Lecture Notes in Computer Science Commenced Publication in 1973 Founding and Former Series Editors: Gerhard Goos, Juris Hartmanis, and Jan van Leeuwen
Editorial Board David Hutchison Lancaster University, UK Takeo Kanade Carnegie Mellon University, Pittsburgh, PA, USA Josef Kittler University of Surrey, Guildford, UK Jon M. Kleinberg Cornell University, Ithaca, NY, USA Alfred Kobsa University of California, Irvine, CA, USA Friedemann Mattern ETH Zurich, Switzerland John C. Mitchell Stanford University, CA, USA Moni Naor Weizmann Institute of Science, Rehovot, Israel Oscar Nierstrasz University of Bern, Switzerland C. Pandu Rangan Indian Institute of Technology, Madras, India Bernhard Steffen University of Dortmund, Germany Madhu Sudan Massachusetts Institute of Technology, MA, USA Demetri Terzopoulos University of California, Los Angeles, CA, USA Doug Tygar University of California, Berkeley, CA, USA Gerhard Weikum Max-Planck Institute of Computer Science, Saarbruecken, Germany
5527
Maria Bras-Amorós Tom Høholdt (Eds.)
Applied Algebra, AlgebraicAlgorithms, and Error-Correcting Codes 18th International Symposium, AAECC-18 Tarragona, Spain, June 8-12, 2009 Proceedings
13
Volume Editors Maria Bras-Amorós Universitat Rovira i Virgili Departament d’Enginyeria Informàtica i Matemàtiques Avinguda Països Catalans, 26, 43007 Tarragona, Catalonia, Spain, E-mail:
[email protected] Tom Høholdt The Technical University of Denmark Department of Mathematics Building 303, 2800 Lyngby, Denmark, E-mail:
[email protected]
Library of Congress Control Number: Applied for CR Subject Classification (1998): E.4, I.1, E.3, G.2, F.2 LNCS Sublibrary: SL 1 – Theoretical Computer Science and General Issues ISSN ISBN-10 ISBN-13
0302-9743 3-642-02180-8 Springer Berlin Heidelberg New York 978-3-642-02180-0 Springer Berlin Heidelberg New York
This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer. Violations are liable to prosecution under the German Copyright Law. springer.com © Springer-Verlag Berlin Heidelberg 2009 Printed in Germany Typesetting: Camera-ready by author, data conversion by Scientific Publishing Services, Chennai, India Printed on acid-free paper SPIN: 12690084 06/3180 543210
Preface
The AAECC symposia series was started in 1983 by Alain Poli (Toulouse), who, together with R. Desq, D. Lazard and P. Camion, organized the first conference. Originally the acronym AAECC stood for “Applied Algebra and Error-Correcting Codes.” Over the years its meaning has shifted to “Applied Algebra, Algebraic Algorithms and Error-Correcting Codes,” reflecting the growing importance of complexity, particularly for decoding algorithms. During the AAECC-12 symposium the Conference Committee decided to enforce the theory and practice of the coding side as well as the cryptographic aspects. Algebra was conserved, as in the past, but slightly more oriented to algebraic geometry codes, finite fields, complexity, polynomials, and graphs. The main topics for AAECC-18 were algebra, algebraic computation, codes and algebra, codes and combinatorics, modulation and codes, sequences, and cryptography. The invited speakers of this edition were Iwan Duursma, Henning Stichtenoth, and Fernando Torres. We would like to express our deep regret for the loss of Professor Ralf K¨ otter, who recently passed away and could not be our fourth invited speaker. Except for AAECC-1 (Discrete Mathematics 56, 1985) and AAECC-7 (Discrete Applied Mathematics 33, 1991), the proceedings of all the symposia have been published in Springer’s Lecture Notes in Computer Science (Vols. 228, 229, 307, 356, 357, 508, 539, 673, 948, 1255, 1719, 2227, 2643, 3857, 4851). It is a policy of AAECC to maintain a high scientific standard, comparable to that of a journal. This was made possible thanks to the many referees involved. Each submitted paper was evaluated by at least two international researchers. AAECC-18 received and refereed 50 submissions. Of these, 22 were selected for publication in these proceedings as regular papers and 7 were selected as extended abstracts. The symposium was organized by Maria Bras-Amor´os and Tom Høholdt, with the help of Jes´ us Manj´ on, Gl` oria Pujol, Jordi Castell` a, Antoni Mart´ınez, and Xavier Fern´ andez under the umbrella of the CRISES group for Cryptography and Statistical Secrecy, at the Universitat Rovira i Virgili, led by Josep DomingoFerrer. It was sponsored by the Catalan Government, the UNESCO Chair in Data Privacy located at Universitat Rovira i Virgili, and the Spanish Network on Mathematics of Information Society. We would like to dedicate these proceedings to the memory of our colleague Ralf K¨ otter. June 2009
Maria Bras-Amor´ os Tom Høholdt
Applied Algebra, Algebraic Algorithms, and Error Correcting Codes - AAECC-18
General Chair Maria Bras-Amor´ os
Universitat Rovira i Virgili, Catalonia, Spain
Co-chair Tom Høholdt
Technical University of Denmark
Program Committee Jacques Calmet Claude Carlet Gerard D. Cohen Cunsheng Ding Gui-Liang Feng Marc Giusti Guang Gong Joos Heintz Kathy Horadam Hideki Imai Navin Kashyap Shu Lin Oscar Moreno Wai Ho Mow Harald Niederreiter Michael E. O’Sullivan Ferruh Ozbudak Udaya Parampalli Alain Poli S. Sandeep Pradhan Asha Rao Shojiro Sakata Hong-Yeop Song Chaoping Xing
University of Karlsruhe, Germany University of Paris 8, France Ecole Nationale Sup´erieure des T´el´ecommunications, France Hong Kong University of Science and Technology, China University of Southwestern Louisiana, USA Ecole Polytechnique, France University of Waterloo, Canada University of Buenos Aires, Argentina and University of Cantabria, Spain RMIT University, Australia University of Tokyo, Japan Queen’s University, Canada University of California, USA University of Puerto Rico Southwest Jiaotong University, China National University of Singapore San Diego State University, USA Middle East Technical University, Turkey University of Melbourne, Australia University P. Sabatier, France University of Michigan at Ann Arbor, USA Royal Melbourne Institute of Technology, Australia Technical University of Denmark Yonsei University, Korea Nanyang Technological University, Singapore
VIII
Organization
Organizing Committee Jes´ us Manj´ on Universitat Gl` oria Pujol Universitat Jordi Castell` a-Roca Universitat Antoni Mart´ınez-Ballest´e Universitat Xavier Fern´ andez Universitat
Rovira Rovira Rovira Rovira Rovira
i i i i i
Virgili, Virgili, Virgili, Virgili, Virgili,
Catalonia, Catalonia, Catalonia, Catalonia, Catalonia,
Spain Spain Spain Spain Spain
Table of Contents
Codes The Order Bound for Toric Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Peter Beelen and Diego Ruano
1
An Extension of the Order Bound for AG Codes . . . . . . . . . . . . . . . . . . . . . Iwan Duursma and Radoslav Kirov
11
Sparse Numerical Semigroups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. Munuera, F. Torres, and J. Villanueva
23
From the Euclidean Algorithm for Solving a Key Equation for Dual Reed–Solomon Codes to the Berlekamp-Massey Algorithm . . . . . . . . . . . . Maria Bras-Amor´ os and Michael E. O’Sullivan Rank for Some Families of Quaternary Reed-Muller Codes . . . . . . . . . . . . Jaume Pernas, Jaume Pujol, and Merc`e Villanueva Optimal Bipartite Ramanujan Graphs from Balanced Incomplete Block Designs: Their Characterizations and Applications to Expander/LDPC Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Tom Høholdt and Heeralal Janwal Simulation of the Sum-Product Algorithm Using Stratified Sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . John Brevik, Michael E. O’Sullivan, Anya Umlauf, and Rich Wolski A Systems Theory Approach to Periodically Time-Varying Convolutional Codes by Means of Their Invariant Equivalent . . . . . . . . . . Joan-Josep Climent, Victoria Herranz, Carmen Perea, and Virtudes Tom´ as
32 43
53
65
73
On Elliptic Convolutional Goppa Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jos´e Ignacio Iglesias Curto
83
The Minimum Hamming Distance of Cyclic Codes of Length 2ps . . . . . . . ¨ ¨ Hakan Ozadam and Ferruh Ozbudak
92
There Are Not Non-obvious Cyclic Affine-invariant Codes . . . . . . . . . . . . . ´ Jos´e Joaqu´ın Bernal, Angel del R´ıo, and Juan Jacobo Sim´ on
101
On Self-dual Codes over Z16 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kiyoshi Nagata, Fidel Nemenzo, and Hideo Wada
107
X
Table of Contents
Cryptography A Non-abelian Group Based on Block Upper Triangular Matrices with Cryptographic Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ´ Rafael Alvarez, Leandro Tortosa, Jos´e Vicent, and Antonio Zamora Word Oriented Cascade Jump σ−LFSR . . . . . . . . . . . . . . . . . . . . . . . . . . . . Guang Zeng, Yang Yang, Wenbao Han, and Shuqin Fan On Some Sequences of the Secret Pseudo-random Index j in RC4 Key Scheduling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Riddhipratim Basu, Subhamoy Maitra, Goutam Paul, and Tanmoy Talukdar
117 127
137
Very-Efficient Anonymous Password-Authenticated Key Exchange and Its Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . SeongHan Shin, Kazukuni Kobara, and Hideki Imai
149
Efficient Constructions of Deterministic Encryption from Hybrid Encryption and Code-Based PKE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yang Cui, Kirill Morozov, Kazukuni Kobara, and Hideki Imai
159
Algebra Noisy Interpolation of Multivariate Sparse Polynomials in Finite Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ´ Alvar Ibeas and Arne Winterhof
169
New Commutative Semifields and Their Nuclei . . . . . . . . . . . . . . . . . . . . . . J¨ urgen Bierbrauer
179
Spreads in Projective Hjelmslev Geometries . . . . . . . . . . . . . . . . . . . . . . . . . Ivan Landjev
186
On the Distribution of Nonlinear Congruential Pseudorandom Numbers of Higher Orders in Residue Rings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Edwin D. El-Mahassni and Domingo Gomez Rooted Trees Searching for Cocyclic Hadamard Matrices over D4t . . . . . . ´ ´ V´ıctor Alvarez, Jos´e Andr´ es Armario, Mar´ıa Dolores Frau, F´elix Gudiel, and Amparo Osuna
195 204
Extended Abstracts Interesting Examples on Maximal Irreducible Goppa Codes . . . . . . . . . . . Marta Giorgetti
215
Repeated Root Cyclic and Negacyclic Codes over Galois Rings . . . . . . . . . Sergio R. L´ opez-Permouth and Steve Szabo
219
Table of Contents
XI
Construction of Additive Reed-Muller Codes . . . . . . . . . . . . . . . . . . . . . . . . J. Pujol, J. Rif` a, and L. Ronquillo
223
Gr¨ obner Representations of Binary Matroids . . . . . . . . . . . . . . . . . . . . . . . . M. Borges-Quintana, M.A. Borges-Trenard, and E. Mart´ınez-Moro
227
A Generalization of the Zig-Zag Graph Product by Means of the Sandwich Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . David M. Monarres and Michael E. O’Sullivan
231
Novel Efficient Certificateless Aggregate Signatures . . . . . . . . . . . . . . . . . . . Lei Zhang, Bo Qin, Qianhong Wu, and Futai Zhang
235
Bounds on the Number of Users for Random 2-Secure Codes . . . . . . . . . . Manabu Hagiwara, Takahiro Yoshida, and Hideki Imai
239
Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
243
The Order Bound for Toric Codes Peter Beelen and Diego Ruano DTU-Mathematics, Technical University of Denmark, Matematiktorvet, Building 303, 2800 Kgs. Lyngby, Denmark {P.Beelen,D.Ruano}@mat.dtu.dk
Abstract. In this paper we investigate the minimum distance of generalized toric codes using an order bound like approach. We apply this technique to a family of codes that includes the Joyner code. For some codes in this family we are able to determine the exact minimum distance.
1
Introduction
In 1998 J.P. Hansen considered algebraic geometry codes defined over toric surfaces [7]. Thanks to combinatorial techniques of such varieties he was able to estimate the parameters of the resulting codes. For example, the minimum distance was estimated using intersection theory. Toric geometry studies varieties which contain an algebraic torus as a dense subset and where moreover the torus acts on the variety. The importance of such varieties, called toric varieties, resides in their correspondence with combinatorial objects, which makes the techniques to study the varieties (such as cohomology, intersection theory, resolution of singularities, etc) more precise and at the same time tractable [3,6]. The order bound gives a way to obtain a lower bound for the minimum distance of linear codes [1,4,5,9]. Especially for codes from algebraic curves this technique has been very successful. In this article we will develop a similar bound for toric codes. Actually our bound also works for the more general class of generalized toric codes (see Section 2). This will give a new way of estimating the minimum distance of toric codes that in some examples give a better bound than intersection theory. Another advantage is that known algorithms [4,5] can be used to decode the codes up to half the order bound. As an example we will compute the order bound for a family of codes that includes the Joyner codes [11]. For this reason we call these codes generalized Joyner codes. Also we will compute the exact minimum distance for several generalized Joyner codes. It turns out that a combination of previously known techniques and the order bound gives a good estimate of the minimum distance of generalized Joyner codes. The paper is organized as follows. In Section 2 we will give an introduction to toric codes and generalized toric codes, while in Section 3 the order bound for
The work of D. Ruano is supported in part by DTU, H.C. Oersted post doc. grant (Denmark) and by MEC MTM2007-64704 and Junta de CyL VA065A07 (Spain).
M. Bras-Amor´ os and T. Høholdt (Eds.): AAECC 2009, LNCS 5527, pp. 1–10, 2009. c Springer-Verlag Berlin Heidelberg 2009
2
P. Beelen and D. Ruano
these codes will be established. The last section of the paper will illustrate the theory by applying the results to generalized Joyner codes.
2
Toric Codes and Generalized Toric Codes
Algebraic geometry codes [9,19] are usually defined evaluating algebraic functions over a non-singular projective variety X defined over a finite field. The functions of L(D) are evaluated at certain rational points of the curve (P = {P1 , . . . , Pn }), where D is a divisor whose support does not contain any of the evaluation points. The zeros and poles of the functions of L(D) are bounded by D. More precisely, the algebraic geometry code C(X, D, P) is the image of the linear map: ev : L(D) → Fnq f → (f (P1 ), . . . , f (Pn )) In this section we introduce toric codes, that is, algebraic geometry codes over toric varieties. One can define a toric variety and a Cartier divisor using a convex polytope, namely, a convex polytope is the same datum as a toric variety and Cartier divisor. Let M be a lattice isomorphic to Zr for some r ∈ Z and MR = M ⊗ R. Let P be an r-dimensional rational convex polytope in MR and let us consider XP and DP the toric variety and the Cartier divisor defined by P [15]. We may assume that XP is non singular, in other case we refine the fan [6, Section 2.6]. Let L(DP ) be the Fq -vector space of functions f over XP such that div(f ) + DP 0. The toric code C(P ) associated to P is the image of the linear evaluation map ev : L(DP ) → Fnq f → (f (t))t∈T where the set of points P = T is the algebric torus T = (F∗q )r . Since we evaluate at #T points, C(P ) has length n = (q − 1)r . One has that L(DP ) is the Fq -vector space generated by the monomials with exponents in P ∩ M L(DP ) = {X u = X1u1 · · · Xrur | u ∈ P ∩ M } ⊂ Fq [X1 , . . . , Xr ] The minimum distance of a toric code C(P ) may be estimated using intersection theory [8,15]. Also, it can be estimated using a multivariate generalization of Vandermonde determinants on the generator matrix [13]. For plane polytopes, r = 2, one can estimate the minimum distance using the Hasse-Weil bound and combinatorial invariants of the polytope (the Minkowsky sum [12] and the Minkowsky length [17]). An extension of toric codes are the so-called generalized toric codes [16]. The generalized toric code C(U ) is the image of the Fq -linear map ev : Fq [U ] → Fnq f → (f (t))t∈T where U ⊂ H = {0, . . . , q − 2}r and Fq [U ] is the Fq -vector space
The Order Bound for Toric Codes
3
Fq [U ] = X u = X1u1 · · · Xrur | u = (u1 , . . . , ur ) ∈ U ⊂ Fq [X1 , . . . , Xr ]. Let u be u mod ((q−1)Z)r , that is u = (u1 mod (q−1), . . . , ur mod (q−1)), for u ∈ Zr , and U = {u | u ∈ U }. The dimension of the code C(U ) is k = #U = #U , since the evaluation map ev is injective. By [16, Theorem 6], one has that the dual code of C(U ) is C(U ⊥ ), where ⊥ U = H \ −U, with −U = {−u | u ∈ U }. Namely, we have 0 if u + u = 0 u u ev(X ) · ev(X ) = (1) r (−1) if u + u = 0 for u, u ∈ H, where · denotes the inner product in Fnq . The family of generalized toric codes includes the ones obtained evaluating polynomials of an arbitrary subalgebra of Fq [X1 , . . . , Xr ] at T , in particular toric codes. However, there is no estimate so far for the minimum distance in this more general setting, the order bound techniques in this paper will apply to generalized toric codes as well. From now on we will consider generalized toric codes but for the sake of simplicity, we will just call them toric codes.
3
The Order Bound for Toric Codes
In this section we follow the order bound approach to estimate the minimum distance of the dual code of a toric code C(U ), for U ⊂ H. Let B1 = {g1 , . . . , gn } and B2 = {h1 , . . . , hn } be two bases of Fq [H]. For c = ev(f ) ∈ C(U ), we consider the syndrome matrix S(c) = (si,j )1≤i,j≤n , with si,j = (ev(gi ) ∗ ev(hj )) · ev(f ) = ev(gi hj ) · ev(f ), where ∗ denotes the componentwise product. In other words, S(c) = M1 D(c)M2t , where D is the diagonal matrix with c in the diagonal and M1 and M2 are the evaluation matrices given by ⎞ ⎞ ⎛ ⎛ g1 (t1 ) g1 (t2 ) · · · g1 (tn ) h1 (t1 ) h1 (t2 ) · · · h1 (tn ) ⎜ g2 (t1 ) g2 (t2 ) · · · g2 (tn ) ⎟ ⎜ h2 (t1 ) h2 (t2 ) · · · h2 (tn ) ⎟ ⎟ ⎟ ⎜ ⎜ , M M1 = ⎜ . = ⎟ ⎜ .. .. .. .. .. .. .. ⎟ . 2 ⎠ ⎝ .. ⎝ . . . . . . . ⎠ gn (t1 ) gn (t2 ) · · · gn (tn )
hn (t1 ) hn (t2 ) · · · hn (tn )
Here t1 , . . . , tn denote the points of the algebraic torus. Note that M1 and M2 have full rank, since the evaluation map is injective. This implies that the rank of S(c) equals wt(c). It is convenient to consider bases of Fq [H] consisting of monomials, that is, we set gi = X vi and hi = X wi , for i = 1, . . . , n, with {v1 , . . . , vn } = {w1 , . . . , wn } = H. Then, we can easily compute the syndrome matrix for a codeword using the following lemma. u and S(ev(f )) = (si,j )1≤i,j≤n the syndrome Lemma 1. Let f = u∈H λu X matrix of ev(f ). Then, one has that si,j = (−1)r λ−(vi +wj ) . In particular, si,j is equal to zero if and only if vi + wj ∈ / −supp(f ), where supp(f ) denotes the support of f , supp(f ) = {u ∈ H | λu = 0}.
4
P. Beelen and D. Ruano
Proof. By definition, si,j = ev(X vi +wj ) · ev( = (−1)r λ−(vi +wj )
λu X u ) =
u∈H
ev(X vi +wj ) · ev(λu X u )
u∈H
(by (1)).
Therefore, si,j is equal to zero if and only if −(vi + wj ) is not in the support of f . Equivalently, si,j = 0 if and only if vi + wj ∈ / −supp(f ).
To bound the minimum distance using order domain theory, we should give a lower bound for the rank of the syndrome matrix. Since the order bound gives an estimate for the minimum distance of the dual code, we begin by considering C(U )⊥ = C(U ⊥ ) to get a bound for the minimum distance of C(U ). Let H = {u1 , . . . , un }, with U ⊥ = {u1 , . . . , un−k } ⊂ H, notice that U = {−un−k+1 , . . . , −un }. We are dealing with an arbitrary order on H, we only require, for the sake of simplicity, that the first n − k elements of H are the elements of U ⊥ . For l ∈ {0, . . . , k − 1}, we consider the following filtration of codes depending on the previous ordering C C1 C2 · · · Cl Cl+1 , where C = C(U ⊥ ) and Cm = C(U ⊥ ∪ {un−k+1 , . . . , un−k+m }), for m = 1, . . . , l + 1, and their dual codes, ⊥ C ⊥ C1⊥ C2⊥ · · · Cl⊥ Cl+1 , ⊥ with C ⊥ = C(U ) and Cm = C(U \ {−un−k+1 , . . . , −un−k+m }), for m = 1, . . . , l + ⊥ 1, since (U ∪ {un−k+1 , . . . , un−k+m })⊥ = U \ {−un−k+1 , . . . , −un−k+m }. ⊥ We wish to bound the weight of c ∈ Cl⊥ \ Cl+1 . Let νl be the largest integer (in {1, . . . , n}) such that
• vi + wi = un−k+l+1 , for i = 1, . . . , νl . • vi + wj ∈ U ⊥ ∪ {un−k+1 , . . . , un−k+l }, for i = 1, . . . , νl and j < i. ⊥ Proposition 1. Let c ∈ Cl⊥ \ Cl+1 , then wt(c) ≥ νl . Proof. Let c = ev(f ), then f = λu X u , where u ∈ U \ {−un−k+1 ,. . . ,−un−k+l }. ⊥ Notice that λ−un−k+l+1 = 0, since c ∈ / Cl+1 . Hence we have by Lemma 1 that,
• si,i = 0, for i = 1, . . . , νl , since vi + wi = un−k+l+1 ∈ −supp(f ), for i = 1, . . . , νl . • si,j = 0, for i = 1, . . . , νl , since vi + wj ∈ U ⊥ ∪ {un−k+1 , . . . , un−k+l }, for j < i. That is, vi + wj ∈ / −supp(f ) because H \ (U ⊥ ∪ {un−k+1 , . . . , un−k+l }) = −U \ {un−k+1 , . . . , un−k+l } Therefore, the submatrix of S(c) consisting of the first νl rows and columns has full rank. In particular, the rank of S(c) is at least νl and the result holds since the rank of S(c) is equal to the weight of c.
The Order Bound for Toric Codes
5
For every l in {0, . . . , k − 1} we consider a filtration and we obtain a bound for ⊥ the weight of a word in Cl⊥ \ Cl+1 . Therefore, we have obtained the following bound for the minimum distance of C ⊥ = C(U ). Theorem 1. Let C(U ) be a toric code with U ⊂ H. Then, d(C(U )) ≥ min{νl | l = 0, . . . , k − 1}. Remark 1. We can apply known decoding algorithms [4,5] to decode a toric code C(U ) up to half of the order bound obtained in the previous theorem. In the next section we will use the above approach to estimate the minimum distance of a family of toric codes.
4
Generalized Joyner Codes
In this section we will introduce a class of toric codes that includes the wellknown Joyner code [11, Example 3.9]. After introducing these codes, we will calculate a lower bound for their minimum distances using techniques from Section 3. Then we will calculate another lower bound for the minimum distance using a combination of the order bound and Serre’s improvement of Hasse-Weil’s theorem on the number of rational points on a curve [18]. In some cases we are able to compute the exact minimum distance. In this section we will always assume that r = 2, so that H = {0, . . . , q − 2} × {0, . . . , q − 2}. Definition 1. Let q be a power of a prime and a an integer satisfying 2 ≤ a ≤ q − 2. We define the sets Ua = {(u1 , u2 ) ∈ H | u1 + u2 ≤ a + 1, u1 − au2 ≤ 0, −au1 + u2 ≤ 0}, Ta = {(u1 , u2 ) ∈ H | u1 + u2 ≤ a + 1, u1 ≥ 1, u2 ≥ 1}, Va = Ua \{(1, a)}, and Wa = Ua \{(a, 1)}. The set Ua consists of all elements of H lying in or on the boundary of the triangle with vertices (0, 0), (1, a) and (a, 1). Note that the condition on a ensures that the points (1, a) and (a, 1) are in H. Also note that the set Ua can be obtained by joining (0, 0) to the set Ta . All sets in the above definition are actually sets of integral points in a polytope, so the corresponding codes are classical toric codes. We wish to investigate the toric code C(Ua ) and begin by establishing some elementary properties: Lemma 2. The code C(Ua ) is an [(q − 1)2 , 1 + a(a + 1)/2, d] code over Fq and we have d ≤ (q − 1)(q − a).
6
P. Beelen and D. Ruano
Proof. Since the set Ua is contained in H, the dimension of the corresponding code is equal to the number of elements in Ua . Since Ua can be obtained by joining {(0, 0)} to the set Ta the formula for the dimension follows by a counting argument. To prove the result on the minimum distance first note that the code C(Ta ) is a subcode of C(Ua ). It is well known, [8, Theorem 1.3], that d(C(Ta )) = (q − 1)(q − a), so the result follows.
The code C(U4 ) is the Joyner code over Fq , see [11]. It is known that for q ≥ 37 its minimum distance is equal to (q − 1)(q − 4), [17], meaning that the upper bound in the previous lemma is attained. In fact equality already holds for much smaller q. Using a computer one finds that q = 8 is the smallest value of q for which equality holds. It is conjectured that for all q ≥ 8 one has that d(C(U4 )) = (q − 1)(q − 4). This behavior turns out to happen as well for other values of a. This is the reason we study these codes in this section. We proceed our investigation by calculating two lower bounds for the minimum distance of the codes C(Ua ). The first one holds for any q, while the second one turns out to be interesting only for large q. Proposition 2. The minimum distance d of the Fq -linear code C(Ua ) satisfies d ≥ (q − 1)(q − a − 1). Proof. Since d(C(Ta )) = (q − 1)(q − a), the proposition follows once we have shown that wt(c) ≥ (q − 1)(q − a − 1) for any c ∈ C(Ua )\C(Ta ). For such c it holds that c = ev(f ) for some f ∈ Fq [Ua ] satisfying that (0, 0) ∈ supp(f ). We will now use Proposition 1. Any number i between 0 and (q − 1)(q − a − 1) − 1 can be written uniquely as i = βi · (q − 1) + αi with αi and βi integers satisfying 0 ≤ αi ≤ q − 2 and 0 ≤ βi ≤ q − a − 2. For i between 0 and (q − 1)(q − a − 1) − 1 we then define vi = (αi , βi ) and wi = −vi . By construction of wi it then holds that vi + wi = (0, 0). On the other hand, if j < i, then vi + wj = (0, 0) and 0 ≤ βi − βj ≤ q − a − 2 implying that vi + wj ∈ −Ua . By Proposition 1, we get that wt(c) ≥ (q − 1)(q − a − 1).
For q = 8 and a = 4, the Joyner code case, we obtain that d ≥ 21. An other method to obtain a lower bound for the minimum distance of toric codes is to use intersection theory. For the Joyner code over F8 one can prove in this way that d ≥ 12, [14]. The bound we get compares favorably to it. Another advantage of the order bound techniques is that they are valid for generalized toric codes as well. Now we obtain a second lower bound on the minimum distance of the code C(Ua ). First we need a lemma. Lemma 3. Let c be a nonzero codeword from the code C(Va ) or the code C(Wa ). Then wt(c) ≥ (q − 1)(q − a). Proof. Suppose that c ∈ C(Va ) (the case that c ∈ C(Wa ) can be dealt with similarly by symmetry and will not be discussed below). If c ∈ C(Ta ), we are done. Therefore we can suppose that c ∈ C(Va )\C(Ta ). Exactly as in the proof
The Order Bound for Toric Codes
7
of Proposition 2 we now define for i between 0 and (q − 1)(q − a) − 1 the tuple ui = (αi , βi ). The only difference is that now βi is also allowed to be q − a, otherwise everything is the same. Further we also define wi = −vi . Then we have that vi + wi = (0, 0) and for j < i, we obtain that vi + wj = (0, 0) and 0 ≤ βi − βj ≤ q − a − 1. This implies that vi + wj ∈ −Va . The lemma now follows.
One can use Lemma 3 and the fact that d(C(Ta )) = (q−1)(q−a) [8], to restrict the number of possibilities for a non-zero codeword of weight less than (q − 1)(q − a). Namely, it has to be the evaluation of a function f = λu X u with non-zero coefficients λ(a,1) , λ(0,0) , λ(1,a) . We will use this in the following proposition. We distinguish cases between a = 2 and a > 2. Proposition 3. Let Ua be the set from Definition 1 and let a > 2. The minimum distance d of the code C(Ua ) satisfies a(a − 1) √ 2 d ≥ min (q − 1)(q − a), q − 3q + 2 − 2 q . 2 Proof. Let c ∈ C(Ua ) be a nonzero codeword and suppose that c = ev(f ). If supp(f ) ⊂ Ta then we know that wt(c) ≥ (q − 1)(q − a) from [8] as noted before. If supp(f ) ⊂ Va or supp(f ) ⊂ Wa then wt(c) ≥ (q − 1)(q − a) by Lemma 3. We are left with the case that {(0, 0), (1, a), (a, 1)} ⊂ supp(f ). In this case the Newton-polygon of the polynomial f is Minkowski-indecomposable which implies that the polynomial f is absolutely irreducible. We can therefore consider the algebraic curve Cf defined by the equation f = 0. From Newton-polygon theory it follows that this curve has geometric genus at most a(a − 1)/2 and that the edges from (0, 0) to (1, a) and from (0, 0) to (a, 1) correspond to two rational points at infinity using projective coordinates (see [2, Remark 3.18 and Theorem 4.2]). We denote by N the total number of pairs (α, β) ∈ F2q such that √ f (α, β) = 0. We claim that N ≤ q − 1 + a(a − 1)/2 q. If the curve C has no singularities, all solutions (α, β) correspond one-to-one to all affine Fq -rational points. Taking into account that there at least 2 rational points at infinity (in fact at least 3 if a = 2), the claim follows from Serre’s bound (and can be slightly improved if a = 2). If there are singularities, a solution (α, β) may not correspond to a rational point on C, but for every such solution the genus will drop at least one, so the claim still follows from Serre’s bound. The proposition
now follows, since wt(c) ≥ (q − 1)2 − N . For a = 2 we can determine the exact minimum distance. We will do so in the following theorem. This theorem is formulated using the existence or nonexistence of an elliptic curve with a certain number of points. The theorem is completely constructive, since it is very easy to determine if an elliptic curve defined over Fq with a certain number of points exists. To this end one can use the following fact [20, Theorem 4.1]: Let q be a power of a prime p and let t be an integer. There exists an elliptic curve defined over Fq with q + 1 + t points if and only if the following holds:
8
P. Beelen and D. Ruano
1. p | t and t2 ≤ 4q, 2. e is odd and one of the following holds (a) t = 0, (b) t2 = 2q and p = 2, (c) t2 = 3q and p = 3 , 3. e is even and one of the following holds (a) t2 = 4q, (b) t2 = q and p ≡ 1 mod 3, (c) t = 0 and p ≡ 1 mod 4. Theorem 2. Denote by d the minimum distance of the Fq -linear code C(U2 ). Further let t be the largest integer such that 1. 3 | q + 1 + t, 2. there exists an elliptic curve defined over Fq with q + 1 + t points. Then we have: d = q 2 − 3q + 3 − t. Proof. Analogously as in the previous proposition we only have to consider codewords coming from functions f such that {(0, 0), (1, 2), (2, 1)} ⊂ supp(f ). We again consider the curve Cf given by the equation f = 0. In this case all three edges of the Newton polygon of f correspond to rational points at infinity. The polynomial f can be written as α + βX1 X2 + γX1 X22 + δX12 X2 , where α, γ and δ are nonzero. We may assume that the curve does not have singularities, since otherwise the geometric genus of Cf is zero, which implies that the equation f = 0 has at most 1 + (q + 1) − 3 = q − 1 solutions in F2q (the first term represents the singular point which could have rational coordinates). This would give rise to a codeword of weight at least (q − 1)2 − (q − 1) = q 2 − 3q + 2. By changing variables to U = −γX1 X2 /δ and V = γX12 X2 /δ (or equivalently X1 = −V /U and X2 = δU 2 /(γV )) one can show that the curve Cf is also given by the equation V 2 − βU V /δ + αγV /δ 2 = U 3 . Since we have assumed that the curve Cf is nonsingular, it is an elliptic curve and we already found a Weierstrass equation for it. In (U, V ) coordinates one sees that (0, 0) is a point on the elliptic curve and using the addition formula one checks that this point has order three in the elliptic curve group. Denote the total number of rational points on Cf by q + 1 + t, then clearly there exists an elliptic curve with q + 1 + t points. Since the point (0, 0) has order three, the total number of rational points has to be a multiple of three and it follows that 3|q + 1 + t. Reasoning back we see that the total number of affine solutions to the equation f = 0 in F2q equals q + 1 + t − 3 = q − 2 + t. On the other hand there are no solutions with zero coordinates, so we have that wt(ev(f )) = (q − 1)2 − (q − 2 + t). It remains to be shown which values of t are possible when the polynomial f is varied. It is shown in [10, Section 4.2] that any elliptic curve having (0, 0) as a point of order three has a Weierstrass equation of the form V 2 + a1 U V + a3 V = U 3 , with a3 = 0. This implies that t can be any value satisfying the two conditions stated in the formulation of the theorem. Choosing the maximal one among these values gives a codeword of lowest possible nonzero weight. This concludes the proof.
The Order Bound for Toric Codes
9
If a > 2 the situation is more complicated. Given a fixed a the lower bound from Proposition 2 is good for relatively small q, while the lower bound from Proposition 3 becomes better as q becomes larger. A combination of the techniques from Section 3 and this section gives an in general good lower bound for the minimum distance. In some cases we are able to determine the minimum distance and we describe when this happens in the following theorem. Theorem 3. Let q be a prime power and consider a natural number a satisfying 2 < a ≤ q − 2. Define the set Ua as in Definition 1. Then we have the following for the minimum distance d of the Fq -linear code C(Ua ): √ d = (q − 1)(q − a) if 2(a − 2)(q − 1) ≥ (a − 1)a2 q. √ Proof. If (q − 1)(q − a) ≤ q 2 − 3q + 2 − a(a − 1)2 q/2, then it follows from Lemma 2 and Proposition 3 that d = (q − 1)(q − a). The theorem follows after some manipulation of this inequality.
For small values of a we obtain the following corollary. Note that the results for a = 4 are the same as in [17]. Corollary 1. We have d(C(U3 )) = (q − 1)(q − 3) d(C(U4 )) = (q − 1)(q − 4) d(C(U5 )) = (q − 1)(q − 5) d(C(U6 )) = (q − 1)(q − 6)
if if if if
q q q q
≥ 37, ≥ 37, = 41 or q ≥ 47, ≥ 59.
As in the case for the Joyner code it seems that the for many small values of q it also holds that d(C(Ua )) = (q − 1)(q − a). We know by Corollary 1 and some computer calculations that d(C(U3 )) = (q −1)(q −3) if q ≥ 23. It is known for the Joyner code that d(C(U4 )) = (q − 1)(q − 4) if q ≥ 8. Finally we conjecture that d(C(U5 )) = (q − 1)(q − 5) if q ≥ 9. It remains future work to prove this conjecture without the aid of a computer and to establish what happens for larger values of a.
References 1. Beelen, P.: The order bound for general algebraic geometric codes. Finite Fields Appl. 13(3), 665–680 (2007) 2. Beelen, P., Pellikaan, R.: The Newton polygon of plane curves with many rational points. Des. Codes Cryptogr. 21(1-3), 41–67 (2000) 3. Danilov, V.I.: The geometry of toric varieties. Russian Math. Surverys 33(2), 97– 154 (1978) 4. Duursma, I.M.: Majority coset decoding. IEEE Trans. Inform. Theory 39(3), 1067– 1070 (1993) 5. Feng, G.L., Rao, T.R.N.: Improved geometric Goppa codes. I. Basic theory. IEEE Trans. Inform. Theory 41(6, part 1), 1678–1693 (1995) 6. Fulton, W.: Introduction to toric varieties. Annals of Mathematics Studies, vol. 131. Princeton University Press, Princeton (1993)
10
P. Beelen and D. Ruano
7. Hansen, J.P.: Toric surfaces and error-correcting codes. In: Buchmann, J., Høholdt, T., Stichtenoth, H., Tapia-Recillas, H. (eds.) Coding theory, cryptography and related areas (Guanajuato, 1998), pp. 132–142. Springer, Berlin (2000) 8. Hansen, J.P.: Toric varieties Hirzebruch surfaces and error-correcting codes. Appl. Algebra Engrg. Comm. Comput. 13(4), 289–300 (2002) 9. Høholdt, T., van Lint, J.H., Pellikaan, R.: Algebraic geometry of codes. In: Handbook of coding theory, vol. I, II, pp. 871–961. North-Holland, Amsterdam (1998) 10. Husem¨ oller, D.: Elliptic curves, 2nd edn. Graduate Texts in Mathematics, vol. 111. Springer, New York (2004) 11. Joyner, D.: Toric codes over finite fields. Appl. Algebra Engrg. Comm. Comput. 15(1), 63–79 (2004) 12. Little, J., Schenck, H.: Toric surface codes and Minkowski sums. SIAM J. Discrete Math. 20(4), 999–1014 (2006) 13. Little, J., Schwarz, R.: On toric codes and multivariate Vandermonde matrices. Appl. Algebra Engrg. Comm. Comput. 18(4), 349–367 (2007) 14. Mart´ınez-Moro, E., Ruano, D.: Toric codes. In: Advances in Algebraic Geometry Codes. Series on Coding Theory and Cryptology, vol. 5, pp. 295–322. World Scientific, Singapore (2008) 15. Ruano, D.: On the parameters of r-dimensional toric codes. Finite Fields Appl. 13(4), 962–976 (2007) 16. Ruano, D.: On the structure of generalized toric codes. J. Symb. Comput. 44(5), 499–506 (2009) 17. Soprunov, I., Soprunova, E.: Toric surface codes and minkowski length of polygons. arXiv:0802.2088v1 [math. AG] (2008) 18. Serre, J.-P.: Sur le nombre des points rationnels d’une courbe alg´ebrique sur un corps fini. C. R. Acad. Sci. Paris S´er. I Math. 296(9), 397–402 (1983) 19. Tsfasman, M.A., Vl˘ adut¸, S.G.: Algebraic-geometric codes, Dordrecht. Mathematics and its Applications (Soviet Series), vol. 58 (1991) ´ 20. Waterhouse, W.C.: Abelian varieties over finite fields. Ann. Sci. Ecole Norm. Sup. 2(4), 521–560 (1969)
An Extension of the Order Bound for AG Codes Iwan Duursma and Radoslav Kirov Department of Mathematics, University of Illinois at Urbana-Champaign, 1409 W. Green Street (MC-382) Urbana, Illinois 61801-2975 http://www.math.uiuc.edu/{~ duursma, ~rkirov2}
Abstract. The most successful method to obtain lower bounds for the minimum distance of an algebraic geometric code is the order bound, which generalizes the Feng-Rao bound. By using a finer partition of the set of all codewords of a code we improve the order bounds by Beelen and by Duursma and Park. We show that the new bound can be efficiently optimized and we include a numerical comparison of different bounds for all two-point codes with Goppa distance between 0 and 2g − 1 for the Suzuki curve of genus g = 124 over the field of 32 elements. Keywords: Algebraic geometric code, order bound, Suzuki curve.
1
Introduction
To obtain lower bounds for the minimum distance of algebraic geometric codes we follow [7] and exploit a relation between the minimum distance of algebraic geometric codes and the representation of divisor classes by differences of base point free divisors. Theorem 2 gives a new improved lower bound for the weight of vectors in a subset of the code. In Section 5 we outline methods to efficiently optimize lower bounds for the minimum distance using the theorem. Section 6 has tables that compare different bounds for two-point Suzuki codes over the field of 32 elements. Let X/F be an algebraic curve (absolutely irreducible, smooth, projective) of genus g over a finite field F. Let F(X) be the function field of X/F. A nonzero rational function f ∈ F(X) has divisor (f ) = P ∈X ordP (f )P = (f )0 − (f )∞ , where the positive part (f )0 gives the zeros of f and their multiplicities, and the negative part (f )∞ gives the poles of f and their multiplicities. A divisor D = P mP P is principal if it is of the form D = (f ) for some f ∈ F(X). Two divisors D and D are linearly equivalent if D = D + (f ) for some f ∈ F(X). Given a divisor D on X defined over F, let L(D) denote the vector space over F of nonzero functions f ∈ F(X) for which (f ) + D ≥ 0 together with the zero function. A point P is a base point for the linear system of divisors {(f ) + D : f ∈ L(D)} if (f ) + D ≥ P for all f ∈ L(D), that is to say if L(D) = L(D − P ). We give the definition of an algebraic geometric code. For n distinct rational points P1 , . . . , Pn on X and for disjoint divisors D = P1 + · · · + Pn and G, the geometric Goppa code CL (D, G) is defined as the image of the map αL : L(G) −→ F n , f → ( f (P1 ), . . . , f (Pn ) ). M. Bras-Amor´ os and T. Høholdt (Eds.): AAECC 2009, LNCS 5527, pp. 11–22, 2009. c Springer-Verlag Berlin Heidelberg 2009
12
I. Duursma and R. Kirov
With the Residue theorem, the dual code CL (D, G)⊥ can be expressed in terms of differentials. Let Ω(X) be the module of rational differentials for X/F. For a given divisor E on X defined over F, let Ω(E) denote the vector space over F of nonzero differentials ω ∈ Ω(X) for which (ω) ≥ E together with the zero differential. Let K represent the canonical divisor class. The geometric Goppa code CΩ (D, G) is defined as the image of the map αΩ : Ω(G − D) −→ F n , ω → ( ResP1 (ω), . . . , ResPn (ω) ). The code CΩ (D, G) is the dual code for the code CL (D, G). We use the following characterization of the minimum distance. Proposition 1. [7, Proposition 2.1] For the code CL (D, G), and for C = D−G, d(CL (D, G)) = min{deg A : 0 ≤ A ≤ D | L(A − C) = L(−C)}. For the code CΩ (D, G), and for C = G − K, d(CΩ (D, G)) = min{deg A : 0 ≤ A ≤ D | L(A − C) = L(−C)}. Proof. There exists a nonzero word in CL (D, G) with support in A, for 0 ≤ A ≤ D, if and only if L(G − D + A)/L(G − D) = 0. There exists a nonzero word in CΩ (D, G) with support in A, for 0 ≤ A ≤ D, if and only if Ω(G − A)/Ω(G) = 0 if and only if L(K − G + A)/L(K − G) = 0. It is clear that in each case d ≥ deg C. The lower bound dGOP = deg C is the Goppa designed minimum distance of a code.
2
Coset Bounds
For a point P disjoint from D, consider the subcodes CL (D, G − P ) ⊆ CL (D, G) and CΩ (D, G + P ) ⊆ CΩ (D, G). Proposition 2. [7, Proposition 3.5] Let A = supp(c) be the support of a codeword c = (cP : P ∈ D), with 0 ≤ A ≤ D. For C = D − G, c ∈ CL (D, G)\CL (D, G − P ) ⇒ L(A − C) = L(A − C − P ). For C = G − K, c ∈ CΩ (D, G)\CΩ (D, G + P ) ⇒ L(A − C) = L(A − C − P ). Proof. There exists a word in CL (D, G)\CL (D, G − P ) with support in A, for 0 ≤ A ≤ D, if and only if CL (D, G − (D − A)) = CL (D, G − P − (D − A)) only if L(G−D+A) = L(G−D+A−P ). There exists a word in CΩ (D, G)\CΩ (D, G+P ) with support in A, for 0 ≤ A ≤ D, if and only if CΩ (A, G) = CΩ (A, G + P ) only if Ω(G − A) = Ω(G − A + P ), which can be expressed as L(K − G + A) = L(K − G + A − P ).
An Extension of the Order Bound for AG Codes
13
The order bound for the minimum distance of an algebraic geometric code is motivated by the decoding procedures in [10], [8] and combines estimates for the weight of a word c ∈ CL (D, G)\ CL (D, G − P ), or c ∈ CΩ (D, G)\ CΩ (D, G + P ). The basic version (often referred to as the simple or first order bound [13], [4]) takes the form d(CL (D, G)) = min( min{wt(c) : c ∈ CL (D, G − iP )\CL (D, G − (i + 1)P )} ), i≥0
d(CΩ (D, G)) = min( min{wt(c) : c ∈ CΩ (D, G + iP )\CL (D, G + (i + 1)P )} ). i≥0
The order bound makes it possible to use separate estimates for different subsets of codewords. The bound is successful if for each subset, i.e. for each i, we can find an estimate that is better than a uniform lower bound for all codewords. Methods that provide uniform lower bounds include the Goppa designed minimum distance and bounds of floor type [17], [16], [11], [7]. It follows from the Singleton bound that the minimum distance of an algebraic geometric code can not exceed its designed minimum distance by more than g, where g is the genus of the curve. This implies that the minimum in the order bound occurs for i ∈ {0, . . . , g}. For a curve X defined over the field F, let Pic(X) be the group of divisor classes. Let Γ = {A : L(A) = 0} be the semigroup of effective divisor classes. For a divisor class C, define Γ (C) = {A : L(A) = 0 and L(A − C) = 0}, Γ ∗ (C) = {A : L(A) = 0 and L(A − C) = L(−C)}. The semigroup Γ (C) has the property that A + E ∈ Γ (C) whenever A ∈ Γ (C) and E ∈ Γ. With the extra structure Γ (C) is a semigroup ideal. Similar for Γ ∗ (C). For a suitable choice of divisor class C, the subsets of coordinates A that support a codeword in an algebraic geometric code CL (D, G) or CΩ (D, G) belong to the semigroup ideals Γ ∗ (C) ⊆ Γ (C). c ∈ CL (D, G)\{0} ⇒ A = supp(c) ∈ Γ ∗ (C) ⊆ Γ (C), c ∈ CΩ (D, G)\{0} ⇒ A = supp(c) ∈ Γ ∗ (C) ⊆ Γ (C). Following [7], our approach from here on will be to estimate the minimal degree of a divisor A ∈ Γ ∗ (C). For that purpose we no longer need to refer to the codes CL (D, G) or CΩ (D, G) after we choose C = D − G or C = G − K, respectively. We write C(C) to refer to any code with designed minimum support D − G or G − K in the divisor class C. In this short paper, we restrict ourselves to the case deg C > 0 (i.e. to codes with positive Goppa designed minimum distance), so that L(−C) = 0, and Γ ∗ (C) = Γ (C). The case deg C ≤ 0 is handled with a straightforward modification similar to that used in [7]. To apply the order bound argument we restate Proposition 2 in terms of semigroup ideals. For a given point P ∈ X, let ΓP = {A : L(A) = L(A − P )} be
14
I. Duursma and R. Kirov
the semigroup of effective divisor classes with no base point at P . For a divisor class C and for a point P , define the semigroup ideal ΓP (C) = {A : L(A) = L(A − P ) and L(A − C) = L(A − C − P )}. The implications in Proposition 2 become c ∈ CL (D, G)\CL (D, G − P ) ⇒ A = supp(c) ∈ ΓP (C), c ∈ CΩ (D, G)\CΩ (D, G + P ) ⇒ A = supp(c) ∈ ΓP (C). Let ΔP (C) be the complement of ΓP (C) in ΓP , ΔP (C) = {A : L(A) = L(A − P ) and L(A − C) = L(A − C − P )}. Theorem 1. (Duursma-Park [7]) Let {A1 ≤ A2 ≤ · · · ≤ Aw } ⊂ ΔP (C) be a sequence of divisors with Ai+1 ≥ Ai + P , for i = 1, . . . , w − 1. Then deg A ≥ w, for every divisor A ∈ ΓP (C) with support disjoint from Aw − A1 . Proof. A more general version is proved in the next section.
3
Order Bounds
Following [1], we use the order bound with a sequence of points {Qi : i ≥ 0}. For Ri = Q0 + Q1 + · · · + Qi−1 , i ≥ 0, c ∈ CL (D, G − Ri )\CL (D, G − Ri − Qi ) ⇒ A = supp(c) ∈ ΓQi (C + Ri ), c ∈ CΩ (D, G + Ri )\CΩ (D, G + Ri + Qi ) ⇒ A = supp(c) ∈ ΓQi (C + Ri ). Repeated application of Theorem 1 gives the following bound for the minimum distance. Corollary 1. (Duursma-Park bound dDP for the minimum distance) Let deg A ≥ w for A ∈ ΓQi (C + Ri ), i ≥ 0 Then a code with designed minimum support in the divisor class C and with divisor D disjoint from Aw − A1 as required by the application of Theorem 1, has minimum distance at least w. The Beelen bound dB for the minimum distance [1, Theorem 7, Remark 5] is similar but assumes in the application of Theorem 1 that Aw − A1 has support in the single point Qi , for a seqeunce {A1 ≤ A2 ≤ · · · ≤ Aw } ⊂ ΔQi (C + Ri ). The Beelen bound dB has a weaker version (which we will denote by dB0 ) that assumes that moreover A1 has support in the point Qi . The simple or first order bound uses the further specialization that Qi = P for i ≥ 0. For G = K + C, such that deg C > 0, and for D disjoint from P , the simple order bound becomes d(CL (D, G)⊥ ) ≥ min #(ΔP (C + iP ) ∩ {jP : j ≥ 0}). i≥0
For G = mP this is the original Feng-Rao bound.
An Extension of the Order Bound for AG Codes
15
The purpose of the order bound is to improve on uniform bounds such as the floor bound. In rare occasions the Beelen bound is less than bounds of floor type [1], [7] (this is the case for a single code in Table 1). Compared to the Beelen bound, the ABZ order bound dABZ [7, Theorem 6.6] allows Aj+1 − Aj ∈ {kQi : k > 0} for a single j in the range j = 1, . . . , w − 1. With that modification the ABZ order bound always is at least the ABZ floor bound dABZ [7, Theorem 2.4] which is the best known bound of floor type. In general, dB0 ≤ dB ≤ dABZ ≤ dDP and dABZ ≤ dABZ .
4
Extension of the Order Bound
We seek to exploit the argument in the order bound a step further by using a partition CL (D, G − iP )\CL (D, G − (i + 1)P ) = ∪j≥0 CL (D, G − iP − jQ)\ (CL (D, G − (i + 1)P − jQ) ∪ CL (D, G − iP − (j + 1)Q). We apply the argument in the setting of the divisor semigroups. For a finite set S of rational points, let ΓS = ∩P ∈S ΓP , and let ΓS (C) = {A ∈ ΓS : A − C ∈ ΓS }, so that ΓS (C) = ∩P ∈S ΓP (C). Let ΔS (C) = ∪P ∈S ΔP (C). Proposition 3. For a divisor class C, for a finite set of rational points S, and for P ∈ S, ΓS (C) ∩ ΓP = ∪i≥0 ΓS∪P (C + iP ). Proof. Since P ∈ S, P ∈ ΓS , and, for A − C − iP ∈ ΓS , A − C ∈ ΓS . Therefore ΓS∪P (C + iP ) = ΓS (C + iP ) ∩ ΓP (C + iP ) ⊆ ΓS (C) ∩ ΓP . For the other inclusion, let A ∈ ΓS (C)∩ΓP . Then A−C ∈ ΓS and L(A−C) = 0. Choose i ≥ 0 maximal such that L(A − C) = L(A − C − iP ). Then A − C − iP ∈ ΓS∪P . As a special case Γ (C) ∩ ΓP = ∪i≥0 ΓP (C + iP ). Theorem 1 gives a lower bound for deg A, for A ∈ ΓP (C). Combination of the lower bounds for C ∈ {C + iP : i ≥ 0} then gives a lower bound for deg A, for A ∈ Γ (C) ∩ ΓP . In combination with Γ (C + iP ) ∩ ΓQ = ∪j≥0 Γ{P,Q} (C + iP + jQ), we obtain, for S = {P, Q}, Γ (C) ∩ ΓS = ∪i,j≥0 ΓS (C + iP + jQ). The next theorem gives a lower bound for deg A, for A ∈ ΓS (C). For S = {P, Q}, combination of the lower bounds for C ∈ {C + iP + jQ : i, j ≥ 0} then gives a lower bound for deg A, for A ∈ Γ (C) ∩ ΓS . Theorem 2. (Main theorem) Let {A1 ≤ A2 ≤ · · · ≤ Aw } ⊂ ΔS (C) be a sequence of divisors with Ai ∈ ΔPi (C), Pi ∈ S, for i = 1, . . . , w, such that Ai − Pi ≥ Ai−1 for i = 2, . . . , w. Then deg A ≥ w, for every divisor A ∈ ΓS (C) with support disjoint from Aw − A1 .
16
I. Duursma and R. Kirov
Lemma 1. For D ∈ ΓP (C), ΔP (C) ⊆ ΔP (D ). Proof. For D − C ∈ ΓP , if A − C ∈ ΓP then A − D ∈ ΓP . Lemma 2. Let lC (A) = l(A) − l(A − C). Then A ∈ ΔP (C) ⇔ lC (A) − lC (A − P ) = 1. Proof. lC (A) − lC (A − P ) = 1 ⇔ (l(A) − l(A − P )) − (l(A − C) − l(A − C − P )) = 1 ⇔ l(A) − l(A − P ) = 1 ∧ l(A − C) − l(A − C − P ) = 0 ⇔ A ∈ ΔP (C). Proof. (Theorem 2) For A = D ∈ ΓS (C) ⊆ ΓPi (C) and for Ai ∈ ΔPi (C), Ai ∈ ΔPi (D ), by Lemma 1, and lD (Ai ) = lD (Ai − Pi ) + 1 by Lemma 2. With Ai − Pi ≥ Ai−1 , there exists a natural map L(Ai−1 )/L(Ai−1 − D ) −→ L(Ai − Pi )/L(Ai − Pi − D ). With (Ai − Pi ) − Ai−1 disjoint form D , the map is injective, since L(Ai−1 ) ∩ L(Ai − Pi − D ) = L(Ai−1 − D ). So that lD (Ai − Pi ) ≥ lD (Ai−1 ). Iteration over i yields deg D ≥ lD (Aw ) ≥ lD (Aw−1 ) + 1 ≥ · · · ≥ lD (A1 ) + w − 1 ≥ lD (A1 − P1 ) + w. To obtain lower bounds with the theorem, we need to construct sequences of divisors in ΔS (C). In the next section we discuss how this can be done effectively.
5
Efficient Computation of the Bounds
The main theorem can be used efficiently to compute coset distances, which in turn can be used to compute two-point code distances. To compute with a certain curve we use, for given points P and Q, a function dP,Q that encapsulates some geometric properties of the curve [7], see also [2], [14]. Lemma 3. Let B be a divisor and let P, Q be distinct points. There exists a unique integer k(B, P, Q) such that B + k P ∈ ΓQ if and only if k ≥ k(B, P, Q) Proof. This amounts to showing that B + kP ∈ ΓQ implies B + (k + 1)P ∈ ΓQ . Now use that ΓQ is a semigroup and that P ∈ ΓQ . Let us restrict our attention to two-point divisors. Using the previous notation define the following integer valued function. dP,Q (a) = k(aQ, P, Q) + a
An Extension of the Order Bound for AG Codes
17
Theorem 3. For a divisor A with support in {P, Q}, A ∈ ΓQ ⇔ deg(A) ≥ dP,Q (AQ ). Proof. Let A = kP +aQ, then deg(A) ≥ dP,Q (a) is equivalent to k ≥ k(aQ, P, Q), which by definition is aQ + kP ∈ ΓQ . This property makes the d function a powerful computational tool. Moreover, for m such that mP ∼ mQ, d is defined modulo m. In general, the function d depends on the ordering of the points P and Q, but it is easy to see that the functions dP,Q and dQ,P satisfy the relation dP,Q (a) = a + b if and only if dQ,P (b) = a + b. The function d = dP,Q and the parameter m are enough to compute sequences of two-point divisors Ai ∈ ΔP (C) as required by Theorem 1. A simple application of the Riemann-Roch Theorem give us that we can restrict our search to a finite range of divisors, since, for A ∈ ΔP (C), min{0, deg C} ≤ deg A ≤ max{2g − 1, deg C + 2g − 1}. Using mP ∼ mQ, we can assume moreover, for A = AP P + AQ Q, that AQ ∈ [0, . . . , m − 1]. To find long sequences of divisors Ai ∈ ΔP (C) we use a graph theory weightmaximizing algorithm on a rectangular grid T , such that Ti,j is a path of longest length up to the divisor A with degree i and AQ = j (i.e. A = iP + j(Q − P )). Computing bounds for the coset C(C)\C(C + P ) with Theorem 1 1. Initialize the first row of T (corresponding to degree i = min{0, deg C} − 1) with 0. 2. Update each row of T successively by the rule Ti,j = max{Ti−1,j−1 , Ti−1,j + BPi,j } where BPi,j is 1 if A ∈ ΔP (C) and 0 otherwise, for the divisor A of degree i with AQ = j. BPi,j is computed using Theorem 3. 3. Iterate up to the last row (corresponding to degree i = max{2g − 1, deg C + 2g − 1}). 4. Return the maximum value in the last row. Using the algorithm we compute bounds for the cosets C(C, S)\C(C + P, S) and C(C, S)\C(C + Q, S) over all possible divisors C. We store them in arrays CP and CQ where the row denotes the degree of C and the column is CQ (mod m). For rational points P and Q such that dP,Q = dQ,P , we can save some work and obtain the table CQ from CP with the relabeling CQi,j = CPi,j for j = i − j (mod m). After computing the coset bounds, we traverse all possible coset filtrations of all codes to find bounds for the minimum distances. We use a graph theory flow-maximizing algorithm on a rectangular grid D, such that Di,j is a bound for the minimum distance of a code C(C) with C of degree i and CQ = j. Computing bounds for the distances of all codes C(C) using P -coset and Q-coset tables 1. Initialize the last row of D (corresponding to degree i = 2g) with 2g.
18
I. Duursma and R. Kirov
2. Update each row of D successively by the rule Di,j = max{min{Di+1,j , CPi,j }, min{Di+1,j+1 , CQi,j }} 3. Iterate up to the first row (corresponding to degree i = 0). Theorem 2 can be used to obtain improved lower bounds for the cosets C(C)\C (C + P ) and C(C)\C(C + Q). The basic algorithm is the same as before with an extra step of using the table of all bounds for C(C, S)\ ∪P ∈S C(C + P, S) to produce the P - and Q-coset tables. Computing P - and Q-coset bounds using Theorem 2 1. Compute table CS with bounds for C(C, S)\ ∪P ∈S C(C + P, S). This step is done in exactly the same way as the computation for Theorem 1, but the new table T for each C has the update rule Ti,j = max{Ti−1,j−1 + BQi,i−j , Ti−1,j + BP i, j} 2. Initialize CP and CQ at the top row (corresponding to degree i = 2g) with 2g. 3. Compute CP in decreasing row order using the rule CPi,j = min{CPi+1,j+1 , Ti,j }. 4. Compute CQ in decreasing row order using the rule CQi,j = min{CQi+1,j , Ti,j }. Once the P - and Q-coset bounds are known, exactly the same minimizing flow method as the one used for obtaining dDP can be used for an improved bound, denoted dDK .
6
Tables for the Suzuki Curve over F32
The Suzuki curve over the field of q = 2q02 elements is defined by the equation y q + y = xq0 (xq + x). The curve has q 2 + 1 rational points and genus g = q0 (q − 1). The semigroup of Weierstrass nongaps at a rational point is generated by {q, q+q0 , q+2q0 , q+2q0 +1}. For any two rational points P and Q there exists a function with divisor (q + 2q0 + 1)(P − Q). Let m = q + 2q0 + 1 = (q0 + 1)2 + q0 2 , and let H be the divisor class containing mP ∼ mQ. The canonical divisor K ∼ 2(q0 − 1)H. For the Suzuki curve over the field of 32 elements we use q0 = 4, q = 32, g = 124, m = 41, K ∼ 6H. The action of the automorphism group on the rational points of the curve is 2-transitive, so that dP,Q (a) = d(a) does not depend on the choice of the points P and Q. For the Suzuki curve with parameter q0 , the d function is given by d(k) = (q0 − a)(q − 1) where a,b are the unique integers such that |a| + |b| ≤ q0 and k ≡ a(q0 + 1) + bq0 − q0 (q0 + 1) (mod m). A detailed explanation of the geometry behind this result
An Extension of the Order Bound for AG Codes
19
can be found in [7]. To store the function d as a list, we go through all integers a and b with |a| + |b| ≤ q0 . To compute d(k) for a single value k, we may use d(k) = (q − 1)(2q0 − q(k − 1, 2q0 + 1)+ − q(r(k − 1, 2q0 + 1), q0 + 1) − r(r(k − 1, 2q0 + 1), q0 + 1)), where q(a, b) and r(a, b) are the quotient and the remainder, respectively, when a is divided by b, and k − 1 is taken modulo m. Table 1. Number of improvements of one bound over another (top), and the maximum improvement (bottom), based on 10168 two-point codes for the Suzuki curve over F32 Floor bounds dBP T dGOP dBP T
Order bounds
dLM dGST dABZ
dB0
dB dABZ dDP dDK
6352 6352 6352 6352 6352 6352 · 4527 4551 4597 5260 5264
6352 6352 6352 5264 5264 5274
dLM dGST dABZ
· · ·
· 2245 2852 4711 4729 · · 2213 4711 4729 · · · 4665 4683
4731 4731 4757 4731 4731 4757 4685 4685 4711
dB0 dB dABZ dDP
· · · ·
1 1 · ·
1 1 · ·
1 1 · ·
· 176 · · · · · ·
374 412 1643 198 236 1565 · 38 1404 · · 1366
dGOP dBP T
1 ·
8 7
13 12
21 20
33 32
33 32
33 32
33 32
33 32
dLM dGST dABZ
· · ·
· · ·
7 · ·
15 8 ·
28 24 24
28 24 24
28 24 24
28 24 24
28 24 24
dB0 dB dABZ dDP
· · · ·
1 1 · ·
1 1 · ·
1 1 · ·
· · · ·
1 · · ·
5 5 · ·
5 5 1 ·
6 6 6 6
Table 2. Improvements of dDP and dDK over dB for 10168 two-point codes dDK − dDP = dDP − dB
0
1
2
3
4
= 0 8603 656 356 198 50 1 92 12 0 0 0 2 33 4 0 0 0 3 74 4 1 0 0 4 0 0 0 0 0 5 0 16 0 0 0
5
6
6 63 0 0 0 0 0 0 0 0 0 0
20
I. Duursma and R. Kirov
Table 3. Optimal codes (For given deg C = 2, . . . , 124 (= g), dDK is the maximum lower bound for a two-point code defined with C = CP P + CQ Q, and CQ gives values for which the maximum is achieved. Suppressed are divisors that define subcodes with the same minimum distance as already listed codes. Exchanging P and Q gives a similar code and listed are only divisors with CQ (mod m) ≤ CP (mod m). The last columns give the amount by which dDK exceeds similarly defined maximum lower bounds for dDP and dB , respectively. deg C dDK CQ 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42
31 31 31 32 32 32 38 38 38 44 44 44 44 48 48 48 50 50 51 54 54 54 54 54 54 54 56 56 58 58 58 59 61 62 62 62 62 62 62 62 62
· · · [0] · · · [2] · · · [1] · [6] · · · [5] · · · [8] · · [9] · [10] · · · · · · · [12] · · [13] 2 [10] 2 2 [12] 3 [13] 3 [10 14] 3 [17] 3 3 2 2 2 2 2 [5]
deg C dDK CQ · · · · · · 3 3 3 · · · · 1 · · 1 · 1 3 3 1 3 · · · 1 · 2 2 2 3 3 3 3 3 2 2 2 2 2
43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83
62 62 63 64 64 64 68 68 70 72 72 72 74 75 75 75 75 77 77 78 79 80 80 81 83 85 85 85 87 87 87 87 89 91 91 91 91 93 93 93 93
· · [5] · [0] · · · [2] · · [2, 4] · [1] · [4, 6] · · [7] 2 [7] 2 · · · [8] · · [10] · [10] 1 [11] 2 [8] · [5] 1 [10] 1 [11] 2 1 [5, 9] 1 [10] 3 2 [7, 16] · · [10] 1 [10, 11] 3 1 [8] 1 [5, 7] · [10, 11] 2 2 1 [7] · [5]
deg C dDK CQ · · · · · · 2 2 2 · · · 2 3 2 · · · · · 1 2 · 1 1 2 1 1 3 2 · · 1 3 1 1 · 2 2 1 1
84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124
94 94 96 96 96 96 99 100 102 102 103 104 106 106 106 108 110 110 110 112 112 114 114 114 116 118 118 118 120 120 121 121 122 123 124 124 124 125 126 127 128
[1] [0, 5] [1] [0]
[1, 2] [2, 3] [3] [5] [2] [7]
[2, 6] [7]
[6] [2] [7] [9] [2, 6, [7]
[6] [2] [7] [2, [2, [2, [2,
6, 6, 6, 6,
[1, 5, [1] [1] [1] [0]
· · · · · · · · · · · · · · · · · · · · · 1 1 · 7] · 2 2 1 1 1 2 9] 1 7, 10, 15] · 7] 1 7] 2 1 10, 19] · · · · ·
1 · 1 · · · · · · · · · · · · · · · · · · 1 1 · · 2 2 1 1 1 2 1 · 1 2 1 · · · · ·
An Extension of the Order Bound for AG Codes
21
Table 1 compares the Goppa bound dGOP , the base point bound dBP T (an improvement of the Goppa bound by one whenever C has a base point), the floor bounds dLM [16, Theorem 3], dGST [11, Theorem 2.4], dABZ [7, Theorem 2.4], and the order bounds dB0 , dB , dABZ , dDP , dDK . Each bound was optimized over the full parameter space corresponding to that bound. The computations for the Suzuki curve of genus g = 124 over F32 were very efficient, computations for the 2g · m = 10168 two-point codes with Goppa designed minimum distance in the range 0, . . . , 2g − 1, took less than 10 minutes on a desktop PC for any given bound. As can be seen, the Beelen bound dB offers only slight improvement over the weaker Beelen bound dB0 . Similar for the improvement of the Duursma-Park bound dDP over the weaker ABZ bound dABZ . Table 2 gives a further breakdown of the 236 codes for which dDP improves dB and the 1366 codes for which dDK improves dDP . The improvements are by at most 5 and 6, respectively. The 63 codes with dDK = dDP + 6 all have dDK = 62 which agrees with the actual minimum distance (namely realized by a choice of 62 points with P1 + . . . + P62 ∼ 31P + 31Q). For each of the 63 codes, the coset with C = 23P + 23Q is the unique coset where Theorem 1 fails to give a lower bound above 56. For the same coset, Theorem 2 gives a lower bound of 62, for example with a sequence A1 , . . . , A62 of type {A1 = 24Q, . . . , A24 = 114P + 24Q} (⊆ ΔP (C)) ∪ {A25 = 114P + 25Q, . . . , A38 = 114P + 40Q} (⊆ ΔQ (C)) ∪ {A39 = 156P, . . . , A62 = 270P } (⊆ ΔP (C)). Table 1 and Table 2 do not show whether the improvements occur for good codes or for poor codes. For Table 3 we select for each degree deg C, i.e. for each given Goppa designed minimum distance, the optimal code with respect to each of the bounds dB , dDP , dDK and we compare those. In this case, depending on the degree, the improvements of dDK , obtained with Theorem 2, over the bounds dB and dDP vary between 0 and 3.
References 1. Beelen, P.: The order bound for general algebraic geometric codes. Finite Fields Appl. 13(3), 665–680 (2007) 2. Beelen, P., Tuta¸s, N.: A generalization of the Weierstrass semigroup. J. Pure Appl. Algebra 207(2), 243–260 (2006) 3. Bras-Amor´ os, M., O’Sullivan, M.E.: On semigroups generated by two consecutive integers and improved Hermitian codes. IEEE Trans. Inform. Theory 53(7), 2560– 2566 (2007) 4. Campillo, A., Farr´ an, J.I., Munuera, C.: On the parameters of algebraic-geometry codes related to Arf semigroups. IEEE Trans. Inform. Theory 46(7), 2634–2638 (2000) 5. Carvalho, C., Torres, F.: On Goppa codes and Weierstrass gaps at several points. Des. Codes Cryptogr. 35(2), 211–225 (2005)
22
I. Duursma and R. Kirov
6. Chen, C.-Y., Duursma, I.M.: Geometric Reed-Solomon codes of length 64 and 65 over F8 . IEEE Trans. Inform. Theory 49(5), 1351–1353 (2003) 7. Duursma, I., Park, S.: Coset bounds for algebraic geometric codes, 36 pages (2008) (submitted), arXiv:0810.2789 8. Duursma, I.M.: Majority coset decoding. IEEE Trans. Inform. Theory 39(3), 1067– 1070 (1993) 9. Duursma, I.M.: Algebraic geometry codes: general theory. In: Munuera, C., Martinez-Moro, E., Ruano, D. (eds.) Advances in Algebraic Geometry Codes. Series on Coding Theory and Cryptography. World Scientific, Singapore (to appear) 10. Feng, G.L., Rao, T.R.N.: Decoding algebraic-geometric codes up to the designed minimum distance. IEEE Trans. Inform. Theory 39(1), 37–45 (1993) 11. G¨ uneri, C., Sitchtenoth, H., Taskin, I.: Further improvements on the designed minimum distance of algebraic geometry codes. J. Pure Appl. Algebra 213(1), 87– 97 (2009) 12. Hansen, J.P., Stichtenoth, H.: Group codes on certain algebraic curves with many rational points. Appl. Algebra Engrg. Comm. Comput. 1(1), 67–77 (1990) 13. Høholdt, T., van Lint, J.H., Pellikaan, R.: Algebraic geometry of codes. In: Handbook of coding theory, vol. I, II, pp. 871–961. North-Holland, Amsterdam (1998) 14. Kim, S.J.: On the index of the Weierstrass semigroup of a pair of points on a curve. Arch. Math. (Basel) 62(1), 73–82 (1994) 15. Kirfel, C., Pellikaan, R.: The minimum distance of codes in an array coming from telescopic semigroups. IEEE Trans. Inform. Theory 41(6, part 1), 1720–1732 (1995); Special issue on algebraic geometry codes 16. Lundell, B., McCullough, J.: A generalized floor bound for the minimum distance of geometric Goppa codes. J. Pure Appl. Algebra 207(1), 155–164 (2006) 17. Maharaj, H., Matthews, G.L.: On the floor and the ceiling of a divisor. Finite Fields Appl. 12(1), 38–55 (2006) 18. Matthews, G.L.: Weierstrass pairs and minimum distance of Goppa codes. Des. Codes Cryptogr. 22(2), 107–121 (2001) 19. Matthews, G.L.: Codes from the Suzuki function field. IEEE Trans. Inform. Theory 50(12), 3298–3302 (2004)
Sparse Numerical Semigroups C. Munuera1 , F. Torres2 , and J. Villanueva2 1
2
Dept. of Applied Mathematics, University of Valladolid Avda Salamanca SN, 47012 Valladolid, Castilla, Spain IMECC-UNICAMP, Cx.P. 6065, 13083-970, Campinas-SP, Brasil
Abstract. We investigate the class of numerical semigroups verifying the property ρi+1 − ρi ≥ 2 for every two consecutive elements smaller than the conductor. These semigroups generalize Arf semigroups.
1
Introduction
Let N0 be the set of nonnegative integers and H = {0 = ρ1 < ρ2 < · · ·} ⊆ N0 be a numerical semigroup of finite genus g. This means that the complement N0 \H is a set of g integers called gaps, Gaps(H) = {1 , . . . , g }. Then c = g + 1 is the smallest integer such that c + h ∈ H for all h ∈ N0 . The number c is called the conductor of H and clearly c = ρc−g+1 . Semigroups play an important role in the theory of one point algebraic geometry codes. In fact, the computation of the order bound on the minimum distance of such a code C(X , D, mQ) –constructed from a curve X , a divisor D on X and a point Q of X – involves computations in the Weierstrass semigroup associated to Q (see Section 5 and [6] for more details). Some families of semigroups have been studied in deep from this point of view. Relevant for this article are Arf and inductive semigroups. Let us remember that the semigroup H is said to be Arf if ρi + ρj − ρk ∈ H for i ≥ j ≥ k ≥ 0. The study of Arf semigroups has been carried in the context of Coding Theory, [3],[5], and Local Algebra Theory, [1], [2], [7], among others (see [12] and the references therein). Inductive semigroups are a particular class of Arf semigroups ([5]). A sequence (Sn ) of semigroups is called inductive if there exist sequences of positive integers (an ) and (bn ) such that S1 = N0 and for n > 1, Sn = an Sn−1 ∪ {m ∈ N0 : m ≥ an bn }. A semigroup is called inductive if it is a member of an inductive sequence. Note that the Arf condition implies (see Corollary 1 below) i − i−1 ≤ 2,
for i = 2, . . . , g,
(1)
or equivalently ρi+1 − ρi ≥ 2 for i = 1, . . . , c − g. There are numerical semigroups satisfying (1) which are not Arf, e.g. those given in Example 1. This fact provides a motivation to introduce and study the class of semigroups for which (1) holds. These will be called sparse. Let us remark that there exist other generalizations of Arf semigroups, proposed by different authors. For example, in ([4]) the Arf property is extended to a widely class of semigroups via the concept of patterns of semigroups. The sparse and pattern properties are disjoint in general. M. Bras-Amor´ os and T. Høholdt (Eds.): AAECC 2009, LNCS 5527, pp. 23–31, 2009. c Springer-Verlag Berlin Heidelberg 2009
24
C. Munuera, F. Torres, and J. Villanueva
From a geometrical point of view, sparse semigroups are closely related to Weierstrass semigroups arising in double covering of curves, cf. [14]. Its arithmetical structure is strongly influenced by the parity of its largest gap g . Recall that for all semigroups we have g ≤ 2g − 1 ([9]). When equality holds then the semigroup is called symmetric. For even g we consider the family Hg,k := {H : H is a sparse semigroup of genus g and g = 2g − 2k}.
(2)
For each H ∈ Hg,k we prove that g ≤ 6k − 3 (Theorem 2); this was noticed by Barucci et al. [2, Thm. I.4.5] and Bras-Amor´ os [3, Prop. 3.2] when k = 1. The approach used here is quite different from theirs. In particular we show that ∪g Hg,k is finite. For g odd, write g = 2g − (2γ + 1) with γ ≥ 0 an integer. We prove (Theorem 3) that for g large enough with respect to γ, H is of the form ¯ ∪ {h ∈ N : h ≥ 2g − 2γ + 1}, H = 2H
(3)
¯ is a semigroup of genus γ uniquely determined by H. When γ = 0 then where H H is hyperelliptic, that is 2 ∈ H (which has already been noticed in [2], [3], [5]). Finally in Section 5 we observe that sparse semigroups of genus large enough satisfy the acute property in the sense of Bras-Amor´ os [3], and we obtain the order of such semigroups as a function of g and g (Proposition 5).
2
Sparse Semigroups
Let H = {0 = ρ1 < ρ2 < · · ·} be a semigroup of genus g and conductor c. Our starting point is the following result (see [5], Proposition 1). Proposition 1. H is Arf iff for all integers i ≥ j ≥ 1, it holds that 2ρi −ρj ∈ H. Corollary 1. The set of gaps of an Arf semigroup H of genus g satisfy (1). Proof. Let Gaps(H) = {1 , . . . , g }. Suppose there exists i ∈ {2, . . . , g} such that i − i−1 > 2. Then i − 2 and i − 1 are non-gaps of H and Proposition 1 implies 2(i − 1) − (i − 2) = i ∈ H, a contradiction. Example 1. Let g ≥ 7. The set H = N0 \ {1, . . . , g − 3, g − 1, g + 1, g + 2} = {0, g − 2, g, g + 3, g + 4, . . .} is a semigroup of genus g satisfying (1). Since 2g − (g − 2) = g + 2 ∈ H it is not Arf. Definition 1. A semigroup H = {0 = ρ1 < ρ2 < . . .} of genus g is called sparse if the set of its gaps, Gaps(H), satisfies (1). Equivalently, H is sparse if ρi+1 − ρi ≥ 2 for i = 1, . . . , c − g. Obviously, every Arf semigroup is sparse. Other remarkable examples of sparse semigroups are as follows.
Sparse Numerical Semigroups
25
¯ be an arbitrary semigroup of conductor c¯ and genus γ. Example 2. Let H ¯ ∪{h ∈ N0 : h ≥ R} (a) Given two integers a ≥ 2, R ≥ 0, the semigroup H := aH is sparse. In particular, inductive semigroups are sparse. Note that when a = 2 ¯ is Arf. and R ≥ 2¯ c, then H is Arf iff H ¯ ∪ {h ∈ N : h ≥ (b) Let g be an integer such that g ≥ γ + c¯. Then H = 2H 2g − 2γ + 1} is a sparse semigroup of genus g with g = 2g − (2γ + 1), whose number of even gaps is γ. In Section 4 we will show that every sparse semigroup with g odd and genus large enough with respect to γ arise in this way. For a semigroup H, let m = m(H) = ρ2 − 1. When H = N0 , m is the largest integer such that m = m. Then m(H) = 1 iff H is hyperelliptic. Proposition 2. Let H be a symmetric semigroup of genus g > 0. The following statements are equivalent: 1. H is Arf; 2. H is sparse; 3. m(H) = 1. Proof. The symmetry property implies g−1 = g − ρ2 ; then from inequalities (1) it follows that ρ2 = 2. This shows that (2) implies (3). If 2 ∈ H, assertion (1) follows from Example 2 as H = 2N0 ∪ {n ∈ N : n ≥ 2g}. Proposition 3. Let H be a semigroup with m(H) = 2 and g > 2. The following statements are equivalent: 1. H is Arf; 2. H is sparse; 3. H = 3, (3g + 2)/2 , (3g + 5)/2 . In this case g = (3g − 1)/2 . Proof. If H is sparse, by induction i = (3i − 1)/2 for i = 1, . . . , g. Then H = 3N0 ∪ {n ∈ N : n ≥ (3g + 2)/2 }. The following results show a strong restriction on the arithmetic of sparse semigroups. They generalize the statement (2) implies (3) in Proposition 2. Lemma 1. Let H be a sparse semigroup of genus g with Gaps(H) = {1 , . . . , g }. If g−i = g−i+1 − 1, then 2g−i < 3(g − i). Proof. The set {1, . . . , g−i } contains g − i gaps and t = g−i − (g − i) nongaps. Write {1, . . . , g−i } ∩ H = {ρ2 , . . . , ρt+1 }. All elements in the set {g−i − ρ2 , · · · , g−i − ρt+1 } ∪ {g−i+1 − ρ2 , · · · , g−i+1 − ρt+1 } are distinct gaps. If otherwise g−i −ρr = g−i+1 −ρs = g−i +1−ρs , then ρr +1 = ρs contradicting the fact of H being sparse. Thus 2(g−i −(g −i))+1 ≤ g −i and hence 2g−i ≤ 3(g −i)−1. Let m = m(H) = ρ2 − 1 be as above and let K = K(H) = 2g − g .
26
C. Munuera, F. Torres, and J. Villanueva
Theorem 1. Let H be a sparse semigroup of genus g with Gaps(H) = {1 , . . . , g }. Then 1. i ≥ 2i − K for all i = 1, . . . , g. In particular m ≤ K. 2. If i = 2i − K then j = 2j − K for j = i, . . . , g. 3. If g ≥ 2K − 1 then i = 2i − K for all i = 2K − 2, . . . , g. Proof. (1) By the sparse property, g − i ≤ 2(g − i) so that i ≥ 2i − K. Then 2(j + 1) − K ≤ j+1 ≤ j + 2 and the result follows. (2) is clear. (3) Since g = 2g − K by definition of K, it is enough to prove that i+1 = i + 2 for i = 2k − 2, . . . , g − 1. We use induction on j = g − i and thus we have to prove the statement g−j = g−j+1 − 2 for j = 1, . . . , g − 2K + 2. If g−1 = g − 2, the sparse property implies g−1 = g − 1. Then, according to Lemma 1, 2(g − 1) = 2g−1 < 3(g − 1) and so g < 2K − 1, a contradiction. Suppose that the assertion is true for all t = 1, . . . , j (1 ≤ j ≤ g − 2K + 1) and let us prove it for j + 1. If g−j−1 = g−j − 1, then from the Lemma it follows that 2(g−j − 1)) = 2g−j−1 < 3(g − i − 1). By induction hypothesis we have g−j = g − 2j, hence j > g − 2K + 1, which is also a contradiction. Example 3. Let H be a semigroup. According to the Theorem, m(H) ≤ K(H). When equality holds, then H is sparse iff i = 2i − K for i = K, . . . , g. Note that in this case H is also Arf. If K = 2k, then H is sparse iff H has exactly (g − k) even gaps. Here observe that 2(2k + 1) ≥ 2g − 2k + 2 so that g ≤ 3k. This example will be generalized in Theorem 2.
3
The Largest Gap: Even Case
As we have seen, the structure of sparse semigroups H = N0 is strongly influenced by its largest gap g , and in particular by the parity of g . Let us study now the case in which g is even and thus m = m(H) = ρ2 − 1 ≥ 2. Theorem 2. Let H be a sparse semigroup of genus g with g = 2g−K = 2g−2k. If g ≥ 4k − 1, then g ≤ 6k − m(H) − 1. In particular, the family of semigroups Hk = g≥0 Hg,k , where Hg,k is as in (2), is finite. Proof. From Theorem 1(3), the gaps 6k − 2 = 4k−1 < . . . < 2g − 2k = g , are all of them even numbers. Thus the gaps 3k − 1 = I−(g−4k+1) < · · · < g − k = I are consecutive integers and I ≤ 4k − 1. By the sparse property, 4k−1 − I ≤ 2(4k − 1 − I), hence 2I ≤ g + k (∗). If m ≥ g − k, then g ≤ 3k by Theorem 1(1), so that k = 1 and g = 3. If m < g − k, then m < 3k − 1. By applying the sparse property once again, it follows that 3k − 1 − m ≤ 2(I − g + 4k − 1 − m), i.e., 2g ≤ 2I + 5k − m − 1. Finally (∗) implies g ≤ 6k − m − 1. Example 4. In [2] and [3], Barucci et al. and Bras-Amor´os proved that there exist only two Arf semigroups with g = 2g − 2, namely H = {0, 3, 4, . . .} and H = {0, 3, 5, 6, 7, . . .}. By the above Theorem, the following statements are equivalent:
Sparse Numerical Semigroups
27
1. H is Arf with g = 2g − 2; 2. H is sparse with g = 2g − 2; 3. g = 2 and H = {0, 3, 4, . . .}, or g = 3 and H = {0, 3, 5, 6, 7, . . .} = 3N0 ∪{h ∈ N : h ≥ 5}. Example 5. By using the above results we can compute all sparse semigroups with g = 2g − 4. First note that all these semigroups have genus g ≥ 4. Also g ≤ 7. If otherwise g ≥ 4k = 8, Theorem 2 implies g ≤ 11 − m and hence m = 2 or m = 3. In both cases 12 ∈ H. Since 7 = 10, we have 12 ≥ 2g which is a contradiction. Thus 4 ≤ g ≤ 7. Let g > 4. By Theorem 1(1) we have 2 ≤ m ≤ 4. If m = 2 then Proposition 3 allows us to compute H. Let m = 3. We shall show that g = 5 and H = {0, 4, 7, 8, 9, 10, . . .} = 4N0 ∪ {h ∈ N : h ≥ 7}. In fact, here ρ2 = 4, 4 = 5. If 6 ∈ H, then {4, 6, 8, 10, . . .} ⊆ H, which implies 2g − 4 ≤ 2. Thus 5 = 6. If g > 5, 8 ∈ Gaps(H) by the sparse property. Finally the case m = 4 is computed via Example 3. We subsume all these computations as follows. Let H be a semigroup with g = 2g − 4. The following statements are equivalent: 1. H is Arf; 2. H is sparse; 3. g = 4 and Gaps(H) = {1, 2, 3, 4}, or g = 5 and H = 3, 8, 10, or g = 6 and H = 3, 10, 11, or g = 7 and H = 3, 11, 13, or g = 5 and H = 4, 7, 9, 10, or m(H) = 4 and hence H has exactly (g − 2) even gaps for g = 5, 6, 7. Remark 1. We can also study the cardinality of the family (2). Consider the semigroups H of genus g with g (H) = 2g − 2k and m(H) = 2k. We have 2k ≤ g ≤ 3k and we obtain exactly k + 1 of such semigroups (cf. Example 3).
4
The Largest Gap: Odd Case
Let us study now the case g odd. To do that we shall need some results concerning γ-hyperelliptic semigroups, which are those semigroups having exactly γ even gaps (cf. [13], [14]). Proposition 4. Let H be a semigroup. If there exists an integer γ ≥ 0 such that ρ2γ+2 = 6γ + 2 and g ≥ 6γ + 4, then H can be written in the form ¯ ∪ {uγ , . . . , u1 } ∪ {n ∈ N : n ≥ 2g} , H = 2H ¯ is a semigroup of genus γ and uγ < · · · < u1 ≤ 2g − 1 are precisely the where H γ odd nongaps of H up to 2g. Proof. See [13], Theorem 2.1. ¯ = Observe that H in the above Proposition has exactly γ even nongaps and H {h/2 : h ∈ H, h ≡ 0 (mod 2)}. The main result of this section is the following.
28
C. Munuera, F. Torres, and J. Villanueva
Theorem 3. Let H be a sparse semigroup with g = 2g − (2γ + 1). If g ≥ 6γ + 4 ¯ ∪ {uγ , . . . , u1 } ∪ {n ∈ N : n ≥ 2g}, where H ¯ is a semigroup of then H = 2H genus γ and ui = 2g − 2i + 1 for i = 1, . . . , γ. Proof. By Theorem 1 we have 2K−1 = 3K − 2 and 2K = 3K where K = 2γ + 1. Thus ρK+1 = 3K − 1, hence ρ2γ+2 = 6γ + 2 and the result follows. A semigroup as in the Theorem will be called ordinary γ-hyperelliptic. Remark 2. The lower bound on the genus in Theorem 3 is necessary as the following example shows. Let γ ≥ 1 be an integer and consider the Arf semigroup H = 3, 6γ + 2, 6γ + 4. H is 2γ-hyperelliptic of genus g = 4γ + 1 but g = 2g − (2γ + 1) (cf. Proposition 3). Remark 3. There exist γ-hyperelliptic sparse semigroups of genus g ≥ 2γ + 1. Let H be the semigroup defined in Example 3. Here i = i for i = 1, . . . , K, i = 2i − K for i = K, . . . , g, and K = 2γ + 1. Corollary 2. Let γ ≥ 0 be an integer. Let H be a sparse semigroup of genus ¯ = {h/2 : h ∈ H, h ≡ 0 (mod 2)}. g ≥ 6γ + 4 with g = 2g − (2γ + 1). Let H ¯ Then H is Arf if and only if so is H. ¯ ∪{h ∈ N : h ≥ 2g−2γ +1}. Since R = 2g−2γ +1 ≥ Proof. We can write H = 2H 2¯ c + 1 (note 2γ ≥ c¯), the result follows from Example 2. Example 6. If g ≥ 4 and g = 2g − 1, then γ = 0 and we find a new proof of Proposition 2. Remember that there exists just one semigroup of genus 1, namely {0, 2, 3, . . .} and two semigroups of genus 2, {0, 2, 4, 5, . . .} and {0, 3, 4, 5, . . .}. All of them are Arf semigroups. Example 7. Let H be a semigroup of genus g ≥ 10 with g = 2g − 3. The following statements are equivalent: 1. H is Arf; 2. H is sparse; 3. H is ordinary 1-hyperelliptic, that is H = 2{0, 2, 3, . . .} ∪ {n ∈ N0 : n ≥ 2g − 1}. Example 8. Let H be a semigroup of genus g ≥ 16 with g = 2g − 5. The following statements are equivalent: 1. H is Arf; 2. H is sparse; ¯ ∪ {2g − 3, 2g − 1}, where H ¯ is 3. H is ordinary 2-hyperelliptic, that is H = 2H a semigroup of genus 2. ¯ of genus γ which are not Example 9. For every γ ≥ 3, there exist semigroups H Arf. For example, H := N0 \ {1, 2, . . . , γ − 1, γ + 2}. Thus for every γ ≥ 3 there exist ordinary γ-hyperelliptic semigroups having largest gap g odd, which are not Arf property (cf. Corollary 2).
Sparse Numerical Semigroups
5
29
On the Order of Semigroups
In this section we are interested in the order bound of sparse semigroups. Let us remember that for a semigroup H = {0 = ρ1 < ρ2 < · · ·}, we can consider the sets A() = A[ρ ] := {(s, t) ∈ N2 : ρs + ρt = ρ } ∈ N, and their cardinals
ν := |A( + 1)|
(or equivalently, ν = |{ρ+1 − ρ1 , . . . , ρ+1 − ρ+1 } ∩ H|). For ∈ N0 we define the -th order bound of H as d (H) := min{νm : m ≥ }. The order bound was introduced in the context of the Theory of Error-Correcting codes. In fact, d (H) gives a lower bound on the minimum distance of Algebraic Geometry codes related to H (see [6] for details). The order bound is often difficult to compute. For this reason, we define the order number of H as o(H) := min{ ∈ N : νj ≤ νj+1 for all j ≥ } . It is clear that d (H) = ν for all ≥ o(H). Let us recall the concept of acute and near-acute semigroups (cf. [3], [8]). Write H = {0} ∪ [cm , dm ] ∪ · · · ∪ [c1 , d1 ] ∪ [c0 , ∞) , (4) where ci , di are non-negative integers such that di+1 + 2 ≤ ci ≤ di for i = 1, . . . , m, d1 + 2 ≤ c, and cm+1 = dm+1 = 0, with c0 equals the conductor c of H. Consider the following conditions: (I) c0 − d1 ≤ c1 − d2 ; (II) c0 − d1 ≤ d1 − d2 ; (III) 2d1 − c0 + 1 ∈ H. We say that H is acute if it satisfies (I); H is near-acute if it satisfies (II) or (III). Notice that (I) implies (II). Define the function ι on H by ι(ρi ) = i. In [8, Thm. 3.11] it is shown that o(H) = min{ι(2d1 + 1), ι(c0 + c1 − 1)} provided that H is near-acute (*). See also [10,11] for more information. Proposition 5. Let H be a sparse semigroup of genus g with g = 2g − K. If 2g ≥ 3g then H is acute and o(H) = 2g − g = 3g − 2K. Proof. Notation as in equation (4). From Theorem 1, c0 = g +1, c1 = d1 = g −1 and c2 = d2 = g − 3. We have 2d1 + 1 = c0 + c1 − 1 = 2g − 1 and thus H is acute. Now 2g − 1 = ρ2g −g and the proof follows from the above formula (∗) for o(H). Example 10. For g ≥ 7, let H = {0, g − 2, g, g + 3, g + 4 . . .} be the sparse semigroup of Example 1 (thus g = g + 2 = 2g − (g − 2)). It is easy to check that H is not near-acute. We claim that o(H) = g − 1 (in particular, o(H) is not given by the formula (∗) above).
30
C. Munuera, F. Torres, and J. Villanueva
Since r = 2c0 − g − 1 = g + 5, from [8, Thm. 2.6], the sequence (ν )≥g+5 is strictly increasing. Thus we have to consider the numbers ν for ≤ g + 5. Here ρ1 = 0, ρ2 = g − 2, ρ3 = g and ρ = g + − 1 for ≥ 4. Let k ∈ {1, . . . , 6} and i ∈ [k + 3, g + k − 1]. We have ρg+k − ρi = g + k − i ∈ [1, g − 3] and so νg+k−1 = |{ρg+k − ρ2 , . . . , ρg+k − ρk+2 } ∩ H| + 2 . If k = 1 then νg = |{g + 2, g} ∩ H| + 2 = 3, otherwise ρg+k − ρk+2 = g − 2 ,
if k ≥ 2
ρg+k − ρk+1 = g − 1 ∈ / H, ρg+k − ρk = g ,
if k ≥ 3 if k ≥ 4 .
In addition, if k ≥ 2 we obtain ρg+k − ρ2 = g + k + 1 ∈ H and thus νg+k−1 = |{ρg+k − ρ3 , . . . , ρg+k − ρk+2 } ∩ H| + 3 . Therefore νg+1 = |{g + 1, g − 2} ∩ H| + 3 = 4 , νg+2 = |{g + 2, g − 1, g − 2} ∩ H| + 3 = 4 , νg+3 = |{g + 3, g, g − 1, g − 2} ∩ H| + 3 = 6 , νg+4 = |{g + 4, g + 1, g, g − 1, g − 2} ∩ H| + 3 = 6 , νg+5 = |{g + 5, g + 2, g + 1, g, g − 1, g − 2} ∩ H| + 3 = 6 . Now, let s ∈ {0, 1} and j ∈ [4, g + s − 1] then ρg−s − ρj = g − s − j ∈ [1, g − 4] and so νg−s−1 = |{ρg−s − ρ2 , ρg−s − ρ3 } ∩ H| + 2 . Then νg−1 = |{g + 1, g − 1} ∩ H| + 2 = 2 , νg−2 = |{g, g − 2} ∩ H| + 2 = 4 . Thus o(H) = g − 1. Acknowledgment. The authors wish to thank the reviewers for their detailed comments and suggestions.
References 1. Arf, C.: Une interpretation alg´ebrique de la suite des ordres de multiplicit´e d´ une branche alg´ebrique. Proc. London Math. Soc. 50, 256–287 (1949) 2. Barucci, V., Dobbs, D.E., Fontana, M.: Maximality properties in numerical semigroups and applications to one-dimensional analytically irreducible local domains. Mem. Amer. Math. Soc. 125 (1997)
Sparse Numerical Semigroups
31
3. Bras-Amor´ os, M.: Acute semigroups, the order bound on the minimum distance, and the Feng-Rao improvement. IEEE Trans. Inform. Theory 50(6), 1282–1289 (2004) 4. Bras-Amor´ os, M., Garc´ıa, P.A.: Patterns on numerical semigroups. Linear Algebra and its Applications 414, 652–669 (2006) 5. Campillo, A., Farr´ an, J.I., Munuera, C.: On the parameters of algebraic-geometry codes related to Arf semigroups. IEEE Trans. Inform. Theory 46(7), 2634–2638 (2000) 6. Høholdt, T., van Lint, J.H., Pellikaan, R.: Algebraic Geometry codes. In: Pless, V., Huffman, C. (eds.) Handbook of Coding Theory, pp. 871–961. Elsevier, Amsterdam (1998) 7. Lipman, J.: Stable ideal and Arf semigroups. Amer. J. Math. 97, 791–813 (1975) 8. Munuera, C., Torres, F.: A note on the order bound on the minimum distance of AG codes and acute semigroups. Advances in Mathematics of Communications 2(2), 175–181 (2008) 9. Oliveira, G.: Weierstrass semigroups and the canonical ideal of non-trigonal curves. Manuscripta Math. 71, 431–450 (1991) 10. Oneto, A., Tamone, G.: On numerical Semigroups and the Order Bound. J. Pure Appl. Algebra 212, 2271–2283 (2008) 11. Oneto, A., Tamone, G.: On the order bound of one-point algebraic geometry codes. J. Pure Appl. Algebra (to appear) 12. Rosales, J.C., Garc´ıa-S´ anchez, P.A., Garc´ıa-Garc´ıa, J.I., Branco, M.B.: Arf Numerical Semigroups. Journal of Algebra 276, 3–12 (2004) 13. Torres, F.: On γ-hyperelliptic numerical semigroups. Semigroup Forum 55, 364–379 (1997) 14. Torres, F.: Weierstrass points and double coverings of curves with applications: Symmetric numerical semigroups which cannot be realized as Weierstrass semigroups. Manuscripta Math. 83, 39–58 (1994)
From the Euclidean Algorithm for Solving a Key Equation for Dual Reed–Solomon Codes to the Berlekamp–Massey Algorithm Maria Bras-Amor´ os1, and Michael E. O’Sullivan2 1
Departament d’Enginyeria Inform` atica i Matem` atiques Universitat Rovira i Virgili
[email protected] 2 Department of Mathematics and Statistics San Diego State University
[email protected]
Abstract. The two primary decoding algorithms for Reed-Solomon codes are the Berlekamp-Massey algorithm and the Sugiyama et al. adaptation of the Euclidean algorithm, both designed to solve a key equation. This article presents a new version of the key equation and a way to use the Euclidean algorithm to solve it. A straightforward reorganization of the algorithm yields the Berlekamp-Massey algorithm.
1
Introduction
For correcting primal Reed-Solomon codes a useful tool are the so-called locator and evaluator polynomials. Once we know them, the error positions are determined by the inverses of the roots of the locator polynomial and the error values can be computed by a formula due to Forney [4] which uses the evaluator and the derivative of the locator evaluated at the inverses of the error positions. Berlekamp presented in [1] a key equation determining the decoding polynomials for primal Reed-Solomon codes. Sugiyama et al. introduced in [9] their celebrated algoithm for solving this key equation based on the Euclidean algorithm for computing the greatest common divisor of two polynomials and the coefficients of the B´ezout’s identity. Another celebrated algorithm for solving the key equation is the algorithm by Berlekamp and Massey [1, 6] which is widely accepted to have better performance than the Sugiyama et al. algorithm, although its formulation is more difficult to understand. The connections between both algorithms were analyzed in [3, 5]. In this work we take the perspective of dual Reed Solomon codes. In Section 3 we present a key equation for dual Reed-Solomon codes and in Section 4 we
An extended version is in preparation for being submitted for publication. This work was partly supported by the Catalan Government through a grant 2005 SGR 00446 and by the Spanish Ministry of Education through projects TSI200765406-C03-01 E-AEGIS and CONSOLIDER CSD2007-00004 ARES.
M. Bras-Amor´ os and T. Høholdt (Eds.): AAECC 2009, LNCS 5527, pp. 32–42, 2009. c Springer-Verlag Berlin Heidelberg 2009
From the Euclidean Algorithm for Solving a Key Equation
33
introduce an Euclidean-based algorithm for solving it, following Sugiyama et al.’s idea. While the key equation for primal Reed-Solomon codes states that a linear combination of the decoding polynomials is a multiple of a certain power of x, in the key equation presented here the linear combination has bounded degree. Then, while in Sugiyama et al.’s algorithm the locator and evaluator polynomials play the role of one of the B´ezout’s coefficients and the residue respectively, in the Euclidean algorithm presented here the locator and evaluator polynomials play the role of the two coefficients of the B´ezout’s identity. The decoding polynomials are now slightly different and in a sense more natural, since the error positions are given by the roots themselves of the locator polynomial rather than their inverses and that the error values are obtained by evaluating the evaluator polynomial and the derivative of the locator polynomial at the error positions rather than evaluating them at the inverses of the error positions. In addition the equivalent of the Forney formula does not have the minus sign. In Section 5 we prove that the new Euclidean-based algorithm is exactly the Berlekamp-Massey algorithm. The connections between the Euclidean and the Berlekamp-Massey algorithms seem to be much more transparent for the dual Reed-Solomon codes than for the primal codes [3, 5]. In Section 6 we move back to primal Reed Solomon codes and see how all the developments of the previous sections can be applied to primal codes with minor modifications.
2
Settings and Notations
In this section we establish the notions and notations that we will use in the present work. A general reference is [8]. Let F be a finite field of size q = pm and let α be a primitive element in F. Let n = q − 1. We identify the vector u = (u0 , . . . , un−1 ) with the polynomial u(x) = u0 + · · · + un−1 xn−1 and denote u(a) the evaluation of u(x) at a. Classically the (primal) Reed-Solomon code C ∗ (k) of dimension k is the cyclic code generated by the polynomial (x−α)(x−α2 ) · · · (x−αn−k ), and has generator matrix G∗ (k) = (αij )ij , for i = 0, . . . , k − 1, and j = 0, . . . , n − 1, while the dual Reed-Solomon code C(k) of dimension k is generated by the polynomial (x − α−(k+1) ) · · · (x − α−(n−1) )(x − 1) and has generator matrix G∗ (k) = (α(i+1)j )ij , for i = 0, . . . , k − 1, and j = 0, . . . , n − 1. Both codes have minimum distance d = n − k + 1. Furthermore, C(k)⊥ = ∗ C (n − k). There is a natural bijection from Fn to itself, c → c∗ , that takes C(k) to C ∗ (k). Indeed, let i be a vector of dimension k and let c = (c0 , c1 , . . . , cn−1 ) = iG(k) ∈ C(k), c∗ = (c∗0 , c∗1 , . . . , c∗n−1 ) = iG∗ (k) ∈ C ∗ (k). Then, c = (c∗0 , αc∗1 , α2 c∗2 , . . . , αn−1 c∗n−1 ).
(1)
In particular, c(αi ) = c∗0 + αc∗1 αi + α2 c∗2 α2i + · · · + αn−1 c∗n−1 α(n−1)i = c∗ (αi+1 ).
34
3 3.1
M. Bras-Amor´ os and M.E. O’Sullivan
Key Equations for Primal and Dual Reed-Solomon Codes Polynomials for Correction of RS Codes
Suppose that a word c∗ ∈ C ∗ (k) is transmitted and that an error e∗ occurred ∗ ∗ ∗ with t ≤ d−1 2 non-zero positions, so that u = c + e is received. Define the ∗ error locator polynomial Λ and the error evaluator polynomial Ω ∗ as Λ∗ = (1 − αi x), Ω∗ = e∗i αi (1 − αi x), i:e∗ i =0
i:e∗ i =0
j:e∗ j =0,j=i
and the syndrome polynomial S ∗ = e∗ (α) + e∗ (α2 )x + e∗ (α3 )x2 + · · · + e∗ (αn )xn−1 . Notice that from the received word we only know e∗ (α) = u∗ (α), . . . , e∗ (αn−k ) = u∗ (αn−k ). This is why we use the truncated syndrome polynomial S¯∗ = e∗ (α) + e∗ (α2 )x + · · · + e∗ (αn−k )xn−k−1 . If Λ∗ and Ω ∗ are known, the error positions can be identified as the indices i such that Λ∗ (α−i ) = 0 and the error values can be computed by the Forney formula [4] e∗i = −
3.2
Ω ∗ (α−i ) . Λ∗ (α−i )
Polynomials for Correction of Dual RS Codes
Suppose that a word c ∈ C(k) is transmitted and that an error e occurred with t ≤ d−1 2 non-zero positions, so that u = c + e is received. In this case define the error locator polynomial Λ and the error evaluator polynomial Ω as (x − αi ), Ω= ei (x − αi ), (2) Λ= i:ei =0
i:ei =0
j:ej =0,j=i
and the syndrome polynomial S = e(αn−1 ) + e(αn−2 )x + · · · + e(α)xn−2 + e(1)xn−1 .
(3)
Notice that now, from the received word we only know e(1) = u(1), . . . , e(αn−k−1 ) = u(αn−k−1 ). This is why we use the truncated syndrome polynomial S¯ = e(αn−k−1 )xk + e(αn−k−2 )xk+1 + · · · + e(1)xn−1 .
(4)
From the Euclidean Algorithm for Solving a Key Equation
35
In this case, if Λ and Ω are known, the error positions can be identified as the indices i such that Λ(αi ) = 0 (5) and the error values can be computed by the formula ei =
Ω(αi ) . Λ (αi )
(6)
It is easy to check that when the error vectors e and e∗ satisfy the relationship (1), then the polynomials Λ, Ω, S, S associated to the error e are related to the polynomials Λ∗ , Ω ∗ , S ∗ , S¯∗ associated to the error e∗ as follows: Λ = xt Λ∗ (1/x), Ω = xt−1 Ω ∗ (1/x), S = xn−1 S ∗ (1/x), S¯ = xn−1 S¯∗ (1/x). 3.3
Key Equations
A straightforward computation shows that ΛS = (xn − 1)Ω
(7)
or, equivalently, Λ∗ S ∗ = (1 − xn )Ω ∗ . See [7] for a proof. In particular, the polynomial Λ∗ S¯∗ + (xn − 1)Ω ∗ = Λ∗ (S¯∗ − S ∗ ) has only terms of order n − k or larger and this implies Λ∗ S¯∗ = Ω ∗ mod xn−k This is the key equation introduced by Berlekamp [1]. Massey [6] gave an algorithm for solving linear feedback shift registers and its application to decoding BCH and thus RS codes. Sugiyama et al. [9] recognized that the Euclidean algorithm for finding the greatest common divisor of two polynomials and for solving B´ezout’s identity, could also be adapted to solve this kind of equation for Λ∗ and Ω ∗ , starting with r−2 = xn−k and r−1 = S¯∗ . Their method is often referred to as the Euclidean decoding algorithm for Reed Solomon codes. On the other hand, the polynomial ΛS¯ − (xn − 1)Ω = Λ(S¯ − S) has degree d−1 d+1 at most d−1 2 + k − 1 = 2 + n − d = n − 2 . That is deg(ΛS¯ − (xn − 1)Ω) ≤ n − d+1 2 The aim of the present work is to deal with this alternative key equation on Λ and Ω and solving it by the Euclidean algorithm starting with r−2 = xn − 1 and ¯ r−1 = S.
4
Euclidean Algorithm for Dual Reed-Solomon Codes
The next theorem characterizes the decoding polynomials by means of the alternative key equation, the polynomial degrees, and their coprimality. It is the analogue of [2, Theorem 4.8] for the standard key equation, and the proof is similar.
36
M. Bras-Amor´ os and M.E. O’Sullivan
Theorem 1. Suppose that at most d−1 2 errors occurred. Then Λ and Ω are the unique polynomials f , ϕ satisfying the following properties. 1. deg(f S¯ − (xn − 1)ϕ) ≤ n − d+1 2
2. deg(ϕ) < deg(f ) ≤ d−1 2 3. f, ϕ are coprime. 4. f is monic
Consider the following algorithm: Initialize: r−2 = xn − 1, f−2 = 0, ϕ−2 = −1, ¯ f−1 = 1, ϕ−1 = 0, r−1 = S, i = −1. while deg(ri ) > n − d+1 : 2 qi = Quotient(ri−2 , ri−1 ) ri = Remainder(ri−2 , ri−1 ) ϕi = ϕi−2 − qi ϕi−1 fi = fi−2 − qi fi−1 end while Return fi /LeadingCoefficient(fi ), ϕi /LeadingCoefficient(fi )
The polynomial ϕ has been initialized to negative 1 because we want that fi , ϕi satisfy at each step fi S¯ − ϕi (xn − 1) = ri . The next theorem verifies the algorithm. Its proof has also been ommitted and will appear in the final form of this article. Theorem 2. If t ≤
5
d−1 2
then the algorithm outputs Λ and Ω.
From the Euclidean to the Berlekamp-Massey Algorithm
In this section we will perform a series of minor modifications to the previous algorithm that will lead to the Berlekamp-Massey algorithm. This will show that they are essentially the same. First of all, notice that the updating step in the previous algorithm can be expressed in matrix form as ri−1 fi−1 ϕi−1 −qi 1 ri fi ϕi = . 1 0 ri−1 fi−1 ϕi−1 ri−2 fi−2 ϕi−2 (0)
Furthermore, if qi = qi
−qi 1 1 0
=
(0)
1 −qi 01
(1)
(l)
+ qi x + · · · + qi xl then
(1)
1 −qi x 01
(2)
1 −qi x2 01
...
(l)
1 −qi xl 01
01 10
.
If we split all matrix multiplications we will need to add intermediate variables. Thus, it does not make sense anymore to consider matrices
From the Euclidean Algorithm for Solving a Key Equation
ri fi ϕi ri−1 fi−1 ϕi−1
37
.
˜ i , F˜i , Ψ˜i and matrices We will use auxiliary polynomials Ri , Fi , Ψi , R Ri Fi Ψi . R˜i F˜i Ψ˜i These matrices will satisfy at given stages of the algorithm (exactly when deg(Ri ) ˜ i )) that Ri , Fi , Ψi are the remainder and the intermediate B´ezout coeffi< deg(R ˜ i , F˜i , Ψ˜i cients in one of the steps of the original Euclidean algorithm and that R are respectively the remainder and the intermediate B´ezout coefficients previous to Ri , Fi , Ψi . The previous algorithm can be expressed now: Initialize: R−1 F−1 Ψ−1 S¯ 10 ˜ −1 F˜−1 Ψ˜−1 = xn − 1 0 −1 R : while deg(Ri ) > n − d+1 2 Ri+1 Fi+1 Ψi+1 Ri Fi Ψi 01 = ˜ i+1 F˜i+1 Ψ˜i+1 ˜ i F˜i Ψ˜i 10 R R (quotient=0) ˜i) : while deg(Ri ) ≥ deg(R ˜i) μ = LeadingTerm(Ri )/LeadingTerm(R (quotient=quotient+μ) Ri+1 Fi+1 Ψi+1 Ri Fi Ψi 1 −μ = ˜ i+1 F˜i+1 Ψ˜i+1 ˜ i F˜i Ψ˜i 01 R R end while end while Return Fi /LeadingCoefficient(Fi ), Ψi /LeadingCoefficient(Fi )
5.1
Making Remainders Monic
˜ i monic in all steps. This makes it easier to A useful modification is to keep R ˜i ). compute the μ’s. It is enough to force this every time we get deg(Ri ) < deg(R ˜ i ), then R ˜ i stays the same and so it remains monic. To When deg(Ri ) ≥ deg(R ˜ i by its leading coefficient μ ˜, and also F˜i and Ψ˜i in order this end we divide R to keep the properties. Analogously, we multiply Ri , Fi and Ψi by −˜ μ in order to make the Fi ’s monic. Another modification is the use of an integer p keeping track ˜ i and that we take leading coefficients of the degree difference between Ri and R of Ri instead of leading terms. With these modifications, the algorithm is now:
38
M. Bras-Amor´ os and M.E. O’Sullivan Initialize: R−1 R−1 Ψ−1 S¯ 10 = n ˜ −1 F˜−1 Ψ˜−1 x − 1 0 −1 R : while deg(Ri ) > n − d+1 2 μ = LeadingCoefficient(Ri ) ˜i) p = deg(Ri ) − deg(R if p < 0 then Ri+1 Fi+1 ˜ i+1 F˜i+1 R else Ri+1 Fi+1 ˜ i+1 F˜i+1 R end if end while
Ψi+1 Ψ˜i+1 Ψi+1 Ψ˜i+1
=
=
0 −μ 1/μ 0 1 −μxp 01
Ri Ri Ψi ˜ i F˜i Ψ˜i R Ri Fi Ψi ˜ i F˜i Ψ˜i R
Return Fi , Ψi
5.2
Joining Steps
It is easy to check that after each step corresponding to p < 0 the new p is exactly the previous one with opposite sign and so is μ. With this observation we can join each step corresponding to p < 0 with the following one: Initialize: R−1 F−1 Ψ−1 S¯ 10 = n ˜ −1 F˜−1 Ψ˜−1 x − 1 0 −1 R : while deg(Ri ) > n − d+1 2 μ = LeadingCoefficient(Ri ) ˜i) p = deg(Ri ) − deg(R if p < 0 then −p Ri Fi Ψi Ri+1 Fi+1 Ψi+1 x −μ = ˜ i+1 F˜i+1 Ψ˜i+1 ˜ i F˜i Ψ˜i 1/μ 0 R R else Ri+1 Fi+1 Ψi+1 Ri Fi Ψi 1 −μxp = ˜ i+1 F˜i+1 Ψ˜i+1 ˜ i F˜i Ψ˜i 01 R R end if end while Return Fi , Ψi
5.3
Skipping Remainders
˜ i ) is that we need Finally, the only reason to keep the polynomials Ri (and R to compute their leading coefficients (the μ’s). We now show that these leading coefficients may be obtained without reference to the polynomials Ri . This allows to compute the Fi , Ψi iteratively and dispense with the polynomials Ri .
From the Euclidean Algorithm for Solving a Key Equation
39
On one hand, the remainder Ri = Fi S¯ − Ψi (xn − 1) = Fi S¯ − xn Ψi + Ψi has degree at most n − 1 for all i ≥ −1. This means that all terms of xn Ψi cancel with terms of Fi S¯ and that the leading term of Ri must be either a term of Fi S¯ or a term of Ψi or a sum of a term of Fi S¯ and a term of Ψi . On the other hand, the algorithm only computes LeadingCoefficient(Ri ) while deg(Ri ) ≥ n − d−1 2 . We want to see that in this case the leading term of Ri has degree strictly larger than that of Ψi . Indeed, one can check that for i ≥ −1, deg(Ψi ) < deg(Fi ) and that all Fi ’s in the algorithm have degree at most d−1 2 . d−1 d−1 So deg(Ψi ) < deg(Fi ) ≤ d−1 ≤ d − ≤ n − ≤ deg(R ). i 2 2 2 The previous algorithm can be transformed in a way such that the remainders are not kept but their degrees: Initialize: ¯ −2 = n d −1 = deg(S) d 10 f−1 ϕ−1 = F−1 Ψ−1 0 −1 while di > n − d+1 : 2 ¯ di ) μ = Coefficient(fi S, p = di − di−1 if μ = 0 then di = di − 1 elseif p < 0 then −p fi+1 ϕi+1 x −μ fi ϕi = Fi+1 Ψi+1 Fi Ψi 1/μ 0 di = di−2 − 1 else
fi+1 ϕi+1 Fi+1 Ψi+1 di = di − 1 end if
=
1 −μxp 01
fi ϕi Fi Ψi
end while Return fi , ϕi
This last algorithm is the Berlekamp-Massey algorithm that solves the linear recurrence tj=0 Λj e(αi+j−1 ) = 0 for all i > 0. This recurrence is derived from Λ xnS−1 being a polynomial (see (7)) and thus having no terms of negative order in its expression as a Laurent series in 1/x, and from the equality 1 S e(α) e(α2 ) = + + ··· . e(1) + xn − 1 x x x2 A proof of this last equality can be found in [7].
6
Moving Back to Primal Reed-Solomon Codes
It is well known that the words of a dual Reed-Solomon code are in bijection with those of the primal Reed-Solomon code of the same dimension. The
40
M. Bras-Amor´ os and M.E. O’Sullivan
natural bijection was previously mentioned in (1). Suppose that a code word c∗ from the primal Reed-Solomon code C ∗ (k) is transmitted, and that an error e∗ is added to it giving the received word u∗ = c∗ + e∗ . Notice that if c = (c∗0 , αc∗1 , α2 c∗2 , . . . , αn−1 c∗n−1 ) and e = (e∗0 , αe∗1 , α2 e∗2 , . . . , αn−1 e∗n−1 ), then c ∈ C(k) and u := c + e = (u∗0 , αu∗1 , α2 u∗2 , . . . , αn−1 u∗n−1 ). So, e has the same nonzero positions as e∗ , and the error values e∗i added to u∗i can be computed from the error values ei added to ui by e∗i = ei /αi . Table 1. Steps of the classical Euclidean algorithm q(x) = 0 r(x) = x9 f (x) = 2 ϕ(x) = 0 q(x) = 0 r(x) = α14 x8 + αx7 + α10 x6 + α10 x5 + α17 x4 + α5 x3 + x2 + α11 x + 2 f (x) = 0 ϕ(x) = 1 q(x) = α12 x + α12 r(x) = α3 x7 + α22 x6 + 2x5 + α6 x4 + α16 x3 + α9 x2 + α5 x + α12 f (x) = 2 ϕ(x) = α25 x + α25 q(x) = α11 x + α21 r(x) = α3 x6 + x5 + α23 x4 + α19 x3 + α23 x2 + α24 x + α17 f (x) = α11 x + α21 ϕ(x) = α23 x2 + α3 x + α4 q(x) = x + 2 r(x) = α10 x5 + α11 x4 + α14 x3 + αx2 + α23 x + α3 f (x) = α24 x2 + α9 x + α2 ϕ(x) = α10 x3 + α20 x2 + α4 x + α16 q(x) = α19 x + α10 r(x) = α4 x3 + α4 x2 + α10 x + α20 f (x) = α4 x3 + x2 + α7 x + 2 ϕ(x) = α16 x4 + α4 x3 + α17 x2 + α9 x + α7
Now we can perform the previous versions of the algorithm using S¯ = e(αn−k−1 )xk + e(αn−k−2 )xk+1 + · · · + e(1)xn−1 = e∗ (αn−k )xk + e∗ (αn−k−1 )xk+1 + · · · + e∗ (α)xn−1 , which is known because it is equal to u∗ (αn−k )xk + u∗ (αn−k−1 )xk+1 + · · · + u∗ (α)xn−1 . Then, once we have the error positions, we can compute the error values as e∗i =
Ω(αi ) . αi Λ (αi )
As an example, consider F27 = F2 [x]/(x3 + 2x + 1) and suppose that α is the class of x. Suppose d = 10 and e∗ = α4 x + α2 x2 + α6 x8 + α3 x24 . For the classical Euclidean algorithm, we need the polynomial
From the Euclidean Algorithm for Solving a Key Equation
41
Table 2. Steps of the new Euclidean algorithm q(x) = 0 r(x) = x26 + 2 f (x) = 0 ϕ(x) = 2 q(x) = 0 r(x) = 2x25 + α11 x24 + x23 + α5 x22 + α17 x21 + α10 x20 + α10 x19 + αx18 + α14 x17 f (x) = 1 ϕ(x) = 0 q(x) = 2x + α24 r(x) = α14 x24 + α16 x23 + α25 x22 + α17 x21 + α20 x20 + α6 x19 + α7 x18 + α25 x17 + 2 f (x) = x + α11 ϕ(x) = 2 q(x) = α25 x + α17 r(x) = α16 x23 + α15 x22 + α7 x21 + α7 x20 + α11 x19 + α25 x18 + 2x17 + α25 x + α17 f (x) = α12 x2 + αx + α25 ϕ(x) = α25 x + α17 q(x) = α24 x + α6 r(x) = α24 x22 + 2x21 + α25 x20 + α9 x19 + α21 x18 + α3 x17 + α10 x2 + α24 x + α11 f (x) = α23 x3 + α9 x2 + α22 x + α15 ϕ(x) = α10 x2 + α24 x + α11 q(x) = α18 x + α18 r(x) = α8 x20 + α2 x19 + α21 x18 + α25 x17 + α15 x3 + α5 x2 + α25 x + α25 f (x) = α2 x4 + α4 x3 + α12 x2 + α25 x + α11 ϕ(x) = α15 x3 + α5 x2 + α25 x + α25
S¯∗ = e∗ (α)+e∗ (α2 )x+· · ·+e∗ (αn−k )xn−k−1 = α14 x8 +αx7 +α10 x6 +α10 x5 + a x + α5 x3 + x2 + α11 x + 2. For the new Euclidean algorithm we will use S¯ = e∗ (αn−k )xk +e∗ (αn−k−1 )xk+1 + · · · + e∗ (α)xn−1 = 2x25 + α11 x24 + x23 + α5 x22 + α17 x21 + α10 x20 + α10 x19 + αx18 + α14 x1 7. The steps of both algorithms are written in Table 1 and Table 2. 17 4
References 1. Berlekamp, E.R.: Algebraic coding theory. McGraw-Hill Book Co., New York (1968) 2. Cox, D.A., Little, J., O’Shea, D. (eds.): Using algebraic geometry, 2nd edn. Graduate Texts in Mathematics, vol. 185. Springer, New York (2005) 3. Dornstetter, J.-L.: On the equivalence between Berlekamp’s and Euclid’s algorithms. IEEE Trans. Inform. Theory 33(3), 428–431 (1987) 4. David Forney Jr., G.: On decoding BCH codes. IEEE Trans. Information Theory IT11, 549–557 (1965) 5. Heydtmann, A.E., Jensen, J.M.: On the equivalence of the Berlekamp-Massey and the Euclidean algorithms for decoding. IEEE Trans. Inform. Theory 46(7), 2614– 2624 (2000) 6. Massey, J.L.: Shift-register synthesis and BCH decoding. IEEE Trans. Information Theory IT-15, 122–127 (1969)
42
M. Bras-Amor´ os and M.E. O’Sullivan
7. O’Sullivan, M.E., Bras-Amor´ os, M.: The Key Equation for One-Point Codes, ch. 3, pp. 99–152 (2008) 8. Roth, R.: Introduction to coding theory. Cambridge University Press, Cambridge (2006) 9. Sugiyama, Y., Kasahara, M., Hirasawa, S., Namekawa, T.: A method for solving key equation for decoding Goppa codes. Information and Control 27, 87–99 (1975)
Rank for Some Families of Quaternary Reed-Muller Codes Jaume Pernas, Jaume Pujol, and Merc`e Villanueva Dept. of Information and Communications Engineering, Universitat Aut` onoma de Barcelona, Spain {jaume.pernas,jaume.pujol,merce.villanueva}@autonoma.edu
Abstract. Recently, new families of quaternary linear Reed-Muller codes such that, after the Gray map, the corresponding Z4 -linear codes have the same parameters and properties as the codes in the usual binary linear Reed-Muller family have been introduced. A structural invariant, the rank, for binary codes is used to classify some of these Z4 -linear codes. The rank is established generalizing the known results about the rank for Z4 linear Hadamard and Z4 -linear extended 1-perfect codes. Keywords: Rank, quaternary codes, Reed-Muller codes, Z4 -linear codes.
1
Introduction
Let Z2 and Z4 be the ring of integers modulo 2 and modulo 4, respectively. Let Zn2 be the set of all binary vectors of length n and let Zn4 be the set of all quaternary vectors of length n. Any nonempty subset C of Zn2 is a binary code and a subgroup of Zn2 is called a binary linear code or a Z2 -linear code. Equivalently, any nonempty subset C of Zn4 is a quaternary code and a subgroup of Zn4 is called a quaternary linear code. Some authors also use the term “quaternary codes” to refer to additive codes over GF (4) [1], but note that these are not the codes we are considering in this paper. The Hamming distance dH (u, v) between two vectors u, v ∈ Zn2 is the number of coordinates in which u and v differ. The Hamming weight of a vector u ∈ Zn2 , denoted by wH (u), is the number of nonzero coordinates of u. The minimum Hamming distance of a binary code C is the minimum value of dH (u, v) for u, v ∈ C satisfying u = v. The Gray map, φ : Zn4 −→ Z2n 2 given by φ(v1 , . . . , vn ) = (ϕ(v1 ), . . . , ϕ(vn )) where ϕ(0) = (0, 0), ϕ(1) = (0, 1), ϕ(2) = (1, 1), ϕ(3) = (1, 0), is an isometry which transforms Lee distances over Zn4 into Hamming distances over Z2n 2 . Let C be a quaternary linear code. Since C is a subgroup of Zn4 , it is isomorphic to an abelian structure Zγ2 × Zδ4 . Therefore, C is of type 2γ 4δ as a group, it has |C| = 2γ+2δ codewords and 2γ+δ codewords of order two. The binary image
This work was supported in part by the Spanish MEC and the European FEDER under Grant MTM2006-03250.
M. Bras-Amor´ os and T. Høholdt (Eds.): AAECC 2009, LNCS 5527, pp. 43–52, 2009. c Springer-Verlag Berlin Heidelberg 2009
44
J. Pernas, J. Pujol, and M. Villanueva
C = φ(C) of any quaternary linear code C of length n and type 2γ 4δ is called a Z4 -linear code of binary length N = 2n and type 2γ 4δ . Two binary codes C1 and C2 of length n are said to be isomorphic if there is a coordinate permutation π such that C2 = {π(c) : c ∈ C1 }. They are said to be equivalent if there is a vector a ∈ Zn2 and a coordinate permutation π such that C2 = {a + π(c) : c ∈ C1 } [11]. Two quaternary linear codes C1 and C2 both of length n and type 2γ 4δ are said to be monomially equivalent, if one can be obtained from the other by permuting the coordinates and (if necessary) changing the signs of certain coordinates. They are said to be permutation equivalent if they differ only by a permutation of coordinates [9]. Note that if two quaternary linear codes C1 and C2 are monomially equivalent, then the corresponding Z4 -linear codes C1 = φ(C1 ) and C2 = φ(C2 ) are isomorphic. Two structural invariants for binary codes are the rank and dimension of the kernel. The rank of a binary code C, denoted by rC , is simply the dimension of C, which is the linear span of the codewords of C. The kernel of a binary code C, denoted by K(C), is the set of vectors that leave C invariant under translation, i.e. K(C) = {x ∈ Zn2 : C + x = C}. If C contains the all-zero vector, then K(C) is a binary linear subcode of C. The dimension of the kernel of C will be denoted by kC . These two invariants do not give a full classification of binary codes, since two nonisomorphic binary codes could have the same rank and dimension of the kernel. In spite of that, they can help in classification, since if two binary codes have different ranks or dimensions of the kernel, they are nonisomorphic. It is well-known that an easy way to built the binary linear Reed-Muller family of codes, denoted by RM , is using the Plotkin construction [11]. In [14],[15], Pujol et al. introduced new quaternary Plotkin constructions to build new families of quaternary linear Reed-Muller codes, denoted by RMs . The quaternary linear Reed-Muller codes RMs (r, m) of length 2m−1 , for m ≥ 1, 0 ≤ r ≤ m and 0 ≤ s ≤ m−1 2 , in these new families satisfy that the corresponding Z4 -linear codes have the same parameters and properties (length, dimension, minimum distance, inclusion and duality relationship) as the binary linear codes in the wellknown RM family. In the binary case, there is only one family. In contrast, in the quaternary case, for each m there are m+1 2 families, which will be distinguished using subindexes s from the set {0, . . . , m−1 2 }. The dimension of the kernel and rank have been studied for some families of Z4 -linear codes [2], [4], [5], [10], [12]. In the RM family, the RM (1, m) and RM (m − 2, m) binary codes are a linear Hadamard and extended 1-perfect code, respectively. Recall that a Hadamard code of length n = 2m is a binary code with 2n codewords and minimum Hamming distance n/2, and an extended 1-perfect code of length n = 2m is a binary code with 2n−m codewords and minimum Hamming distance 4. Equivalently, in the RMs families, the corresponding Z4 -linear code of any RMs (1, m) and RMs (m − 2, m) is a Hadamard and extended 1-perfect code, respectively [14],[15]. For the corresponding Z4 linear codes of RMs (1, m) and RMs (m − 2, m), the rank were studied and computed in [5],[10].
Rank for Some Families of Quaternary Reed-Muller Codes
Specifically,
45
γ + 2δ if s = 0, 1 and (1) γ + 2δ + δ−1 if s ≥ 2 2 rP = γ¯ + 2δ¯ + δ = 2m−1 + δ¯ (except rP = 11, if m = 4 and s = 0), where ¯ H = φ(RMs (1, m)) of type 2γ 4δ and P = φ(RMs (m − 2, m)) of type 2γ¯ 4δ . The dimension of the kernel was computed for all RMs (r, m) codes in [13]. The aim of this paper is the study of the rank for these codes, generalizing the known results about the rank for the RMs (r, m) codes with r ∈ {0, 1, m − 2, m − 1, m} [5],[10]. The paper is organized as follows. In Section 2, we recall some properties related to quaternary linear codes and the rank of these codes. Moreover, we describe the construction of the RMs families of codes. In Section 3, we establish the rank for all codes in the RMs families with s ∈ {0, 1}. Furthermore, we establish the rank for the RMs (r, m) codes with r ∈ {2, m−3}. In Section 4, we show that the rank allows us to classify the RMs (r, m) codes with r ∈ {2, m − 3}. Finally, the conclusions are given in Section 5. rH =
2 2.1
Preliminaries Quaternary Linear Codes
Let C be a quaternary linear code of length n and type 2γ 4δ . Although C is not a free module, every codeword is uniquely expressible in the form γ δ λi ui + μj vj , i=1
j=1
where λi ∈ Z2 for 1 ≤ i ≤ γ, μj ∈ Z4 for 1 ≤ j ≤ δ and ui , vj are vectors in Zn4 of order two and four, respectively. The vectors ui , vj give us a generator matrix G of size (γ + δ) × n for the code C. In [8], it was shown that any quaternary linear code of type 2γ 4δ is permutation equivalent to a quaternary linear code with a canonical generator matrix of the form 2T 2Iγ 0 , (2) S R Iδ where R, T are matrices over Z2 of size δ × γ and γ × (n − γ − δ), respectively; and S is a matrix over Z4 of size δ × (n − γ − δ). The concepts of duality for quaternary linear codes were also studied in [8], n where n the inner product for any two vectors u, v ∈ Z4 is⊥ defined as u · v = i=1 ui vi ∈ Z4 . Then, the dual code of C, denoted by C , is defined in the standard way C ⊥ = {v ∈ Zn4 : u · v = 0 for all u ∈ C}. The corresponding binary code φ(C ⊥ ) is denoted by C⊥ and called the Z4 -dual code of C = φ(C). Moreover, the dual code C ⊥ , which is also a quaternary linear code, is of type 2γ 4n−γ−δ . Let u ∗ v denote the component-wise product for any u, v ∈ Zn4 . Lemma 1 ([6],[7]). Let C be a quaternary linear code of type 2γ 4δ and let C = φ(C) be the corresponding Z4 -linear code. Let G be a generator matrix of C and let {ui }γi=1 be the rows of order two and {vj }δj=0 the rows of order four in G. Then, C is generated by {φ(ui )}γi=1 , {φ(vj ), φ(2vj )}δj=1 and {φ(2vj ∗ vk )}1≤j
46
2.2
J. Pernas, J. Pujol, and M. Villanueva
Quaternary Linear Reed-Muller Codes
Recall that a binary linear r th-order Reed-Muller code RM (r, m) with 0 ≤ r ≤ m and m ≥ 2 can be described using the Plotkin construction as follows [11]: RM (r, m) = {(u|u + v) : u ∈ RM (r, m − 1), v ∈ RM (r − 1, m − 1)}, where RM (0, m) is the repetition code {0, 1}, RM (m, m) is the universe code, and ”|” denotes concatenation. For m = 1, there are only two codes: the repetition code RM (0, 1) and the universe code RM (1, 1). This RM family of codes r m m m−r has length 2 , minimum distance 2 and dimension . Moreover, the i i=0 code RM (r − 1, m) is a subcode of RM (r, m) and the code RM (r, m) is the dual code of RM (m − 1 − r, m) for 0 ≤ r < m. In the recent literature [2],[3],[8],[16],[17] several families of quaternary linear codes have been proposed and studied trying to generalize the RM family. However, when the corresponding Z4 -linear codes are taken, they do not satisfy all the same properties as the RM family. In [14],[15], new quaternary linear ReedMuller families, RMs , such that the corresponding Z4 -linear codes have the parameters and properties of RM family of codes, were proposed. The following two constructions are necessary to generate these new RMs families. Definition 2 (Plotkin Construction). Let A and B be two quaternary linear codes of length n, types 2γA 4δA and 2γB 4δB , and minimum distances dA and dB , respectively. A new quaternary linear code PC(A, B) is defined as PC(A, B) = {(u|u + v) : u ∈ A, v ∈ B}. It is easy to see that if GA and GB are generator matrices of A and B, respectively, then the matrix GA GA GP C = 0 GB is a generator matrix of the code PC(A, B). Moreover, the code PC(A, B) is of length 2n, type 2γA +γB 4δA +δB , and minimum distance d = min{2dA , dB } [14],[15]. Definition 3 (BQ-Plotkin Construction). Let A, B, and C be three quaternary linear codes of length n; types 2γA 4δA , 2γB 4δB , and 2γC 4δC ; and minimum distances dA , dB , and dC , respectively. Let GA , GB , and GC be generator matrices of the codes A, B, and C, respectively. A new code BQ(A, B, C) is defined as the quaternary linear code generated by ⎛ ⎞ GA GA GA GA ⎜ 0 GB 2GB 3GB ⎟ ⎟ GBQ = ⎜ ⎝ 0 0 GˆB GˆB ⎠ , 0 0 0 GC where GB is the matrix obtained from GB after switching twos by ones in their γB rows of order two, and GˆB is the matrix obtained from GB after removing their γB rows of order two.
Rank for Some Families of Quaternary Reed-Muller Codes
47
The code BQ(A, B, C) is of length 4n, type 2γA +γC 4δA +γB +2δB +δC , and minimum distance d = min{4dA , 2dB , dC } [14],[15]. Now, the quaternary linear Reed-Muller codes RMs (r, m) of length 2m−1 , for m ≥ 1, 0 ≤ r ≤ m, and 0 ≤ s ≤ m−1 2 , will be defined. For the recursive construction it will be convenient to define them also for r < 0 and r > m. We begin by considering the trivial cases. The code RMs (r, m) with r < 0 is defined as the zero code. The code RMs (0, m) is defined as the repetition code with only the all-zero and all-two vectors. The code RMs (r, m) with r ≥ m is defined . For m = 1, there is only one family with s = 0, and as the whole space Zm−1 4 in this family there are only the zero, repetition and universe codes for r < 0, r = 0 and r ≥ 1, respectively. In this case, the generator matrix ofRM 0 (0, 1) is G0(0,1) = 2 and the generator matrix of RM0 (1, 1) is G0(1,1) = 1 . For any m ≥ 2, given RMs (r, m − 1) and RMs (r − 1, m − 1) codes, where 0 ≤ s ≤ m−2 2 , the RMs (r, m) code can be constructed in a recursive way using the Plotkin construction given by Definition 2 as follows: RMs (r, m) = PC(RMs (r, m − 1), RMs (r − 1, m − 1)). For example, for m = 2, the generator matrices of RM0 (r, 2), 0 ≤ r ≤ 2, are the following: 02 10 G0(0,2) = 2 2 ; G0(1,2) = ; G0(2,2) = . 11 01 Note that when m is odd, the RMs family with s = m−1 can not be generated 2 using the Plotkin construction. In this case, for any m ≥ 3, m odd and s = m−1 2 , given RMs−1 (r, m − 2), RMs−1 (r − 1, m − 2) and RMs−1 (r − 2, m − 2), the RMs (r, m) code can be constructed using the BQ-Plotkin construction given by Definition 3 as follows: RMs (r, m) = BQ(RMs−1 (r, m−2), RMs−1 (r−1, m−2), RMs−1 (r−2, m−2)). For example, for m = 3, there are two families. The RM0 family can be generated using the Plotkin construction. On the other hand, the RM1 family has to be generated using the BQ-Plotkin construction. The generator matrices of RM1 (r, 3), 0 ≤ r ≤ 3, are the following: G1(0,3) = 2 2 2 2 ; ⎛ ⎛ ⎞ ⎞ 1111 1111 ⎜0 1 2 3⎟ ⎜0 1 2 3⎟ 1111 ⎜ ⎟ ⎟ G1(1,3) = ; G1(2,3) = ⎜ ⎝ 0 0 1 1 ⎠ ; G1(3,3) = ⎝ 0 0 1 1 ⎠ . 0123 0002 0001 Table 1 shows the type 2γ 4δ of all these RMs (r, m) codes for m ≤ 10. The following proposition summarizes the parameters and properties of these RMs families of codes. Proposition 4 ([14],[15]). A quaternary linear Reed-Muller code RMs (r, m), with m ≥ 1, 0 ≤ r ≤ m, and 0 ≤ s ≤ m−1 2 , has the following parameters and properties:
48
J. Pernas, J. Pujol, and M. Villanueva
1. the length is n = 2m−1 ; 2. the minimum distance is d = 2m−r ; 3. 4. 5. 6.
3
r m the number of codewords is 2 , where k = ; i i=0 the code RMs (r − 1, m) is a subcode of RMs (r, m) for 0 ≤ r ≤ m; the codes RMs (1, m) and RMs (m−2, m), after the Gray map, are Z4 -linear Hadamard and Z4 -linear extended perfect codes, respectively; the code RMs (r, m) is the dual code of RMs (m − 1 − r, m) for −1 ≤ r ≤ m. k
Rank for Some Infinite Families of RMs(r, m) Codes
In this section, we will compute the rank for some infinite families of the quaternary linear Reed-Muller codes RMs (r, m). The rank of RMs (r, m) will be denoted by rs(r,m) instead of rRMs (r,m) . First of all, we will recall the result that gives us which of the RMs (r, m) codes are binary liner codes after the Gray map. Note that if we have a quaternary linear code of type 2γ 4δ which is a binary linear code after the Gray map, we can compute the rank as γ + 2δ [6],[7]. Proposition 5 ([13]). For all m ≥ 1, the corresponding Z4 -linear code of the RMs (r, m) code is a binary linear code if and only if ⎧ ⎨ s = 0 and r ∈ {0, 1, 2, m − 1, m}, s = 1 and r ∈ {0, 1, m − 1, m}, ⎩ s ≥ 2 and r ∈ {0, m − 1, m}. Now, we will give an expression for the parameters γ and δ of a quaternary linear Reed-Muller code RMs (r, m) of type 2γ 4δ , depending on s, r and m. Lemma 6. Let C be a quaternary linear Reed-Muller code RMs (r, m) of type 2γ 4δ . Then, for s ≥ 0, m ≥ 2s + 1 and 0 ≤ r ≤ m, r2 m − 2s − 1 s γ= r − 2i i i=0
and
r 1 m γ δ= − . 2 i=0 i 2
The next proposition gives us an important result for these quaternary linear Reed-Muller codes. In some cases, we obtain two codes with the same rank, but different s. We will prove that these codes are equal. This proposition will be used for the classification of some of the RMs (r, m) codes in Section 4, and it will also be used to calculate the rank of these codes as exceptions. Proposition 7. Given two codes RMs (r, m) and RMs−1 (r, m) of type 2γ 4δ and 2γ 4δ , respectively, such that m ≥ 3 is odd, r ≥ 2 is even, and s = m−1 2 , then RMs (r, m) = RMs−1 (r, m).
Rank for Some Families of Quaternary Reed-Muller Codes
49
Proof. The generator matrix Gs−1(r,m) of RMs−1 (r, m) is obtained using the Plotkin construction from RMs−1 (r − 1, m − 1) and RMs−1 (r, m − 1). Furthermore, the generator matrices of RMs−1 (r − 1, m − 1) and RMs−1 (r, m − 1) can be obtained using Plotkin construction again from codes with m − 2 value. So we can write the generator matrix Gs−1(r,m) as follows: ⎛ ⎞ Gs−1(r,m−2) Gs−1(r,m−2) Gs−1(r,m−2) Gs−1(r,m−2) ⎜ 0 Gs−1(r−1,m−2) 0 Gs−1(r−1,m−2) ⎟ ⎟. Gs−1(r,m) = ⎜ ⎝ 0 0 Gs−1(r−1,m−2) Gs−1(r−1,m−2) ⎠ 0 0 0 Gs−1(r−2,m−2) The generator matrix Gs(r,m) of RMs (r, m) can be obtained using the BQPlotkin construction given by Definition 3. Since r is even and m is odd, r − 1 and m − 2 are odd. In this case any RMs−1 (r − 1, m − 2) code, where s = m−1 2 , is of type 20 4δ . This result can be proved by induction on m and using the BQ Plotkin construction. Since Gs−1(r−1,m−2) is of type 20 4δ , then Gs−1(r−1,m−2) = ˆ Gs−1(r−1,m−2) and Gs−1(r−1,m−2) = Gs−1(r−1,m−2) . It is easy to find a linear combination of rows that transforms the matrix Gs−1(r,m) into the matrix Gs(r,m) . Now, we will give a recursive way to compute the rank for all Reed-Muller codes in the RM0 and RM1 families. Note that the first binary nonlinear code is RM0 (3, 5). Thus, for m < 5 the rank is γ + 2δ. Proposition 8. Let C be a quaternary linear Reed-Muller code RM0 (r, m). The rank of C for m ≥ 5 is 0 if r ∈ {0, 1, 2, m − 1, m} r0(r,m) = r0(r,m−1) + r0(r−1,m−1) + m−2 if r ∈ {3, . . . , m − 2} 2r−3 Proposition 9. Let C be a quaternary linear Reed-Muller code RM1 (r, m). The rank of C for m ≥ 5 is ⎧ if r ∈ {0, 1, m − 1, m} ⎨0 r1(r,m) = r1(r,m−1) + r1(r−1,m−1) + m − 2 if r = 2 ⎩ m−1 if r ∈ {3, . . . , m − 2}. 2 2r−3 The next proposition gives the rank for all quaternary linear Reed-Muller codes with r ∈ {0, 1, m − 3, m − 2, m − 1, m} and any s. Proposition 10. Let RMs (r, m) be a quaternary linear Reed-Muller code of type 2γ 4δ . The rank of RMs (r, m) can be computed as ⎧ γ + 2δ if r ∈ {0, m − 1, m} ⎪ ⎪ ⎪ γ + 2δ ⎪ ⎨ if r = 1 and s ∈ {0, 1} rs(r,m) = γ + 2δ + δ−1 if r = 1 and s ≥ 2 2 ⎪ m−1 ⎪ 2 + δ if r = m − 3 and m > 6 ⎪ ⎪ ⎩ m−1 2 +δ if r = m − 2 and m > 4.
50
J. Pernas, J. Pujol, and M. Villanueva
Finally, the next proposition gives a recursive way to compute the rank of all quaternary linear Reed-Muller codes with r = 2 and any s. Proposition 11. Let RMs (2, m) be a quaternary linear Reed-Muller code of type 2γ 4δ . The rank of RMs (2, m) can be computed as s+1 (m − s − 3), rs(2,m) = rs(2,m−1) + rs(1,m−1) + 2s + 2 except when m is odd and s =
m−1 2 ,
since the rank is rs(2,m) = rs−1(2,m) .
When m is odd and s = m−1 2 , by Proposition 7, the codes RMs (2, m) and RMs−1 (2, m) are equals. Thus, the rank is also the same. Note that, for s = 0 and r = 2, we have a binary linear code and the rank is γ + 2δ.
4
Classification of Some Families of RMs(r, m) Codes
In this section, we will show that this invariant, the rank, will allow us to classify these RMs (r, m) codes in some cases depending on the parameter r. This classification was given for r = 1 and r = m − 2 [5], [10]. Now, we will extend this result for r = 2 and r = m − 3. We are close to generalize this result for all 0 ≤ r ≤ m, but it is not easy to obtain a general form to compute the rank for all quaternary linear Reed-Muller codes RMs (r, m). Table 1 shows the type 2γ 4δ and the rank of all these RMs (r, m) codes for m ≤ 10. In these examples, you can see that the rank is always different, except for the codes quoted in Proposition 7. If two codes have different rank, we can say that they are nonisomorphic. The next theorem proves that for a given m ≥ 4, and r ∈ {2, m − 3}2, the RMs (r, m) codes have different rank, so they are nonisomorphic. In some cases, there is an exception, but we know by Proposition 7 that the codes are equal. Theorem 12. For all m ≥ 4 and r ∈ {2, m − 3}, there are at least m+1 2 nonisomorphic binary codes with the same parameters as the code RM (r, m), except when m is odd, and r is even. In this case, there are at least m−1 nonisomorphic 2 binary codes with the same parameters as the code RM (r, m). Proof. By Proposition 11, we know that rs(2,m) = rs(2,m−1) + rs(1,m−1) + 2s + s+1 increases or 2 (m − s − 3). If r = 1, the code is Hadamard and rs(1,m−1) is equal to, depending on s. For m ≥ 4 the expression 2s + s+1 (m − s − 3) 2 also increases, depending on s. We can suppose that rs(2,m−1) is crescent on s for m = 4 and proceed by induction on m. Therefore, rs(2,m) is different for every s, except when m is odd, where we have two codes with the same rank. By Proposition 7, these two codes are equal. By Proposition 10, we know that rs(m−3,m) = 2m−1 + δ. In Proposition 6, we can see a way to compute γ. Since r = m − 3, then the value of γ is decreasing on s. Thus, δ is crescent and the rank is also crescent, depending on s. When m is odd and r is even, we have again the case of two equal codes, solved in Proposition 7.
Rank for Some Families of Quaternary Reed-Muller Codes
51
Table 1. Type 2γ 4δ and rank rs(r,m) for all RMs (r, m) codes with m ≤ 10 and r ∈ {0, 1, 2, 3}, showing them in the form (γ, δ) rs(r,m) m 1 2 3 4 5 6
7
8
9
10
5
HHr 0 s H 0 0 0 1 0 1 0 1 2 0 1 2 0 1 2 3 0 1 2 3 0 1 2 3 3 0 1 2 3 3
(1,0) (1,0) (1,0) (1,0) (1,0) (1,0) (1,0) (1,0) (1,0) (1,0) (1,0) (1,0) (1,0) (1,0) (1,0) (1,0) (1,0) (1,0) (1,0) (1,0) (1,0) (1,0) (1,0) (1,0) (1,0) (1,0) (1,0) (1,0) (1,0) (1,0)
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
(0,1) (1,1) (2,1) (0,2) (3,1) (1,2) (4,1) (2,2) (0,3) (5,1) (3,2) (1,3) (6,1) (4,2) (2,3) (0,4) (7,1) (5,2) (3,3) (1,4) (8,1) (6,2) (4,3) (2,4) (0,5) (7,1) (5,2) (3,3) (1,4) (1,5)
2 2 3 4 4 5 5 6 6 7 7 7 8 8 8 9 11 9 9 10 12 10 10 11 13 16 11 11 12 14 17
(0,2) 4 (1,3) 7 (1,3) 7 (3,4) 11 (1,5) 13 (6,5) 16 (2,7) 21 (2,7) 21 (10,6) 22 (4,9) 31 (2,10) 35 (15,7) 29 (7,11) 43 (3,13) 53 (3,13) 53 (21,8) 37 (11,13) 57 (5,16) 75 (3,17) 82 (28,9) 46 (16,15) 73 (8,19) 101 (4,21) 118 (4,21) 118 (36,10) 56 (22,17) 91 (12,22) 131 (6,25) 161 (4,26) 172
m−3
(. . . ) (. . . ) (. . . ) (. . . ) (. . . ) (. . . ) (. . . ) (. . . ) (. . . ) (. . . ) (. . . ) (. . . ) (. . . ) (. . . ) (. . . ) (. . . ) (. . . ) (. . . ) (. . . ) (. . . ) (. . . )
(1,0) 1 (1,0) 1 (3,1) 5 (1,2) 5 (6,5) 16 (2,7) 21 (2,7) 21 (10,16) 47 (4,19) 51 (2,20) 52 (15,42) 106 (7,46) 110 (3,48) 112 (3,48) 112 (21, 99) 227 (11,104) 232 (5,107) 235 (3,108) 236 (28,219) 475 (16,225) 481 (8,229) 485 (4,231) 487 (4,231) 487 (36,466) 978 (22,473) 985 (12,478) 990 ( 6,481) 993 ( 4,482) 994
m−2
m−1
m
(1,0) 1 (2,1) 4 (0,2) 4 (3,4) 11 (1,5) 13 (4,11) 27 (2,12) 28 (0,13) 29 (5,26) 58 (3,27) 59 (1,28) 60 (6,57) 121 (4,58) 122 (2,59) 123 (0,60) 124 (7,120) 248 (5,121) 249 (3,122) 250 (1,123) 251 (8,247) 503 (6,248) 504 (4,249) 505 (2,250) 506 (0,251) 507 (9,502) 1014 (7,503) 1015 (5,504) 1016 (3,505) 1017 (1,506) 1018
(1,0) 1 (1,1) 3 (1,3) 7 (1,3) 7 (1,7) 15 (1,7) 15 (1,15) 31 (1,15) 31 (1,15) 31 (1,31) 63 (1,31) 63 (1,31) 63 (1,63) 127 (1,63) 127 (1,63) 127 (1,63) 127 (1,127) 255 (1,127) 255 (1,127) 255 (1,127) 255 (1,255) 511 (1,255) 511 (1,255) 511 (1,255) 511 (1,255) 511 (1,511) 1023 (1,511) 1023 (1,511) 1023 (1,511) 1023 (1,511) 1023
(0,1) 2 (0,2) 4 (0,4) 8 (0,4) 8 (0,8) 16 (0,8) 16 (0,16) 32 (0,16) 32 (0,16) 32 (0,32) 64 (0,32) 64 (0,32) 64 (0,64) 128 (0,64) 128 (0,64) 128 (0,64) 128 ( 0,128) 256 ( 0,128) 256 ( 0,128) 256 ( 0,128) 256 (0,256) 512 (0,256) 512 (0,256) 512 (0,256) 512 (0,256) 512 (0,512) 1024 (0,512) 1024 (0,512) 1024 (0,512) 1024 (0,512) 1024
Conclusions
In a recent paper [15], new families of quaternary linear codes, the RMs (r, m) codes, are constructed in such a way that, after the Gray map, the Z4 -linear codes fulfill the same properties and fundamental characteristics as the binary linear Reed-Muller codes. In this paper, a structural invariant for binary codes, the rank, is used to classify some of these new families of codes. Specifically, we classified the RMs (r, m) codes with r ∈ {2, m − 3}. The RMs (r, m) codes with r ∈ {0, 1, m − 2, m − 1, m} were already classified using the rank [5],[10]. As a future research, it would be interesting to compute the rank for the RMs (r, m) codes with r ∈ {3, . . . , m − 4} and s ≥ 2, in order to see whether it is possible to obtain a full classification of all these RMs (r, m) codes using this invariant. In this paper, we also proved that, when m is odd, m ≥ 5, and r is even, there are two codes with the same rank, because these two codes are equal. Moreover, we also computed the rank for all codes in the RM0 and RM1 families.
References 1. Bierbrauer, J.: Introduction to coding theory. Chapman & Hall/CRC, Boca Raton (2005) 2. Borges, J., Fern´ andez, C., Phelps, K.T.: Quaternary Reed-Muller codes. IEEE Trans. Inform. Theory 51(7), 2686–2691 (2005) 3. Borges, J., Fern´ andez-C´ ordoba, C., Phelps, K.T.: “ZRM codes”. IEEE Trans. Inform. Theory 54(1), 380–386 (2008)
52
J. Pernas, J. Pujol, and M. Villanueva
4. Borges, J., Phelps, K.T., Rif` a, J., Zinoviev, V.A.: On Z4 -linear Preparata-like and Kerdock-like codes. IEEE Trans. Inform. Theory 49(11), 2834–2843 (2003) 5. Borges, J., Phelps, K.T., Rif` a, J.: The rank and kernel of extended 1-perfect Z4 linear and additive non-Z4 -linear codes. IEEE Trans. Inform. Theory 49(8), 2028– 2034 (2003) 6. Fern´ andez-C´ ordoba, C., Pujol, J., Villanueva, M.: On rank and kernel of Z4 -linear codes. In: Barbero, A. (ed.) ICMCTA 2008. LNCS, vol. 5228, pp. 46–55. Springer, Heidelberg (2008) 7. Fern´ andez-C´ ordoba, C., Pujol, J., Villanueva, M.: Z2 Z4 -linear codes: rank and kernel. Discrete Applied Mathematics (2008) (submitted), arXiv:0807.4247 8. Hammons, A.R., Kumar, P.V., Calderbank, A.R., Sloane, N.J.A., Sol´e, P.: The Z4 linearity of Kerdock, Preparata, Goethals and related codes. IEEE Trans. Inform. Theory 40, 301–319 (1994) 9. Huffman, W.C., Pless, V.: Fundamentals of Error-Correcting Codes. Cambridge University Press, Cambridge (2003) 10. Krotov, D.S.: Z4 -linear Hadamard and extended perfect codes. In: International Workshop on Coding and Cryptography, Paris, France, January 8-12, pp. 329–334 (2001) 11. MacWilliams, F.J., Sloane, N.J.A.: The Theory of Error-Correcting Codes. NorthHolland Publishing Company, Amsterdam (1977) 12. Phelps, K.T., Rif` a, J., Villanueva, M.: On the additive (Z4 -linear and non-Z4 linear) Hadamard codes: rank and kernel. IEEE Trans. Inform. Theory 52(1), 316– 319 (2006) 13. Pernas, J., Pujol, J., Villanueva, M.: Kernel dimension for some families of quaternary Reed-Muller codes. In: Calmet, J., Geiselmann, W., M¨ uller-Quade, J. (eds.) Beth Festschrift. LNCS, vol. 5393, pp. 128–141. Springer, Heidelberg (2008) 14. Pujol, J., Rif` a, J., Solov’eva, F.I.: Quaternary Plotkin constructions and quaternary Reed-Muller codes. In: Bozta¸s, S., Lu, H.-F(F.) (eds.) AAECC 2007. LNCS, vol. 4851, pp. 148–157. Springer, Heidelberg (2007) 15. Pujol, J., Rif` a, J., Solov’eva, F.I.: Construction of Z4 -linear Reed-Muller codes. IEEE Trans. Inform. Theory 55(1), 99–104 (2009) 16. Solov’eva, F.I.: On Z4-linear codes with parameters of Reed-Muller codes. Problems of Inform. Trans. 43(1), 26–32 (2007) 17. Wan, Z.-X.: Quaternary codes. World Scientific Publishing Co. Pte. Ltd., Singapore (1997)
Optimal Bipartite Ramanujan Graphs from Balanced Incomplete Block Designs: Their Characterizations and Applications to Expander/LDPC Codes Tom Høholdt1 and Heeralal Janwal2 1
Institute for Mathematics, Technical University of Denmark, Denmark
[email protected] 2 Department of Mathematics and Computer Science, University of Puerto Rico (UPR), Rio Piedras Campus, P.O. Box: 23355, San Juan, PR 00931 - 3355
[email protected]
Abstract. We characterize optimal bipartite expander graphs and give necessary and sufficient conditions for optimality. We determine the expansion parameters of the BIBD graphs and show that they yield optimal expander graphs that are also bipartite Ramanujan graphs. In particular, we show that the bipartite graphs derived from finite projective and affine geometries yield optimal Ramanujan graphs. This in turn leads to a theoretical explanation of the good performance of a class of LDPC codes. Keywords: Bipartite graphs, expander graphs, eigenvalues of graphs, Ramanujan graphs, BIBD, finite geometries, LDPC and expander codes.
1
Introduction
An expander graph is a highly connected “sparse” graph (see, for example [51]). Expander graphs have numerous applications including those in communication science, computer science (especially complexity theory), network design, cryptography, combinatorics and pure mathematics (see the books and articles in the Bibliography). Expander graphs have played a prominent role in recent developments in coding theory (LDPC codes, expander codes, linear time encodable and decodable codes, codes attaining the Zyablov bound with low complexity of decoding) (see [55], [53], [54], [48], [57], [58], [38] [8],[27], [26], [4], [13], and others). We shall consider graphs X = (V, E), where V is the set of vertices and E is the set of edges of X . We will assume that the graph is undirected and connected and we shall only consider finite graphs. For F ⊂ V , the boundary ∂F is the set of edges connecting F to V \ F . The expanding constant, or isoperimetric constant of X is defined as, |∂F | h(X ) = min ∅=F ⊂V min{|F |, |V \ F |} M. Bras-Amor´ os and T. Høholdt (Eds.): AAECC 2009, LNCS 5527, pp. 53–64, 2009. c Springer-Verlag Berlin Heidelberg 2009
54
T. Høholdt and H. Janwal
If X is viewed as the graph of a communication network, then h(X ) measures the “quality” of the network as a transmission network. In all applications, the larger the h(X ) the better, so we seek graphs (or families of graphs) with h(X ) as large as possible with some fixed parameters. It is well-known that the expansion properties of a graph are closely related to the eigenvalues of the adjacency matrix A of the graph X = (V, E); it is indexed by pairs of vertices x, y of X and Axy is the number of edges between x and y. When X has n vertices, A has n real eigenvalues, repeated according to multiplicities that we list in decreasing order μ0 ≥ μ1 ≥ . . . ≥ μn−1 It is also known that if X is D-regular, i.e. all vertices have degree D, then μ0 = D and if morover the graph is connected μ1 < D. Also X is bipartite if and only if −μ0 is an eigenvalue of A. We recall the following Theorem 1. Let X be a finite, connected , D-regular graph then (D − μ1 )/2 ≤ h(X ) ≤ 2D(D − μ1 ) and Theorem 2. Let (Xm )m≥1 be a family of finite connected, D-regular graphs with |Vm | → +∞ as m → ∞. Then √ lim inf μ1 (Xm ) ≥ 2 D − 1. N →∞
This leads to the following Definition 1. A finite connected, D-regular graph X is Ramanujan if, for every eigenvalue μ of A other than ±D, one has √ μ ≤ 2 D − 1. We will also need Definition 2 (Bipartite Ramanujan Graphs). Let X be a√(c, d)-regular √ bipartite graph. Then X is called a Ramanujan graph if μ1 (X ) ≤ c − 1 + d − 1. In this article, we address the optimality of expander graphs and the property of being Ramanujan for: A D-regular graphs; B (c, d)-regular bipartite graphs; C Irregular bipartite graphs. We also show that the bipartite graphs of Balanced Incomplete Block Designs yield optimal Ramanujan graphs (meaning that we have equality in definition 1). Furthermore, we show that the bipartite graphs derived from finite projective and affine geometries (PG/EG) also yield optimal Ramanujan graphs.
Optimal Bipartite Ramanujan Graphs
2
55
Optimal Expander Graphs and Balanced Incomplete Block Designs
Unless otherwise specified, for background on block designs, we follow Hall [14]. Definition 3. A balanced incomplete block design (BIBD) D with parameters (v, b, r, k, λ) is an incidence structure with a set V of v distinct varieties (or objects) denoted a1 , · · · , av and a set B of b distinct blocks denoted B1 , · · · , Bb , such that each of the b blocks is incident with k varieties, and each of the v varieties is incident with r blocks, and every pair of varieties is incident with precisely λ blocks. The design is called symmetric if b = v and its parameters are denoted by (v, k, r). Proposition 1. For a (BIBD) D with parameters (v, b, r, k, λ), 1. b · k = v · r 2. r(k − 1) = λ(v − 1) Remark 1. For a BIBD, Fisher’s inequality implies that b ≥ v and hence r ≥ k and for symmetric designs b = v, and r = k. Furthermore, in any (v, k, r) symmetric block design, every pair of blocks is incident with precisely λ varieties, and if v is even then k − λ is a square. Definition 4. A BIBD can be described by a v × b incidence matrix H := (hij ), where for 1 ≤ i ≤ v and 1 ≤ j ≤ b, hij = 1 if ai is incident with Bj , and hij := 0 else. Then D is a block design if and only if the following system of equations hold. B := HH T = (r − λ)I + λJv
(1v )T H = k1b
(1)
where Jv is the v×v all 1’s matrix, and 1v and 1b are all 1’s vectors of appropriate lengths. 2.1
The Bipartite Graph of a BIBD
Definition 5. Let D be a BIBD. The bipartite graph XD has left set of vertices V and right set of vertices B and the adjacency of the left and right vertices is defined by the incidence structure of the design. It is clear that the left vertices of XD all have degree r and the right vertices all have degree k so the graph is what is called (r, k)-regular. In the symmetric case all vertices have degree r so the graph is r-regular. The adjacency matrix of the bipartite graph XD is then, 0 H A= (2) HT 0 We will from the following proposition from [18] determine the eigenvalues of A.
56
T. Høholdt and H. Janwal
Proposition 2. Let M be a Hermitian matrix. 1. M is diagonalizable by a Unitary matrix. 2. The geometric multiplicity of an eigenvalue of M equals its algebraic multiplicity. 3. All the eigenvalues of M are real. l 4. If PM (x) = i=1 (x − λi )mi is the characteristic polynomial of M with distinct real eigenvalues λi with algebraic multiplicities mi , then the minimal polynomial of M is mM (x) = li=1 (x − λi ) With notations as above we can now prove Lemma 1. 1. The characteristic and minimal polynomials of B = HH T are PB (x) = (x − r · k)(x − (r − λ))v−1 , mB (x) = (x − r · k)(x − (r − λ)). 2. The characteristic and minimal polynomials of C = H T H are, PC (x) = xb−v (x − r · k)(x − (r − λ))v−1 , mC (x) = x(x − r · k)(x − (r − λ)). Proof. The matrix Jv has eigenvalue v with multiplicity 1 and corresponding eigenvector 1v , and eigenvalue 0 with multiplicity (v − 1) with eigenspace W = {Y |Y ⊥ 1v }. And all eigenvectors are also eigenvectors of I. From the relation 1, we conclude that r · k is an eigenvalue of B with eigenvector 1v , and (r − λ) with eigenspace W , and hence multiplicity (v − 1). Therefore, PB (X) = (x − r · k)(x − (r − λ))v−1 . Since B is a symmetric matrix, from Proposition 2, we have mB (x) = (x − r · k)(x − (r − λ)). Since HH T and H T H have the same nonzero eigenvalues with the same multiplicities ([11] p.186 ) we get PC (x) = xb−v (x − r · k)(x − (r − λ))v−1 , and from Propostion 2 mC (x) = x(x − r · k)(x − (r − λ)).
Q.E.D.
This in turn leads to Theorem 3. The adjacency matrix of the A of the bipartite √ graph XD √ of a (x) = (x− k · r)(x+ k · r) (v, b, r, k, λ) BIBD has characteristic polynomial P A √ √ v−1 v−1 b−v (x − r − λ) (x + r − λ) x , and minimal polynomial m (x) = (x − A √ √ √ √ k · r)(x + k · r)(x − r − λ)(x + r − λ)x. In particular, the eigenvalues are √ √ k · r with multiplicity 1, r − λ with multiplicity v−1, 0 with multiplicity b−v, √ √ − r − λ with multiplicity v − 1, and − k · r with multiplicity 1. Proof. First
HH T 0 AA = A = 0 HT H T
2
(3)
Therefore, the characteristic polynomial PA2 (x) = PB (x)PC (x). Therefore, from Lemma 1, PA2 (x) = (x − r · k)2 (x − (r − λ))2(v−1) xb−v . It is clear that PA2 (x2 ) = PA (x)PA (−x) and since the graph is bipartite we have PA (−x) = (−1)v+b PA (x) ( [11] p. 31) and hence
Optimal Bipartite Ramanujan Graphs
57
(−1)v+b (PA (x))2 = PA2 (x2 ) = (x2 − r · k)2 (x2 − (r − λ))2(v−1) x2(b−v) so we have PA (x) = ±(x2 − r · k)(x2 − (r − λ))v−1 xb−v and the theorem follows directly.
(4) Q.E.D.
Theorem 4. (I) The bipartite graph XD of a (v, √ b, r, k, λ) BIBD is a (r, k)regular bipartite Ramanujan graph with μ1 = r − λ. (II) The r-regular graph of √ a symmetric BIBD is an r-regular bipartite Ramanujan graph with μ1 = r − λ. Proof. The proofs of (I) and (II) follow from Theorem 1 by applying the definition of Ramanujan graphs and following the inequalities. Q.E.D. We also will show that the bipartite graph of a BIBD is an optimal expander graph for its parameters.
3
Characterization of Optimal Bipartite (c, d) Regular Expander Graphs
We recall that a matrix A with rows and columns indexed by a set X is called irreducible when it is not possible to find a proper subset of X so that A(x, y) = 0 whenever x ∈ S and y ∈ X \ S. Equivalently, A is not irreducible if and only if it is possible to apply a simultaneous row and column permutation a matrix in a square block form so that one of the blocks is a zero block. For the following Lemma, see for example ([18] p. 363). Lemma 2. Let D be a finite graph. Then the adjacency matrix of A is irreducible if and only if D is connected. We shall also need Proposition 3 (Perron-Frobenius). Let A be an irreducible non-negative matrix. Then, there is up to scalar multiples, a unique non-negative eigenvector a := (a1 , a2 , · · · , an ) all of whose coordinates ai are strictly positive. The corresponding eigenvalue μ0 (called the dominant eigenvalue of A has algebraic multiplicity 1 and μ0 ≥ μi for any eigenvalue μi of A. We recall the following special case of Courant-Fisher Theorem (called the Raleigh-Ritz Theorem) (see for example, ([18], Theorem 4.2.2)) Theorem 5. Let A be an n × n Hermitian matrix over the complex field C, then it is known that all its eigenvalues are real, with maximum eigenvalue μmax . ∗T For 0 = X ∈ C n , define the Raleigh quotient RX := XX ∗ TAX . Then μmax = X maxX=0 RX . Furthermore, RX ≤ μmax with equality if and only if X is an eigenvector corresponding to the eigenvalue μmax .
58
T. Høholdt and H. Janwal
Definition 6. Let X be a bipartite graph with average left degrees c and average right degree d. Define v := |E|/c and b := |E|/d. Let μ1 := maxγ {|γ||γ = μmax }, where the maximization is over the eigenvalues of X , and μmax is the maximum eigenvalue of the adjancency matrix of X . Theorem 6. Let X be a connected bipartite graph with average left √ degrees c and average right degree d and maximum eigenvalue μmax . Then cd ≤ μmax with equality if and only is X is a (c, d)-regular bipartite graph. Proof. If X is a (c, d)-regular bipartite graph, with adjancy matrix 0 H A= HT 0 d then it is easy to see that [1, 1, . . . , 1, τ, . . . , τ ] where τ = c is an eigenvec√ tor corresponding to the eigenvalue cd and then the inequality and equality follow from the Perron-Frobenius theorem (3). Conversely, assume that X is a connected bipartite graph on v left vertices of average degree c and b right vertices of average degree d and maximum eigenvalue μmax . Define Z:=
[1, 1, · · · , 1, τ, · · · , τ ]T , in which the first v component are 1, and where τ := dc . √ √ Then one can see that RZ = cd and therefore cd ≤ μmax . If we have equality then RZ = μmax , and from the Courant-Fisher Theorem Z is an eigenvector corresponding to the √maximal eigenvalue RZ . Therefore, AZ = μmax Z. By assumption, then AZ = cdZ. By solving the simultaneous equations, we get that cvi = c for every left vertex vi , and dbj = d for every right vertex bj . Consequently, X is a (c, d) regular graph. Q.E.D. Theorem 7. Let X be a connected graph such that both ±μmax are eigenvalues (i.e. it is a bipartite with say v left vertices and b right vertices). Suppose that 1 2 1 2 μi μi · = μ2max . 2v μ 2b μ i
i
Then X is a (c, d)-regular graph, where c := |E|/v and d := |E|/b Proof. By of the adjacency matrix, T race(AAT ) = 2|E|. But T race
definition T 2 (AA ) = μi μi Since, c := |E|/v and d = |E|/b, by assumption we get cd = μ2max and therefore by Theorem 6, X is a (c, d) regular graph with c := |E|/v and d := |E|/b. Q.E.D. Theorem 8. Suppose that X is a bipartite graph with v left vertices and b right vertices where b ≥ v ≥ 2. Suppose the eigenvalues are ±α with multiplicity 1, ±β with multiplicity (v − 1), 0 with multiplicity (b − v) (and α > β > 0). If α2 + (v − 1)β 2 = α2 bv
Optimal Bipartite Ramanujan Graphs
59
then X is the graph of a balanced incomplete block design with parameters (b, v, r, k, λ), where 2 2 2 2 r = α +(v−1)β , k = α +(v−1)β and λ = r − β 2 . b v Proof. With r and k as above we have that r = |E| v is the avearge degree of the left vertices and k = |E| is the average degree of the right vertices. But then √b by the condition we get rk = α so by Theorem 7, X is a bipartite (r, k)-regular graph. √ √ √ √ √ √ √ Let √ X = [ r, r, · · · , r, k, k, · · · , k], where the multiplicity of r is v, and k is b. √ Then we can verify that AX = r · kX. Therefore, √ (5) AT AX = r · kAX = (rk)X T 0 H HH 0 Let A = . Then AAT = A2 = . HT 0 0 HT H Let B = HH T , where H is the left-right incidence matrix of the bipartite graph, and let Q = H T H. Hence,√ from of H, we can confirm that BY = r · kY, where √ the definition √ Y = [ r, r, · · · , r]T . Consequently Z = [1, 1, · · · , 1]T is an eigenvector corresponding to the eigenvalue r · k. Then by ([11] p.186), B has eigenvalues α = r · k with multiplicity 1, β with multiplicity (v − 1). Therefore, the minimal polynomial of B is mB (x) = (x−α)q(x), where q(x) = (x − β). (Since B is real symmetric, it is diagonalizable by an orthonormal basis, and therefore its minimal polynomial is composed of distinct factors). Substituting B for X, we have Bq(B) = r · kq(B). Since α is a simple eigenvalue of B with eigenvector Z, q(B) has columns that are multiples of Z. However q(B) is symmetric, as B is symmetric, we conclude that all the column multiples are the same scalar c, i.e. q(B) = cJ, where J is the v × v matrix of all 1’s. Hence B − β 2 Iv = B − (r − λ)Iv = c · J. Hence, B = (r − λ)Iv + c · Jv . By taking Trace both sides, and since T race(B) = T race(HH T ) = rv, we conclude that c = λ. Hence B = HH T = (r − λ)I + λJ, and 1v H = k.1b since the graph is a bipartite graph. Therefore, H is the incidence matrix of a (v, b, r, k, λ) BIBD. Q.E.D. 3.1
Bounds on the Eigenvalues of Irregular Bipartite Graphs
Theorem 9. Let X be a connected bipartite graph with average left degree c and average right degree d. Let v := |E|/c and b := |E|/d (we assume b ≥ v) and maximum eigenvalue μmax . Then
60
T. Høholdt and H. Janwal
μ1 ≥
|E| − μ2max v−1
1/2
with equality if and only if the eigenvalues are ±μmax (with multiplicity 1), ±μ1 (with multiplicity (v − 1), and 0 with multiplicity b − v. Proof. Since X is a connected bipartite graph, the set of eigenvalues is S(X ) := {±μmax , ±μ1 , ±μ2 , ± · · · ± μN −1 , 0} with 2N non-zero eigenvalues, where the absolute values are in decreasing order. HH T 0 Since AAT = A2 = 0 HT H we get by taking the trace, T race(A2 ) = T race(HH T ) + T race(H T H) = vc + b · d = |E| + |E| = 2|E|. Therefore, 2|E| = T race(A2 ) = 2μmax 2 + 2μ1 2 +
N −1 2 i=2 [μ2i ≤ 2μmax 2 + 2(N − 1)μ1 2 . Hence μ1 ≥
|E| − μ2max N −1
1/2 ≥
|E| − μ2max v−1
1/2
where the last inequality follows from the fact that N ≤ v. It is clear that equality occurs if and only if the eigenvalues are ±μmax with multiplicity 1, ±μ1 with multiplicity (v − 1), and 0 with multiplicity b − v. Q.E.D. Corollary 1. Let X be a connected√(c, d)-regular bipartite graph. Define v := |E|/c and b := |E|/d. Then μmax = c · d, and μ1 ≥
|E| − μ2max v−1
1/2
with equality if the eigenvalues are ±μmax (with multiplicity 1), ±μ1 (with multiplicity (v − 1)), and 0 with multiplicity b − v. We can now derive the central theorem in this paper: Theorem 10. The bipartite graph XD of a √ (v, b, r, k, λ) BIBD is a (r, k)-regular bipartite Ramanujan graph with μ1 (X) = r − λ, and it is optimal expander graph with these parameters.
4
Optimal Ramanujan Graphs from Finite Geometries
In this we consider special type of BIBD that are highly structured, namely those coming from Finite Affine and Projective Geometries. The corresponding class of bipartite graphs from P G(n, IFq ) and EG(n, IFq ) coming from subspaces, yield us, by way of Theorem 12 optimal bipartite expander graphs. The class has been used to construct good LDPC codes ( [38], [39] , [40]) and our results on the eigenvalues and hence the expansion coefficient give a partial theoretical explanation of the good performance of these codes.
Optimal Bipartite Ramanujan Graphs
4.1
61
Optimal Bipartite Ramanujan Graphs of Projective Geometries
Let IFq be the finite field with q = pm elements. Let n ≥ 2 be an integer, and let s be an integer such that 1 ≤ s ≤ n − 1. We consider the incidence structure X (n, s, q) = (V, B), where V = the points of the n-dimensional projective geometry P G(n, q), and B = {S| dim S = s and S a subspace of P G(n, q)}. Proposition 4 (Hall [14]). The graph of the incidence structure X (n, s, q) is a bipartite graph of a (v, b, r, k, λ) BIBD with parameters b = (qn+1 −1)(qn+1 −q)···(qn+1 −qs ) (qn+1 −1) (qs+1 −1) (qn −1)···(qn −qs−1 ) , v = , k = , r = (qs+1 −1)(qs+1 −q)···(qs+1 −qs ) (q−1) (q−1) (qs −1)···(qs −qs−1 ) , λ=
(qn+1 −q2 )···(qn+1 −qs ) (qs+1 −q2 )···(qs+1 −qs ) .
Theorem 11. The bipartite graph √ X (n, s, q) is a (r, k)-regular bipartite Ramanujan graph with μ1 (X) = r − λ, and it is optimal expander graph with these parameters.
4.2
Optimal Bipartite Ramanujan Graphs of Affine Geometries
Let IFq be the finite field with q = pm elements. Let n ≥ 2 be an integer, and let s be an integer such that 1 ≤ s ≤ n − 1. We consider the incidence structure Y(n, s, q) = (V, B), where V = the points of the n-dimensional affine geometry EG(n, q), and B = {S| dim S = s and S a subspace of EG(n, q)}. Proposition 5 (Hall [14]). The graph of the incidence structure Y(n, s, q) is a bipartite graph of a (v, b, r, k, λ) BIBD with parameters b = qn (qn −1)(qn −q)···(qn+1 −qs−1 ) (qn −1)(qn −q)···(qn+1 −qs−1 ) n s qs (qs −1)(qs+1 −q)···(qs −qs−1 ) , v = q , b = (qs −1)(qs+1 −q)···(qs −qs−1 ) , k = q , λ = (qn −q2 )···(qn −qs−1 ) (qs −q)···(qs+1 −qs−1 ) .
Theorem 12. The bipartite graph √ Y(n, s, q) is a (r, k)-regular bipartite Ramanujan graph with μ1 (X) = r − λ, and it is optimal expander graph with these parameters.
5
Conclusion
In this article we have considered a special type of expander graphs coming from balanced incomplete block designs, in particular from finite geometries. We have shown that these yield optimal Ramanujan graphs. In a forthcoming paper, Janwa and Høholdt [17], we study the expander codes and LDPC codes that are derived from BIBD and codes based on finite affine and projective geometric based bipartite graph and provide theoretical bounds on their parameters such as distance and rate, and derive some excellent codes.
62
T. Høholdt and H. Janwal
References 1. Alon, N.: Eigenvalues and expanders. Combinatorica 6, 83–96 (1986) 2. Alon, N., Bruck, J., Naor, J., Naor, M., Roth, R.M.: Construction of asymptotically good low-rate error-correcting codes through pseudo-random graphs. IEEE Trans. on Inform. Theory 38(2), 509–516 (1992) 3. Alon, N., Lubotzky, A., Wigderson, A.: Semi-direct product in groups and zig-zag product in graphs: Connections and applications. In: Proc. of the 42nd FOCS, pp. 630–637 (2001) 4. Barg, A., Zemor, G.: Error exponent of expander codes. IEEE Trans. Inform. Theory 48(6), 1725–1729 (2002) 5. Bass, H.: The Ihara-Selberg zeta function of a tree lattice. International Journal of Mathematics 3(6), 717–797 (1992) 6. Biggs, N.: Algebraic Graph Theory, 2nd edn. Cambridge University Press, Cambridge (1994) 7. Cvetkovic, D.M., Doob, M., Sachs, H.: Spectra of Graphs: Theory and Applications. Academic Press, London (1979) 8. David Forney, G.: Codes on graphs: recent progress. Phys. A 302(104), 1–13 (2001) 9. Frey, B.J., Koetter, R., Forney, G.D., Kschischang, F.R., McEliece, R.J., Spielman, D.A. (eds.): Special Issue on Codes on Graphs and Iterative Algorithms. IEEE Trans. Inform. Theory 47 (2001) 10. Friedman, J. (ed.): Expanding Graphs. DIMACS Series in Discrete Mathematics and Theoretical Computer Science, vol. 10. AMS (1993) 11. Godsil, C.D.: Algebraic Combinatorics. Chapman and Hall, Boca Raton (1993) 12. Guo, Q., Janwa, H.: Some properties of the zig-zag product of graphs. In: Presented at the 2nd NASA-EPSCoR meeting, November 10 (2003) (preprint) 13. Guruswami, V., Indyk, P.: Expander-Based Construction of Efficiently Decodable Codes. In: 42nd IEEE Symposium on Foundation of Computer Science, October 14-17, p. 658 (2001) 14. Hall Jr., M.: Combinatorial Theory, 2nd edn. Wiley Interscience in discrete Mathematics. Kluwer, Dordrecht (2003); Høholdt, T., Justesen, J.: Graph codes with Reed-Solomon component codes (submitted) 15. Høholdt, T., Justesen, J.: Graph codes with Reed-Solomon component codes. In: Proc. ISIT 2006, pp. 2002–2006 (2006) 16. Høholdt, T., Justesen, J.: Iterated analysis of iterated hard decision decoding of product codes with Reed-Solom component codes. In: Proc. ITW 2007 (September 2007) 17. Høholdt, T., Janwa, H.: Characterization of some optimal expander graphs and their application to expander/LDPC codes (in preparation) 18. Horn, R.A., Johnson, C.R.: Matrix Analysis. Cambridge University Press, Cambridge (1992) 19. Janwa Jr., H.: “Relations among Expander Graphs, Codes, Sequence Design, and Their Applications. In: Proceedings of the 4th World Multi-conference on Systemics, Cybernetics and Informatics (SCI 2000 and ISAS 2000), Orlando Florida, July 23-26, vol. XI, pp. 122–124 (2000) 20. Janwa, H.: New (Explicit) Constructions of Asymptotic Families of Constant Degree Expander Graphs from Algebraic Geometric (AG) Codes and Their Applications to Tanner Codes. ABSTRACTS OF THE AMS 26(1), 197 (2005); Joint AMS annual Meeting, Atlanta, 2005; Special session on Algebraic Geometry and Codes
Optimal Bipartite Ramanujan Graphs
63
21. Janwa, H.: New Constructions of Sparse Quasi-Random Graphs (in preparation) 22. Janwa, H.: Further examples of Ramanujan graphs from AG codes (2003) (in preparation) 23. Janwa, H.: Covering radius of codes, expander graphs, and Waring’s problem over finite fields (in preparation) 24. Janwa, H.: A graph theoretic proof of the Delsarte bound on the covering radius of linear codes (preprint) 25. Janwa, H.: Good Expander Graphs and Expander Codes: Parameters and Decoding. In: Fossorier, M.P.C., Høholdt, T., Poli, A. (eds.) AAECC 2003. LNCS, vol. 2643, pp. 119–128. Springer, Heidelberg (2003) 26. Janwa, H., Lal, A.K.: “On Expander Graphs: Parameters and Applications (January 2001) (submitted), arXiv:cs.IT/04060-48v1 27. Janwa, H., Lal, A.K.: On Tanner Codes: Parameters and Decoding. Applicable Algebra in Engineering, Communication and Computing 13, 335–347 (2003) 28. Janwa, H.: Explicit Constructions of Asymptotic Families of Constant Degree Expander Graphs from Algebraic Geometric (AG) Codes. Congressus Numerantium 179, 193–207 (2006) 29. Janwa, H., Lal, A.K.: On the Generalized Hamming Weights and the Covering Radius of Linear Codes. LNCS, vol. 4871, pp. 347–356. Springer, Heidelberg (2007) 30. Janwa, H., Mattson Jr., H.F.: “The Projective Hypercube and Its Properties and Applications. In: Proceedings of the 2001 International Symposium on Information Theory (ISIT 2001), Washington D.C., USA (June 2001) 31. Janwa, H., Moreno, O.: Strongly Ramanujan graphs from codes, polyphasesequences, and Combinatorics. In: Proceedings of the International Symposium on Information Theory (ISIT 1997), Ulm, Germany, p. 408 (1997) 32. Janwa, H., Moreno, O.: New Constructions of Ramanujan Graphs and Good Expander Graphs from Codes, Exponential Sums and Sequences. IMA Summer Program on Codes, Systems and Graphical Models (August 1999) 33. Janwa, H., Moreno, O.: Elementary constructions of some Ramanujan graphs. Congressus Numerantium (December 1995) 34. Janwa, H., Moreno, O.: “Coding theoretic constructions of some number theoretic Ramanujan graphs. Congressus Numerantium 130, 63–76 (1998) 35. Janwa, H., Moreno, O.: “Expander Graphs, Ramanujan Graphs, Codes, Exponential Sums, and Sequences (to be submitted) 36. Janwa, H., Moreno, O., Kumar, P.V., Helleseth, T.: Ramanujan graphs from codes over Galois Rings (in preparation) 37. Janwa, H., Rangachari, S.S.: Ramanujan Graphs. Lecture Notes. Preliminary Version (July 1994) 38. Kou, Y., Lin, S., Fossorier, M.P.C.: Low-density parity-check codes based on finite geometries: a rediscovery and new results. IEEE Trans. Inform. Theory 47(7), 2711–2736 (2001) 39. Lin, S., Kou, Y.: A Geometric Approach to the Construction of Low Density Parity Check Codes. In: Presented at the IEEE 29th Communication Theory Workshop, Haynes City, Fla., May 7-10 (2001) 40. Lin, S., Kou, Y., Fossorier, M.: Finite Geometry Low Density Parity Check Codes: Construction, Structure and Decoding. In: Proceedings of the ForneyFest. Kluwer Academic, Boston (2000) 41. Li, W.-C.W.: Character sums and abelian Ramanujan graphs. Journal of Number Theory 41, 199–217 (1992) 42. Lubotzky, A.: Discrete Groups, Expanding Graphs and Invariant Measures. Birkh¨ auser, Basel (1994)
64
T. Høholdt and H. Janwal
43. Lubotzky, A., Phillips, R., Sarnak, P.: Ramanujan Graphs. Combinatorica 8, 261– 277 (1988) 44. Margulis, G.A.: Explicit group theoretic constructions of combinatorial schemes and their applications for construction of expanders and super-concentrators. Journal of Problems of Information Transformation, 39–46 (1988) 45. Reingold, O., Vadhan, S., Wigderson, A.: Entropy waves, the zig-zag graph product, and new constant-degree expanders. Ann. of Math. 155(1), 157–187 (2002) 46. Richardson, T., Urbanke, R.: The capacity of low-density parity check codes under message-passing decoding. IEEE Trans. Inform. Theory 47(2), 569–618 (2001) 47. Richardson, T.J., Shokrollahi, M.A., Urbanke, R.: Design of capacity-approaching irregular low-density parity-check codes. IEEE Trans. Inform. Theory 47(2), 619– 637 (2001) 48. Rosenthal, J., Vontobel, P.O.: Construction of LDPC Codes using Ramanujan Graphs and Ideas from Margulis. In: Allerton Conference (2000) 49. Roth, R.M., Skachek, V.: Improved nearly MDS codes. IEEE Trans. on Inform. Theory 52(8), 3650–3661 (2006) 50. Sarnak, P.: Some Applications of Modular Forms. Cambridge University Press, Cambridge (1990) 51. Sarnak, P.: What is... an Expander. Notices of the AMS 51(7), 762–763 (2004) 52. Shokrollahi, A.: Codes and graphs. In: Reichel, H., Tison, S. (eds.) STACS 2000. LNCS, vol. 1770, pp. 1–12. Springer, Heidelberg (2000) 53. Sipser, M., Spielman, D.A.: Expander codes. Codes and complexity. IEEE Trans. Inform. Theory 42(6, part 1), 1710–1722 (1996) 54. Spielman, D.A.: Linear-time encodable and decodable error-correcting codes. IEEE Trans. on Inform. Theory 42(6), 1710–1722 (1996) 55. Tanner, R.M.: Explicit concentrators from generalized N -gons. SIAM J. of Discrete and Applied Mathematics 5, 287–289 (1984) 56. van Lint, J.H., Wilson, R.M.: A Course in Combinatorics, 2nd edn. Cambridge University Press, Cambridge (2001) 57. Tanner, R.M.: Minimum-distance bounds by graph analysis. IEEE Trans. Inform. Theory 47(2), 808–821 (2001) 58. Zemor, G.: On Expander Codes. IEEE Trans. Inform. Theory. IT-47(2), 835–837 (2001)
Simulation of the Sum-Product Algorithm Using Stratified Sampling John Brevik1 , Michael E. O’Sullivan2 , Anya Umlauf2 , and Rich Wolski3 1
California State University, Long Beach 1250 Bellflower Blvd. Long Beach CA 90840 2 Department of Mathematics and Statistics San Diego State University 5500 Campanile Dr. San Diego, CA, USA, 92182 3 Department of Computer Science University of California, Santa Barbara, CA 75275
Abstract. Using stratified sampling a desired confidence level and a specified margin of error can be achieved with smaller sample size than under standard sampling. We apply stratified sampling to the simulation of the sum-product algorithm on a binary low-density parity-check code. Keywords: Sum-product algorithm, stratified sampling, simulation.
1
Introduction
One of the mysterious aspects of decoding low-density parity-check codes with the sum-product algorithm is the error floor appearing in the performance curve. As is now well known, the performance of the SPA, measured in either bit-errorrate or word-error-rate as a function of the signal-to-noise ratio, tends to have two regions: a “waterfall” portion, where the curve descends ever more steeply until it reaches the second region, the “error floor,” where the curve flattens out considerably. The change in descent has been attrributed to “near-codewords” [4], also called trapping sets [5]. The error floor for ensembles of codes using erasure decoding and also maximum a posteriori (MAP) decoding are analyzed with sophisticated probabilistic techniques in [6, 3.24,4.14]. The analysis of the error floor for the SPA appears to be quite challenging. In particular, for individual codes, “the error floor does not concentrate [to the ensemble average] and significant differences can arise among the individual members of an ensemble. It is therefore important to indentify ‘good elements’ of an ensemble” [6, p.266]. The error floor is important because it limits the utility of an LDPC code. In practice, it may be desirable to have assurance that the (bit) error rate at a particular signal-to-noise ratio is lower than 10−10 , but simulation to achieve
This research was supported by NSF CCF-Theoretical Foundations grants #0635382, #0635389 and #0635391.
M. Bras-Amor´ os and T. Høholdt (Eds.): AAECC 2009, LNCS 5527, pp. 65–72, 2009. c Springer-Verlag Berlin Heidelberg 2009
66
J. Brevik et al.
a reasonable level of confidence at this rate can be expensive. There are some methods to improve code constructions to lower the error floor see e.g. [3,1], but there is no clear theory to explain the error floor and no method to compute it for a particular LDPC code. Thus one might hope for improved methods of simulation. Our approach to this problem is based on the observation that with high probability, a received vector for input in the SPA has a very small number of badly corrupted bits and will decode successfully; to look at it another way, the input vectors that fail to decode are concentrated in a region of relatively low probability. In this article, we summarize how stratified sampling may be used to greatly reduce the sample size required for a desired confidence level and margin of error. We apply stratified sampling to a particular LDPC code—a (3, 6) regular code of length 26—that is small but still interesting. The method reduces the number of samples required by roughly a factor of 70, at SNR 9.0, a margin of error of 20% of the error rate (which is roughly 4 × 10−6 ), and 95% confidence level.
2
Stratified Sampling
In this section, we give a brief resum´e of the ideas behind stratified sampling, which is a method for reducing the variance in a sample (Cf. [2, Ch. 8] for a more complete account). We give a rather general treatment, provide some simple examples, and finally interpret the method in the context of block-error rates for LDPC codes. 2.1
Theoretical Overview
Let (Ω, p) be a probability space, and let X be a real-valued random variable on Ω, which we will assume to have finite mean μ and variance σ 2 . We are interested in the problem of estimating the expected value μ = E(X) from a sample (x1 , x2 . . . , xr ) of size r taken from X. Provided that X is relatively well behaved ¯ has approximate distribution and that r is sufficiently large, the sample mean X N (μ, σ 2 /r). Again with the same provisions, σ 2 is adequately approximated by the sample variance s2 , so √ we can obtain a reasonable confidence interval for μ ∗ √ as (¯ x − z s/ r, x ¯ + z ∗ s/ r), where x ¯ is the mean obtained from our particular sample and z ∗ is an appropriate standard-normal critical value. Now suppose that Y is another random variable on Ω taking on the finite set Y1 , Y2 , . . . , Yk of values; we will refer to the events Y = Yk as strata of Ω. Suppose further that pk = p(Y = Yk ) is known or can be approximated effectively. This suggests an alternate method for estimating μ from a sample of size r: Take of the variables Xj = X|Yj , a sample (x j1 , xj2 . . . , xjrj ) of size rj from each where r = rj , and take the weighted sum x ˆ= pj x ¯j of sample means. The ¯ j has approximate distribution N (E(X|Yj ), V (X|Yj )/rj ); note sample mean X that pj E(X|Yj ) = μ. If we again assume that the j th sample variance s2j adequately approximates thetrue variance V (X|Yj ), we find that the variance ¯ j is approximately of pj X p2j s2j /rj .
Simulation of the Sum-Product Algorithm Using Stratified Sampling
To find the values of rj , subject to
67
rj = r that minimize the variance, use p2j s2j p s Lagrange multipliers: We find that − 2 = λ for all j, so rj = r · jpijsi . The rj approximate value of the total variance of this estimator is thus p2j s2j p i si ( pj sj )2 Var xˆ ≈ . (1) = rpj sj r We note that the term “stratified sampling” is sometimes used for the specific case in which the ri are chosen to be proportional to the pi . In general, this approach also leads to some reduction in variance, simply because it controls one contributor to sampling variance, namely the variance in the sampling distribution of Y -values. The approach outlined above clearly yields superior reduction in variance, based as it is on optimization. 2.2
An Example
Suppose that we wish to obtain an accurate estimate for the incidence of a certain gene in a population. Suppose further that the population is 70% white and 30% non-white, that we have obtained the approximate values of .01% incidence in the white population and 17% in the non-white population, and that these approximations provide acceptable values for the purposes of estimating sample variances. This gives an approximate value of .05107 for the proportion of the population with this gene, and the proportion for a “raw” sample of size r would .04846 .05107 · (1 − .05107) ≈ . thus have variance approximately r r Now write Y1 for the event that a member of the population is white and Y2 for the event√that s/he is non-white; then, in the notation established above, √ p1 = .7, s1 = .0001 · .9999 ≈ .01, so p1 s1 ≈ .007; and p2 = .3, s2 = .17 · .83 ≈ .3756, so p2 s2 ≈ .1127. The calculations above show that the optimal stratified .007 sample of size r takes r · ≈ .0585 from the white population and .007 + .1127 therefore about 94.15% of the total sample from the non-white population. The .0143 (.007 + .1127)2 = , an improvement sample variance in this scheme is r r by a factor of almost 3.4. In practical terms, √ the standard error (= standard deviation of the sample proportion) is about .0143 · .04846 ≈ .54 as large using stratified sampling as with the “raw” method. For a given sample size, then, stratification allows us to obtain an estimate with just over half the margin of error; alternatively, it allows us to use a much smaller sample – about .3 as large, since the standard error decreases as the square root of the sample size – to obtain the same level of accuracy. We can gain even more advantage if the population can be further separated into strata with different characteristics. Suppose that the non-white population breaks down as 25% African-American and 5% Asian-American and that the gene has incidence about 1% in the African-American population and about 97% in the Asian-American population. An analogous calculation to the above
68
J. Brevik et al.
calls for a sample that is about 17.3% white, 65.6% African-American, and 21.1% .00163 Asian-American, and the total variance in this case comes out to about r – a further improvement by a factor of about 9 over the first stratified example, and a further reduction by a factor of almost 3 in standard error. Compared to “raw” sampling, this final stratified sampling scheme reduces the necessary sample size for a given margin of error by a factor of about 29.7. 2.3
Stratified Sampling with LDPC Codes
We now consider a code of length n with code symbols ±1 (i.e., {(−1)0 , (−1)1 }), transmitted across a Gaussian memoryless binary symmetric channel. In this setting the probability space Ω consists of all vectors (1 , 2 , . . . , n ), 0 < j < 1, where j is the likelihood of the value −1 at the j th bit, inferred by the decoder at the initial step based on the received value at that bit. For the purposes of simulation it is sufficient to restrict to the all-1 codeword; see [6][Sec. 4.3]; by our assumptions, the components of Ω are iid and the distribution on each component can be calculated using normal distributions. Our random variable X of interest takes on the value 1 if the SPA decoder does not produce the all 1 codeword and 0 otherwise, so E(X) is the frame-error rate (FER) of the code. We can compute the density function for i under the assumption that the all-1 codeword was sent. Let T1 stand for the input at a bit given that a 1 is transmitted. T1 is distributed according to the Gaussian N1 (t) =
√1 e− σ 2π
(t−1)2 2σ2
. Similarly, if a −1 is (t+1)2
transmitted, the input T−1 is distributed according to N−1 (t) = σ√12π e− 2σ2 . t (−1) N−1 (t) Now, for a given input value t, the likelihood ratio is given by = t (1) N1 (t) 2t e− σ2 , and therefore the likelihood r(t) that a −1 was sent is equal to 2t
e− σ 2 1 t (−1) = − 2t = 2t . 2 t (−1) + t (0) e σ +1 1 + e σ2 Note that q = r(t) isa strictly decreasing function of t, and we can solve for 1−q σ2 t to obtain t = 2 ln q . Now, under the assumption that a 1 was actually transmitted, for any q in the interval (0, 1), the distribution function FQ (q) is given by 1−q σ2 ln FQ (q) = P (Q ≤ q) = P T1 ≥ 2 q σ 1−q T1 − 1 1 ≥ ln FQ (q) = P − σ 2 q σ ⎡ ⎛ ⎞⎤ 1−q σ 1 − ln 2 q σ 1 ⎠⎦ √ (2) FQ (q) = 1 − ⎣ ⎝1 + erf 2 2
Simulation of the Sum-Product Algorithm Using Stratified Sampling
69
x 2 2 where erf is the the error function,’ defined by erf(x) = √ e−u du so π 1 √ 1 that (1 + erf(X/ 2)) is the distribution function for the standard normal, any 2 particular value of which can be well approximated. Now, set some threshold value t1 above which a likelihood j is to be considered “high,” and stratify the set of likelihood vectors by the number of high entries contained in each, so that our variable Y takes on the values 0, 1, . . . , n corresponding to the number of high values in a vector. Note that pj = p(Y = j) is relatively straightforward to calculate from the distribution on and binomial coefficients. This situation is a prime candidate for stratified sampling, because the strata corresponding to large j have a much higher incidence of decoding failure than those with low j-values. We then carry out sampling as follows: For a given j, choose randomly the j “high” positions; then sample from the conditional density > t1 in these places and from the conditional density < t1 in the others. Of course, the threshold t1 is a parameter in this scheme. More generally, in fact, one can define multiple thresholds t1 < t1 < · · · < tm , define tm+1 = 1, and stratify Ω by the m-tuple of integers (y1 , y2 , . . . , ym ) where yj is the number of entries falling between tj and tj+1 , so that n − y1 − · · · − ym is the number of entries between 0 and t1 . Because the large sample sizes required to get meaningful estimates for error rates incurs a high cost in terms of time and compute cycles, our purpose for stratified sampling is not to obtain a superior estimate of block-error rate for a given sample size but rather to obtain an equally accurate estimate with a smaller sample size. In the next section, we show for a specific code the amount of sample-size reduction that is possible with stratified sampling.
3
Application
We applied stratified sampling to a (3, 6) regular graph with 26 bit nodes and 13 check nodes. The girth of the graph is 6. We simulated the sum-product algorithm for signal to noise ratios from 5 to 9, over which range the output FER drops from roughly 10−2 to 5 × 10−6 . We want to estimate block error ˆ Let r˜ be rate with a 95% confidence level and within 20% of the estimate θ. the sample size needed for the stratified sampling, and let r be the sample size needed for standard sampling. We are interested in comparing these values. We chose the threshold t1 to be 0.5. As described above, the probability space Ω is stratified by the number j of high values in the where j varies from jvector, 26−j 0 to 26. The probability of the jth stratum is 26 , where q is the j q (1 − q) probability, calculated as above, that any given received value is over 0.5. The vast majority of strata have very low probability and may be ignored. ˆ = σ2 , where For regular sampling, variance of θˆ can be calculated with Var(θ) r ˆ Sample size required for the standard sampling can be σ 2 is estimated by θˆ·(1− θ). 2 2 zα σ 2 ˆ calculated with r = M 2 , where σ is the variance of θ and M is margin of error.
70
J. Brevik et al.
ˆ With stratified sampling, variance of θ is found as in Equation 1 above: ( j pj sj )2 ˆ ≈ , where sj = θˆj · (1 − θˆj ). The relative sample sizes for Var(θ) r˜
a given margin of error is then given by ˆ θˆ · (1 − θ) r = . r˜ ( j pj sj )2 Note that this ratio is independent of the desired margin of error. In order to use stratified sampling for the problem of estimating FER, we first need to obtain an approximate value for the FER within each stratum in Table 1. Data for several strata at each SNR SNR M 2 stratum j θˆ 5 9.50e-03 3.61e-06 0 1 2 3 4 5 6 Total 6 3.36e-03 4.52e-07 0 1 2 3 4 5 Total 7 6.57e-04 1.73e-08 0 1 2 3 4 5 Total 8 8.65e-05 2.99e-10 0 1 2 3 4 Total 9 4.84e-06 9.36e-13 0 1 2 3 4 Total
pj 3.68e-01 3.75e-01 1.84e-01 5.75e-02 1.29e-02 2.23e-03 3.06e-04
θˆj 1.00e-10 1.04e-03 1.28e-02 6.19e-02 1.70e-01 4.05e-01 3.33e-01
sj 1.00e-05 3.22e-02 1.12e-01 2.41e-01 3.76e-01 4.91e-01 4.71e-01
5.46e-01 3.34e-01 9.84e-02 1.85e-02 2.51e-03 2.60e-04
1.00e-10 3.50e-04 1.61e-02 6.67e-02 1.44e-01 2.46e-01
1.00e-05 1.87e-02 1.26e-01 2.49e-01 3.51e-01 4.31e-01
7.19e-01 2.38e-01 3.80e-02 3.87e-03 2.84e-04 1.59e-05
1.00e-10 1.70e-04 7.31e-03 7.27e-02 1.90e-01 1.96e-01
1.00e-05 1.30e-02 8.52e-02 2.60e-01 3.92e-01 3.97e-01
8.55e-01 1.34e-01 1.01e-02 4.90e-04 1.70e-05
1.00e-10 3.31e-05 4.48e-03 6.86e-2 1.78e-01
1.00e-05 5.75e-03 6.68e-02 2.53e-01 3.83e-01
9.39e-01 5.91e-02 1.79e-03 3.46e-05 4.81e-07
1.00e-10 7.57e-06 1.54e-03 4.36e-02 2.63e-01
1.00e-05 2.75e-03 3.93e-02 2.04e-01 4.40e-01
p j × sj 3.68e-06 1.21e-02 2.06e-02 1.39e-02 4.87e-03 1.20e-03 1.44e-04 5.27e-02 5.46e-06 6.25e-03 1.24e-02 4.62e-03 8.81e-04 1.12e-04 2.43e-02 7.19e-06 3.11e-03 3.24e-03 1.01e-03 1.12e-04 6.32e-06 7.48e-03 8.55e-06 7.73e-04 6.77e-04 1.24e-04 6.51e-06 1.59e-03 9.39e-06 1.63e-04 7.01e-05 7.06e-06 2.12e-07 2.49e-04
rj 1 677 1,155 776 273 62 9 2,953 2 1,289 2,553 953 182 24 5,003 12 5,170 5,382 1,674 186 11 12,435 175 15,748 13,806 2,525 133 32,387 9,610 166,289 71,753 7,220 217 255,089
Simulation of the Sum-Product Algorithm Using Stratified Sampling
71
Table 2. Summarized data for each SNR ˆ · θˆ ( pj sj )2 ratio SNR r r˜ θˆ (1 − θ) j 5 9.50e-03 9.41e-03 2.77e-03 3 10,011 2,953 6 3.36e-03 3.35e-03 5.89e-04 6 28,466 5,003 7 6.57e-04 6.57e-04 5.59e-05 12 146,065 12,435 8 8.65e-05 8.65e-05 2.52e-06 34 1,109,868 32,387 9 4.84e-06 4.84e-06 6.21e-08 78 19,855,359 255,089 Z2 p s order to set the sample sizes rj = ( jpjjsj )˜ r = pj sj Mα2 ( j pj sj ). We do this j by taking relatively small samples; note that any sampling error has only the “second-order” effect of changing the estimated standard error s2j , so we expect the final outcome to be fairly robust to this type of error. Given these estimates, select rj vectors from each stratum j at random. This can be achieved in practice by first randomly choosing j bit positions for “high” values and then sampling “high” values and “low” values by conditioning the distribution according to Equation 2. The estimated value for the frame-error rate is θˆ = j pj θˆj . For all included strata, sample sizes mj are shown in Table 1. Table 2 shows how much sample-size reduction can be achieved with stratified sampling, together with calculated sample sizes needed for standard and stratified sampling for a margin of error approximately 20% of the FER. At SNR 5 the reduction ratio is approximately 3; however, as SNR increases this ratio also increases. For example, standard sampling for SNR9 requires almost 20 million units, whereas stratified sampling needs only 255 thousand, reducing sample size by a factor of approximately 77.
4
Conclusions and Future Work
The results shown here indicate that stratified sampling is an effective strategy for reducing sample size necessary to estimate decoder performance on an LDPC code at high SNR, within a specified margin of error. At this stage, there are a number of clear paths for future examination. First, a length-26 code is far to short to have any real-world value, so it is necessary to validate these results with longer codes. We have begun testing codes of length 282 using massively parallel batch jobs on the Condor pool 1 [7], 1
Condor provides (among many features) the ability to “harvest” computing cycles from idle computers that are connected to the Internet. For these experiments, we use a Condor pool located in the Computer Science Department at the University of California, Santa Barbara consisting of approximately 100 machines. In addition, the Condor Project at the University of Wisconsin allows UCSB’s Condor pool to “offload” work that exceeds the available idle machine capacity; thus each experiment had available to it up to 300 machines for the purpose of computing each Monte Carlo simulation. The ability to harvest unused cycles and load-share between institutions made the investigation feasible in a short span of time.
72
J. Brevik et al.
and it is already evident from preliminary results that it will be necessary to adjust the cutoff value q = 0.5; while we have achieved a speedup by a factor of 5 or so using such adjustments, it is not clear that the straightforward approach used for the length-26 code will achieve such dramatic results. However, there are a number of ways in which the stratification scheme can be made more sophisticated. As discussed earlier, one can use multiple thresholds, t1 < t2 < · · · < tm . For example, with m = 2, we could associate to each vector of likelihoods an ordered pair (a, b), where a stands for the number of values between, say, 0.5 and 0.9 and b stands for the number of values greater than 0.9. Probabilities of the various strata can still be calculated with binomial coefficients, and sampling can still be done with linear conditioning on randomnumber generators. Another possibility is to stratify based on the proximity of high-valued bits on the associated graph; intuitively, if high-valued bits are close together (e.g. sharing a check), the bits’ inaccurate values tends to reinforce one another and affect nearby bit estimates in the SPA. We note that our method can be adapted to bit-error rate (BER), which in some applications is a more suitable measure for decoding error than FER. In fact, a stratified method would likely work even better here, since the lowprobability badly behaved strata will likely have much more variance in the number of errors per word than the high-probability strata and will therefore contribute proportionally more to the variance of the BER estimator. As was kindly pointed out by one of the referees, the general framework of stratified sampling may well apply to a wide variety of communications systems, which typically lack precise analytical means for error analysis and therefore rely on sampling. While our experience is mainly in LDPC codes, exploring this broader applicability may lead to better and more general results.
References 1. Xu, J., Chen, L., Djurdjevic, I., Lin, S., Abdel-Ghaffar, K.: Construction of regular and irregular LDPC codes: geometry decomposition and masking. IEEE Trans. Inform. Theory 53, 121–134 (2007) 2. Ross, S.M.: Simulation, 4th edn. Elsevier Academic Press, Amsterdam (2006) 3. Tian, T., Jones, C., Villasenor, J.D., Wesel, R.D.: Construction of irregular LDPC codes with low error floors. In: IEEE International Conference on Communications, vol. 5, pp. 3125–3129 (2003) 4. MacKay, D.J.C., Postol, M.J.: Weaknesses of Margulis and Ramanujan–Margulis low-lensity parity-check codes. In: Proc. of MFCSIT 2002, Galway. Electronic Notes in Theoretical Computer Science, vol. 74. Elsevier, Amsterdam (2003) 5. Richardson, T.: Error floors of LDPC codes. In: Proc. of 41st Allerton Conf., Monticello, IL (September 2003) 6. Richardson, T., Urbanke, R.: Modern Coding Theory. Cambridge University Press, Cambridge (2008) 7. Tannenbaum, T., Litzkow, M.: The Condor Distributed Processing System. Dr. Dobbs Journal (February 1995)
A Systems Theory Approach to Periodically Time-Varying Convolutional Codes by Means of Their Invariant Equivalent Joan-Josep Climent1 , Victoria Herranz2 , Carmen Perea2, and Virtudes Tom´ as1, 1
Departament de Ci`encia de la Computaci´ o i Intel·lig`encia Artificial Universitat d’Alacant Campus de Sant Vicent del Raspeig, Ap. Correus 99, E-03080 Alacant, Spain 2 Centro de Investigaci´ on Operativa Departamento de Estad´ıstica, Matem´ aticas e Inform´ atica Universidad Miguel Hern´ andez Avenida de la Universidad s/n, E-03202 Elche - Alicante, Spain
Abstract. In this paper we construct (n, k, δ) time-variant convolutional codes of period τ . We use the systems theory to represent our codes by the input-state-output representation instead of using the generator matrix. The obtained code is controllable and observable. This construction generalizes the one proposed by Ogasahara, Kobayashi, and Hirasawa (2007). We also develop and study the properties of the timeinvariant equivalent convolutional code and we show a lower bound for the free distance in the particular case of MDS block codes. Keywords: Time-varying convolutional codes, controllability, observability, free distance.
1
Introduction
Convolutional codes [6, 10, 14] are an specific class of error correcting codes that can be represented as time-invariant discrete linear systems over a finite field [19]. They are used in phone data transmission, radio or mobile communication systems and in image transmissions from satellites [11, 13]. Convolutional coding is the main error correcting technique in data transmission applications due to its easy implementation and nice performance in random error channels [12, 23]. While it is usual for block codes to have the minimum distance guaranteed, for convolutional codes it is common to find out the minimum free distance by a
This work was partially supported by Spanish grants MTM2008-06674-C02-01 and MTM2008-06674-C02-02. The work of this author was also supported by a grant of the Vicerectorat d’Investigaci´ o, Desenvolupament i Innovaci´ o of the Universitat d’Alacant for PhD students.
M. Bras-Amor´ os and T. Høholdt (Eds.): AAECC 2009, LNCS 5527, pp. 73–82, 2009. c Springer-Verlag Berlin Heidelberg 2009
74
J.-J. Climent et al.
code search. Since the required computation in a code search grows exponentially with the number of delay elements of a code, this code search of the minimum free distance can become difficult. As well, the computational requirements of the decoding increase with the amount of delay elements δ. In the construction proposed in this paper we will see that δ does not increase, what can reduce decoding computation [17]. The rest of the paper is organized as follows. In Section 2 we provide all the necessary concepts about convolutional codes to follow the rest of the paper. In Section 3 we give initial knowledge about periodically time variant convolutional codes and we construct the time-invariant equivalent one. In Subsection 3.1 we develop the minimality conditions for the constructed code, and in Subsection 3.2 we study sufficient conditions for the time-invariant code to be controllable and observable when this is formed by (n, 1, 1) or (n, n − 1, 1) codes. In Section 4 we give a lower bound on the free distance. Finally, we provide some conclusions and future research lines in Section 5.
2
Preliminaries
Let F be a finite field. A rate k/n convolutional code C is a submodule of Fn [z] that can be described (see [21, 25]) as C = v(z) ∈ Fn [z] | v(z) = G(z)u(z) with u(z) ∈ Fk [z] where u(z) is the information vector or information word, v(z) is the code vector or code word and G(z) is an n×k polynomial matrix with rank k called generator or encoding matrix of C. The complexity δ of C is the maximum of the degrees of the determinants of the k × k submatrices of any generator matrix of C. We call C an (n, k, δ) convolutional code (see [14]). We can describe an (n, k, δ) convolutional code C by means of the system xt+1 = Axt + But (1) , t = 0, 1, 2, . . . ; x0 = 0 y t = Cxt + Dut yt where A ∈ Fδ×δ , B ∈ Fδ×k , C ∈ F(n−k)×δ , D ∈ F(n−k)×k and v t = . ut The four matrices (A, B, C, D) are called the input-state-output representation of the code C. This representation was introduced in [21] and has been used in the last years to analyze and construct convolutional codes [1, 4, 5, 9, 18, 19, 21, 25]. For each positive integer j let us define the matrices ⎤ ⎡ C ⎢ CA ⎥ ⎥ ⎢ ⎥ ⎢ j−2 j−1 Φj (A, B) = B AB · · · A B A B and Ωj (A, C) = ⎢ ... ⎥ . ⎥ ⎢ ⎣ CAj−2 ⎦ CAj−1
A Systems Theory Approach
75
Definition 1. Let A and B be matrices of sizes δ × δ and δ × k, respectively. The pair (A, B) is controllable if rank(Φδ (A, B)) = δ. If (A, B) is a controllable pair, then the smallest integer κ such that rank(Φκ (A, B)) = δ is the controllability index of (A, B). Definition 2. Let A and C be matrices of sizes δ × δ and (n − k) × δ, respectively. The pair (A, C) is observable if rank(Ωδ (A, C)) = δ. If (A, C) is an observable pair, then the smallest integer ν such that rank(Ων (A, C)) = δ is the observability index of (A, C). An important distance measures of a convolutional code is the free distance ∞ ∞ df ree (C) = min wt(ut ) + wt(y t ) u0 =0
t=0
t=0
where wt(·) denotes the Hamming weight of a vector. Rosenthal and Smarandache [20] gave a generalization of the Singleton bound for block codes. Theorem 1 (Theorem 2.2 of [20]). If C is a (n, k, δ)-code over any field F, then δ + 1 + δ + 1. (2) df ree (C) ≤ (n − k) k This bound is known as the generalized Singleton bound. Analogously to block codes, we say that a (n, k, δ)-code is maximum distance separable (MDS) if the free distance attains the generalized Singleton bound.
3
Periodically Time-Variant Convolutional Codes
In this section we define periodically time-varying convolutional codes and explain the concrete characteristics of our construction. Let us assume that the matrices At , Bt , Ct and Dt at time t are of sizes δ × δ, δ × k, (n − k) × δ and (n − k) × k, respectively. A time-variant convolutional code can be defined by means of the system xt+1 = At xt + Bt ut (3) , t = 0, 1, 2, . . . x0 = 0 y t = Ct xt + Dt ut The defined time-variant convolutional code is a dynamical system which can have a minimal representation, that is, a representation where the matrices have the smallest size possible. It is known that the realization (A, B, C, D) of a linear system is minimal if and only if (A, B) is a controllable pair and (A, C) is an observable pair [2]. In Subsection 3.1 and 3.2 we will study the conditions for our code to have a minimal representation. If the matrices change periodically with periods τA , τB , τC and τD respectively, (that is, AτA +t = At , BτB +t = Bt , CτC +t = Ct and DτD +t = Dt for
76
J.-J. Climent et al.
all t) then we have a periodically time-varying convolutional code of period τ = lcm (τA, τB , τC , τD ). Any periodic time-varying convolutional code is equivalent to an invariant one [3, 15, 16]. Relating every state and every output to previous states, we can always rewrite any block of τ iterations at a given time j as xτ (j+1) = Aτ −1,0 xτ j
⎡
⎢ ⎢ + Aτ −1,1 B0 Aτ −1,2 B1 . . . Aτ −1,τ −1Bτ −2 Bτ −1 ⎢ ⎣ ⎡
yτ j y τ j+1 .. .
⎢ ⎢ ⎢ ⎣
y τ j+τ −1 ⎡
⎤
⎡
⎥ ⎥ ⎥, ⎦
uτ j+τ −1
⎤
C0 C1 A0,0 C2 A1,0 .. .
⎢ ⎥ ⎢ ⎥ ⎢ ⎥=⎢ ⎦ ⎢ ⎣
⎤
uτ j uτ j+1 .. .
⎥ ⎥ ⎥ ⎥ xτ j ⎥ ⎦
Cτ −1 Aτ −2,0
D0 C1 B0 C2 A1,1 B0 .. .
O D1 C2 B1 .. .
... ... ...
O O O .. .
O O O .. .
⎤
⎡ ⎤ ⎥ ⎢ uτ j ⎥ ⎢ ⎥ ⎢ uτ j+1 ⎥ ⎢ ⎥⎢ ⎥ ⎢ +⎢ ⎥⎢ ⎥ .. ⎥⎣ ⎢ ⎦ . ⎥ ⎢ ⎣ Cτ −2 Aτ −3,1 B0 Cτ −2 Aτ −3,2 B1 . . . Dτ −2 O ⎦ uτ j+τ −1 Cτ −1 Aτ −2,1 B0 Cτ −1 Aτ −2,2 B1 . . . Cτ −1 Bτ −2 Dτ −1 where Ai,j
Ai Ai−1 · · · Aj+1 Aj , = Ai ,
This system can be written as Xj+1 = AXj + BUj Yj = CXj + DUj where A = Aτ −1,0 ,
if i = j, if i = j. (4)
B = Aτ −1,1 B0 Aτ −1,2 B1 . . . Aτ −1,τ −1 Bτ −2 Bτ −1 ,
⎡ ⎤ D0 O C0 ⎢ D1 C1 B0 ⎢ ⎢ C1 A0,0 ⎥ ⎢ C2 A1,1 B0 ⎥ ⎢ C2 B1 ⎢ ⎥ ⎢ C = ⎢ C2 A1,0 ⎥ , D = ⎢ .. .. ⎢ ⎥ ⎢ .. . . ⎢ ⎦ ⎣ . ⎣ Cτ −2 Aτ −3,1 B0 Cτ −2 Aτ −3,2 B1 Cτ −1 Aτ −2,0 Cτ −1 Aτ −2,1 B0 Cτ −1 Aτ −2,2 B1 ⎡
⎡ Xj = xτ j ,
⎢ ⎢ Yj = ⎢ ⎣
yτ j y τ j+1 .. . y τ j+τ −1
⎤ O O O O ⎥ ⎥ O O ⎥ ⎥ .. .. ⎥ , . . ⎥ ⎥ ... Dτ −2 O ⎦ . . . Cτ −1 Bτ −2 Dτ −1
... ... ...
⎤
⎡
⎥ ⎥ ⎥ ⎦
⎢ ⎢ and Uj = ⎢ ⎣
uτ j uτ j+1 .. . uτ j+τ −1
⎤ ⎥ ⎥ ⎥. ⎦
A Systems Theory Approach
77
System (4) is the time-invariant convolutional code equivalent to the periodic time-varying system (3). Our particular construction replaces matrices At and Dt by fixed matrices A and D, respectively, for all t. Then, expression (3) turns into xt+1 = Axt + Bt ut , t = 0, 1, 2, . . . , (5) y t = Ct xt + Dut and matrices A, B, C and D of system (4) become A = Aτ , B = Aτ −1 B0 Aτ −2 B1 . . . ABτ −2 Bτ −1 , ⎡ ⎢ ⎢ C=⎢ ⎣
C0 C1 A .. .
Cτ −1 Aτ −1
3.1
⎤ ⎥ ⎥ ⎥, ⎦
⎡
D C1 B0 C2 AB0 .. .
O D C2 B1 .. .
... ... ...
O O O .. .
⎢ ⎢ ⎢ ⎢ D=⎢ ⎢ ⎢ ⎣ Cτ −2 Aτ −3 B0 Cτ −2 Aτ −4 B1 . . . D Cτ −1 Aτ −2 B0 Cτ −1 Aτ −3 B1 . . . Cτ −1 Bτ −2
⎤ O O⎥ ⎥ O⎥ ⎥ .. ⎥ . . ⎥ ⎥ O⎦ D
Minimality Conditions
In this subsection we study conditions for the controllability and observability of the new equivalent time-invariant convolutional code of our construction. Theorem 2. If τ k ≥ δ and the matrices Bj , for j = such 0, 1, . . . , τ − 1, are that Bj = A−(τ −j−1) Ej , being Ej ∈ Fδ×k with E = E0 E1 . . . Eτ −1 a full rank matrix, then the system defined by expression (4) is controllable. Proof. According to the form of the controllable matrix in (5) we have that Φδ (A, B) = B AB . . . Aδ−1 B = E Aτ E . . . A(δ−1)τ E which is clearly a full rank matrix. So, the system (4) is controllable.
Similarly, we have the following result for the observability. Here we denote by M T the transpose matrix of M . Theorem 3. If τ (n − k) ≥ δ and the matrices Cj , for j = 0, 1, . . . , τ − 1, are T such that Cj = Fj A−j , being Fj ∈ F(n−k)×δ with F = F0T F1T . . . FτT−1 a full rank matrix, then the system defined by expression (4) is observable. Note that if we choose Bj and Cj as in Theorems 2 and 3, then we ensure the minimality of the new invariant system. Moreover, taking A = Iδ , E as the parity check matrix of a block code, and F T such that Fj is the submatrix of I = I I I . . . formed by the rows j(n− k)+ l for l = 1, 2, . . . , n − k, then we obtain the periodically time-variant convolutional code proposed in [17]. So, our construction generalizes that case.
78
3.2
J.-J. Climent et al.
Minimality for Generic Periodic Time-Variant Convolutional Codes
Let us now return to the general case given by expression (3). Another representation of the time-invariant equivalent code is the following state-space form (see [7]): xt (h) + But (h) R(λ) xt (h) = A t (h) + Dut (h) y t (h) = C x where
⎡
⎢ ⎢ ut (h) = ⎢ ⎣
ut+hτ ut+hτ +1 .. .
⎤ ⎥ ⎥ ⎥, ⎦
⎡ ⎢ ⎢ y t (h) = ⎢ ⎣
y t+hτ y t+hτ +1 .. .
⎤ ⎥ ⎥ ⎥, ⎦
⎡ ⎢ ⎢ t (h) = ⎢ x ⎣
xt+hτ xt+hτ +1 .. .
⎤ ⎥ ⎥ ⎥ ⎦
ut+hτ +τ −1 y t+hτ +τ −1 xt+hτ +τ −1 O I(τ −1)δ and R(λ) = where λ denotes the one-step-forward time operator λIδ O in the variable h and O denotes the null matrix of the appropriate size. The matrices A, B, C and D are now block-diagonal matrices defined as A = diag (A0 , A1 , . . . , Aτ −1 ) , B = diag (B0 , B1 , . . . , Bτ −1 ) , C = diag (C0 , C1 , . . . , Cτ −1 ) and D = diag (D0 , D1 , . . . , Dτ −1 ) . A − R(λ) i 0 , then the folIf we define P (λ) = A − R(λ) B and P (λ) = C lowing theorem, which will be useful for the proofs of further results, holds also for finite fields. Theorem 4 ([7]). The periodic system (3) is controllable (respectively, observable) if and only if the matrix P i (λ) (respectively, P 0 (λ)) has full row (respectively, column) rank for all λ ∈ C. For the case where the subsystems forming the periodically time-varying convolutional code are (n, 1, 1)-codes or (n, n − 1, 1)-codes we have the following result. Theorem 5. If the periodically time-varying convolutional code is generated by τ controllable and observable (n, 1, 1)-codes (or (n, n − 1, 1)-codes), then the periodically time-varying convolutional code obtained is as well controllable and observable. Proof. We will develop the proof for the case (n, 1, 1). In this case ⎤ ⎡ ⎡ ct,0 dt,0 ⎢ ct,1 ⎥ ⎢ dt,1 ⎥ ⎢ ⎢ At = [at ], Bt = [bt ], Ct = ⎢ . ⎥ , and Dt = ⎢ . ⎣ .. ⎦ ⎣ .. ct,n−2
dt,n−2
⎤ ⎥ ⎥ ⎥. ⎦
A Systems Theory Approach
79
Following the previous theorem, our periodic system is controllable if and only if the matrix ⎤ ⎡ a0 −1 0 . . . 0 0 b0 0 . . . 0 ⎢ 0 a1 −1 . . . 0 0 0 b1 . . . 0 ⎥ ⎥ ⎢ ⎢ .. . . . . .. .. .. ⎥ i .. .. .. .. P (λ) = A − R(λ) B = ⎢ . . . . ⎥ ⎥ ⎢ ⎣ 0 0 0 . . . 0 −1 0 0 . . . 0 ⎦ −λ 0 0 . . . 0 aτ −1 0 0 . . . bτ −1 has full row rank. To ensure that this matrix has full rank we will check that bi = 0 for i = 0, 1, . . . , τ − 1. For every subsystem we have that controllability by Popov holds. Then Belevitch-Hautus test (see [8]) it holds that rank( zI − ai bi ) = 1 for every eigenvalue z of [ai ] and i = 0, 1, . . . , τ − 1. Since the only eigenvalue of [ai ] is z = ai then it must hold that bi = 0. Thus the matrix P i (λ) has full rank and the periodic time-varying convolutional code is controllable. On the other hand, by the previous result, our periodic system is observable if and only if the matrix ⎤ ⎡ a0 −1 0 ... 0 0 ⎥ ⎢ 0 0 a1 −1 . . . 0 ⎥ ⎢ .. .. .. .. ⎥ ⎢ .. ⎥ ⎢ . . . . . ⎥ ⎢ ⎥ ⎢ 0 0 0 . . . 0 −1 ⎥ ⎢ ⎢ −λ 0 0 . . . 0 a τ −1 ⎥ ⎥ ⎢ ⎥ ⎢ c0,0 0 0 ... 0 0 ⎥ ⎢ ⎥ ⎢ c0,1 0 0 0 0 ⎥ ⎢ ⎥ ⎢ .. .. .. .. .. ⎥ ⎢ . . . . . ⎥ ⎢ A − R(λ) 0 ⎥ ⎢ 0 0 ... 0 0 = ⎢ c0,n−2 P (λ) = ⎥ C ⎥ ⎢ 0 c1,0 0 ... 0 0 ⎥ ⎢ ⎥ ⎢ 0 0 ... 0 0 c1,1 ⎥ ⎢ ⎥ ⎢ . .. .. .. .. ⎥ ⎢ .. . . . . ⎥ ⎢ ⎥ ⎢ 0 0 c1,n−2 0 . . . 0 ⎥ ⎢ ⎢ 0 0 0 . . . 0 cτ −1,0 ⎥ ⎥ ⎢ ⎢ 0 0 0 . . . 0 cτ −1,1 ⎥ ⎥ ⎢ ⎥ ⎢ . .. .. .. .. ⎦ ⎣ .. . . . . 0 0 0 . . . 0 cτ −1,n−2 has full column rank. To ensure that this matrix has full rank we will check that for each i = 0, 1, . . . , τ − 1 there exist at least one j such that ci,j = 0. For every subsystem we assumed that⎛observability ⎤⎞ holds. Again by Popov⎡ zI − ai ⎜⎢ ci,0 ⎥⎟ ⎥⎟ ⎜⎢ ⎥⎟ ⎜⎢ Belevitch-Hautus test it holds that rank ⎜⎢ ci,1 ⎥⎟ = 1 for every eigenvalue ⎥⎟ ⎜⎢ .. ⎦⎠ ⎝⎣ . c1,n−2
80
J.-J. Climent et al.
z of [ai ] and i = 0, 1, . . . , τ − 1. Then it must exist at least one j such that ci,j = 0. Thus the matrix P 0 (λ) has full rank and the periodic time-varying convolutional code is observable. The proof for the case (n − 1, n, 1) is analogous.
4
Free Distance
In this section we obtain a lower bound on the free distance of the periodic timevarying code by means of the free distance of the invariant equivalent associated, due to the fact that both are the same (see [24]). Theorem 6 (Theorem 3.1 of [22]). Let C be an observable, rate k/n, degree δ, convolutional code defined through the matrices A, B, C and D. Let ν be the observability index of the pair (A, C) and suppose that there exists d ∈ Z+ such that Φdν (A, B) forms the parity-check matrix of a block code of distance d. Then the free distance of C is greater than or equal to d. Now we have the following result. Theorem 7. If the matrix Φdν (A, B) = B AB . . . Adν−1 B represents the parity-check matrix of an MDS block code of distance d, then df ree ≥ δ + 1. Proof. Let P = Aτ −1 B0 Aτ −2 B1 . . . Bτ −1 . The matrix B AB . . . Adν−1 B = P Aτ P . . . A(dν−1)τ P has size δ ×τ kdν. Since the parity-check matrix of a block code is a matrix of size (N −K)×N , we have that the parameters of the block code are [τ kdν, τ kdν−δ, d]. The block code is MDS so the distance d achieves the Singleton bound N −K +1. Then we have df ree (C) ≥ d = τ kdν − τ kdν + δ + 1 = δ + 1.
Note that if df ree (C) = δ + 1, then C cannot be an MDS convolutional code since, in order to attain the generalized Singleton bound, n = k should hold.
5
Conclusion
In this paper we propose a new method for constructing periodically time-variant convolutional codes. We study the minimality conditions of the new system and state some results about the dependence between the controllability and observability of the subsystems and the minimality of the time-invariant equivalent one. We also study the general case for particular forming subsystems. A lower bound on the free distance is as well given using the fact that both codes have the same df ree . As it is shown in this construction, the number of delay elements remains as δ. This fact makes that the decoding complexity of the new time-varying code
A Systems Theory Approach
81
does not increase respect to the complexity of the subsystems. This gives us a lower complexity of the arithmetic circuits when implementing the model. As future research lines we will study the properties of the time-invariant convolutional code depending on the properties of the subsystems forming the periodically time-variant convolutional one; we will attempt to construct MDS convolutional codes and maximum distance profile (MDP) convolutional codes, which are an optimum subclass.
References [1] Allen, B.M.: Linear Systems Analysis and Decoding of Convolutional Codes. Ph.D thesis, Department of Mathematics, University of Notre Dame, Indiana, USA (June 1999) [2] Antsaklis, P.J., Michel, A.N.: Linear Systems. McGraw-Hill, New York (1997) [3] Balakirsky, V.B.: A necessary and sufficient condition for time-variant convolutional encoders to be noncatastrophic. In: Cohen, G.D., Litsyn, S., Lobstein, A., Z´emor, G. (eds.) Algebraic Coding 1993. LNCS, vol. 781, pp. 1–10. Springer, Heidelberg (1994) [4] Climent, J.J., Herranz, V., Perea, C.: A first approximation of concatenated convolutional codes from linear systems theory viewpoint. Linear Algebra and its Applications 425, 673–699 (2007) [5] Climent, J.J., Herranz, V., Perea, C.: Linear system modelization of concatenated block and convolutional codes. Linear Algebra and its Applications 429, 1191–1212 (2008) [6] Dholakia, A.: Introduction to Convolutional Codes with Applications. Kluwer Academic Publishers, Boston (1994) [7] Grasselli, O.M., Longhi, S.: Finite zero structure of linear periodic discrete-time systems. International Journal of Systems Science 22(10), 1785–1806 (1991) [8] Hautus, M.L.J.: Controllability and observability condition for linear autonomous systems. Proceedings of Nederlandse Akademie voor Wetenschappen (Series A) 72, 443–448 (1969) [9] Hutchinson, R., Rosenthal, J., Smarandache, R.: Convolutional codes with maximum distance profile. Systems & Control Letters 54(1), 53–63 (2005) [10] Johannesson, R., Zigangirov, K.S.: Fundamentals of Convolutional Coding. IEEE Press, New York (1999) [11] Justesen, J.: New convolutional code constructions and a class of asymptotically good time-varying codes. IEEE Transactions on Information Theory 19(2), 220– 225 (1973) [12] Levy, Y., Costello Jr., D.J.: An algebraic approach to constructing convolutional codes from quasicyclic codes. DIMACS Series in Discrete Mathematics and Theoretical Computer Science 14, 189–198 (1993) [13] Massey, J.L., Costello, D.J., Justesen, J.: Polynomial weights and code constructions. IEEE Transactions on Information Theory 19(1), 101–110 (1973) [14] McEliece, R.J.: The algebraic theory of convolutional codes. In: Pless, V.S., Huffman, W.C. (eds.) Handbook of Coding Theory, pp. 1065–1138. Elsevier, NorthHolland, Amsterdam (1998) [15] Mooser, M.: Some periodic convolutional codes better than any fixed code. IEEE Transactions on Information Theory 29(5), 750–751 (1983)
82
J.-J. Climent et al.
[16] O’Donoghue, C., Burkley, C.: Catastrophicity test for time-varying convolutional encoders. In: Walker, M. (ed.) Cryptography and Coding 1999. LNCS, vol. 1746, pp. 153–162. Springer, Heidelberg (1999) [17] Ogasahara, N., Kobayashi, M., Hirasawa, S.: The construction of periodically time-variant convolutional codes using binary linear block codes. Electronics and Communications in Japan, Part 3 90(9), 31–40 (2007) [18] Rosenthal, J.: An algebraic decoding algorithm for convolutional codes. Progress in Systems and Control Theory 25, 343–360 (1999) [19] Rosenthal, J.: Connections between linear systems and convolutional codes. In: Marcus, B., Rosenthal, J. (eds.) Codes, Systems and Graphical Models. The IMA Volumes in Mathematics and its Applications, vol. 123, pp. 39–66. Springer, Heidelberg (2001) [20] Rosenthal, J., Smarandache, R.: Maximum distance separable convolutional codes. Applicable Algebra in Engineering, Communication and Computing 10, 15–32 (1999) [21] Rosenthal, J., Schumacher, J., York, E.V.: On behaviors and convolutional codes. IEEE Transactions on Information Theory 42(6), 1881–1891 (1996) [22] Rosenthal, J., York, E.V.: BCH convolutional codes. IEEE Transactions on Information Theory 45(6), 1833–1844 (1999) [23] Tanner, R.M.: Convolutional codes from quasicyclic codes: A link between the theories of block and convolutional codes. Technical Report USC-CRL-87-21 (November 1987) [24] Thommesen, C., Justesen, J.: Bounds on distances and error exponents of unit memory codes. IEEE Transactions on Information Theory 29(5), 637–649 (1983) [25] York, E.V.: Algebraic Description and Construction of Error Correcting Codes: A Linear Systems Point of View. Ph.D thesis, Department of Mathematics, University of Notre Dame, Indiana, USA (May 1997)
On Elliptic Convolutional Goppa Codes Jos´e Ignacio Iglesias Curto Departamento de Matem´ aticas, Universidad de Salamanca, Plaza de la Merced 1, 37008 Salamanca, Spain
Abstract. The algebraic geometric tools used by Goppa to construct block codes with good properties have been also used successfully in the setting of convolutional codes. We present here this construction carried out over elliptic curves, yielding a variety of codes which are optimal with respect to different bounds. We provide a number of examples for different values of their parameters, including some explicit strongly MDS convolutional codes. We also introduce some conditions for certain codes of this class to be MDS.
1
Introduction
Goppa codes, introduced by V.D. Goppa in the late seventies [4,5] were the seed of a fruitful link between coding theory and algebraic geometry. A further use of algebraic geometric tools in coding theory resulted in many other code constructions but also in the study of related open questions, as for instance the number of rational points of a given curve. Different algebraic elements have been used also to construct convolutional codes. A convolutional code of length n and dimension k is often defined as a k-dimensional subspace of Fnq (z). However it is known that there exists always a polynomial generator matrix. As explained in [1], the term “convolutional” is used because the output sequences can be regarded as the convolution of the input sequences with the sequences in the encoder. Hence, the output at time t does not only depend on the input at time t, but also on those at previous time instants. The output vector at time t is precisely the coefficient of z t . The amount of this dependance is a critical parameter of the convolutional code called the degree or the complexity of the code, δ, and can be computed as the sum of the row degrees of a polynomial generator matrix in row proper form. In fact convolutional codes of degree 0 are precisely linear block codes. In addition, the highest row degree of a polynomial generator matrix in row proper form stands for the number of previous inputs on which every output depends, and it is another invariant of the code known as its memory, m. Since the components of convolutional codewords are not constant the weight function considered is slightly different than the one in block coding. The weight of a convolutional codeword is the sum of the non-zero coefficients at each of
Research partially supported by the research contract MTM2006-076B of DGI and by Junta de Castilla y Le´ on under research project SA029A08.
M. Bras-Amor´ os and T. Høholdt (Eds.): AAECC 2009, LNCS 5527, pp. 83–91, 2009. c Springer-Verlag Berlin Heidelberg 2009
84
J.I. Iglesias Curto
its components. In a similar way the distance of two codewords is defined, and the minimum among them is called the free distance, df , of the code. Similarly to block coding, several bounds on the free distance are used. Two of the most commonly ones used, to which we will refer later (and which are generalizations of bounds for block codes), are df ree ≤ S(n, k, δ) = (n − k)
δ k
+1 +δ+1
generalized Singleton bound [11] k(m+i)−δ−1
df ree ≤ max{d∈{1, . . . , S(n, k, δ)}| ˆ= N
l=0
d ˆ ≤ n(m + i), for all i ∈ N} ql
N ≡ {1, 2, . . .} if km = δ . N0 ≡ {0, 1, 2, . . .} if km > δ Griesmer bound [3]
By analogy with block coding, convolutional codes attaining the generalized Singleton bound are known as Maximum Distance Separable, MDS, codes. As convolutional codes are a generalization of linear block codes, one might wonder wether algebraic geometric tools, and in particular similar tools as the ones used by Goppa, could be also used to construct families of convolutional codes with certain good properties. A first attempt to define convolutional Goppa codes was made in [10], where instead of a curve, a family of curves parameterized by the affine line was considered. Instead of points, disjoint sections of the projection of this family over the affine line were taken, and instead of divisors on a curve, a Cartier divisor and an invertible sheaf. An analogous construction to the classical one led to a family of convolutional codes “of Goppa type”. After that, a more general construction with simpler geometric tools has been given in [9]. The paper is structured as follows. In Section 2 we expose the use of algebraic geometric elements to construct convolutional Goppa codes. In Section 3 we apply this construction to the case in which the curve considered is elliptic. An explicit expression of the elements involved, in particular of the rational points taken, will allow to determine the generator matrices of the codes so obtained. In Section 4 we present several examples of optimal elliptic convolutional Goppa codes. Finally in Section 5 we address the problem of characterizing MDS codes.
2
Geometric Construction of Convolutional Goppa Codes
Let Fq be a finite field and Fq (z) the field of rational functions on one variable. Let X be a smooth projective curve over Fq (z) of genus g and let us assume that Fq (z) is algebraically closed in the field of rational functions of X. Both Riemann-Roch and the Residues theorems still hold under this hypothesis [6].
On Elliptic Convolutional Goppa Codes
85
Let us take n different Fq (z)-rational points P1 , . . . , Pn and the divisor D = P1 +· · ·+Pn , with its associated invertible sheaf OX (D). We have then the exact sequence of sheaves 0 → OX (−D) → OX → Q → 0 ,
(1)
where Q is a sheaf with support only at the points Pi . Let G be another divisor on X with support disjoint from D. By tensoring the exact sequence (1) by the associated invertible sheaf OX (G), we have 0 → OX (G − D) → OX (G) → Q → 0 . and by taking global sections we get the sequence α
0 → H 0 (X, OX (G − D)) → H 0 (X, OX (G)) → H 0 (X, Q) → → H 1 (X, OX (G − D)) → H 1 (X, OX (G)) → 0
.
If we impose deg(G) < n = deg(D), we have an injective Fq (z)-linear map 0
/ L(G) s
α
n / F (z) × . . . × Fq (z) q
/ ...
/ (s(P1 ), . . . , s(Pn ))
Definition 1. The convolutional Goppa code C(D, G) defined by the divisors D and G is the image of α : L(G) → Fq (z)n . Given a subspace S ⊆ L(G), the convolutional Goppa code C(D, S) defined by D and S is the image of α|S . We can use the Riemann-Roch theorem to calculate the dimension of a convolutional Goppa code. Proposition 1 ([9]). C(D, G) is a convolutional code of length n = deg(D) and dimension k ≥ deg(G)+1−g. Further, if deg(G) > 2g−2 then k = deg(G)+1−g. Proof. Since deg(G) < n, dim L(G − D) = 0, hence the map α is injective and k = dim L(G). If 2g − 2 < deg(G), dim L(G) = 1 − g + deg(G) by the RiemannRoch theorem. The geometric tools to characterize the free distance of convolutional Goppa codes are much more sophisticated than the analogous ones in the block case, involving jets, osculating planes and an interpretation of the points Pi as sections over the affine line. Then, the calculus of the free distance could be interpreted as a problem of Enumerative Geometry over finite fields [9]. However a few conditions to obtain MDS convolutional codes will be given in Section 5. Similarly, a convolutional code can be constructed by considering the dual morphisms of those defining C(D, G) and by means of the Residues Theorem it can be proven that the code so constructed is the dual code of C(D, G).
86
3
J.I. Iglesias Curto
Convolutional Goppa Codes over Elliptic Curves
Let X ⊂ P2Fq (z) be a plane elliptic curve over Fq (z). Without loss of generality we will assume that X has a rational point of order at least 4 (so that there are enough rational points to define a convolutional code). Then, in an affine plane containing this point, X can be written in Tate Normal form [7] y 2 + axy + by = x3 + bx2
(2)
being x, y the affine coordinates in this plane and a, b ∈ Fq (z). Let P∞ be the point at infinity, P0 = (0, 0) and P1 , . . . , Pn n different rational points of X, with Pi = (xi , yi ) and xi , yi ∈ Fq (z). Consider the divisors D = P1 + · · · + Pn and G = rP∞ , with 0 < r < n. Recall that the divisors of the functions x, y are (x) = P0 + Q − 2P∞ , (y) = 2P0 + Q − 3P∞ where Q, Q are two rational points different from P0 , P∞ . Then, a basis of L(G) is given by {1, x, y, . . . , xi y j , . . .}, where i, j satisfy 2i + 3j ≤ r = deg(G) (and to avoid linear dependencies j = 0, 1). Since r < n then the evaluation map α : L(G) xi y j
/ Fq (z)n / (xi y j , . . . , xi y j ) 1 1 n n
is an injective morphism and Imα defines the convolutional Goppa code C(D, G) with length n. As g = 1 and deg(G) > 2g − 2 the code has dimension k = r = deg(G). Let us consider now the more general case with G = rP∞ −sP0 , where r, s > 0 and 0 < r − s < n. Then {xa y b | s ≤ a + 2b, 2a + 3b ≤ r} is a basis of L(G). The code C(D, G) has length n and dimension r − s. The explicit expression of a generator matrix of the code C(D, G) is ⎞ ⎛ a b x1 y1 xa2 y2b . . . xan ynb b a+1 b a+1 b ⎟ ⎜xa+1 ⎜ 1 y1 x2 y2 . . . xn yn ⎟ G=⎜ . . .. ⎟ . . .. .. ⎝ .. . ⎠ xc1 y1d
xc2 y2d . . . xcn ynd
The following examples illustrate this construction, as well as the fact that it provides optimal codes. Example 1. Let us consider the elliptic curve with Tate Normal form y 2 + zxy + y = x3 + x2 over the field F2 (z).
On Elliptic Convolutional Goppa Codes
87
We take the divisor D = P1 + P2 + P3 + P4 with P2 = (1 + z, 1 + z 2 ) P1 = (1 + z, z) 3 1+z 3 1+z 3 +z 4 +z 5 1+z 2 +z 4 ) P4 = ( 1+z ) P3 = ( z2 , z3 z2 , z3 and the divisor G = 3P∞ − P0 , being therefore L(G) = x, y . Then the convolutional Goppa Code defined by D and G is generated by the matrix 2 z z2 1 + z + z2 1 + z + z2 , 1 + z 1 + z2 + z3 1 + z + z3 0 2
2
z where the rows are the images of the functions 1+z x and zy + 1+z+z 1+z x respectively. C(D, G) has parameters [n, k, δ, m, df ree ] = [4, 2, 5, 3, 8]. The free distance of the code attains the Griesmer bound.
Example 2. We consider now the elliptic curve y 2 + zxy + 2z 2 y = x3 + 2z 2 x2 over F5 (z), and the divisor D = P1 + P2 , with support at the points 2
3
4
4
5
6
+3z +4z 2z +2z +3z P1 = (3z 2 , 3z 2 + 2z 3 ) P2 = ( 2z1+3z+z , 4+3z+2z 2 2 +z 3 )
We take the divisor G = 2P∞ − P0 , and we have L(G) = x . A generator matrix of the code C(D, G) is 2 + z + 2z 2 , 3 + 2z + z 2 2
given by the image of the function 4+2z+4z x. The code has parameters [n, k, δ, z2 df ree ] = [2, 1, 2, 6], and hence it is MDS. Example 3. Now we take the curve y 2 + (1 + z + z 2 )xy + (z 2 + z 3 )y = x3 + (z 2 + z 3 )x2 over Fq (z), with q = 2m . We consider the divisor D = P1 + P2 + P3 where P1 = (0, −z 3 − z 2 ) P2 = (z 2 − z, −z 4 − 2z 3 + z 2 ) P3 = (−z 2 − z, −z 3 + z) Let us take G = 2P∞ , then L(G) = 1, x and a generator matrix for the convolutional Goppa code C(D, G) is 1 1 1 , 01−z 1+z whose rows are images of the functions 1 and − z1 x. C(D, G), has parameters [n, k, δ, m, df ree ] = [3, 2, 1, 1, 3]. Then C(D, G) is an MDS convolutional code.
88
4
J.I. Iglesias Curto
Some Optimal Convolutional Goppa Codes over Elliptic Curves
The previous examples illustrate the construction of elliptic convolutional Goppa codes with different parameters from their defining elements, i. e., the elliptic curve and the divisors D and G. They also show the existence of optimal convolutional codes defined over elliptic curves. In this section we present a representative collection of convolutional Goppa codes defined over elliptic curves. A more extensive one can be found in [8]. The description of the codes is done by means of their generating elements. To characterize the elliptic curve over which each code is defined, let a, b be the parameters on its Tate Normal form (2). On this curve we take the points {Pi }7i=1 = {2P, 3P, −3P, 4P, −4P, 5P, −5P }, where P = (0, 0) = P0 is a rational point which belongs to the curve, and nP , n ∈ Z is obtained by the addition law defined over every elliptic curve. Together with them, the basis or the bases (when in the same curve and with the same divisor D there is more than one code for those parameters) of the space of functions that define each code are provided. We include for each code its parameters, the size p of the base field (which for the codes presented will be always prime) and the bound that it reaches so that it is considered optimal. As shown on the table, for those codes that are optimal with respect to the Griesmer bound (i.e., there does not exist MDS codes over that field) there exist also MDS codes if the size of the base field is big enough. 4.1
Strongly MDS Convolutional Codes
The set of strongly MDS convolutional codes is a particularly interesting subset of MDS convolutional codes. They are characterized by the property that to decode the coefficients vector at time t, the smallest possible number of coefficients vectors of the received word are needed. This property is very convenient to develop iterative decoding algorithms, as the one in [2], with an error decoding capability per time interval similar to MDS block codes of a large length. The family of convolutional Goppa codes defined over elliptic curves provides also a number of codes which are strongly MDS. A representative subset of them are presented in the following table. More examples can be found in [8]. We would like to stress two interesting facts. In [2] several examples of strongly MDS convolutional codes are presented, which are obtained by different methods. However all of them have in common that the length of the code and the characteristic of the base field must be coprime. This condition is not necessary in our construction and actually some of the codes obtained in this way don’t fulfill it. Secondly, the already mentioned decoding algorithm for this kind of codes that is proposed in the same paper, has the drawback of needing a general syndrome decoding algorithm. The variety of examples presented suggests that elliptic convolutional Goppa codes are a promising way to obtain strongly MDS codes with a rich algebraic structure that would allow to adapt that decoding scheme in order to make it practical.
On Elliptic Convolutional Goppa Codes
89
Table 1. Examples of optimal elliptic convolutional Goppa codes [n, k, δ] [2,1,1] [2,1,1] [2,1,2] [2,1,2] [2,1,7] [2,1,7] [3,1,1] [3,1,2] [3,1,2] [3,1,6] [4,1,2] [4,1,6] [4,1,8] [3,2,1] [3,2,1] [3,2,3] [4,2,1] [4,2,1] [4,2,3] [4,2,5] [4,2,5] [5,2,1] [4,3,2] [4,3,2]
p 3 ≥3 2 ≥5 7 13 ≥5 2 7 11 5 11 11 5 7 7 ≥3 ≥3 7 2 13 ≥3 7 11
df ree (a,b) L(G) 4 (α1 z, α1 z + 2z 2 ) {x} 4 (z, α2 z − α2 z 2 ) {x} 5 (1 + z, 1) {y}, {xy} 6 (2z, 1) {x} 15 (z + 2z 2 , 1 + 4z + 3z 2 ) {y}, {xy} 16 (3z 2 , 1 + 2z + 2z 2 ) {y}, {xy} 6 (z, 2 − 5z + 3z 2 ) {y}, {xy} 8 (z, z 2 ) {x} 9 (z, 1 + 2z + 4z 2 ) {y} 21 (z, 1 + 3z 2 ) {xy} 12 (z, 1 + 2z + 2z 2 ) {y} 28 (z, 1 + 4z + 6z 2 ) {xy} 36 (3z, 6 + 4z 2 ) {y} 3 (1 + z, z + 3z 2 ) {x, y}, {x2 , xy} 3 (1 + 2z, 2z + z 2 ) {x, y}, {x2 , xy} 2 6 (2z, 2 + 6z ) {x, y} 4 (z, −2 + 2z) {1, x} 4 (z, −2 + 3z − z 2 ) {1, x} 8 (2z, 3 + 2z 2 ) {x, y} 8 (z, 1) {x, y} 12 (z, 4 + 4z + 5z 2 ) {x, y} 5 (z, −2 + 3z − z 2 ) {1, x} 4 (5 + α3 z, 3 − α3 z) {1, x, y} 4 (6 + α3 z, 6 − α3 z) {1, x, y} α2 = −1 α3 = 0
i,D = Pi 1, 4 1, 4 4, 5 2, 6 6, 7 6, 7 1, 4, 5 1, 4, 5 1, 4, 5 1, 4, 5 2, 3, 6, 7 2, 3, 6, 7 2, 3, 6, 7 1, 4, 5 1, 4, 5 1, 4, 5 0, 1, 2, 4 1, 2, 4, 6 2, 3, 6, 7 2, 3, 6, 7 2, 3, 6, 7 0, 1, 2, 4, 6 1, 3, 6, 7 1, 3, 6, 7
Bound MDS MDS Griesmer MDS Griesmer MDS MDS Griesmer MDS MDS MDS MDS MDS MDS MDS MDS MDS MDS MDS Griesmer MDS MDS MDS MDS
Table 2. Examples of strongly MDS elliptic convolutional Goppa codes [n, k, δ] p df ree (a,b) L(G) i,D = Pi [2,1,1] 5 4 (z, 1 + 2z + 2z 2 ) {x} 2, 6 [2,1,1] 7 4 (z, 2 + 2z + 3z 2 ) {x} 2, 6 [2,1,2] 11 6 (0, 2 + α1 z) {y}, {xy} 4, 5 [2,1,2] 11,13 6 (2z, 3) {y}, {xy} 4, 5 [2,1,2] 13 6 (2z, 5 + 3z) {y}, {xy} 6, 7 [3,1,2] 3 3 (z 2 , 1 + z) {1, x} 1, 2, 6 [3,1,2] ≥ 5 3 (z, 1 − 3z + 2z 2 ) {x, y},{x2 , xy} 1, 4, 5 [4,2,1] 5,11 4 (2 + α2 z, α3 + α2 α3 z) {1, x} 0, 1, 2, 4 α1 = 1, . . . , 6, α2 ≥ 1, α3 ≥ 2
5
Conditions on Maximum Distance Separability
In [11] a condition is provided so that a codeword has weight bigger or equal to the value for the generalized Singleton bound, the so-called weight property. A codeword holds this property if every subset of k components has weight at
90
J.I. Iglesias Curto
least δ + 1. Hence a sufficient condition for a convolutional code to be MDS is that every codeword holds the weight property. As it is shown, the weight property implies that at least n − k + 1 components must have weight at least δ/k + 1. In particular, since these components must be of course different from 0, the previous condition is also a sufficient condition to be a MDS block code over Fq (z). It is easy to check that actually 1-dimensional MDS convolutional codes are also MDS as block codes. In particular, the necessary conditions for 1-dimensional MDS block codes are also of application in the convolutional case. One of the necessary conditions that have to be verified is the following. Proposition 2. If C(D, G) is a 1-dimensional MDS elliptic code, then for any point P such that G ∼ P , P ∈ / supp(D). Proof. Since G ∼ P , then there exists a function f such that (f ) = G − P , i. e., f ∈ L(G − P ). On the other side, f may not have a zero on the support of D since we assume that C(D, G) is MDS. Then P ∈ / supp(D). In particular, for elliptic convolutional Goppa codes we have Proposition 3. If C(D, G) is a 1-dimensional MDS convolutional code, then for any function f such that α(f ) is polynomial of degree δ, then w(f (Pi )) = δ + 1 for all i. Proof. It is a direct consequence of generalized Singleton bound. Note that these results cannot be extended to (n − 1)-dimensional codes (as one would do for block codes) since the dual code of a MDS convolutional code may not be MDS.
6
Conclusions
The Goppa construction to define linear block codes can be also considered in a very natural way on the convolutional coding context. In particular an explicit expression of convolutional Goppa codes defined over elliptic curves can be given. This construction yields optimal codes over different fields and for different values of their parameters. It is of particular interest the set of strongly MDS convolutional codes of this family. However the problem of determining the free distance is far harder than in the block coding case. Some conditions to get 1-dimensional MDS convolutional codes have been given, but still remains unknown wether they can be extended for codes of any dimension.
References 1. Forney, G.D.: Convolutional codes I: Algebraic structure. IEEE Trans. Information Theory (1970) 2. Gluesing-Luerssen, H., Rosenthal, J., Smarandache, R.: Strongly MDS convolutional codes. IEEE Trans. Information Theory (52), 584–598 (2006)
On Elliptic Convolutional Goppa Codes
91
3. Gluesing-Luerssen, H., Schmale, W.: Distance bounds for convolutional codes and some optimal codes (2003), arXiv:math/0305135v1 4. Goppa, V.D.: Codes associated with divisors. Probl. Peredachi Inform. 13(1), 33–39 (1977); Translation: Probl. Inform. Transmission 13, 22–26 (1977) 5. Goppa, V.D.: Codes on algebraic curves. Dokl. Adad. Nauk SSSR 259, 1289–1290 (1981); Translation: Soviet Math. Dokl. 24, 170–172 (1981) 6. Hartshorne, R.: Algebraic geometry. Grad. Texts in Math., vol. 52. Springer, New York (1977) 7. Husemoller, D.: Elliptic curves. Springer, New York (1987) 8. Iglesias Curto, J.I.: A study on convolutional codes. Classfication, new families and decoding, Ph.D. thesis, Universidad de Salamanca (2008) 9. Mu˜ noz Porras, J.M., Dominguez Perez, J.A., Iglesias Curto, J.I., Serrano Sotelo, G.: Convolutional Goppa codes. IEEE Trans. Inform. Theory 52(1), 340–344 (2006) 10. Mu˜ noz Porras, J.M., Dom´ınguez P´erez, J.A., Serrano Sotelo, G.: Convolutional codes of Goppa type. AAECC 15, 51–61 (2004) 11. Rosenthal, J., Smarandache, R.: Maximum distance separable convolutional codes. AAECC 10(1), 15 (1999)
The Minimum Hamming Distance of Cyclic Codes of Length 2ps ¨ ¨ Hakan Ozadam and Ferruh Ozbudak Department of Mathematics and Institute of Applied Mathematics ˙ on¨ Middle East Technical University, In¨ u Bulvarı, 06531, Ankara, Turkey {ozhakan,ozbudak}@metu.edu.tr
Abstract. We study cyclic codes of length 2ps over Fq , where p is an odd prime. Using the results of [1], we compute the minimum Hamming distance of these codes. Keywords: Cyclic code; repeated-root cyclic code; Hamming distance.
1
Introduction
Although it has been shown in [1] that repeated-root cyclic codes over finite fields are “asymptotically bad”, they remain to be interesting in some cases (see, for example, [6,7,8,9]). For instance, a sequence of binary repeated-root cyclic codes, that are optimal, was found in [8]. Moreover, in [6], it has been shown that the minimum Hamming distance of repeated-root cyclic codes over Galois rings can be determined by the minimum Hamming distance of codes of the same length over the residue finite field. Using this result of [6], the minimum Hamming distance of cyclic codes of length ps over Galois rings of characteristic pa is given in [7]. The minimum Hamming distance of cyclic codes of length ps over a finite field of characteristic p is given in [2]. Later in [5], we have shown that the minimum Hamming distance of these codes can also be computed by using the results of [1] via simpler and more direct methods compared to that of [2]. In this study, we extend our methods to cyclic codes of length 2ps . Namely, we determine the minimum Hamming distance of all cyclic codes, of length 2ps , over a finite field of characteristic p, where p is an odd prime and s is an arbitrary positive integer. This paper is organized as follows. In Section 2, we fix our notation and recall some preliminaries. In Section 3, using the results of [1], we compute the minimum Hamming distance of all cyclic codes of length 2ps over a finite field of characteristic p, where p is an odd prime. We summarize our results in Table 1 at the end of Section 3.
2
Preliminaries
Let p be an odd prime and Fq be a finite field of characteristic p. Let n be a positive integer. Throughout this paper we identify a codeword (a0 , a1 , . . . , an−1 ) M. Bras-Amor´ os and T. Høholdt (Eds.): AAECC 2009, LNCS 5527, pp. 92–100, 2009. c Springer-Verlag Berlin Heidelberg 2009
The Minimum Hamming Distance of Cyclic Codes of Length 2ps
93
over Fq with the polynomial a(x) = a0 +a1 x+· · ·+an−1 xn−1 ∈ Fq [x]. We denote the minimum Hamming distance of a code C by dH (C). Let a = xn − 1 be an ideal of Fq [x] and let Ra be the finite ring given by Ra = Fq [x]/a. It is well-known that cyclic codes, of length n, over Fq are ideals of Ra (see, for example, Chapter 7 of [4]). Any element of Ra can be represented uniquely as f (x) + a where deg(f (x)) < n. The codeword which corresponds to f (x) + a is (f0 , f1 , . . . , fn−1 ), where f (x) = f0 + f1 x + · · · + fn−1 xn−1 ∈ Fq [x]. Since Fq [x] is a principal ideal domain, for any ideal I of Ra , there exists a unique monic polynomial g(x) ∈ Fq [x] with deg(g(x)) < n and g(x)|xn − 1 such that I = g(x). Let h(x) ∈ Fq [x] be a polynomial with h(0) = 0, then h(x) | xm − 1 for some positive integer m (c.f. [3, Lemma 3.1]). Let e be the least positive integer with h(x) | xe −1. The order of h(x) is defined to be e and it is denoted by ord(h) = e.
3
Cyclic Codes of Length 2ps
In this section we determine the minimum Hamming distance of all cyclic codes of length 2ps over Fq , where s is a positive integer. s s s s Let 0 ≤ i, j ≤ ps be integers. Since x2p − 1 = (x2 − 1)p = (x − 1)p (x + 1)p , s i j all cyclic codes of length 2p over Fq are of the form (x − 1) (x + 1) . s Let Ra = Fq [x]/x2p −1 and let C = (x−1)i (x+1)j ⊂ Ra . If (i, j) = (0, 0), then C = Ra and if (i, j) = (ps , ps ), then C = {0}. These are the trivial cyclic codes of length 2ps over Fq . In the rest of this section, we compute the minimum Hamming distance of all non-trivial cyclic codes of length 2ps over Fq . We partition the set {0, 1, . . . , ps } into 4 subsets. The first one is the range 0, . . . , ps−1 , the second one is the range ps−1 + 1, . . . , (p − 1)ps−1 , the third one is the range (p − 1)ps−1 + 1, . . . , ps − 1 and the last one just consists of ps . These ranges arise naturally from technicalities of our computations explained below. If i = 0, or j = 0, or 0 ≤ i, j ≤ ps−1 , then the minimum Hamming distance of C can easily be determined as shown in Lemma 2 and Lemma 3 below. We note that from Lemma 8 till Theorem 2, we consider only the cases, where i ≥ j, explicitly. This is because the cases, where j > i, can be treated similarly as the corresponding case of i > j and consequently, we have the analogous results in these cases. Finally, in Theorem 2, we state all of our results explicitly corresponding to each case including the ones where j > i. First we give an overview of the results in this section. If 0 < j ≤ ps−1 and s−1 p + 1 ≤ i ≤ ps , then the minimum Hamming distance of C is computed in Lemma 8 and Lemma 9. Note that we use the results of [1] for our computations after Lemma 5. If ps−1 + 1 ≤ j ≤ i ≤ (p − 1)ps−1 , then the minimum Hamming distance of C is computed in Lemma 10. If ps−1 + 1 ≤ j ≤ (p − 1)ps−1 < i ≤ ps − 1, then the minimum Hamming distance of C is computed in Lemma 11. If (p − 1)ps−1 + 1 ≤ j ≤ i ≤ ps − 1, then the minimum Hamming distance of C is computed in Lemma 12 and Lemma 13. If 0 < j ≤ ps−1 and i = ps , then the minimum Hamming distance of C is computed in Lemma 14. If ps−1 + 1 ≤ j ≤ (p − 1)ps−1 and i = ps , then the minimum Hamming distance of C is computed
94
¨ ¨ H. Ozadam and F. Ozbudak
in Lemma 15. Finally, if (p − 1)ps−1 + 1 ≤ j ≤ ps − 1 and i = ps , then the minimum Hamming distance of C is computed in Lemma 16. We begin to present and prove our results with the next lemma. Lemma 1. Let C be an ideal of Ra with {0} = C Ra . Then dH (C) ≥ 2. Proof. This follows from the observation that αxm is a unit in Ra for α ∈ Fq \{0} and m ∈ Z+ ∪ {0}.
Lemma 2. Let 0 < i, j ≤ ps be integers, let C = (x − 1)i and D = (x + 1)j . Then dH (C) = dH (D) = 2. s
Proof. The proof follows from Lemma 1 and the observation that (x− 1)i |xp − 1 s and (x + 1)j |xp + 1.
Lemma 3. Let C = (x + 1)i (x − 1)j , for some 0 ≤ i, j ≤ ps−1 with (i, j) = (0, 0). Then dH (C) = 2. s−1
Proof. For this range of i and j, we have (x + 1)i (x − 1)j |(x2p
− 1).
The following lemma shows that we can use the results of [1] for computing the minimum Hamming distance of the remaining codes. Lemma 4. Let f (x) = (x − 1)i (x + 1)j ∈ Fq [x] for some integers 0 ≤ i, j ≤ pe . Then ord(f ) = 2pe if j > pe−1 , or if j > 0 and i > pe−1 . Proof. The proof follows from the fact that ord(x + 1) = 2, [3, Theorem 3.8] and [3, Theorem 3.9].
Let G1 = {(x − 1)i (x + 1)j : G2 = {(x − 1)i (x + 1)j :
ps ≥ i > 0 and ps ≥ j > ps−1 }, ps ≥ i > ps−1
and ps ≥ j > 0} s
s
and let g(x) be any element of G1 ∪ G2 \ {(x − 1)p (x + 1)p }. By Lemma 4, ord(g(x)) = 2ps . Therefore we can use the results of [1] to determine the minimum Hamming distance of g(x). Let 0 ≤ t ≤ ps − 1 be an integer. For g(x) = (x − 1)i (x + 1)j ∈ G1 ∪ G2 \ {(x − s ps 1) (x + 1)p }, we define C = g(x), 1, if i > t, 1, if j > t, and ej,t = ei,t = 0, otherwise, 0, otherwise. Let g¯t (x) = (x − 1)ei,t (x + 1)ej,t and let C¯t = ¯ gt (x) ⊂ Fq [x]/x2 − 1 be the simple-root cyclic code depending on C and t. Following the conventions of [1], we set ⎧ ⎨ 2, if g¯t (x) = x − 1 or g¯t (x) = x + 1, dH (C¯t ) = 1, if g¯t (x) = 1, ⎩ ∞, if g¯t (x) = x2 − 1.
The Minimum Hamming Distance of Cyclic Codes of Length 2ps
95
Then we have the following cases. If i ≥ j > t or j ≥ i > t, and therefore dH (C¯t ) = ∞.
then g¯t (x) = x2 − 1
(1)
If i > t ≥ j or j > t ≥ i, then g¯t (x) = x − 1 or g¯t (x) = x + 1, respectively, and therefore dH (C¯t ) = 2.
(2)
If t ≥ i ≥ j or t ≥ j ≥ i, and therefore dH (C¯t ) = 1.
(3)
then g¯t (x) = 1
For 0 ≤ t ≤ ps − 1, let 0 ≤ t0 , t1 , . . . , ts−1 ≤ p − 1 be the uniquely determined integers such that t = t0 + t1 p + · · · + ts−1 ps−1 , and Pt be the positive integer given by Pt =
s−1
(tm + 1) ∈ Z.
m=0
We define (cf. [1, page 339]) the set T = {t : t = (p − 1)ps−1 + (p − 1)ps−2 + · · · + (p − 1)ps−(j−1) + rps−j , 1 ≤ j ≤ s, 1 ≤ r ≤ p − 1} ∪ {0}. In [1], it has been shown that we can express the minimum Hamming distance of C in terms of dH (C¯t ) and Pt . Lemma 5 ([1, Lemma 1]). Let C, C¯t and Pt be as above. The minimum Hamming distance of C satisfies dH (C) ≤ Pt dH (C¯t ) for all t ∈ {0, 1, . . . , ps − 1}. Theorem 1 ([1, Theorem 1]). Let C, C¯t and Pt be as above. Then dH (C) = Pt dH (C¯t ) for some t ∈ T . Combining Lemma 5 and Theorem 1, we obtain dH (C) = min{Pt dH (C¯t ) :
t ∈ T }.
(4)
Clearly, (1) implies that dH (C) = Pt dH (C¯t ) is impossible for i ≥ j > t and j ≥ i > t. So we will consider only the cases (2) and (3) in our computations. The following proposition is a useful tool for our computations.
Proposition 1. Let t, t ∈ T , then t < t if and only if Pt < Pt . Proof. We have t = (p − 1)ps−1 + (p − 1)ps−2 + · · · + (p − 1)ps−(e−1) + rps−e
t = (p − 1)ps−1 + (p − 1)ps−2 + · · · + (p − 1)ps−(e
−1)
and
+ r ps−e
for some 1 ≤ e, e ≤ s and 1 ≤ r, r ≤ p − 1 as t, t ∈ T . First assume that t > t . Then either e > e , or e = e and r > r . If e > e , then pe−e ≥ p so pe−e (r + 1) ≥ 2p. Moreover r ≤ p − 1 implies r + 1 ≤ p. So we get pe−e (r + 1) − (r + 1) ≥ p. Therefore
Pt − Pt = pe (pe−e (r + 1) − (r + 1)) ≥ pe p > 0.
96
¨ ¨ H. Ozadam and F. Ozbudak
Hence Pt > Pt . If e = e and r > r , then Pt − Pt = pe ((r + 1) − (r + 1)) = pe (r − r ) > 0 and hence Pt > Pt . Using similar arguments, we obtain that if
Pt > Pt , then t > t . This completes the proof. As an immediate consequence of Proposition 1, we have min{Pt :
t∈T ,
t ≥ j} = min{Pt :
t∈T ,
i > t ≥ j},
(5)
provided that these sets are nonempty. We use (5) from Lemma 8 on. The minimum value of Pt is given in the following two lemmas, as t runs through certain sets. These sets are determined from the ranges of i and j in the definition of cyclic codes. Note that if i is an integer satisfying 1 ≤ i ≤ (p − 1)ps−1 , then there exists a uniquely determined integer β such that 0 ≤ β ≤ p − 2 and βps−1 + 1 ≤ i ≤ (β + 1)ps−1 . Lemma 6. Let be an integer such that βps−1 + 1 ≤ ≤ (β + 1)ps−1 , where β is an integer with 0 ≤ β ≤ p − 2. Then min{Pt : t ≥ and t ∈ T } = β + 2. Proof. If t ≥ and t ∈ T , then t ≥ (β + 1)ps−1 . So we get min{t ∈ T : t ≥ } = (β + 1)ps−1 . Using Proposition 1, we obtain min{Pt : t ≥ and t ∈ T } = β + 2.
Note that ps − ps−1 < ps − ps−2 < · · · < ps − ps−s = ps − 1. Hence for an integer i satisfying (p − 1)ps−1 + 1 = ps − ps−1 + 1 ≤ i ≤ ps − 1, there exists a uniquely determined integer k such that 1 ≤ k ≤ s−1 and ps −ps−k +1 ≤ i ≤ ps −ps−k−1 . Moreover we have ps − ps−k < ps − ps−k + ps−k−1 < ps − ps−k + 2ps−k−1 < · · · < ps − ps−k + (p − 1)ps−k−1 and ps − ps−k + (p − 1)ps−k−1 = ps − ps−k−1 . Therefore for such integers i and k, there exists a uniquely determined integer τ with 1 ≤ τ ≤ p − 1 such that ps − ps−k + (τ − 1)ps−k−1 + 1 ≤ i ≤ ps − ps−k + τ ps−k−1 . Lemma 7. Let τ and k be integers with 1 ≤ τ ≤ p − 1 and 1 ≤ k ≤ s − 1. If is an integer with ps − ps−k + (τ − 1)ps−k−1 + 1 ≤ ≤ ps − ps−k + τ ps−k−1 , then min{Pt : t ≥ and t ∈ T } = (τ + 1)pk . Proof. For 0 ≤ α ≤ p − 1, we have ps − ps−k + αps−k−1 = (p − 1)ps−1 + (p − 1)ps−2 + · · · + (p − 1)ps−k + αps−k−1 . So we get (p − 1)ps−1 + (p − 1)ps−2 + · · · + (p − 1)ps−k + (τ − 1)ps−k−1 + 1 ≤ ≤ (p − 1)ps−1 + (p − 1)ps−2 + · · · + (p − 1)ps−k + τ ps−k−1 . Now t ≥ and t ∈ T implies t ≥ (p − 1)ps−1 + (p − 1)ps−2 + · · · + (p − 1)ps−k + t ≥ } = (p − 1)ps−1 + (p − 1)ps−2 + · · · + τ ps−k−1 . Thus min{t ∈ T : s−k s−k−1 (p − 1)p + τp . Hence, using Proposition 1, we obtain min{Pt : t ∈ T and t ≥ } = (τ + 1)ps−k .
Now we are ready to obtain the minimum Hamming distance of the cyclic codes when 0 < j ≤ ps−1 < i < ps in the following two lemmas. Recall that from Lemma 8 till Theorem 2 we state the results when j ≤ i only. For j > i, the
The Minimum Hamming Distance of Cyclic Codes of Length 2ps
97
analogous results are obtained by the same method, and their statements are included in Theorem 2. Lemma 8. Let C = (x − 1)i (x + 1)j , where 2ps−1 ≥ i > ps−1 ≥ j > 0. Then dH (C) = 3. Proof. Note that ps−1 ∈ T and i > ps−1 ≥ j. Hence the set {Pt : T and i > t ≥ j} is nonempty. So, by Lemma 6, we get min{Pt : T and i > t ≥ j} = 2. If i > t ≥ j, then dH (C¯t ) = 2 by (2), thus
t ∈ t ∈
t∈T
(6)
min{Pt dH (C¯t ) :
and i > t ≥ j} = 4.
If t ≥ i > j and 2ps−1 ≥ i > ps−1 , then, by Lemma 6, we get min{Pt : T and t ≥ i > j} = 3. Note that dH (C¯t ) = 1 by (3), so
t∈
t∈T
(7)
min{Pt dH (C¯t ) :
and t ≥ i > j} = 3.
Using (4), (6) and (7), we conclude dH (C) = 3.
Lemma 9. Let C = (x − 1)i (x + 1)j , where ps > i > 2ps−1 and ps−1 ≥ j > 0. Then dH (C) = 4. Proof. By Lemma 6, (2) and (3) we get min{Pt dH (C¯t ) : min{dH (C¯t )Pt :
t∈T
and i > t ≥ j} = 4,
(8)
t∈T
and t ≥ i > j} ≥ 4.
(9)
Using (4), (8) and (9), we obtain dH (C) = 4. We consider the case ps−1 + 1 ≤ j ≤ i ≤ (p − 1)ps−1 in the following lemma.
Lemma 10. Let j ≤ i, 1 ≤ β ≤ β ≤ p − 2 be integers with βps−1 + 1 ≤ i ≤ (β + 1)ps−1 and β ps−1 + 1 ≤ j ≤ (β + 1)ps−1 . Let C = (x − 1)i (x + 1)j , then dH (C) = min{β + 2, 2(β + 2)}.
Proof. First suppose that β = β . Note that if i > t ≥ j, then t ∈ T . For t ≥ i ≥ j, by Lemma 6, we get min{Pt : t ∈ T and t ≥ i ≥ j} = β + 2 = β + 2. For t ≥ i ≥ j, dH (C¯t ) = 1 by (3). So we have min{Pt dH (C¯t ) :
t∈T
and t ≥ i ≥ j} = β + 2 = β + 2.
Next we assume β < β. Then, using Lemma 6, we get min{Pt : t ∈ T t ≥ j} = (β + 2). Since i > t ≥ j, dH (C¯t ) = 2 by (2), therefore min{Pt dH (C¯t ) :
t∈T
t∈T
and i >
and i > t ≥ j} = 2(β + 2).
For t ≥ i ≥ j, using Lemma 6, we get min{Pt : (β + 2). Since t ≥ i ≥ j, dH (C¯t ) = 1 by (3), thus min{Pt dH (C¯t ) :
(10)
t∈T
(11)
and t ≥ i ≥ j} =
and t ≥ i ≥ j} = β + 2.
Using (4), (10), (11) and (12), we obtain dH (C) = min{2(β + 2), β + 2}.
(12)
¨ ¨ H. Ozadam and F. Ozbudak
98
The following lemma deals with the case ps−1 + 1 ≤ j < (p − 1)ps−1 < i ≤ ps − 1. Lemma 11. Let i, j, 1 ≤ τ ≤ p − 1, 1 ≤ β ≤ p − 2 and 1 ≤ k ≤ s − 1 be integers such that ps − ps−k + (τ − 1)ps−k−1 + 1 ≤ i ≤ ps − ps−k + τ ps−k−1 and βps−1 + 1 ≤ j ≤ (β + 1)ps−1 . Let C = (x − 1)i (x + 1)j , then dH (C) = 2(β + 2). Proof. By Lemma 6, (2), Lemma 7 and (3), we get min{Pt dH (C¯t ) : min{Pt dH (C¯t ) :
t∈T
and i > t ≥ j} = 2(β + 2),
t∈T
k
(13)
and t ≥ i > j} = (τ + 1)p .
(14)
It follows from β ≤ p − 2, 1 ≤ τ and 1 ≤ k that 2(β + 2) ≤ 2p ≤ (τ + 1)pk . So, using (4), (13) and (14), we obtain dH (C) = 2(β + 2)..
In the following two lemmas, we obtain the minimum Hamming distance when (p − 1)ps−1 + 1 ≤ j ≤ i ≤ ps − 1.
Lemma 12. Let j ≤ i, 1 ≤ k ≤ s − 1, 1 ≤ τ ≤ τ ≤ p − 1 be integers such that ps − ps−k + (τ − 1)ps−k−1 + 1 ≤ i ≤ ps − ps−k + τ ps−k−1 and ps − ps−k + (τ − 1)ps−k−1 + 1 ≤ j ≤ ps − ps−k + τ ps−k−1 .
Let C = (x − 1)i (x + 1)j , then dH (C) = min{2(τ + 1)pk , (τ + 1)pk }.
Proof. First suppose that τ = τ . Obviously, there exists no t ∈ T with i > t ≥ j. So we consider t ≥ i ≥ j. By Lemma 7 and (3) we get min{Pt dH (C¯t ) :
t∈T
and t ≥ j ≥ i} = (τ + 1)pk .
(15)
Next suppose that τ < τ . By Lemma 7, (2) and (3) we get min{Pt dH (C¯t ) : min{Pt dH (C¯t ) :
t∈T t∈T
and i > t ≥ j} = 2(τ + 1)pk ,
(16)
k
and t ≥ i ≥ j} = (τ + 1)p .
(17)
Using (4), (15), (16) and (17), we obtain dH (C) = min{2(τ + 1)pk , (τ + 1)pk }.
Lemma 13. Let i, j, 1 ≤ k < k ≤ s − 1, 1 ≤ τ , τ ≤ p − 1 be integers such that ps − ps−k + (τ − 1)ps−k−1 + 1 ≤ i ≤ ps − ps−k + τ ps−k−1 and ps − ps−k + (τ − 1)ps−k −1 + 1 ≤ j ≤ ps − ps−k + τ ps−k −1 . Let C = (x − 1)i (x + 1)j , then dH (C) = 2(τ + 1)pk . Proof. By Lemma 7, (2) and (3) we get min{Pt dH (C¯t ) : min{Pt dH (C¯t ) :
t∈T t∈T
and i > t ≥ j} = 2(τ + 1)pk , and t ≥ i > i} = (τ + 1)pk .
(18) (19)
It follows from k > k , τ + 1 ≥ 2 and τ + 1 ≤ p that pk (τ + 1) ≥ 2pk (τ + 1). So, using (4), (18) and (19) we obtain dH (C) = min{2(τ + 1)pk , (τ + 1)pk } =
2(τ + 1)pk .
The Minimum Hamming Distance of Cyclic Codes of Length 2ps
99
Finally it remains to consider the cases i = ps and 0 < j ≤ ps − 1. Clearly, for i = ps and 0 < j ≤ ps − 1, there is no t ∈ T with t ≥ i ≥ j, so we only consider i > t ≥ j in the following three lemmas. Their proofs follow from Lemma 6, Lemma 7, (2), (4) and similar arguments as above. s
Lemma 14. Let 0 < j ≤ ps−1 be an integer and let C = (x − 1)p (x + 1)j . Then dH (C) = 4. Lemma 15. Let 1 ≤ β ≤ p − 2, βps−1 + 1 ≤ j ≤ (β + 1)ps−1 be integers. Let s C = (x − 1)p (x + 1)j , then dH (C) = 2(β + 2). Lemma 16. Let 1 ≤ τ ≤ p − 1, 1 ≤ k ≤ s − 1, j be integers such that ps − s ps−k + (τ − 1)ps−k−1 + 1 ≤ j ≤ ps − ps−k + τ ps−k−1 . Let C = (x − 1)p (x + 1)j , then dH (C) = 2(τ + 1)pk . We summarize our results in the following theorem (see Table 1 also). Theorem 2. Let p be an odd prime and let a, s be arbitrary positive integers. For q = pa , all cyclic codes of length 2ps over Fq are of the form (x − 1)i (x + 1)j ⊂ s s Fq [x]/x2p − 1 with 0 ≤ i, j ≤ ps . Let C = (x − 1)i (x + 1)j ⊂ Fq [x]/x2p − 1. Table 1. The minimum Hamming distance of all non-trivial cyclic codes, of the form (x − 1)i (x + 1)j , of length 2ps over Fq . The parameters 1 ≤ β ≤ β ≤ p − 2, (2) (1) (3) (4) <τ ≤ p − 1, 1 ≤ τ, τ , τ ≤ p − 1 , 1 ≤ k ≤ s − 1, 1 ≤ k < k ≤ s − 1 1≤τ are integers. The cases with i ≥ j are given. For the cases with i ≤ j, see Remark 1. Case i 1 0 < i ≤ ps 2 0 ≤ i ≤ ps−1 3 ps−1 < i ≤ 2ps−1 4 2ps−1 < i ≤ ps 5 6 7 8
j j=0 0 ≤ j ≤ ps−1 0 < j ≤ ps−1 0 < j ≤ ps−1
βps−1 + 1 ≤ i ≤ (β + 1)ps−1 ps − ps−k + (τ − 1)ps−k−1 +1 ≤ i ≤ ps − ps−k + τ ps−k−1 ps − ps−k + (τ − 1)ps−k−1 +1 ≤ i ≤ ps − ps−k + τ ps−k−1 ps − ps−k + (τ (1) − 1)ps−k−1 +1 ≤ i ≤ ps − ps−k +τ (1) ps−k−1
ps − ps−k + (τ (3) − 1)ps−k 9
s
+1 ≤ i ≤ p − p
10
i = ps
11
i = ps
s−k
+τ (3) ps−k
−1
β ps−1 + 1 ≤ j ≤ (β + 1)ps−1 βps−1 + 1 ≤ j ≤ (β + 1)ps−1 s
s−k
p −p + (τ − 1)p (τ + 1)pk +1 ≤ j ≤ ps − ps−k + τ ps−k−1 min{ ps − ps−k + (τ (2) − 1)ps−k−1 2(τ (2) + 1)pk , +1 ≤ j ≤ ps − ps−k (2) s−k−1 (τ (1) + 1)pk } +τ p
ps − ps−k + (τ (4) − 1)ps−k +1 ≤ j ≤ p − p
−1
2(β + 2)
s−k−1
s
dH (C) 2 2 3 4 min{β + 2, 2(β + 2)}
s−k
−1
2(τ (4) + 1)pk
+τ (4) ps−k −1 s−1 βp + 1 ≤ j ≤ (β + 1)ps−1 ps − ps−k + (τ − 1)ps−k−1 +1 ≤ j ≤ ps − ps−k +τ ps−k−1
2(β + 2) 2(τ + 1)pk
100
¨ ¨ H. Ozadam and F. Ozbudak s
s s If (i, j) = (0, 0), then C is the whole space F2p q , and if (i, j) = (p , p ), then C is the zero space {0}. For the remaining values of (i, j), the minimum Hamming distance of the cyclic code C is given in Table 1.
Remark 1. We simplified Table 1 by giving the cases only with i ≥ j for some ranges of i and j. The corresponding case with j ≥ i have the same minimum Hamming distance by symmetry. For example in Case 1, the corresponding case is i = 0 and 0 ≤ j ≤ ps , and the minimum Hamming distance is 2.
Acknowledgements The authors would like to thank anonymous referees for their valuable comments. ¨ ITAK ˙ The authors were partially supported by TUB under Grant No. TBAG107T826.
References 1. Castagnoli, G., Massey, J.L., Schoeller, P.A., von Seemann, N.: On repeated-root cyclic codes. IEEE Trans. Inform. Theory 37, 337–342 (1991) 2. Dihn, H.Q.: On the linear ordering of some classes of negacyclic and cyclic codes and their distance distributions. Finite Fields Appl. 14, 22–40 (2008) 3. Lidl, R., Niederreiter, H.: Finite Fields. Cambridge University Press, Cambridge (1997) 4. MacWilliams, F.J., Sloane, N.J.A.: The Theory Of Error Correcting Codes. North Holland, Amsterdam (1978) ¨ ¨ 5. Ozadam, H., Ozbudak, F.: A note on negacyclic and cyclic codes of length ps over a finite field of characteristic p (2009) (submitted) 6. S˘ al˘ agean, A.: Repeated-root cyclic and negacyclic codes over a finite chain ring. Discrete Appl. Math. 154(2), 413–419 (2006) 7. L´ opez-Permouth, S.R., Szabo, S.: On the Hamming weight of repeated-root cyclic and negacyclic codes over Galois rings (2009) (preprint) 8. van Lint, J.H.: Repeated-root cyclic codes. IEEE Trans. Inform. Theory 37, 343–345 (1991) 9. Zimmerman, K.-H.: On generalizations of repeated-root cyclic codes. IEEE Trans. Inform. Theory 42, 641–649 (1996)
There Are Not Non-obvious Cyclic Affine-invariant Codes ´ Jos´e Joaqu´ın Bernal, Angel del R´ıo, and Juan Jacobo Sim´ on Departamento de Matem´ aticas, Universidad de Murcia, Spain
Abstract. We show that an affine-invariant code C of length pm is not permutation equivalent to a cyclic code except in the obvious cases: m = 1 or C is either {0}, the repetition code or its dual.
Affine-invariant codes were firstly introduced by Kasami, Lin and Peterson [KLP2] as a generalization of Reed-Muller codes. This class of codes has received the attention of several authors because of its good algebraic and decoding properties [D, BCh, ChL, Ho, Hu]. It is well known that every affine-invariant code can be seen as an ideal of the group algebra of an elementary abelian group in which the group is identified with the standard base of the ambient space. In particular, if C is a code of prime length then C is permutation equivalent to a cyclic code. Other obvious affine-invariant cyclic codes are the trivial code, {0}, the repetition code and the code form by all the even-like words, provided its length is a prime power. In this paper we prove that these are the only affine-invariant codes which are permutation equivalent to a cyclic code. Our main tools are an intrinsical characterization of group codes obtained in [BRS] and a description of the group of permutation automorphisms of nontrivial affine-invariant codes given in [BCh]. These results are reviewed in Section 1, where we also recall the definition and main properties of affine-invariant codes. In Section 2 we prove the main result of the paper.
1
Preliminaries
In this section we recall the definition of (left) group code and the intrinsical characterization given in [BRS]. We also recall the definition of affine-invariant code and the description of its group of permutation automorphisms given in [BCh]. All throughout F is a field of order a power of p, where p is a prime number. The finite field with ps elements is denoted by Fps . For a group G, we denote by FG the group ring of G with coefficients in F. All the group theoretical notions used in this paper can be easily founded in [R]. Definition 1. If E is the standard basis of Fn , C ⊆ Fn is a linear code and G is a group (of order n) then we say that C is a G-code if there is a bijection
Research supported by D.G.I. of Spain and Fundaci´ on S´eneca of Murcia.
M. Bras-Amor´ os and T. Høholdt (Eds.): AAECC 2009, LNCS 5527, pp. 101–106, 2009. c Springer-Verlag Berlin Heidelberg 2009
102
´ del R´ıo, and J.J. Sim´ J.J Bernal, A. on
φ : E → G such that the linear extension of φ to an isomorphism φ : Fn → FG maps C to an ideal of FG. A group code is a linear code which is a G-code for some group G. A cyclic group code (respectively, abelian group code, solvable group code, etc) is a linear code which is G-code for some cyclic group G (respectively, abelian group, solvable group, etc). Let Sn denote the group of permutations of n symbols. Every σ ∈ Sn defines an automorphism of Fn in the obvious way, i.e. σ(x1 , . . . , xn ) = (xσ−1 (1) , . . . , xσ−1 (n) ). By definition, the group of permutation automorphisms of a linear code C of length n is PAut(C) = {σ ∈ Sn : σ(C) = C}. An intrinsical characterization of group codes C in terms of PAut(C) has been obtained in [BRS]. Four our purposes we only need to consider abelian groups. So we record in the following theorem the specialization of [BRS, Theorem 1.2] to abelian groups. Theorem 1. [BRS] Let C be a linear code of length n over a field F and G a finite abelian group of order n. Then C is a G-code if and only if G is isomorphic to a transitive subgroup H of Sn . In the remainder of the paper we assume that I = Fpm . Often we will be only using the underlying additive structure of I; for example, FI is the group algebra of this additive group with coefficients in F. Let S(I) denote the group of permutations of I. Every element of S(I) induces a unique F-linear bijection of the group algebra FI. For an F-subspace C of FI, let PAut(C) = {σ ∈ S(I) : σ(C) = C}. An affine-invariant code is an F-subspace C of FI formed by even-like words such that PAut(C) contains the maps of the form x ∈ I → αx + β, with α ∈ I ∗ = {a ∈ I : a = 0} and β ∈ I. These maps are called affine transformations of I. m Observe that if Fp is identified with FI via some bijection from {1, . . . , pm } to I, then the linear codes of length pm correspond to subspaces of FI in such a way that the groups of permutations automorphisms agree. Therefore if C is a subspace of FI and G is a group then C is a left G-code if and only if PAut(C) contains a regular subgroup H of S(I) isomorphic to G and it is a G-code if H can be selected such that CS(I) (H) ⊆ PAut(C). Affine-invariant codes can be seen as extended cyclic codes. Recall that a cyclic code C of length n over F is a subspace of Fn which is closed under cyclic permutations, that is if (x1 , x2 , . . . , xn−1 , xn ) is an element of C then so is (xn , x1 , x2 , . . . , xn−1 ). Cyclic codes are cyclic group codes via the bijection φ : E → G given by φ(ei ) = g i−1 , where E = {e1 , . . . , en } is the standard basis of Fn and G is a cyclic group of order n generated by g. Conversely, any ideal of the group algebra FG, with G a cyclic group of order n can be seen as a cyclic code with a suitable identification of the elements of G with the coordinates. The zeroes of a cyclic code C of length n are the n-th roots of unity α such that x0 + x1 α + x2 α2 + · · · + xn−1 αn−1 = 0, for every (x0 , x1 , . . . , xn−1 ) ∈ C.
There Are Not Non-obvious Cyclic Affine-invariant Codes
103
It is well known that every cyclic code is uniquely determined by its zeroes and conversely, if ζ is a primitive n-th root of unity and D is a union of q-cyclotomic classes modulo n then there is a unique q-ary cyclic code of length n whose set of zeroes is {ζ i : i ∈ D}. Let C ⊆ FI be an affine-invariant code and let C ∗ denote the code obtained by puncturing C at the coordinate labelled by 0. The permutation automorphisms of C which fix 0 induces permutation automorphisms of C ∗ . In particular, PAut(C ∗ ) contains the maps of the form x → αx, for α ∈ I ∗ . This maps form a cyclic group isomorphic to the group of units of the field I. So, C ∗ is a cyclic group code and C is the extended code obtained by adding a parity check coordinate. However not every code obtained by extending a cyclic code of length pm − 1 is affine-invariant. We recall a characterization of Kasami, Lin and Peterson of the extended cyclic codes which are affine-invariant in terms of the roots of the cyclic code [KLP1]. The p-adic expansion ofa non-negative integer x is the list of integers (x0, x1 , . . . ) with 0 ≤ xi < p and x = i≥0 xi pi . The p-adic expansion yields a partial ordering in the set of positive integers by setting x y if xi ≤ yi , for every i, where (xi ) and (yi ) are the p-adic expansions of x and y, respectively. Let n = pm − 1 and let α be a primitive element of I, i.e. a generator of I ∗ . Identify the standard basis of Fn , E = {e1 , . . . , en }, with I ∗ via the map ei → αi−1 . A cyclic code C ∗ of length n is determined by the following set DC ∗ = {i : 0 ≤ i < n, αi is a zero of C ∗ }. Since C ∗ is a q-ary cyclic code, with q = pr , the set DC ∗ is invariant under multiplication by q modulo n, that is, DC ∗ is a union of q-cyclotomic classes modulo n. Conversely, every union D of q-cyclotomic classes modulo n, yields to a uniquely defined cyclic code C ∗ of length n with D = DC ∗ . If C ∗ is a cyclic code and C is its corresponding extended code then the defining set of C is by definition DC = DC ∗ ∪ {0} if 0 ∈ DC , and DC = DC ∗ ∪ {n} if 0 ∈ DC ∗ . Proposition 1. [KLP1][Hu, Corollary 3.5] Let C ∗ be a cyclic code of length n = pm − 1 and C the extended code of C ∗ . Then C is affine-invariant if and only if DC satisfy the following condition for every 1 ≤ s, t ≤ n: s t and t ∈ DC
⇒
s ∈ DC .
(1)
The trivial code and the repetition code of length n are cyclic with defining sets {0, 1, 2, . . . , n − 1} and {1, 2, . . . , n − 1} respectively. Their duals, i.e. Fn and the space of all the even-like words, are also cyclic nwith defining sets ∅ and {0}. Recall that a word (x1 , . . . , xn ) is even-like if i=1 xi = 0. When n = pm − 1, except for the last one, all the others give rise to affine-invariant codes of length pm : the trivial code, the repetition code and the code formed by the even-like words. These three codes are known as the trivial affine-invariant codes. For future use we describe the affine-invariant codes of length 4.
104
´ del R´ıo, and J.J. Sim´ J.J Bernal, A. on
Example 1 (Affine-invariant codes of length 4). Let D be the defining set of an affine-invariant code C of length 4 over F2r . Then C is trivial as an affineinvariant code if and only if D is either {0}, {0, 1, 2} or {0, 1, 2, 3}. Since D is invariant by multiplication by 2r modulo 3 and satisfies condition (1), if r is odd then there are not non trivial affine-invariant codes. However, if r is even then there are two non-trivial affine-invariant codes with defining sets {0, 1} and {0, 2} respectively. If C is a trivial affine-invariant code then PAut(C) = Sn , and therefore C is G-code for every group G of order pm . So to avoid trivialities, in the remainder of the paper all the affine-invariant codes are suppose to be non-trivial. The group of permutations of a (non-trivial) affine-invariant code has been described by Berger and Charpin [BCh]. In order to present their description we need to introduce some notation. We use the notation N G to represent a semidirect product of N by G via some action of G on N , which is going to be clear from the context. That is, N and G are groups and there is a group homomorphism σ : G → Aut(N ) associating g ∈ G to σg . The underlying set of N G is the direct product N × G and the product is given by (n1 , g1 )(n2 , g2 ) = (n1 σg1 (n2 ), g1 g2 ). For every d|m let GL(IFpd ) and Aff d (I) denote the groups of linear and affine transformations of I as vector space over Fpd . The group of Fpd -automorphisms of the field I is denoted by Gal(I/Fpd ). We identify every element y ∈ I with the translation x → x + y. Then Aff d (I) = I GL(IFpd ). Given two divisors a and b of m with b|a, let Ga,b = {f ∈ GL(IFpb ) : f is τ − semilinear for some τ ∈ Gal(Fpa /Fpb )}. We claim that Ga,b = GL(IFpa ), Gal(I/Fpb ) . Indeed, if f is τ -semilinear with τ ∈ Gal(Fpa /Fpb ) then τ is the restriction of σ for some σ ∈ Gal(I/Fpb ) and f σ −1 ∈ GL(IFpa ). Theorem 2. [BCh, Corollary 2] Let C be a non-trivial affine-invariant code of length pm over Fq , with q = pr . Let a = a(C) = min{d|m : Aff d (I) ⊆ PAut(C)}, b = b(C) = min{d ≥ 1 : pd DC = DC } Then b|r, b|a|m and PAut(C) = Aff a (I), Gal(I/Fpb ) = I Ga,b . A method to compute a(C) and b(C) was firstly obtained by Delsarte [D]. Later, Berger and Charpin founded two alternative methods to calculate a(C) and b(C) which are sometimes computationally simpler [BCh].
2
Affine-invariant Cyclic Group Codes
Let C be an affine-invariant code. Then C is an I-code, since the group of translations of I (which we have identified with the additive group I) is a transitive
There Are Not Non-obvious Cyclic Affine-invariant Codes
105
subgroup of S(I) contained in PAut(C). So every affine-invariant code is an elementary abelian group code. In particular, if the length of C is prime then C is a cyclic group code. Next result shows that this one is the only type of non-trivial affine-invariant cyclic group codes. Theorem 3. A non-trivial affine-invariant code is permutation equivalent to a cyclic code if and only if it has prime length. Proof. Assume that C is a non trivial affine-invariant code of length pm which is permutation equivalent to a cyclic code. Then C is a cyclic group code. By Theorem 1, this implies that G = PAut(C) contains a cyclic subgroup of order pm or equivalently G contains an element of order pm . Let a = a(C), b = b(C), as in Theorem 2 and h = m/a. Let pt be the maximum p-th power dividing a/b, and pu the minimum p-th power greater or equal than h. We first show that the existence of an element g of order pm in G implies a strong relation on these parameters which reduces to some few cases. By Theorem 2, G = I Ga,b . Furthermore H = I GL(IFpa ) is a normal subgroup of index a/b in G. Since the order of G is a p-th power and pt is t the maximum p-th power dividing a/b, g p is an element of order pm−t in H. Furthermore, H is isomorphic to Fhpa GLh (Fpa ), where GLh (Fpa ) is the group of invertible h × h matrices with entries in Fpa . Therefore there is an element (x, A) ∈ Fhpa GLh (Fpa ) of order pm−t . This implies that A has order pk with m − t − 1 ≤ k. Since the order of A is a power of p, and the fields of characteristic p do not have elements of orden p, the only eigenvalue of A is 1. We may assume that A is given in Jordan form and hence A = I + N where N is an upper u triangular matrix with zeroes in the diagonal. Then N p = 0 and therefore u Ap = I. Thus m − t − 1 ≤ k ≤ u = logp (h) and we conclude that m ≤ 1 + t + logp (h) < 2 + logp (a) + logp (h) = 2 + logp (m).
(2)
We have to prove that m = 1. Otherwise, (2) implies that either m = 2 or m = 3 and p = 2. Case 1: m = 2. We claim that in this case p = 2. Indeed, if p > 2 then 0 < logp (2) < 1 and t = 0, since pt divides m = 2. Hence 2 = m ≤ 1 + logp (h) ≤ 1 + logp (m) = 2 and so h = 2. Hence (x, A) is an element of order p2 in F2p GL2 (Fp ). Thus there are 1y x1 , x2 , y ∈ Fp such that ((x1 , x2 ), A)p = (0, 1), where A = . However 01 p−1 p ((x1 , x2 ), A)p = (x + iyx ), py , A 1 2 i=0 p(p−1) = yx , 0 , 1 = (0, 1), 2 2 which is the desired contradiction. Hence p = 2 and so C has length 4. Since C is non-trivial as affine-invariant code, by Example 1, r is even and DC = {0, 1} or {0, 2}. By the definition of b(C) we have b = 2 and hence a = 2. We deduce that PAut(C) = I F∗4 . Hence
106
´ del R´ıo, and J.J. Sim´ J.J Bernal, A. on
I F2p is the only subgroup of order 4 of PAut(C) and we conclude that C is not cyclic group code. Case 2: m = 3 and p = 2. Then t = 0 and 3 ≤ 1 + log2 (h) ≤ 1 + log2 (m) = 3, by (2). Thus h = 3, or equivalently a = 1 and hence b = 1. Thus (x, A) is an element of order 8 in F32 GL3 (F2 ). Then u = log2 (3) = 2 and so A4 = 1. If A2 = 1 then x4 = 1, a contradiction. Therefore A is a Jordan matrix of order 4 and thus ⎛ ⎞ 110 A = ⎝0 1 1⎠ 001 However I + A + A2 + A3 = 0. Then (x, A)4 = ((I + A + A2 + A3 )x, A4 ) = (0, 1), a contradiction.
References [BCh]
Berger, T.P., Charpin, P.: The Permutation group of affine-invariant extended cyclic codes. IEEE Trans. Inform. Theory 42, 2194–2209 (1996) ´ Sim´ [BRS] Bernal, J.J., del R´ıo, A., on, J.J.: An intrinsical description of group codes. Designs, Codes, Cryptog. (to appear), doi:10.1007/s10623-008-9261-z [ChL] Charpin, P., Levy-Dit-Vehel, F.: On Self-dual affine-invariant codes, J. Comb. Theory, Series A 67, 223–244 (1994) [D] Delsarte, P.: On cyclic codes that are invariant under the general linear group. IEEE Trans. Inform. Theory IT-16, 760–769 (1970) [Ho] Hou, X.-D.: Enumeration of certain affine invariant extended cyclic codes. J. Comb. Theory, Series A 110, 71–95 (2005) [Hu] Huffman, W.C.: Codes and groups. In: Pless, V.S., Huffman, W.C., Brualdi, R.A. (eds.) Handbook of coding theory, vol. II, pp. 1345–1440. North-Holland, Amsterdam (1998) [KLP1] Kasami, T., Lin, S., Peterson, W.W.: Some results on cyclic codes which are invariant under the affine group and their applications. Information and Control 11, 475–496 (1967) [KLP2] Kasami, T., Lin, S., Peterson, W.W.: New generalizations of the Reed-Muller codes part I: primitive codes. IEEE Trans. Inform. Theory IT-14, 189–199 (1968) [R] Robinson, D.J.S.: A course in the theory of groups. Springer, Heidelberg (1996)
On Self-dual Codes over Z16 Kiyoshi Nagata , Fidel Nemenzo, and Hideo Wada Department of Business Studies and Informatics, Daito Bunka University, Tokyo, Japan
[email protected] Institute of Mathematics, University of the Philippines, Quezon City, Philippines
[email protected] Faculty of Science and Technology, Sophia University, Tokyo, Japan
[email protected]
Abstract. In this paper, we look at self-dual codes over the ring Z16 of integers modulo 16. From any doubly even self-dual binary code, we construct codes over Z16 and give a necessary and sufficient condition for the self-duality of induced codes. We then give an inductive algorithm for constructing all self-dual codes over Z16 , and establish the mass formula, which counts the number of such codes. Keywords: Self-dual codes, Doubly even codes, Finite rings, Mass formula.
1
Introduction
There has been much interest in codes over the ring Zm of integers modulo m and finite rings in general, since the discovery [5] of a relationship between non-linear binary codes and linear quaternary codes. By applying the Chinese Remainder Theorem [2] to self-dual codes over Zm , it suffices to look at codes over integers modulo prime powers. We consider the problem of finding the mass formula, which counts the number of self-dual codes of given length. In [4], Gaborit gave a mass formula for selfdual quaternary codes. This was generalized in [1] to the ring Zp2 and in [6] to Zp3 for all primes p. In [7], we gave a mass formula over Zps in [7] for any odd prime p and any integer s ≥ 4. In this paper, we will give an inductive algorithm for constructing self-dual codes over Z16 from a given binary code. As a consequence, we obtain a mass formula for such codes. A code of length n over a finite ring R is an R-submodule of Rn . Elements of codes are called codewords. Associated to a code C is a generator matrix, whose rows span C and are linearly independent in binary case. Two codewords as vectors) are orthogonal if x = (x1 , . . . , xn ) and y = (y1 , . . . , yn ) (considered their Euclidean inner product x · y = i xi yi is zero. The dual C ⊥ of a code C over a ring R consists of all vectors of Rn which are orthogonal to every codeword in C. A code C is said to be self-dual (resp. self-orthogonal) if C = C ⊥ (resp. C ⊆ C ⊥ ).
Corresponding author.
M. Bras-Amor´ os and T. Høholdt (Eds.): AAECC 2009, LNCS 5527, pp. 107–116, 2009. c Springer-Verlag Berlin Heidelberg 2009
108
2
K. Nagata, F. Nemenzo, and H. Wada
Construction of Self-dual Codes over Z16
Every code C of length n over Z16 has a generator matrix which, after a suitable permutation of coordinates, can be written as ⎡ ⎤ ⎡ ⎤ T1 Ik1 A11 A12 A13 A14 ⎢ 2T2 ⎥ ⎢ 0 2Ik2 2A22 2A23 2A24 ⎥ ⎥ ⎢ ⎥ C=⎢ ⎣ 22 T3 ⎦ = ⎣ 0 0 22 Ik3 22 A33 22 A34 ⎦ , 23 T4 0 0 0 23 Ik4 23 A44 where Iki is the ki × ki identity matrix, and the other matrices Aij ’s (1 ≤ i ≤ j ≤ 4) are considered modulo 2j−i+1 . The columns are grouped in blocks of sizes k1 , k2 , k3 , k4 and k5 , with n = k1 + k2 + k3 + k4 + k5 . In [7] we showed that C is self-dual if and only if k1 = k5 , k2 = k4 and Ti Tjt ≡ 0 (mod 26−i−j ) for i and j such that 1 ≤ i ≤ j ≤ 4 and i + j ≤ 5. We showed in [7] that self-dual codes over the ring Zps are induced from codes over the ring Zps−2 , for any odd prime p and integer s ≥ 4. In this paper we look at the case when p = 2 and show that self-dual codes over Z24 can be constructed from a code over Z22 represented by a generator matrix ⎤ ⎡ T1 ⎣ T2 ⎦ (mod 22 ), 2T3 where T1 has some generator vectors t1 , ..., tk1 such that ti · ti ≡ 0 (mod 23 ) for all i = 1, ..., k1 . Such a representation of a quaternary code is called T1 -doubly even. We thus construct self-dual codes over Z24 from doubly even self-dual binary codes via the following constructive scheme diagram ⎤ ⎡ ⎤ ⎡ T1
T1 ⎢ 2T2 ⎥ ⎢ 2 ⎥ (mod 24 ) ← ⎣ T2 ⎦ (mod 22 ) ← T1 (mod 2), ⎣ 2 T3 ⎦ T2 2T 3 23 T4 where Ti ’s are composed of linearly independent binary codewords as row vectors and [Ti ] denotes the code generated by Ti . In the right of the diagram above, the boldface symbols denote doubly even binary codes. The boldface part in the middle of the diagram, T1 , denotes the T1 -doubly even representation. In [4] Gaborit gave the number of doubly even self-dual binary codes of size n and dimension k1 + k2 , and from this we obtain a mass formula for counting the number of T1 -doubly even representations of quaternary codes induced from a given doubly even self-dual binary code. We start with a doubly even self-dual binary code C0 of size n and of dimension k1 + k2 , and divide it into two subspaces of dimension k1 and k2 . The number k 1 −1 2k1 +k2 −i − 1 . The code C0 can be of such divisions is ρ(k1 + k2 , k1 ) = 2k1 −i − 1 i=0 represented as
On Self-dual Codes over Z16
C0 =
(0)
T1 (0) T2
=
109
(0) (0) (0) (0) Ik1 A11 A12 A13 A14 (0) (0) (0) , 0 Ik2 A22 A23 A24
(0)
where the A∗ are considered modulo 2. (0) Though A11 is supposed to be 0 in a usual binary code basis expression, we (0) should remark that some A11 is necessary, since we consider it mod 2 in the lifted quaternary expression. These differences are counted in ρ(k1 + k2 , k1 ). By a suitable permutation of columns, we can assume that the binary square matrix (0) (0) A13 A14 (0) and A14 are invertible. (0) (0) A23 A24 Next we construct a quaternary code C1 , represented by ⎤ ⎤ ⎡ (0) (0) (1) (0) (1) (0) (1) Ik1 A11 A12 + 2A12 A13 + 2A13 A14 + 2A14 T1 ⎢ (0) (0) (1) (0) (1) ⎥ C1 = ⎣ T2 ⎦ = ⎣ 0 Ik2 A22 A23 + 2A23 A24 + 2A24 ⎦ . (0) (0) 2T3 0 0 2Ik3 2A33 2A34 ⎡
(1)
Here we notice that A12 is necessary, since 2Ik3 becomes 22 Ik3 in the extended code over Z24 . There are two conditions for C1 to be self-dual, and the first is
T1 T t ≡ 0 (mod 2). T2 3
(0) (0) t A13 A14 (0) (0) This implies that (mod 2), thus Ik3 ≡ A33 A34 (0) (0) A23 A24 ⎛ −1 ⎞t (0) (0) (0) A13 A14 A12 ⎠ (0) (0) . A33 A34 is uniquely determined as ⎝ (0) (0) (0) A23 A24 A22
(0) (1) (0) T1 T1 T1 T1 Let = (0) + 2 (1) where (0) is a given doubly even self-dual T2 T2 T T2
2
(1) (1) (1) (1) 0 0 A12 A13 A14 T1 is a binary code. binary code and (1) = (1) (1) T2 0 0 0 A23 A24 Then the self-dual quaternary code has a T1 -doubly even representation if (0)
A12 (0) A22
(0)
T1 (0) T2
(0)
T1 (0) T2
t
t
t (0) (1) (1) (1) T1 T1 T1 2 T1 + 2 (0) +2 ≡0 (1) (1) (1) T2 T2 T2 T2
(1)
:= X + X t and the modulus of congruence is 8 only for the diagonal where X entries and 4 otherwise.
110
K. Nagata, F. Nemenzo, and H. Wada (j)
We describe some parts of Ti ’s using binary row vectors such as
(0) A12 (0) A22
(0) A13 (0) A23
(0) A14 (0) A24
⎞ ⎛ ⎞ ∗ a1 a1 ⎟ ⎜ ⎟ ⎜ = ⎝ ... ⎠ = ⎝ ... ... ⎠ , ak ∗ ak ⎛
(1) A12
0
(1) A13 (1) A23
(1) A14 (1) A24
⎞ x1 ⎟ ⎜ = ⎝ ... ⎠ , xk ⎛
where a’s and x’s are of size k3 + k2 + k1 = k3 + k , a ’s are of size k2 + k1 = k . Put xi = (0 xi ) for k1 + 1 ≤ i ≤ k with xi of size k . Dividing the left hand side of congruence (1) by 2, we get (fij ) + (ai · xj + aj · xi ) + 2(xi · xj ),
t (0) (0) T1 T1 with = (2fij ). Thus the self-dual quaternary code C1 has a (0) (0) T2 T2 T1 -doubly even representation if
fij 1 2 fii
≡ a i · xj + a j · xi ≡ ai · xi + xi · xi = (ai + 1) · xi
(i < j) (i = 1, ..., k1 )
where the congruence modulus is 2, and 1 = 1k3 +k denotes the all-1 vector of size k3 + k . The first condition together with the doubly even assumption for C0 is a necessary and sufficient condition for the self-duality of the quaternary code C1 , while the second condition is for the T1 -doubly even presentation of C1 . We now consider the equation in the following matrix representation. To illustrate, we give such matrix representation for the case when k1 = 2 and k2 = 1. We have ⎛ ⎛ ⎞ ⎞ a1 + 1 g1 ⎛ ⎞ ⎜ ⎜g ⎟ ⎟ xt1 a2 + 1 ⎜ ⎟⎝ t ⎠ ⎜ 2 ⎟ a a ⎜ ⎟ x2 ≡ ⎜ f12 ⎟ , 2 1 ⎝ a ⎝f ⎠ a1 ⎠ xt 3 13 3 a3 a2 f23 where gi := 12 fii (i = 1, 2) and the modulus of congruence is 2. In general, if M is the coefficient matrix in the equation, then the number of row vectors in M is 12 k (k − 1) + k1 with the size of each row equal to (k + k3 ) × k1 + k × k2 = k 2 + k1 k3 . We compute the rank of M . From the definition, the ranks of (ai ) and (ai ) are both equal to k . Any linearly dependent equation of row vectors in M must not contain any ai ’s and must contain at least one ai + 1. Thus we can assume that ai1 + · · · + aim ≡ 1 for some 1 ≤ i1 < · · · < im ≤ k1 .
(mod 2)
On Self-dual Codes over Z16
111
Going back to the starting code C0 , we write it as ⎞ ⎛ ⎞ ⎛ (0) e1 a1 a1 a1 (0) ⎜ .. .. .. ⎟ .. ⎟ ⎜ ⎟ ⎜ Ik1 A11 ⎟ ⎜ . . . ⎟ . ⎜ ⎜ ⎟ ⎜ ⎟ (0) ⎜ ⎟ ak1 ⎟ ⎜ ek1 ak ak1 ⎟ 1 =⎜ C0 = ⎜ ⎟. ⎜ ⎟ ak1 +1 ⎟ ⎜ 0 e1 ak1 +1 ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ .. ⎜ . .. .. ⎟ ⎝ 0 Ik2 . ⎠ ⎝ .. . . ⎠ ak 0 ek2 ak The self-duality of this code implies that for any j (1 ≤ j ≤ k2 ) (0)
(0)
(0)
(0)
0 ≡ ((ai1 ai1 ) + · · · + (aim aim )) · (ej ak1 +j ) ≡ (ai1 + · · · + aim ) · ej + 1 (0)
(0)
using the equation 0 ≡ (ej ak1 +j ) · (ej index j ranges from 1 to k2 , we have (0)
ak1 +j ) = 1 + ak1 +j · ak1 +j . As the
(0)
ai1 + · · · + aim ≡ 1k2 , and (0)
(0)
(ai1 ai1 ) + · · · + (aim ais ) ≡ 1k2 +(k3 +k ) . Again from the self-duality of C0 , we have for any j (1 ≤ j ≤ k1 ) (0)
(0)
(0)
0 ≡ ((ei1 ai1 ai1 ) + · · · + (eim aim aim )) · (ej aj ≡ (ei1 + · · · + eim ) · ej + 1. (0)
(0)
For the second congruence, we used 0 ≡ (ej aj aj )·(ej aj (0) (aj
aj ) (0)
aj ) = 1+(aj
aj )·
aj ). As the index j ranges from 1 to k1 , we have ei1 + · · · + eim ≡ 1k1 .
Therefore we see that {i1 , ..., im } = {1, ..., k1 } and there is one possible linearly dependent equation in M only when (0)
(e1 a1
(0)
a1 ) + · · · + (ek1 ak1 ak1 ) = 1n ,
which is equivalent to 1n ∈ T1 . So the rank of M is 12 k (k − 1) + k1 − with (0) 1 if 1 ∈ T1 = (2) 0 otherwise. (0)
In the case when = 1, the doubly even quaternary code T1 contains 1 + 2x and 0 ≡ (1 + 2x) · (1 + 2x) = n + 4(1 · x + x · x) = n + 8x · x ≡ n
(mod 8).
We need to check whether the simultaneous congruent equations represented in the matrix have a solution or not. We do this by computing the sum of
112
K. Nagata, F. Nemenzo, and H. Wada
constants in the right-hand side corresponding to the rows in M which contain, for i = 1, ..., k1 , either ai + 1 or ai : k1
k k1 1 −1 gi + fij i=1 i=1 j=i+1 ⎛ k1 1 (0) (0) ≡ ⎝ (ei ai ai ) · (ei ai ai ) + 2 4 i=1
≡
(0)
ai
(0)
· aj + 2
1≤i<j≤k1
k1 k1 1 1 1 (0) (0) (ei ai ai ) · (ei ai ai ) ≡ 1n · 1n ≡ n 4 i=1 4 4 i=1
⎞
ai · aj ⎠
1≤i<j≤k1
≡ 0 (mod 2).
We now have the following proposition. Proposition 2.1. Let C0 be a doubly even self-dual binary code with the following representation
(0) (0) (0) (0) (0) T1 Ik1 A11 A12 A13 A14 C0 = (0) = (0) (0) (0) . T2 0 Ik2 A22 A23 A24 If λ(k1 , k2 , k3 ) denotes the number of T1 -doubly even self-dual quaternary codes with representation ⎤ ⎤ ⎡ ⎡ (0) (0) (1) (0) (1) (0) (1) Ik1 A11 A12 + 2A12 A13 + 2A13 A14 + 2A14 T1 ⎢ (0) (0) (1) (0) (1) ⎥ (3) C1 = ⎣ T2 ⎦ = ⎣ 0 Ik2 A22 A23 + 2A23 A24 + 2A24 ⎦ (0) (0) 2T3 0 0 2Ik3 2A33 2A34 then λ(k1 , k2 , k3 ) =
(0)
0 if 1 ∈ T1 1 2 2 k (k +1)−k1 +k1 k3 + otherwise
and 8 |n
,
where is defined as in (2). (0)
Proof. If 1 ∈ T1 and 8 |n then there are no solutions to the matrix congruent equation. Otherwise, we are free to choose binary values for the entries in the solution column (xti ). If w denotes the size of matrix M minus the rank of M , then w = (k 2 +k1 k3 )−( 12 k (k −1)+k1 −) = 12 k (k +1)−k1 +k1 k3 +. Once a T1 -doubly even self-dual quaternary code described in (3) is given, we can construct all the self-dual codes C = [2i Ti ]i=1,...,4 over Z24 , where T1 = (1) (2) (2) (3) (1) (4−i) T1 +22 (0 0 0 A13 A14 )+23 (0 0 0 0 A14 ) Ti = Ti +24−i (0 0 0 0 Ai4 ) for i = 2 (1) and 3, T4 = (0 0 0 Ik1 A44 ), the A∗ ’s are binary matrices and the T∗ ’s represent (0)−1 (0) (1)t T∗ ’s in (3). From the self-duality of C, we have At44 = A14 A13 , A34 = (0)−1 (1) (1)t (2)t (0)−1 (1) (1)t (2) (0)t (2) (0)t A14 (2−1 T1 T3 ) and A24 = A14 (2−2 T1 T2 + A13 A23 + A14 A24 ).
On Self-dual Codes over Z16 (2)
113
(2)
In the third equation A13 and A14 appear but these have not yet been given. (3) We will consider the condition for them and for A14 such as 1 (1) (1)t (2)t (2)t (0) (3)t T T + A 13 A13 + A 14 A14 + 2A14 A14 ≡ 0 22 1 1
mod 22 ,
where A 13 = A13 + 2A13 and A 14 = A14 + 2A14 . (1) (1)t (0) (2)t (2) Now we put (dij ) = 212 T1 T1 + A13 A13 for any k1 × k2 matrix A13 and (0) (2)t (2)t (0)−1 (xij ) = A14 A14 . Then A14 is uniquely determined by A14 (xij ) such that 1 xji ≡ dij − xij mod 2 for i < j and xii ≡ 2 dii (mod 2) . For i > j, the number of free binary values for xij is 12 k1 (k1 − 1). (1) (1)t (2)t (2)t Next we put (fij ) ≡ 12 212 T1 T1 + A 13 A13 + A 14 A14 mod 2 and (0)
(0)
(1)
(3)t
(0)
(1)
(3)t
(0)−1
(yij ) = A14 A14 . Then A14 is uniquely determined by A14 (yij ) where yji = fij − yij for i < j and 12 k1 (k1 + 1) is the number of freely chosen binary values for yij (i ≥ j). Therefore we have the following proposition. Proposition 2.2. For a fixed T1 -doubly even self-dual quaternary code described in (3), the number of induced codes over Z24 is 2k1 (k1 +k2 ) .
3
Mass Formula for Self-dual Code over Z16
Now we establish the mass formula. Theorem 3.1. Let N16 (n) denote the number of distinct self-dual codes over Z16 of length n. 1. If n is odd, then n−1
N16 (n) =
2
2
1 2 k (k +1)
−1 k
i=0
k =0
where
δ=
0 1
n−1
2n−2i−2 + (−1)δ 2 2 2i+1 − 1
n ≡ 1, 7 n ≡ 3, 5
−i−1
−1
(mod 8).
2. If n ≡ 2, 6 (mod 8), then n 2 −1
N16 (n) =
k =0
2
1 2 k (k +1)
−1 k
i=0
2n−2i−2 − 1 ρ(k ). 2i+1 − 1
ρ(k ),
114
K. Nagata, F. Nemenzo, and H. Wada
3. If n ≡ 4 (mod 8), then n 2 −2
N16 (n) =
2
1 2 k (k +1)
−1 k
i=0
k =0
n
2n−2i−2 − 2 2 −i−1 − 2 ρ(k ). 2i+1 − 1
4. If n ≡ 0 (mod 8), then n 2 −1
N16 (n) =
2
1 2 k (k +1)
k =0
n
2
+
2
−1 k
n
2n−2i−2 + 2 2 −i−1 − 2 ρ(k ) 2i+1 − 1
i=0 −2 k 1 2 k (k −1)
k =0
i=0
n
2n−2i−2 + 2 2 −i−1 − 2 ρ (k ). 2i+1 − 1
Here
ρ(k ) =
k
2
(n−k −1)k1
k1 =0
with ρ(k , k1 ) =
k 1 −1 i=0
ρ(k , k1 ), and ρ (k ) =
k k1 =0
2(n−k −1)k1
2k1 − 1 ρ(k , k1 ), 2k − 1
2k −i − 1 . 2k1 −i − 1
In the formulas above, we take the value of a product to be 1 whenever the starting index is greater than the limit. Proof. First note that the length n is a multiple of 4 whenever a doubly even binary code containing 1n exists. The number σ1 (n, k) of distinct doubly even self-orthogonal binary codes of length n and dimension k containing 1n as well as the number σ2 (n, k) of those which do not contain 1n are explicitly given in the proof of the Theorem 4.1 in [6]. We note that σ2 (n, n2 ) = 0 for any even n and σ2 (n, n2 − 1) = 0 for n ≡ 4 (mod 8). For each binary code C0 of size n and the dimension k , the number of partitions [Ti ] is ρ(k , k1 ) if 1 ∈ T1 and ρ(k − 1, k1 − 1) otherwise. Over each binary code with a given partition, there exist λ(k1 , k2 , k3 ) T1 doubly even representations for quaternary code C1 . And from each such code, we can construct 2k1 k distinct self-dual codes (Proposition 2.2). Thus if 8|n, the number of distinct self-dual codes of type (k1 , k2 , k3 ) as described in the left part of the scheme diagram in Section 2 is
(σ2 (n, k )ρ(k , k1 ) + 2σ1 (n, k )ρ(k − 1, k1 − 1)) λ=0 (k1 , k2 , k3 ) × 2k1 k .
Otherwise, the number of such codes is σ2 (n, k )ρ(k , k1 )λ=0 (k1 , k2 , k3 ) × 2k1 k . This is because σ1 (n, k )λ=0 (k1 , k2 , k3 ) = 0 if 8 |n, and λ=1 (k1 , k2 , k3 ) = 2λ=0 (k1 , k2 , k3 ) if it is not 0. Except for the case of λ = 0, λ=0 (k1 , k2 , k3 ) = 1 2 2 k (k +1)−k1 +k1 k3 from Proposition 2.1. Then the calculation −k1 + k1 k3 + k1 k = (k3 + k − l)k1 = (n − k − 1)k1 gives us the mass formulas.
On Self-dual Codes over Z16
4
115
Example for n = 8, k1 = 4, and k2 = k3 = 0 ⎛
⎞ 0111 ⎜1 0 1 1⎟ (0) (0) ⎟ Let C0 = I4 A14 with A14 = ⎜ ⎝ 1 1 0 1 ⎠ . Then C0 is a doubly even self-dual 1110 binary code with d(C0 ) = 4 and C0 18 . (0) (1) (1) We then have C1 = I4 A14 + 2A14 with A14 equal to ⎛ ⎞ 1 α1 + α3 + α4 + 1 α1 + α3 + α4 + α6 + α7 α1 ⎜ α2 + α3 + α4 + α5 + α6 1 α2 + α3 + α4 + α6 + α7 + 1 α2 ⎟ ⎜ ⎟, ⎝ α3 + α5 + α6 + 1 α3 1 α4 ⎠ α5 α6 α7 1 where α∗ can be any binary value. For instance, setting⎛α1 = α2 ⎞ = α4 = α5 = α6 = α7 = 0 and α3 = 1, we have 2131 ⎜3 2 1 1⎟ (0) (1) ⎟ C1 with A14 + 2A14 = ⎜ ⎝ 1 3 2 1 ⎠, a doubly even self-dual quaternary code with 1112 Lee distance dL (C1 ) =⎛ 6. ⎞ 4332 ⎜3 4 3 2⎟ ⎟ Computing (dij ) = ⎜ ⎝ 3 3 4 2 ⎠, we have 2222 ⎞ ⎛ 0 x12 x13 x14 0 x23 x24 ⎟ (2)t (0)−1 ⎜ x12 ⎟, where x = 1 − x. A14 = A14 ⎜ ⎝ x13 x23 0 x34 ⎠ 24 −x34 1 ⎛−x14 −x⎞ 0001 ⎜0 0 0 1⎟ ⎟ From (fij ) ≡ ⎜ ⎝ 0 0 0 1 ⎠, we obtain the induced code C over Z16 : 1110 (0) (1) (2) (3) C = I4 A14 + 2A14 + 22 A14 + 23 A14 ⎛
(3)t
with A14
y11 y12 (0)−1 ⎜ = A14 ⎜ ⎝ y13 y14
y12 y22 y23 y24
y13 y23 y33 y34
⎞ y14 y24 ⎟ ⎟. y34 ⎠ y44
References 1. Balmaceda, J., Betty, R., Nemenzo, F.: Mass formula for self-dual codes over Zp2 . Discrete Mathematics 308, 2984–3002 (2008) 2. Dougherty, S., Harada, M., Sol´e, P.: Self-dual codes over rings and the Chinese remainder theorem. Hokkaido Math. J. 28, 253–283 (1999)
116
K. Nagata, F. Nemenzo, and H. Wada
3. Fields, J., Gaborit, P., Leon, J., Pless, V.: All self-dual Z4 codes of length 15 or less are known. IEEE Trans. Inform. Theory 44, 311–322 (1998) 4. Gaborit, P.: Mass formulas for self-dual codes over Z4 and Fq + uFq rings. IEEE Trans. Inform. Theory 42, 1222–1228 (1996) 5. Hammons, A., Kumar, P., Calderbank, A., Sloane, N., Sol´e, P.: The Z4 linearity of Kerdock, Preparata, Goethals and related codes. IEEE Trans. Inform. Theory 40, 301–319 (1994) 6. Nagata, K., Nemenzo, F., Wada, H.: The number of self-dual codes over Zp3 . Designs, Codes and Cryptography 50, 291–303 (2009) 7. Nagata, K., Nemenzo, F., Wada, H.: Constructive algorithm of self-dual errorcorrecting codes. In: Proc. of 11th International Workshop on Algebraic and Combinatorial Coding Theory, pp. 215–220 (2008) ISSN 1313-423X
A Non-abelian Group Based on Block Upper Triangular Matrices with Cryptographic Applications ´ Rafael Alvarez, Leandro Tortosa, Jos´e Vicent, and Antonio Zamora Departamento de Ciencia de la Computaci´ on e Inteligencia Artificial, Universidad de Alicante, Campus de San Vicente, Ap. Correos 99, E–03080, Alicante, Spain
[email protected],
[email protected],
[email protected],
[email protected]
Abstract. The aim of this article is twofold. In the first place, we describe a special non-abelian group of block upper triangular matrices, and verify that choosing properly certain parameters, the order of the subgroup generated by one of these matrices can be as large as needed. Secondly, we propose and implement a new key exchange scheme based on these primitives. The security of the proposed system is based on discrete logarithm problem although a non-abelian group of matrices is used. The primary advantadge of this scheme is that no prime numbers are used and the efficiency is guaranteed by the use of a quick exponentiation algorithm for this group of matrices. Keywords: Polynomial matrices, Block matrices, Quick exponentiation, Cryptography, Public Key.
1
Introduction
A lot of popular public-key encryption systems are based on number theory problems such as factoring integers or finding discrete logarithms. The underlying algebraic structures are, very often, abelian groups, as we can see in [3,4,16]; this is especially true in the case of the Diffie-Hellman method (DH), that was the first practical public key technique and introduced in 1976 (see [8]). In Crypto 2000 [12], Ko et al. proposed a new public cryptosystem based on Braid groups, which are non abelian groups, and it was the first practical public key cryptosystem based on non abelian groups. The Discrete Logarithm Problem (DLP, see [7, 15, 20]) is, together with the Integer Factoring Problem (IFP, see [8]), one of the main problems upon which public-key cryptosystems are built. Thus, efficiently computable groups where the DLP is hard to break are very important in cryptography. The purpose of this proposal is the analysis and implementation of a key exchange scheme based on a special non-abelian group of block upper triangular matrices. The main idea of this paper is to study the cryptographic behavior M. Bras-Amor´ os and T. Høholdt (Eds.): AAECC 2009, LNCS 5527, pp. 117–126, 2009. c Springer-Verlag Berlin Heidelberg 2009
´ R. Alvarez et al.
118
of products of the type M1v M2w , with v, w integers and M1 , M2 elements of the group of matrices previously mentioned. The paper is organized as follows: in Section 2 we present some properties concerning block upper triangular matrices defined over Zp . We will show that they can generate sets of large and known order if parameters are chosen adequately; due to this we study the order of the subgroup generated by a special matrix M ; in Section 3 we describe the design of a key exchange scheme based on the non-abelian group of triangular matrices, for this, we use a especial quick exponentiation method; finally, some conclusions are given.
2
Preliminaries
Given p a prime number and r, s ∈ N ,we denote by Matr×s (Zp ) the matrices of size r × s with elements in Zp , and by GLr (Zp ) and GLs (Zp ), the invertible matrices of size r × r and s × s respectively. We define AX Θ= , A ∈ GLr (Zp ), B ∈ GLs (Zp ), X ∈ Matr×s (Zp ) OB Theorem 1. The set Θ has a structure of non-abelian group for the product of matrices. The theorem that follows gives us a method for calculating the powers of this kind of matrices (for details see [6]). AX Theorem 2. Let M = be an element of the set Θ, we consider the OB subgroup generated by the different powers of M . Taking h as a non negative integer then h (h) A X h M = , (1) O Bh where X
(h)
=
0 h
h−i XB i−1 i=1 A
if h = 0 if h ≥ 1
(2)
Also, if 0 ≤ t ≤ h then X (h) = At X (h−t) + X (t) B h−t
(3)
X (h) = A(h−t) X (h) + X (h−t) B t .
(4)
As a consequence, in the case t = 1 we have X (h) = AX (h−1) + XB h−1 or X (h) = Ah−1 X + X (h−1) B
A Non-abelian Group Based on Block Upper Triangular Matrices
119
And, taking a, b non negative integers such as a + b ≥ 0, we have X (a+b) = Aa X (b) + X (a) B b
(5)
It is very important to achieve a high order of the group. With this aim, the following construction (see [11, 13]) guarantees a certain order. Let f (x) = a0 + a1 x + · · ·+ an−1 xn−1 + xn a monic polynomial in Zp [x], whose companion n × n matrix is ⎤ ⎡ 0 1 0 ... 0 0 ⎢ 0 0 1 ... 0 0 ⎥ ⎥ ⎢ ⎢ .. .. .. . . .. .. ⎥ ⎢ . . . . . . ⎥ A=⎢ ⎥ . ⎢ 0 0 0 ... 1 0 ⎥ ⎥ ⎢ ⎣ 0 0 0 ... 0 1 ⎦ −a0 −a1 −a2 . . . −an−2 −an−1 If f is an irreducible polynomial in Zp [x], then the order of the matrix A is equal to the order of any root of f in Fpn and the order of A divides pn − 1 (see [14]). Moreover, assuming that f is a primitive polynomial in Zp [x], the order of A is exactly pn − 1. Odoni, Varadharajan and Sanders [19] propose an extended scheme based on the construction of block matrices like this ⎤ ⎡ A1 0 . . . 0 ⎢ 0 A2 . . . 0 ⎥ ⎥ ⎢ A=⎢ . . . ⎥ , ⎣ .. .. . . . .. ⎦ 0 0 . . . Ak where Ai is the companion matrix of fi , and fi , for i = 1, 2, . . . , k, are different primitive polynomials in Zp [x] of degree ni , for i = 1, 2, . . . , k, respectively. The order of Ai is pni − 1, for i = 1, 2, . . . , k. Therefore, the order of A is lcm(pn1 − 1, pn2 − 1, . . . , pnk − 1). With the aim of using this type of matrix in a public key cryptosystem, Odoni, Varadharajan and Sanders conjugate this matrix A with an invertible matrix P of size n × n, with n = n1 + n2 + . . . + nk , obtaining a new matrix A = P AP −1 that has the same order as A. If we construct the blocks A and B of M ∈ Θ using primitive polynomials, we can guarantee a very high order. Let f (x) = a0 + a1 x+ · · ·+ ar−1 xr−1 + xr , g(x) = b0 + b1 x+ · · ·+ bs−1 xs−1 + xs be two primitive polynomials in Zp [x] and A, B the corresponding companion matrices. Let P and Q be two invertible matrices, such that A = P AP −1 and B = QBQ−1 . With this construction, the order of M is lcm(pr − 1, ps − 1). Using cyclotomic fields theory we know that the polynomial xn − 1 is divided by xd − 1 if d|n; therefore, if we choose r and s such that they are relatively prime, the number of common divisors is diminished and the lcm(pr − 1, ps − 1) will be maximum. Table 1 shows a comparison of the variation of the order of M depending on the value of p, r and s, where r and s are the sizes of blocks A and B respectively.
120
´ R. Alvarez et al. Table 1. Order of M depending of r, s and p r
s
2 3 3 4 5 31 47 60 130 216
3 4 5 5 6 32 48 61 131 217
p Order of M 2200 2200 2200 2200 2200 29 29 29 29 29
2800 21200 21400 21600 22000 2302 2457 2584 21264 22099
Given the excellent cryptographic properties of block upper triangular matrices, we will see an aplication of this set to public key cryptography.
3
Cryptographic Application: A Key Exchange Scheme
The DLP ( [17]) for a set G is finding, given α, β ∈ G, a nonnegative integer x (if it exists) such that β = αx . The smallest such integer x is called the discrete logarithm of β to the base α, and is written x = logα β. Clearly, the discrete logarithm problem for a general group G is exactly the problem of inverting the exponentiation function (see [18] for more information). This is especially true for the Diffie-Hellman method ( [8]), that was the first practical public key technique to be published. The security of this method for key exchanges, is based on the discrete logarithm problem, and it uses a prime number p and a primitive element r ∈ Zp . Privacy or security of messages is not the only problem area in cryptology. It is also important that user identity can be authenticated. Digital signature is a property of asymmetric cryptography, that allows authentication. It consists of two processes: signing a message and verifying a message signature; and it must depend on the message to be signed. The method presented in this section uses a non-abelian group based on the powers of a block upper triangular matrix, which is a very flexible technique. A1 X1 A2 X2 and M2 = be two elements of the set Θ with Let M1 = 0 B1 0 B2 orders m1 and m2 respectively. We define the following notation for a pair of numbers x, y ∈ N: Axy = Ax1 Ay2 , Bxy = B1x B2y and
(y)
(x)
Cxy = Ax1 X2 + X1 B2y .
A Non-abelian Group Based on Block Upper Triangular Matrices
121
If two users U and V wish to exchange a key, they may execute the following steps: 1. U and V agree on p ∈ N and M1 , M2 ∈ Θ, with m1 , m2 the orders of M1 and M2 , respectively. 2. U generates two random private keys r, s ∈ N such that 1 ≤ r ≤ m1 − 1, 1 ≤ s ≤ m2 − 1, and computes Ars , Brs , Crs constructing Ars Crs C= . 0 Brs 3. U sends C to V . 4. V generates two random private keys v, w ∈ N such that 1 ≤ v ≤ m1 − 1, 1 ≤ w ≤ m2 − 1, and computes Avw , Bvw , Cvw constructing Avw Cvw D= . 0 Bvw 5. V sends D to U . 6. The public keys of U and V are respectively the matrices C and D. 7. U computes (s)
(r)
Ku = Ar1 Avw X2 + Ar1 Cvw B2s + X1 Bvw B2s . 8. V computes (w)
Kv = Av1 Ars X2
(v)
+ Av1 Crs B2w + X1 Brs B2w .
The following theorem shows that Ku = Kv . (s)
(r)
Theorem 3. If Ku = Ar1 Avw X2 + Ar1 Cvw B2s + X1 Bvw B2s (w) (v) and Kv = Av1 Ars X2 + Av1 Crs B2w + X1 Brs B2w , then Ku = Kv . Proof 1. We have Ars Crs Avw Cvw r s C= = M1 M2 , D = = M1v M2w , 0 Brs 0 Bvw M1r =
M2s
(r)
Ar1 X1 0 B1r (s)
As2 X2 = 0 B2s
, M1v =
(v)
Av1 X1 0 B1v
and
M2w
,
(w)
Aw 2 X2 = 0 B2w
.
´ R. Alvarez et al.
122
Let Mu =
M1r DM2s
Av Kv . 0 Bv
Au Ku = 0 Bu
and Mv = M1v CM2w = Then
Mu = M1r DM2s = M1r M1v M2w M2s = M1v M1r M2s M2w = M1v CM2w = Mv and, consequently, Ku = Kv . As we have demonstrated in this theorem, now both U and V share a common and secret key, Ku = Kv = P. The private keys are r, s and v, w, respectively. These keys do not have to be prime numbers (we avoid primality tests). In our key exchange scheme based on block upper triangular matrices, the usage of big powers of matrices is required (see [1]); so the implementation of an efficient and trustworthy quick exponentiation algorithm (see [2, 9, 10, 21]) becomes necessary for the accomplishment of this task. Given n ∈ N, then a ordered set of indices exist I = {i1 , i2 , i3 , i4 , . . . , iq } so that n = 2i1 + 2i2 + 2i3 + . . . + 2iq . In order to compute the powers of A and B, i1 +2i2 +...+2iq
An = A2
i1 +2i2 +...+2iq
Bn = B2
i1
i2
iq
i1
i2
iq
= A2 A2 . . . A2
= B2 B2 . . . B2 ,
we use 0 A2 = A = A0 21 A = AA = A1 2 A2 = A2 A2 = A2 ...... e e−1 e−1 = Ae A2 = A2 A2 That is to say, computing big powers of the blocks A, B of matrices M , is reduced to multiplying matrices quickly. So, for block X we have the following theorem. Theorem 4. Given an integer number n, whose binary decomposition is n=
q
j=1
2ij ,
A Non-abelian Group Based on Block Upper Triangular Matrices
123
and a set of indices I = {i1 , i2 , i3 , i4 , . . . , iq }, we have: X (n) =
q
(k)
(k)
(k)
An1 X (n2 ) B n3
(6)
k=1
Where (k) n1
=
q−k
(q)
2ij , f or k = 1, 2, 3, . . . q − 1, and n1 = 0;
j=1 (k)
n2 = 2iq−k+1 for k = 1, 2, 3, . . . q ;
(k)
n3 =
q
(1)
2ij f or k = 2, 3, . . . q, and n3 = 0.
j=q−k+2
In Alvarez et al. (see [2]) we prove this theorem and we show some examples of its use.
4
Security Analysis
Brute force attacks are infeasible if a sufficiently big order for M1 and M2 is chosen as, for example, 1024 bits. In the same way, we use big values for the private keys r, s, v, w (about 1024 bits) and due to this, we can avoid Meet-inthe-Middle-Attacks. A widely used algorithm for the cryptanalysis of public key schemes based on matrix powers is due to Menezes and Wu (see [18]). It, basically, establishes the possibility of reducing the full discrete logarithm problem to a series of smaller discrete logarithms over finite fields. This algorithm is not viable for the presented scheme since no matrix powers are published. Another technique for effectively cryptanalyzing some schemes based on block upper triangular matrices, has been developed by Climent, Gorla and Rosenthal [5] and is based on the Cayley-Hamilton theorem. Theorem 5. (Cayley-Hamilton Theorem). Let a matrix M ∈ GLn (Zp ) and its characteristic equation qM (λ) = det(λIn − M ) = a0 + a1 λ + a2 λ2 + . . . + an−1 λn−1 + λn . Then qM (M ) = a0 + a1 M + a2 M 2 + . . . + an−1 M n−1 + M n = 0n , where In is the identity matrix of size n and 0n the null matrix of the same size. This attack is not viable either, since two different matrices with different characteristic equations are employed. Let us analyze the inefficiency of this type of attack.
124
´ R. Alvarez et al.
Consider
A1 X1 A2 X2 M1 = ∈ Θ , M2 = ∈ Θ, 0 B1 0 B2
be two matrices of sizes n = r + s, and assume that det(λI − M1 ) = 0 and det(λI − M2 ) = 0, then qM1 (λ) = det
λI − A1 −X1 0 λI − B1
= det(λI − A1 ) · det(λI − B1 ) = qA1 (λ) · qB1 (λ) = a0 + a1 λ + a2 λ2 + . . . + an−1 λn−1 + λn , qM2 (λ) = det
λI − A2 −X2 0 λI − B2
= det(λI − A2 ) · det(λI − B2 ) = qA2 (λ) · qB2 (λ) = b0 + b1 λ + b2 λ2 + . . . + bn−1 λn−1 + λn with qM1 (λ) = qM2 (λ). The Cayley-Hamilton theorem guarantees that qM1 (M1 ) = qM2 (M2 ) = 0, so a0 I + a1 M1 + a2 M12 + . . . + an−1 M1n−1 + M1n = 0, a0 I + a1 M1 + a2 M12 + . . . + an−1 M1n−1 = −M1n , multiplying by M1 a0 M1 + a1 M12 + a2 M13 + . . . + an−1 M1n = −M1n+1 . Replacing the value of M1n a0 M1 + a1 M12 + a2 M13 + . . . + an−1 (−a0 I − a1 M1 − a2 M12 − . . . − an−1 M1n−1 ) = −M1n+1 , and grouping terms we obtain M1n+1 = b0 I + b1 M1 + b2 M12 + . . . + bn−1 M1n−1 . with b0 = a0 an−1 and bi = an−1 ai − ai−1 for i = 1, . . . , n − 1 Following this process for a certain p ≥ n, we have M1p = c0 I + c1 M1 + c2 M12 + . . . + cn−1 M1n−1 ,
(7)
A Non-abelian Group Based on Block Upper Triangular Matrices
125
then
M1p
(p) Ap1 X1 = 0 B1p n−1 (n−1) I0 A1 X1 A1 X1 = c0 + c1 + . . . + cn−1 . 0 B1 0I 0 B1n−1
Consequently, Ap1 = c0 I + c1 A1 + c2 A21 + . . . + cn−1 An−1 , 1 (p) (2) (n−1) , X1 = c1 X1 + c2 X1 + . . . + cn−1 X1 B1p = c0 I + c1 B1 + c2 B12 + . . . + cn−1 B1n−1 . If we proceed like in expression (7), we obtain M2p = d0 I + d1 M2 + d2 M22 + . . . + dn−1 M2n−1 . In the scheme that we are analyzing, we know M1 and M2 so we can set up the following linear system M1r = e1 M1 + e2 M12 + . . . + en−1 M1n−1 , M2v = f1 M2 + f2 M22 + . . . + fn−1 M2n−1 . Since r and s are private keys, coefficients e1 , e2 , . . . , en−1 and f1 , f2 , . . . , fn−1 , as well as the matrices M1r and M2v are unknown, rendering the system unsolvable. Therefore, the Climent, Gorla and Rosenthal technique (see [5]), based on the Cayley Hamilton theorem, cannot be used to obtain a system suitable for cryptanalysis. In order to cryptanalyze this scheme, we must apply square root algorithms to the calculation of discrete logarithms. We can choose the blocks A1 , A2 , B1 and B2 so that the inefficiency of this type of attack is guaranteed.
5
Conclusions
We propose a key exchange scheme, based on the behavior of matrix products of the type M1v M2w , where M1 , M2 are elements of a non-abelian group of block upper triangular matrices with a big enough order, being v, w integers. One of the main advantages of this scheme is the absence of big prime numbers, avoiding the need for primality tests. Moreover, the proposed scheme is very efficient since it employs fast exponentiation algorithms for such type of matrices. The presented algebraic structure can be applied to other cryptographic applications in addition to the proposal. It is based on the discrete logarithm problem for matrices and presents the advantage of reducing the required key length for a given level of security. Another advantage is the use of non-prime numbers, which prevents the primality tests that slow cryptographic protocols down.
126
´ R. Alvarez et al.
References 1. Agnew, G.B., Mullin, R.C., Vanstone, S.A.: Fast Exponentiation in GF (2n ). In: G¨ unther, C.G. (ed.) EUROCRYPT 1988. LNCS, vol. 330, pp. 251–255. Springer, Heidelberg (1988) ´ 2. Alvarez, R., Ferr´ andez, F., Vicent, J.F., Zamora, A.: Applying quick exponentiation for block upper triangular matrices. Applied Mathematics and Computation 183, 729–737 (2006) 3. Anshel, I., Anshel, M., Goldfeld, D.: An algebraic method for public-key cryptography. Mathematical Research Letters 6, 287–291 (1999) 4. Blake, I., Seroussi, G., Smart, N.: Elliptic Curves in Cryptography. London Mathematical Society. Lecture Notes, vol. 265. Cambridge University Press, Cambridge (1999) 5. Climent, J.J., Gorla, E., Rosenthal, J.: Cryptanalysis of the CFVZ cryptosystem. Advances in Mathematics of Communications (AMC) 1, 1–11 (2006) 6. Climent, J.J.: Propiedades espectrales de matrices: el ´ındice de matrices triangulares por bloques. La ra´ız Perron de matrices coc´ıclicas no negativas. Ph.D Thesis, Valencia. Sapin (1993) 7. Coppersmith, D., Odlyzko, A., Schroeppel, R.: Discrete logarithms in GF (p). Algorithmica, 1–15 (1986) 8. Diffie, W., Hellman, M.: New directions In Cryptography. IEEE Trans. Information Theory 22, 644–654 (1976) 9. Gathen, J.: Efficient and Optimal Exponentiation in Finite Fields. Computational Complexity 1, MR 94a:68061, 360–394 (1991) 10. Gordon, D.M.: A Survey of Fast Exponentiation Methods. Journal of Algorithms 27, 129–146 (1998) 11. Hoffman, K., Kunze, R.: Linear Algebra. Prentice-Hall, New Jersey (1971) 12. Ko, K.H., Lee, S.J., Cheon, J.H., Han, J.W., Kang, J.: New Public Key Cryptosystem Unsing Braid Groups. In: Bellare, M. (ed.) CRYPTO 2000. LNCS, vol. 1880, pp. 166–183. Springer, Heidelberg (2000) 13. Koblitz, N.: A Course in Number Theory and Cryptography. Springer, Heidelberg (1987) 14. Lidl, R., Niederreiter, H.: Introduction to Finite Fields and Their Applications. Cambridge University Press, Cambridge (1994) 15. McCurley, K.: The discret logarithm problem. Cryptology and Computational Number Theory. In: Proceedings of Symposia in Applied Mathematics, vol. 42, pp. 49–74 (1990) 16. Menezes, A., Van Oorschot, P., Vanstone, S.: Handbook of Applied Cryptography. CRC Press, Florida (2001) 17. Menezes, A., Wu, Y.-H.: A polynomial representation for logarithms in GF (q). Acta arithmetica 47, 255–261 (1986) 18. Menezes, A., Wu, Y.-H.: The Discrete Logarithm Problem in GL(n, q). Ars Combinatoria 47, 22–32 (1997) 19. Odoni, R.W.K., Varadharajan, V., Sanders, P.W.: Public Key Distribution in Matrix Rings. Electronic Letters 20, 386–387 (1984) 20. Pohlig, S.C., Hellman, M.E.: An improved algorithm for computing logarithms over GF (p) and its cryptographic significance. IEEE Trans. Info. Theory 24, 106–110 (1978) 21. Shuhong, G., Gathen, J., Panario, D., Shoup, V.: Algorithms for Exponentiation in Finite Fields. Journal of Symbolic Computation 29-6, 879–889 (2000)
Word Oriented Cascade Jump σ−LFSR Guang Zeng, Yang Yang, Wenbao Han, and Shuqin Fan Department of Applied Mathematics, Zhengzhou Information Science and Technology Institute, Zhengzhou 450002, China sunshine
[email protected], yangyang
[email protected],
[email protected],
[email protected]
Abstract. Bit oriented cascade jump registers were recently proposed as building blocks for stream cipher. They are hardware oriented designed hence inefficient in software. In this paper word oriented cascade jump registers are presented based on the design idea of bit oriented cascade jump registers. Their constructions make use of special word oriented σ−LFSRs, which can be efficiently implemented on modern CPU and only require few memory. Experimental results show that one type of efficient word oriented cascade jump σ−LFSRs can be used as building blocks for software oriented stream cipher. Keywords: Stream Cipher, Linear Feedback Shift Register(LFSR), Cascade Jump LFSR, σ−LFSR, Fast Software Encryption.
1
Introduction
Today stream ciphers are widely used in areas where the combination of security, performance and implementation complexity are of importance. One such area is wireless communication (GSM, 3G, IEEE802.11), where a low gate count in hardware implementation requirements prevail. Another area is efficient software encryption with speed about tens of gigabits per second. This opinion is supported by the eSTREAM project [1], which is an effort to identify new stream ciphers that might be interesting for widespread adoption. LFSR is known to allow fast hardware implementation and produces sequences with a large period and good statistical properties. But the inherent linearity of these sequences results in susceptibility to algebraic attacks [2,3]. A wellknown method to resist that is clock control. Due to the traditional clock control mechanism that either controls the LFSR irregularly clocking or shrinks or thins the output, clock controlled LFSRs have decreased rate of sequence generation since such LFSRs are usually stepped a few times to produce just one bit. A more effective method is dynamic feedback control, which is used in stream
This work has been supported by a grant from the National High Technology Research and Development Program of China (No.2006AA01Z425), and National Natural Science Foundation of China (No.90704003, No.60503011), and National Basic Research Program of China (No.2007CB807902).
M. Bras-Amor´ os and T. Høholdt (Eds.): AAECC 2009, LNCS 5527, pp. 127–136, 2009. c Springer-Verlag Berlin Heidelberg 2009
128
G. Zeng et al.
cipher Pomaranch [4]. Pomaranch is a hardware oriented stream cipher and the core part is made of several bit oriented cascade jump registers [5], which was presented at SASC2004 Workshop. The main idea of cascade jump register [6] is to move the state of LFSR to another one over more than one step without having to step through all the intermediate states. The nature of cascade jump register is bit oriented design, so it is suitable for hardware implementation while its software implementation is inefficient. The recent trend in software oriented stream cipher is towards word oriented design such as Sober [7], Snow [8], Sosemanuk [9] and etc. This allows them not only to be efficiently implemented in software but also to increase the throughput since words instead of bits output. σ−LFSRs [10,11] are one type of efficient word oriented LFSRs, which are constructed by few fundamental instructions of modern processor and hence have high software efficiency. Combining the design ideas of bit oriented cascade jump register and word oriented σ−LFSR, we propose a cascade jump σ−LFSR. We examine one type of cascade jump σ−LFSRs on modern CPU architecture and make suggestions for design practices to maximize the speed, the security and decrease the implementation complexity of general cascade jump σ−LFSR. Finally, we give the jump index of our examples using Pollard Rho method.
2
Cascade Jump Register and σ−LFSR
Bit Oriented Cascade Jump Register. Linear Finite State Machine (LFSM) has been widely used in cryptography. Let F2 be the binary field and F2m be its extension field for some positive integer m. The state of the LFSM is represented t by a vector St = (rn−1 , · · · , r0t ) ∈ Fn2m at time t, where rit ∈ F2m and rit denotes the content of the ith memory cell after t transitions. As the finite state machine is linear, transitions from one state to the next can be described by a multiplication of the state vector with a transition matrix T ∈ Mmn (F2 ), i.e. St+1 = St · T , for t ≥ 0. In practice, matrix T is generally chosen to have a special form for convenience of implementation. If m = 1, the main idea of bit oriented cascade jump register is to find whether there exists an integer J, such that T J = T + I where I is a mn × mn identity matrix. If such a power of the transition matrix does exist, then one achieves the same effect as multiplying the state vector either by T J or by T + I. Moreover, since changing T into T +I is generally much simpler than rewiring T into T J for an arbitrary transition matrix T , the LFSR is easy to jump. This modification of the transition matrix has very low complexity and is much more attractive and efficient than the method of rewiring the LFSR. Hence bit oriented cascade jump registers can be used as a dynamic part controlled by two-value sequence in clock control stream cipher. Let g(x) = det(xI + T ) be the characteristic polynomial of T and g ⊥ (x) = g(x + 1) be the dual polynomial of g(x). If there exists an integer J such that T J = T + I, the value of J is called the jump index of g. The jump index always exists if g(x) is a primitive polynomial over F2 . Pomaranch uses 6 or 9 bit oriented
Word Oriented Cascade Jump σ−LFSR
129
cascade jump registers for 80-bit and 128-bit key respectively. Both registers are built on 18 bits memory cells, and have primitive characteristic polynomial, i.e., when only clocked by zeros or ones they have a period of 218 − 1. The transition matrix used in Pomaranch has a very special form, namely, ⎛ ⎞ d0 0 0 · · · 0 1 ⎜ 1 d1 0 · · · 0 ⎟ t1 ⎜ ⎟ ⎜ ⎟ . . .. ⎜ 0 1 d2 . . . .. ⎟ ⎜ ⎟ (1) ⎜ ⎟ . . . .. ⎜ 0 0 .. .. 0 ⎟ ⎜ ⎟ ⎜ . . . ⎟ ⎝ .. .. . . 1 d16 ⎠ t16 0 0 ··· 0
1 d17 + t17
The major advantage of this type of transition matrix is the efficient hardware implementation. If wants to obtain word oriented cascade jump register, one only need choose m ≥ 2, especially m = 32 or 64 for the word size of CPU correspondingly. But it should be noticed that the operation of St ·T in computer is likely to be expensive unless the matrix T has a special form. In this case, σ−LFSRs are good choices because they are software oriented design. Word Oriented σ−LFSR. The main idea of σ−LFSR [10,11] is to use a few the fundamental logic instructions, arithmetic instructions, or Single Instruction Multiple Data technique to construct high efficiency word oriented LFSR with good cryptographic properties. Let C0 , · · · , Cn−1 ∈ Mm (F2 ) and the state at t+1 t , · · · , r0t ), then the next state is St+1 = (rn−1 , · · · , r0t+1 ) ∈ time t be St = (rn−1 t Fn2m where rit+1 = ri+1 for 0 ≤ i ≤ n − 2 and t+1 t rn−1 = C0 r0t + C1 r1t + · · · + Cn−1 rn−1 .
(2)
The system is called σ−LFSR of order n and matrix polynomial f (x) = xn + Cn−1 xn−1 + · · · + C1 x + C0 ∈ Mm (F2 )[x] is called σ−polynomial of σ−LFSR. In fact, σ is a circular rotation operation defined in the following. Let α0 , · · · , αn−1 be a basis of F2m and for α ∈ F2m , there exists a vector (a0 , a1 , · · · , am−1 ) ∈ Fm 2 m−1 such that α = i=0 ai αi . Operation σ over F2m is defined as σ(α) am−1 α0 + a0 α1 + · · · + am−2 αm−1
(3)
and σ is the Frobenius automorphism over F2m when the basis is normal, namely σ(α) = α2 . It is easy to see that σ is a linear transformation on F2m /F2 . From the view of linear transformation, we can obtain a new ring A2m = F2m [σ] with adding σ to F2m . Moreover we have A2m ∼ = Mm (F2 ) [11]. For the sake of obtaining fast speed implementation in software, we confine the coefficients in several special forms as follows. m−1 1. AND Operation, ∧V (α) = α ∧ V = m−1 i=0 ai ci αi , where V = i=0 ci αi ; 2. Circular Rotation Operation, σ k (α); m−1 3. Left Shift Operation L, L(α) = i=1 ai αi−1
130
G. Zeng et al.
4. Right Shift Operation R, R(α) = m−2 i=0 ai αi+1 ; 5. LR Shift Combination Operation s,t = Ls + Rt . If the output sequences generated by one σ−LFSR attain the maximal period, that σ−LFSR is called the primitive σ−LFSR. Suppose s∞ is the sequence generated by primitive σ−LFSR, its bit coordinate sequences are all m-sequence with the same minimal polynomial[11]. In this case, the number of nonzero coefficients of the binary minimal polynomial is called the Hamming weight of σ−LFSR. This parameter is important, the closer to the half of the degree of its minimal polynomial, the better its properties. With these special operations above, we could find many primitive σ−LFSRs with good cryptographic properties. The following is a primitive σ−LFSR over F216 . -
.................. ....................... .. ..... .... V ........ 1,1... ..................... ......................
∧
?
Fig. 1. σ−LFSR with σ−polynomial x8 + ∧0x3fb6 x + 1,1
The minimal polynomial of its coordinate sequences is p(x) with Hamming weight 45, where p(x) = x128 + x121 + x114 + x112 + x107 + x105 + x98 + x96 + x91 + x75 + x73 + x72 + x68 + x66 + x65 + x64 + x61 + x59 + x58 + x57 + x56 + x54 + x52 + x51 + x49 + x47 + x45 + x42 + x41 + x38 + x36 + x35 + x33 + x31 + x29 + x27 + x26 + x25 + x24 + x20 + x18 + x11 + x6 + x4 + 1. Furthermore, this σ−LFSR has extremely fast speed, output speed is about over 25 Gbits/second on Pentium IV 3.0GHz CPU.
3
Word Oriented Cascade Jump σ−LFSR
Cascade jump register and σ−LFSR are effective methods in clock control design and word oriented LFSR design respectively. If two advantages could be combined together, we could strengthen the security of σ−LFSR and improve the performance of bit oriented cascade jump register. With this idea, we study the word oriented cascade jump σ−LFSR. The transition matrix of general word oriented cascade jump σ−LFSR is as follows. ⎞ ⎛ 0 0 0 0 C0 ⎜ I 0 0 0 C1 ⎟ ⎟ ⎜ ⎜ .. . . . . .. ⎟ .. .. T =⎜ . ⎟ . . ⎟ ⎜ ⎝ 0 0 I 0 Cn−2 ⎠ 0 0 0 I Cn−1 mn×mn Only if T has maximal multiplicative order in Mmn (F2 ), the output sequence generated by this σ−LFSR achieves the maximal period 2mn − 1, and in this
Word Oriented Cascade Jump σ−LFSR
131
case, there exists a jump index J such that T J = T + I. The following theorem may be helpful in searching for primitive T . Theorem 1. [10] Let σ−LFSR over F2m of order n with σ−polynomial f (x) = xn + Cn−1 xn−1 + · · · + C1 x + C0 ∈ Mm (F2 )[x] where C0 ∈ GLm (F2 ) and Cl = (cij l )m×m for l = 0, 1, · · · , n − 1. Let F (x) = (f ij (x))m×m ∈ Mm (F2 [x]) be the corresponding polynomial matrix of f (x) where f ij (x) = δ ij xn +
n−1
l ij cij l x ∈ F2 [x], δ =
l=0
1, i = j; 0, i = j.
Then σ−LFSR has maximal period if and only if the determinant |F (x)| is a primitive polynomial over F2 of degree mn. A word oriented cascade jump σ−LFSR can be used under a binary control sequence, namely changing the transition matrix from T to T + I according to the control sequence. We should choose σ−polynomial f (x) to be primitive first, and furthermore f ⊥ (x) = f (x + 1) also be primitive. If σ−polynomial f ⊥ is not primitive, its output sequences will have short period with high probability and can not provide good properties in theory as maximal period sequence. This weakness will make this word oriented cascade jump σ−LFSR very dangerous if the control sequence is not balanced. Apparently, the dual transition matrix of cascade jump σ−LFSRs will be identical to the transition matrix except for the entries on the main diagonal. Equivalently, adding ones to the entries on the main diagonal of the transition matrix equals to the jump index power of that matrix. Seeking for fast efficiency, we focus on one type of word oriented cascade σ−LFSRs with the following σ−polynomial over Mmn (F2 ) as f (x) = xn + ∧V xr + s,t ,
(4)
where V ∈ F2m , 0 < s, t < m − 1 and (s + t)|m. We have made exhaustive search for m = 16. Because of the symmetry of s, t, which means that if Eq. (4) is primitive, then g(x) = xn + ∧V x + t,s is also primitive, hence s ≤ t is required. Totally 6,815,744 σ−polynomials in the form of Eq. (4) are tested with n = 8. The following table lists the number of maximal period word oriented cascade jump σ−LFSRs. Table 1. The Number of Primitive f and its Dual f ⊥ r=1r=3 f 2510 2415 f ⊥ 2974 2334 Both 1382 48
r=5 2680 2419 54
r = 7 r = 2, 4, 6 5666 0 3570 0 46 0
132
G. Zeng et al. Table 2. Hamming Weight of Some Good σ−polynomial SN 1 2 3 4 5 6 7 8 9 10
f (x)
8
x + ∧0x5806 x + 3,5 x8 + ∧0xffc0 x + 1,7 x8 + ∧0x9239 x + 1,1 x8 + ∧0x29d8 x + 1,1 x8 + ∧0x2ab9 x3 + 1,1 x8 + ∧0xdb73 x3 + 1,1 x8 + ∧0xb036 x5 + 1,3 x8 + ∧0xf00e x5 + 1,1 x8 + ∧0x4e2c x7 + 1,1 x8 + ∧0x623c x7 + 1,1
Weight(f ) Weight(f ⊥ ) 9 33 9 35 35 65 33 75 33 55 31 53 21 57 29 65 29 61 39 69
There are total 1530 word oriented cascade jump σ−LFSRs satisfying our selection principle and the Hamming weight of f ⊥ (x) is superior to that of f (x). In order to make the test results become visible, some examples with both f and f ⊥ primitive are listed below. Example 1. The tenth word oriented cascade jump σ−LFSR in Table 2 has the best Hamming weight properties. Their software implementation are both very fast for simplicity of f and f ⊥ . The original σ−LFSR and the modified cascade jump σ−LFSR are shown in Figure 2 and Figure 3 respectively. -
...................... ..... 1,1... ..................... ..
..... ......... ........ ..... V ..... ................... ..
∧
?
Fig. 2. σ−LFSR with σ−polynomial x8 + ∧0x623c x7 + 1,1
- ?
?
?
....................... .... .... V...... ................
∧
?
?
?
?
?
?
-
.................... .... 1,1..... .... ...................
Fig. 3. The Modified Cascade Jump σ−LFSR in Fig.2
Searching for good word oriented σ−LFSRs as form in Eq. (4) is a timeconsuming procedure. The following two necessary conditions may decrease the amount of search work a little. Theorem 2. If r is an even integer, f (x) defined in Eq. (4) cannot be primitive.
Word Oriented Cascade Jump σ−LFSR
133
Proof: According Theorem 1, we need to calculate |F (x)|. Since every item in f (x) is a square item over F2 [x], hence the determinant must be a square polynomial, not primitive polynomial. Theorem 3. Polynomial f (x) in Eq. (4) cannot be primitive if s = t > 2. Proof: Let Cm,s be the m × m matrix as following ⎞ ⎛ x1 1 ⎟ ⎜ .. ⎟ ⎜ . 1 ⎟ ⎜ ⎜ .. .. ⎟ Cm,s = ⎜ 1 ⎟ . . ⎟ ⎜ ⎟ ⎜ . . .. .. ⎠ ⎝ 1 xm m×m
(5)
In fact, with xi = xn + ci xr for 0 ≤ i ≤ m − 1 in (5), we get f (x) defined (4) in polynomial matrix expression. By Theorem 1, we need to calculate |f (x)| and check whether it is a primitive polynomial. Next we will prove the determinant of Cm,s is the multiplication of polynomials with the total number s when s ≥ 2. Hence we know the determinant of f (x) must be reducible, obviously not primitive. For Cm,s with m ≤ 2s, we can use the last m − s rows to reduce the first m − s rows, then use the Lapalace Theorem at first m − s columns to compute the determinant, hence we can obtain a s × s diagonal matrix by suitable adjusting some columns. So the |Cm,s | is the multiplication of s polynomials. This procedure is as follows 0 x1 xs+1 . .. x2 xs+2 x1 xs+1 x2 xs+2 . . = |Cm,s | = . 0 . . xs+1 1 . . . .. .. . 1
xm
Next we will adopt mathematic induction method to prove. Suppose |Cm,s | can be transformed to s × s diagonal matrix in the case (k − 1)s < m ≤ ks with some elementary row or column operations. Next we need to prove this is true for ks < m ≤ (k + 1)s. In this case, we partition matrix Cm,s as following where Cks,s is a ks × ks submatrix. ⎛ ⎞ 0 ⎜ ⎟ .. ⎜ Cks,s ⎟ . ⎜ ⎟ ⎜ 1 ⎟ ⎟ Cm,s = ⎜ ⎜0 ⎟ 1 xm−s+1 ⎜ ⎟ ⎜ . ⎟ .. .. ... ⎝ ⎠ . 0
1
xm
134
G. Zeng et al.
According to our hypothesis, Cks,s can be transformed to s × s diagonal matrix during some elementary operations. We assume the result diagonal matrix is diag(x1 , · · · , xs ), hence after applying these transformations we have x1 |Cm,s | = 1
∗ ..
. xs xm−s+1
..
. 1
.. . ∗ .. . xm
Then using the similar elementary operations as that in the case of m ≤ 2s, we can obtain another s × s diagonal matrix. Hence the determinant of Cm,s is the multiplication of s polynomials. For finding other types of word oriented cascade jump σ−LFSRs, one should choose both f (x) and f ⊥ (x) to be primitive. To be efficiently implemented in software, one needs to choose special matrix, which can be fast implemented on CPU. In the search procedure, one needs to carefully observe the test results to find some necessary condition for decreasing the search time. Finally, for the security aspect, one needs to pick out those whose Hamming weight are as large as possible.
4
Jump Index Calculation
The determination of the jump index of a given irreducible polynomial is a challenging task, especially for high degree polynomials, as it comes to finding the discrete logarithm in a large finite extension field. By employing a small amount of theory and a huge amount of computing power at the present time it is easy to determine the jump index of our examples as above. Let f (x) be the primitive σ−polynomial defined in Eq. (4), we briefly outline how we use the methods of Pohlig-Hellman and Pollard to compute d = logx (x+ 1) in finite field F = F2128 . First we use the Pohlig-Hellman method [12] to reduce the DLP in F∗ to the DLP in groups of prime group of order p, with p dividing |F∗ | = 2128 − 1. There are totally 9 subgroups due to 2128 − 1 = 3 ∗ 5 ∗ 17 ∗ 257 ∗ 641 ∗ 65537 ∗ 274177 ∗ 6700417 ∗ 67280421310721. For each p dividing |F∗ |, we compute d mod p as (x + 1)|F
∗
|/p
= (x|F
∗
|/p d mod p
)
.
Each of these equations represents a DLP in the group of order p generated by ∗ x|F |/p . Having computed d mod p for all prime factors p, we use the Chinese Remainder Theorem to determine d. To find d mod p in subgroup we use the Pollard’s Rho Method [13]. Assuming that G has group order |G| = p with p prime, we divide G into four pairwise disjoint sets T1 , T2 , T3 , T4 according to the
Word Oriented Cascade Jump σ−LFSR
135
Table 3. Jump Indexes Result of Our Examples in Table 2 SN 1 dual of SN 1 SN 2 dual of SN 2 SN 3 dual of SN 3 SN 4 dual of SN 4 SN 5 dual of SN 5 SN 6 dual of SN 6 SN 7 dual of SN 7 SN 8 dual of SN 8 SN 9 dual of SN 9 SN 10 dual of SN 10
229293628387587534271494079939876772168 171093700099261338188907767079510869762 336329875434468839552512651424276600321 221670400420824361821529547693718365366 132704672458331951512744918391743389836 63484811825393477931744641796814909436 242693385714521507540553960871504133333 242175253972759543846615773932946320432 302663685763979419983753102844986248754 30363583456125956220431035122577343503 38103675587383110403116109629553749812 21701473805279016312844913568808878794 294387267487675766095140689863605610440 337929409571672598321702082033778437886 196365381069874425320432318291992328986 106083960323802987776388354828025397851 333320604818996548962247785769086115220 38808670205409475437208962295677546997 199203367905034120698612111405886013754 258674226151357822616504808032925789541
constant and x item coefficient. Let fP : G → G as ⎧ (x + 1) ∗ y, y ∈ T1 ; ⎪ ⎪ ⎨ 2 y ∈ T2 ; y , fP (y) x ∗ y, y ∈ T3 ; ⎪ ⎪ ⎩ x ∗ (x + 1) ∗ y, y ∈ T4 . We choose a random number α in the range {1, 2, · · · , |G|}, compute a starting element y0 = xα , and put yi+1 = fP (yi ) for i = 1, 2, · · · . This induces two triple integer sequences and we check whether y2i matches yi . If this is the case, we can obtain d mod p with very high probability. If not, we repeat the whole for large computation with another starting value y0 ; but this case is very rare group orders |G| = p. The expected value of iteration is about 1.229 π|G|/2 and it take about one minute to compute one discrete logarithm on average. The following table is the jump index of examples in Table 2. We list the jump index of f (x) and its dual f ⊥ (x). From the table, we can see the jump indexes are extremely large, hence the cascade jump σ−LFSR is an effective method in clock control design indeed.
5
Conclusions
Modern processors have already gained much of better performance than their predecessors. This indicates the design criteria of software oriented cryptographic algorithm should fully benefit from these changes. We have illustrated one type word
136
G. Zeng et al.
oriented cascade jump σ−LFSR, which is software oriented design and hence has fast speed. Although not shown in great detail, it should be clear that these word oriented cascade jump σ−LFSR constructions can be used as building blocks in software oriented clock control stream cipher in a very efficient way.
References 1. ECRYPT, eSTREAM: ECRYPT Stream Cipher Project, http://www.ecrypt.eu.org/stream/ 2. Courtois, N., Meier, W.: Algebraic attacks on stream ciphers with liners feedback. In: Biham, E. (ed.) EUROCRYPT 2003. LNCS, vol. 2656, pp. 345–359. Springer, Heidelberg (2003) 3. Courtois, N.: Fast algebraic attacks on stream ciphers with linear feedback. In: Boneh, D. (ed.) CRYPTO 2003. LNCS, vol. 2729, pp. 176–194. Springer, Heidelberg (2003) 4. Jansen, C.J.A., Helleseth, T., Kholosha, A.: Cascade jump controlled sequence generator and Pomaranch stream cipher (Version 3), eSTREAM, ECRYPT Stream Cipher Project (2007) 5. Jansen, C.J.A.: Streamcipher design: Make your LFSRs jump! In: The State of the Art of Stream Ciphers, Workshop Record, Brugge, Belgium, October 2004, pp. 94–108 (2004) 6. Jansen, C.J.A.: Stream cipher design based on jumping finite state machines. Cryptology ePrint Archive, Report 2005/267 (2005), http://eprint.iacr.org/2005/267/ 7. Hawkes, P., Rose, G.G.: Primitive Specification and Supporting Documentation for SOBER-t32 Submission to NESSIE. In: First NESSIE Workshop (2000) 8. Ekdahl, P., Johansson, T.: A New Version of the Stream Cipher SNOW. In: Nyberg, K., Heys, H.M. (eds.) SAC 2002. LNCS, vol. 2595, pp. 47–61. Springer, Heidelberg (2003) 9. Berbain, C., Billet, O., et al.: Sosemanuk, a fast software-oriented stream cipher. ECRYPT Stream Cipher Project (2007) 10. Zeng, G., Han, W., He, K.C.: High efficiency feedback shift register: σ−LFSR. Cryptology ePrint Archive, Report 2007/114 (2007) 11. Zeng, G., He, K.C., Han, W.: A trinomial type of σ−LFSR oriented toward software implementation. Science in China Series F-Information Sciences 50(3), 359–372 (2007) 12. Pohlig, S.C., Hellman, M.E.: An improved algorithm for computing logarithms over GF (p) and its cryptographic significance. IEEE Transactions on Information Theory 24, 106–110 (1978) 13. Pollard, J.M.: Monte Carlo methods for index computation (mod p). Mathematics of Computation 32(143), 918–924 (1978) 14. Teske, E.: Speeding Up Pollard’s Rho Method for Computing Discrete Logarithms. In: Buhler, J.P. (ed.) ANTS 1998. LNCS, vol. 1423, pp. 541–554. Springer, Heidelberg (1998)
On Some Sequences of the Secret Pseudo-random Index j in RC4 Key Scheduling Riddhipratim Basu1 , Subhamoy Maitra1 , Goutam Paul2 , and Tanmoy Talukdar1 1
Indian Statistical Institute , 203 B T Road, Kolkata 700 108, India
[email protected],
[email protected],
[email protected] 2 Department of Computer Science and Engineering, Jadavpur University, Kolkata 700032 goutam
[email protected]
Abstract. RC4 Key Scheduling Algorithm (KSA) uses a secret pseudorandom index j which is dependent on the secret key. Let SN be the permutation after the complete KSA of RC4. It is known that the value of j in round y + 1 can be predicted with high probability from SN [y] for −1 [y] for the final values of y. This fact the initial values of y and from SN has been exploited in several recent works on secret key recovery from SN . In this paper, we perform extensive analysis of some special sequences of indices corresponding to the j values that leak useful information for key recovery. We present new theoretical results on the probability and the number of such sequences. As an application, we explain a new secret key recovery algorithm that can recover a 16 bytes secret key with a success probability of 0.1409. Our strategy has high time complexity at this point and requires further improvement to be feasible in practice. Keywords: Bias, Cryptanalysis, Filter, Key Recovery, Permutation, RC4, Sequence, Stream Cipher.
1
Introduction
The RC4 stream cipher has been designed by Ron Rivest for RSA Data Security in 1987, and was a propriety algorithm until 1994. The RC4 cipher has two components, namely, the Key Scheduling Algorithm (KSA) and the Pseudo-Random Generation Algorithm (PRGA). The KSA expands a secret key k[0 . . . l − 1] into an array K[0 . . . N − 1] such that K[y] = k[y mod l] for any y, 0 ≤ y ≤ N − 1. Using this key, an identity permutation S[0 . . . N −1] of {0, 1, . . . N −1} is scrambled using two indices i, j. The PRGA uses this permutation to generate pseudorandom keystream bytes z1 , z2 , . . ., that are bitwise XOR-ed with the plaintext to generate the ciphertext at the sender end and bitwise XOR-ed with the ciphertext to get back the plaintext at the receiver end. Any addition used related to the RC4 description is in general addition modulo N unless specified otherwise.
The first and last authors worked for this paper during the winter break between the semesters (2008–2009) in their Bachelor of Statistics course.
M. Bras-Amor´ os and T. Høholdt (Eds.): AAECC 2009, LNCS 5527, pp. 137–148, 2009. c Springer-Verlag Berlin Heidelberg 2009
138
R. Basu et al.
KSA(K) Initialization: For i = 0, . . . , N − 1 S[i] = i; j = 0; Scrambling: For i = 0, . . . , N − 1 j = (j + S[i] + K[i]); Swap(S[i], S[j]);
PRGA(S) Initialization: i = j = 0; Keystream Generation Loop: i = i + 1; j = j + S[i]; Swap(S[i], S[j]); t = S[i] + S[j]; Output z = S[t];
We use subscript r to the variables S, i and j to denote their updated values in round r, where 1 ≤ r ≤ N . According to this notation, SN is the permutation after the completion of the KSA. By S0 and j0 , we mean the initial permutation and the initial value of the index j respectively before the KSA begins. We use the notation S −1 for the inverse of the permutation S, i.e., if S[y] = v, then y + K[x], 0 ≤ y ≤ N −1. S −1 [v] = y. By fy (K), we denote the expression y(y+1) 2 x=0
RC4 is the most popular and widely deployed software stream cipher. It is used in network protocols such as SSL, TLS, WEP and WPA, and also in Microsoft Windows, Lotus Notes, Apple AOCE, Oracle Secure SQL etc. The cipher is simple in structure, easy to implement and efficient in throughput. Yet after more than twenty years of cryptanalysis, the cipher is still not completely broken and is of great interest to the cryptographic community. One may refer to [7,9,13] and the references therein for a detailed exposition on RC4 cryptanalysis. In [12], it has been argued that the most likely value of the y-th element of the permutation after the KSA for the first few values of y is given by SN [y] = fy (K). The experimental values of the probabilities P (SN [y] = fy (K)) for y from 0 to 47 were reported in [12] without any theoretical proof. The expressions for the probabilities P (SN [y] = fy (K)) for all values of the index y in [0, N − 1] were theoretically derived in [10, Section 2] and the work [11, Section 2.1] generalized it to derive P (Sr [y] = fy (K)) for all rounds r, 1 ≤ r ≤ N . It has been shown in [6] that the bytes SN [y], SN [SN [y]], SN [SN [SN [y]]], and so on, are biased to fy (K). In particular, they showed that P (SN [SN [y]] = fy (K)) decreases from 0.137 for y = 0 to 0.018 for y = 31 and then slowly settles down to 0.0039 (beyond y = 48). In [10, Section 3], for the first time an algorithm is presented to recover the complete key from the final permutation after the KSA using the Roos’ biases, without any assumption on the key or IV. The algorithm recovers some secret key bytes by solving sets of independent equations of the form SN [y] = fy (K) and the remaining key bytes by exhaustive search. Subsequently, the work [2] additionally considered differences of the above equations and reported better results. Recently, [1] has accumulated the ideas in the earlier works [2,6,10] along with some additional new results to devise a more efficient algorithm for key recovery. After the publication of [2,10], another work [11] which has been performed independently and around the same time as [1] shows that each byte
On Some Sequences of the Secret Pseudo-random Index j
139
of SN actually reveals secret key information. The key recovery algorithm of [11] sometimes outperform that of [2]. A recent work [3] starts with the equations of [2] and considers a bit-by-bit approach to key recovery. Getting back the secret key from the final permutation after the KSA is an important problem in RC4 cryptanalysis. State recovery from keystream [4,8,14], another important attack on RC4, can be turned into a key recovery attack if key reconstruction from SN is possible in a time complexity less than that of state recovery attacks. Moreover, in certain applications such as WEP [5], the key and the IV are combined in such a way that the secret key can be easily extracted from the session key. For these applications, if one can recover the session key from the permutation then it is possible to get back the secret key. In that case, for subsequent sessions where the same secret key would be used with different known IV’s, the RC4 encryption would be completely insecure. In this paper, we study sequences of jy+1 values corresponding to the rounds y + 1 when i takes the value y, 0 ≤ y ≤ N − 1. We concentrate on a span equal to the key length at each end of the permutation SN . Based on our sequence analysis in Section 2, we present a novel bidirectional search algorithm for secret key recovery in Section 3. Our algorithm can recover secret keys of length 16 bytes with a success probability of 0.1409, which is almost two times the currently best known value of 0.0745 reported in [1]. The time reported in [1] for probability 0.0745 is 1572 seconds, but it is not clear how the complexity of [1] would grow with increase in probability. Our analysis provides better probability and a method to achieve that, but further tuning is required to make our strategy feasible in practice as the time complexity is very high. Our main idea is to guess K[0], K[1], . . . using the left end of the permutation and guess K[l − 1], K[l − 2], . . . using the right end of the permutation simultaneously. When some key bytes are guessed, the KSA may be run (in the forward or in the backward direction, depending on the location of the guessed key bytes) until a known jy+1 is encountered. If this matches with the computed jy+1 , then the guessing is continued, else the partial key is discarded. Thus, the indices y for which jy+1 ’s are known act as “filters” for separating out the wrong keys from the correct candidates. All the existing works on key recovery from the final permutation after the KSA try to guess the entire key together. Our bidirectional key retrieval makes use of the fact that once certain key bytes are known, the choice of possible values of the remaining key bytes become restricted. To our knowledge, the concept of “filtering technique” as well as the notion of bidirectional key search is a completely different and new approach towards secret key recovery of RC4.
2
Theoretical Analysis of Sequences of Filter Indices
In this section, we study the structure and algebraic properties of the sequence of indices that act as filters for validating the secret key guesses. For simplicity, we restrict l to be a factor of N .
140
R. Basu et al.
The secret index j is pseudo-random and may be considered uniformly distributed in [0, N − 1]. In general, if the values for a sequence of m many j’s are guessed randomly, the success probability is N −m . For N = 256 and m = 8, this value turns out to be 2−64 . Whereas, using our theoretical framework, we can guess a sequence of j values with very good probability (see Table 1). Definition 1 (Event Ay ). For 0 ≤ y ≤ N − 1, event Ay occurs if and only if the following three conditions hold. 1. Sy [y] = y and Sy [jy+1 ] = jy+1 . 2. jy+1 ≥ y. 3. SN [y] = Sy+1 [y]. Proposition 1. py = P (Ay ) = ( NN−y ) · ( NN−2 )y · ( NN−1 )N −1−y , 0 ≤ y ≤ N − 1. Proof. Let us analyze the conditions mentioned in the definition of Ay . 1. Sy [y] = y and Sy [jy+1 ] = jy+1 occur if jt ∈ / {y, jy+1 }, for 1 ≤ t ≤ y, the probability of which is ( NN−2 )y . 2. P (jy+1 ≥ y) = NN−y . 3. SN [y] = Sy+1 [y], if jt = y, ∀t ∈ [y + 2, N ]. This happens with probability ( NN−1 )N −y−1 . Multiplying the above three probabilities, we get the result.
Note that the event Ay implies jy+1 = SN [y] and Proposition 1 is a variant of [10, Lemma 2] that relates jy+1 and SN [y]. Definition 2 (Event By ). For 0 ≤ y ≤ N − 1, event By occurs if and only if the following three conditions hold. 1. Sy [y] = y. 2. jy+1 ≤ y. 3. SN [jy+1 ] = Sy+1 [jy+1 ]. N −1 N −1 Proposition 2. py = P (By ) = ( y+1 , 0 ≤ y ≤ N − 1. N )·( N )
Proof. Let us analyze the conditions mentioned in the definition of By . 1. Sy [y] = y occurs if jt = y for 1 ≤ t ≤ y, the probability of which is ( NN−1 )y . 2. P (jy+1 ≤ y) = y+1 N . 3. SN [jy+1 ] = Sy+1 [jy+1 ], if jt = jy+1 , ∀t ∈ [y + 2, N ]. This happens with probability ( NN−1 )N −y−1 . Multiplying the above three probabilities, we get the result.
−1 [y], 0 ≤ y ≤ N − 1. By is the same as Note that the event By implies jy+1 = SN the event E1 (y) defined in [11] and the proof of Proposition 2 has been presented as part of the proof of [11, Theorem 3].
On Some Sequences of the Secret Pseudo-random Index j
141
Definition 3 (Filter). An index y in [0, N − 1] is called a filter if either of the following two holds. 1. 0 ≤ y ≤ N2 − 1 and event Ay occurs. 2. N2 ≤ y ≤ N − 2 and event By occurs. Definition 4 (Bisequence). Suppose jN is known. Then a sequence of at least (t+t +1) many filters is called a (t, t )-bisequence if the following three conditions hold. 1. Exactly t many filters 0 ≤ i1 < i2 < . . . < it ≤ 2l − 1 exist in the interval [0, 2l − 1]. 2. Exactly t many filters N − 1 − 2l ≤ it < . . . < i1 ≤ N − 2 exist in the interval [N − 1 − 2l , N − 2]. 3. Either a filter it+1 exist in the interval [ 2l , l − 1] or a filter it +1 exist in the interval [N − 1 − l, N − 2 − 2l ]. Lemma 1. Given a set Ft of t many indices in [0, 2l − 1], a set Bt of t many indices in [N − 1 − 2l , N − 2] and an index x in [ 2l , l − 1] ∪ [N − 1 − l, N − 2 − 2l ], the probability of the sequence of indices Ft ∪ Bt ∪ {x} to be a (t, t )-bisequence is ⎛ ⎞ ⎝ piu qiv pi qi ⎠ p˜x , iu ∈Ft
iv ∈[0,l−1]\(Ft ∪{x})
iu ∈Bt
u
iv ∈[N −1−l,N −2]\(Bt ∪{x})
v
where qy = 1 − py , qy = 1 − py for 0 ≤ y ≤ N − 1 and p˜x = px or px according as x ∈ [ 2l , l − 1] or [N − 1 − l, N − 2 − 2l ] respectively. Proof. According to Definition 4, Ft ∪ Bt ∪ {x} would be an (t, t )-bisequence, if the indices in Ft and Bt and the index x are filters and the indices in ([0, l − 1] ∪ [N − 1 − l, N − 2]) \ (Ft ∪ Bt ∪ {x}) are non-filters. Hence, the result follows from Propositions 1 and 2. Definition 5 (Critical Filters). The last filter it within the first 2l indices and the first filter it within the last 2l indices for an (t, t )-bisequence are called the left critical and the right critical filters respectively. Together, they are called the critical filters. Definition 6 (Favourable Bisequence). A (t, t )-bisequence is called d-favourable, d ≤ 2l , if the following seven conditions hold. 1. 2. 3. 4. 5. 6. 7.
i1 + 1 ≤ d. iu+1 − iu ≤ d, ∀u ∈ [1, t − 1]. l 2 − 1 − it ≤ d. N − 1 − i1 ≤ d. iv − iv+1 ≤ d, ∀v ∈ [1, t − 1]. it − (N − 1 − 2l ) ≤ d. it+1 − it ≤ d or it − it +1 ≤ d.
142
R. Basu et al.
Let us interpret the conditions mentioned in Definition 6. For ease of reference, we introduce a dummy index −1 to the left of index 0. Conditions 1 and 4 ensure that the first filter to the right of index −1 and the first filter to the left of index N − 1 are at most at a distance d from indices −1 and N − 1 respectively. Conditions 2 and 5 ensure that the distance between any two consecutive filters to the left of left critical filter and to the right of right critical filter is at most d. Conditions 3 and 6 ensure that the left and right critical filters are at most at a distance d from indices 2l − 1 and N − 1 − 2l respectively. Consider the first filter it+1 to the right of the left critical filter and the first filter it +1 to the left of the right critical filter. For a (t, t )-bisequence, at least one of it+1 and it +1 must exist. Condition 7 ensures that whichever exist (and at least one of the two, if both exist) is located at most at a distance d from the corresponding critical filter. Lemma 2. The number of distinct sequences of indices in [0, 2l − 1] ∪ [N − 1 − 2l , N − 2] satisfying the conditions 1 through 6 of Definition 6 is l δ≤d,δ ≤d t≤ l −δ,t ≤ l −δ c(δ, t)c(δ , t ), where δ = 2 − 1 − it , δ = it − 2
2
l
(N − 1 − 2l ), and c(δ, t) is the coefficient of x 2 −δ−t in (1 + x + . . . + xd−1 )t . Proof. Let x1 = i1 , xu = iu − iu−1 − 1 for 2 ≤ u ≤ t be the number of non-filters between two consecutive filters in [0, it ]. The total number of non-filters in the interval [0, it ] is it − (t − 1) = ( 2l − 1 − δ) − (t − 1) = 2l − δ − t. Thus, the number of distinct sequences of indices in [0, 2l − 1] satisfying conditions 1, 2 and 3 of Definition 6 is the same as the number of non-negative integral solutions of x1 + x2 + . . . + xt = 2l − δ − t, where 0 ≤ xu ≤ d − 1, ∀u ∈ [1, t]. The number of solutions is given by c(δ, t). Similarly, the number of distinct sequences of indices in [N − 1 − 2l , N − 2] satisfying conditions 4, 5 and 6 of Definition 6 is c(δ , t ). l l Hence, the number of distinct sequences of indices in [0, 2 − 1]∪[N − 1 − 2 , N −2] satisfying the conditions 1 through 6 is δ≤d,δ ≤d t≤ l −δ,t ≤ l −δ c(δ, t)c(δ , t ). 2 2 Theorem 1. The probability of existence of a d-favourable (t, t )-bisequence in [0, l − 1] ∪ [N − 1 − l, N − 2] is piu qiv pi πd = t,t Ft ,Bt iu ∈Ft
iv ∈[N −1− 2l ,N −2]\Bt
iv ∈[0, 2l −1]\Ft
qi (1 − v
iu ∈Bt
u
q˜y ),
y∈[ 2l ,it +d]∪ [it −d,N −2− 2l ]
where the sum is over all t, t , Ft , Bt such that the sequence of indices Ft ∪ Bt satisfy the conditions 1 through 6 in Definition 6 and q˜y = qy or qy according as y ∈ [ 2l , it + d] or [it − d, N − 2 − 2l ] respectively. Proof. Immediately follows from Lemma 1. The term (1 − y∈[it +1,it +d]∪ [it − d, it − 1]˜ qy ) accounts for condition 7 of Definition 6. Using the definitions of c(δ, t), c(δ , t ) introduced in Theorem 2, we can approximate the probability expression presented in Theorem 1 as follows.
On Some Sequences of the Secret Pseudo-random Index j
143
Corollary 1. If l ≤ 16 (i.e., the key length is small), then we have l l c(δ, t)pt q 2 −t c(δ , t )pt q 2 −t (1 − q d−δ q d−δ ), πd ≈ δ≤d,δ ≤d t≤ l −δ,t ≤ l −δ 2
where p =
2 l
l 2 −1
y=0
2
py , p =
2 l
N −2
y=N −1− 2l
py , q = 1 − p, q = 1 − p .
Proof. Approximating each py in the left half by the average p of the first 2l many py ’s and each py in the right half by the average p of the last 2l many py ’s, we get the result. The ranges of the variables in the summation account l for conditions 3 and 6 of Definition 6. The term c(δ, t)pt q 2 −t accounts for the l conditions 1 and 2, the term c(δ , t )pt q 2 −t accounts for the conditions 4 and 5, and the term (1 − q d−δ q d−δ ) accounts for the condition 7 of Definition 6. In Table 1, we compare the theoretical estimates of πd from Corollary 1 with the experimental values obtained by running the RC4 KSA with 10 million randomly generated secret keys of length 16 bytes. Table 1. Theoretical and experimental values of πd vs. d with l = 16 d 2 3 4 5 6 Theoretical 0.0055 0.1120 0.3674 0.6194 0.7939 Experimental 0.0052 0.1090 0.3626 0.6152 0.7909
The results indicate that our theoretical values match closely with the experimental values. Theorem 2. The number of distinct d-favourable (t, t )-bisequences containing exactly f ≤ 2l filters is
c(δ, t)c(δ , t )
δ≤d,δ ≤d t≤ l −δ,t ≤ l −δ 2
2d−δ−δ
s=1
2
2d − δ − δ l − 2d + δ + δ . s f − t − t − s
Proof. By Lemma 2, the number of distinct sequences of indices in [0, 2l − 1] ∪ l [N − 1 − 2 , N − 2] satisfying the conditions 1 through 6 of Definition 6 is δ≤d,δ ≤d t≤ 2l −δ,t ≤ 2l −δ c(δ, t)c(δ , t ). The justification for the two binomial coefficients with a sum over s is as follows. For condition 7 to be satisfied, at least one out of 2d − δ − δ indices in [ 2l , 2l − 1 + d − δ] ∪ [N − 1 − 2l − (d − δ ), N − 2 − 2l ] must be a filter. Let the number of filters in this interval be s ≥ 1. So the remaining f − t − t − s many filters must be amongst the remaining l − 2d + δ + δ many indices in [ 2l + d − δ, l − 1] ∪ [N − 1 − l, N − 2 − 2l − (d − δ )] Corollary 2. The number of distinct d-favourable (t, t )-bisequences containing at most F ≤ 2l filters is Cd,F = c(δ, t)c(δ , t ) δ≤d,δ ≤d t≤ l −δ,t ≤ l −δ 2
2
144
R. Basu et al.
2d−δ−δ
s=1
F −t−t −s l − 2d + δ + δ 2d − δ − δ . s r r=0
Proof. In addition to s ≥ 1 filters in [ 2l , 2l −1+d−δ]∪[N −1− 2l −(d−δ ), N −2− 2l ], we need r more filters in [ 2l + d − δ, l − 1] ∪ [N − 1 − l, N − 2 − 2l − (d − δ )], where t + t + s + r ≤ F . Corollary 3. The number of distinct d-favourable (t, t )-bisequences in [0, l − 1] ∪ [N − 1 − l, N − 2] (containing at most 2l filters) is c(δ, t)c(δ , t )(1 − 2−2d+δ+δ )2l . Cd,2l = δ≤d,δ ≤d t≤ l −δ,t ≤ l −δ 2
2
Proof. Substitute F = 2l in Corollary 2 and simplify. Alternatively, number of distinct sequences satisfying the conditions 1 the through 6 is δ,δ t,t c(δ, t)c(δ , t )2l and out of these the number of sequences violating the condition 7 is δ,δ t,t c(δ, t)c(δ , t )2l−2d+δ+δ . Subtracting, we get the result. Note that the term 2r stands for the number of ways in which a set of r indices may be a filter or a non-filter. As an illustration, we present Cd,F for different d and F values with l = 16 in Table 2. Table 2. Cd,F for different F and d values with l = 16 F 32 20 16 12 8 d = 2 227.81 227.31 224.72 218.11 23.58 d = 3 230.56 230.38 229.02 224.86 215.95 d = 4 231.47 231.35 230.38 227.17 220.14
Though we have assumed that l divides N in our analysis, the same idea works when l is not a factor of N . One only needs to map the indices from the right end of the permutation to the appropriate key bytes.
3
Application of Filter Sequences in Bidirectional Key Search
Here, we demonstrate how the theoretical framework of filter sequences developed in the previous section can be used to devise a novel key retrieval algorithm. Suppose that after the completion of the RC4 KSA, the final permutation SN and the final value jN of the index j is known. The update rule for KSA is jy+1 = jy + Sy [y] + K[y], 0 ≤ y ≤ N − 1. Since S0 is identity permutation and j0 = 0, we have j1 = K[0]. Thus, if index 0 is a filter, then K[0] is known. Again, if index N − 2 is a filter, then jN −1 is known and hence K[l − 1] = K[N − 1] = jN − jN −1 − SN −1 [N − 1] = jN − jN −1 − SN [jN ] is also known. Moreover, if two
On Some Sequences of the Secret Pseudo-random Index j
145
consecutive indices y − 1 and y are filters, 1 ≤ y ≤ N − 2, then jy and jy+1 are known and Sy [y] = y from the definition of filters. Thus, the correct value of the key byte K[y mod l] can be derived as K[y mod l] = jy+1 − jy − y. For full key recovery, apart from the knowledge of the final permutation SN and the final value jN of the index j, we also need to assume that a d-favourable (t, t )-bisequence exist. Given such a sequence, our algorithm does not require to guess more than d many key bytes at a time. For faster performance, we like to keep d small (typically, 4). Our basic strategy for full key recovery is as follows. First, we perform a partial key recovery using filters in the interval [0, M − 1] ∪ [N − 1 − M, N − 2], where l ≤ M ≤ 2l, such that the correct values of at least ncorr out of l key bytes are uniquely determined and at least nin out of ncorr are found from the filters in [0, l − 1] ∪ [N − 1 − l, N − 2]. Let P be the set of partially recovered key bytes, each of which is associated with a single value, namely, the correct value of itself. Next, we build a frequency table for the l − ncorr many unknown key bytes as follows. We guess one jy+1 from the 8 values −1 −1 −1 [y], SN [SN [[y]], SN [SN [y]], SN [SN [SN [y]]], SN [y], SN −1 −1 −1 −1 −1 −1 −1 [SN [SN [y]]], SN [SN [SN [SN [y]]]] and SN [SN [SN [SN [y]]]]. From two SN successive j values jy and jy+1 , 8 × 8 = 64 candidates for the key byte K[y] are obtained. This part is similar to the 6 × 6 = 36 candidate keys from up to −1 as in [1]. These candidates are three levels of nested indexing of SN and SN weighted according to their probabilities. We split the 256 possible values of each unknown key byte K[y] into 4 sets of size 64 each. Let Gy be the set of values obtained from the above frequency table and let H1,y , H2,y , H3,y be the other sets, called bad sets. For 0 ≤ y ≤ l − 1, let gy = P (K[y] ∈ Gy ). Since empirical results show that gy is almost the same for each y, we would be using the average l−1 g = 1l y=0 gy in place of each gy . After this, we perform a bidirectional search from both ends several times, each time with exactly one of the sets {Gy , H1,y , H2,y , H3,y } for each unknown key byte. We decide to use the bad sets for at most b of the unknown key bytes. The search algorithm, called BidirKeySearch, takes as inputs the final permutation SN , the final value jN , the partially recovered key set P, a set S of l−ncorr many candidate sets each of size 64 and a d-favourable (t, t )-bisequence B. For the left end, first we guess the key bytes K[0], . . . , K[i1 ] except those in P. Starting with S0 and j0 , we run the KSA until we obtain Si1 +1 and ji1 +1 . Since the correct value of ji1 +1 is known, the tuples leading to the correct ji1 +1 form a set Ti1 of candidate solutions for K[0 . . . i1 ]. These tuples are said to “pass the filter” i1 . Similarly, starting with ji1 +1 and each state Si1 +1 associated with each tuple in Ti1 , we guess K[i1 + 1], . . . , K[i2 ] except those in P. Thus, we obtain a set Ti2 of candidate solutions for K[0 . . . i2 ] passing the filter i2 , and so on until we reach a stage where we have a set Tit of candidate solutions for K[0 . . . it ] passing the left critical filter it . In the same manner, we start with SN and jN
146
R. Basu et al.
and run the KSA backwards until we reach a stage where we have a set Tit of candidate solutions for K[it + 1 . . . l − 1] passing the right critical filter it . Among the set of remaining key bytes L = {K[it + 1], . . . , K[it+1 ]} \ P on the left and the set of remaining key bytes R = {K[it +1 + 1], . . . , K[it ]} \ P on the right, some bytes are common. We first guess the smaller of these two sets and then guess only those key bytes from the larger set that are needed to complete the whole key. We can reduce the candidate keys by filtering the left half of the key using the right filters and the right half of the key using the left filters. In the BidirKeySearch algorithm description, i0 and i0 denote the two default filters −1 (a dummy filter) and N − 1 respectively. BidirKeySearch (SN , jN , P, S, B) Guess the Left Half-Key: 1. For u = 1 to t do 1.1. Iteratively guess {K[iu−1 + 1], . . . , K[iu ]} \ P.
1.2. Tiu ← {K[0], . . . , K[iu ]} : the tuple pass the filter iu . Guess the Right Half-Key: 2. For v =1 to t do 2.1. Iteratively guess {K[iv + 1], . . . , K[iv−1 ]} \ P.
2.2. Tiv ← {K[iv + 1], . . . , K[l − 1]} : the tuple pass the filter iv . Merge to Get the Full Key: 3. L ← {K[it + 1], . . . , K[it+1 ]} \ P and R ← {K[it +1 + 1], . . . , K[it ]} \ P. 4. If |L| < |R| then do 4.1. Guess L.
4.2. Tit+1 ← {K[0], . . . , K[it+1 ]} : the tuple pass the filter it+1 . 4.3. Guess R \ Tit+1 using the filter it +1 . 5. Else 5.1. Guess R.
5.2. Ti ← {K[it + 1], . . . , K[l − 1]} : the tuple pass the filter iv . t +1
5.3. Guess L \ Ti
t +1
using the filter it+1 .
Cross-Filtration: 6. If K[0 . . . m − 1] is guessed from the left filters and K[m . . . l − 1] is guessed from the right filters, then validate K[m . . . l − 1] using the left filters up to l − 1 and validate K[0 . . . m − 1] using the right filters up to N − 1 − l.
Let m1 be the complexity and α be the probability of obtaining a d-favourable (t, t ) bisequence along with the partially recovered key set P, satisfying the input requirements of BidirKeySearch. We require a search complexity of m2 = 2(M−l) for locating the ncorr − nin many correct key bytes from the filters in ncorr −nin [l, l+M −1]∪[N −1−l, N −2−M ]. We need to run the BidirKeySearch algorithm b n−ncorr r a total of m3 = 3 times and this gives a success probability r=0 r n−ncorr β = r=n−ncorr −b n−nrcorr g r (1−g)n−ncorr −r . If each run of the BidirKeySearch algorithm consumes τ time, the overall complexity for the full key recovery is m1 m2 m3 τ 28 and the success probability is αβ. The term 28 comes for guessing the correct value of jN .
On Some Sequences of the Secret Pseudo-random Index j
147
For l = 16, d = 4, M = 20, ncorr = 6, nin = 4, our experiments with 10 million randomly generated secret keys reveal that α ≈ 0.1537 and g ≈ 0.8928. Also, m1 ≈ 231 (see Table 2) and m2 ≈ 25 . With b = 2, β and m3 turn out to be 0.9169 and 29 (approx.) respectively. Thus, the success probability is αβ ≈ 0.1409 and the complexity is approximately 231+5+9+8 τ = 253 τ . Though the the exact estimation of τ is not feasible at this point, τ is expected to be small, since for wrong filter sequences the search is likely to terminate in a negligible amount of time. So far, the best known success probability for recovering a 16 bytes secret key has been 0.0745 [1]. The time reported in [1] for this probability is 1572 seconds, but it is not clear how the complexity of [1] would grow with increase in probability. We present better probability (almost two times that of [1]), but with very high time complexity, which is infeasible in practice unless further improvement is made.
4
Conclusion
In this paper, we have performed a detailed theoretical study on sequences of j indices in RC4 KSA. We have also demonstrated an application of our sequence analysis in secret key recovery from the final permutation after the KSA. Though our key retrieval algorithm has good success probability, it has high time complexity. Currently, we are working on some more special sequences to reduce the complexity parts from those reported in Table 2 and thereby improve the overall time complexity for key recovery.
References 1. Akg¨ un, M., Kavak, P., Demirci, H.: New Results on the Key Scheduling Algorithm of RC4. In: Chowdhury, D.R., Rijmen, V. (eds.) INDOCRYPT 2008. LNCS, vol. 5365, pp. 40–52. Springer, Heidelberg (2008) 2. Biham, E., Carmeli, Y.: Efficient Reconstruction of RC4 Keys from Internal States. In: Nyberg, K. (ed.) FSE 2008. LNCS, vol. 5086, pp. 270–288. Springer, Heidelberg (2008) 3. Khazaei, S., Meier, W.: On Reconstruction of RC4 Keys from Internal States. In: Calmet, J., Geiselmann, W., M¨ uller-Quade, J. (eds.) Mathematical Methods in Computer Science (MMICS). LNCS, vol. 5393, pp. 179–189. Springer, Heidelberg (2008) 4. Knudsen, L.R., Meier, W., Preneel, B., Rijmen, V., Verdoolaege, S.: Analysis Methods for (Alleged) RC4. In: Ohta, K., Pei, D. (eds.) ASIACRYPT 1998. LNCS, vol. 1514, pp. 327–341. Springer, Heidelberg (1998) 5. LAN/MAN Standard Committee. Wireless LAN medium access control (MAC) and physical layer (PHY) specifications, 1999 edition. IEEE standard 802.11 (1999) 6. Maitra, S., Paul, G.: New Form of Permutation Bias and Secret Key Leakage in Keystream Bytes of RC4. In: Nyberg, K. (ed.) FSE 2008. LNCS, vol. 5086, pp. 253–269. Springer, Heidelberg (2008); A revised and extended version with the same title is available at the IACR Eprint Server, eprint.iacr.org, number 2007/261 (January 9, 2009)
148
R. Basu et al.
7. Mantin, I.: Analysis of the stream cipher RC4. Master’s Thesis, The Weizmann Institute of Science, Israel (2001) 8. Maximov, A., Khovratovich, D.: New State Recovering Attack on RC4. In: Wagner, D. (ed.) CRYPTO 2008. LNCS, vol. 5157, pp. 297–316. Springer, Heidelberg (2008) 9. McKague, M.E.: Design and Analysis of RC4-like Stream Ciphers. Master’s Thesis, University of Waterloo, Canada (2005) 10. Paul, G., Maitra, S.: Permutation after RC4 Key Scheduling Reveals the Secret Key. In: Adams, C., Miri, A., Wiener, M. (eds.) SAC 2007. LNCS, vol. 4876, pp. 360–377. Springer, Heidelberg (2007) 11. Paul, G., Maitra, S.: RC4 State Information at Any Stage Reveals the Secret Key. IACR Eprint Server, eprint.iacr.org, number 2007/2008 (January 9, 2009); This is an extended version of [10] 12. Roos, A.: A class of weak keys in the RC4 stream cipher. Two posts in sci.crypt, message-id
[email protected] and
[email protected] (1995) 13. Tews, E.: Attacks on the WEP protocol. IACR Eprint Server, eprint.iacr.org, number 2007/471, December 15 (2007) 14. Tomasevic, V., Bojanic, S., Nieto-Taladriz, O.: Finding an internal state of RC4 stream cipher. Information Sciences 177, 1715–1727 (2007)
Very-Efficient Anonymous Password-Authenticated Key Exchange and Its Extensions SeongHan Shin1,2 , Kazukuni Kobara1,2, and Hideki Imai2,1 1 Research Center for Information Security (RCIS), National Institute of Advanced Industrial Science and Technology (AIST), 1-18-13, Sotokanda, Chiyoda-ku, Tokyo, 101-0021 Japan
[email protected] 2 Chuo University, 1-13-27, Kasuga, Bunkyo-ku, Tokyo, 112-8551 Japan
Abstract. An anonymous password-authenticated key exchange (anonymous PAKE) protocol is designed to provide both password-only authentication and user anonymity. In this paper, we propose a veryefficient anonymous PAKE (called, VEAP) protocol that provides the most efficiency among their kinds in terms of computation and communication costs. The VEAP protocol guarantees semantic security of session keys in the random oracle model under the chosen target CDH problem, and unconditional user anonymity against a semi-honest server. If the pre-computation is allowed, the computation cost of the VEAP protocol is the same as the well-known Diffie-Hellman protocol! In addition, we extend the VEAP protocol in two ways.
1
Introduction
Since the Diffie-Hellman protocol [8], many researchers have tried to design secure cryptographic algorithms/protocols for realizing secure channels. These algorithms/protocols are necessary because application-oriented protocols (e.g., web-mail, Internet banking/shopping, ftp) are frequently developed assuming the existence of such secure channels. The latter can be achieved by an authenticated key exchange (AKE) protocol at the end of which the involving parties authenticate and share a common session key to be used for subsequent cryptographic algorithms. For authentication, human-memorable passwords are commonly used rather than high-entropy keys because of their convenience. Many password-based AKE protocols (see [10]) have been extensively investigated so far. However, one should be very careful about two major attacks on passwords: on-line and off-line dictionary attacks. While on-line attacks are applicable to all of the password-based protocols equally, they can be prevented by having a server take appropriate countermeasures (e.g., lock-up accounts after consecutive failures of passwords). But, we cannot avoid off-line attacks by such countermeasures mainly because these attacks can be done off-line and independently of the party. M. Bras-Amor´ os and T. Høholdt (Eds.): AAECC 2009, LNCS 5527, pp. 149–158, 2009. c Springer-Verlag Berlin Heidelberg 2009
150
1.1
S. Shin, K. Kobara, and H. Imai
Anonymous Password-Authenticated Key Exchange
A password-authenticated key exchange (PAKE) protocol allows a user, who remembers his/her password only, to authenticate with the counterpart server that holds the password or its verification data. In PAKE protocols (see [9]), a user should send his/her identity clearly in order to share an authenticated session key. Let us suppose an adversary who fully controls the networks. Though the adversary cannot impersonate any party in PAKE protocols, it is easy to collect a user’s personal information about the communication history itself. For this problem, Viet et al., [15] proposed an anonymous PAKE protocol and its threshold construction1 both of which simply combine a PAKE protocol [1] with an oblivious transfer (OT) protocol [7,14] for user’s anonymity. The user anonymity is guaranteed against an outside adversary as well as a passive server, who follows the protocol honestly. Later, Shin et al., [13] proposed an anonymous PAKE protocol that is only based on the PAKE protocol [1], and showed that their protocol significantly improved the efficiency compared to [15]. Very recently, Yang and Zhang [16] proposed a new anonymous PAKE protocol that is based on a different PAKE (i.e., SPEKE [11,12]) protocol. The main idea of [16] is that the user and the server share a Difffie-Hellman-like key, by using the SPEKE protocol [11,12], and then run a sequential Diffie-Hellman protocol partially-masked with the shared key. To our best knowledge, all the anonymous PAKE protocols are constructed from PAKE protocols and, among them, the construction of [16] seems the most efficient in terms of computation and communication costs. 1.2
Our Contributions
In this paper, we propose a very-efficient anonymous PAKE (called, VEAP) protocol whose core part is based on the blind signature scheme [5]. The VEAP protocol guarantees semantic security of session keys in the random oracle model under the chosen target CDH problem, and unconditional user anonymity against a semi-honest server. In the VEAP protocol, the user and the server are required to compute only one modular exponentiation, respectively, if the pre-computation is allowed. Surprisingly, this is the same computation cost of the well-known DiffieHellman protocol [8] which does not provide authentication at all. Also, we extend the VEAP protocol in two ways: the first is designed to reduce the communication cost of the VEAP protocol and the second shows that stripping off anonymity parts from the VEAP protocol results in a new PAKE protocol. Organization. In the next section, we explain some preliminaries to be used throughout this paper. In Section 3, we propose a very-efficient anonymous PAKE (called, VEAP) protocol with security proof and efficiency comparison. In Section 4, we extend the VEAP protocol in two ways. 1
In their threshold construction, the ”threshold” number of users should collaborate one another in order to be authenticated by the server. In this paper, we only focus on an anonymous PAKE protocol where the threshold t = 1.
Very-Efficient Anonymous PAKE and Its Extensions
2
151
Preliminary
2.1
Notation
Here, we explain some notation. Let G be a finite, cyclic group of prime order p and g be a generator of G. The parameter (G, p, g) is public to everyone. In the aftermath, all the subsequent arithmetic operations are performed in modulo p unless otherwise stated. Let l and κ be the security parameters for hash functions (i.e., the size of the hashed value) and a symmetric-key encryption scheme (i.e., the length of the key), respectively. Let {0, 1}∗ be the set of finite binary strings and {0, 1}l be the set of binary strings of length l. Let ”” be the concatenation of bit strings in {0, 1}. Let us denote G∗ = G\{1} where 1 is the identity element of G. The G is a full-domain hash (FDH) function that maps {0, 1} to the elements of G∗ . While F : {0, 1} → {0, 1}κ, the other hash functions are denoted Hk : {0, 1} → {0, 1}lk , for k = 1, 2, 3. Let U = {U1 , · · · , Un } and S be the identities of a group of n users and server, respectively. 2.2
Formal Model
In this subsection, we introduce an extended model, built upon [4,2], and security notions for anonymous PAKE. Model. In an anonymous PAKE protocol P , there are two parties Ui (∈ U ) and S where a pair of user Ui and server S share a low-entropy password pwi , chosen from a small dictionary Dpassword , for i (1 ≤ i ≤ n). Each of Ui and S may have several instances, called oracles involved in distinct, possibly concurrent, executions of P . We denote Ui (resp., S) instances by Uiμ (resp., S ν ) where μ, ν ∈ N, or by I in case of any instance. During the protocol execution, an adversary A has the entire control of networks whose capability can be represented as follows: – Execute(Uiμ , S ν ): This query models passive attacks, where the adversary gets access to honest executions of P between Uiμ and S ν . – Send(I, m): This query models active attacks by having A send a message m to an instance I. The adversary A gets back the response I generates in processing m according to the protocol P . – Reveal(I): This query handles misuse of the session key by any instance I. The query is only available to A, if the instance actually holds a session key, and at that case the key is released to A. – RegisterUser(Uj , pwj ): This query handles inside attacks by having A register a user Uj to server S with a password pwj (i.e., Uj ∈ U ). – Test(I): This query is used to measure how much the adversary can obtain information about the session key. The Test-query can be asked at most once by the adversary A and is only available to A if the instance I is fresh 2 . 2
We consider an instance I that has accepted (holding a session key SK). Let I be a partnered instance of I (refer to [4]). The instance I is fresh unless the following conditions are satisfied: (1) either Reveal(I) or Reveal(I ) is asked by A at some point; or (2) RegisterUser(I, ) is asked by A when I = Ui for any i.
152
S. Shin, K. Kobara, and H. Imai
This query is answered as follows: one flips a private coin b ∈ {0, 1}, and forwards the corresponding session key SK (Reveal(I) would output), if b = 1, or a random value with the same size except the session key, if b = 0. Security Notions. The adversary A is provided with random coin tosses, some oracles and then is allowed to invoke any number of queries as described above, in any order. The AKE security is defined by the game Gameake (A, P ) where the goal of the adversary is to guess the bit b, involved in the Test-query, by outputting this guess b . We denote the AKE advantage, by Advake P (A) = 2 Pr[b = b ] − 1, as the probability that A can correctly guess the value of b. The protocol P is said to be (t, ε)-AKE-secure if A’s advantage is smaller than ε for any adversary A running time t. As [15], we consider a semi-honest server S, who honestly follows the protocol P , but it is curious about the involved user’s identity. The user anonymity is defined by the probability distribution of messages in P . Let P (Ui , S) (resp., P (Uj , S)) be the transcript of P between Ui (resp., Uj ) and S. We can say that the protocol P is anonymous if, for any two users Ui , Uj ∈ U , Dist[P (Ui , S)] = Dist[P (Uj , S)] where Dist[c] denotes c’s probability distribution. This security notion means that the server S gets no information about the user’s identity. 2.3
Computational Assumptions
Chosen Target CDH (CT-CDH) Problem [3,5]. Let G = g be a finite cyclic group of prime order p with g as a generator. Let x be a random element of Zp and let X ≡ g x . A (t, ε)-CT-CDHg,G attacker is a probabilistic polynomial time (PPT) machine B that is given X and has access to the target oracle TG , returning random points Wi in G∗ , as well as the helper oracle (·)x . Let qT (resp., qH ) be the number of queries B made to the target (resp., helper) oracles. The success probability Succct−cdh (B) of B attacking the chosen-target CDH problem g,G is defined as the probability to output a set of m pairs ((j1 , K1 ), · · · , (jm , Km )) where, for all i (1 ≤ i ≤ m), ∃ji (1 ≤ ji ≤ qT ) such that Ki ≡ Wjxi , all Ki are (t) the maximal success probadistinct and qH < qT . We denote by Succct−cdh g,G bility over every adversaries, running within time t. The CT-CDH-Assumption (t) ≤ ε for any t/ε not too large. Note that, if B makes one states that Succct−cdh g,G query to the target oracle, then the chosen-target CDH assumption is equivalent to the computational Diffie-Hellman assumption. A Symmetric-Key Encryption Scheme. A symmetric-key encryption scheme SE consists of the following two algorithms (E, D), with a symmetrickey K uniformly distributed in {0, 1}κ: (1) Given a message M and a key K, E produces a ciphertext C = EK (M ); and (2) Given a ciphertext C and a key K, D recovers a message M = DK (C). The semantic security for SE is that it is infeasible for an adversary to distinguish the encryptions of two messages M0 and M1 (of the same length), even though the adversary has access to the en/decryption oracles. We denote by Succss SE (t, q) the maximal success probability of a distinguisher, running within time t and making at most q queries to the E and D oracles.
Very-Efficient Anonymous PAKE and Its Extensions User Ui (pwi )
153
Server S ((Uj , pwj ) , 1 ≤ j ≤ n) [Pre-computation] R
R
a ← Zp , Wi ← G(Ui , pwi ), A ≡ Wi × g a U, A
-
S, X, Ax , {Cj }1≤j≤n , VS Ki ≡ Ax /X a , Ki ← F(Ui , X, Wi , Ki ), For i = j, MS = DKi (Ci ). If VS = H1 (U StransMS ), reject. Otherwise, VUi ← H2 (U StransMS ) SK ← H3 (U StransMS ) and accept.
R
x ← Zp , X ≡ g x , MS ← {0, 1}l For j = 1 to n, Wj ← G(Uj , pwj ), Kj ≡ (Wj )x , Kj ← F(Uj , X, Wj , Kj ), and Cj = EKj (MS). Compute Ax VS ← H1 (U StransMS)
VUi
-
If VUi = H2 (U StransMS), reject. Otherwise, SK ← H3 (U StransMS) and accept.
Fig. 1. The VEAP protocol where trans = AAx X{Cj }1≤j≤n
3
A Very-Efficient Anonymous PAKE Protocol
In this section, we propose a very-efficient anonymous PAKE (for short, VEAP) protocol that provides the most efficiency among their kinds in terms of computation and communication costs. We also show that the VEAP protocol guarantees not only AKE security against an active adversary but also unconditional user anonymity against a semi-honest server, who honestly follows the protocol. 3.1
The VEAP Protocol
We assume that each user Ui of the group U = {U1 , · · · , Un } has registered his/her password pwi to server S. For simplicity, we assign the users consecutive integer i (1 ≤ i ≤ n) so that Ui can be regarded as the i-th user of U . In the VEAP protocol, any user Ui can share an authenticated session key with server S anonymously (see Fig. 1). Step 0 (Pre-computation of server S): At first, server S chooses a random number x from Zp and a random master secret MS from {0, 1}l , and computes its Diffie-Hellman public value X ≡ g x . Then, the server repeats the followings for all users Uj (1 ≤ j ≤ n): 1) it computes Wj ← G(Uj , pwj ); 2) it x computes Kj ≡ (Wj ) and derives a symmetric-key Kj ← F (Uj , X, Wj , Kj ); and 3) it generates a ciphertext Cj = EKj (MS) for the master secret MS. Step 1: The user Ui chooses a random number a from Zp and computes Wi ← G(Ui , pwi ). The Wi is masked with g a so that the resultant value
154
S. Shin, K. Kobara, and H. Imai
A is computed in a way of A ≡ Wi × g a . The user sends A and the group U of users’ identities to server S. Step 2: For the received value A, server S computes Ax with the exponent x. Also, the server generates an authenticator VS ← H1 (U StransMS). Then, server S sends its identity S, X, Ax , {Cj }1≤j≤n and VS to user Ui . Step 3: After receiving the message from server S, user Ui first computes Ki ≡ Ax /X a and Ki ← F (Ui , X, Wi , Ki ). With Ki , the user decrypts the i-th ciphertext Ci=j as follows: MS = DKi (Ci ). If the received VS is not valid, user Ui terminates the protocol. Otherwise, the user generates an authenticator VUi ← H2 (U StransMS ) and a session key SK ← H3 (U StransMS ). The authenticator VUi is sent to server S. Step 4: If the received VUi is not valid, server S terminates the protocol. Otherwise, the server generates a session key SK ← H3 (U StransMS). Rationale. Unlike [13,15,16], the novelty of the VEAP protocol is that it does not use the existing PAKE protocols [9,10] at all. In order to provide the most efficiency, the VEAP protocol has the following rationale: 1) Instead of unmasking A, server S sends X and Ax which can be used to derive Ki=j if user Ui has computed Wi with the correct Ui and pwi . Importantly, this procedure makes it possible for server S to pre-compute Kj for all users Uj (1 ≤ j ≤ n); 2) The server S generates only one Diffie-Hellman public value X and its exponent is used to compute all of the Kj ’s; 3) The server S sends {Cj }1≤j≤n by encrypting the master secret MS with Kj . This is enough to guarantee unconditional user anonymity against a semi-honest server; and 4) Both Ui and pwi are used to compute the verification data Wi . This is crucial since an adversary is enforced to make an on-line dictionary attack on a specific user, not the others. 3.2
Security
In this subsection, we prove that the VEAP protocol is AKE-secure under the chosen target CDH problem in the random oracle model [6] and provides unconditional user anonymity against a semi-honest server. Theorem 1 (AKE Security). Let P be the VEAP protocol where passwords are independently chosen from a dictionary of size N . For any adversary A within a polynomial time t, with less than qsend active interactions with the parties (Send-queries), qexecute passive eavesdroppings (Execute-queries) and asking qhashF , qhashG , qhashH hash queries to F , G, any Hk , for k = 1, 2, 3, respectively, ss ct−cdh Advake (t2 + qhashG · τe ) P (A) ≤ 2nqsendS × SuccSE (t1 , q1 ) + 16qsendS × Succg,G
+
2 qhashH 2qsendU 2qsendS q2 6(qsendU + qsendS ) 2Q2 + + + + + hashF N |G| 2κ 2l 2l1 2l2
(1)
where (1) qsendU (resp., qsendS ) is the number of Send-queries to Ui (resp., S) instance, (2) Q = qexecute + qsend and qsend ≤ qsendU + qsendS , (3) l1 and l2 are the output sizes of hash function H1 and H2 , respectively, and (4) τe denotes the computational time for an exponentiation in G.
Very-Efficient Anonymous PAKE and Its Extensions
155
The main proof strategy is as follows. In order to ”embed” X of the CT-CDH problem, the simulator selects a random value s in the set {1, · · · , qsendS } where qsendS is the number of Send-queries to S instance. If this is the s-th instance of party S, the simulator uses the X and forwards A to the helper oracle (·)x , and then returns the received Ax . Let qT1 and qT2 be the first and the second G-query, respectively. The simulator forwards qT1 and qT2 to the target oracle TG and returns the received WT1 and WT2 , respectively. For u (3 ≤ u ≤ hashG) in the {qT3 , · · · , qThashG } where hashG is the number of G-queries, the simulator chooses a random number wu from Zp and returns WTu ≡ WTw1u , if u ≡ 1 mod 2, or WTu ≡ WTw2u , if u ≡ 0 mod 2. That is, we simulate G oracle by answering with a randomized WT1 , for half of G-queries, and WT2 , for the remaining queries. Note that qT = 2 and qH = 1. Whenever there is a collision on Wji , the simulator can find the solution to the CT-CDH problem with the probability 1/2qsendS . Theorem 2 (Anonymity). The VEAP protocol provides unconditional user anonymity against a semi-honest server. Proof. Consider server S who honestly follows the VEAP protocol. It is obvious that server S cannot get any information about the user’s identity since the A has a unique discrete logarithm of g and, with the randomly-chosen number a, it is the uniform distribution over G∗ . This also implies that the server cannot distinguish Ai (of user Ui ) from Aj (of Uj=i ) because they are completely independent each other. In addition, even if server S receives the authenticator VUi the A does not reveal any information about the user’s identity since the probability, for all users, to get MS is equal. Therefore, Dist[P (Ui , S)] = Dist[P (Uj , S)] for any two users {Ui , Uj } ∈ U . Remark 1. In [16], they discussed user anonymity against a malicious server who does not follow the protocol. However, the NAPAKE protocol [16] does not provide user anonymity against such type of attacks because, if there are only two users in the group, the malicious server can always determine which user is the one, with probability 1/2, by using different random values for the users.
3.3
Efficiency Comparison
This subsection shows efficiency comparison of the VEAP protocol and the previous anonymous PAKE protocols [15,13,16] (see Table 1). With respect to computation costs, we count the number of modular exponentiations of user Ui and server S. In Table 1, ”Total” means the total number of modular exponentiations and ”T−P” is the remaining number of modular exponentiations after excluding those that are pre-computable. With respect to communication costs, we measure the bit-length of messages where | · | indicate the bit-length. As for computation cost of the VEAP protocol, user Ui (resp., server S) is required to compute 2 (resp., n + 2) modular exponentiations. When precomputation is allowed, the remaining costs of user Ui (resp., server S) are only 1 (resp., 1) modular exponentiation. Note that this is the same computation cost
156
S. Shin, K. Kobara, and H. Imai
Table 1. Efficiency comparison of anonymous PAKE protocols in terms of computation and communication costs where n is the number of users Number of modular exponentiations User Ui Server S Communication costs ∗1 Total T−P Total T−P APAKE [15] 6 4 4n + 2 3n + 1 (n + 2)|p| + (n + 1)|H| TAP [13] 3 2 n+1 n 2|p| + (n + 1)|H| NAPAKE [16] 4 3 ∗2 n+3 2 (n + 3)|p| + |H| ∗2 VEAP 2 1 n+2 1 3|p| + 2|H| + n|E | *1: The bit-length of identities is excluded *2: In [16], they incorrectly estimated the efficiency of the NAPAKE protocol Protocols
of the Diffie-Hellman protocol [8]. One can easily see that the VEAP protocol is the most efficient in the number of modular exponentiations for both user and server. As for communication cost, the VEAP protocol requires a bandwidth of (3|p| + 2|H| + n|E|)-bits, except the length of identities U and S, where the bandwidth for |H| and the modulus size |p| are independent from the number of users. If we consider the minimum security parameters (|p| = 1024, |H| = 160 and |E| = 128), the gap of communication costs between the VEAP protocol and the others [15,13,16] becomes larger as the number of users increases.
4 4.1
Its Extensions Reducing Communication Cost of the VEAP Protocol
In this subsection, we show how to reduce the communication cost of the VEAP protocol, such that it is independent of the number of users, at the expense of slight increase of the computation cost (see Fig. 2). The main difference from the VEAP protocol is that server S fixes the values X and {Cj }1≤j≤n for a time period t. This obviously reduces the communication cost to be independent of the number of users. Instead, user Ui has to read a necessary information from server S’s public bulletin board. Note that, if an adversary changes these values, this extended protocol results in ”failure”. However, one should be careful about the following problems: 1) session key privacy against other legitimate users; and 2) forward secrecy. In fact, the values B x ≡ X b and B y ≡ Y b in the hash functions play an important role to solve the above problems in the extended protocol. 4.2
A New PAKE Protocol
Here, we show that stripping off anonymity parts from the VEAP protocol is a new kind of PAKE protocol (see Fig. 3). To our best knowledge, this PAKE protocol is the first construction built from the blind signature scheme [5].
Very-Efficient Anonymous PAKE and Its Extensions
157
Server S ((Uj , pwj ) , 1 ≤ j ≤ n) [Publication of temporarily-fixed values] R R x ← Zp , X ≡ g x , MS ← {0, 1}l For j = 1 to n, Wj ← G(Uj , pwj ), Kj ≡ (Wj )x , Kj ← F(Uj , X, Wj , Kj ), and Cj = EKj (MS). Server S’s public bulletin board Posted time Users U
Values Y H HH 2009/01/18 {Uj }1≤j≤n X, {Cj }1≤j≤n read
User Ui (pwi ) 2 R (a, b) ← Zp , B ≡ g b , Wi ← G(Ui , pwi ), A ≡ Wi × g a
Valid period t 2009/02/17
[Protocol execution up to t] x, X, (Kj , Cj ) , 1 ≤ j ≤ n
U, A, B
-
R
y ← Zp , Y ≡ g y Compute Ax , B x and B y VS ← H1 (transB x B y MS)
x S, A , Y, VS Ki ≡ Ax /X a , Ki ← F(Ui , X, Wi , Ki ), For i = j, MS = DKi (Ci ). If VS = H1 (transX b Y b MS ), reject. VUi Otherwise, VUi ← H2 (transX b Y b MS ) b b If VUi = H2 (transB x B y MS), reject. SK ← H3 (transX Y MS ) Otherwise, SK ← H3 (transB x B y MS) and accept. and accept.
Fig. 2. An extended VEAP protocol where trans = U SAAx XBY {Cj }1≤j≤n
User Ui (pwi ) R
a ← Zp , Wi ← G(Ui , pwi ), A ≡ Wi × g a
Server S ((Uj , pwj ) , 1 ≤ j ≤ n) Ui , A
-
R
x ← Zp , X ≡ g x , Ax For i = j, Wj ← G(Uj , pwj ) and Kj ≡ (Wj )x . VS ← H1 (Ui transWj Kj )
x S, X, A , VS Ki ≡ Ax /X a , If VS = H1 (Ui transWi Ki ), reject. VUi Otherwise, VUi ← H2 (Ui transWi Ki ) If VUi = H2 (Ui transWj Kj ), reject. SK ← H3 (Ui transWi Ki ) Otherwise, SK ← H3 (Ui transWj Kj ) and accept. and accept.
Fig. 3. A new PAKE protocol from the VEAP protocol where trans = SAAx X
158
S. Shin, K. Kobara, and H. Imai
Acknowledgements We appreciate the helpful comments of anonymous reviewers.
References 1. Abdalla, M., Pointcheval, D.: Simple Password-Based Encrypted Key Exchange Protocols. In: Menezes, A. (ed.) CT-RSA 2005. LNCS, vol. 3376, pp. 191–208. Springer, Heidelberg (2005) 2. Bresson, E., Chevassut, O., Pointcheval, D.: New Security Results on Encrypted Key Exchange. In: Bao, F., Deng, R., Zhou, J. (eds.) PKC 2004. LNCS, vol. 2947, pp. 145–158. Springer, Heidelberg (2004) 3. Bellare, M., Namprempre, C., Pointcheval, D., Semanko, M.: The One-More-RSAInversion Problems and the Security of Chaum’s Blind Signature Scheme. Journal of Cryptology 16(3), 185–215 (2003) 4. Bellare, M., Pointcheval, D., Rogaway, P.: Authenticated Key Exchange Secure against Dictionary Attacks. In: Preneel, B. (ed.) EUROCRYPT 2000. LNCS, vol. 1807, pp. 139–155. Springer, Heidelberg (2000) 5. Boldyreva, A.: Threshold Signatures, Multisignatures and Blind Signatures Based on the Gap-Diffie-Hellman-Group Signature Scheme. In: Desmedt, Y.G. (ed.) PKC 2003. LNCS, vol. 2567, pp. 31–46. Springer, Heidelberg (2002) 6. Bellare, M., Rogaway, P.: Random Oracles are Practical: A Paradigm for Designing Efficient Protocols. In: Proc. of ACM CCS 1993, pp. 62–73. ACM Press, New York (1993) 7. Chu, C.K., Tzeng, W.G.: Efficient k-Out-of-n Oblivious Transfer Schemes with Adaptive and Non-adaptive Queries. In: Vaudenay, S. (ed.) PKC 2005. LNCS, vol. 3386, pp. 172–183. Springer, Heidelberg (2005) 8. Diffie, W., Hellman, M.: New Directions in Cryptography. IEEE Transactions on Information Theory IT-22(6), 644–654 (1976) 9. IEEE P1363.2: Password-Based Public-Key Cryptography, http://grouper.ieee.org/groups/1363/passwdPK/submissions.html 10. http://www.jablon.org/passwordlinks.html 11. Jablon, D.: Strong Password-Only Authenticated Key Exchange. ACM Computer Communication Review 26(5), 5–20 (1996) 12. Jablon, D.: Extended Password Key Exchange Protocols Immune to Dictionary Attacks. In: WET-ICE 1997 Workshop on Enterprise Security (1997) 13. Shin, S.H., Kobara, K., Imai, H.: A Secure Threshold Anonymous PasswordAuthenticated Key Exchange Protocol. In: Miyaji, A., Kikuchi, H., Rannenberg, K. (eds.) IWSEC 2007. LNCS, vol. 4752, pp. 444–458. Springer, Heidelberg (2007) 14. Tzeng, W.G.: Efficient 1-Out-n Oblivious Transfer Schemes. In: Naccache, D., Paillier, P. (eds.) PKC 2002. LNCS, vol. 2274, pp. 159–171. Springer, Heidelberg (2002) 15. Viet, D.Q., Yamamura, A., Tanaka, H.: Anonymous Password-Based Authenticated Key Exchange. In: Maitra, S., Veni Madhavan, C.E., Venkatesan, R. (eds.) INDOCRYPT 2005. LNCS, vol. 3797, pp. 244–257. Springer, Heidelberg (2005) 16. Yang, J., Zhang, Z.: A New Anonymous Password-Based Authenticated Key Exchange Protocol. In: Chowdhury, D.R., Rijmen, V., Das, A. (eds.) INDOCRYPT 2008. LNCS, vol. 5365, pp. 200–212. Springer, Heidelberg (2008)
Efficient Constructions of Deterministic Encryption from Hybrid Encryption and Code-Based PKE Yang Cui1,2 , Kirill Morozov1, Kazukuni Kobara1,2, and Hideki Imai1,2 1 Research Center for Information Security (RCIS), National Institute of Advanced Industrial Science & Technology (AIST), Japan {y-cui,kirill.morozov,k-kobara,h-imai}@aist.go.jp http://www.rcis.aist.go.jp/ 2 Chuo University, Japan
Abstract. We build on the new security notion for deterministic encryption (PRIV) and the PRIV-secure schemes presented by Bellare et al at Crypto’07. Our work introduces: 1) A generic and efficient construction of deterministic length-preserving hybrid encryption, which is an improvement on the scheme sketched in the above paper; to our best knowledge, this is the first example of length-preserving hybrid encryption; 2) postquantum deterministic encryption (using the IND-CPA variant of code-based McEliece PKE) which enjoys a simplified construction, where the public key is re-used as a hash function. Keywords: Deterministic encryption, hybrid encryption, code-based encryption, searchable encryption, database security.
1
Introduction
Background. The notion of security against privacy adversary (denoted as PRIV) for deterministic encryption was pioneered by Bellare et al [2] featuring an upgrade from the standard onewayness property. Instead of not leaking the whole plaintext, the ciphertext was demanded to leak, roughly speaking, no more than the plaintext statistics does. In other words, the PRIV-security definition (formulated in a manner similar to the semantic security definition of [7]) requires that a ciphertext must be essentially useless for adversary who is to compute some predicate on the corresponding plaintext. Achieving PRIVsecurity demands two important assumptions: 1) the plaintext space must be large enough and have a smooth (i.e. high min-entropy) distribution; 2) the plaintext and the predicate are independent of the public key. Constructions satisfying two flavors of PRIV-security are presented in [2]: against chosen-plaintext (CPA) and chosen-ciphertext (CCA) attacks. The following three PRIV-CPA constructions are introduced in the random oracle (RO) model. The generic Encrypt-with-Hash (EwH) primitive features replacing of the coins used by the randomized encryption scheme with a hash of the public key concatenated with the message. The RSA deterministic OAEP (RSADOAEP) scheme provides us with length-preserving deterministic encryption. M. Bras-Amor´ os and T. Høholdt (Eds.): AAECC 2009, LNCS 5527, pp. 159–168, 2009. c Springer-Verlag Berlin Heidelberg 2009
160
Y. Cui et al.
In the generic Encrypt-and-Hash (EaH) primitive, a ”tag” in the form of the plaintext’s hash is attached to the ciphertext of a randomized encryption scheme. These results were extended by Boldyreva et al [4] and Bellare et al [3] presenting new extended definitions, proving relations between them, and introducing, among others, new constructions without random oracles. Applications. The original motivation for this research comes from the demand on efficiently searchable encryption (ESE) in the database applications. Lengthpreserving schemes can also be used for encryption of legacy code and in the bandwidth-limited systems. Some more applications (although irrelevant to our work) to improving randomized encryption schemes were studied in [4, Sec. 8]. Motivation. The work [2, Sec. 5] sketches a method for encrypting long messages, but it is less efficient compared to the standard hybrid encryption, besides it is conjectured not to be length-preserving. Also, possible emerging of quantum computers raises demands for postquantum deterministic encryption schemes. Our Contribution. In the random oracle model, we present a generic and efficient construction of length-preserving deterministic hybrid encryption. In a nutshell, we prove that the session key can be computed by concatenating the public key with the first message block and inputting the result into key derivation function. This is a kind of re-using the (sufficient) entropy of message, and it is secure due to the assumption that the message is high-entropy and independent of the key. Meanwhile, Bellare et al. employ the hybrid encryption in a conventional way, which first encrypts a random session key to further encrypt the data, obviously losing the length-preserving property. Hence, we show that the claim of Bellare et al [2, Sec. 5]: “However, if using hybrid encryption, RSADOAEP would no longer be length-preserving (since an encrypted symmetric key would need to be included with the ciphertext)” is overly pessimistic. To our best knowledge, this is the first example of length-preserving hybrid encryption. For achieving postquantum deterministic encryption, we propose to plug in an IND-CPA secure variant [10] of the coding theory based (or code-based) McEliece PKE [9] into the generic constructions EaH and EwH, presented in [2, Sec. 5]. The McEliece PKE is believed to be resistant to quantum attacks, besides it has very fast encryption algorithm. Moreover, we point out a significant simplification: the public key (which is a generating matrix of some linear code) can be re-used as hash function. Related Work. The deterministic hybrid encryption scheme is based on the same principle as the RSA-DOAEP scheme of [2, Sec. 5], we just fill the gap which was overlooked there. Organization. The paper will be organized in the following way: Sec.2 provides the security definitions of deterministic encryption. Sec.3 gives the proposed generic and efficient construction of deterministic hybrid encryption, which leads to the first length-preserving construction, immediately. In Sec.4, we will provide deterministic encryption from the code-based PKE, which is postquantum secure and efficient due to the good property of the underlying PKE scheme. Next, in Sec.5, we further discuss how to extend the PRIV security to the chosenciphertext attack (CCA) scenario.
Efficient Constructions of Deterministic Encryption
2
161
Preliminaries
Denote by “|x|” the cardinality of x. Denote by x ˆ the vector and by x ˆ[i] the i-th component of x ˆ (1 ≤ i ≤ |ˆ x|). Write x ˆ||ˆ y for concatenation of vectors x ˆ and y ˆ. Let x ←R X denote the operation of picking x from the set X uniformly at random. Denote by z ← A(x, y, ...) the operation of running algorithm A with input (x, y, ...), to output z. Write log x as the logarithm with base 2. We also write Pr[A(x) = y : x ←R X] the probability that A outputs y corresponding to input x, which is sampled from X. We say a function (k) is negligible, if for any constant c, there exists k0 ∈ N, such that < (1/k)c for any k > k0 . A public key encryption (PKE) scheme Π consists of a triple of algorithms (K, E, D). The key generation algorithm K outputs a pair of public and secret keys (pk, sk) taking on input 1k , a security parameter k in unitary notation. The encryption algorithm E on input pk and a plaintext x ˆ outputs a ciphertext c. The decryption algorithm D takes sk and c as input and outputs the plaintext message x ˆ. We require that for any key pair (pk, sk) obtained from K, and any plaintext x ˆ from the plaintext space of Π, x ˆ ← D(sk, E(pk, x ˆ)). Definition 1 (PRIV [2]). Let a probabilistic polynomial-time (PPT) adversary ADE against the privacy of the deterministic encryption Π = (K, E, D), be a pair of algorithms ADE = (Af , Ag ), where Af , Ag do not share any random coins or state. The advantage of adversary is defined as follows, priv−1 priv−0 Advpriv Π,ADE (k) = Pr[ExpΠ,ADE (k) = 1] − Pr[ExpΠ,ADE (k) = 1]
where experiments are described as: Experiment Exppriv−1 Π,ADE (k) :
Experiment Exppriv−0 Π,ADE (k) :
(pk, sk) ←R K(1k ), (ˆ x1 , t1 ) ←R Af (1k ), c ←R E(1k , pk, x ˆ1 ), g ←R Ag (1k , pk, c); return 1 if g = t1 , else return 0
(pk, sk) ←R K(1k ), (ˆ x0 , t0 ) ←R Af (1k ), (ˆ x1 , t1 ) ←R Af (1k ), k c ←R E(1 , pk, x ˆ0 ), g ←R Ag (1k , pk, c); return 1 if g = t1 , else return 0
We say that Π is PRIV secure, if Advpriv Π,ADE (k) is negligible, for any PPT ADE with high min-entropy, where ADE has a high min-entropy μ(k) means that μ(k) ∈ ω(log(k)), and Pr[ˆ x[i] = x : (ˆ x, t) ←R Am (1k )] ≤ 2−μ(k) for all k, ∗ all 1 ≤ i ≤ |ˆ x|, and any x ∈ {0, 1} . In the underlying definition, the advantage of privacy adversary could be also written as priv−b Advpriv Π,ADE (k) = 2 Pr[ExpΠ,ADE (k) = b] − 1 where b ∈ {0, 1} and probability is taken over the choice of all of the random coins in the experiments.
162
Y. Cui et al.
Remarks. 1) The encryption algorithm Π need not be deterministic per se. For example, in a randomized encryption scheme, the random coins can be fixed in an appropriate way to yield a deterministic scheme (as explained in Sec.4); 2) As argued in [2], Af has no access to the pk and Ag does not know the chosen plaintext input to encryption oracle by Af . This is required because the public key itself carries some non-trivial information about the plaintext if the encryption is deterministic.1 Thus, equipping either Af or Ag with both the public key and free choice of an input plaintext in the way of conventional indistinguishability notion [7] of PKE, the PRIV security cannot be achieved. It is possible to build PRIV security from indistinguishability (IND) security, as observed in [2]. In the following, we recall the notion of IND security. Definition 2 (IND-CPA). We say a scheme Π = (K, E, D) is IND-CPA secure, if the advantage Advind Π,A of any PPT adversary A = (A1 , A2 ) is negligible, (let s be the state information of A1 , and ˆb ∈ {0, 1}): ⎡ ⎤ ˆb = b : (pk, sk) ←R K(1k ), ⎢ (x0 , x1 , s) ←R A1 (1k , pk), ⎥ ⎢ ⎥ Advind Π ,A (k) = 2 · Pr ⎣ b ← {0, 1}, c ← E(1k , pk, x ), ⎦ − 1 R R b ˆb ←R A2 (1k , c, s) Remark. IND security is required by a variety of cryptographic primitives. However, for an efficiently searchable encryption used in database applications, IND secure encryption may be considered as overkill. For such a strong encryption, it is not known how to arrange fast (i.e. logarithmic in the database size) search. IND secure symmetric key encryption (SKE) has been carefully discussed in the literature, such as [6, Sec.7.2]. Given a key K ∈ {0, 1}k and message m, an encryption algorithm outputs a ciphertext χ. Provided χ and K, a decryption algorithm outputs the message m uniquely. Note that for a secure SKE, outputs of the encryption algorithm could be considered uniformly distributed in the range, when encrypted under independent session keys. Besides, it is easy to build IND secure SKE. Definition 3 (IND-CPA SKE). A symmetric key encryption (SKE) scheme Λ = (KSK , ESK , DSK ) with key space {0, 1}k , is indistinguishable against chosen plaintext attack (IND-CPA) if the advantage of any PPT adversary B, Advind−cpa is negligible, where Λ,B ˆb = b : K ←R {0, 1}k , b ←R {0, 1}, ind−cpa (k) = 2 · Pr ˆ AdvΛ,B − 1, b ←R B LOR(K,·,·,b) (1k ) where a left-or-right oracle LOR(K, M0 , M1 , b) returns χ ←R ESK (K, Mb ). Adversary B is allowed to ask LOR oracle, with two chosen message M0 , M1 (M0 = M1 , |M0 | = |M1 |). 1
In other words, suppose that in Def. 1, Af knows pk. Then, Af can assign t1 to be the ciphertext c, and hence Ag always wins the game (returns 1). Put it differently, although Af and Ag are not allowed to share a state, knowledge of pk can help them to share it anyway.
Efficient Constructions of Deterministic Encryption
163
Hybrid Encryption. In the seminal paper by Cramer and Shoup [6], the idea of hybrid encryption is rigorously studied. Note that typically, PKE is applied in key distribution process due to its expensive computational cost, while SKE is typically used for encrypting massive data flow using a freshly generated key for each new session. In hybrid encryption, PKE and SKE work in tandem: a randomly generated session key is first encrypted by PKE, then the plaintext is further encrypted on the session key by SKE. Hybrid encryption is more commonly used in practice than a sole PKE, since encryption/decryption of the former is substantially faster for long messages. McEliece PKE. (denoted ΠM ) Consists of the following triple of algorithms (KM , EM , DM ). 1. Key generation KM : On input λ, output (pk, sk). n, t ∈ N, t n – sk (Private Key): (S, ϕ, P ) G : l×n generating matrix of a binary irreducible [n, l] Goppa code which can correct a maximum of t errors. ϕ is an efficient bounded distance decoding algorithm of the underlying code, S: l × l non-singular matrix, P: n × n permutation matrix, chosen at random. – pk (Public Key): (G, t) G: l × n matrix given by a product of three matrices SG P . 2. Encryption EM : Given pk and an l-bit plaintext m, randomly generate n-bit e with Hamming weight t, output ciphertext c = mG ⊕ e. 3. Decryption DM : On input c, output m with private key sk. – Compute cP −1 = (mS)G ⊕ eP −1 , where P −1 is an inverse matrix of P . – Error correcting algorithm ϕ corresponding to G applies to compute mS = ϕ(cP −1 ). – Compute the plaintext m = (mS)S −1 . IND-CPA security of the McEliece PKE can be achieved by padding the plaintext with a random bit-string r, |r| = a · l for some 0 < a < 1. We refer to [10] for details.
3
Secure Deterministic Hybrid Encryption
In this section, we will present a generic composition of PKE and SKE to obtain deterministic hybrid encryption. Interestingly, the situation is different from conventional hybrid encryption. In that case, the overhead of communication cost includes at least the size of the session key, even if we pick the PKE scheme being a (length-preserving) one-way trapdoor permutation, e.g. RSA. However, we notice that in PRIV security definition, both of public key and plaintext are not simultaneously known by Af or Ag . Hence, one can save on generating and encrypting a random session key. Instead, the secret session key could be extracted from the combination of public key and plaintext which are available to a legal user contrary to the adversary. As we show next, such an approach may need a little higher min-entropy, but it works in principle.
164
3.1
Y. Cui et al.
Generic Composition of PRIV-Secure PKE and IND-CPA Symmetric Key Encryption
Given a PRIV secure PKE scheme Π = (K, E, D), and an IND-CPA secure SKE scheme Λ = (KSK , ESK , DSK ), we can achieve a deterministic hybrid encryption HE = (KH , EH , DH ). In the following, H : {0, 1}∗ → {0, 1}k is a key derivation function (KDF), modeled as a random oracle. In the following section, we simply write input vector x ˆ as x with length of |ˆ x| = v. Wlog, parse x = x ¯||x, where the |¯ x| and |x| are equivalent to the input domain of Π and Λ, respectively. Table 1. Generic Construction of Deterministic Hybrid Encryption EH (pk, x): DH (sk, c): KH (1k ): Parses x to x ¯||x Parse c to ψ||χ (pk, sk) ←R K(1k ) x ¯ ← D(sk, ψ) ψ ←R E (1k , pk, x ¯) Return (pk, sk) K ← H(pk||¯ x) K ← H(pk||¯ x) x ← DSK (K, χ) χ ←R ESK (K, x) Return x = x ¯||x Return c = ψ||χ
In the Table 1, the proposed construction is simple, efficient, and can be generically built from any PRIV PKE and IND-CPA SKE. Note that the secret session key is required to have high min-entropy in order to deny a brute-force attack to SKE. However, thanks to the PRIV security, the high min-entropy requirement is inherently fulfilled for any PPT privacy adversary, so that we can build a reduction of security of the deterministic hybrid encryption to security of deterministic PKE. Next, we will provide a sketch of our proof. 3.2
Security Proof
Theorem 1. In the random oracle model, given a PRIV PKE scheme Π = (K, E, D), and an IND-CPA SKE scheme Λ = (KSK , ESK , DSK ), if there is a PRIV adversary AH against the hybrid encryption HE = (KH , EH , DH ), then there exists PRIV adversary A or IND-CPA adversary B, s.t. priv ind−cpa (k) + qh v/2μ Advpriv HE,AH (k) ≤ AdvΠ,A (k) + AdvΛ,B
where qh is an upper bound on the number of queries to random oracle H, v is the plaintext size of Π, μ is defined by high min-entropy of PRIV security of Π. Proof. Since we assume a PPT adversary AH = (Af , Ag ) against the HE scheme, according to the definition of PRIV, there must be a non-negligible advantage in the following experiments. More precisely, if a successful adversary exists, then priv−1 priv−0 Advpriv HE,AH (k) = Pr[ExpHE,AH (k) = 1] − Pr[ExpHE,AH (k) = 1]
Efficient Constructions of Deterministic Encryption
Experiment Exppriv−1 HE,AH (k): (pk, sk) ←R K(1k ); (x1 , t1 ) ←R Af (1k ); Parse x1 to x ¯1 ||x1 ; ψ ←R E(1k , pk, x¯1 ); K ← H(pk||¯ x1 ); χ ←R ESK (K, x1 ); c ← ψ||χ; g ←R Ag (1k , pk, c); return 1 if g = t1 , else return 0
165
Experiment Exppriv−0 HE,AH (k): (pk, sk) ←R K(1k ); (x0 , t0 ) ←R Af (1k ), (x1 , t1 ) ←R Af (1k ); Parse x0 to x ¯0 ||x0 ; ψ ←R E(1k , pk, x ¯0 ); K ← H(pk||¯ x0 ); χ ←R ESK (K , x0 ); c ← ψ ||χ ; g ←R Ag (1k , pk, c ); return 1 if g = t1 , else return 0
is non-negligible for some AH . Next we present a simulator which gradually modifies the above experiments such that the adversary does not notice it. Our goal is to show that Advpriv HE,AH (k) is almost as big as the corresponding advantages defined for PRIV security of the PKE scheme and IND-CPA security of the SKE scheme, which are assumed negligible. Because of the high min-entropy requirement of PRIV adversary, it is easy to ¯0 = x ¯1 see that x0 = x1 , except with negligible probability. Thus, there must be x or x0 = x1 , or both. Hence, we need to consider the following cases. ¯0 = x ¯1 , the right part of xb (b ∈ {0, 1}), Case [¯ x0 = x¯1 ] Since x0 = x1 and x could be equal or not. – When x0 = x1 , the adversary has two targets, such as Π and Λ in two experiments. First look at the SKE scheme Λ. In this case, the inputs to Λ in two experiments are the same, but still unknown to Ag . The key derivation function H outputs K ← H(pk||¯ x1 ) and K ← H(pk||¯ x0 ). ¯1 , we have K = K . Note that Ag does not know x0 nor x1 , Since x¯0 = x thus does not know K, K , either. Then, Ag must tell which of χ, χ is the corresponding encryption under the unknown keys without knowing x0 , x1 (x0 = x1 ), which is harder than breaking IND-CPA security and (k). that could be bounded by Advind−cpa Λ,B On the other hand, the adversary can also challenge the PKE scheme Π to distinguish two experiments, but it will break the PRIV security. More precisely, the advantage in distinguishing ψ, ψ with certain K, K is at most Advpriv Π,A (k), since K, K are not output explicitly and unavailable to adversary. – when x0 = x1 , this case is similar to the above, except that the inputs to Λ are different. Ag can do nothing given χ, χ only, hence Ag ’s possible attack must be focused on Π, and its advantage can be bounded by Advpriv Π,A (k). Case [x0 = x1 ] Similarly, there must be either x ¯0 = x ¯1 or x ¯0 = x ¯1 . – when x¯0 = x ¯1 , the same session key K ← H(pk||¯ xb ) (b ∈ {0, 1}) is used for Λ. In this case, the ciphertexts ψ, ψ are the same, adversary will focus on distinguishing the χ, χ . Note that Af cannot compute K even though he knows the x¯0 = x ¯1 , because pk is not known to him
166
Y. Cui et al.
(otherwise, it will break the PRIV security of Π immediately!). Thus, the successful distinguishing requires Ag to choose the same x ¯0 = x ¯1 when querying to the random oracle. Then, Ag has a harder game than IND-CPA (because it does not know x0 , x1 ), whose advantage is bounded by Advind−cpa (k). Λ,B In order to be sure that adversary (Af , Ag ) mounting a brute-force attack to find out the session key of Λ cannot succeed, the probability to find the key in searching all the random oracle queries should be taken into account as well. Suppose that adversary makes at most qh queries to its random oracle, and the Π’s plaintext size is v. Then, this probability could be upper bounded by qh v/2μ (Note that this bound is in nature similar to that in [2, Sec.6.1]). – when x¯0 = x ¯1 , as we have discussed above, this will break the PRIV security of Π, and advantage of adversary could be bounded by Advpriv Π,A (k). Summarizing, we conclude that in all cases when (Af , Ag ) intends to break the PRIV security of our HE scheme, its advantage of distinguishing two experiments ind−cpa μ (k).
is bounded by the sum of Advpriv Π,A (k), qh v/2 and AdvΛ,B Length-preserving Deterministic Hybrid Encryption. The first lengthpreserving PRIV PKE scheme is RSA-DOAEP due to [2]. The length-preserving property is important in practical use, such as bandwidth-restricted applications. RSA-DOAEP makes use of the RSA trapdoor permutation and with a modified 3-round Feistel network achieves the same sizes of input and output. As we have proved in Theorem 1, a construction proposed in Table 1 leads to a deterministic hybrid encryption. In particular, RSA-DOAEP + IND-CPA SKE ⇒ a length-preserving deterministic hybrid encryption, because both RSA-DOAEP and IND-CPA SKE are length-preserving. Note that in [2, Sec.5.2], it is argued that RSA-DOAEP based hybrid encryption scheme cannot be length-preserving any more, because a random session key has to be embedded in RSA-DOAEP. However, by re-using the knowledge of public key pk and a part of the message, we can indeed build the first length-preserving deterministic hybrid encryption, which is not only convenient in practice, but also meaningful in theory.
4
Deterministic Encryption from Code-Based PKE
From a postquantum point of view, it is desirable to obtain deterministic encryption based on assumptions other than RSA or discrete log. Code-based PKE, such as McEliece PKE [9] is considered a promising candidate after being carefully studied for over thirty years. To our surprise, it is not the only motivation to achieve deterministic encryption from code-based PKE. Another good property of the McEliece PKE and its variants is that its public key could be used as a hash function to digest the message, which is originally noted in Stern’s paper [11], and recently designed by [1,8]. The advantage that public key itself is able to work as a hash function,
Efficient Constructions of Deterministic Encryption
167
Table 2. Construction of EwH Deterministic Encryption K(1k ): E (pk, HN , x): D(sk, HM , c): R ← HM (x) x, r , e ← DM (sk, c) (pk, sk) ←R KM (1k ) k Parse R to r||re Decode e to re HM ← H(1 , pk) Encode re to e R ← r ||re , R ← HM (x) Return (pk, HM , sk) c ← EM (pk, r||x; e) Return x if R = R Otherwise, return ⊥ Return c Table 3. Construction of EaH Deterministic Encryption K(1k ): E (pk, HM , x): D(sk, HM , c||T ): T ← HN (x) x, r, e ← DM (sk, c) (pk, sk) ←R KM (1k ) T ← HN (x) r ←R {0, 1}lp HN ← H(1k , pk) n Return x if T = T e ←R {0, 1} , s.t. Hw(e) = t Return (pk, HM , sk) Otherwise, return ⊥ c ← EM (pk, r||x; e) Return c||T
can do us a favor to build efficient postquantum deterministic encryption. We call this Hidden Hash (HH) property of McEliece PKE. Henceforth, we assume that this function behaves as a random oracle. In [2], two constructions satisfying PRIV security have been proposed: Encrypt-with-Hash (EwH) and Encrypt-and-Hash (EaH). Adapting the HH property of the McEliece PKE to the both constructions, we can achieve PRIV secure deterministic encryption. For proving PRIV security, we require the McEliece PKE to be IND-CPA secure, which has been proposed in [10]. (The proofs are deferred to the full version of this paper). Construction of EwH. Let ΠM = (KM , EM , DM ) be the IND-CPA McEliece PKE as described in Section 2, based on [n, l, 2t + 1] Goppa code family, with lp bit padding where lp = a·l for some 0 < a < 1, and plaintext length lm = l−lp . Let H be a hash family defined over a set of public keys of the McEliece PKE.
t n HM : {0, 1}lm → {0, 1}lp+log i=1 ( t ) and HN : {0, 1}lm → {0, 1}2k are uniquely defined by 1k and pk. Without knowledge of pk, there is no way to compute HM or HN (refer to [1,8] for details). e is an error vector, s.t. |e| = n with Hamming weight Hw(e) = t. According to Cover’s paper [5], it is quite efficient to find an injective mapping to encode the (short) bit string re into e, and vice versa. Our EwH scheme is presented in Table 2. Note that compared with the EwH scheme proposed by Bellare et al. [2], our scheme does not need to include pk into the hash, because hash function HM itself is made of pk. Public key pk could be considered as a part of the algorithm of the hash function, as well. When we model HM as a random oracle, we can easily prove the PRIV security in a similar way as Bellare et al’s EwH. A more favorable, efficiently searchable encryption (ESE) with PRIV security is EaH. EaH aims to model the practical scenario in database security, where a deterministic encryption of some keywords works as a tag attached to the encrypted data. To search the target data, it is only required to compute the
168
Y. Cui et al.
deterministic tag and compare it within the database, achieving a search time which is logarithmic in database size. Construction of EaH. The description of McEliece PKE is similar to the above. EaH scheme is described in Table 3. The HH property is employed in order to achieve PRIV secure efficiently searchable encryption.
5
Concluding Remarks
Extension to Chosen-Ciphertext Security. Above, we have proposed several PRIV secure deterministic encryption schemes, in CPA case. A stronger attack scenario, CCA, requires a little more care. As commented in [2], PRIV-CCA could be obtained from PRIV-CPA scheme with some additional cost, such as one-time signatures or other authentication techniques to deny a CCA attacker. We can employ those techniques to lift up CPA to CCA. The important issue is that we have achieved very efficient PRIV-CPA secure building blocks which enjoy some advantages over previous works. Open Question. Proving our constructions secure in the standard model is an open question and the topic of our future work.
References 1. Augot, D., Finiasz, M., Sendrier, N.: A Family of Fast Syndrome Based Cryptographic Hash Functions. In: Dawson, E., Vaudenay, S. (eds.) Mycrypt 2005. LNCS, vol. 3715, pp. 64–83. Springer, Heidelberg (2005) 2. Bellare, M., Boldyreva, A., O’Neill, A.: Deterministic and Efficiently Searchable Encryption. In: Menezes, A. (ed.) CRYPTO 2007. LNCS, vol. 4622, pp. 535–552. Springer, Heidelberg (2007) 3. Bellare, M., Fischlin, M., O’Neill, A., Ristenpart, T.: Deterministic Encryption: Definitional Equivalences and Constructions without Random Oracles. In: Wagner, D. (ed.) CRYPTO 2008. LNCS, vol. 5157, pp. 360–378. Springer, Heidelberg (2008) 4. Boldyreva, A., Fehr, S., O’Neill, A.: On Notions of Security for Deterministic Encryption, and Efficient Constructions without Random Oracles. In: Wagner, D. (ed.) CRYPTO 2008. LNCS, vol. 5157, pp. 335–359. Springer, Heidelberg (2008) 5. Cover, T.: Enumerative source encoding. IEEE IT 19(1), 73–77 (1973) 6. Cramer, R., Shoup, V.: Design and analysis of practical public-key encryption schemes secure against adaptive chosen ciphertext attack. SIAM Journal on Computing 33(1), 167–226 (2003) 7. Goldwasser, S., Micali, S.: Probabilistic encryption. Journal of Computer and System Sciences 28(2), 270–299 (1984) 8. Finiasz, M.: Syndrome Based Collision Resistant Hashing. In: Buchmann, J., Ding, J. (eds.) PQCrypto 2008. LNCS, vol. 5299, pp. 137–147. Springer, Heidelberg (2008) 9. McEliece, R.J.: A public-key cryptosystem based on algebraic coding theory. Deep Space Network Progress Rep. 42-44, 114–116 (1978) 10. Nojima, R., Imai, H., Kobara, K., Morozov, K.: Semantic Security for the McEliece Cryptosystem without Random Oracles. Designs, Codes and Cryptography 49(13), 289–305 (2008) 11. Stern, J.: A new identification scheme based on syndrome decoding. In: Stinson, D.R. (ed.) CRYPTO 1993. LNCS, vol. 773, pp. 13–21. Springer, Heidelberg (1994)
Noisy Interpolation of Multivariate Sparse Polynomials in Finite Fields ´ Alvar Ibeas1 and Arne Winterhof2 1
2
University of Cantabria E-39071 Santander, Spain
[email protected] Johann Radon Institute for Computational and Applied Mathematics Austrian Academy of Sciences Altenbergerstr. 69, 4040 Linz, Austria
[email protected]
Abstract. We consider the problem of recovering an unknown sparse multivariate polynomial f ∈ Fp [X1 , . . . , Xm ] over a finite field Fp of prime order p from approximate values of f (t1 , . . . , tm ) at polynomially many points (t1 , . . . , tm ) ∈ Fm p selected uniformly at random. Our result is based on a combination of bounds on exponential sums with the lattice reduction technique. Keywords: Noisy interpolation, Sparse polynomials, Hidden number problem, Lattice reduction, Exponential sums.
1
Introduction
Let p be a prime. We will use the notation Fp for the finite field of p elements and identify it with the integers in the range {0, . . . , p − 1}. For an integer s we denote by sp the remainder of s on division by p. For a positive integer m we consider the problem of finding an unknown multivariate nonconstant polynomial f ∈ Fp [X1 , . . . , Xm ] of weight at most w from approximate values of f (t1 , . . . , tm )p at polynomially many points (t1 , . . . , tm ) ∈ Fm p . More precisely, using the abbreviations X = (X1 , . . . , Xm ) ej,1 e ej and X = X1 · · · Xmj,m if ej = ej,1 + ej,2 p + · · · + ej,m pm−1 with 0 ≤ ej,1 , ej,2 , . . . , ej,m < p, the polynomial f = f (X) is of the form f (X) =
w
αj Xej ,
(1)
j=1
where 1 ≤ e1 < e2 < · · · < ew < pm . A polynomial time algorithm to recover a univariate sparse polynomial f ∈ Fp [X] from approximations to f (j)p at random j was introduced and investigated in [3,4]. The case of a univariate polynomial f (X) = αX corresponds to the hidden number problem, see [2,5,6] and references therein. In the present paper we present an algorithm for arbitrary m. M. Bras-Amor´ os and T. Høholdt (Eds.): AAECC 2009, LNCS 5527, pp. 169–178, 2009. c Springer-Verlag Berlin Heidelberg 2009
170
´ Ibeas and A. Winterhof A.
The algorithm is based on the lattice reduction technique. To be more precise, a certain lattice is constructed which contains a vector, associated with the coefficients of f , and which is close to a certain known vector. Using the lattice reduction technique one can hope to recover this vector. Then, in order to make these arguments rigorous, that is, to show that there are no other lattice vectors which are close enough to the known “target” vector, we need a certain uniform distribution property of f which is only guaranteed if the total degree of f is small enough. However, using a weaker uniform distribution property of f and ideas reminiscent to Waring’s problem in a finite field, see for example [7], we can adapt the algorithm to polynomials of much larger degree. We use log z to denote the binary logarithm of z > 0. For a prime p and τ ≥ 0 we denote by MSBτ,p (x) any integer u such that |xp − u| ≤ p/2τ +1 . Roughly speaking, MSBτ,p (x) gives the τ most significant bits of x, however this definition is more flexible and suits better our purposes. In particular we remark that τ need not be an integer. We use bold lowercase letters to denote vectors in e1 e e em Fm p and use the analogue notation to X : x = (x1 , . . . , xm ) and x = x1 · · · xm m−1 if e = e1 + e2 p + · · · + em p with 0 ≤ e1 , . . . , em < p.
2
Preliminaries
For a prime p we denote ep (z) = exp(2πiz/p). The following bound on exponential sums of a univariate polynomial is given in [1, Corollary 1.1]. It is nontrivial for polynomials of very large degree relative to p. Lemma 1. Let F ∈ Fp [X] be a nonconstant polynomial of degree D and weight 1−ε
log p)) w. For any ε ∈ (0, 1), if D ≤ p(log(wlog , the following bound holds provided p that p is large enough: 1 ≤ p 1 − e (F (x)) . p (w log p)1+ε x∈Fp
The following result extends this bound to exponential sums of a multivariate polynomial. Lemma 2. Let F ∈ Fp [X] be a nonconstant multivariate polynomial of total degree D and weight at most w. For any ε ∈ (0, 1) and 0 < c < 1 with min degXi (F ) ≤ i=1,...,m deg X (F )>0 i
p(log log p)1−ε log p
and
D ≤ cp,
the following bound holds provided that p is large enough: 1 m ep (F (x)) ≤ p 1− . (w log p)1+ε x∈Fm p
(2)
Noisy Interpolation of Multivariate Sparse Polynomials in Finite Fields
Proof. We may assume that degXm (F ) = F (X) =
w0
171
min degXi (F ) > 0 and
i=1,...,m
ej
ai (X1 , . . . , Xm−1 )Xm i
i=1
for some w0 ≤ w and polynomials a1 , . . . , aw0 ∈ Fp [X1 , . . . , Xm−1 ] not identically ep (F (x)). We have zero. Put S = x∈Fm p w 0 ej S≤ ep ai (x1 , . . . , xm−1 )xmi . i=1 x1 ,...,xm−1 ∈Fp xm ∈Fp
The polynomial aw0 has at most pm−2 D zeros and therefore, the univariate polynomials F (x1 , . . . , xm−1 , Xm ) are constant for at most pm−2 D choices of the values x1 , . . . , xm−1 . Using Lemma 1 with a certain ε < ε for the rest, we get S 1 1−c ≤ D + (p − D) 1 − ≤p 1− , pm−1 (w0 log p)1+ε (w0 log p)1+ε and the result follows. For small total degree D we also use the Weil bound ep (F (x)) < Dpm−1/2 . x∈Fp
(3)
Let S ⊆ Fm p be a subseteiof cardinality s. For integers h, k ≥ 1, r, and a polyno∈ Fp [X], we denote by N (S, k, f, h, r) the number of mial f (X) = w i=1 αi X solutions in S k of the “Waring-like” equation: f (x1 ) + · · · + f (xk ) ≡ t mod p, r + 1 ≤ t ≤ r + h. w Let Pf be the set of polynomials i=1 βi Xei which are distinct to f . We say that the set S is (Δ, k, f )-homogeneously distributed modulo p if for every g ∈ Pf max N (S, k, f − g, h, r) − hsk p−1 ≤ Δsk . 1≤h,r≤p
Lemma 3. Let f ∈ Fp [X] such that ep (a(f (x) − g(x))) ≤ Bpm , x∈Fm p k for all g ∈ Pf and a ∈ F∗p . Then, for every integer k ≥ 1, Fm p is (B log p, k, f )homogeneously distributed modulo p.
´ Ibeas and A. Winterhof A.
172
Proof. By the well-known identity
p−1
ep (au) =
a=0
0, if u ≡ 0 mod p, we have p, if u ≡ 0 mod p,
N (Fm p , k, f − g, h, r) =
r+h
p−1
t=r+1 x1 ,...,xk ∈Fm p
1 = p a=0
p−1
r+h
1 ep (a(f (x1 ) − g(x1 ) + · · · + f (xk ) − g(xk ) − t)) p a=0 ⎛
ep (−at) ⎝
⎞k ep (a(f (x) − g(x)))⎠ .
x∈Fm p
t=r+1
The term corresponding to a = 0 is hpmk−1 . Applying the estimate p−1 r+h ep (at) ≤ p log p max 1≤h≤p a=1 t=r+1
to other terms we obtain the result.
As in [3,4], the presented method uses lattice basis reduction techniques. We briefly review a result on lattices. Let {b1 , . . . , bs } be a set of linearly independent vectors in Rr . The set L = {c1 b1 + · · · + cs bs | c1 , . . . , cs ∈ Z} is called an s-dimensional lattice, and the set {b1 , . . . , bs } is called a basis of L. The search of elements in a lattice with small norm or close to a given one is a widely investigated problem. As in [3,4], we use the following result. Lemma 4. There exists a deterministic polynomial time algorithm which, for a given lattice L ⊂ Rs and vector r = (r1 , . . . , rs ) ∈ Rs , finds a lattice vector v = (v1 , . . . , vs ) satisfying the inequality s log2 log s (vi − ri ) ≤ exp O log s i=1 s 2 (zi − ri ) , z = (z1 , . . . , zs ) ∈ L . × min
s
2
i=1
3
General Interpolation Result
Let τ > 0 be a real number and e = (e1 , . . . , ew ) a vector of integers such that 1 ≤ e1 < · · · < ew < pm . For t1,1 , . . . , t1,k ; . . . ; td,1 , . . . , td,k ∈ Fm p we denote by Lτ,e,p (t1,1 , . . . , td,k ) the (d + w)-dimensional lattice generated by the rows of the
Noisy Interpolation of Multivariate Sparse Polynomials in Finite Fields
173
following matrix: ⎛
1 1 te1,1 + · · · + te1,k ⎜ .. ⎜ . ⎜ ew ⎜ t + · · · + tew 1,k ⎜ 1,1 ⎜ p ⎜ ⎜ .. ⎝ .
0
1 1 · · · ted,1 + · · · + ted,k 1/2τ +1 .. .. . . w w · · · ted,1 + · · · + ted,k 0 ··· 0 0 .. .. .. . . . ··· p 0
⎞ ··· 0 ⎟ .. .. ⎟ . . ⎟ · · · 1/2τ +1 ⎟ ⎟. ··· 0 ⎟ ⎟ ⎟ .. ⎠ . ···
(4)
0
Lemma 5. Let p be a sufficiently large n-bit prime and let w ≥ 1 be an integer. We define d = 2 (nw)1/2 and η = 0.5(nw)1/2 + 3. Let f ∈ Fp [X] be defined by (1) and k be a positive integer such that S ⊆ Fm p is (2−η , k, f )-homogeneously distributed modulo p. Assume that t1,1 , . . . , td,k ∈ S are chosen uniformly and independently at random. Then with probability P ≥ 1 − 2−η for any vector s = (s1 , . . . , sd , 0, . . . , 0) with
d (f (ti,1 ) + · · · + f (ti,k )p − si )2
1/2 ≤ 2−η p,
i=1
all vectors v = (v1 , . . . , vd , vd+1 , . . . , vd+w ) ∈ Lτ,e,p (t1,1 , . . . , td,k ) satisfying
d
1/2 (vi − si )
2
≤ 2−η p
i=1
are of the form ⎢ ⎢ ⎢ w ej e βj (t1,1 + ··· + t j v= ⎣
⎥ ⎢ ⎥ ⎥ ⎢ ⎥ ⎥ ⎢ w ej ej ⎥ ⎦ ,...,⎣ ⎦ , ) β (t + · · · + t ) j d,1 1,k d,k
j=1
p
j=1
p
β1 /2
τ +1
, . . . , βw /2
τ +1
with some integers βj ≡ αj mod p, j = 1, . . . , w. Proof. We define the modular distance between two integers λ and μ as distp (λ, μ) = min |λ − μ − bp| = min {λ − μp , p − λ − μp } . b∈Z
174
´ Ibeas and A. Winterhof A.
Because of the homogeneous distribution property, we see that for any polynomial g ∈ Pf the probability P (g) that distp (g(t1 ) + · · · + g(tk ), f (t1 ) + · · · + f (tk )) ≤ 2−η+1 p for t1 , . . . , tk ∈ S selected uniformly at random is P (g) ≤ 2−η+2 + 2−η =
5 . 2η
Therefore, for any g ∈ Pf , the probability that there is i ∈ [1, d] with distp (g(ti,1 ) + · · · + g(ti,k ), f (ti,1 ) + · · · + f (ti,k )) > 2−η+1 p d equals 1 − P (g)d ≥ 1 − 25η , where the probability is taken over t1,1 , . . . , td,k ∈ S chosen uniformly and independently at random. Since #Pf = pw − 1, we obtain Pr ∀g ∈ Pf , ∃i ∈ [1, d] | distp (g(ti,1 ) + · · · + g(ti,k ), f (ti,1 ) + · · · + f (ti,k )) > 2−η+1 p d 5 ≥ 1 − (pw − 1) > 1 − 2−η 2η because d(η − log 5) > 2(nw)1/2 (0.5(nw)1/2 + 3 − log 5) > nw + η ≥ w log p + η, provided that p is large enough. Indeed, we fix some (t1,1 , . . . , td,k ) ∈ S dk with min max distp (g(ti,1 ) + · · · + g(ti,k ), f (ti,1 ) + · · · + f (ti,k )) > 2−η+1 p. (5)
g∈Pf i∈[1,d]
Let v ∈ Lτ,e,p (t1,1 , . . . , td,k ) be a lattice point satisfying
d
1/2 (vi − si )
2
≤ 2−η p.
i=1
Since v ∈ Lτ,e,p (t1,1 , . . . , td,k ), there are some integers β1 , . . . , βw , z1 , . . . , zd such that w w ej ej ej ej βj (t1,1 + · · · + t1,k ) − z1 p, . . . , βj (td,1 + · · · + td,k ) − zd p, v= j=1
j=1
β1 /2
τ +1
, . . . , βw /2
τ +1
.
Noisy Interpolation of Multivariate Sparse Polynomials in Finite Fields
175
If βj ≡ αj mod p, j = 1, . . . , w, then for all i = 1, . . . , d we have ⎢ ⎥ ⎢ w ⎥ w ⎢ ej ej ej ej ⎥ ⎣ βj (ti,1 + · · · + ti,k ) − zi p = βj (ti,1 + · · · + ti,k )⎦ j=1
j=1
p
= f (ti,1 ) + · · · + f (ti,k )p , since otherwise there is i ∈ [1, d] such that |vi − si | > 2−η p. Now suppose that βj ≡ αj mod p for some j ∈ [1, w]. In this case we have ⎛ ⎞ d 1/2 w e e j j (vi − si )2 ≥ max distp ⎝ βj (ti,1 + · · · + ti,k ), si ⎠ i∈[1,d]
i=1
j=1
w ej ej βj (ti,1 + · · · + ti,k ) distp f (ti,1 ) + · · · + f (ti,k ),
≥ max
i∈[1,d]
j=1
− distp si , f (ti,1 ) + · · · + f (ti,k ) > 2−η+1 p − 2−η p = 2−η p, contradicting our assumption. As we have seen, condition (5) holds with probability exceeding 1 − 2−η and the result follows. Now we state a rather general result. Theorem 1. Let p be a sufficiently large n-bit prime and w and k be positive integers of polynomial size. We define μ = (nw)1/2 ,
τ = μ + log n + log k ,
d = 2 μ ,
and
η = 0.5μ + 3.
There exists a deterministic polynomial time algorithm A such that for any −η , k, f )-homogeneously polynomial f , defined by (1), such that S ⊆ Fm p is (2 distributed modulo p, given kd(m + 1) integers ti,j ∈ S
and
si,j = MSBτ,p (f (ti,j )),
i = 1, . . . , d,
j = 1, . . . , k
its output satisfies Pr A(t1,1 , . . . , td,k ; s1,1 , . . . , sd,k ) = (α1 , . . . , αw ) ≥ 1 + O(2−η ), m t1,1 ,...,td,k ∈Fp
if t1,1 , . . . , td,k are chosen uniformly and independently at random from S. Proof. Let us consider the vector s = (s1 , . . . , sd , sd+1 , . . . , sd+w ) where sd+j = 0, j = 1, . . . , w, and si = si,1 + · · · + si,k , i = 1, . . . , d. We have |f (ti,j ) − si,j | ≤ p/2τ +1 and thus |f (ti,1 ) + · · · + f (ti,k ) − si | ≤ kp/2τ +1.
176
´ Ibeas and A. Winterhof A.
Next we have f (ti,1 ) + · · · + f (ti,k ) − si = f (ti,1 ) + · · · + f (ti,k )p − si p + νp with ν ∈ {−1, 0, 1}. If ν = 1 then we have f (ti,1 ) + · · · + f (ti,k )p − si p + p = |f (ti,1 ) + · · · + f (ti,k ) − si | < kp/2τ +1 which is only possible if f (ti,1 ) + · · · + f (ti,k )p < kp/2τ +1. Similarly, we can show that ν = −1 is only possible if f (ti,1 ) + · · · + f (ti,k )p > p − kp/2τ +1. Because of the homogeneous distribution property, the probability that kp/2τ +1 ≤ f (ti,1 ) + · · · + f (ti,k )p ≤ p − kp/2τ +1 for all i = 1, . . . , d is 1 + O(dk/2τ ). Now we assume that ν = 0 and thus |f (ti,1 ) + · · · + f (ti,k )p − si p | < kp/2τ +1. Multiplying the jth row vector of the matrix (4) by αj and subtracting a certain multiple of the (w + j)th vector, j = 1, . . . , w, we obtain a lattice point u = (u1 , . . . , ud , α1 /2τ +1, . . . , αw /2τ +1) ∈ Lτ,e,p (t1,1 , . . . , td,k ) such that |ui − si | < kp2−τ −1 ,
i = 1, . . . , d + w,
where ud+j = αj /2τ +1, j = 1, . . . , w. Therefore, d+w
(ui − si )2 ≤ (d + w)2−2τ −2 k 2 p2 .
i=1
We can assume that w ≤ n because in the opposite case τ > n+ log k and the result is trivial. Therefore d+w = O(τ ). Now we can use Lemma 4 to find in polynomial time a lattice vector v = (v1 , . . . , vd , vd+1 , . . . , vd+w ) ∈ Lτ,e,p (t1,1 , . . . , td,k ) such that d+w d 2 o(d+w) 2 (vi − si ) ≤ 2 min (zi − si ) , z = (z1 , . . . , zd+w ) ∈ L i=1
i=1
≤2
−2τ +o(τ )
(d + w)k 2 p2 ≤ 2−2τ +o(τ )k 2 p2 ≤ 2−2η−1 p2 ,
provided that p is large enough. We also have d (ui − si )2 ≤ d2−2τ −2 k 2 p2 ≤ 2−2η−2 p2 . i=1
d Therefore, i=1 (ui −vi )2 ≤ 2−2η p2 . Applying Lemma 5, we see that v = uf with probability at least 1 − 2−η , and therefore the coefficients of f can be recovered in polynomial time. Applying Lemma 3 and Equation (3) to Theorem 1 we obtain (k = 1):
Noisy Interpolation of Multivariate Sparse Polynomials in Finite Fields
177
Corollary 1. Let p be a sufficiently large n-bit prime and let w ≥ 1 be an integer. We define μ = (nw)1/2 , τ = μ + log n, d = 2 μ, η = 0.5μ + 3. There exists a deterministic polynomial time algorithm A such that for any polynomial f defined by (1), with known exponents 1 ≤ e1 < · · · < ew < pm and total degree D≤
p1/2 ; 2η log p
given d(m + 1) integers ti ∈ Fm p and si = MSBτ,p (f (ti )), i = 1, . . . , d, its output satisfies Pr m A(t1 , . . . , td ; s1 , . . . , sd ) = (α1 , . . . , αw ) ≥ 1 + O(2−η ), t1 ,...,td ∈Fp
if t1 , . . . , td are chosen uniformly and independently at random from Fm p . For polynomials of higher degree, Lemma 2 gives: Corollary 2. Let p be a sufficiently large n-bit prime and let w ≥ 1 be an integer. We define μ = (nw)1/2 ,! τ = μ + log n + log k, d = 2 μ, η = 0.5μ+ 3, where k = 3(w log p)1+ε log p for 1 > ε > 0. There exists a deterministic polynomial time algorithm A such that for any polynomial f defined by (1), with known exponents 1 ≤ e1 < · · · < ew < pm and satisfying (2), given kd(m + 1) integers ti,j ∈ Fm p and si,j = MSBτ,p (f (ti,j )), i = 1, . . . , d, j = 1, . . . , k, its output satisfies Pr A(t1,1 , . . . , td,k ; s1,1 , . . . , sd,k ) = (α1 , . . . , αw ) ≥ 1 + O(2−η ), m t1,1 ,...,td,k ∈Fp
if t1,1 , . . . , td,k are chosen uniformly and independently at random from Fm p . Remarks. Using the standard method for reducing incomplete exponential sums to complete ones we can easily extend Corollary 1 to subboxes of Fm p of the form S = {x0 + (x1 , . . . , xm ) : 0 ≤ xi < Ki , i = 1, . . . , m} for some integers 1 ≤ K1 , . . . , Km ≤ p and x0 ∈ Fm p . Moreover, we can extend the results to the situation where the randomly chosen points are chosen from a sufficiently large subgroup of F∗p of order T by substituting the variable X by Y (p−1)/T . The condition e1 ≥ 1 is not absolutely necessary. However, if we don’t restrict ourselves to the case f (0, . . . , 0) = 0 a modified algorithm gives only an approximation of the constant term f (0, . . . , 0).
References 1. Cochrane, T., Pinner, C., Rosenhouse, J.: Bounds on exponential sums and the polynomial Waring problem mod p. J. London Math. Soc. 67(2), 319–336 (2003) 2. Shparlinski, I., Winterhof, A.: A hidden number problem in small subgroups. Math. Comp. 74(252), 2073–2080 (2005) 3. Shparlinski, I., Winterhof, A.: Noisy interpolation of sparse polynomials in finite fields. Appl. Algebra Engrg. Comm. Comput. 16(5), 307–317 (2005)
178
´ Ibeas and A. Winterhof A.
4. Shparlinski, I.E.: Sparse polynomial approximation in finite fields. In: Proceedings of the Thirty-Third Annual ACM Symposium on Theory of Computing, pp. 209–215. ACM, New York (2001) (electronic) 5. Shparlinski, I.E.: Playing ‘hide-and-seek’ with numbers: the hidden number problem, lattices and exponential sums. In: Public-key cryptography. Proc. Sympos. Appl. Math., vol. 62, pp. 153–177. Amer. Math. Soc., Providence (2005) 6. Shparlinski, I.E., Winterhof, A.: A nonuniform algorithm for the hidden number problem in subgroups. In: Bao, F., Deng, R., Zhou, J. (eds.) PKC 2004. LNCS, vol. 2947, pp. 416–424. Springer, Heidelberg (2004) 7. Winterhof, A.: On Waring’s problem in finite fields. Acta Arith. 87(2), 171–177 (1998)
New Commutative Semifields and Their Nuclei J¨ urgen Bierbrauer Department of Mathematical Sciences Michigan Technological University Houghton, Michigan 49931 (USA)
Abstract. Commutative semifields in odd characteristic can be equivalently described by planar functions (also known as PN functions). We describe a method to construct a semifield which is canonically associated to a planar function and use it to derive information on the nuclei directly from the planar function. This is used to determine the nuclei of families of new commutative semifields of dimensions 9 and 12 in arbitrary odd characteristic. Keywords: PN functions, planar functions, presemifields, semifields, middle nucleus, kernel, Dembowski-Ostrom polynomial, isotopy, strong isotopy.
1
Introduction
Until recently the only known families of commutative semifields in arbitrary odd characteristic aside of the fields themselves were the classical constructions by Dickson [7] and Albert [1]. The first provably new such general constructions were given in Zha-Kyureghyan-Wang [11] and [2]. The families constructed in Budaghyan-Helleseth [3] may be new as well but this seems to remain unproved. The survey article of Kantor [9] gives more background information and comments on the scarcity of known commutative semifields in odd characteristic, in particular when the characteristic is > 3. Definition 1. A presemifield is a set F with two binary relations, addition and ∗, such that – – – –
F is a commutative group with respect to addition, with identity 0. F ∗ is a loop under multiplication. 0 ∗ a = 0 for all a. The distributive laws hold.
If moreover there is an element e ∈ F such that e ∗ x = x ∗ e = x for all x we speak of a semifield. Definition 2. Let F = Fpr for an odd prime p. A function f : F −→ F is perfectly nonlinear (PN), also called a planar function, if for each 0 = a ∈ F the directional derivative δa defined as δa (x) = f (x + a) − f (x) is bijective. M. Bras-Amor´ os and T. Høholdt (Eds.): AAECC 2009, LNCS 5527, pp. 179–185, 2009. c Springer-Verlag Berlin Heidelberg 2009
180
J. Bierbrauer
i Let f : F −→ F and write it as a polynomial f (x) = r−1 i=0 ai x . Then f is a Dembowski-Ostrom (DO-)polynomial if all its monomials have p-weight ≤ 2 (the exponents are sums of two powers of p). In odd characteristic planar DO-polynomials are equivalent with commutative presemifields, see Coulter-Henderson [4]: Theorem 1. The following concepts are equivalent: – Commutative presemifields in odd characteristic. – Dembowski-Ostrom polynomials which are PN functions. The relation between those concepts is identical to the equivalence between quadratic forms and bilinear forms in odd characteristic, with the planar function in the role of the quadratic form. If ∗ is the presemifield product, then the corresponding planar function is f (x) = x ∗ x. When the planar function is given, the corresponding semifield product is x ∗ y = (1/2){f (x + y) − f (x) − f (y)}.
(1)
For an overview see also Section 9.3.2 of Horadam [8]. One way to construct a semifield from a commutative presemifield is the following: choose 0 = e ∈ F and define the new multiplication ◦ by (x ∗ e) ◦ (y ∗ e) = x ∗ y. Then ◦ describes a semifield with unit element e ∗ e. Definition 3. Let F = Frp be the r-dimensional vector space over Fp . Consider presemifields on F whose additions coincide with that of F. Two such presemifield multiplications ∗ and ◦ on F are isotopic if there exist α1 , α2 , β ∈ GL(r, p) such that β(x ◦ y) = α1 (x) ∗ α2 (y) always holds. They are strongly isotopic if we can choose α2 = α1 . This notion of equivalence is motivated by the fact that two presemifields are isotopic if and only if the corresponding projective planes are isomorphic. Let F = Fpr be the field of order pr . It is a commonly used method to replace a given commutative semifield of order pr by an isotopic copy which is defined on F and shares the additive structure and the unit element 1 with F. The question is then to which degree the semifield structure can be made to coincide with the field structure. As associativity is the only field axiom that a commutative semifield does not satisfy it is natural that associativity will be in the center of interest. Definition 4. Let F = Fpr and (F, ∗) a commutative semifield with unit 1 whose additive structure agrees with that of the field F. Define S = {c ∈ F |c ∗ x = cx for all x ∈ F }. M = {c ∈ F |(x ∗ c) ∗ y = x ∗ (c ∗ y) for all x, y ∈ F }. K = {c ∈ F |c ∗ (x ∗ y) = (c ∗ x) ∗ y for all x, y ∈ F }.
New Commutative Semifields and Their Nuclei
181
Here the dimensions of the middle nucleus M and of the kernel or left nucleus K of a commutative semifield are invariant under isotopy. The dimension of S depends on the embedding of the semifield in the field F. As mentioned in [5] we have K ⊆ M (if a ∗ (x ∗ y) = (a ∗ x) ∗ y for all x, y, then this also equals (a ∗ y) ∗ x = (y ∗ a) ∗ x = (x ∗ a) ∗ y = x ∗ (a ∗ y)). As M is closed under semifield multiplication and is associative it is a field. The semifield multiplication on M can therefore be made to coincide with field multiplication. The same is true of the vector space structure of F over its subfield M. It follows that we can find a suitable isotope such that K ⊆ M ⊆ S. Here are the constructions from [11] and [2]: Theorem 2. Let p be an odd prime, q = ps , q = pt , F = Fq3 , s = s/ gcd(s, t), t = t/ gcd(s, t), s odd. Let f : F −→ F be defined by
f (x) = x1+q − vxq
2
+q q
where ord(v) = q 2 + q + 1.
Then f is a PN function in each of the following cases: – s + t ≡ 0 (mod 3). – q ≡ q ≡ 1 (mod 3) Theorem 3. Let p be an odd prime, q = ps , q = pt , K = Fq ⊂ F = Fq4 such that 2s/ gcd(2s, t) is odd, q ≡ q ≡ 1 (mod 4). Let f : F −→ F be defined by
f (x) = x1+q − vxq
3
+q q
where ord(v) = q 3 + q 2 + q + 1.
Then f is a PN function. The first family of Theorem 2 is constructed in [11], the second family of Theorem 2 and Theorem 3 are from [2]. In the generic case the first family of Theorem 2 is new as was shown in [11]. It was proved in [2] that the semifields of order p4s isotopic to the special case t = 2, s > 1 odd of Theorem 3 are not isotopic to Dickson or Albert semifields. Let f (x) be a Dembowski-Ostrom polynomial which is a PN function and ∗ the corresponding presemifield product (x ∗ y = (1/2){f (x + y) − f (x) − f (y)}). In the following section we decribe a canonical construction of a (commutative) semifield strongly isotopic to (F, ∗) which allows to read off information on the nuclei directly from f (x). In the last section we continue studying lowdimensional subfamilies of the planar functions of Theorems 2,3. In particular we describe 12-dimensional semifields with middle nucleus of dimension 2 and kernel Fp as well as a new family of 9-dimensional semifields all of whose nuclei agree with the prime field. In the sequel p always denotes an odd prime.
182
2
J. Bierbrauer
From Commutative Presemifields to Semifields in Odd Characteristic
Definition 5. Let f (X) be a DO-polynomial defined on F = Fpr for odd p. Let i G = Gal(F |Fp ) = {g0 = id, g1 , . . . , gr−1 } be the Galois group where gi (x) = xp . Write r−1 f (X) = ai gi (X 2 ) + bjk gj (X)gk (X) i=0
j
where ai , bjk ∈ F. If f (X) is also a planar function, then the presemifield product defined by f (X) is x∗y = ai gi (xy) + (bjk /2)(gj (x)gk (y) + gk (x)gj (y)) i
j
i
Let ti (X) = X p − X. Lemma 1. tmu (X) is a polynomial in tm (X). u−1
Proof. Let Q = pm . Then tmu (X) = tm (X)Q g(u−1)m (tm (X)) + . . . + tm (X).
u−2
+ tm (X)Q
+ . . . + tm (X) =
Proposition 1. Let p odd, F = Fpr and (F, ∗) a commutative presemifield. Let α ∈ GL(r, p) and define a product ◦ by α(1) ∗ α(x ◦ y) = α(x) ∗ α(y). Then (F, ◦) is a commutative semifield with unit 1. It is strongly isotopic to (F, ∗). Proof. Obviously (F, ◦) is a commutative presemifield. It is related to (F, ∗) by the strong isotopy β(x ◦ y) = α(x) ∗ α(y) where β(x) = α(1) ∗ α(x). Choosing y = 1 shows α(1) ∗ α(x ◦ 1) = α(x) ∗ α(1). It follows x ◦ 1 = x. In case α = id we obtain 1 ∗ (x ◦ y) = x ∗ y. This is made explicit in the following definition and theorem. Definition 6. Let F = Fpr for odd p and f (x) a planar DO-polynomial on F. The associated semifield function is B(f (x)) where B ∈ GL(r, p) is the inverse of A(x) = x ∗ 1. The associated semifield product is the product ◦ defined by B(f (x)). r−1 2 Theorem 4. Let f (X) = j
New Commutative Semifields and Their Nuclei
183
Proof. Let x ∗ y be the presemifield product defined by f (X). We have (bjk /2)(gj (x) + gk (x)) A(x) = x ∗ 1 = ai gi (x) + j
and f (x) = A(x2 ) +
(bjk /2)(2gj (x)gk (x) − gj (x2 ) − gk (x2 )).
j
The expression in parenthesis is 2gj (x)gk (x) − gj (x2 ) − gk (x2 ) = −(gk (x) − gj (x))2 = −gj (tk−j (x)2 ). This yields the associated semifield function B(f (x)) = x2 − B( (bjk /2)gj (tk−j (x)2 )) j
and the associated semifield product x ◦ y = xy − B( (bjk /2)gj (tk−j (x)tk−j (y)). j
This follows from the linearity of Equation 1 and the fact that a term x2 in the planar function turns into xy in the presemifield product. Observe that tmu (X) is a polynomial in tm (X) by Lemma 1. Let c ∈ Fpm . Then c◦x = cx as tm (c) = 0. This shows Fpm ⊆ S(F, ◦). In order to show Fpm ⊆ M(F, ◦) it remains to be shown (cx) ◦ y = x ◦ (cy) for all x, y. This also follows directly from the fact that tk−j (cx) = ctk−j (x) for all k, j such that bjk = 0. Theorem 5. In the situation of Theorem 4 let l be the greatest common divisor of r and the numbers i, j, k where ai = 0 and j < k such that bjk = 0. Then the associated semifield has Fpl in its left nucleus. Proof. Let c ∈ Fpl . We have to show (cx) ◦ y = c(x ∗ y). This follows from the form of B(f (x)) as given in the proof of Theorem 4 and the fact that A(x) and its inverse B(x) are linear over Fpl .
3
Some Semifields and Their Nuclei
Theorem 6. The semifields of order p12 associated to the presemifields in case s = 3, t = 2 of Theorem 3 have middle nucleus Fp2 and kernel Fp . Proof. We have p ≡ 1 (mod 4), F = Fp12 and ord(v) = p9 + p6 + p3 + 1. The planar function is 2 5 9 f (x) = x1+p − vxp +p . It follows from Theorem 4 that the middle nucleus M of the associated semifield (F, ◦) has even dimension. It was shown in [2] that dim(M) is not a multiple of
184
J. Bierbrauer
6. If dim(M) > 2, then dim(M) = 4. By a result of Menichetti [10] the semifield would be Albert which is not the case as we proved in [2]. It follows M = Fp2 . We have x ◦ y = xy − (1/2)B(t2 (x)t2 (y)) + (1/2)B(vg5 (t4 (x)t4 (y))) (see the proof of Theorem 4) and t4 (X) = t2 (X) + g2 (t2 (X)). Let K(X, Y ) be the polynomial such that x ◦ y = xy + K(t2 (x), t2 (y)). Then 2
2
K(X, Y ) = −(1/2)B(XY − vg5 ((X + X p )(Y + Y p )) = 5
7
5
7
7
5
= −(1/2)B(XY − v(XY )p − v(XY )p − vX p Y p − vX p Y p ). Assume dim(K) > 1. Then K = M = Fp2 . It has been proven in [5], Theorem 2 2 4.2, that this is equivalent with K(X, Y ) being a polynomial in X p and Y p . Although we do not know B explicitly it is obvious that this condition cannot be 11 i i+2 satisfied. In fact, let B(x) = i=0 βi gi (x). The absence of monomials X p Y p i for odd i shows β0 = β2 = . . . = β10 = 0. As (XY )p is absent for odd i we have 0 = βi − gi−5 (v)βi−5 − gi−7 (v)βi−7 = βi . This yields the contradiction B ≡ 0. We turn to Theorem 2. The smallest dimension for which new planar functions may result is r = 9 for the second subfamily. Here t should not be a multiple of 3 as otherwise a field or an Albert twisted field is obtained. Up to obvious 4 6 isotopy equivalences there are three cases, f (x) = x1+p −vxp +p , f (x) = x1+p − 3 7 2 3 8 vxp +p , f (x) = x1+p − vxp +p . We show that those yield new semifields all of whose nuclei agree with the prime field: Theorem 7. Let p ≡ 1 (mod 3), q = p3 , K = Fq ⊂ F = Fp9 and ord(v) = q 2 + q + 1. The semifields of order p9 associated to the planar functions 4
f (x) = x1+p − vxp
+p6
3
, f (x) = x1+p − vxp
+p7
2
3
or f (x) = x1+p − vxp
+p8
have middle nucleus Fp and are not isotopic to a commutative Albert semifield. Proof. Assume the middle nucleus M of a corresponding semifield has dimension > 1. Then the dimension is 3. By Menichetti [10] we are in the Albert case. It suffices therefore to prove that our presemifield is not isotopic to a commutative Albert presemifield. There are four cases to consider. Corresponds ing presemifields are described by the monomial planar functions X 1+p where s ∈ {1, 2, 3, 4}. Here case s = 3 corresponds to the uniquely determined Albert semifield with nucleus of dimension 3, the remaining values of s are representatives of the three isotopism classes of commutative Albert presemifields whose corresponding semifields have nucleus of dimension 1. Assume we have isotopy with one of those commutative Albert presemifields. It follows from CoulterHenderson [4], Corollary 2.8 that there is a strong isotopy. There exist invertible linear mappings 8 8 ai gi (x), β(x) = bi gi (x) α(x) = i=0
i=0
New Commutative Semifields and Their Nuclei
such that
185
s
α(x)1+p = β(f (x)). We complete the proof for the first type f (x). The proofs in the remaining cases are analogous. Observe that in the exponents modular distances (in the circle of length 9) d = 0, d = 3, d = 4 do not occur. In the case of distance d = 0 this yields ai ai+s = 0 for all i. The equations for d = 3 and d = 4 are the following: ai gs (ai+3−s ) + ai+3 gs (ai−s ) = 0. ai gs (ai+4−s ) + ai+4 gs (ai−s ) = 0. Without restriction a0 = 0. It follows as = a−s = 0. Evaluating the d = 3 equation for i ∈ {0, −3, s, s − 3} and the d = 4 equation for i ∈ {0, −4, s, s − 4} shows ai = 0 for i ∈ ±{s, s − 3, s − 4, s + 3, s + 4}. For s = 3 or s = 4 this yields the contradiction a0 = 0. For s = 1 or s = 2 the contradiction ai = 0 for all i = 0 is obtained.
References 1. Albert, A.A.: On nonassociative division algebras. Transactions of the American Mathematical Society 72, 296–309 (1952) 2. Bierbrauer, J.: New semifields, PN and APN functions. Designs, Codes and Cryptography (submitted) 3. Budaghyan, L., Helleseth, T.: New perfect nonlinear multinomials over Fp2k for any odd prime p,. In: Golomb, S.W., Parker, M.G., Pott, A., Winterhof, A. (eds.) SETA 2008. LNCS, vol. 5203, pp. 403–414. Springer, Heidelberg (2008) 4. Coulter, R.S., Henderson, M.: Commutative presemifields and semifields. Advances in Mathematics 217, 282–304 (2008) 5. Coulter, R.S., Henderson, M., Kosick, P.: Planar polynomials for commutative semifields with specified nuclei. Designs, Codes and Cryptography 44, 275–286 (2007) 6. Coulter, R.S., Matthews, R.W.: Planar functions and planes of Lenz-Barlotti class II. Designs, Codes and Cryptography 10, 167–184 (1997) 7. Dickson, L.E.: On commutative linear algebras in which division is always uniquely possible. Transactions of the American Mathematical Society 7, 514–522 (1906) 8. Horadam, K.J.: Hadamard matrices and their applications. Princeton University Press, Princeton (2007) 9. Kantor, W.M.: Commutative semifields and symplectic spreads. Journal of Algebra 270, 96–114 (2003) 10. Menichetti, G.: On a Kaplansky conjecture concerning three-dimensional division algebras over a finite field. Journal of Algebra 47, 400–410 (1977) 11. Zha, Z., Kyureghyan, G.M., Wang, X.: Perfect nonlinear binomials and their semifields. Finite Fields and Their Applications 15, 125–133 (2009)
Spreads in Projective Hjelmslev Geometries Ivan Landjev New Bulgarian University, 21 Montevideo str., 1618 Sofia, Bulgaria and Institute of Mathematics and Informatics, BAS, 8 Acad. G. Bonchev str., 1113, Sofia, Bulgaria
Abstract. We prove a necessary and sufficient condition for the existence of spreads in the projective Hjelmslev geometries PHG(RRn+1 ). Further, we give a construction of projective Hjelmslev planes from spreads that generalizes the familiar construction of projective planes from spreads in PG(n, q). Keywords: Projective Hjelmslev geometry, projective Hjelmslev plane, spread, finite chain ring. Subject Classifications: 51E26, 51E21, 51E22, 94B05.
1 Introduction In this paper, we introduce spreads in the projective Hjelmslev geometries PHG(Rn+1 R ). There exists an extensive literature about spreads in the projective geometries PG(k, q) (cf. [5] and the references there). The same objects in ring geometries have attracted little or no attention despite the connections with interesting areas as linear codes over finite chain rings. In what follows, we restrict ourselves mainly to spreads in geometries over chain rings of nilpotency index 2. This is a necessary step towards investigating geometries over chain rings of larger nilpotency index, because of the nested structure of the projective Hjelmslev geometries. On the other hand, geometries over rings have less regularities than the usual projective geometries, we settle for a problem that is tractable to some extent. Finally, there exists a complete classification for the chain rings R with |R| = q2 , R/ radR ∼ = Fq , so that in this case we have a description of all coordinate geometries. This paper is organized as follows. In Section 2, we give some basic facts about finite chain rings and the structure of projective Hjelmslev geometries over such rings. In Section 3, we prove a necessary and sufficient condition for the existence of spreads in the projective Hjelmslev geometries PHG(Rn+1 R ), where R is a finite chain ring of nilpotency index 2. We sketch the proof of the main theorem for arbitrary chain rings. In Section 4, we present a construction of projective Hjelmslev planes from spreads in PHG(Rn+1 R ). M. Bras-Amorós and T. Høholdt (Eds.): AAECC 2009, LNCS 5527, pp. 186–194, 2009. c Springer-Verlag Berlin Heidelberg 2009
Spreads in Projective Hjelmslev Geometries
187
2 Basic Facts on Projective Hjelmslev Geometries A finite ring R (associative, with identity 1 = 0, ring homomorphisms preserving the identity) is called a left (resp. right) chain ring if the lattice of its left (resp. right) ideals forms a chain. It turns out that every left ideal is also a right ideal. Moreover, if N = rad R every proper ideal of R has the form N i = Rθi = θi R, for any θ ∈ N \ N 2 and some positive integer i. The factors N i /N i+1 are one-dimensional linear spaces over R/N. Hence, if R/N ∼ = Fq and m denotes the nilpotency index of N, the number of elements of R is equal to qm . For further facts about chain rings, we refer to [2,10,11]. As mentioned above, we consider chain rings of nilpotency index 2, i.e. chain rings with N = (0) and N 2 = (0). Thus we have always |R| = q2 , where R/N ∼ = Fq . Chain rings with this property have been classified in [3,12]. If q = pr there are exactly r + 1 isomorphism classes of such rings. These are: • for every σ ∈ Aut Fq the ring Rσ ∼ = Fq [X; σ]/(X 2 ) of so-called σ-dual numbers over addition and multiplication given Fq with underlying set Fq × Fq , component-wise by (x0 , x1 )(y0 , y1 ) = x0 y0 , x0 y1 + x1 σ(y0 ) ; • the Galois ring GR(q2 , p2 ) ∼ = Z p2 [X]/( f (X)), where f (X) ∈ Z p2 [X] is a monic polynomial of degree r, which is basic irreducible (cf. [10]). The rings Rσ with σ = id are noncommutative, while Rid is commutative. Moreover, char Rσ = p for every σ. The Galois ring GR(q2 , p2 ) is commutative and has characteristic p2 . From now on we denote by R a finite chain ring of nilpotency index 2. The only exception will be Theorem 8, where R is a chain ring of an arbitrary nilpotency index. Let R be a finite chain ring and consider the module M = RkR . Denote by M ∗ the set of all non-torsion vectors of M, i.e. M ∗ = M \ Mθ. Define sets P and L by
P = {xR; x ∈ M ∗ }, L = {xR + yR; x, y ∈ M ∗ , x, y linearly independent}, respectively, and take as incidence relation I ⊆ P × L set-theoretical inclusion. Further, define a neighbour relation on the sets of points and lines of the incidence structure (P , L , I) as follows: Y ) if there exist two different (N1) the points X,Y ∈ P are neighbours (notation X lines incident with both of them; (N2) the lines s,t ∈ L are neighbours (notation s t) if there exist two different points incident with both of them. The incidence structure Π = (P , L , I) with the neighbour relation is called the (k−1)dimensional (right) projective Hjelmslev geometry over R and is denoted by PHG(RkR ). The point set S ⊆ P is called a Hjelmslev subspace (or simply subspace) of PHG(RkR ) if for every two points X,Y ∈ S , there exists a line l incident with X and Y that is incident only with points of S . The Hjelmslev subspaces of PHG(RkR ) are of the form {xR; x ∈ U ∗ }, where U is a free submodule of M. The (projective) dimension of a subspace is equal to the rank of the underlying module minus 1. It is easily checked that is an equivalence relation on each one of the sets P and L . If [X] denotes the set of all points that are neighbours to X = xR, then [X] consists
188
I. Landjev
of all free rank 1 submodules of xR + Mθ. Similarly, the class [l] of all lines which are neighbours to l = xR + yR consists of all free rank 2 submodules of xR + yR + Mθ. More generally, two subspaces S and T , dim S = s, dim T = t, s ≤ t, are neighbours if {[X]; X ∈ S } ⊆ {[X]; X ∈ T }. In particular, we say that the point X is a neighbour of the subspace S if there exists a point Y ∈ S with X Y . The neighbour class [S ] contains all subspaces of dimension s that are neighbours to S . The next theorems give some insight into the structure of the projective Hjelmslev geometries PHG(RkR ) and are part of more general results [1,4,6,7,8,9,13]. As usual, n k q denotes the Gaussian coefficient: n (qn − 1)(qn−1 − 1) . . . (qn−k+1 − 1) = . k q (qk − 1)(qk−1 − 1) . . . (q − 1) Theorem 1. Let Π = PHG(RkR ) where R is a chain ring with |R| = q2 , R/N ∼ = Fq . Then k −1 (i) the number of points (hyperplanes) in Π is qk−1 nk q = qk−1 · qq−1 ; k−1 (ii) every point (hyperplane) has q neighbours; (iii) every subspace of dimension s − 1 is contained in exactly q(t−s)(k−t) k−s t−s q subspaces of dimension t − 1, where s ≤ t ≤ k; (iv) given a point P and a subspace S of dimension s − 1 containing P, there exist exactly qs−1 points in S that are neighbours to P. Note that the Hjelmslev spaces PHG(RkR ) are 2-uniform in the sense of [4]. Denote by η the natural homomorphism from Rk to Rk /Rk θ and by η the mapping induced by η on the submodules of Rk . It is clear that for every point X and every line l we have [X] = {Y ∈ P ; η(Y ) = η(X)}, [l] = {m ∈ L ; η(m) = η(l)}.
P
(resp. L ) the set of all neighbour classes of points (resp. lines). Let us denote by The following result is straightforward. Theorem 2. The incidence structure (P , L , I ) with incidence relation I defined by [X] I [l] ⇐⇒ ∃Y ∈ [X], ∃ m ∈ [l] : Y I m is isomorphic to the projective geometry PG(k − 1, q) Let S0 be a fixed subspace in PHG(RkR ) with dim S0 = s. Define the set P of subsets of P by P = {S ∩ [X]; X S0 , S ∈ [S0 ]}. The sets S ∩ [X] are either disjoint or coincide. Define an incidence relation I ⊂ P × L by / (S ∩ [X])I l ⇐⇒ l ∩ (S ∩ [X]) = 0.
Spreads in Projective Hjelmslev Geometries
189
Let L (S0 ) be the set of all lines in L incident with at least one point in P. For the lines l1 , l2 ∈ L (S0 ) we write l1 ∼ l2 if they are incident (under I) with the same elements of P. The relation ∼ is an equivalence relation under which L (S0 ) splits into classes of equivalent lines. Denote by L a set of representatives of the equivalence classes of lines in L (S0 ). The set of representatives L contains only two types of lines: lines l with l S0 . S0 and lines l with l Theorem 3. The incidence structure (P, L, I |P×L ) can be embedded into PG(k−1, q). A special case of this result is obtained if we take S0 to be a point. Given Π = (P , L , I) = PHG(RkR ) and a point P ∈ P , let L (P) be the set of all lines in L incident with points in [P]. For two lines s,t ∈ L (P) we write s ∼ t if s and t coincide on [P]. Denote by L1 a complete list of representatives of the lines from L (P) with respect to the equivalence relation ∼. Then we have the following result: Theorem 4
([P], L1 , I|[P]×L1 ) ∼ = AG(k − 1, q).
Finally, let two points X1 and X2 in Π = PHG(RkR ) be neighbours. Then any two lines incident with X1 and X2 are neighbours and belong to the same class, [l] say. In such case we say that the neighbour class [l] has the direction of the pair (X1 , X2 ).
3 The Existence of Spreads in Projective Hjelmslev Geometries Definition 5. An r-spread of the projective Hjelmslev geometry PHG(Rn+1 R ) is a set S of r-dimensional subspaces such that every point is contained in exactly one subspace of S . Theorem 6. Let R be a chain ring with |R| = q2 , R/ rad R ∼ = Fq . There exists a spread S of r-dimensional spaces of PHG(RnR ) if and only if r + 1 divides n + 1. Proof. The number of points in an r-dimensional subspace is qr r+1 1 q . The existence of r+1 r+1 r a spread of r dimensional subspaces implies that q 1 q divides qn n+1 1 q , i.e. 1 q divides n+1 , i.e. r + 1 divides n + 1. 1 q Assume that r + 1 divides n + 1 and let s be determined by n + 1 = (s + 1)(r + 1). First we consider the case where R = GR(q2 , p2 ). Take an algebra of dimension r + 1 over R = GR(q2 , p2 ), e.g. let this algebra be Rr+1 = R[X]/( f (X)) = GR(q2(r+1) , p2 ), where f is a monic irreducible polynomial of degree r + 1 over R. If α is a root of f in Rr+1 then every element β from Rr+1 can be written as β = b0 + b1 α + . . . + br αr , bi ∈ R. n+1 are isomorphic as modules over R. Thus each point in Clearly, Rs+1 r+1 , Rn+1 and R n+1 PHG(RR ) can be represented by an (s + 1)-tuple of elements from Rr+1 or as a unit in Rn+1 . In the same time, every (s + 1)-tuple over Rr+1 that has at least one coordinate that is a unit, can be viewed as a point in PHG(Rs+1 r+1 ).
190
I. Landjev
∗ Let (γ0 , γ1 , . . . , γs ) ∈ (Rs+1 r+1 ) be a nontorsion vector. Without loss of generality, let γ0 = 0. Consider the system −γ1 x0 + γ0 x1 =0 −γ2 x0 + + γ x =0 0 2 . (1) . .. ... = 0 −γs x0 + + γ0 xs = 0
The choice of the nonzero element is not essential. If we take γ j = 0. The system (1) is equivalent to −γi x j + γ j xi = 0 for i = 0, 1, . . . , s, i = j. The solutions of (1) form a free s+1 submodule of rank 1 Rs+1 r+1 , i.e. a point in PHG(Rr+1 ). This rank 1 submodule can be considered as a free submodule of rank (r + 1) of Rn+1 R , i.e. a r-dimensional subspace of PHG(Rn+1 ). Two (r + 1)-dimensional subspaces in Rn+1 obtained from different 1R R s+1 dimensional subspaces of Rr+1 do not have a common nontorsion vector. Now consider two different points (γ0 , γ1 , . . . , γs ) and (δ0 , δ1 , . . . , δs ) in PHG(Rs+1 r+1 ). These points give rise to systems of the type (1) having as solutions different points of s+1 PHG(Rs+1 r+1 ) (1-dimensional subspaces of Rr+1 ). Assume otherwise and let the (s + 1)tuple (x0 , x1 , . . . , xs )=(0, 0, . . . , 0) be a common solution of the two systems. Then x0 = λγ0 = µδ0 , x1 = λγ1 = µδ1 , . . . , xs = λγs = µδs , where λ, µ ∈ Rr+1 , λ, µ = 0. This is a contradiction since the points (γ0 , γ1 , . . . , γs ) and (δ0 , δ1 , . . . , δs ) were assumed to be different. It remains to prove that every point is contained in a r-dimensional subspace. The n n+1 ; the number of points in an r-dimensional ) is q number of points in PHG(Rn+1 R 1 q s(r+1) s+1 subspace is qr r+1 and the number of points in PHG(Rs+1 r+1 ) is q 1 q 1 qr+1 . Now we have r+1 s+1 qr+1 − 1 s(r+1) q(s+1)(r+1) − 1 n n+1 ·q = q · qs(r+1) = qr , qr 1 q 1 qr+1 1 q q−1 qr+1 − 1 which means that the r-dimensional subspaces cover all points of PHG(Rn+1 R ). Secondly, consider the case where R is the ring of σ dual numbers over the finite field Fq , i.e. R = Rσ = Fq + tFq . Denote by R the ring of σ -dual numbers Fqr+1 + tFqr+1 , where σ |Fqr+1 = σ. Similarly, let R be the ring of σ -dual numbers Fqn+1 + tFqn+1 , where σ |Fqn+1 = σ . The ring R is a subring of R which in turn is a subring of R . As
and Rn+1 and R R are isomorphic as (right) submodules over R. above, R s+1 R R Consider an arbitrary nontorsion vector (γ0 , γ1 , . . . , γs ) ∈ R R s+1 . Fix a component which is a unit, γ0 say, and consider the system of linear equations (1). The set of solutions of (1) is a free rank 1 submodule of R s+1 which can be viewed as a free rank R r submodule of Rn+1 R . Further the proof is completed as for Galois rings. Remark 7. Assume r + 1 divides n + 1. We can prove the existence of a spread of rdimensional subspaces using the nested structure of the projective Hjelmslev geometries.
Spreads in Projective Hjelmslev Geometries
191
Let H0 be a fixed subspace in PHG(RkR ) with dim H0 = r. Define the set P of subsets of P by P = {H ∩ [X]; X H0 , H is a subspace, dim H = s, H ∈ [H0 ]}. The sets H ∩ [X] are either disjoint or coincide. Define an incidence relation I ⊂ P × L by / (H ∩ [X])I l ⇐⇒ l ∩ (H ∩ [X]) = 0. Let L (H0 ) be the set of all lines in L incident with at least one point in P. For the lines l1 , l2 ∈ L (H0 ) we write l1 ∼ l2 if they are incident (under I) with the same elements of P. The relation ∼ is an equivalence relation under which L (H0 ) splits into nonintersecting classes of equivalent lines. Denote by L a set of representatives of the equivalence classes of lines in L (H0 ). The set of representatives L contains only two types of lines: H0 . lines l with l H0 and lines l with l It is known from [6] that the incidence structure (P, L, I |P×L ) can be embedded isomorphically into the projective geometry PG(k − 1, q). Hence we can construct a spread in PHG(Rn+1 R ) in the following way. We start with a spread in the factor geometry PG(n, q). This spread defines a set of neighbourhood classes of projective r-subspaces. Each one of these classes is isomorphic in the sense of the above mentioned result to a projective geometry PG(n, q) with an (n − r − 1)-dimensional space deleted. Now it suffices to take a spread which contains a spread of the deleted (n − r − 1)-dimensional subspace. A spread with this property can be constructed, for instance, by repeating the construction from the proof of Theorem 6. The exceptional (n − r − 1)-dimensional space can be taken as the space consisting of all points having zeros in the first r + 1 positions. Theorem 6 can be generalized to projective Hjelmslev geometries over arbitrary chain rings R. Theorem 8. Let R be a chain ring with |R| = qm , R/ rad R ∼ = Fq . The n-dimensional projective Hjelmslev geometry PHG(Rn+1 ) has a spread of r-dimensional projective R Hjelmslev subspaces if and only if r + 1 divides n + 1. Proof. We give only a sketch of a proof. For the sake of convenience, we set N = rad R = θR. As before, the "only if"-part is straightforward. The proof of the "if"-part uses induction on m and n. This result is obviously true for m = 1 and 2. It is also trivial if n = r for every m. Consider the factor geometry having as points the (m − 1)-neighbour classes on points. It is isomorphic to PHG((Rk /θm−i Rk )R/N m−i (cf. [6]). By the induction hypothesis, it has a spread of r-dimensional projective Hjelmslev subspaces. The preimage of these subspaces are of the form [Δ]m−1 where Δ is an r-dimensional Hjelmslev subspace in PHG(Rn+1 R ). Here [Δ] j is the class of all r-dimensional Hjelmslev subspaces that are ∼ j-th neighbours to Δ. Now [Δ] j can be imbedded isomorphically in PHG((R/N)n+1 R/N ) = PG(n, q) (cf. [6]) where the missing part is an (n − r − 1)-dimensional subspace, H say. Since r + 1 divides n − r = (n + 1) − (r + 1), we have that H contains a spread by the induction hypothesis. Now it is enough to take a spread which contains as a subset the spread of the missing (n − r − 1)-dimensional subspace.
192
I. Landjev
4 Projective Hjelmslev Planes from Spreads Spreads in Π = PHG(Rn+1 R ) can be used to construct projective Hjelmslev planes. Set n = 2t − 1, r = t − 1, s = 1. By Theorem 6 , there exists a spread S of r-dimensional subspaces of Π such that its image under the canonical map η is a (multiple of a) spread in PG(n, q). = PHG(Rn+2 ), e.g. by taking by taking as The geometry Π can be imbedded in Π R with first coordinate 0. Hence Π can be considered as a points of Π all points of Π Denote by [Π] the set of all neighbour hyperpalnes to Π in Π. Define hyperplane of Π. a new incidence structure as follows: Take as points: that are not incident with a point of [Π]. These are called proper (1) all points of Π points and their number is: qn+1
qn+2 − 1 qn+1 − 1 − qn+1 = q2(n+1) = q4t . q−1 q−1
(2) all subspaces of the form S, P ∩ H, \ [Π] and H is where S is an r-dimensional subspace from S , P is a point from Π a hyperplane of Π contained in the neighbour class [Π]. We cann call these ideal points. The number of choices for the point P is q4t = q2(n+1) . The number of choices for S ∈ S is qn q q−1−1 n+1
|S | =
qr q q−1−1 r+1
=
q2t−1 (q2t − 1) = qt (qt + 1). qt−1 (qt − 1)
The number of choices for H ∈ [Π] is qn+1 = q2t . For all points Q in S, P \ [Π], we have S, Q ∩ H = S, P ∩ H i.e. we get the same point in the new incidence structure. Hence for qr+1
qr+2 − 1 qr+1 − 1 − qr+1 = q2(r+1) = q2t . q−1 q−1
different points P we get the same (r + 1)-dimensional subspace S, P. As lines we take: \ [Π], i.e. (1) all subspaces of the form S, P, where S ∈ S and P is a point from Π these are all (r + 1)-dimensional subspaces through r-dimensional subspaces in the spread; (2) all hyperplanes H from [Π].
Spreads in Projective Hjelmslev Geometries
193
For the ideal points, we For the proper points neighbourhood is inherited from Π. have that S , P ∩ H S , P ∩ H if and only if S and S are neighbours in Π. By definition, two lines 1 and 2 are neighbours if for every point X ∈ 1 there exists a point Y ∈ 2 with X Y , and, conversely, for every Y ∈ 2 there exists an X ∈ 1 with Y X. \ [Π] Lemma 9. Let S be an r-dimensional subspace in Π and let P, Q be points from Π with P Q. Then S, P ∩ [Π] = S, Q ∩ [Π]. Proof. Assume there exists a point X ∈ [Π] with X ∈ S, Q, but X ∈ S, P. The lines PY and QY are neighbours. Therefore |PY ∩ QY | = q. The common points of both lines must be neighbours to Y . Hence the q common points must lie in [Π], contradiction to the initial assumption. Lemma 10. The number of hyperplanes from [Π] through a fixed r-dimensional flat S ∈ S (S ⊂ Π) is qt . Proof. Any r-dimensional flat in an (n + 1)-dimensional space can be given by a set of (n + 1) − r = t + 1 equations. Without loss of generality, let S be given by x0 = x1 = . . . = xt = 0 and let Π be the hyperplane defined by x0 = 0. An arbitrary hyperplane in [Π] containing S satisfies an equation of the form: x0 + θ(r1 x1 + r2 x2 + . . . + rt xt ) = 0.
(2)
We have θr = θs if and only if r − s ∈ rad R, therefore (2) describes all hyperplanes in [Π] through S when (r1 , r2 , . . . , rt ) runs Γt , where Γ is a set of elements no two of which are congruent modulo rad R. hence there are exactly q possibilities for each ri and the number of hyperplanes in [Π] through S is qt . According to Lemma 9 the number of the essentially different choices of P is qn+2 −1 q−1 qr+2 −1 q−1
− q q−1−1 n+1
−
qr+1 −1 q−1
=
qn+1 = qt . qr+1
qt (qt
+ 1) and the number of hyperplanes H from [Π] The number of choices for S is is q2t . On the other hand, by Lemma 10, we get the same intersection for qt different hyperplanes in [Π]. Each ideal point of the second type can be obtained for qt different subspaces S, P. Therefore the number of all points of the second type is qt · qt (qt + 1) · q2t = q3t + q2t . qt · qt Now it is a straightforward check that the defined incidence structure is indeed a projective Hjelmslev plane. Acknowledgements. This research has been supported by the Strategic Development Fund of The New Bulgarian University.
194
I. Landjev
References 1. Artmann, B.: Hjelmslev-Ebenen mit verfeinerten Nachbarschaftsrelationen. Mathematische Zeitschrift 112, 163–180 (1969) 2. Clark, W.E., Drake, D.A.: Finite chain rings. Abhandlungen aus dem mathematischen Seminar der Universität Hamburg 39, 147–153 (1974) 3. Cronheim, A.: Dual numbers, Witt vectors, and Hjelmslev planes. Geometriae Dedicata 7, 287–302 (1978) 4. Drake, D.A.: On n-Uniform Hjelmslev Planes. Journal of Combinatorial Theory 9, 267–288 (1970) 5. Hirschfeld, J.W.P.: Projective geometries over finite fields. Clarendon Press, Oxford (1998) 6. Honold, T., Landjev, I.: Projective Hjelmslev geometries. In: Proc. of the International Workshop on Optimal Codes, Sozopol, pp. 97–115 (1998) 7. Kreuzer, A.: Hjelmslev-Räume. Resultate der Mathematik 12, 148–156 (1987) 8. Kreuzer, A.: Projektive Hjelmslev-Räume, Dissertation, Technische Universität München (1988) 9. Kreuzer, A.: Hjelmslevsche Inzidenzgeometrie - ein Bericht, Bericht TUM-M9001, Technische Universität München, Beiträge zur Geometrie und Algebra Nr. 17 (January 1990) 10. McDonald, B.R.: Finite rings with identity. Marcel Dekker, New York (1974) 11. Nechaev, A.A.: Finite principal ideal rings. Mat. Sbornik of the Russian Academy of Sciences 20, 364–382 (1973) 12. Raghavendran, R.: Finite associative rings. Compositio Mathematica 21, 195–229 (1969) 13. Veldkamp, F.D.: Geometry over rings. In: Buekenhout, F. (ed.) Handbook of Incidence Geometry—Buildings and Foundations, pp. 1033–1084. Elsevier Science Publishers, Amsterdam (1995)
On the Distribution of Nonlinear Congruential Pseudorandom Numbers of Higher Orders in Residue Rings Edwin D. El-Mahassni1 and Domingo Gomez2, 1
Department of Computing, Macquarie University North Ryde, NSW 2109, Australia
[email protected] 2 Faculty of Science, University of Cantabria E-39071 Santander, Spain
[email protected]
Abstract. The nonlinear congruential method is an attractive alternative to the classical linear congruential method for pseudorandom number generation. In this paper we present new discrepancy bounds for sequences of s-tuples of successive nonlinear congruential pseudorandom numbers of higher orders modulo a composite integer M .
1
Background
For an integer M > 1, we denote by ZZM the residue ring modulo M . In this paper we present some distribution properties of a generalization of pseudorandom number generators, first introduced in [7], defined by a recurrence un+1 ≡ f (g1 (un , . . . , un−r+1 ), . . . , gr (un , . . . , un−r+1 ))
(mod M ),
(1)
where f (X1 , . . . , Xr ), g1 (X1 , . . . , Xr ), . . . , gr (X1 , . . . , Xr ) ∈ ZZM [X1 , . . . , Xr ] for n ≥ r − 1 with some initial values u0 , . . . , ur−1 . To study this pseudorandom number generator, we define the sequence of polynomials fk (X) ∈ ZZM [X], with X = X1 , . . . , Xr by the recurrence relation fk (X) ≡ fk−1 (g1 (X), . . . , gr (X))
(mod M ),
k≥1
(2)
where f0 (X) = f (X). It is obvious that (1) becomes periodic with some period t ≤ M r . Throughout this paper we assume that this sequence is purely periodic, i.e. un = un+t beginning with n = 0, otherwise we consider a shift of the original sequence.
Domingo Gomez is partially supported by the Spanish Ministry of Education and Science grant MTM20067088.
M. Bras-Amor´ os and T. Høholdt (Eds.): AAECC 2009, LNCS 5527, pp. 195–203, 2009. c Springer-Verlag Berlin Heidelberg 2009
196
E.D. El-Mahassni and D. Gomez
Although the distribution of nonlinear congruential generators has been studied extensively, see [5,6,9] for instance, much less is known for its higher orders analogue. A result for a class of polynomials for prime moduli was established in [7], where this was later extended to a larger family of polynomials in [8]. In this paper we show a generalization of this result to a larger class of pseudorandom number generators. 1.1
Notation and First Results
This section begins with some notation. It will be assumed that N, Ai , Bi and bi represent integer positive numbers and 0 is the r-dimensional 0 vector. The elements of ZZM will be identified with the integers {0, . . . , M − 1}. For this reason, we can define eM (z) = exp(2πiz/M ) for any element z ∈ ZZM . For a polynomial f , G is the gcd of the coefficients of nonconstant monomials with M . We will denote f (p) (X) ≡ f (X) (mod p) of total degree d for a polynomial f with integer coefficients and total degree d. Finally, we define degXr f , the degree of the coefficient Xr of the polynomial f , to be degXr modulo every prime factor p of M . Lemma 1. Given a polynomial in the ring ZZM [X], f (X) ≡
d1 i1 =0
...
dr
bi1 ,...,ir X1i1 . . . Xrir
(mod M )
ir =0
with degXr f ≥ 1 , total degree d, then exist polynomials h1 (Xr ), . . . , hr−1 (Xr ) of degree less than d(log d + 1) such that (p)
(p)
g (p) (Xr ) = f (p) (a1 + h1 (Xr ), . . . , ar−1 + hr−1 (Xr ), Xr )
(3)
is a nonconstant polynomial for any p|M , p |G and any values a1 , . . . , ar−1 ∈ ZZM . Proof. Let p|M be a prime number not dividing G and D = log d + 1. By the definition of G, we note that f (p) (X) is not a constant polynomial and it can be expressed as f (p) (X) = h(X1 , . . . , Xr−1 )Xrd +f (X) where f (X) is a polynomial of degree stricly less than d in Xr . If d = 0 then f (p) (X) = h(X1 , . . . , Xr−1 ), but in any case h is a not a constant polynomial. Let IF be an extension field of degree D over ZZp . By the cardinality of IF, there exist ξ1 , . . . , ξr−1 ∈ IF such as h(ξ1 , . . . , ξr−1 ) = 0. It is easy to check that
f (p) (X1 + ξ1 Xr , . . . , Xr−1 + ξr−1 Xr , Xr ) = h(ξ1 , . . . , ξr−1 )Xrd + f (X) (4) where d ≥ d > 0 and f (X) is a polynomial with total degree less than d − 1 in Xr . Let E be a extension field of degree d + 1 over IF and let θ be a defining element of E over both ZZp and E, i.e. E ≡ ZZp (θ) ≡ IF(θ).
On the Distribution of Nonlinear Congruential Pseudorandom Numbers
197
The evaluation of the polynomial in (4) f (p) (a1 + ξ1 θ, . . . , ar−1 + ξr−1 θ, θ) = 0,
a1 , . . . , ar−1 ∈ ZZp
(5)
because the degree of the minimal polynomial of θ over IF is d + 1. (p)
(p)
For i = 1, . . . , r−1, each element ξi θ can be expressed as hi (θ), where hi (Xr ) ∈ ZZp [Xr ]. Applying the Chinese Remainder Theorem to the different polynomials (p) hi (Xr ) for each prime p|M , we find the corresponding hi (Xr ). By construction, (p) (p) f (p) (a1 + h1 (Xr ), . . . , ar−1 + hr−1 (Xr ), Xr ) is not the zero polynomial by (5) for any integer values a1 , . . . , ar−1 . Now, we proceed to define a family of polynomials depending on g1 (X),. . . ,gr (X) which will be the main subject of the article. Let IK be a field. We denote by T as the set of polynomials, f , such that s−1 j=0 aj (fk+j (X) − fl+j (X)) is nonconstant, where fi (X) are defined by (2) and aj ∈ IK, with at least one aj = 0 and k = l. Here is a sufficient condition for a certain polynomial to be in class T . To prove the result, we need some background. We start defining a homomorphism of polynomial rings φ : IK[X1 , . . . , Xr ] → IK[X1 , . . . , Xr ] with φ(Xi ) = gi (X). Polynomials g1 (X), . . . , gr (X) are said to be algebraically independent if the application φ is injective. φk denotes the composition of the function φ k times with φ0 being the identity map. Lemma 2. Let f (X) be a polynomial in IK[X] and g1 (X), . . . , gr (X) be algebraically independent and IF be an extension field of IK. Suppose that there exists (b1 , . . . , br ), (c1 , . . . , cr ) ∈ IFr two different zeros of the polynomials g1 (X), . . . , gr (X) with f (c1 , . . . , cr ) = f (b1 , . . . , br ), then f ∈ T . Proof. s−1 Suppose that k > l, and exist a0 , . . . , as−1 ∈ IK with a0 = 0 satisfying; j=0 aj (fk+j (X) − fl+j (X)) = K, where K ∈ IK. Then s−1
aj (fk+j (X) − fl+j (X)) = φk−1
s−1 j=0
aj (f1+j (X) − f1−k+l+j (X))
j=0
s−1 and this implies K = j=0 aj ((f1+j (X)) − (f1−k+l+j (X))) because φ is an injective map. By equation (2), we notice that for k = 0, we have that fk (b1 , . . . , br ) = fk−1 (0, . . . , 0) = fk (c1 , . . . , cr ), so substituting in the equation both points and subtracting the result, we get that a0 = 0. The last remark in this section is that conditions in this criterion can be tested using Groebner basis.
198
1.2
E.D. El-Mahassni and D. Gomez
Exponential Sums and Previous Results
We start by listing some previous bounds on exponential sums which will be used to establish our main results. The first Lemma is the well-known Hua-Loo Keng bound in a form which is a relaxation of the main result of [11] (see also Section 3 of [3] and Lemma 2.2 in [6]), followed by its multidimensional version. Lemma 3. For any polynomial f (X) = bd X d + . . . + b1 X + b0 ∈ ZZM [X] of degree d ≥ 1, there is a constant c0 > 0 where the bound eM (f (x)) < ec0 d M 1−1/d G1/d x∈ZZM
holds, where G = gcd(bd , . . . , b1 , M ). Lemma 4. Let f (X), with total degree d ≥ 2 and degree greater than one in Xr , be a polynomial with integer coefficients, with G = 1. Then the bound 2 2 eM (f (x1 , . . . , xr )) ≤ ec0 d (log d+1) M r−1/(d (log d+1)) x1 ,...,xr ∈ZZM holds, where c0 is some positive constant. Proof. We recall the univariate case that appears as Lemma 3. Then let g(X) = f (X1 + h1 (Xr ), X2 + h2 (Xr ), . . . , Xr ) where hi (Xr ) are the polynomials defined in Equation (3). It is easy to see that eM (g(x1 , x2 . . . , xr )) . eM (f (x1 , x2 . . . , xr )) = where the summations are taken over x1 , x2 . . . , xr ∈ ZZM since (x1 , . . . , xr ) → (x1 + h1 (xr ), x2 + h2 (xr ), . . . , xr ) merely permutates the points. By Lemma 1, for any selection x1 , . . . , xr−1 this polynomial is not constant modulo p and the gcd of the coefficients of g and M are coprime. Hence, applying Lemma 3, we have e (g(x , x . . . , x )) M 1 2 r x1 ,x2 ...,xr ∈ZZM ≤ eM (g(x1 , x2 . . . , xr )) x1 ,...,xr−1 ∈ZZM xr ∈ZZM
≤ ec0 d
2
(log d+1)
M r−1/d
2
(log d+1)
.
We obtain the last step by noting that the degree of g in Xr can be bounded by d2 (log d + 1) and so we are done.
On the Distribution of Nonlinear Congruential Pseudorandom Numbers
199
This now allows us to state and prove the following Lemma. Lemma 5. Let f (X) be a polynomial with integer coefficients with degXr f ≥ 1 and total degree d. Recalling the definition of G, ≤ ec0 d2 (log d+1) M r (G/M )1/d2 (log d+1) e (f (x , . . . , x )) M 1 r x1 ...,xr ∈ZZM Proof. We let fG (x1 , . . . , xr ) = (f (x1 , . . . , xr ) − f (0))/G and m = M/G. Then, = e (f (x , . . . , x )) e (f (x , . . . , x ) − f (0)) M 1 r M 1 r x1 ,...,xr ∈ZZM x1 ,...,xr ∈ZZM = Gr em (fG (x1 , . . . , xr )) x1 ,...,xr ∈ZZm Now fG (x1 , . . . , xr ) satisfies the conditions in Lemma 4, so: 2 2 r G em (fG (x1 , . . . , xr )) ≤ Gr ec0 d (log d+1) (m)r−1/d (log d+1) x1 ,...,xr ∈ZZm and so the result follows. Lastly, we will make use of the following lemma, which is essentially the multidimensional version of Lemma 2.3 of [6]. Lemma 6. Let f (X) ∈ ZZM [X] be a polynomial such that f (p) ∈ T for every p|M and let s−1
aj (fk+j (X) − fl+j (X)) =
j=0
d1 i1 =0
...
dr
bi1 ,...,ir X1i1 . . . Xrir ,
ir =0
where k = l. Recalling the definition of G, the following equality G = gcd(a0 , . . . , as−1 , M ) holds. Proof. We put Aj = aj /G and m = M/G, j = 0, . . . , s − 1. In particular, gcd(A0 , . . . , As−1 , m) = 1. It is enough to show that the polynomial H(X) =
s−1
Aj (fk+j (X) − fl+j (X))
j=0
is nonconstant modulo any prime p|m, for k = l.
(6)
200
E.D. El-Mahassni and D. Gomez
By definition, we have H (p) (X) ≡
s−1
(p) (p) Aj fk+j (X) − fl+j (X)
(mod p)
j=0
and H (p) (X) can not be a constant polynomial, since f (p) ∈ T and so we are done. 1.3
Discrepancy
For a sequence of N points −1 Γ = (γ0,n , . . . , γs−1,n )N n=0
(7)
of the half-open interval [0, 1)s , denote by ΔΓ its discrepancy, that is, TΓ (B) − |B| , ΔΓ = sup N B⊆[0,1)s where TΓ (B) is the number of points of the sequence Γ which hit the box B = [α0 , β0 ) × . . . × [αs−1 , βs−1 ) ⊆ [0, 1)s and the supremum is taken over all such boxes. For an integer vector a = (a0 , . . . , as−1 ) ∈ ZZs we put |a| =
max
i=0,...,s−1
|ai |,
r(a) =
s−1
max{|ai |, 1}.
(8)
i=0
We need the Erd¨ os–Tur´ an–Koksma inequality (see Theorem 1.21 of [4]) for the discrepancy of a sequence of points of the s-dimensional unit cube, which we present in the following form. Lemma 7. There exists a constant Cs > 0 depending only on the dimension s such that, for any integer L ≥ 1, for the discrepancy of a sequence of points (7) the bound ⎛ ⎛ ⎞⎞ N −1 s−1 1 1 1 ⎝ ⎝ ⎠ ΔΓ < Cs + exp 2πi aj γj,n ⎠ L N r(a) n=0 j=0 0<|a|≤L holds, where |a|, r(a) are defined by (8) and the sum is taken over all integer vectors a = (a0 , . . . , as−1 ) ∈ ZZs with 0 < |a| ≤ L. The currently best value of Cs is given in [2].
On the Distribution of Nonlinear Congruential Pseudorandom Numbers
2
201
Discrepancy Bound
Let the sequence (un ) generated by (1) be purely periodic with an arbitrary period t. For an integer vector a = (a0 , . . . , as−1 ) ∈ ZZs we introduce the exponential sum ⎛ ⎞ N −1 s−1 Sa (N ) = eM ⎝ aj un+j ⎠ . n=0
j=0
Theorem 1. Let the sequence (un ), given by (1) with a polynomial f (p) (X) ∈ T , for every prime divisor p of M , with total degree d and degXr f ≥ 1, be purely periodic with period t and t ≥ N ≥ 1. The bound |Sa (N )| = O N 1/2 M r/2 (log log(M/G))−1/2 max gcd(a0 ,...,as−1 ,M)=G
holds, where G = gcd(a0 , . . . , as−1 , M ) and the implied constant depends only on s and d. Proof. The proof follows a strategy first seen in [9]. Select any a = (a0 , . . . , as−1 ) ∈ ZZs with gcd(a0 , . . . , as−1 , M ) = G. It is obvious that for any integer k ≥ 0 we have ⎛ ⎞ N −1 s−1 Sa (N ) − ⎝ ⎠ eM aj un+k+j ≤ 2k. n=0 j=0 Therefore, for any integer K ≥ 1, K|Sa (N )| ≤ W + K 2 , where
⎛ ⎞ ⎛ ⎞ N N s−1 −1 K−1 s−1 −1 K−1 ⎝ ⎠ ⎝ ⎠ W = eM aj un+k+j ≤ eM aj un+k+j . n=0 k=0 n=0 k=0 j=0 j=0
Accordingly, letting x = x1 , . . . , xr , we obtain ⎛ ⎞2 N −1 K−1 s−1 2 ⎝ ⎠ eM aj fk+j (un , . . . , un−r+1 ) W ≤N n=0 k=0 j=0 ⎛ ⎞2 s−1 K−1 ⎝ ⎠ ≤N e a f (x) M j k+j j=0 x∈ZZrM k=0 ⎛ ⎞ K−1 s−1 K−1 =N eM ⎝ aj (fk+j (x) − fl+j (x))⎠ . k=0 l=0 x∈ZZrM
j=0
202
E.D. El-Mahassni and D. Gomez
If k = l, then the inner sum trivially equal to M r . There are K such sums. is s−1 Otherwise the polynomial j=0 aj (fk+j (X) − fl+j (X)) is nonconstant since f (p) ∈ T . Hence we can apply Lemma 5 and Lemma 6 (so that we only need consider aj , j = 0, . . . , s − 1, instead of the coefficients of f ) to the inner sum, obtaining the upper bound ec0 d
3(K+s−2)
M r−1/d
3(K+s−2)
G1/d
3(K+s−2)
for at most K 2 sums and positive constant c0 and noting that d2 (log d + 1) < d3 . Hence, W 2 ≤ KN M r + K 2 N ec0 d
3(K+s−2)
M r−1/d
3(K+s−2)
G1/d
3(K+s−2)
.
Now, without too much loss of generality we may assume (d + 1)3(K+s−2) ≥ 2. Next we put K = log log(M/G)/(3c log(d + 1)), for some c > 2 to guarantee that the first term dominates and the result follows. Next, let Ds (N ) denote the discrepancy of the points given by u un+s−1 n ,..., , n = 0, . . . , N − 1, M M in the s-dimensional unit cube [0, 1)s . Theorem 2. If the sequence (un ), given by (1) with a polynomial f (p) (X) ∈ T , for every prime divisor p of M , with total degree d and degXr f ≥ 1 is purely periodic with period t with t ≥ N ≥ 1, then the bound Ds (N ) = O N −1/2 M r/2 (log log log M )s /(log log M )1/2 holds, where the implied constant depends only on s and d. Proof. The statement follows from Lemma 7, taken with
L = N 1/2 M −r/2 (log log M )1/2 and the bound of Theorem 1, where all occurring G = gcd(a0 , . . . , as−1 , M ) are at most L.
References 1. Arkhipov, G.I., Chubarikov, V.N., Karatsuba, A.A.: Trigonometric Sums in Number Theory and Analysis, de Gruyter Expositions in Mathematics, Berlin, vol. 39 (2004) 2. Cochrane, T.: Trigonometric approximation and uniform distribution modulo 1. Proc. Amer. Math. Soc. 103, 695–702 (1988) 3. Cochrane, T., Zheng, Z.Y.: A Survey on Pure and Mixed Exponential Sums Modulo Prime Numbers. Proc. Illinois Millenial Conf. on Number Theory 1, 271–300 (2002)
On the Distribution of Nonlinear Congruential Pseudorandom Numbers
203
4. Drmota, M., Tichy, R.F.: Sequences, discrepancies and applications. Springer, Berlin (1997) 5. El-Mahassni, E.D., Shparlinski, I.E., Winterhof, A.: Distribution of nonlinear congruential pseudorandom numbers for almost squarefree integers. Monatsh. Math. 148, 297–307 (2006) 6. El-Mahassni, E.D., Winterhof, A.: On the distribution of nonlinear congruential pseudorandom numbers in residue rings. Intern. J. Number Th. 2(1), 163–168 (2006) 7. Griffin, F., Niederreiter, H., Shparlinski, I.: On the distribution of nonlinear recursive congruential pseudorandom numbers of higher orders. In: Fossorier, M.P.C., Imai, H., Lin, S., Poli, A. (eds.) AAECC 1999. LNCS, vol. 1719, pp. 87–93. Springer, Heidelberg (1999) 8. Gutierrez, J., Gomez-Perez, D.: Iterations of multivariate polynomials and discrepancy of pseudorandom numbers. In: Bozta, S., Sphparlinski, I. (eds.) AAECC 2001. LNCS, vol. 2227, pp. 192–199. Springer, Heidelberg (2001) 9. Niederreiter, H., Shparlinski, I.E.: On the distribution and lattice structure of nonlinear congruential pseudorandom numbers. Finite Fields and Their Appl. 5, 246– 253 (1999) 10. Niederreiter, H., Shparlinski, I.E.: Exponential sums and the distribution of inversive congruential pseudorandom numbers with prime-power modulus. Acta Arith. 92, 89–98 (2000) 11. Steˇckin, S.B.: An estimate of a complete rational exponential sum. Trudy Mat. Inst. Steklov. 143, 188–207 (1977) (in Russian)
Rooted Trees Searching for Cocyclic Hadamard Matrices over D4t ´ ´ V´ıctor Alvarez, Jos´e Andr´ es Armario, Mar´ıa Dolores Frau, F´elix Gudiel, and Amparo Osuna Dpto. Matem´ atica Aplicada 1, E.T.S.I.Inform´ atica, Avda. Reina Mercedes s/n, 41012 Sevilla, Spain
Abstract. A new reduction on the size of the search space for cocyclic Hadamard matrices over dihedral groups D4t is described, in terms of the so called central distribution. This new search space adopt the form of a forest consisting of two rooted trees (the vertices representing subsets of coboundaries) which contains all cocyclic Hadamard matrices satisfying the constraining condition. Experimental calculations indicate that the ratio between the number of constrained cocyclic Hadamard matrices and the size of the constrained search space is greater than the usual ratio. Keywords: Hadamard matrix, cocyclic matrix, dihedral groups.
1
Introduction
Since Hadamard matrices (that is, {1, −1}-square matrices whose rows are pairwise orthogonal) were introduced at the end of the XIXth century, the interest in their construction has grown substantially, because of their multiple applications (see [7] for instance). For this reason, many attempts and efforts have been devoted to the design of good construction procedures, the latest involving heuristic techniques (see [5], [1], [4] for instance). Even alternative theoretical descriptions characterizing Hadamard matrices have been proposed (for instance, Ito’s works involving Hadamard graphs [11] in the middle eighties, and Hadamard groups [6,12] more recently). But no matter what one may think, Hadamard matrices keep on being elusive anyway. The point is that though it may be easily checked that the size of a Hadamard matrix is to be 2 or a multiple of 4, there is no certainty whether such a Hadamard matrix exists for every size 4t. This is the Hadamard conjecture, which remains unsolved for more than a century. In fact, the design of a procedure which outputs a Hadamard matrix of the desired size has shown to be as important as solving the Hadamard conjecture itself.
All authors are partially supported by the research projects FQM–296 and P07FQM-02980 from Junta de Andaluc´ıa and MTM2008-06578 from Ministerio de Ciencia e Innovaci´ on (Spain).
M. Bras-Amor´ os and T. Høholdt (Eds.): AAECC 2009, LNCS 5527, pp. 204–214, 2009. c Springer-Verlag Berlin Heidelberg 2009
Rooted Trees Searching for Cocyclic Hadamard Matrices over D4t
205
In the early 90s, a surprising link between homological algebra and Hadamard matrices [9] led to the study of cocyclic Hadamard matrices [10]. The main advantages of the cocyclic framework are that: – On one hand, the Hadamard test for cocyclic matrices [10] runs in O(t2 ) time, better than the O(t3 ) algorithm for usual (not necessarily cocyclic) Hadamard matrices. – On the other hand, the search space reduces drastically, though it still is often of exponential size (see [4,2] for details). Among the groups for which cocyclic Hadamard matrices have been found, it seems that dihedral groups D4t are more likely to give a more density of cocyclic Hadamard matrices, even for every order multiple of 4 (see [8,3,2] for instance). Unfortunately, the task of explicitly construct cocyclic Hadamard matrices over D4t is considerably difficult, since the search space inherits a exponential size. New ideas dealing with this problem are welcome. The purpose of this paper is to describe a new reduction on the size of the search space for cocyclic Hadamard matrices over dihedral groups D4t . The key idea is exploiting the notions of i-paths and intersections introduced in [2], in order to design a forest consisting of two rooted trees (the vertices representing subsets of coboundaries) which contains all cocyclic Hadamard matrices satisfying the so called central distribution. We will explain these notions in the following section. We organize the paper as follows. Section 2 is devoted to preliminaries on cocyclic matrices. Section 3 describes the new search space for cocyclic Hadamard matrices over D4t . We include some final remarks and comments.
2
Preliminaries on Cocyclic Matrices
Consider a multiplicative group G = {g1 = 1, g2 , . . . , g4t }, not necessarily abelian. A cocyclic matrix Mf over G consists in a binary matrix Mf = (f (gi , gj )) coming from a 2-cocycle f over G, that is, a map f : G × G → {1, −1} such that f (gi , gj )f (gi gj , gk ) = f (gj , gk )f (gi , gj gk ),
∀ gi , gj , gk ∈ G.
We will only use normalized cocycles f (and hence normalized cocyclic matrices Mf ), so that f (1, gj ) = f (gi , 1) = 1 for all gi , gj ∈ G (and correspondingly Mf = (f (gi , gj )) consists of a first row and column all of 1s). A basis B for 2-cocycles over G consists of some representative 2-cocycles (coming from inflation and transgression) and some elementary 2-coboundaries ∂i , so that every cocyclic matrix admits a unique representation as a Hadamard (pointwise) product M = M∂i1 . . . M∂iw · R, in terms of some coboundary matrices M∂ij and a matrix R formed from representative cocycles. Recall that every elementary coboundary ∂d is constructed from the characteristic set map δd : G → {±1} associated to an element gd ∈ G, so that −1 gd = gi for δd (gi ) = (1) ∂d (gi , gj ) = δd (gi )δd (gj )δd (gi gj ) 1 gd = gi
206
´ V. Alvarez et al.
Although the elementary coboundaries generate the set of all coboundaries, they might not be linearly independent (see [3] for details). Moreover, since the elementary coboundary ∂g1 related to the identity element in G is not normalized, / B. we may assume that ∂g1 ∈ The cocyclic Hadamard test asserts that a cocyclic matrix is Hadamard if and only if the summation of each row (but the first) is zero [10]. In what follows, the rows whose summation is zero are termed Hadamard rows. This way, a cocyclic matrix Mf is Hadamard if and only if every row (Mf )i is a Hadamard row, 2 ≤ i ≤ 4t. In [2] the Hadamard character of a cocyclic matrix is described in an equivalent way, in terms of generalized coboundary matrices, i-walks and intersections. We reproduce now these notions. ¯ ∂j related to a elementary coboundary The generalized coboundary matrix M th ∂j consists in negating the j -row of the matrix M∂j . Note that negating a row of a matrix does not change its Hadamard character. As it is pointed out in [2], ¯ ∂j contains exactly two negative entries every generalized coboundary matrix M in each row s = 1, which are located at positions (s, i) and (s, e), for ge = gs−1 gi . We will work with generalized coboundary matrices from now on. ¯ ∂i : 1 ≤ j ≤ w} of generalized coboundary matrices defines an A set {M j ¯ l1 , . . . , M ¯ lw ) so that i-walk if these matrices may be ordered in a sequence (M th consecutive matrices share exactly one negative entry at the i -row. Such a walk is called an i-path if the initial and final matrices do not share a common −1, and an i-cycle otherwise. As it is pointed out in [2], every set of generalized coboundary matrices may be uniquely partitioned into disjoint maximal i-walks. From the definition above, it is clear that every maximal i-path contributes two negative occurrences at the ith -row. This way, a characterization of Hadamard rows (consequently, of Hadamard matrices) may be easily described in terms of i-paths. Proposition 1. [2] The ith row of a cocyclic matrix M = M∂i1 . . . M∂iw · R is Hadamard if and only if 2ci − 2Ii = 2t − ri
(2)
¯ ∂i , . . . , M ¯ ∂i }, ri counts where ci denotes the number of maximal i-paths in {M w 1 th the number of −1s in the i -row of R and Ii indicates the number of positions ¯ ∂i . . . M ¯ ∂i share a common −1 in their ith -row. in which R and M w 1 ¯ ∂i . . . M ¯ ∂i share From now on, we will refer to the positions in which R and M w 1 a common −1 in a given row simply as intersections, for brevity. We will now focus on the case of dihedral groups.
3
Cocyclic Matrices over D4t
Denote by D4t the dihedral group ZZ 2t ×χ ZZ 2 of order 4t, t ≥ 1, given by the presentation < a, b|a2t = b2 = (ab)2 = 1 >
Rooted Trees Searching for Cocyclic Hadamard Matrices over D4t
207
and ordering {1 = (0, 0), a = (1, 0), . . . , a2t−1 = (2t − 1, 0), b = (0, 1), . . . , a2t−1 b = (2t − 1, 1)} In [6] a representative 2-cocycle f of [f ] ∈ H 2 (D4t , ZZ 2 ) ∼ = ZZ 32 is written interchangeably as a triple (A, B, K), where A and B are the inflation variables and K is the transgression variable. All variables take values ±1. Explicitly, ij ij k A , i + j < 2t, A B , i ≥ j, i j k i j k f (a b, a b ) = f (a , a b ) = Aij K, i + j ≥ 2t, Aij B k K, i < j, Let β1 , β2 and γ denote the representative 2-cocycles related to (A, B, K) = (−1, 1, 1), (1, −1, 1), (1, 1, −1) respectively. A basis for 2-coboundaries is described in [2], and consists of the elementary coboundaries {∂a , . . . , ∂a2t−3 b }. This way, a basis for 2-cocycles over D4t is given by B = {∂a , . . . , ∂a2t−3 b , β1 , β2 , γ}. Computational results in [6,2] suggest that the case (A, B, K) = (1, −1, −1) contains a large density of cocyclic Hadamard matrices. Furthermore, as it is pointed out in Theorem 2 of [2], cocyclic matrices over D4t using R = β2 γ are Hadamard matrices if and only if rows from 2 to t are Hadamard, so that the cocyclic test runs four times usual. faster than A A for From now on, we assume that R = Mβ2 · Mγ = B −B ⎞ ⎛ ⎛ ⎞ 1 −1 · · · −1 1 1 ··· 1 ⎜ .. . . . . .⎟ · ⎜1 ⎜ . . . .. ⎟ · · −1 ⎟ ⎜ ⎟ ⎟ ⎜ (3) A=⎜. . ⎟ and B = ⎜ ⎟ .. ⎝ .. · ·· · ·· .. ⎠ ⎝1 . −1 ⎠ 1 −1 · · · −1 1 1 ··· 1 Now we would like to know how the 2-coboundaries in B have to be combined to form i-paths, 2 ≤ i ≤ t. This information is given in Proposition 7 of [2]. Proposition 2. [2] For 1 ≤ i ≤ 2t, a maximal i-walk consists of a maximal subset in (M∂1 , . . . , M∂2t ) or (M∂2t+1 , . . . , M∂4t ) formed from matrices (. . . , Mj , Mk , . . .) which are cyclically separated in i − 1 positions (that is j ± (i − 1) ≡ k mod 2t). Notice that since ri = 2(i − 1) for 2 ≤ i ≤ t, the cocyclic Hadamard test reduces to check whether ci − Ii = t − i + 1, for 2 ≤ i ≤ t. Thus ci uniquely determines Ii and reciprocally, 2 ≤ i ≤ t. In fact, the way in which intersections may be introduced at the ith -row is uniquely determined. More explicitly Lemma 1. The following table gives a complete distribution of the coboundaries in B which may create an intersection at a given row. For clarity in the reading, ¯ ∂i simply by i: we note the generalized coboundary M
´ V. Alvarez et al.
208
row coboundaries 2 2t, 2t + 1 3 2, 2t − 1, 2t, 2t + 1, 2t + 2 4 ≤ k ≤ t 2, . . . , k − 1, 2t − k + 2, . . . , 2t + k − 1, 4t − k + 2, . . . , 4t − 2 Proof It may be seen by inspection, taking into account the distribution of the negative occurrences in R and the form of the generalized coboundary matrices.
Lemma 2. In particular, there are some coboundaries which do not produce any intersection at all, at rows 2 ≤ k ≤ t, which we term free intersection coboundaries. More concretely, t coboundaries 2 2, 3, 6 t > 2 t, t + 1, 3t, 3t + 1 Proof It suffices to take the set difference between B and the set of coboundaries used in the lemma above.
Lemma 3. Furthermore, the following table distributes the coboundaries which produce a intersection at every row, so that coboundaries which produce the same negative occurrence at a row are displayed vertically in the same column. row 2 3 4 ≤ k ≤ t
2t-1 2t-k+2
2t − k + 3 2
... ...
2t − 1 k−2
Proof It may be seen by inspection.
2t 2t 2
coboundaries 2t + 1 2t + 1
2t
2t + 1
k-1
4t-k+2
2t+2 2t + 2 4t − k + 1
... ...
2t + k − 3 4t − 2
2t + k − 2
2t+k-1
Remark 1. Notice that: – The set of coboundaries which may produce an intersection at the ith -row is included in the analog set corresponding to the (i + 1)th -row. – The boxed coboundaries do not produce any intersection at the precedent rows. Now one could ask whether cocyclic Hadamard matrices exist for any formal distribution of pairs (ci , Ii ) satisfying the relations ci −Ii = t−i+1, for 2 ≤ i ≤ t. Actually, this is not the case. Proposition 3. Not all of the formal sequences [(c2 , I2 ), . . . , (ct , It )] satisfying ci − Ii = t − i + 1 give rise to cocyclic Hadamard matrices over D4t , for t ≥ 3.
Rooted Trees Searching for Cocyclic Hadamard Matrices over D4t
209
Proof Proposition 10 of [2] bounds the number w of coboundaries in B to multiply with R so that a cocyclic Hadamard matrix is formed, so that t − 1 ≤ w ≤ 3t + 2. In particular, for t ≥ 6, we know that 5 ≤ w. Consequently, the case I2 = . . . = It = 0 is not feasible, since from Lemma 2 we know that only up to 4 coboundaries may be combined so that no intersection is generated at any row. This proves the Lemma for t ≥ 6. We now study the remaining cases. Taking into account that 0 ≤ Ii ≤ ri = 2i − 2, we may have a look in the way in which cocyclic Hadamard matrices are distributed regarding the number of intersections Ii , for those groups D4t for which the whole set of cocyclic Hadamard matrices have been computed until now. These are precisely t = 2, 3, 4, 5. For t = 2, we formally have 3 solutions for the equation c2 − I2 = 1, c2 1 2 3 I2 0 1 2 Each of these solutions gives rise to some cocyclic Hadamard matrices Mf , I2 0 1 2 |Mf | 4 10 2
c2 − I2 = 2 coming c3 − I3 = 1 from the combination of any solution of each of the equations For t = 3, we formally have 15 solutions for the system
c2 2 3 4 I2 0 1 2
c3 1 2 3 4 5 I3 0 1 2 3 4
Only 9 of the 15 hypothetical solutions are real solutions (there are no combinations of coboundaries meeting the other 6 “theoretical” solutions), distributed in the following way: (I2 , I3 ) (0, 1) (0, 2) (0, 3) (1, 1) (1, 2) (1, 3) (2, 1) (2, 2) (2, 3) |Mf | 6 10 2 8 20 8 2 10 6 ⎧ ⎨ c2 − I2 = 3 For t = 4, we formally have 105 solutions for the system c3 − I3 = 2 coming ⎩ c4 − I4 = 1 from the combination of any solution of each of the equations c2 3 4 5 I2 0 1 2
c3 2 3 4 5 6 I3 0 1 2 3 4
c4 1 2 3 4 5 6 7 I4 0 1 2 3 4 5 6
Only 36 of the 105 hypothetical solutions are real solutions, distributed in the following way:
210
´ V. Alvarez et al.
(I2 , I3 , I4 ) (0, 0, 2) (0, 1, 1) (0, 1, 2) (0, 1, 3) (0, 1, 4) (0, 2, 2) (0, 2, 3) (0, 2, 4) (0, 2, 5) |Mf | 8 6 18 18 6 12 12 24 12 (I2 , I3 , I4 ) (0, 3, 3) (0, 3, 4) (1, 0, 3) (1, 1, 1) (1, 1, 2) (1, 1, 3) (1, 1, 4) (1, 1, 5) (1, 2, 1) |Mf | 4 8 12 18 12 8 4 6 4 (I2 , I3 , I4 ) (1, 2, 2) (1, 2, 3) (1, 2, 4) (1, 2, 5) (1, 3, 2) (1, 3, 3) (1, 3, 4) (1, 4, 3) (2, 1, 2) |Mf | 24 72 24 4 12 32 20 20 2 (I2 , I3 , I4 ) (2, 2, 1) (2, 2, 2) (2, 2, 3) (2, 2, 4) (2, 3, 2) (2, 3, 3) (2, 1, 3) (2, 1, 4) (2, 1, 5) |Mf | 6 6 2 12 24 12 12 24 12
⎧ c2 − I2 ⎪ ⎪ ⎨ c3 − I3 For t = 5, we formally have 945 solutions for the system c4 − I4 ⎪ ⎪ ⎩ c5 − I5 from the combination of any solution of each of the equations c2 4 5 6 I2 0 1 2
c3 3 4 5 6 7 I3 0 1 2 3 4
c4 2 3 4 5 6 7 8 I4 0 1 2 3 4 5 6
=4 =3 coming =2 =1
c5 1 2 3 4 5 6 7 8 9 I4 0 1 2 3 4 5 6 7 8
Only 153 of the 945 hypothetical solutions are real solutions, (I2 , I3 , I4 , I5 ) (0, 0, 1, 3) (0, 0, 2, 3) (0, 1, 1, 4) (0, 1, 2, 2) (0, 1, 2, 3) (0, 1, 2, 4) |Mf | 4 12 3 6 12 21 (I2 , I3 , I4 , I5 ) (0, 1, 2, 5) (0, 1, 3, 2) (0, 1, 3, 3) (0, 1, 3, 4) (0, 1, 3, 5) (0, 1, 3, 6) |Mf | 6 6 12 21 21 9 (I2 , I3 , I4 , I5 ) (0, 1, 3, 7) (0, 1, 4, 3) (0, 1, 4, 4) (0, 1, 4, 5) (0, 1, 4, 6) (0, 1, 4, 7) |Mf | 3 6 3 9 18 6 (I2 , I3 , I4 , I5 ) (0, 2, 0, 2) (0, 2, 1, 2) (0, 2, 1, 3) (0, 2, 1, 4) (0, 2, 2, 2) (0, 2, 2, 3) |Mf | 2 4 2 2 4 12 (I2 , I3 , I4 , I5 ) (0, 2, 2, 4) (0, 2, 3, 2) (0, 2, 3, 3) (0, 2, 3, 4) (0, 2, 3, 5) (0, 2, 3, 6) |Mf | 4 8 8 24 14 2 (I2 , I3 , I4 , I5 ) (0, 2, 4, 4) (0, 2, 4, 5) (0, 2, 4, 6) (0, 2, 5, 4) (0, 3, 2, 5) (0, 3, 3, 2) |Mf | 12 26 4 4 2 2 (I2 , I3 , I4 , I5 ) (0, 3, 3, 3) (0, 3, 3, 4) (0, 3, 3, 5) (0, 3, 4, 2) (0, 3, 4, 3) (0, 3, 4, 4) |Mf | 7 6 6 1 4 5 (I2 , I3 , I4 , I5 ) (0, 3, 4, 5) (0, 3, 5, 4) (1, 0, 1, 4) (1, 0, 2, 2) (1, 0, 2, 3) (1, 0, 2, 4) |Mf | 6 1 2 4 2 4 (I2 , I3 , I4 , I5 ) (1, 0, 4, 2) (1, 0, 4, 3) (1, 0, 4, 4) (1, 0, 5, 4) (1, 1, 1, 3) (1, 1, 1, 4) |Mf | 4 2 4 2 6 3 (I2 , I3 , I4 , I5 ) (1, 1, 2, 3) (1, 1, 2, 4) (1, 1, 2, 5) (1, 1, 2, 6) (1, 1, 3, 2) (1, 1, 3, 3) |Mf | 7 14 22 8 4 16 (I2 , I3 , I4 , I5 ) (1, 1, 3, 4) (1, 1, 3, 5) (1, 1, 3, 6) (1, 1, 3, 7) (1, 1, 4, 3) (1, 1, 4, 4) |Mf | 20 8 16 16 5 10 (I2 , I3 , I4 , I5 ) (1, 1, 4, 5) (1, 1, 4, 6) (1, 1, 5, 3) (1, 1, 5, 4) (1, 2, 1, 3) (1, 2, 1, 4) |Mf | 18 8 2 1 4 8
Rooted Trees Searching for Cocyclic Hadamard Matrices over D4t
211
(I2 , I3 , I4 , I5 ) (1, 2, 2, 2) (1, 2, 2, 3) (1, 2, 2, 4) (1, 2, 2, 5) (1, 2, 3, 2) (1, 2, 3, 3) |Mf | 4 16 28 16 24 20 (I2 , I3 , I4 , I5 ) (1, 2, 3, 4) (1, 2, 3, 5) (1, 2, 4, 2) (1, 2, 4, 3) (1, 2, 4, 4) (1, 2, 4, 5) |Mf | 32 56 4 12 24 24 (I2 , I3 , I4 , I5 ) (1, 2, 5, 4) (1, 3, 1, 2) (1, 3, 1, 3) (1, 3, 1, 4) (1, 3, 2, 2) (1, 3, 2, 3) |Mf | 8 1 3 2 8 16 (I2 , I3 , I4 , I5 ) (1, 3, 2, 4) (1, 3, 2, 5) (1, 3, 3, 3) (1, 3, 3, 4) (1, 3, 3, 5) (1, 3, 4, 2) |Mf | 10 6 24 32 16 8 (I2 , I3 , I4 , I5 ) (1, 3, 4, 3) (1, 3, 4, 4) (1, 3, 4, 5) (1, 3, 5, 2) (1, 3, 5, 3) (1, 3, 5, 4) |Mf | 16 22 10 3 9 6 (I2 , I3 , I4 , I5 ) (1, 4, 1, 3) (1, 4, 2, 3) (1, 4, 2, 4) (1, 4, 3, 3) (1, 4, 3, 4) (1, 4, 4, 3) |Mf | 2 6 6 16 8 6 (I2 , I3 , I4 , I5 ) (1, 4, 4, 4) (1, 4, 5, 3) (2, 1, 2, 3) (2, 1, 2, 4) (2, 1, 2, 5) (2, 1, 2, 6) |Mf | 6 2 2 1 3 6 (I2 , I3 , I4 , I5 ) (2, 1, 2, 7) (2, 1, 3, 2) (2, 1, 3, 3) (2, 1, 3, 4) (2, 1, 3, 5) (2, 1, 3, 6) |Mf | 2 2 4 7 7 3 (I2 , I3 , I4 , I5 ) (2, 1, 3, 7) (2, 1, 4, 2) (2, 1, 4, 3) (2, 1, 4, 4) (2, 1, 4, 5) (2, 1, 5, 4) |Mf | 1 2 4 7 2 1 (I2 , I3 , I4 , I5 ) (2, 2, 1, 4) (2, 2, 2, 4) (2, 2, 2, 5) (2, 2, 2, 6) (2, 2, 3, 2) (2, 2, 3, 3) |Mf | 4 12 26 4 8 8 (I2 , I3 , I4 , I5 ) (2, 2, 3, 4) (2, 2, 3, 5) (2, 2, 3, 6) (2, 2, 4, 2) (2, 2, 4, 3) (2, 2, 4, 4) |Mf | 24 14 2 4 12 4 (I2 , I3 , I4 , I5 ) (2, 2, 5, 2) (2, 2, 5, 3) (2, 2, 5, 4) (2, 2, 6, 2) (2, 3, 1, 4) (2, 3, 2, 2) |Mf | 4 2 2 2 3 3 (I2 , I3 , I4 , I5 ) (2, 3, 2, 3) (2, 3, 2, 4) (2, 3, 2, 5) (2, 3, 3, 2) (2, 3, 3, 3) (2, 3, 3, 4) |Mf | 12 15 18 6 21 18 (I2 , I3 , I4 , I5 ) (2, 3, 3, 5) (2, 3, 4, 5) (2, 4, 2, 3) |Mf | 18 6 12 a Attending to the tables above, we conclude that, for 2 ≤ t ≤ 5, there is
large density of cocyclic Hadamard matrices in the case ci = t for 2 ≤ i ≤ t, that is, (I2 , . . . , It ) = (1, . . . t − 1). We call this case the central distribution for intersections and i-paths on D4t . We include now a table comparing the number central of cocyclic Hadamard |Mf | of the amount matrices in the central distribution with the proportion % = cases |Mf | of cocyclic Hadamard matrices over D4t by the total number cases of valid distributions of intersections (I2 , . . . , It ). The last column contains the number of cocyclic Hadamard matrices of the most prolific case: t cases |Mf | % central 2 3 16 5.33 10 3 9 72 8 20 4 36 512 14.22 72 5 153 1400 9.15 32
best 10 20 72 56
´ V. Alvarez et al.
212
It seems then reasonable trying to constraint the search for cocyclic Hadamard matrices over D4t to the central distribution case. The search space in the central distribution (I2 , . . . , It ) = (1, . . . , t−1) may be represented as a forest of two rooted trees of depth t−1. We identify each level of the tree to the correspondent row of the cocyclic matrix at which intersections are being counted, so that the roots of the trees are located at level 2 (corresponding to the intersections created at the second row of the cocyclic matrix). This way the level i contains those coboundaries which must be added to the father configuration in order to get the desired i − 1 intersections at the ith -row, for 2 ≤ i ≤ t. The root of the first tree is ∂2t , whereas the root of the second tree is ∂2t+1 , since from Lemma 1 these are the only coboundaries which may give an intersection at the second row. As soon as one of these coboundaries is used, the other one is forbidden, since otherwise a second intersection would be introduced at the second row. Now one must add some coboundaries to get two intersections at the third row. Since ∂2t is already used and ∂2t+1 is forbidden, there are only 3 coboundaries left (those boxed in the table of Lemma 3). Successively, in order to construct the nodes at level k, one must add some of the correspondent boxed coboundaries of the table of Lemma 3, since the remaining coboundaries are either used or forbidden. We include the forests corresponding to the cases t = 2, 3, 4 for clarity.
t
trees Level 2
2 Level 2
4
5
Level 2
6
7
3 Level 3
Level 3 5
(2,5,8)
8
5
8
Level 2
Level 3
2
7
10
(2,7,10)
Level 4
4
6
11
14 (3,6,11) (3,6,14)(3,11,14)
6
11
14 (3,6,11)(3,6,14)(3,11,14)
11
14 (3,6,11)(3,6,14)(3,11,14)
9
Level 2
Level 3
6
2
7
10
Level 4 3
6
11(14,3,6) (14,3,11) (14,6,11)
3
6
11(14,3,6) (14,3,11) (14,6,11)
3
6
11(14,3,6) (14,3,11) (14,6,11)
8
Rooted Trees Searching for Cocyclic Hadamard Matrices over D4t
213
Every branch ending at level t gives a cocyclic matrix meeting the desired distribution of intersections (I2 , . . . , It ). Now one has to check whether any of the 16 possible combinations with the free intersection coboundaries {∂t ,∂t+1 ,∂3t ,∂3t+1 } of Lemma 2 gives rise to a cocyclic Hadamard matrix (that is, to the desired distribution of i-paths, (c2 , . . . , ct ) = (t, . . . , t)). We now give some properties of the trees above. Proposition 4. In the circumstances above, it may be proved that 1. The skeleton (i.e., the branches, forgetting about the indexes of the coboundaries used) of the trees related to D4t are preserved to form the levels from 2 to t corresponding to the trees of D4(t+1) . 2. Among the boxed coboundaries {∂2t−k+2 , ∂k−1 , ∂4t−k+2 , ∂2t+k−1 to be added at level k of the trees, precisely one of them removes an intersection, whereas the remaining three adds one intersection each. 3. At each level, either just one or exactly three boxed coboundaries must be used, there is no other possible choice in order to meet the desired amount of intersections. 4. Consequently, a branch may be extended from level k to level k + 1 if and only if k − hk ∈ {−1, 1, 3}, where hk denotes the number of intersections inherited from level k to level k + 1. 5. Branches ending at levels above level t will never give rise to cocyclic Hadamard matrices meeting the central distribution. This will be more frequent the greater t is. 6. Both trees may have branches ending at level t which may not produce any cocyclic Hadamard matrix at all. This will be more frequent the greater t is. Proof Most of the properties are consequences of the results explained in Lemma 1 through Lemma 3, and are left to the reader. Property 3 comes as a result of a parity condition: there must be an odd number of intersections at odd levels, and an even number of intersections at even levels. Since boxed coboundaries either add an intersection each, or just one of them removes an intersection (added by a coboundary previously used), the parity condition leads to the result. Concerning Property 5, we give a branch not reaching the last level for t = 9: level 2 3 4 5 6 7 8 cob. 18 17 21 4, 22, 33 32 6, 13, 24 12, 25, 30 Concerning Property 6, we give a branch reaching the last level for t = 9, which do not give rise to any cocyclic Hadamard matrix at all: level 2 3 4 5 6 7 8 9 cob. 19 17 21 22 14 6, 24, 31 12, 25, 30 29
So far, it is evident that the above trees reduce the search space for cocyclic Hadamard matrices over D4t , constraining the solutions to the central distribution case.
214
´ V. Alvarez et al.
There is only one question left. Is the new proportion ratioc of cocyclic Hadamard matrices in the central distribution case by the size of the reduced space greater than the proportion ratiog of general cocyclic Hadamad matrices by the size of the general search space? It seems so, attending to the table below (we have followed the calculations of [2] about the size of the general search space in D4t ). t |Mf | g. size ratiog 2 16 32 0.5 3 72 492 0.146 4 512 8008 0.063
|Mf | central size ratioc 10 16 0.625 20 96 0.208 72 576 0.125
We claim that developing a heuristic search in the forest described above will produce some cocyclic Hadamard matrices over D4t more likely than any other technique applied till now to the general case. This will be the goal of our work in the near future.
References ´ 1. Alvarez, V., Armario, J.A., Frau, M.D., Real, P.: A genetic algorithm for cocyclic Hadamard matrices. In: Fossorier, M.P.C., Imai, H., Lin, S., Poli, A. (eds.) AAECC 2006. LNCS, vol. 3857, pp. 144–153. Springer, Heidelberg (2006) ´ 2. Alvarez, V., Armario, J.A., Frau, M.D., Real, P.: A system of equations for describing cocyclic Hadamard matrices. Journal of Comb. Des. 16(4), 276–290 (2008) ´ 3. Alvarez, V., Armario, J.A., Frau, M.D., Real, P.: The homological reduction method for computing cocyclic Hadamard matrices. J. Symb. Comput. 44, 558–570 (2009) ´ 4. Alvarez, V., Frau, M.D., Osuna, A.: A genetic algorithm with guided reproduction for constructing cocyclic Hadamard matrices. In: ICANNGA 2009 (2009) (sent) 5. Baliga, A., Chua, J.: Self-dual codes using image resoration techniques. In: Bozta, S., Sphparlinski, I. (eds.) AAECC 2001. LNCS, vol. 2227, pp. 46–56. Springer, Heidelberg (2001) 6. Flannery, D.L.: Cocyclic Hadamard matrices and Hadamard groups are equivalent. J. Algebra 192, 749–779 (1997) 7. Hedayat, A., Wallis, W.D.: Hadamard Matrices and Their Applications. Ann. Stat. 6, 1184–1238 (1978) 8. Horadam, K.J.: Hadamard matrices and their applications. Princeton University Press, Princeton (2006) 9. Horadam, K.J., de Launey, W.: Cocyclic development of designs. J. Algebraic Combin. 2(3), 267–290 (1993); Erratum: J. Algebraic Combin. 3(1), 129 (1994) 10. Horadam, K.J., de Launey, W.: Generation of cocyclic Hadamard matrices. In: Computational algebra and number theory, pp. 279–290. Math. Appl., Kluwer Acad. Publ, Dordrecht (1995) 11. Ito, N.: Hadamard Graphs I. Graphs Combin. 1(1), 57–64 (1985) 12. Ito, N.: On Hadamard Groups. J. Algebra 168, 981–987 (1994)
Interesting Examples on Maximal Irreducible Goppa Codes Marta Giorgetti Dipartimento di Fisica e Matematica, Universita’ dell’Insubria
Abstract. A full categorization of irreducible classical Goppa codes of degree 4 and length 9 is given: it is an interesting example in the context of finding an upper bound for the number of Goppa codes which are permutation non-equivalent and irreducible and maximal with fixed parameters q, n and r (Fq is the field of the Goppa code, the Goppa polynomial has coefficients in Fqn and its degree is r) using group theory techniques.
1
The Number of Non-equivalent Goppa Codes
Definition 1. Let g(x) = gi xi ∈ Fqn [x] and let L = {ε1 , ε2 , . . ., εN } denote n a subset of elements of Fq which are not roots of g(x). Then the Goppa code G(L, g) is defined as the set of all vectors c = (c1 , c2 , . . ., cN ) with components in N ci ≡ 0 mod g(x). Fq which satisfy the condition: i=0 x−ε i If L = {ε1 , ε2 , . . ., εN } = Fqn the Goppa code is said maximal ; if the degree of g(x) is r, then the Goppa code is called a Goppa code of degree r; if g(x) is irreducible over Fqn the Goppa code is called irreducible . In this case, a
1 1 1 , α−ε , 1 , where parity check matrix for G(L, g) can be Hα = α−ε 2 , . . ., α−εqn−1 α α ∈ Fqnr is a root of g(x) and ε = F∗qn . We denote the Goppa code G(L, g) as C(α) when a parity check matrix of type Hα is considered. It is important to stress that by using parity check matrix Hα to define G(L, g) we implicitly fix an order in L. We observe that the Goppa code G(L, g) is the subfield subcode of code over Fqnr having as parity check matrix Hα . We denote by Ω = Ω(q, n, r) the set of Goppa codes, with fixed parameters q, n, r, S = S(q, n, r) the set of all elements in Fqnr of degree r over Fqn and P = P(q, n, r) the set of irreducible polynomials of degree r in Fqn [x]. In [4] the action on Ω is obtained by considering an action on S of an ”semiaffine” group T = AGL(1, q n )σ in the following way: for α ∈ S and t ∈ T , i αt = aαq + b for some a, b ∈ Fqn , a = 0 and i = 1 . . . nr. The action gives a number of orbits over S which is an upper bound for the number of non equivalent Goppa codes. An important result of [4] is the following: i
Theorem 1. [4] If α, β ∈ S are related as it follows β = ζαq + ξ for some ζ, ξ ∈ Fqn , ζ = 0, i = 1. . .nr, then C(α) is equivalent to C(β). M. Bras-Amor´ os and T. Høholdt (Eds.): AAECC 2009, LNCS 5527, pp. 215–218, 2009. c Springer-Verlag Berlin Heidelberg 2009
216
M. Giorgetti
In [2] the action of a group F G isomorphic to AΓ L(1, q n ) on the q n columns of the parity check matrix Hα is considered. We point out that columns of Hα are in bijective correspondence with the elements of Fqn . The group F G induces on Ω the same orbits which arise from the action introduced in [4]. This action does not describe exactly the orbits of permutation non equivalent Goppa codes, since in some cases the number of permutation non-equivalent Goppa codes is less than the number of orbits of T on S. The group F G acts faithfully on the columns of Hα : it can be seen as a subgroup of the symmetric group Sqn . In [2] it has been proved that there exists exactly one maximal subgroup M (isomorphic to AGL(nm, p)) of Sqn (Aqn ) containing F G (q = pm ). This suggests that one could consider the action of M on codes to reach the right bound. From this result one could hope that, when it is not possible to reach the exact number s of permutation non-equivalent Goppa codes by the action of F G, s is obtained by considering the group AGL(nm, p). Unfortunately, this is not always true as it is shown in the next section. The following examples were introduced by Ryan in this PhD thesis [4]. In the next, we thoroughly analyze them, pointing out the group action of AGL(nm, p). Classification of Ω(3, 2, 4). Let q = 3, n = 2, r = 4; let ε be a primitive element of F32 with minimal polynomial x2 + 2x + 2; let L = [ε, ε2 , . . . , ε7 , 1, 0]; let P = P(3, 2, 4) be the set of all irreducible polynomials of degree 4 in F9 , |P| = 1620 and let S = S(3, 2, 4) be the set of all elements of degree 4 over F9 , |S| = 6480. Let Γ (g, L) be a maximal irreducible Goppa code of length 9 over F3 , g ∈ P. We denote by SS the symmetric group on S. We consider the action of T on S: it gives 13 orbits on S. It means that there are at most 13 classes of maximal irreducible Goppa codes. We choose a representative for each class. Table 1 shows the thirteen classes: for each representative code Γi , we give the corresponding Goppa polynomial gi (x), the code parameters [n, k, d] and the generator matrix M . Table 1. Representatives of the 13 classes obtaining in the action of T over S Γi Γ1 Γ2 Γ3 Γ4 Γ5 Γ6 Γ7 Γ8 Γ9 Γ10 Γ11 Γ12 Γ13
gi (x) x4 + f 3 x3 + f x + f 5 x4 + f 7 x3 + x2 + f 5 x + f 3 x4 + f 5 x + f x4 + f 5 x2 + f 6 x + f 2 x4 + f 7 x2 + f 2 x + f 5 x4 + f x3 + f 5 x2 + f 3 x + f 6 x4 + f 6 x3 + f 2 x2 + 2x + f 5 x4 + 2x3 + 2x2 + 2x + f 7 x4 + f 5 x3 + f 2 x2 + f 3 x4 + 2x3 + f 3 x2 + f 6 , x4 + f x3 + f x2 + f x + f 2 , x4 + f 3 x3 + f 2 x2 + 2x + f 3 , x4 + f 3 x3 + f 5 x2 + x + f 6
[n, k, d] generator matrixM [9, 1, 9] [112212212] [9, 1, 5] [010222010] [9, 1, 9] [122221112] [9, 1, 6] [120101202] [9, 1, 6] [112001011] [9, 1, 7] [001111122] [9, 1, 5] [001220110] [9, 1, 6] [121120200] [9, 1, 6] [010021112] [9, 1, 5] [120201200] [9, 1, 7] [120220221] [9, 1, 6] [101012012] [9, 1, 6] [011101202]
Interesting Examples on Maximal Irreducible Goppa Codes
217
The analysis of parameters [n, k, d] and generator matrices shows that these 13 code representantives can not be equivalent, since they have different minimum distances. We can observe that Γ1 is permutation equivalent to Γ3 ; Γ2 and Γ10 are permutation equivalent to Γ7 ; Γ11 is permutation equivalent to Γ6 ; Γ4 is permutation equivalent to Γ8 ; Γ9 are Γ12 are permutation equivalent to Γ13 . We can conclude that the number of different classes of permutation non equivalent codes is 6 and not 13 (Γ5 composes a permutation equivalence class). Moreover Γ5 , Γ4 , Γ8 , Γ9 , Γ12 and Γ13 are monomially equivalent, so there are only 4 equivalence classes of non equivalent Goppa codes. In Table 2 we summarize the results of the group actions as follows. The action of T on S, T ≤ SS , creates 13 orbits: we report the number of elements in each orbit |ST | and we count the number of Goppa codes corresponding to these elements (by abuse of notation we write |ΓiT |). For each representative Γi , we consider its permutation group P(Γi ): we obtain the number of codes |S9 | ; the number of codes which are permutation equivalent to it by computing |P(Γ i )| permutation equivalent to Γi under the actions of F G (and AGL = AGL(2, 3)) is |F G| |AGL| (and |ALG∩P(Γ , respectively). We use symbols ♣, ♦, ♥ obtained as |F G∩P(Γ i )| i )| and ♠ to denote the four monomial equivalence classes and symbols ⊕, , ⊗ to denote the permutation classes when they are different from the monomial classes. We write P.E. to say Permutation Equivalent. In this example, the action of the only maximal permutation group AGL ≤ Sqn , which contains F G, is not sufficient to unify disjoint orbits of non equivalent codes. Only the whole symmetric group Sqn gives the right number of non equivalent Goppa codes. Remark 1. It is interesting to analyzing polynomials in P. We denote by P♣ the set of polynomials corresponding to Goppa codes in the ♣ equivalence class, and Table 2. Different group actions |ST | |ΓiT | |P(Γi )|
Γi Γ1 Γ3 Γ2 Γ10 Γ7 Γ6 Γ11 Γ5 Γ4 Γ8 Γ9 Γ12 Γ13
♣ ♣ ♦ ♦ ♦ ♠ ♠ ♥ ♥ ♥ ♥ ♥ ♥
⊗ ⊕ ⊕
144 576 576 576 576 576 288 576 576 576 288 576 576 6480
|S9 | |P(Γi )|
18 2880 126 72 2880 P.E.Γ1 144 288 1260 144 288 P.E.Γ2 144 288 P.E.Γ2 144 480 756 72 480 P.E.Γ6 144 720 504 144 432 840 144 432 P.E.Γ4 72 288 1260 144 288 P.E.Γ9 144 288 P.E.Γ9 1530 4746
|F G| |AGL| |P(Γi )∩F G| |P(Γi )∩ALG|
18 72 144 144 144 144 72 144 144 144 72 144 144 1530
54 72 432 216 432 216 108 216 432 216 108 216 216 2934
218
M. Giorgetti
so on for the others, hence P = P♣ ∪ P♦ ∪ P♠ ∪ P♥ . We denote by P∗,Γi , the set of polynomials in P∗ , ∗ ∈ {♣, ♦, ♥,♠}, corresponding to the codes in ΓiT . It is easy to check that if g ∈ P♣ , g has the following shape x4 + εi x3 + εj x + εk for some εi , εj , εk ∈ Fqn and the x2 coefficient is equal to zero. We know that |P♣,Γ1 | = 36 and |Γ1T | = 18. This means that more than one polynomial generates the same code. We can show that couples of polynomials in P♣,Γ1 generate the same code. Moreover if g1 , g2 ∈ Γ1T generate the same Goppa code then they have the same coefficients except for the constant term: we can obtain one constant term from the other by arising to the q-th power. For example polynomials x4 +ε6 x3 +ε2 x+ε2 and x4 + ε6 x3 + ε2 x + ε6 generate the same Goppa code. A similar argument can conduce us to say that polynomials in P♣,Γ3 are 576, but they generate 72 different Goppa codes. We have that 4 polynomials create the same Goppa code and we find the following relation: given a polynomial g ∈ P♣,Γ3 , g = x4 +εi x3 +εj x+εk , then the following tree polynomials generate the same Goppa code: g = x4 + εiq x3 + εjq x + εkq , g = x4 + εj x3 + εi x + εk and g = x4 + εjq x3 + εjq x + εkq . Analogous arguments can be used to describe set of polynomials in P♦ , P♥ and P♠ . Codes in Ω(2, 5, 6): let us consider the codes studied in [3]. Let q = 2, n = 5 r = 6 and let f be a primitive element of Fqn with minimal polynomial x5 +x2 +1; let L = [f, f 2 , . . . , f 30 , 1, 0]. We consider the following two polynomials p1 := x6 + f 22 x5 + f 2 x4 + f 25 x3 + f 10 x + f 3 and p2 := x6 + f 20 x5 + f 19 x4 + f 19 x3 + f 12 x2 +f 4 x+f 2 8. They generate equivalent Goppa codes Γ1 (L, p1 ) and Γ2 (L, p2 ), but their roots are in different orbits under the action of T over S = S(2, 5, 6). To know how many codes are in each orbits we take a representative code and we construct its orbit under the permutation group F G ≤ S32 . We verify that the action of the maximal subgroup AGL(2, 5) containing F G does not unify the two orbits. Also in this case, the only permutation group which gives the right number of non equivalent Goppa code is the whole symmetric group Sqn .
References 1. Chen, C.-L.: Equivalent irreducible Goppa codes. IEEE Trans. Inf. Theory 24, 766– 770 (1978) 2. Dalla Volta, F., Giorgetti, M., Sala, M.: Permutation equivalent maximal irreducible Goppa codes. Designs, Codes and Cryptography (submitted), http://arxiv.org/abs/0806.1763 3. Ryan, J.A., Magamba, K.: Equivalent irreducible Goppa codes and the precise number of quintic Goppa codes of length 32. In: AFRICON 2007, September 26-28, pp. 1–4 (2007) 4. Ryan, J.: Irreducible Goppa codes, Ph.D. Thesis, University College Cork, Cork, Ireland (2002)
Repeated Root Cyclic and Negacyclic Codes over Galois Rings Sergio R. L´ opez-Permouth and Steve Szabo Department of Mathematics, Ohio University, Athens, Ohio-45701, USA
Abstract. In this notice we describe the ideal structure of all cases of cyclic and negacyclic codes of length ps over a Galois ring alphabet that have not yet been discussed in the literature. Unlike in the cases reported earlier in the literature by various authors, the ambient spaces here are never chain rings. These ambient rings do nonetheless share the properties of being local and having a simple socle. Keywords: repeated root codes, cyclic codes, polynomial codes, Galois rings.
1
Introduction
While the literature on cyclic and negacylic codes over chain rings (such as Galois rings) has experienced much growth in recent years (see [1, 2, 3, 4, 5, 6]), in most instances the studies have been focused only on the cases where the characteristic of the alphabet ring is coprime to the code length, the so-called simple root codes. A few of the contributions to the study of the cases where the characteristic of the alphabet ring is not coprime to the code length (repeated root codes) are [7, 8, 9, 10, 11, 12]. In this paper we focus on the repeated root case where the code length is in fact ps , a power of a prime p. In all papers dealing with this type of code so far, the codes correspond to principal ideals because in every case considered thus far the code ambient has been a chain ring. It turns out that in the remaining cases the code ambients are no longer chain rings and in fact, not even principal ideal rings. The authors are submitting a complete account of their research elsewhere [13]; the results announced here are proven there and methods to calculate the minimum distance of all negacyclic and cyclic codes considered are also given there. The three cases of codes of length ps not previously tackled in the literature are: cyclic and negacyclic codes over GR(pa , m) for odd prime p and a > 1 and cyclic codes over GR(2a , m) for a > 1. It is easy to see that the ambients a GR(pa ,m)[x] ,m)[x] and GR(p for negacyclic and cyclic codes for an odd prime p xps +1 xps −1 and a > 1 are isomorphic under the isomorphism sending x to −x. Hence, all the results about the lattice structure of their ideals can be done for one and translated into the other in a straightforward way. Surprisingly, the structure a ,m)[x] of the cyclic codes over GR(2a , m) for a > 1 for the ambient ring GR(2 x2s −1 M. Bras-Amor´ os and T. Høholdt (Eds.): AAECC 2009, LNCS 5527, pp. 219–222, 2009. c Springer-Verlag Berlin Heidelberg 2009
220
S.R. L´ opez-Permouth and S. Szabo
is similar to the two other ambients in this paper and, in fact, the proofs of the corresponding results are highly parallel. For convenience, we opt to state all results on the cyclic case so all results can be stated with a single notation without restrictions on the parity of p. It is worthwhile noticing, however, that GR(2a ,m)[x] is not isomorphic to its negacyclic counterpart which is a chain ring x2s −1 [14]. One should also point out that in [13] where these results appear in their entirety including proofs, it was found to be better to work on the negacyclic case for the proofs when p is odd. In the case of p = 2, the results can be obtained using x + 1 or x − 1. For the reason mentioned, x + 1 was chosen to present the p = 2 case in [13] also. Here we will use x − 1 for the purposes of stating the results for arbitrary prime p in a concise way. We use standard ring-theoretic terminology which we include here for the convenience of the reader. For further information, the reader may consult a standard reference such [15]. For an account focusing on finite rings, see [16]. A left module M of a ring R is simple if M = 0 and M has no R-submodules other than (0) and M . The socle of a ring R denoted by soc(R), is the sum of of all minimal left ideals of R. The Jacobson Radical of ring R denoted by J(R), is the intersection of all maximal left ideals of R. A chain ring is a ring whose ideals are linearly ordered by inclusion. Since all rings in this paper are commutative, the use of left modules, left ideals, etc. is unnecessary in our context and the reader may simply ignore the word ”left” in each definition.”
2
The Main Results
The following results hold for ambient rings and p an arbitrary prime. Proposition 1. In
GR(pa ,m)[x] , xps −1
p
a−1
(ps −1)
(x − 1)
for integers a > 1, s > 0
the element (x − 1) is nilpotent.
Proposition 2. The ambient ring GR(pa ,m)[x] = p, x − 1. s xp −1 Proposition 3. The socle soc
GR(pa ,m)[x] xps −1
GR(pa ,m)[x] xps −1
GR(pa ,m)[x] xps −1
of
is a local ring with radical J
GR(pa ,m)[x] xps −1
is the simple module
.
Proposition 4. In the ambient ring
GR(pa ,m)[x] xps −1
1. p ∈ / x − 1 2. x − 1 ∈ / p a ,m)[x] 3. GR(p is not a chain ring s xp −1 4. p, x − 1 is not a principal ideal Theorem 1. The ambient ring that is not a chain ring.
GR(pa ,m)[x] xps −1
is a finite local ring with simple socle
Repeated Root Cyclic and Negacyclic Codes over Galois Rings
221
Example 1. To illustrate Theorem 1, we provide the following figure. It shows Z [x] the ideal lattice of x332−1 . Notice that the radical is 3, x − 1 and the socle is 3(x − 1)2 . More importantly, we see that the ring is not a chain ring. 1 3, x − 1 jjj ???TTTTTTT ?? TTTT jjjj j j j TTTT ?? jjj j j TTTT ?? j j j j TTTT j ? j j ? j TTTT j ? j j ? j TTTT j ? j2jj 3 + (x − 1) 3 + 2(x − x − 1 3, (x − 1) OO1) WWWWW OOOWZWZWZWZWZZZZ OOO WWWWW OOO WWWWZWZZZZZZZ OOO WWWWW WWWWW ZZZZZZZ OOO OOO WW ZZ WWWWW OOO O WWWWW ZZZZZZZZZZZZ WWWWWWWW OOO ZZZZZZZ WWWWW OOOOO WWWWW OOO Z W Z W W Z Z W W Z W Z Z W OOO ZZZZWZWZWZWWOWOOOO WWWWW ZZZW O 2 2 W 3 WWWWW 3 + (x − 1) , 3(x 3 + 2(x −O1) (x − 1)2 OOOWW−WZW1) Z OOO WWWWW OOO WZWZWZWZWZWZZZZZZZZ O W O W WWWWW ZZZZZZZ WWWWW OOO OOO ZZ WWWWW W OOO OOO WWWWW ZZZZZZZZZZZZ WWWWWWWW OOO O Z W Z W W Z WWWWW ZZZZZZZ WWWWW OOOOO OOO ZZZZZZZWWWW OO WWWWW OOO ZZZZWZWZWZWOO WWW O 2 2 3(x 3 + 6(x − 1) + (x 3 + (x − 1)2 TTT−T1) 3 + 3(x − 1) + ?? (x − 1) j − 1) TTTT ?? jjjj j j j TTTT ?? jj TTTT ?? jjjj TTTT ?? jjjj j j TTTT j TTTT ??? jjjj TT ? jjjjjj 2 3(x − 1) 0
3
Conclusion
The results in this paper shed light on a more general class of codes and their ambient structures, namely polynomial codes over Galois rings. It was shown here that such codes are not always principal. This abridged version of [13] should serve as insight for another paper by the authors which is in preparation at this time [17]. In that paper, the ambient rings for polynomial codes over Galois rings are studied which is the natural next step for the research presented here.
References 1. Calderbank, A.R., Sloane, N.J.A.: Modular and p-adic cyclic codes. Des. Codes Cryptogr. 6(1), 21–35 (1995) 2. Kanwar, P., L´ opez-Permouth, S.R.: Cyclic codes over the integers modulo pm . Finite Fields Appl. 3(4), 334–352 (1997) 3. Kiah, H.M., Leung, K.H., Ling, S.: Cyclic codes over GR(p2 , m) of length pk . Finite Fields Appl 14(3), 834–846 (2008)
222
S.R. L´ opez-Permouth and S. Szabo
4. Pless, V.S., Qian, Z.: Cyclic codes and quadratic residue codes over Z4 . IEEE Trans. Inform. Theory 42(5), 1594–1600 (1996) 5. S˘ al˘ agean, A.: Repeated-root cyclic and negacyclic codes over a finite chain ring. Discrete Appl. Math. 154(2), 413–419 (2006) 6. Wolfmann, J.: Negacyclic and cyclic codes over Z4 . IEEE Trans. Inform. Theory 45(7), 2527–2532 (1999) 7. Abualrub, T., Oehmke, R.: Cyclic codes of length 2e over Z4 . Discrete Appl. Math. 128(1), 3–9 (2003); International Workshop on Coding and Cryptography (WCC 2001), Paris (2001) 8. Abualrub, T., Oehmke, R.: On the generators of Z4 cyclic codes of length 2e . IEEE Trans. Inform. Theory 49(9), 2126–2133 (2003) 9. Castagnoli, G., Massey, J.L., Schoeller, P.A., von Seemann, N.: On repeated-root cyclic codes. IEEE Trans. Inform. Theory 37(2), 337–342 (1991) 10. Dougherty, S.T., Park, Y.H.: On modular cyclic codes. Finite Fields Appl. 13(1), 31–57 (2007) ¨ ¨ 11. Ozadam, H., Ozbudak, F.: A note on negacyclic and cyclic codes of length ps over a finite field of characteristic p (2009) (preprint) 12. van Lint, J.H.: Repeated-root cyclic codes. IEEE Trans. Inform. Theory 37(2), 343–345 (1991) 13. L´ opez-Permouth, S.R., Szabo, S.: On the Hamming weight of repeated root cyclic and negacyclic codes over Galois rings. arXiv:0903.2791v1 (submitted) 14. Dinh, H.Q.: Negacyclic codes of length 2s over Galois rings. IEEE Trans. Inform. Theory 51(12), 4252–4262 (2005) 15. Lam, T.Y.: A first course in noncommutative rings, 2nd edn. Graduate Texts in Mathematics, vol. 131. Springer, New York (2001) 16. McDonald, B.R.: Finite rings with identity. Pure and Applied Mathematics, vol. 28. Marcel Dekker Inc., New York (1974) 17. L´ opez-Permouth, S.R., Szabo, S.: Polynomial codes over Galois rings (in preparation)
Construction of Additive Reed-Muller Codes J. Pujol, J. Rif` a, and L. Ronquillo Department of Information and Communications Engineering, Universitat Aut` onoma de Barcelona, 08193-Bellaterra, Spain
Abstract. The well known Plotkin construction is, in the current paper, generalized and used to yield new families of Z2 Z4 -additive codes, whose length, dimension as well as minimum distance are studied. These new constructions enable us to obtain families of Z2 Z4 -additive codes such that, under the Gray map, the corresponding binary codes have the same parameters and properties as the usual binary linear Reed-Muller codes. Moreover, the first family is the usual binary linear Reed-Muller family. Keywords: Z2 Z4 -Additive codes, Plotkin construction, Reed-Muller codes, Z2 Z4 -linear codes.
1
Introduction
The aim of our paper is to obtain a generalization of the Plotkin construction which gave rise to families of Z2 Z4 -additive codes such that, after the Gray map, the corresponding Z2 Z4 -linear codes had the same parameters and properties as the family of binary linear RM codes. Even more, we want the corresponding codes with parameters (r, m) = (1, m) and (r, m) = (m−2, m) to be, respectively, any one of the non-equivalent Z2 Z4 -linear Hadamard and Z2 Z4 -linear 1-perfect codes.
2
Constructions of Z2 Z4 -Additive Codes
β In general, any non-empty subgroup C of Zα 2 × Z4 is a Z2 Z4 -additive code, where α Z2 denotes the set of all binary vectors of length α and Zβ4 is the set of all β-tuples in Z4 . β n Let C be a Z2 Z4 -additive code, and let C = Φ(C), where Φ : Zα 2 × Z4 −→ Z2 is given by the map Φ(u1 , . . . , uα |v1 , . . . , vβ ) = (u1 , . . . , uα |φ(v1 ), . . . , φ(vβ )) where φ(0) = (0, 0), φ(1) = (0, 1), φ(2) = (1, 1), and φ(3) = (1, 0) is the usual Gray map from Z4 onto Z22 . Since the Gray map is distance preserving, the Hamming distance of a Z2 Z4 linear code C coincides with the Lee distance computed on the Z2 Z4 -additive code C = φ−1 (C).
This work has been partially supported by the Spanish MICINN Grants MTM200603250, TSI2006-14005-C02-01, PCI2006-A7-0616 and also by the Comissionat per a Universitats i Recerca de la Generalitat de Catalunya under grant FI2008.
M. Bras-Amor´ os and T. Høholdt (Eds.): AAECC 2009, LNCS 5527, pp. 223–226, 2009. c Springer-Verlag Berlin Heidelberg 2009
224
J. Pujol, J. Rif` a, and L. Ronquillo
A Z2 Z4 -additive code C is also isomorphic to an abelian structure like Zγ2 ×Zδ4 . Therefore, C has |C| = 2γ 4δ codewords and, moreover, 2γ+δ of them are of order two. We call such code C a Z2 Z4 -additive code of type (α, β; γ, δ) and its binary image C = Φ(C) is a Z2 Z4 -linear code of type (α, β; γ, δ). Although C may not have a basis, it is important and appropriate to define a generator matrix for C as: B2 Q2 G= , (1) B4 Q4 where B2 and B4 are binary matrices of size γ × α and δ × α, respectively; Q2 is a γ × β-quaternary matrix which contains order two row vectors; and Q4 is a δ × β-quaternary matrix with order four row vectors. 2.1
Plotkin Construction
In this section we show that the well known Plotkin construction can be generalized to Z2 Z4 -additive codes. Definition 1 (Plotkin Construction). Let X and Y be any two Z2 Z4 -additive codes of types (α, β; γX , δX ), (α, β; γY , δY ) and minimum distances dX , dY , respectively. If GX and GY are the generator matrices of X and Y, then the matrix GX GX GP = 0 GY is the generator matrix of a new Z2 Z4 -additive code C. Proposition 2. Code C defined above is a Z2 Z4 -additive code of type (2α, 2β; γ, δ), where γ = γX + γY , δ = δX + δY , binary length n = 2α + 4β, size 2γ+2δ and minimum distance d = min{2dX , dY }. 2.2
BA-Plotkin Construction
Applying two Plotkin constructions, one after another, but slightly changing the submatrices in the generator matrix, we obtain a new construction with interesting properties with regard to the minimum distance of the generated code. We call this new construction BA-Plotkin construction. Given a Z2 Z4 -additive code C with generator matrix G we denote, respectively, by G[b2 ], G[q2 ], G[b4 ] and G[q4 ] the four submatricesB2 , Q2 , B4 , Q4 of G defined B2 Q2 in (1); and by G[b] and G[q] the submatrices of G, , , respectively. B4 Q4 Definition 3 (BA-Plotkin Construction). Let X , Y and Z be any three Z2 Z4 -additive codes of types (α, β; γX , δX ), (α, β; γY , δY ), (α, β; γZ , δZ ) and minimum distances dX , dY , dZ , respectively. Let GX , GY and GZ be the generator matrices of the Z2 Z4 -additive codes X , Y and Z, respectively. We define a new code C as the Z2 Z4 -additive code generated by
Construction of Additive Reed-Muller Codes
225
⎛
GBA
⎞ GX [b] GX [b] 2GX [b] GX [q] GX [q] GX [q] GX [q] ⎜ 0 GY [b2 ] GY [b2 ] 0 2G [q2 ] G [q2 ] 3G [q2 ] ⎟ Y Y Y ⎜ ⎟ ⎟ =⎜ ⎜ 0 GY [b4 ] GY [b4 ] 0 GY [q4 ] 2GY [q4 ] 3GY [q4 ] ⎟ , ⎝ GY [b4 ] GY [b4 ] 0 0 0 GY [q4 ] GY [q4 ] ⎠ 0 0 0 0 GZ [q] 0 GZ [b]
where GY [q2 ] is the matrix obtained from GY [q2 ] after switching twos by ones in its γY rows of order two, and considering the ones from the third column of the construction as ones in the quaternary ring Z4 . Proposition 4. Code C defined above is a Z2 Z4 -additive code of type (2α, α + 4β; γ, δ) where γ = γX + γZ , δ = δX + γY + 2δY + δZ , binary length n = 4α + 8β, size 2γ+2δ and minimum distance d = min{4dX , 2dY , dZ }.
3
Additive Reed-Muller Codes
We will refer to Z2 Z4 -additive Reed-Muller codes as ARM. Just as there is only one RM family in the binary case, in the Z2 Z4 -additive case there are m+2 2 families for each value of m. Each one of these families will contain any of the m+2 2 non-isomorphic Z2 Z4 -linear extended perfect codes which are known to exist for any m [1]. We will identify each family ARMs (r, m) by a subindex s ∈ {0, . . . , m 2 }. 3.1
The Families of ARM(r, 1) and ARM(r, 2) Codes
We start by considering the case m = 1, that is the case of codes of binary length n = 21 . The Z2 Z4 -additive Reed-Muller code ARM(0, 1) is the repetition code, of type (2, 0; 1, 0) and which only has one nonzero codeword (the vector with only two binary coordinates of value 1). The code ARM(1, 1) is the whole space Z22 , thus a Z2 Z4 -additive code of type (2, 0; 2, 0). Both codes ARM(0, 1) and ARM(1, 1) are binary codes with the same parameters and properties as the corresponding binary RM (r, 1) codes (see [2]). We will refer to them as ARM0 (0, 1) and ARM0 (1, 1), respectively.
1 1 and the generator 1) is G (0, 1) = The generator matrix of ARM0 (0, 0 11 matrix of ARM0 (1, 1) is G0 (1, 1) = . 01 For m = 2 we have two families, s = 0 and s = 1, of additive Reed-Muller codes of binary length n = 22 . The family ARM0 (r, 2) consists of binary codes obtained from applying the Plotkin construction defined in Proposition 2 to the family ARM0 (r, 1). For s = 1, we define ARM1 (0, 2), ARM 1 (1, 2) and ARM1 (2, 2) as the codes with generator matrices G1 (0, 2) = 1 1 2 , G1 (1, 2) = ⎛ ⎞ 112 112 and G1 (2, 2) = ⎝ 0 1 0 ⎠, respectively. 011 011
226
3.2
J. Pujol, J. Rif` a, and L. Ronquillo
Plotkin and BA-Plotkin Constructions
Take the family ARMs and let ARMs (r, m − 1), ARMs (r − 1, m − 1) and ARMs (r − 2, m − 1), 0 ≤ s ≤ m−1 2 , be three consecutive codes with parameters (α, β; γ , δ ), (α, β; γ , δ ) and (α, β; γ , δ ); binary length n = 2m−1 ; minimum distances 2m−r−1 , 2m−r and 2m−r+1 ; and generator matrices Gs (r, m − 1), Gs (r − 1, m − 1) and Gs (r − 2, m − 1), respectively. By using Proposition 2 and Proposition 4 we can prove the following results: Theorem 5. For any r and m ≥ 2, 0 < r < m, code ARMs (r, m) obtained by applying the Plotkin construction from Definition 1 on codes ARMs (r, m − 1) and ARMs (r − 1, m − 1) is a Z2 Z4 -additive code of type (2α, 2β; γ, δ), where γ = γ + γ and δ = δ + δ ; binary length n = 2m ; size 2k codewords, where r m k= ; minimum distance 2m−r and ARMs (r − 1, m) ⊂ ARMs (r, m). i i=0 We consider ARMs (0, m) to be the repetition code with only one nonzero codeword (the vector with 2α ones and 2β twos) and ARMs (m, m) be the whole 2β space Z2α 2 × Z4 . Theorem 6. For any r and m ≥ 3, 0 < r < m, s > 0, use the BA-Plotkin construction from Definition 3, where generator matrices GX , GY , GZ stand for Gs (r, m − 1), Gs (r − 1, m − 1) and Gs (r − 2, m − 1), respectively, to obtain a new Z2 Z4 -additive ARMs+1 (r, m + 1) code of type (2α, α + 4β; γ, δ), where γ = γ + γ , δ = δ + γ + 2δ + δ ; binary length n = 2m+1 ; 2k code r m+1 words, where k = , minimum distance 2m−r+1 and, moreover, i i=0 ARMs+1 (r − 1, m + 1) ⊂ ARMs+1 (r, m + 1). To be coherent with all notations, code ARMs+1 (−1, m + 1) is defined as the all zero codeword code, code ARMs+1 (0, m + 1) is defined as the repetition code with only one nonzero codeword (the vector with 2α ones and α + 4β twos), whereas codes ARMs+1 (m, m + 1) and ARMs+1 (m + 1, m + 1) are defined as α+4β the even Lee weight code and the whole space Z2α , respectively. 2 × Z4 Using both Theorem 5 and Theorem 6 we can now construct all ARMs (r, m) codes for m > 2. Once applied the Gray map, all these codes give rise to binary codes with the same parameters and properties as the RM codes. Moreover, when m = 2 or m = 3, they also have the same codewords.
References 1. Borges, J., Rif` a, J.: A characterization of 1-perfect additive codes. IEEE Trans. Inform. Theory 45(5), 1688–1697 (1999) 2. MacWilliams, F.J., Sloane, N.J.A.: The Theory of Error-Correcting Codes. NorthHolland Publishing Company, Amsterdam (1977)
Gr¨ obner Representations of Binary Matroids M. Borges-Quintana1, M.A. Borges-Trenard1, and E. Mart´ınez-Moro2, 1
Dpto. de Matem´ atica, FCMC, U. de Oriente, Santiago de Cuba, Cuba
[email protected],
[email protected] 2 Dpto. de Matem´ atica Aplicada, U. de Valladolid, Castilla, Spain
[email protected]
Abstract. Several constructions in binary linear block codes are also related to matroid theory topics. These constructions rely on a given order in the ground set of the matroid. In this paper we define the Gr¨ obner representation of a binary matroid and we show how it can be used for studying different sets bases, cycles, activity intervals, etc.
1
Introduction
In this work we provide a representation of a binary matroid. We have tried to keep “Gr¨ obner machinery” [11] out of the paper for those readers non familiar with it. Effective methods for computing the Gr¨obner representation that heavily depend of Gr¨ obner basis techniques can be found in other papers of the authors [1,2,3,4]. The starting point of the research in Gr¨ obner representations can be found in [5,11,12,16]. Let E be a finite set called ground set and I ⊂ 2E . We say that M = (E, I) is a matroid if the following conditions are satisfied: (i) ∅ ∈ I, (ii) If A ∈ I and B ⊆ A, then B ∈ I and (iii) If A, B ∈ I and |A| = |B| + 1, then there is an element x ∈ A \ B so that B ∪ {x} ∈ I. Let F2 be the finite field with 2 elements and G be a binary matrix with column index set E. Let I be the collection of sets I ⊆ E such that the column submatrix indexed by I has F2 independent columns. If we consider ∅ ∈ I then M = (E, I) is the binary matroid with ground set E and independent sets I. A base B of M is a maximum cardinality set B ∈ I and a circuit C of M is a subset of E that indexes a F2 -minimal dependent column submatrix of G. We call cycles to all the dependent sets (minimal and non minimal). G can be seen as a generator matrix of a code. From now on we consider only binary matrices that provide projective codes, i.e. those that any two columns of the generator matrix are different (equivalently, the dual code has minimum weight at least 3). Circuits (respectively cycles) of the matroid defined by G are in one to one correspondence with the minimum weight codewords (respectively codewords)
Partially funded by “Agencia Espa˜ nola de Cooperaci´ on Internacional”, Project A/016959/08. Partially funded by “Junta de Castilla y Le´ on” VA65A07 and “Ministerio Educaci´ on y Ciencia” I+D/I+D+I MTM2007-66842, MTM2007-64704.
M. Bras-Amor´ os and T. Høholdt (Eds.): AAECC 2009, LNCS 5527, pp. 227–230, 2009. c Springer-Verlag Berlin Heidelberg 2009
228
M. Borges-Quintana, M.A. Borges-Trenard, and E. Mart´ınez-Moro
of the dual code. In matroids the greedy algorithm always succeeds in choosing a base of minimum weight, formally Proposition 1. Let M = (E, I) a matroid where the elements of E are totally ordered, then 1. there is a base A = {a1 , a2 , . . . , ak } with a1 < a2 < . . . < ak , called the first basis, such that for any other base X = {x1 , x2 , . . . , xk } with x1 < x2 < . . . < xk we have that ai ≤ xi . 2. there is a base B = {b1 , b2 , . . . , bk } with b1 < b2 < . . . < bk , called the last basis, such that for any other base X = {x1 , x2 , . . . , xk } with x1 < x2 < . . . < xk we have that bi ≥ xi . The first base is computed by a greedy algorithm, {a1 } is the smallest independent set of size 1, then a2 is the smallest element such that {a1 , a2 } is independent, and so on. The last base is built in a similar way.
2
A Gr¨ obner Representation of a Binary Matroid
Let {ei }ni=1 be the canonical basis of Fn2 (ie. ei is the vector with 1 in position i and 0 elsewhere). We consider the following total order on the elements of Fn2 . Let x, y ∈ Fn2 we say that x ≺e y if |supp(x)| ≤ |supp(y)|, or, if |supp(x)| = |supp(y)| then x ≺ y where ≺ is the lexicographic ordering on the vectors of Fn2 . Consider a binary matroid and C its associated projective code. We define the Gr¨ obner representation of the matroid as follows, Canonical forms. The set N is a transversal of Fn2 /C ⊥ such that for each n ∈ N \ {0} there exists ei 1 ≤ i ≤ n such that n = n + ei and n ∈ N . Matphi. Let φ : N × {ei }ni=1 → N be the function that maps each pair (n, ei ) to the element in N representing the coset containing n + ei . We call the pair N, φ a Gr¨ obner representation of M. Suppose we consider the ideal given by I(M) = {xa − xb | (a − b) ∈ C ⊥ } ⊆ F2 [x1 , x2 , . . . , xn ] where (abusing the notation) we consider an element ain Fn2 as a vector a = n obner (a1 , a2 , . . . , an ) ∈ Zn then xa represents the monomial i=1 xai n . A a Gr¨ representation [11] of the ideal I(M) is given by N, φ. An algorithm for computing a Gr¨ obner basis of the ideal can be found in [1]. This greedy algorithm computes the first base given N, φ 1. F B ← 0, v ← 0. 2. for i = 1, . . . , n do (if φ(v, ei ) = 0 then F B ← F B + ei , v ← φ(v, ei ) endif ) enddo 3. Return F B. The returned vector is a 0-1 vector where the 1 index the elements on the first basis. Same algorithm applies for the last base just changing the order in the counter for. One can compute a trellis for a linear code isomorphic to Muder’s minimal trellis [13] based on the first and last basis, see [6].
Gr¨ obner Representations of Binary Matroids
3
229
Activities
Let M = (E, I) be a matroid on a linearly ordered set E, and let A ⊆ E. We say that an element e ∈ E is M-active with respect to A if there is a circuit γ such that e ∈ γ ⊆ A ∪ {e} and e is the smallest element of the circuit. We denote by ActM (A) the set of M-active elements with respect to A. If C is the binary code associated to the matroid, then a codeword c = (c1 , . . . , cn ) of the code C ⊥ (i.e. a cycle of the matroid) is said to have the span interval [i, j] where i ≤ j, if ci cj = 0, and cl = 0 if l < i or j < l. In this case the codeword c is said to be sup-active in the interval [i, j − 1], and has span length j − i + 1. Sup-active words in binary codes are closely connected with the construction and analysis of trellis oriented generator matrix of C ⊥ (see [8,10] and the references therein). Given N, φ for a matroid M, the nodes of the GR-graph (Gr¨ obner representation graph) be the set N of canonical forms and there is an edge from node n1 to n2 if there is a i, 1 ≤ i ≤ n such that φ(n1 , ei ) = n2 , and the label of that edge is ei . Let A ⊆ E = [1, n] we define the graph GRA obtained from the GR-graph of the matroid by deleting all the edges labelled with ei , such that i ∈ E \ A and element j is active if: 1. Either j ∈ A and there is a circuit in GRA such that all the labels el fulfill l ≥ j and the (binary) sum of all the labels is different from 0. 2. or j ∈ / A and there is a circuit in GRA∪{j} such that all the labels el fulfill l ≥ j and the (binary) sum of all the labels is different from 0. Let A be an interval and let IA the elimination ideal of I = I(M) (see [7] for a definition) for the variables in {xi }i∈A . Let g the greatest element in A, then the cycles correspond to the non-zero binomials in e
References 1. Borges-Quintana, M., Borges-Trenard, M.A., Fitzpatrick, P., Mart´ınez-Moro, E.: On a Gr¨ obner bases and combinatorics for binary codes. Appl. Algebra Engrg. Comm. Comput. 19-5, 393–411 (2008) 2. Borges-Quintana, M., Borges-Trenard, M.A., Mart´ınez-Moro, E.: A general framework for applying FGLM techniques to linear codes. In: Fossorier, M.P.C., Imai, H., Lin, S., Poli, A. (eds.) AAECC 2006. LNCS, vol. 3857, pp. 76–86. Springer, Heidelberg (2006)
230
M. Borges-Quintana, M.A. Borges-Trenard, and E. Mart´ınez-Moro
3. Borges-Quintana, M., Borges-Trenard, M.A., Mart´ınez-Moro, E.: On a Gr¨ obner bases structure associated to linear codes. J. Discrete Math. Sci. Cryptogr. 10-2, 151–191 (2007) 4. Borges-Quintana, M., Borges-Trenard, M.A., Mart´ınez-Moro, E.: A Gr¨ obner representation of linear codes. In: Advances in Coding Theory and Cryptography, pp. 17–32. World Scientific, Singapore (2007) 5. Borges-Quintana, M., Borges-Trenard, M.A., Mora, T.: Computing Gr¨ obner Bases by FGLM Techniques in a Noncommutative Settings. J. Symb.Comp. 30, 429–449 (2000) 6. Cameron, P.: Codes, matroids and trellises, citeseerx.ist.psu.edu/343758.html 7. Cox, D., Little, J., O’Shea, D.: Ideals, varieties, and algorithms (1992) 8. Esmaeili, M., Gulliver, T., Secord, N.: Trellis complexity of linear block codes via atomic codewords. In: Chouinard, J.-Y., Fortier, P., Gulliver, T.A. (eds.) Information Theory 1995. LNCS, vol. 1133, pp. 130–148. Springer, Heidelberg (1996) 9. Faug`ere, J., Gianni, P., Lazard, D., Mora, T.: Efficient Computation of ZeroDimensional Gr¨ obner Bases by Change of Ordering. J. Symb. Comp. 16(4), 329–344 (1993) 10. Kschischang, F.R., Sorokine, V.: On the trellis structure of block codes. IEEE Trans. Inform. Theory 41, 1924–1937 (1995) 11. Mora, T.: Solving Polynomial Equation Systems II: Macaulay’s Paradigm and Gr¨ obner Technology. Cambridge Univ. Press, Cambridge (2005) 12. Mora, T.: The FGLM Problem and M¨ oller’s Algorithm on Zero-dimensional Ideals. In: [15] 13. Muder, D.J.: Minimal trellises for block codes. IEEE Trans. Inform. Theory 34(5, part 1), 1049–1053 (1988) 14. Oxley, J.G.: Matroid Theory. Oxford University Press, Oxford (1992) 15. Sala, M., Mora, T., Perret, L., Sakata, S., Traverso, C. (eds.): Gr¨ obner Bases, Coding, and Cryptography. Springer, Heidelberg (2009) (to appear) 16. Reinert, B., Madlener, K., Mora, T.: A Note on Nielsen Reduction and Coset Enumeration. In: Proc. ISSAC 1998, pp. 171–178. ACM, New York (1998)
A Generalization of the Zig-Zag Graph Product by Means of the Sandwich Product David M. Monarres1 and Michael E. O’Sullivan2 1
2
Grossmont College El Cajon, CA San Diego State University San Diego, CA
Abstract. In this paper we develop a generalization of the zig-zag graph product created by Reingold, Vadhan, and Widgerson[8]. We do this by using a broader definition of directed and undirected graphs in which incidence is determined by functions from the edge set to the vertex set. We introduce the sandwich product of graphs and show how our general zig-zag product is a sandwich product.
1
Introduction
In this extended abstract we present a generalization of the zig-zag product of graphs which is defined in the manner outlined in the paper by Hoory et.al.[3]. For this definition let – – – –
G be a m-regular graph on n verticies. H be a d-regular graph on m verticies. Let v1 , v2 , . . . , vm be an enumeration of V (H) For each u ∈ V (G), fix an enumeration e1u , e2u , . . . , em u of the edges incident with u
z H, is a Definition 1. The zig-zag product of graphs G and H, denoted G graph on V (G) × V (H) for which (u, vi ) is adjacent to (w, vj ) if and only if there exists k, l ∈ {1, 2, . . . , m} such that both
– (vi , vk ) , (vl , vj ) ∈ E(H), and – eku = elu The development of the zig-zag product was a seminal step in the explicit construction of expander graphs. These graphs have been used to address fundametal problems in computer science, such as network design[6,7] and complexity theory[11,10], and areas of pure mathematics, such as topology [1] and measure theory [5]. For a more general survey of the development of the construction of exapnder graphs see Hoory, et al. [3].
This research was supported by NSF CCF- Theoretical Foundations grant #0635382.
M. Bras-Amor´ os and T. Høholdt (Eds.): AAECC 2009, LNCS 5527, pp. 231–234, 2009. c Springer-Verlag Berlin Heidelberg 2009
232
2
D.M. Monarres and M.E. O’Sullivan
Graphs, Sandwiches, and the Zig-Zag Product
Our approach does not require the regularity of the first graph to be equal to the number of edges in the second graph, moreover, neither graph needs to be regular. We use definitions of digraph and of graph that allow for loops and parallel edges. The definition is perhaps not standard but is found in Harary[2] (where the term net is used instead of digraph) and is close to the definitions used by MacLane[4], and Serre [9]. We introduce the sandwich product of graphs and we show that the zig-zag product of graphs may be concisely presented as the sandwich product of two relatively simple graphs. 2.1
Directed and Undirected Graphs
In the following definitions, a graph is a digraph with additional structure. Definition 2. A directed graph (or digraph), denoted by the letter G, is a collection of two sets E(G) and V (G) together with two functions σG and τG from E(G) to V (G) where – E(G) and V (G) are known as the edge and vertex sets of the net G. – σG and τG are known as the source and terminus functions of the net G. Definition 3. An undirected graph, or just graph, is a digraph G in which there exists a unique involution ρG : E → E such that τG = σG ◦ ρG . We call the function ρG the rotation mapping of the graph G. The term rotation map was used in [8]. In the usual depiction of a graph an edge between two vertices represents two edges according to the preceding definition— one going each direction—and the two are images of each other under the involution ρG . The tip to tail concatenation of the graphs used in the definitions of the zigzag and replacement products has a connection to a more general construct, the pullback of mappings of sets. Definition 4. Let A, B, and C be sets and let f : A → C and g : B → C be functions. Then the pullback of set functions f and g is the set A ×C B = {(a, b) |f (a) = g (b)} together with the standard coordinate projections π1 and π2 . We can look at the tip to tail traversing of the edges of two graphs in the zig-zag product as really just the pullback of the terminus of one graph and the source of the other. More generally we may define the concatenation of any two graphs as follows: Definition 5. Let G and H be directed graphs with a common vertex set V and let E(G) ×V E(H) be the pullback of the mappings τG and σH . Then the c H, is the directed graph with concatenation of G with H, denoted by G vertex set V , edge set E(G)×V E(H), source map σ = σG ◦ π1 and terminus map τ = τH ◦ π2 .
A Generalization of the Zig-Zag Graph Product
233
c H is not guaranteed to be a graph, the double concatenation of the While G c c G always is. We will call this type of concatenation the sandwich form G H product.
Definition 6. Let G and H be graphs with common vertex set V . Then the s H, is the graph defined by the sandwich product of G with H, denoted G s c c triple concatination G H = G H G The sandwich product is a “zig-zag” like product but with the sole requirement that they have the same vertex set. 2.2
A Zig-Zag Sandwich
We now can display the zig-zag product of two graphs as the sandwich product of two simple graphs. The first factor we call the zig product and the second factor the zag product. Definition 7. Let G and H be graphs. The zig product of G with H, denoted i H, is |V (G)| copies of the graph H defined by G idV (G) ×ρH
V (G) × E(H) idV (G) ×σH
V (G) × V (H)
The zig product graph will lie at the beginning and end of our product much like how we start and end in a “cloud” when we traverse the replacement product graph in a “zig-zag” fashion. We now need to define our analog to the bridges between the “clouds” in the zig-zag. We will call this graph the zag product. The zig-zag product is the sandwich product of the zig and the zag. Definition 8. Let G and H be graphs and φ : E(G) → V (H) be any mapping. a H, is the graph with vertex Then the zag product of G and H, denoted G a H) = E(G), source mapping σzag = σg × φ set V (G) × V (H), edge set E(G and rotation mapping ρzag = ρG as depicted in the diagram ρG
E(G) σG ×φ
V (G) × V (H) The zig-zag product is then the sandwich product of the zig and the zag. z H = (G i H) s (G a H) G
234
D.M. Monarres and M.E. O’Sullivan
The definition above requires only a mapping from the edge set of G to the vertex set of H. It is more general then the one in the paper by Reingold et al. [8] in which G must be regular with regularity equal to the number of vertices in H.
References 1. Gromov, M.: Spaces and questions. Geom. Funct. Anal. (Special Volume, Part I), 118–161 (2000); GAFA 2000 (Tel Aviv, 1999) 2. Harary, F., Norman, R.Z., Cartwright, D.: Structural Models: An Introduction to the Theorey of Directed Graphs. John Wiley & Sons, Inc., Chichester (1966) 3. Hoory, S., Linial, N., Wigderson, A.: Expander Graphs and their Applications. Bulletin of the American Mathematical Society 43(4), 439–561 (2006) 4. MacLane, S.: Categories for the working mathematician. Springer, Heidelberg (1971) 5. Lubotzky, A.: Discrete groups, expanding graphs and invariant measures. Progress in Mathematics, vol. 125. Birkh¨ auser, Basel (1994); With an appendix by Rogawski, J.D. 6. Pippenger, N.: Sorting and selecting in rounds. SIAM Journal on Computing 16(6), 1032–1038 (1987) 7. Pippenger, N., Yao, A.C.: Rearrangeable networks with limited depth. SIAM Journal of Algebraic and Discrete Methods 3, 411–417 (1982) 8. Reingold, O., Vadhan, S., Widgerson, A.: Entropy waves, the zig-zag graph product, and new constant-degree expanders. Annals of Mathmatics 155, 157–187 (2002) 9. Serre, J.-P.: Trees. Springer, Heidelberg (1980) 10. Valiant, L.G.: Graph-theoretic arguments in low-level complexity. In: Gruska, J. (ed.) MFCS 1977. LNCS, vol. 53, pp. 162–176. Springer, Heidelberg (1977) 11. Urquhart, A.: Hard examples for resolution. Journal of the Association for Computing Machinery 34(1), 209–219 (1987)
Novel Efficient Certificateless Aggregate Signatures Lei Zhang1 , Bo Qin1,3 , Qianhong Wu1,2 , and Futai Zhang4 1
3
UNESCO Chair in Data Privacy Department of Computer Engineering and Mathematics Universitat Rovira i Virgili Av. Pa¨ısos Catalans 26, E-43007 Tarragona, Catalonia {lei.zhang,bo.qin,qianhong.wu}@urv.cat 2 Key Lab. of Aerospace Information Security and Trusted Computing Ministry of Education, School of Computer, Wuhan University, China Dept. of Maths, School of Science, Xi’an University of Technology, China 4 College of Mathematics and Computer Science Nanjing Normal University, Nanjing, China
[email protected]
Abstract. We propose a new efficient certificateless aggregate signature scheme which has the advantages of both aggregate signatures and certificateless cryptography. The scheme is proven existentially unforgeable against adaptive chosen-message attacks under the standard computational Diffie-Hellman assumption. Our scheme is also efficient in both communication and computation. The proposal is practical for message authentication in many-to-one communications.
1
Introduction
The notion of newly introduced aggregate signatures [2] allows an efficient algorithm to aggregate n signatures of n distinct messages from n different signers into one single signature. The resulting aggregate signature can convince a verifier that the n signers did indeed sign the n original messages. These properties greatly reduce the resulting signature size and make aggregate signatures very applicable to message authentication in many-to-one communications. The inception of certificateless cryptography [1] efficiently addresses the key escrow problem in ID-based Cryptography. In certificateless cryptosystems, a trusted Key Generation Center (KGC) helps each user to generate his private key. Unlike ID-based cryptosysems, the KGC in certificateless cryptosystems merely determines a partial private key rather than a full private key for each user. Then the user computes the resulting private key with the obtained partial private key and a self-chosen secret value. As for the public key of each user, it is computed from the KGC’s public parameters and the secret value chosen by the user. With this mechanism, certificateless cryptosystems avoid the key escrow problem in ID-based cryptosystems. The advantages of certificateless cryptosystems motivate a number of further studies. The first certificateless signature scheme was presented by Al-Riyami M. Bras-Amor´ os and T. Høholdt (Eds.): AAECC 2009, LNCS 5527, pp. 235–238, 2009. c Springer-Verlag Berlin Heidelberg 2009
236
L. Zhang et al.
and Paterson [1]. A security definition of certificateless signature was formalized in [7] and the Al-Riyami-Paterson scheme was analyzed in this model. The security model of CLS schemes was further enhanced in [6,8,9]. Two Certificateless Aggregate Signature (CLAS) schemes were recently presented [5] with security proofs in a weak model similar to that in [7]. Subsequently, a new CLAS scheme was proposed in [10] and proven secure in a stronger security model. As for efficiency, the existing schemes require a relatively large number of paring computations in the process of verification and suffer from long resulting signatures. Our Contribution. In this paper, we propose a novel CLAS scheme which is more efficient than existing schemes. By exploiting the random oracle model, our CLAS scheme is proven existentially unforgeable against adaptive chosenmessage attacks under the standard CDH assumption. It allows multiple signers to sign multiple documents in an efficient way and the total verification information (the length of the signature), consists only 2 group elements. Our scheme is also very efficient in computation and the verification procedure need only a very small constant number of pairing computations, independent of the number of aggregated signatures.
2
Our Certificateless Aggregate Signature Scheme
In this section, we propose a new certificateless aggregate signature scheme. Our scheme is realized in groups which allowing efficient bilinear maps [3]. 2.1
The Scheme
The specification of the scheme is as follows. – Setup: Given a security parameter , the KGC chooses a cyclic additive group G1 which is generated by P with prime order q, chooses a cyclic multiplicative group G2 of the same order and a bilinear map e : G1 ×G1 −→ G2 . The KGC also chooses a random λ ∈ Zq∗ as the master-key and sets PT = λP , chooses cryptographic hash functions H1 ∼ H4 : {0, 1}∗ −→ G1 , H5 : {0, 1}∗ −→ Zq∗ . The system parameter list is params = (G1 , G2 , e, P, PT , H1 ∼ H5 ). – Partial-Private-Key-Extract: This algorithm is performed by KGC that accepts params, master-key λ and a user’s identity IDi ∈ {0, 1}∗, and generates the partial private key for the user as follows. 1. Compute Qi,0 = H1 (IDi , 0), Qi,1 = H1 (IDi , 1). 2. Output the partial private key (Di,0 , Di,1 ) = (λQi,0 , λQi,1 ). – UserKeyGen: This algorithm takes as input params, a user’s identity IDi , selects a random xi ∈ Zq∗ and sets his secret/public key as xi /Pi = xi P . – Sign: To sign a message Mi using the signing key (xi , Di,0 , Di,1 ), the signer, whose identity is IDi and the corresponding public key is Pi , first chooses a one-time-use string Δ then performs the following steps.
Novel Efficient Certificateless Aggregate Signatures
1. 2. 3. 4. 5.
237
Choose a random ri ∈ Zq∗ , compute Ri = ri P . Compute T = H2 (Δ), V = H3 (Δ), W = H4 (Δ). Compute hi = H5 (Mi ||Δ||IDi ||Pi ). Compute Si = Di,0 + xi V + hi (Di,1 + xi W ) + ri T . Output σi = (Ri , Si ) as the signature on Mi .
– Aggregation: Anyone can act as an aggregate signature generater who can aggregate a collection of individual signatures that use the same string Δ. For an aggregating set (which has the same string Δ) of n users with identities {ID1 , · · · , IDn } and the corresponding public keys {P1 , · · · , Pn }, and message-signature pairs (M1 , σ1 = (R1 , S1 )), · · · , (Mn , σn = (Rn , Sn )) respectively, the aggregate signature generater computes from {U1 , · · · , Un } n n R = i=1 Ri , S = i=1 Si and outputs the aggregate signature σ = (R, S). – Aggregate Verify: To verify an aggregate signature σ = (R, S) signed by n users with identities {ID1 , ..., IDn } and corresponding public keys {P1 , ..., Pn } on messages {M1 , ..., Mn } under the same string Δ, the verifier performs the following steps. 1. Compute T = H2 (Δ), V = H3 (Δ), W = H4 (Δ), and for all i, 1 ≤ i ≤ n compute hi = H5 (Mi ||Δ||IDi ||Pi ), Qi,0 = H1 (IDi , 0), Qi,1 = H1 (IDi , 1). n n n ? 2. Verify P ) = e(PT , i=1 Qi,0 + i=1 hi Qi,1 )e(T, R)e(W, i=1 hi Pi ) e(S, n e(V, i=1 Pi ). If the equation holds, output true. Otherwise, output false. In our scheme, each user in an aggregating set must use the same one-time-use string Δ when signing. As mentioned in [4], it is straightforward to choose such a Δ in certain settings. For example, if the signers have access to some loosely synchronized clocks, Δ can be chosen based on the current time. Furthermore, if Δ is sufficiently long, then it will be statistically unique. By exploiting a approach similar to that presented in [4], the one-time-use restriction on common reference string Δ in the above scheme can also be removed to achieve better applicability. 2.2
Security Analysis
Two types of adversaries, who can access to services in addition to those provided to the attacker against regular signatures, are considered in CL-PKC – Type I adversary and Type II adversary. A Type I adversary is not allowed to access to the master-key, but he can replace the public key of any user with a value of his choice. A Type II adversary can access to the master-key but he cannot replace the public key of any user. In a secure CLAS scheme, it is infeasible for Type I adversary and Type II adversary to forge a valid signature. Under the standard computational Diffie-Hellman assumption, the proposed CLAS scheme is provably secure against both types of adversaries in the random model. The formal security proof is in the full version of this paper.
238
3
L. Zhang et al.
Conclusion
We presented an efficient certificateless aggregate signature scheme. To verify an aggregate signature signed by n users on n messages under the same string, a verifier only needs to compute four pairing operations. The proposal is provably secure in the random oracle model assuming that the computational DiffieHellman problem is hard. Our CLAS scheme can be applied to authentication in bandwidth limited scenarios such as many-to-one communications.
Acknowledgments and Disclaimer This paper is partly supported by the Spanish Government through projects CONSOLIDER INGENIO 2010 CSD2007-00004 ARES, TSI2007-65406-C03-01 E-AEGIS and by the national nature science foundation of China (No. 60673070). The views of the author with the UNESCO Chair in Data Privacy do not necessarily reflect the position of UNESCO nor commit that organization.
References 1. Al-Riyami, S.S., Paterson, K.G.: Certificateless Public Key Cryptography. In: Laih, C.-S. (ed.) ASIACRYPT 2003. LNCS, vol. 2894, pp. 452–473. Springer, Heidelberg (2003) 2. Boneh, D., Gentry, C., Shacham, H., Lynn, B.: Aggregate and Verifiably Encrypted Signatures from Bilinear Maps. In: Biham, E. (ed.) EUROCRYPT 2003. LNCS, vol. 2656, pp. 416–432. Springer, Heidelberg (2003) 3. Boneh, D., Franklin, M.: Identity-based Encryption from the Weil Pairing. SIAM J. Comput. 32, 586–615 (2003); a Preliminary Version Appeared. In: Kilian, J. (ed.): CRYPTO 2001. LNCS, vol. 2139, pp. 213-229. Springer, Heidelberg (2001) 4. Gentry, C., Ramzan, Z.: Identity-Based Aggregate Signatures. In: Yung, M., Dodis, Y., Kiayias, A., Malkin, T.G. (eds.) PKC 2006. LNCS, vol. 3958, pp. 257–273. Springer, Heidelberg (2006) 5. Gong, Z., Long, Y., Hong, X., Chen, K.: Two Certificateless Aggregate Signatures from Bilinear Maps. In: Proc. of the Eighth ACIS International Conference on Software Engineering, Artificial Intelligence, Networking, and Parallel/Distributed Computing, pp. 188–193 (2007) 6. Hu, B.C., Wong, D.S., Zhang, Z., Deng, X.: Key Replacement Attack Against a Generic Construction of Certificateless Signature. In: Batten, L.M., Safavi-Naini, R. (eds.) ACISP 2006. LNCS, vol. 4058, pp. 235–246. Springer, Heidelberg (2006) 7. Huang, X., Susilo, W., Mu, Y., Zhang, F.: On the Security of Certificateless Signature Schemes from Asiacrypt 2003. In: Desmedt, Y.G., Wang, H., Mu, Y., Li, Y. (eds.) CANS 2005. LNCS, vol. 3810, pp. 13–25. Springer, Heidelberg (2005) 8. Huang, X., Mu, Y., Susilo, W., Wong, D.S., Wu, W.: Certificateless Signature Revisited. In: Pieprzyk, J., Ghodosi, H., Dawson, E. (eds.) ACISP 2007. LNCS, vol. 4586, pp. 308–322. Springer, Heidelberg (2007) 9. Zhang, Z., Wong, D.: Certificateless Public-Key Signature: Security Model and Efficient Construction. In: Zhou, J., Yung, M., Bao, F. (eds.) ACNS 2006. LNCS, vol. 3989, pp. 293–308. Springer, Heidelberg (2006) 10. Zhang, L., Zhang, F.: A New Certificateless Aggregate Signature Scheme. Computer Communications (2009), doi:10.1016/j.comcom.2008.12.042
Bounds on the Number of Users for Random 2-Secure Codes Manabu Hagiwara1,2, Takahiro Yoshida1,2 , and Hideki Imai1,3 1
1
Research Center for Information Security, National Institute of Advanced Industrial Science and Technology, 1-18-13 Sotokanda, Chiyoda-ku, Tokyo 101-0021, Japan
[email protected] 2 Center for Research and Development Initiative, Chuo University, 1-13-27 Kasuga, Bunkyo-ku, Tokyo, Japan
[email protected] 3 Faculty of Science and Engineering, Chuo University, Kasuga, Bunkyo-ku, Tokyo. 112-8551, Japan
[email protected]
Introduction
An illegal re-distribution problem is a problem that a regular user who received content re-distributes without legal permutation. This becomes a social problem all over the world. As it is indicated in [2], this problem has been known since a few or more handred years ago. Boneh and Shaw formalize this problem as collusion-secure fingerprinting (c-secure code) for digital data [2]. After their work, study of c-secure code has become one of popular research stream for information security [1]. One of remarkable results is called Tardos’s code [5]. Tardos’s code achieved to construct approximately optimal length c-secure codes with random coding method. On the other hand, it is an interesting research way to restrict the number of colluders to small values [3]. Small colluder cases give us many knowledge, techniques, and examples for c-secure code theory. In [4], it is shown that coinflipping codes are useful for 2-secure code which is a c-secure code restricted to against at most two colluders. In this extended abstruct, we investigate the coin-flipping codes as 2-secure codes. We give three formulas to represent the success probability of simple tracing algorithm for the coin-flipping codes. It shows that even if the code length is short, e.g. 128 bits, it succeeds to trace one of colluders with high probability, e.g. 0.9999. A non-trivial achievable rate for our coin-flipping code is also given.
2
Notations and Definitions
Let n be a positive integer. First, the administrator divides a content to n or more parts, chose n parts from the divided contents. A user i can requests to obtain a copy of the content one time. The administrator registers the person i. This is formalized as follows: The set of registered M. Bras-Amor´ os and T. Høholdt (Eds.): AAECC 2009, LNCS 5527, pp. 239–242, 2009. c Springer-Verlag Berlin Heidelberg 2009
240
M. Hagiwara, T. Yoshida, and H. Imai
users U is defined as a set of integers. For ease, the set U is starting with 1 and is consecutive, i.e. U = {1, 2, 3, . . . }. The administrator flips coins n times. We assume that the probability to have face side is 0.5 and the one of the other side is 0.5 also. If jth coin is face side, (i) (i) we put cj = 0. If otherwise, put cj = 1. Therefore the administrator obtains (i)
(i)
(i)
n-bit sequence c(i) = (c1 , c2 , . . . , cn ). The administrator memorizes c(i) to a database which the only administrator accesses. We call the set of codewords (i) constructed by coin-flipping coin-flipping code. The bit cj is encrypted and is embedded to jth part of the requested content. Assume that any user cannot decrypt and cannot distinguish which bit is embedded. The user i receives the requested content which the secret n-bit sequence c(i) is embedded by the administrator. 2.1
Collusion Attack under Marking-Assumption
We introduce the notion, called marking-assumption. For details, see the reference [2]. It is assumed that any user cannot delete the embedded codeword or a part of the codeword from a distributed content. The only attack users can is to shuffle the distributed content. We assumed that user cannot distinguish which bit is embedded to the distributed contents. However, it is possible for (a) (b) users a and b to find a part associated to jth part if the different bits cj = cj are embedded by comparing their content. 2.2
Tracing Algorithm for Coin-Flipping Codes
If illegally re-distributed content is found somewhere, the administrator decrypts a bit sequence y from the content. The administrator tries to trace at least one of the colluders by analyzing y. A tracing algorithm T is an algorithm which inputs a bit sequence and outputs a subset of the users U. The details of our algorithm are the following: For the input y and an user i ∈ U, calculate Hamming distance d(c(i) , y). For i, denote the Hamming distance by si and call it the score of i for y. Since a value of Hamming distance is non-negative integer, we have minimum value s in {d(c(i) , y)|i ∈ U}. Define a set Us := {i ∈ U|d(c(i) , y) = s}. Then output Us . We denote the output Us by T (y). Remark that the output Us is not the empty set. We call the algorithm T (y) Hamming distance tracing algorithm.
3
Main Results
Let us introduce the numerical security indicator for coin-flipping codes as 2secure codes. We propose the success probability Sn,U by putting: Sn,U = Prob[T (y) ⊂ {colluders}, y : illegally re-distributed code]. = Prob[c(1) , c(2) , . . . , ∈R {0, 1}n , a, b ∈R U, y = Sa,b (c(a) , c(b) ), T (y) ⊂ {a, b}], where Sa,b (c(a) , c(b) ) is the best strategy for a, b with inputs (c(a) , c(b) ) and Ex() is the expected value function.
Bounds on the Number of Users for Random 2-Secure Codes
241
Theorem 1. Let n be the code length of a coin-flipping code and U the number of the users. Then the success probability Sn,U is written by the following two forms: −(U−2)h(n,(s−1)/2) , 1) 21n 0≤s≤n,s:odd n+1 s 2 2) Exx∈{0,1}n [( 12 )(U−2)h(n,wt(x)/2) ], where h(n, a) := n − log2 (2n − 0≤t≤a nt )) and wt(x) is the Hamming weight of x. We call Sn,U a security parameter. How Sn,U should be small depends on the application. It is enough that the users give up re-distributing illegally. It is easy to calculate h(n, a) if n is not so huge, e.g. n = 128, and 0 ≤ a < n/2 by using a math software, e.g. MuPAD, MATLAB, MAPLE, Mathematica and so on. Once we obtained a list of h(n, a), it is easy to calculate Sn,U by the form in Theorem 1 (1). The following is a table of upper bounds on the maximal number of users for satisfying Sn,U > 0.99, 0.999, 0.9999. Table 1. The Upper Bounds on The Number of Users for Satisfying a Given Security Parameter and a Code Length n = 32 n = 64 n = 128
Sn,U > 0.99 Sn,U > 0.999 Sn,U 3 2 57 7 68031 5363
> 0.9999 2 2 516
From now, let introduce the following game G: (1) Toss up n coins. Count the number of the face sides. (2) Repeat step (1) U − 2 times. Put the maximal number of the face sides and denote it by s1 . (3) Toss up n coins again. Count the number of the faces and denote it by s2 . (4) You win if 2s1 + s2 < 2n. You lose if the otherwise. Denote the winning probability of the above game by Wn,U . Theorem 2. For U ≥ 3, n ≥ 1, Wn,U = Sn,U . It is interesting to know a theoretical capacity on the maximal number of the users with infinite length. Recently, a capacity of collusion secure code has been investigated by many researchers [1]. It is not trivial that there exist the capacity for our code. We investigate an achievable rate R for the coin-flipping code with Hamming distance tracing algorithm, where an achievable rate R is a constant such that limn→∞ Sn,U = 0 for R0 > R, where U = 2R0 Theorem 3. For any R < 1 − h(1/4), limn→∞ Sn,U = 1, where h(·) is the binary entropy function, i.e. h(x) = −x log2 x − (1 − x) log2 (1 − x). In particular, the capacity is more than 1 − h(1/4), if it exists.
242
4
M. Hagiwara, T. Yoshida, and H. Imai
Applications: Digital Cinema Distribution and Hard Paper Distribution
A recent illegally re-distribution problem is un-official copied DVD purchases. It is said that someone records a film by his own video camera in a cinema theater and he made illegally copied DVDs by him to sell them. In this senario, the administrator is a provide company for films and the users are cinema theatersa and divides films to n or more sets of frames. If a cinema theater requests the film, the administrator embeds n-bit codeword to the associated n-set of frames. If an illegally re-distributed content is found, the administrator traces a cinema theater where the film was recorded. Then the administrator asks the cinema theater to pay more attention customers illegal recording. Corporation of the administrator and cinema theaters makes illegal recoding harder for pirate customers. At the early stage, it is possible for a pirate customer to find cinema theaters which does not pay so much attention to such illegal recording. However by repeating making efforts between the administrator and theaters, we expect to reduce such crime. There are not many film screens. For example, 3221 screens are in Japan, 2007 [6]. In Japan, the success probability of coin-flipping codes is more than 0.999 if the code length is 128. The next application is to protect hard papers from away illegally re-distributions. The administrator is the president of a company or the chairman of the director’s meeting. In this case, the number of the users is at most dozens. Furthermore the number of the colluders is at most a few by social reasons. The administrator embeds each bit of a fingerprinting codeword to each page of the material. If there is a mass of materials, it is hard to delete that from the point of cost.
References 1. Amiri, E., Trados, G.: High rate fingerprinting codes and the fingerprinting capacity. In: Proc. of the Twentieth Annual ACM-SIAM Symp. on Discrete Algorithms 2. Boneh, D., Shaw, J.: Collusion-secure fingerprinting for digital data. IEEE Transactions of Information Theory 44, 480–491 (1998) 3. Schaathun, H.G.: Fighting three pirates with scattering codes. In: Proc. Inter. Symp. on Information Theory, 2004. ISIT 2004, June-2 July, vol. 27, p. 202 (2004) 4. Hagiwara, M., et al.: A short random fingerprinting code against a small number of pirates. In: Fossorier, M.P.C., Imai, H., Lin, S., Poli, A. (eds.) AAECC 2006. LNCS, vol. 3857, pp. 193–202. Springer, Heidelberg (2006) 5. Tardos, G.: Optimal Probabilistic Fingerprint Codes. In: Proceedings of the 35th Annual ACM Symp. on Theory of Computing, pp. 116–125 (2003); J. of the ACM (to appear) 6. Website of Motion Picture Producers Association of Japan, Inc., http://www.eiren.org/history_e/index.html
Author Index
´ Alvarez, Rafael 117 ´ Alvarez, V´ıctor 204 ´ Armario, Jos´e Andr´ es
Nagata, Kiyoshi Nemenzo, Fidel 204
O’Sullivan, Michael E. Osuna, Amparo 204 ¨ Ozadam, Hakan 92 ¨ Ozbudak, Ferruh 92
Basu, Riddhipratim 137 Beelen, Peter 1 Bernal, Jos´e Joaqu´ın 101 Bierbrauer, J¨ urgen 179 Borges-Quintana, M. 227 Borges-Trenard, M.A. 227 Bras-Amor´ os, Maria 32 Brevik, John 65 Climent, Joan-Josep Cui, Yang 159 Duursma, Iwan
73
El-Mahassni, Edwin D.
Qin, Bo
195
Shin, SeongHan 149 Sim´ on, Juan Jacobo 101 Szabo, Steve 219
204
Talukdar, Tanmoy 137 Tom´ as, Virtudes 73 Torres, F. 23 Tortosa, Leandro 117
Hagiwara, Manabu 239 Han, Wenbao 127 Herranz, Victoria 73 Høholdt, Tom 53
Umlauf, Anya
´ Ibeas, Alvar 169 Iglesias Curto, Jos´e Ignacio 83 Imai, Hideki 149, 159, 239 53
Maitra, Subhamoy 137 Mart´ınez-Moro, E. 227 Monarres, David M. 231 Morozov, Kirill 159 Munuera, C. 23
65
Vicent, Jos´e 117 Villanueva, J. 23 Villanueva, Merc`e 43 Wada, Hideo 107 Winterhof, Arne 169 Wolski, Rich 65 Wu, Qianhong 235
Kirov, Radoslav 11 Kobara, Kazukuni 149, 159 Landjev, Ivan 186 L´ opez-Permouth, Sergio R.
235
Rif` a, J. 223 ´ R´ıo, Angel del 101 Ronquillo, L. 223 Ruano, Diego 1
Giorgetti, Marta 215 Gomez, Domingo 195 Gudiel, F´elix 204
Janwal, Heeralal
32, 65, 231
Paul, Goutam 137 Perea, Carmen 73 Pernas, Jaume 43 Pujol, Jaume 43, 223
11
Fan, Shuqin 127 Frau, Mar´ıa Dolores
107 107
219
Yang, Yang 127 Yoshida, Takahiro
239
Zamora, Antonio 117 Zeng, Guang 127 Zhang, Futai 235 Zhang, Lei 235