Inverse Problems in Vibration
SOLID MECHANICS AND ITS APPLICATIONS Volume 119 Series Editor:
G.M.L. GLADWELL Departm...
130 downloads
1785 Views
2MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
Inverse Problems in Vibration
SOLID MECHANICS AND ITS APPLICATIONS Volume 119 Series Editor:
G.M.L. GLADWELL Department of Civil Engineering University of Waterloo Waterloo, Ontario, Canada N2L 3GI
Aims and Scope of the Series The fundamental questions arising in mechanics are: Why?, How?, and How m The aim of this series is to provide lucid accounts written by authoritative res giving vision and insight in answering these questions on the subject of mecha relates to solids. The scope of the series covers the entire spectrum of solid mechanics. Thus it the foundation of mechanics; variational formulations; computational me statics, kinematics and dynamics of rigid and elastic bodies: vibrations of so structures; dynamical systems and chaos; the theories of elasticity, plasti viscoelasticity; composite materials; rods, beams, shells and membranes; s control and stability; soils, rocks and geomechanics; fracture; tribology; expe mechanics; biomechanics and machine design. The median level of presentation is the first year graduate student. Some texts ar graphs defining the current state of the field; others are accessible to final yea graduates; but essentially the emphasis is on readability and clarity.
For a list of related mechanics titles, see final pages.
Inverse Problems in Vibration Second Edition by
Graham M.L. Gladwell University of Waterloo, Department of Civil Engineering, Waterloo, Ontario, Canada
KLUWER ACADEMIC PUBLISHERS NEW YORK, BOSTON, DORDRECHT, LONDON, MOSCOW
eBook ISBN: Print ISBN:
1-4020-2721-4 1-4020-2670-6
©2005 Springer Science + Business Media, Inc. Print ©2004 Kluwer Academic Publishers Dordrecht All rights reserved No part of this eBook may be reproduced or transmitted in any form or by any means, electronic, mechanical, recording, or otherwise, without written consent from the Publisher Created in the United States of America Visit Springer's eBookstore at: and the Springer Global Website Online at:
http://ebooks.springerlink.com http://www.springeronline.com
All appearance indicates neither a total exclusion nor a manifest presence of divinity, but the presence of a God who hides himself. Everything bears this character. Pascal’s Pensées, 555.
This page intentionally left blank
Contents 1 Matrix Analysis 1.1 Introduction . . . . . . . . . . . . . 1.2 Basic definitions and notation . . . 1.3 Matrix inversion and determinants 1.4 Eigenvalues and eigenvectors . . . .
. . . .
. . . .
. . . .
. . . .
1 1 1 6 13
2 Vibrations of Discrete Systems 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Vibration of some simple systems . . . . . . . . . . . . . . 2.3 Transverse vibration of a beam . . . . . . . . . . . . . . . 2.4 Generalised coordinates and Lagrange’s equations: the rod 2.5 Vibration of a membrane and an acoustic cavity . . . . . . 2.6 Natural frequencies and normal modes . . . . . . . . . . . 2.7 Principal coordinates and receptances . . . . . . . . . . . . 2.8 Rayleigh’s Principle . . . . . . . . . . . . . . . . . . . . . . 2.9 Vibration under constraint . . . . . . . . . . . . . . . . . . 2.10 Iterative and independent definitions of eigenvalues . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
19 19 19 24 26 30 35 38 40 43 46
3 Jacobi Matrices 3.1 Sturm sequences . . . . . . . . . 3.2 Orthogonal polynomials . . . . 3.3 Eigenvectors of Jacobi matrices 3.4 Generalised eigenvalue problems
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
49 49 52 57 61
4 Inverse Problems for Jacobi Systems 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 An inverse problem for a Jacobi matrix . . . . . . . . . . 4.3 Variants of the inverse problem for a Jacobi matrix . . . 4.4 Reconstructing a spring-mass system; by end constraint . 4.5 Reconstruction by using modification . . . . . . . . . . . 4.6 Persymmetric systems . . . . . . . . . . . . . . . . . . . 4.7 Inverse generalised eigenvalue problems . . . . . . . . . . 4.8 Interior point reconstruction . . . . . . . . . . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
63 63 65 68 74 81 84 86 87
vii
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
viii
Contents
5 Inverse Problems for Some More General Systems 5.1 Introduction: graph theory . . . . . . . . . . . . . . 5.2 Matrix transformations . . . . . . . . . . . . . . . . 5.3 The star and the path . . . . . . . . . . . . . . . . 5.4 Periodic Jacobi matrices . . . . . . . . . . . . . . . 5.5 The block Lanczos algorithm . . . . . . . . . . . . . 5.6 Inverse problems for pentadiagonal matrices . . . . 5.7 Inverse eigenvalue problems for a tree . . . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
93 93 98 102 103 105 108 110
6 Positivity 6.1 Introduction . . . . . . . . . . . . . . . . . . . . 6.2 Minors . . . . . . . . . . . . . . . . . . . . . . . 6.3 A general representation of a symmetric matrix 6.4 Quadratic forms . . . . . . . . . . . . . . . . . . 6.5 Perron’s theorem . . . . . . . . . . . . . . . . . 6.6 Totally non-negative matrices . . . . . . . . . . 6.7 Oscillatory matrices . . . . . . . . . . . . . . . . 6.8 Totally positive matrices . . . . . . . . . . . . . 6.9 Oscillatory systems of vectors . . . . . . . . . . 6.10 Eigenproperties of TN matrices . . . . . . . . . 6.11 u-line analysis . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
118 118 119 125 126 130 133 138 143 145 148 151
7 Isospectral Systems 7.1 Introduction . . . . . . . . . . . 7.2 Isospectral flow . . . . . . . . . 7.3 Isospectral Jacobi systems . . . 7.4 Isospectral oscillatory systems . 7.5 Isospectral beams . . . . . . . . 7.6 Isospectral finite-element models 7.7 Isospectral flow, continued . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
153 153 154 160 166 171 175 180
Discrete Vibrating Beam Introduction . . . . . . . . . . . . . . . . . . . The eigenanalysis of the cantilever beam . . . The forced response of the beam . . . . . . . . The spectra of the beam . . . . . . . . . . . . Conditions on the data for inversion . . . . . . Inversion by using orthogonality . . . . . . . . A numerical procedure for the inverse problem
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
185 185 186 189 190 193 196 199
9 Discrete Modes and Nodes 9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.2 The inverse mode problem for a Jacobi matrix . . . . . . . . . 9.3 The inverse problem for a single mode of a spring-mass system 9.4 The reconstruction of a spring-mass system from two modes . 9.5 The inverse mode problem for the vibrating beam . . . . . . .
. . . . .
202 202 203 206 209 211
8 The 8.1 8.2 8.3 8.4 8.5 8.6 8.7
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
Contents
9.6 9.7 9.8 9.9 9.10
ix
Courant’s nodal line theorem . . . . . Some properties of FEM eigenvectors Strong sign graphs . . . . . . . . . . Weak sign graphs . . . . . . . . . . . Generalisation to M> K problems . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
214 217 222 228 229
10 Green’s Functions and Integral Equations 10.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 10.2 Green’s functions . . . . . . . . . . . . . . . . . . . . . . 10.3 Some functional analysis . . . . . . . . . . . . . . . . . . 10.4 The Green’s function integral equation . . . . . . . . . . 10.5 Oscillatory properties of Green’s functions . . . . . . . . 10.6 Oscillatory systems of functions . . . . . . . . . . . . . . 10.7 Perron’s Theorem and compound kernels . . . . . . . . . 10.8 The interlacing of eigenvalues . . . . . . . . . . . . . . . 10.9 Asymptotic behaviour of eigenvalues and eigenfunctions 10.10 Impulse responses . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
231 231 237 240 251 255 259 266 271 276 284
11 Inversion of Continuous Second-Order Systems 11.1 A historical review . . . . . . . . . . . . . . . . 11.2 Transformation operators . . . . . . . . . . . . . 11.3 The hyperbolic equation for N({> |) . . . . . . . 11.4 Uniqueness of solution of an inverse problem . . 11.5 The Gel’fand-Levitan integral equation . . . . . 11.6 Reconstruction of the Sturm-Liouville system . 11.7 An inverse problem for the vibrating rod . . . . 11.8 An inverse problem for the taut string . . . . . 11.9 Some non-classical methods . . . . . . . . . . . 11.10 Some other uniqueness theorems . . . . . . . . . 11.11 Reconstruction from the impulse response . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
289 289 294 296 303 305 312 315 319 321 326 331
12 A Miscellany of Inverse Problems 12.1 Constructing a piecewise uniform rod from two spectra 12.2 Isospectral rods and the Darboux transformation . . . 12.3 The double Darboux transformation . . . . . . . . . . . 12.4 Gottlieb’s research . . . . . . . . . . . . . . . . . . . . 12.5 Explicit formulae for potentials . . . . . . . . . . . . . 12.6 The research of Y.M. Ram et al. . . . . . . . . . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
335 335 344 351 355 361 364
13 The 13.1 13.2 13.3 13.4 13.5 13.6
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
368 368 373 381 383 386 391
Euler-Bernoulli Beam Introduction . . . . . . . . . . . . . . . . . . . Oscillatory properties of the Green’s function Nodes and zeros for the cantilever beam . . . The fundamental conditions on the data . . . The spectra of the beam . . . . . . . . . . . . Statement of the inverse problem . . . . . . .
. . . . . .
. . . . . . . . . . .
. . . . . .
. . . . . . . . . . .
. . . . . .
. . . . . . . . . . .
. . . . . .
. . . . . .
x
Contents
13.7 13.8
The reconstruction procedure . . . . . . . . . . . . . . . . . . . 393 The total positivity of matrix P is su!cient . . . . . . . . . . . 399
14 Continuous Modes and Nodes 14.1 Introduction . . . . . . . . . . . . . . 14.2 Sturm’s Theorems . . . . . . . . . . . 14.3 Applications of Sturm’s Theorems . . 14.4 The research of Hald and McLaughlin
. . . .
402 402 403 407 411
15 Damage Identification 15.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.2 Damage identification in rods . . . . . . . . . . . . . . . . . . . 15.3 Damage identification in beams . . . . . . . . . . . . . . . . . .
417 417 419 422
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
Index
426
Bibliography
432
Preface The last thing one settles in writing a book is what one should put in first. Pascal’s Pensées, 19 In 1902 Jacques Hadamard introduced the term well-posed problem. His definition, an abstraction from the known properties of the classical problems of mathematical physics, had three elements: Existence: the problem has a solution Uniqueness: the problem has only one solution Continuity: the solution is a continuous function of the data. Much of the research into theoretical physics and engineering before and after 1902 has concentrated on formulating problems, with properly chosen initial and/or boundary conditions, so that their solutions do have these characteristics: the problems are well posed. Over the years it began to be recognized that there were important and apparently sensible questions that could be asked that did not fall into the category of well-posed problems. They were eventually called ill-posed problems. Many of these problems looked like a classical problem except that the roles of known and unknown quantitites had been reversed: the data, the known, were related to the outcome, the solution of a classical problem; while the unknowns were related to the data for the classical problem: they were thus called inverse problems, in contrast to the direct classical problems. (Later reflection suggested that the choice of which to be called direct and which to be called inverse was partly a historical accident.) For completeness, one should add that not all such inverse problems are ill-posed, and not all ill-posed problems are inverse problems! This book is about inverse problems in vibration, and many of these problems are ill-posed because they fail to satisfy one or more of Hadamard’s criteria: they may not have a solution at all, unless the data are properly chosen; they may have many solutions; the solution may not be a continuous function of the data, in particular, as the data are varied by small amounts, it can leave the feasible region in which there is one or more solutions, and enter the region where there is no solution. xi
xii
Preface
Classical vibration theory is concerned, in large part, with the infinitesimal undamped free vibration of various discrete or continuous bodies. This book is concerned only with such classical vibration theory. One of the basic problems in this theory is the determination of the natural frequencies (eigenfrequencies or simply eigenvalues) and normal modes of the vibrating body. A body that is modelled as a discrete system of rigid masses, rigid rods, massless springs, or as a finite element model (FEM) will be governed by an ordinary matrix dierential equation in time w with constant coe!cients. It will have a finite number of eigenvalues, and the normal modes will appear as vectors, called eigenvectors. A body that is modelled as a continuous system will be governed by a set of partial dierential equations in time and one or more spatial variables. It will have an infinity of eigenvalues, and the normal modes will be functions, eigenfunctions, of the space variables. In the context of classical theory, inverse problems are concerned with the construction of a model of a given type, i.e., a mass-spring system, a string, etc., that has given eigenvalues and/or eigenvectors or eigenfunctions, i.e., given spectral data. In general, if some such spectral data are given, there can be no system, a unique system, or many systems, having these properties. In the original, 1986, edition of this book, we were concerned exclusively with a stricter class of inverse problems, the so-called reconstruction problems. Here the data are such that there is only one vibrating system of the specified type which has the given spectral properties. In this new edition we have widened the scope of our study to include inverse problems that do not fall under this strict classification. Before describing what the book is, we first say what it is not: it is not a book about computation. In Engineering, the almost universal approach to inverse problems is through least squares: find a system which minimizes the distance between the predicted and desired behaviours. While the early studies were examples of brute force, there is now an established and rigorous discipline governing such approaches, based on the work of Tikhonov, Morozov etc. See for example Kirsch (1996). We do not refer to any of this work in this book. Rather, we are concerned with basic analysis, qualitative properties, whether a problem has one or more solutions, etc. There are occasions when one method that we describe, that should theoretically lead to the construction of a solution, is found in practice to be ill-conditioned, and this has led to another, better behaved, procedure; in such a case we have presented both methods and discussed why one fails while the other succeeds; see for example Section 4.3. Because we are concerned with fundamental analysis, the range of physical systems that we can consider is relatively narrow; essentially it is confined to the basic elements of structures, rods, beams and membranes, and excludes structures composed of combinations of these elements. This restriction in scope is understandable; indeed, until the introduction of the finite element method and high-speed largememory computing, the only direct vibration problems that could be solved were those involving those same structural elements in isolation. The study of inverse problems is at an earlier stage of evolution than that of direct problems.
Preface
xiii
The book falls into two parts: Chapters 1-9 are concerned with discrete systems, Chapter 10-14 with continuous systems. Matrix analysis is the language of discrete systems, and it is developed, as needed, in Chapters 1 and 3. Thus, Chapter 1 provides the basic definitions and introduces quadratic forms, minimax theorems, eigenvalues, etc. Chapter 2 provides the basic physics of the vibrating systems that are analysed. Chapter 3 lays out the classical analysis of Jacobi matrices, the matrices that appear in the simplest kinds of vibrating systems, in-line sequences of masses connected by springs. Chapter 4 concerns inverse problems for Jacobi matrices. Chapter 5 provides an introduction to more general discrete systems, and the language of graph theory that is needed to analyse them. Inverse problems in vibration are concerned with constructing a vibrating system of a particular type, e.g., a string, a beam, a membrane, that has specified (behavioural) properties. The system so constructed must be realistic: its defining parameters, masses, lengths, stinesses, etc., must be positive. Signs, positive and negative, lie at the heart of any deep discussion of inverse problems. Chapter 6, on Positivity, introduces the mathematics relating to dierent kinds of matrices: positive, totally positive, oscillatory, etc. This mathematics, due to Fekete, Perron, Gantmacher, Krein and others, was first applied to vibrating systems by Gantmacher and Krein in their classic Oscillation Matrices and Kernels and Small Oscillations of Mechanical Systems (1950), that has just recently (2002) been reprinted by the American Mathematical Society. Sometimes the data that are supplied are insu!cient to identify a unique vibrating system; there is then a family of systems having the specified properties - an isospectral family. Chapter 7 describes how one can form such isospectral families, and be sure that each member of the family has the necessary positivity properties. There are essentially two ways of forming families: algebraic, and dierential. The former uses a carefully chosen rotation to go from one member to another. The latter uses the idea of isospectral flow ; a matrix can flow, under so-called Toda flow along a path so that it retains the same eigenvalues and at the same time retains a particular structure and particular positivity properties. Chapter 8 is concerned with one particular type of vibrating system: a beam vibrating in flexure. This problem had been a severe stumbling block in the early history of inverse problems. Chapter 9 completes the first part of the blook with a study of modes, i.e., normal modes, and nodes. This analysis depends heavily on the positivity study of Chapter 6. The second part of the book, Chapters 10-14, is concerned with continuous systems. The problems appear in two related forms, dierential equations and integral equations. The integral equations, which use the Green’s function for the system, are the easier to analyse, for it is the Green’s function, Gantmacher and Krein’s kernel, that has the all-important positivity properties. Moreover, the Green’s function operator appearing in the integral equation is a concrete example of a positive compact self-adjoint operator in a Hilbert space, so that
xiv
Preface
we may immediately make use of the well-developed theory of such operators, as described in Chapter 10. Chapter 11 uses this theory, and the fundamental Gel’fand-Levitan transformation operator, to provide solutions to some inverse problems for the SturmLiouville equation. This equation, which appears in three related forms, is the governing equation for the vibrating string and rod. The Chapter describes the classical approach, as well as some recent techniques that are more readily adaptable to computation. Chapter 12 discusses families of isospectral continuous systems. Chapter 13 applies the Gel’fand-Levitan transformation to the inverse problem for the continuous Euler-Bernoulli beam. Chapter 14 is a short (too short) study of inverse nodal problems. While it is di!cult in practice to measure a vibration mode, it is comparatively easy to locate the nodes of a particular mode. There is now a considerable body of research, due primarily to McLaughlin and Hald, that focuses on what nodal data is su!cient to identify, say, the mass distribution on a vibrating string, rod, or membrane, and how one can construct such a vibrating system from a knowledge of some nodes of some modes. Section 14.4 briefly reports on this research. The book concludes with another short chapter on damage identification. The history of mathematics and the physical sciences leads to an important far-reaching conclusion: the study of one topic can throw light on many other topics, even on some which at first seem have no connection with the original topic. The study of inverse problems in vibration provides a clear example of this connectedness. On the one hand, there are topics in inverse problems that are illumined by knowledge in other fields, notably linear algebra and operator theory; on the other hand the study of inverse vibration problems throws light on the classical direct problems by highlighting the fundamental qualitative properties of solutions. A remark on the quotations from Pascal’s Pensées is in order. I used the translation by W.F. Trotter that appeared in Everyman’s Library, published by J.M. Dent & Sons in 1956. My copy is dated 26th April 1957 and contains an 8d (old pence) ticket for the London Transport bus No. 73 from Euston Road to Stoke Newington, reminding me that the Pensées were my daily bus reading to and from my ‘digs’ when I was Assistant Lecturer in Mathematics at University College London. I chose the Pensées for the chapter captions because it is clear from his writings that Pascal considered the search for God to be an inverse problem. His comments on the place of reason, heart and will in seeking a solution of the problem, though sometimes enigmatic, are as deep and relevant in 2004 as they were in 1654. I hope that these excerpts from the Pensées will whet readers’ appetites for Pascal’s writings. The caption for Chapter 11 reminds me that many people have contributed to this book. Some were acknowledged in the Preface to the first edition. This new edition contains material taken from papers written with graduate students Brad Willms, Mohamed Movahheddy, Hongmei Zhu and with colleagues Brian
Preface
xv
Davies, Josef Leydold, Peter Stadler and Antonino Morrassi. In addition to these, I have freely taken from papers by numerous colleagues worldwide, as referenced in the bibliography. Parts of the book were read at the proof stage by Antonino Morrassi, Maeve McCarthy, Oscar Rojo and Michele Dilena. I thank them for pointing out many errors and shortcomings, some of which I have managed to correct. The book was typed by Tracy Taves. Thank you for your stamina and your attention to detail. Colin Campbell helped us out with his understanding of the idiosyncracies of LaTeX. Finally, I acknowledge the patience and understanding of my wife, Joyce, who saw me immersed in books in my study for years on end. George Carrier once remarked that the aim of mathematics is insight, not numbers. It is the author’s wish that this book will provide insight into the many interconnected topics in mathematics, physics and engineering that appear in the study of inverse problems in vibration. G.M.L. Gladwell Waterloo, Ontario March, 2004
This page intentionally left blank
Chapter 1
Matrix Analysis It is a bad sign when, on seeing a person, you remember his book. 1 Pascal’s Pensées
1.1
Introduction
The book relies heavily on matrix analysis. In this Chapter we shall present the basic definitions and properties of matrices, and provide proofs of some important theorems that will be used later. Since matrix analysis now has an established position in Engineering and Science, it will be assumed that the reader has had some exposure to it; the presentation in the early stages will therefore be brief. The reader may supplement the treatment here with standard texts.
1.2
Basic definitions and notation
We use the word i to mean ‘if and only if’. A matrix is a rectangular array of real or complex numbers together with a set of rules that specify how the numbers are to be manipulated. A matrix D is said to have order p × q if it has p rows and q columns. The set of all real matrices, i.e., matrices with real entries, of order p × q, is sometimes denoted by Rp × q . Following Horn and Johnson (1985) [183], we use the simpler notation Pp>q > and say A 5 Pp>q . We write 1 Blaise Pascal (1623-1662) lived among the French intelligentia, and in that context it was a bad sign; one should be known for more than just a book one had written. When the first edition of this book was being translated into Chinese, the translator objected, for in 20th century China, it would be a good sign. If you met someone you knew who had written a book, you would mention it immediately!
1
2
Chapter 1
5
d11 9 d21 A=9 7 · dp1
d12 d22 · dp2
··· ··· ···
6 d1q d2q : :· · 8 dpq
The entry in row l and column m is dlm , and A is often written simply as A = (dlm )= Two matrices A> B are said to be equal if they have the same order p × q, and if dlm = elm > (l = 1> 2> = = = > p; m = 1> 2> = = = > q); Then we write A = B= The transpose of the matrix A is the q × p matrix AW , whose rows are the columns of A. We note that the transpose of AW is A; we say that A and AW are transposes (of each other), and write this (AW )W = A= For example A=
·
1 2 4 2 6 7
¸
5
>
6 1 2 AW = 7 2 6 8 4 7
are transposes. If p = q then the p × q matrix A is said to be a square matrix of order q: A 5 Pq>q ; we abbreviate Pq>q to Pq ; thus A 5 Pq . A square matrix that is equal to its transpose is said to be symmetric; in this case A = AW > or alternatively dlm = dml > (l> m = 1> 2> = = = > q)= The set of real symmetric matrices of order 5 1 2 A=7 2 4 9 6
q is denoted by Vq . The matrix 6 9 6 8 3
is symmetric. The square matrix A is said to be diagonal if it has non-zero entries only on the principal diagonal running from top left to bottom right. We write 5 6 d11 0 0 ··· 0 9 0 d22 0 · · · 0 : : = diag(d11 > d22 > = = = > dqq )= A=9 7 0 0 d33 · · · 0 8 0 0 0 · · · dqq
1. Matrix Analysis
3
The unit matrix of order q is I = Iq = diag(1> 1> = = = > 1)= The elements of this matrix are denoted by the Kronecker delta ½ 1 l=m lm = = 0 l 6= m
(1.2.1)
The zero matrix of order p × q is the matrix with all its p × q entries zero. A matrix with 1 column and q rows is called a column vector of order q, and is written 5 6 {1 9 {2 : 9 : x = 9 . : = {{1 > {2 > = = = > {q }= 7 .. 8 {q The set of all such real vectors constitutes a linear vector space that we denote by Yq . The transpose of a column vector is a row vector, written xW = [{1 > {2 > = = = > {q ]= Two matrices A> B may be added or subtracted i they have the same order p × q. Their sum and dierence are matrices C and D respectively of the same order p × q, the elements of which are flm = dlm + elm > glm = dlm elm = We write, C = A + B > D = A B= The product of a matrix A by a number (or scalar ) n is the matrix nA with elements ndlm . Two matrices A and B can be multiplied in the sense AB only if the number of columns of A is equal to the number of rows of B. Thus if A has order p × q, B has order q × s then AB = C> where C has order p × s. We write A(p × q) × B(q × s) = C(p × s)=
(1.2.2)
The element in row l and column m of C is flm , and is equal to the sum of the elements of row l of A multiplied by the corresponding elements of column m of B. Thus flm = dl1 e1m + dl2 e2m + = = = + dlq eqm =
q X
n=1
dln enm >
(1.2.3)
4
Chapter 1
and for example ·
2 3 1 1 6 7
¸
5
6 · ¸ 1 2 1 0 1 7 1 1 7 0 1 1 0 8 = = 6 8 9 7 1 0 2 1
The most important consequence of this definition is that matrix multiplication is (in general) non-commutative, i.e., AB 6= BA= Indeed, if A is (p × q) and B is (q × s) then BA cannot be formed at all unless p = s. Even when p = s, the two matrices are not necessarily equal, as is shown by the example · ¸ · ¸ 1 1 1 0 A= > B= > · 0 1 ¸ · 1 1 ¸ (1.2.4) 0 1 1 1 AB = > BA = = 1 1 1 2 In addition, this definition implies that there are divisors of zero; i.e., there can be non-zero matrices A> B such that AB = 0= An example is provided by · ¸· ¸ · ¸ 1 1 1 1 0 0 = = 2 2 1 1 0 0 The product of A(p × q) and a column vector x(q × 1) is a column vector y(p × 1) with elements |l = dl1 {1 + dl2 {2 + = = = + dlq {q > (l = 1> 2> = = = > p)=
(1.2.5)
This means that the set of p equations d11 {1 + d12 {2 + = = = + d1q {q = |1 > d21 {1 + d22 {2 + = = = + d2q {q = |2 > ························· dp1 {1 + dp2 {2 + = = = + dpq {q = |p > may be written as the single matrix equation 5 65 d11 d12 · · · d1q 9 d21 d22 · · · d2q : 9 9 :9 7 · · ··· · 87 dp1 dp2 · · · dpq
6 5 {1 |1 9 |2 {2 : :=9 · 8 7 · {q |p
(1.2.6)
6 : :> 8
(1.2.7)
or Ax = y=
(1.2.8)
1. Matrix Analysis
5
The product of an (q × 1) column vector q × q symmetric matrix 5 {21 {1 {2 9 { { {22 2 1 W xx = 9 7 · · {q {1 {q {2
x and its transpose xW (1 × q) is an ··· ··· ··· ···
6 {1 {q {2 {q : := · 8 {2q
(1.2.9)
On the other hand, the product of xW (1 × q) and x(q × 1) is a (1 × 1) matrix, i.e., a scalar xW x = {21 + {22 + = = = + {2q = (1.2.10) This quantity, which is positive i the {l (assumed to be real) are not all zero, is called the square of the O2 norm of x, i.e., 1
||x||2 = xW x > ||x|| = ({21 + {22 + = = = + {2q ) 2 =
(1.2.11)
The scalar (or dot) product of x and y is defined to be xW y = yW x ={1 |1 + {2 |2 + = = = + {q |q =
(1.2.12)
Two vectors are said to be orthogonal if xW y = 0=
(1.2.13)
It has been noted that matrix multiplication is non-commutative. This holds even if the matrices are square (see (1.2.4)) or symmetric, as illustrated by · ¸· ¸ · ¸ · ¸· ¸ · ¸ 1 2 1 1 1 1 1 1 1 2 1 0 = > = = 2 2 1 1 0 0 1 1 2 2 1 0 (1.2.14) This example, which shows that the product of two symmetric matrices is not (necessarily) symmetric, hints also that there might be a relation between the products AB and BA. This result is su!ciently important to be called: Theorem 1.2.1 (AB)W = BW AW >
(1.2.15)
so that when A, B, are symmetric, then (AB)W = BA=
(1.2.16)
Proof. Consider the element in row l, column m on each side of (1.2.15). Suppose A is (p × q), B is (q × s), then AB is p × s and (AB)W is s × p. Then q X W ((AB) )lm = (AB)ml = dmn enl > n=1
and
(BW AW )lm
= ( row l of BW ) × ( column m of DW ) = P ( column l of B) × ( row m of D) q = ¥ n=1 enl dmn
6
Chapter 1
Exercises 1.2 1. If
5
6 1 2 3 A=7 2 3 5 8 3 5 8
find a square matrix B such that AB = 0. Show that if d33 is changed then the only possible matrix B would be the zero matrix. 2. Show that, whatever the matrix A, the two matrices AAW and AW A are symmetric. Are these two matrices equal? 3. Show that if A, B are square and of order q, and A is symmetric, then BABW and BW AB are symmetric. 4. Show that if A, B, C can be multiplied in the order ABC, then (ABC)W = CW BW AW . 5. If x is complex, then its O2 norm is defined by ||x||2 = |{1 |2 + |{2 |2 + = = = + |{q |2 = Show that ||x||2 = x x
where x = x ¯W , the complex conjugate transpose of x.
1.3
Matrix inversion and determinants
In this section we shall be concerned almost exclusively with square matrices. The determinant of a (square) matrix D, denoted by det(A) or |A|, is defined to be X det(A) = |A| = ± d1l1 d2l2 · · · dqlq ; (1.3.1)
where the su!ces l1 > l1 > = = = > lq are a permutation of the numbers 1> 2> 3> = = = > q; the sign is + if the permutation is even, and if it is odd, and the summation is carried out over all q! permutations of 1> 2> 3> = = = > q. We note that each product in the sum contains just one element from each row and just one element from each column of A. Thus for 2 × 2 and 3 × 3 matrices respectively ¯ ¯ ¯ d11 d12 ¯ ¯ ¯ ¯ d21 d22 ¯ = d11 d22 d12 d21 > ¯ ¯ d11 ¯ ¯ d21 ¯ ¯ d31
d12 d22 d32
d13 d23 d33
¯ ¯ ¯ ¯ = d11 d22 d33 ¯ d11 d23 d32 ¯
(1.3.2)
+ d12 d23 d31 d12 d21 d33
+ d13 d21 d32 = d13 d22 d31
The permutation l1 > l2 > = = = > lq is even or odd according to whether it may be obtained from 1> 2> = = = > q by means of an even or an odd number of interchanges,
1. Matrix Analysis
7
respectively. Thus 1> 3> 2> 4 and 2> 3> 1> 4 are respectively odd and even permutations of 1> 2> 3> 4 because (1> 2> 3> 4) $ (1> 3> 2> 4)> (1> 2> 3> 4) $ (2> 1> 3> 4) $ (2> 3> 1> 4)= We now list some of the properties of determinants. Lemma 1.3.1 If two rows (or columns) of A are interchanged, the determinant retains its numerical value, but changes sign. If the new matrix is called B then e1l = d2l > e2l = d1l > eml = dml > (m = 3> 4> = = = > q) and
P det(B) = P ± e1l1 e2l2 e3l3 · · · eqlq > = P ± d2l1 d1l2 d3l3 · · · dqlq > = ± d1l2 d2l1 d3l3 · · · dqlq =
But if l1 > l2 > l3 > = = = > lq is even (odd) then l2 > l1 > l3 > = = = > lq is odd (even), so that each term in det(B) appears in det(A) (and vice versa) with the opposite sign, so that det(B) = det(A). Lemma 1.3.2 If two rows (columns) of A are identical then det(A) = 0. If the two rows (columns) are interchanged, then, on the one hand, det(A) is unchanged, while on the other, Lemma 1.3.1, det(A) changes sign. Thus det(A) = det(A) and hence det(A) = 0. Lemma 1.3.3 If one row (column) of A is multiplied by n then the determinant is multiplied by n. Each term in the expansion is multiplied by n. Lemma 1.3.4 If two rows (columns) of A are proportional, then det(A) = 0. This follows from Lemmas 1.3.1, 1.3.3. Lemma 1.3.5 If one row (column) of A is added to another row (column) then the determinant is unchanged. If the matrix B is obtained, say, by adding row 2 to row 1 then e1l = d1l + d2l > eml = dml > m = 2> 3> = = = > q= Thus
P det(B) = P ± e1l1 e2l2 e3l3 · · · eqlq = = P ± (d1l1 + d2l1 )d2l2 d3l3 · ·P · dqlq > = ± d1l1 d2l2 d3l3 · · · dqlq ± d2l1 d2l2 d3l3 · · · dqlq >
and the first sum is det(A) while the second, having its first and second rows equal is zero.
8
Chapter 1
Lemma 1.3.6 If a linear combination of rows (columns) of A is added to another row (column) then the determinant is unchanged. This follows directly from Lemma 1.3.5. We may now prove Theorem 1.3.1 If the rows (columns) of A are linearly dependent then det(A) = 0. Proof. Denote the rows by aW1 > aW2 > = = = > aWq . By hypotheses, there are scalars f1 > f2 > = = = > fq not all zero, such that f1 aW1 + f2 aW2 + · · · + fq aWq = 0= There is a fl not zero; let it be fp . Then aWp =
q X
(fl @fp )aWl =
l=1 l6=p
If the sum on the right is added to row p of A, the new matrix has a zero row, so that its determinant, which by Lemma 1.3.6 is det(A), is zero Before proving the converse of this theorem, we need some more notation. A minor of order s of a matrix A is the determinant of a (square) submatrix of A formed by taking elements from s rows l1 > l2 > = = = > ls and s columns m1 > m2 > = = = > ms . We denote the minor by D(l1 > l2 > = = = > ls ; m1 > m2 > = = = > ms ) Thus if
5
6 2 1 3 A = 7 1 2 4 8 1 0 7
(1.3.3)
then D(1; 1) = 2>
¯ ¯ D(1> 2; 1> 2) = ¯¯
¯ 2 1 ¯¯ = 5> 1 2 ¯
D(1> 2; 2> 3) = 2=
There is an important special case. The minor of order q1 obtained by deleting the lth row and mth column of A is denoted by d ˆlm . Thus for the A in (1.3.3), ¯ ¯ ¯ ¯ ¯ ¯ ¯ 2 4 ¯ ¯ 1 4 ¯ ¯ 1 2 ¯ ¯ ¯ ¯ ¯ ¯ ¯ = 2= d ˆ11 = ¯ = 14> d ˆ12 = ¯ = 11> d ˆ13 = ¯ 0 7 ¯ 1 7 ¯ 1 0 ¯
The minors d ˆlm occur in the expansion of a determinant: for the determinant in (1.3.2) we may write
det(A) = d11 (d22 d33 d23 d32 ) d12 (d21 d33 d23 d31 ) + d13 (d21 d32 d22 d31 ) (1.3.4) = d11 d ˆ11 d12 d ˆ12 + d13 d ˆ13
1. Matrix Analysis
9
This is called the expansion of det(A) along the first row. Thus for A in (1.3.3) we have 33 = 2 × 14 1 × (11) + 3 × (2)=
The coe!cients d ˆ11 > ˆ d12 > d ˆ13 in (1.3.4) are called the cofactors of d11 > d12 > d13 respectively, and are denoted by D11 > D12 > D13 respectively. Thus we write (1.3.4) as det(A) = ¯d11 D11 + d12 D12 + ¯ d13 D13 ¯ d11 d12 d13 ¯ ¯ ¯ (1.3.5) = ¯¯ d21 d22 d23 ¯¯ ¯ d31 d32 d33 ¯ If we take the cofactors of the first row and multiply them by the elements of another row, say the second row, then we get zero: ¯ ¯ ¯ d21 d22 d23 ¯ ¯ ¯ ¯ d21 d22 d23 ¯ = d21 D11 + d22 D12 + d23 D13 = 0 (1.3.6) ¯ ¯ ¯ d31 d32 d33 ¯ The determinant on the left is zero because it has two rows equal. These two results, (1.3.5) and (1.3.6), are special cases of Theorem 1.3.2
q X
dln Dmn = det(A) lm
(1.3.7)
Dnl dnm = det(A) lm
(1.3.8)
n=1
q X
n=1
where lm is defined in (1.2.1).
Proof. When l = m, so that ll = 1, these equations merely state the definition of a cofactor. When l 6= m they state that the determinant of a matrix with two rows (or columns) equal, is zero Now compare equation (1.3.7) with (1.2.3). If we define a matrix B such that enm = Dmn (1.3.9) then we can write (1.3.7) as q X
dln enm = det(A) lm
(1.3.10)
n=1
which, in matrix terms, states that AB = det(A)I
(1.3.11)
Likewise, (1.3.8) may be written BA = det(A)I=
(1.3.12)
10
Chapter 1
The matrix B is called the adjoint (or adjugate) of A and is denoted by dgm(A). Thus equation (1.3.11), (1.3.12) state that A dgm(A) = dgm(A)A = det(A)I=
(1.3.13)
We are now in a position to prove the converse of Theorem 1.3.1, namely Theorem 1.3.3 If det(A) = 0, then the rows (columns) of A are linearly dependent. Proof. We prove the result for the columns. That for the rows may be proved likewise. We will prove it by induction on q. It certainly holds, trivially, when q = 1, for then det(A) = d11 . Let a1 > a2 > = = = > aq be the columns of A, and suppose det(A) = 0. Either each set of q 1 vectors selected from a1 > a2 > = = = aq is a linearly dependent set, in which case the complete set is linearly dependent as required, or there is a set of q 1 vectors, which without loss of generality we may take to be a1 > a2 > = = = aq1 , which is linearly independent. Now imagine creating a set of vectors b1 > b2 > = = = > bq1 by deleting the lth row of each of the vectors a1 > a2 > = = = > aq1 . For at least one value of l the set b1 > b2 > = = = > bq1 must be linearly independent. By the inductive hypothesis, the (q 1) × (q 1) determinant formed from these vectors must be non-zero; at least one of the terms enm in equation (1.3.10) is non-zero. If det(A) = 0, equation (1.3.10) states that q X dln enm = 0 l> m = 1> 2> = = = > q (1.3.14) n=1
Since an = {d1n > d2n > = = = > dqn }, we may write the q equations (1.3.14) obtained by taking m = 1> 2> = = = > q, as q X
enm an = 0 m = 1> 2> = = = > q=
(1.3.15)
n=1
For at least one value of m, not all the enm are zero; the columns a1 > a2 > = = = > aq are linearly dependent Theorem 1.3.4 The matrix equations Ax = 0>
yW A = 0
have non-trivial solutions x and y respectively i det(A) = 0. Proof. The theorem is a corollary of Theorem 1.3.3. If a1 > a2 > = = = > aq are the columns of A, then Ax = [a1 > a2 > = = = > aq ]{{1 > {2 > = = = > {q } = {1 a1 + {2 a2 + · · · + {q aq = We can find {1 > = = = > {q , not all zero, such that {1 a1 + {2 a2 + · · · + {q aq = 0
1. Matrix Analysis
11
i a1 > a2 > = = = > aq are linearly dependent. By Theorem 1.3.3 this happens i det(A) = 0. This happens in turn i the rows of A are linearly dependent, i.e., yW A = 0 has a non-trivial solution Theorem 1.3.5 If A> B are square matrices of order q then det(AB) = det(A)= det(B) The proof of this result is left to Ex. 1.3.3. The square matrix A is said to be singular if det(A) = 0, non-singular or invertible if det(A) 6= 0. Theorem 1.3.4 shows that if A is non-singular, then the equation Ax = 0 has only the trivial solution x = 0. Ex. 1.3.5 extends this result: if A is non-singular, then the matrix equations AS = 0> TA = 0 have only the trivial solutions S = 0> T = 0; when A is non-singular there are no divisors of zero. The matrix R is said to be an inverse of A if AR = I. Theorem 1.3.6 If A has an inverse, it is unique, and RA = I. Proof. Suppose AR = I. Theorem 1.3.5 shows that det(A)= det(R) = det(I) = 1
(1.3.16)
so that det(A) 6= 0: A is non-singular. If R1 > R2 were two inverses, then AR1 = I = AR2 , so that A(R1 R2 ) = 0. But A is non-singular, so that R1 R2 = 0: R2 = R1 . Now if AR = I then ARA = A, i.e., A(RA I) = 0. But A is non-singular, so that RA I = 0, i.e., RA = I Theorem 1.3.6 shows that if A has an inverse, then A is non-singular. The logical negative of this statement is that if A is singular it does not have an inverse. We now prove the converse. Theorem 1.3.7 If A is non-singular, then it has an inverse. Proof. If A is non-singular, then det(A) 6= 0, and equation (1.3.13) may be written AR = RA = I> (1.3.17) where R = dgm(A)@ det(A) If A is non-singular, its unique inverse is denoted by A1 . We have AA1 = A1 A = I=
(1.3.18)
Theorem 1.3.8 The equation Ax = b
(1.3.19)
either has a unique solution, if A is non-singular; or if A is singular, it has a solution only for certain b.
12
Chapter 1
Proof. If A is non-singular then x = A1 (Ax) = A1 b is the unique solution. If A is singular, then there is one (or more) y such that yW A = 0= Then yW (Ax) = yW b = 0 so that (1.3.19) has a solution only if b is orthogonal to any y which satisfies yW A = 0. If A is singular then Ax = 0 has one or more solutions x1 > x2 > = = = > xp , so that if x0 is one solution satisfying Ax0 = b, then x = x0 +
p X
fl xl
(1.3.20)
l=1
is also a solution for arbitrary f1 > f2 > = = = > fp Note that trying to solve Ax = b by actually finding the inverse of A, is an extremely wasteful and clumsy procedure. Finding A1 is equivalent to solving Ax = b for all possible b, not just for the given b. Techniques for solving Ax = b form the subject matter of numerical linear algebra, for which see Bishop, Gladwell and Michaelson (1965) [33] or Golub and Van Loan (1983) [135]. Note also that we have not in fact shown how to find one solution x0 if B is in fact orthogonal to all solutions of yW A = 0; this too is covered in numerical linear algebra. In numerical linear algebra the starting point of almost all the procedures for solving linear equations such as (1.3.19), whether A is square or not, or of finding determinants, is Gaussian elimination. This is a systematic reduction of an array (dlm ) to (usually) upper triangular form by subtracting multiples of one equation from another. Lemma 1.3.6 shows that the determinant of coe!cients is unchanged under such an operation. The application of Gaussian elimination to the equations {1 + 3{2 + 2{3 2{1 + 5{2 + 6{3 3{1 + 4{2 + 7{3
= 6 = 13 = 14
would proceed as follows; only the coe!cients need be retained: 1 3 2 : 6 2 5 6 : 13 3 4 7 : 14
$
1 3 2 : 6 0 1 2 : 1 0 5 1 : 4
$
1 3 2 : 6 0 1 2 : 1 0 0 9 : 9
The determinant of A is 1 × (1) × (9) = 9. The last of the new equations gives 9{3 = 9> {3 = 1; when substituted in the new second equation this gives {2 = 1 2{3 = 1> {2 = 1; then {1 + 3 + 2 = 6 gives {1 = 1.
1. Matrix Analysis
13
Exercises 1.3 1. Show that if A is upper (lower) triangular, i.e., dlm = 0 if m A l(m ? l), then det(A) = d11 d22 = = = dqq = 2. If
5
6 1 3 2 A=7 2 5 6 8 3 4 7
find A1 . Verify that AA1 = A1 A = I. 3. Prove that if A> B are square matrices of order q, then det(AB) = det(A)= det(B)= Hint: consider the 2q × 2q matrix · ¸ A 0 C= I B Show that det(C) = det(A)= det(B). Now subtract multiples of rows (q+1) to 2q from rows 1 to q to delete all elements in the top left quarter of C. The elements in the top right quarter will be those of AB. 4. Use Gaussian elimination to solve the equations {1 + 2{2 + 4{3 + 8{4 {2 + 3{3 + 2{4 {1 + 2{2 + 5{3 + 6{4 {1 + 3{2 + 4{3 + 7{4
= 9 = 1 = 3 = 10
5. Show that if A is non-singular, then the matrix equations AS = 0 and TA = 0 have only the trivial solution S = 0> T = 0, respectively.
1.4
Eigenvalues and eigenvectors
If A and C are square matrices of order q then the equation Cx = Ax
(1.4.1)
will have a non-trivial solution x (i.e., one for which ||x|| 6= 0) i the matrix C A is singular, i.e., the scalar satisfies the determinantal or characteristic equation det(C A) = 0= (1.4.2) The roots of this equation are called the eigenvalues of the matrix pair (C> A); they may be real or complex. If is an eigenvalue, a vector x satisfying (1.4.1) is called an eigenvector corresponding to .
14
Chapter 1
In many mathematical texts, attention is focused almost exclusively on the case when A = I. In this case is said to be an eigenvalue of C. The problem (1.4.1) is called the generalised eigenvalue problem. In Mechanics there are many problems in which two matrices, C> A appear, and it will be convenient to develop the theory for this case. The eigenvalue theory for general, i.e., not necessarily symmetric matrices C> A, is extremely complicated. (See Ex. 1.4.8). However, for all, or almost all, the problems encountered in this book, the matrices C> A have special properties: they are real and symmetric, and one at least is positive definite, defined as follows. Suppose A is real and symmetric, and x is a real q × 1 column vector. The quantity xW Ax is a scalar. Written in full it is xW Ax = d11 {21 +2d12 {1 {2 +· · ·+2d1q {1 {q +d22 {22 +· · ·+2d2q {2 {q +· · ·+dqq {2q = (1.4.3) This is called a quadratic form. In many physical applications the kinetic energy and the potential energy of a mechanical system may be expressed as quadratic forms in the generalised velocities or displacements, respectively. The kinetic energy of a system is always positive, unless all the generalised velocities are zero. This leads us to a definition. The matrix A is said to be positive definite if ||x|| 6= 0 implies xW Ax A 0. (Clearly, if ||x|| = 0, so that {1 = 0 = {2 = = = = {q , then xW Ax 0.) If A satisfies the weaker condition, that ||x|| 6= 0 implies xW Ax 0, i.e., there is a vector x such that ||x|| 6= 0 and xW Ax = 0, then A is said to be positive semi-definite. We will find later that the matrix associated with the potential energy of an unanchored system is positive semi-definite; there is a vector x corresponding to a rigid body displacement of the system, for which the potential (or strain) energy is zero. Theorem 1.4.1 If C> A are real and symmetric, and A is positive definite, then the eigenvalues and eigenvectors of (1.4.1) are real. Proof. Suppose > x possibly complex, and with ||x|| 6= 0, are an eigenpair ¯ W to obtain of (1.4.1), multiply both sides by x = x x Cx = x Ax
(1.4.4)
The quantities x Ax and x Cx are both real. This is so because x Ax, for instance, is a scalar, and therefore equal to its own transpose. Thus ¯ = xW A¯ d = x Ax = (x Ax)W = xW AW x x = (x Ax) = d ¯ but if d = d ¯, then d is real. Similarly, x Cx is real. Moreover, if ||x|| 6= 0, i.e., at least one element in x is not zero, then d is strictly positive, i.e., d A 0. For let x = u + lv where u> v are real, then x Ax = (uW lvW )A(u + lv) = uW Au + l{uW Av vW Au} + vW Av=
1. Matrix Analysis
15
But since x Ax is real, the imaginary term is zero, and thus x Ax = uW Au + vW Av 0= The inequality is strict because either at least one element of u is non-zero, in which case uW Au A 0; or if u 0, at least one element of v is non-zero, in which case vW Av A 0. Now return to equation (1.4.4); x Cx and x Ax are both real and x Ax is positive. Hence = x Cx@x Ax is real. Since is real, the vector x, obtained by solving a set of simultaneous linear equations with real coe!cients, is real. Therefore, x = xW , and we can write = xW Cx@xW Ax= This ratio is often called, and we will call it, the Rayleigh Quotient corresponding to equation (1.4.1). (It was Lord Rayleigh (Rayleigh (1894) [290]) who, in his classical treatise Theory of Sound used this quotient to take the first steps towards the variational treatment of eigenvalues. We discuss this further in Chapter 2.) We write U = U(x) = xW Cx@xW Ax
(1.4.5)
Ex. 1.4.7 shows the necessity of having one of the matrices A> C, positive definite. The conditions which must be satisfied if a (symmetric) matrix A is to be positive definite or positive semi-definite may be expressed in terms of the principal minors of A. A principal minor of order s of a matrix A (symmetric or not) is a determinant of a submatrix formed from s rows l1 > l2 > = = = > ls and the same s columns l1 > l2 > = = = > ls . Thus for A in (1.3.3), ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ 2 1 ¯ ¯ 2 3 ¯ ¯ 2 4 ¯ ¯ 2 1 3 ¯ ¯ ¯>¯ ¯>¯ ¯ > ¯ 1 2 4 ¯ ¯ 1 2 ¯ ¯ 1 7 ¯ ¯ 0 7 ¯ ¯ ¯ ¯ 1 0 7 ¯
are all principal minors. In the notation of Section 1.3, a principal minor is D(l1 > l2 > = = = > ls ; l1 > l2 > = = = > ls ). There is a special notation for the leading principal minors of A, these are as follows: ¯ ¯ ¯ d11 d12 ¯ ¯ ¯ > = = = Gq = |A| = det(A)= G1 = d11 > G2 = ¯ (1.4.6) d21 d22 ¯ Now we may state
Theorem 1.4.2 The symmetric matrix A is positive definite i the leading principal minors (Gl )q1 are all positive. A will be positive semi-definite i (Gl )q1 0> Gq = 0. 1
16
Chapter 1
This will not be proved until Chapter 5. Note that since Gq = det(A), this states that a positive-definite matrix is non-singular, and a positive semi-definite matrix is singular. We may now refine Theorem 1.4.1 to give Theorem 1.4.3 If C> A are real and symmetric and A is positive definite then equation (1.4.1) will have q real eigenvalues, although they need not be distinct. If C is positive definite they will be positive, if C is positive semi-definite they will be non-negative. Proof. Equation (1.4.2) may be expanded in terms of the coe!cients flm dlm ; the result is an qth degree polynomial equation for , namely 4() = det(C A) 40 + 41 + 42 2 + · · · + 4q q = 0=
(1.4.7)
Most of the coe!cients 4l are complicated functions of dlm and flm , but the first and last may be easily identified, namely 40 = det(C)>
4q = (1)q det(A)=
(1.4.8)
Since A is positive-definite, det(A) A 0 so that 4q 6= 0. This means that equation (1.4.7) is a proper equation of degree q with q roots (l )q1 . This proves the first part of the Theorem. If C is positive-definite, then both numerator and denominator of the Rayleigh Quotient (1.4.5) will be positive, so that (l )q1 A 0. If C is only positive semi-definite, then the numerator of the Rayleigh Quotient is only positive or zero (i.e., non-negative), so that the l are non-negative. Moreover, since 1 2 = = = q = ()q 40 @4q = det(C)@ det(A) equation (1.4.7) will have at least one zero root when det(C) = 0 Under the conditions of Theorem 1.4.3 the eigenvalues (l )q1 may be labelled in increasing order: 0 1 2 = = = q = (1.4.9) Theorem 1.4.4 Eigenvectors ul > um corresponding to two dierent eigenvalues l > m (l 6= m ) of the symmetric matrix pair (C> A) are orthogonal w.r.t. both A and to C, i.e., uWl Aum = 0 = uWl Cum =
(1.4.10)
Cul = l Aul > Cum = m Aum =
(1.4.11)
Proof. By definition
Transpose the first equation and multiply it on the right by uWm ; multiply the second equation on the left by uWl , to obtain (uWl C)um = l (uWl A)um uWl (Cum ) = m uWl (Aum )
1. Matrix Analysis
17
Subtract these two equations to yield (l m )uWl Aum = 0= But l m 6= 0, so that uWl Aum = 0, and hence uWl Cum = 0. Premultiplying equation (1.4.11) by uWl we find fl uWl Cul = l uWl Aul = l dl
(1.4.12)
Sometimes, we will normalise an eigenvector ul w.r.t. A; then dl = 1> fl = l . An important corollary of this result is Theorem 1.4.5 If the symmetric matrix pair (C> A) has distinct eigenvalues (l )q1 , and A is positive-definite, then the eigenvectors ul are linearly independent, and therefore span Yq , the space of q-vectors. Proof.
The eigenvectors ul are linearly independent; for suppose 1 u1 + 2 u2 + · · · + q uq = 0;
multiplying by uWl A we have 1 (uWl Au1 ) + 2 (uWl Au2 ) + · · · + q (uWl Auq ) = 0= But uWl Aum = 0 if l 6= m> so that only the lth term in this equation is non-zero, and hence l (uWl Aul ) = 0= Since A is positive definite, uWl Aul A 0 and l = 0; all the (l )q1 are zero; the ul are linearly independent. Any vector uYq may be written uniquely as u=
q X
m um
(1.4.13)
m=1
where m = uWm Au@uWm Aum =
¥
(1.4.14)
In this book we are not concerned with methods for computing eigenvalues and eigenvectors. A simple treatment of the classical techniques may be found in Bishop, Gladwell and Michaelson (1965) [33]. A comprehensive account of modern techniques is given by Golub and Van Loan (1983) [135]. The classical treatise on the symmetric eigenvalue problem is Parlett (1980) [264]. We are concerned only with the qualitative properties of eigenvalues. Exercises 1.4 1. If
5
6 1 1 9 1 : 2 1 : A=9 7 1 2 1 8 1 1
show that A is positive semi-definite. For what x is Ax = 0?
18
Chapter 1 2. Show that A1 is positive definite i A is positive definite. 3. Verify the conditions given in Theorem 1.4.2 for A to be positive definite, when q = 2, by writing xW Ax = d11 {½21 + 2d12 {1 {2 + d22 {22 ´2 ³ ´ ¾ ³ d11 d22 d212 {22 = {1 + dd12 = d11 + 2 d 11 11
Extend the analysis to q = 3. 4. Find the eigenvalues and eigenvectors of the pair 5 6 5 6 2 1 0 1 8 2 1 8 > 1 C = 7 1 A=7 0 1 2 1 Hint: replace the eigen-equation by the equivalent recurrence relation {u1 +(2){u {u+1 = 0 with appropriate end conditions for u = 1> u = 3, and seek a solution of the form {u = D cos u + E sin u. Generalise this result. 5. Show that if q = 3, A is symmetric, and G1 > G2 > G3 of equation (1.4.6) are all positive, then all the principal minors of A are positive. Hint: write d11 det(A) as a 2 × 2 determinant with elements which are minors of A of order 2. This is a particular case of a general result, see e.g., Gantmacher (1959) [97]. 6. Show that the real symmetric matrix A has positive eigenvalues i it is positive-definite. 7. Take C=
·
1 1 1 1
¸
> A=
·
1 1 1 1
¸
=
The eigenvalues are not real. Where does the argument used in the proof of Theorem 1.4.1 break down? 8. Take C=
·
0 1 0 0
¸
> A=
·
1 0 0 1
¸
=
Show that equation (1.4.1) has only one eigenvalue and one eigenvector, so that the eigenvectors do not span the space Y2 . This is the kind of di!culty attending the non-symmetric eigenvalue problem.
Chapter 2
Vibrations of Discrete Systems Our nature consists in motion; complete rest is death. Pascal’s Pensées, 129
2.1
Introduction
The formulation and solution of the equations governing the motion of a discrete vibrating system, i.e., one which has a finite number of degrees of freedom, have been fully considered elsewhere. See for example, Bishop and Johnson (1960) [34], Bishop, Gladwell and Michaelson (1965) [33], Meirovich (1975) [234]. In this chapter we shall give a brief account of those parts of the theory that will be needed for the solution of inverse problems. Throughout this book we shall be concerned with the infinitesimal vibration of a conservative system about some datum configuration, which will usually be an equilibrium position. Before embarking on a general discussion we shall first formulate the equations of motion for some simple vibrating systems.
2.2
Vibration of some simple systems
Figure 2.2.1 shows a vibrating system consisting of q masses connected by linear springs of stinesses (nu )q1 . The whole lies in a straight line on a smooth horizontal table and is excited by forces (Iu (w))q1 . Newton’s equations of motion for the system are ¨u = Iu + u+1 u > u = 1> 2> = = = > q 1> pu x
(2.2.1)
pq x ¨q = Iq q >
(2.2.2)
19
20
Chapter 2
where · denotes dierentiation with respect to time. Hooke’s law states that the spring forces are given by u = nu (xu xu1 ) > u = 1> 2> = = = > q=
(2.2.3)
If the left hand end is pinned then x0 = 0=
(2.2.4)
Forced vibration analysis concerns the solution of these equations for given forcing functions Iu (w). Free vibration analysis consists in finding solutions to the equations which require no external excitation, i.e., Iu (w) 0> u = 1> 2> = = = > q, and which satisfy the stated end conditions.
u1
u2
k1
un
kn
k2
m1
m2
mn
Figure 2.2.1 - q masses connected by springs The system shown in Figure 2.2.1 has considerable engineering importance. It is the simplest possible discrete model for a rod vibrating in longitudinal motion. Here the masses and stinesses are obtained by lumping the continuously distributed mass and stiness of the rod. Equations (2.2.1) - (2.2.4) also describe the torsional vibrations of the system shown in Figure 2.2.2., provided that the xu > nu > pu are interpreted as torsional rotations, torsional stinesses and moments of inertia respectively. Such a discrete system provides a simple model for the torsional vibrations of a rod with a continuous distribution of inertia and stiness. There is a third system which is mathematically equivalent to equations (2.2.1) - (2.2.4). This is the transverse motion of the string shown in Figure 2.2.3 which is pulled taut by a tension W and which is loaded by masses (pu )q1 . (But note that the string shown in Figure 2.2.3 has its right hand end fixed, rather than free, as in Figures 2.2.1 and 2.2.2. In order to simulate a string with a free end, the last segment of the string must be attached to a massless ring that slides on a smooth vertical rod.) If in accordance with the assumption of infinitesimal vibration, the string departs very little from the straight line equilibrium position, then the equation governing the motion of mass pu may be derived by considering Figure 2.2.4.
2. Vibrations of Discrete Systems
k1
u1 m1
21
u2
k2
un
kn
m2
mn
Figure 2.2.2 - A torsionally vibrating system
A1
m2
A2
m1
mn
A n+1 u2
u1
un
Figure 2.2.3 - q masses on a taut string
Fr T
T αr
αr 1
mr ur
Figure 2.2.4 - The forces acting on the mass pu
22
Chapter 2
Newton’s equation of motion yields pu x ¨u
= Iu + W sin u+1 W sin u > = Iu + u+1 u >
(2.2.5) (2.2.6)
where, for small deflections, we may take sin u = u , u = W u = nu (xu xu1 ) > nu = W @cu = In order to express equations (2.2.1) - (2.2.3) in matrix form we use (2.2.3) to obtain pu x ¨u = Iu + nu+1 xu+1 (nu+1 + nu )xu + nu xu1 > pq x ¨q = Iq nq xq + nq xq1 > which yields 5 7
p1
p2 .
.
. pq
6 · 8
x ¨1 x ¨2 · x ¨q
¸ · +
n1 + n2 n2 · 0
=
·
I1 I2 · Iq
n2 n2 + n3 · 0
¸ =
0 n3 · 0
··· ··· ··· ···
0 0 · nq
0 0 · nq
¸·
x1 x2 · xq
¸
(2.2.7)
This equation may be written M¨ u + Ku = F
(2.2.8)
where the matrices M> K are called respectively the inertia (or mass) and the stiness matrices of the system. Note that both M and K are symmetric; this is a property shared by the matrices corresponding to any conservative system. We note also that both M> K are positive-definite. In this particular example the matrix M is diagonal while K is tridiagonal, i.e., its only non-zero elements are on the principal diagonal, and the two neighbouring diagonals, called the codiagonals. Equation (2.2.3) can also be constructed by introducing = {1 > 2 > = = = > q } and noting that 5 6 6 5 65 5 6 1 1 0 === 0 x1 n 1 9 2 : : 9 :9 9 : 9 : 9 1 1 = = = 0 : 9 x2 : n2 9 · :=9 :9 · · === · :9 · : 9 : 7 : :9 89 · 7 · 8 7 0 · === 0 87 · 8 nq 0 · === 1 q xq which will be written ˆ Wu > = KE
(2.2.9)
2. Vibrations of Discrete Systems
where
5
9 9 E=9 9 7
1 1 0 ··· 0 1 1 · · · · · · · 0 ··· 1 0 ··· 0
0 0 · 1 1
23
5
6
9 : : 1 9 :>E = 9 9 : 7 8
1 0 · 0 0
1 1 · 0 0
1 1 · 0 0
··· ··· · ··· ···
1 1 · 1 1
6 : : : : 8
(2.2.10)
ˆ = gldj(n1 > n2 > = = = > nq ). and K Using the matrix E, we may write equation (2.2.1) - (2.2.2) in the form M¨ u = E + F > so that on using (2.2.9) we find W
and
ˆ M¨ u + EKE u=F>
(2.2.11)
ˆ W= K = EKE
(2.2.12)
For free vibration analysis there are two important end conditions. The right hand end may be free, in which case there is no restriction on the (xl )q1 , or it may be fixed, in which case xq = 0.
Exercises 2.2 1. Verify that the stiness matrix in equation (2.2.7) satisfies the conditions of Theorem 1.4.2. Obtain a proof that applies to principal minors of any order l, such that 1 l q. 2. Consider the multiple pendulum of Figure 2.2.5.
A1 y
m1
1
A2 y
2
yn
m2
mn
Figure 2.2.5 - A compound pendulum made up of q inextensible strings
24
Chapter 2
Show that the kinetic and potential energies of the system for small oscillations are given by
= p1 |˙12 + p2 |˙22 + = = = + pq |˙q2 > 2 2 |2 1) + = = = + q (|q |cqq1 ) = 1 c11 + 2 (|2 | c2
2W 2Y
where u = j
2.3
Pq
v=u
pv .
Transverse vibration of a beam
Figure 2.3.1 shows a simple discrete model for the transverse vibration of a beam; it consists of q + 2 masses (pu )q1 linked by massless rigid rods of lengths (cu )q0 which are themselves connected by q rotational springs of stinesses (nu )q1 . The mass and stiness of the beam, which are actually distributed along the length, have been lumped at q + 2 points.
The discrete system is governed by a set of four first-order dierence equations, which may be deduced from Figure 2.3.2.
k2 k1
W1
m 1
A0
m0
A1
m1
A2
m2
Wn
kn mn − 1
An
I 2
Figure 2.3.1 - A discrete model of a vibrating beam
mn
I
n
2. Vibrations of Discrete Systems
25
I
r
Wr + 1 θr + 1
Wr
I
Ir
r −1
Wr
m
•
r
I
Ir
u
r
r −1
Figure 2.3.2 - The configuration around pu For small displacements, the rotations are u = (xu xu1 )@cu > u = 0> 1> = = = > q= If the uth spring has rotational stiness nu , then the moment u needed to produce a relative rotation u+1 u of the two rigid rods on either side of pu is u = nu+1 (u+1 u ) > u = 0> 1> = = = > q 1= Equilibrium of the rod linking pu and pu+1 yields the shearing forces !u = ( u u+1 )@cu+1 > u = 1> 0> = = = > q 1> while Newton’s equation of motion for mass pu is ¨u = !u !u1 > u = 1> 0> = = = > q= pu x Here !2 > !q and 1 > q denote external shearing forces and bending moments, respectively, applied to the ends. Suppose that the left hand end is clamped so that x1 = 0 = x0 > then only the masses (pu )q1 move, and the governing equations may be written = L1 EW u>
(2.3.2)
ˆ W > = KE
(2.3.3)
26
Chapter 2 ! = L1 E c1 q q eq >
(2.3.4)
M¨ u = E! + !q eq >
(2.3.5)
where u = {x1 > x2 > = = = > xq }> = {1 > 2 > = = = > q }> = { 0 > 1 > = = = > q1 }> ˆ = gldj(nu )> L = gldj(cu )> M = gldj(pu )> ! = {!0 > !1 > = = = > !q1 }> K eq = {0> 0> = = = > 0> 1} and E is given in equation (2.2.10). Equations (2.3.2) - (2.3.5) may be combined to give M¨ u + Ku = !q eq + c1 q q Eeq >
(2.3.6)
ˆ W L1 EW = K = EL1 EKE
(2.3.7)
where
This equation has the same general form as equation (2.2.8). We note that M and K are again symmetric and positive-definite, M being diagonal, and K being pentadiagonal.
2.4
Generalised coordinates and Lagrange’s equations: the rod
The idea that a discrete system is one composed of a finite number of masses connected by springs is unnecessarily restrictive. The general concept is that of a system whose motion is specified by q generalised coordinates (tu )q1 that are functions of time w alone. The systems considered in Sections 2.2, 2.3 are indeed discrete in this sense and the generalised coordinates corresponding to the system in Figure 2.2.1 are (xu )q1 . However, the more general concept would also cover, for instance, a model of a non-uniform longitudinally vibrating rod constructed by using the finite element method (see for example, Zienkiewicz (1971) [343]), Strang and Fix (1973) [311]. In such a model, shown in Figure 2.4.1, the rod is first divided into q + 1 elements. In the uth element, shown in Figure 2.4.2., the longitudinal displacement |({> w) is taken to have a simple linear form. |({> w) = |u (w)(1 ) + |u+1 (w) > {u { {u+1 >
(2.4.1)
where = ({ {u )@cu > runs from 0 at the left hand end of the element to 1 at the right. Equations (2.4.1) with u = 0> 1> = = = > q express the displacement at every point of the rod in terms of the q + 2 generalised coordinates (|u )q+1 . When the end conditions 0 are imposed there will be, as before, only q coordinates (|u )q1 .
2. Vibrations of Discrete Systems
27
Figure 2.4.1 - A rod divided into elements
Ar
xr
xr 1
Figure 2.4.2 - One element of the rod When the finite element method is used, it is not possible to set up the equations of motion by using Newton’s equations of motion, for there is no actual ‘mass’ to which forces are applied. Instead we may use Lagrange’s equations. For a conservative system with kinetic energy W and potential or strain energy Y , which are functions of q coordinates (tu )q1 , Lagrange’s equations state that CW CY g CW ( ) + = Tu > (u = 1> 2> = = = > q)= gw C t˙u Ctu Ctu
(2.4.2)
Here Tu is the generalised force corresponding to tu in the sense that the work done by external forces acting on the system when the system is displaced from a configuration specified by (tu )q1 to one specified by (tu + tu )q1 , is Zh =
q X u=1
Tu tu =
28
Chapter 2
For the system shown in Figure 2.2.1 the kinetic and potential energies are q
1X pu |˙u2 > 2 u=1
W =
q
Y =
1X nu (|u+1 |u )2 > 2 u=1
(2.4.3)
and Tu = Iu (w). Thus CW = pu |˙u > C |˙u
CY = nu (|u+1 |u ) + nu1 (|u |u1 )> C|u
and equation (2.4.3) yields (2.2.1). For the finite element model of Figure 2.4.1, the kinetic and potential energies of the system will be Z 1 c W = V[|({> ˙ w))]2 g{> 2 0 Z 1 c C| Y = VH[ ({> w)]2 g{> 2 0 C{ where V({)> ({)> H({) are the (possibly variable) cross-sectional area, density and Young’s modulus of the rod. On inserting the assumed form of |({> w) given in (2.4.1) we find q
W =
1X 2 u=0
Z
V({u + cu )({u + cu )[|˙u (1 ) + |˙ u+1 ]2 cu g>
(2.4.4)
Z
(2.4.5)
0
q
Y =
1
1X 2 u=1
0
1
V({u + cu )H({u + cu )[|u+1 |u ]2 c1 u g=
On carrying out the integrations, perhaps numerically if V({)> ({)> H({) are variable, we may write W =
Y =
q+1 1X 2 u=0 q+1 1X 2 u=0
If the rod is fixed at both ends, then
q+1 X
puv |˙u |˙v >
(2.4.6)
nuv |u |v =
(2.4.7)
v=0
q+1 X v=0
|0 = 0 = |q+1 >
(2.4.8)
so that all the sums in (2.4.6), (2.4.7) run from 1 to q. In this case q
X CW = puv |˙v > C |˙u v=1
q
X CY = nuv |v > C|u v=1
2. Vibrations of Discrete Systems
29
and equation (2.4.2) yields the following equation for free vibration: q X
puv |¨v +
v=1
q X
nuv |v = 0> (u = 1> 2> = = = > q)=
v=1
This equation may, as before, be condensed into the matrix equation M¨ y + Ky = 0
(2.4.9)
We note that, for the rod with the kinetic and potential energies given by (2.4.6), (2.4.7), the matrices M> K are symmetric, tridiagonal matrices with sign properties. They are tridiagonal because puv > nuv are zero unless u = v or u = v ± 1. The sign properties may be deduced from (2.4.4), (2.4.5): the codiagonal elements pu>u+1 > pu>u1 of M are positive, while nu>u+1 > nu>u1 are negative. Thus 6 5 6 5 f1 g1 d1 e1 : 9 : 9 .. .. : 9 : 9 e1 d2 . . : > K = 9 g1 f2 := M=9 : 9 : 9 .. .. .. .. 7 7 . . eq1 8 . . gq1 8 eq1 dq gq1 fq (2.4.10) These sign properties of M> K will later be shown to have important implications for the qualitative properties of a vibrating rod. On the basis of these examples we now pass to the general case. For a conservative system with generalised coordinates (tu )q1 which specify small displacements from a position of stable equilibrium, the kinetic and potential energies will have the form q q 1X X W = puv t˙u t˙v > (2.4.11) 2 u=1 v=1 q q 1X X Y = nuv tu tv > 2 u=1 v=1
(2.4.12)
where the matrices M = (puv ) and K = (nuv ) are symmetric, in that pvu = puv > nvu = nuv = The equations governing free vibration may be written M¨ q + Kq = 0=
(2.4.13)
We note that equations (2.4.11), (2.4.12) may be written W =
1 W q˙ Mq> ˙ 2
Y =
1 W q Kq= 2
(2.4.14)
It is not possible for any arbitrarily chosen symmetric matrix M to be an inertia matrix, because the kinetic energy W is an essentially positive quantity,
30
Chapter 2
i.e., it is always positive except when each of the t˙u is zero, in which case it is zero. Thus M must be positive definite (see Section 1.4). The restrictions on the matrix K are slightly less severe since, although the strain energy will always be positive or zero, it will actually be zero if the system has a rigid-body displacement. Notice, for example, that the Y of (2.4.5) will be zero if y is the rigid body displacement |0 = |1 = · · · = |q = |q+1 = This will be a possible displacement of the system in Figure 2.2.1 only if both ends are free. We conclude that if the system is not constrained so that one point is fixed, then K is positive semi-definite.
Exercises 2.4 1. Use equations (2.4.4), (2.4.5) to evaluate the mass and stiness matrices for a uniform rod in longitudinal vibration subject to the end conditions (2.4.8). 2. Use the form (2.4.5) of the strain energy of the rod to show that the stiness matrix K for a rod fixed at the left and free at the right has the form 5 9 9 9 9 9 9 N=9 9 9 9 9 9 7
2.5
n1 + n2 n2 .. .
6
n2 n2 + n3 .. .
n3 .. .
..
..
.
..
.
..
.
.
..
.
..
.
..
..
.
. nq1
nq1 nq
: : : : : : := : : : : : 8
Vibration of a membrane and an acoustic cavity
Over the last three or four decades, computational vibration analysis has developed to such an extent that it can analyse the vibration of almost anything: rods, beams, plates, trusses, steel and concrete buildings, bridges, aircraft, and so on. Inverse vibration analysis in the strict form we consider in this book can hope to encompass only comparatively simple structures: strings, rods, beams, membranes and acoustic cavities and, even now, inverse problems for membranes and cavities are still open; all we can do is find some qualitative properties of the vibration. The vibrations of a membrane and of an acoustic cavity are mathematically similar: both involve just one scalar quantity, the transverse
2. Vibrations of Discrete Systems
31
displacement x({> |), for the membrane under unit tension; the excess pressure s({> |> }), for the acoustic cavity. Both are governed by a wave equation 4x =
C2x > Cw2
C2 C2 + 2> 2 C{ C|
4=
(2.5.1)
for a membrane with mass density ({> |), and 4s =
C2s > Cw2
4=
C2 C2 C2 + + > C{2 C| 2 C} 2
(2.5.2)
for the acoustic cavity. To set up the finite element model FEM of a membrane we consider the energies Z Z 1 x˙ 2 g{g|> (2.5.3) W = 2 G Z Z 1 Y = (5x)2 g{g|= (2.5.4) 2 G
The simplest FEM is based on triangulation. For an arbitrary triangular element S1 > S2 > S3 as shown in Figure 2.5.1, we take x({> |) = d + e{ + f|
P• 2
(2.5.5)
u
2
P
P 3 • u
1
•u
1
3
Figure 2.5.1 - A triangular finite element If x takes the values x1 > x2 > x3 at the vertices S1 > S2 > S3 respectively, then xl = d + e{l + f|l > l = 1> 2> 3=
(2.5.6)
We can solve these equations for d> e> f and hence express W> Y for one element, i.e., Wh > Yh , as quadratic forms Wh =
1 W u˙ Mh u˙ h 2 h
(2.5.7)
32
Chapter 2
Yh =
1 W u Kh uh 2 h
(2.5.8)
with coe!cients which are functions of the coordinates ({l > |l )> l = 1> 2> 3. We are not particularly interested in the magnitudes of the coe!cients; we are more interested in their signs. First we investigate the elements of Kh . Equation (2.5.8) give e4 = x1 (|2 |3 ) + x2 (|3 |1 ) + x3 (|1 |2 ) f4 = x1 ({2 {3 ) + x2 ({3 {1 ) + x3 ({1 {2 ) where
¯ ¯ 1 {1 ¯ 4 = ¯¯ 1 {2 ¯ 1 {3
|1 |2 |3
¯ ¯ ¯ ¯ = 2 area(S1 S2 S3 )= ¯ ¯
Since (5x)2 = e2 + f2 , the coe!cient of, say, x1 x2 in Yh is
{({3 {1 )({3 {2 ) + (|3 |1 )(|3 |2 )}@|4| = |S1 S3 |=|S2 S3 | cos @|4|= Users of finite element methods have found that compact, i.e., acute angled, triangles give more accurate computational results than elongated triangles that have an obtuse angle. If the triangle has all its angles acute, then n12>h and n23>h > n31>h are all negative: Kh has the sign pattern 5
6 + Kh = 7 + 8 = +
(2.5.9)
To find the signs of the coe!cients in Wh , it is convenient to write (2.5.7) in terms of the areal coordinates , !l ({> |), of the triangle; if S is an arbitrary point of the triangle, then x({> |) = x1 !1 ({> |) + x2 !2 ({> |) + x3 !3 ({> |) where !1 =
area(S S2 S3 ) area(S S3 S1 ) area(S S1 S2 ) > !2 = > !3 = area(S1 S2 S3 ) area(S1 S2 S3 ) area(S1 S2 S3 )
as shown in Figure 2.5.2.
2. Vibrations of Discrete Systems
33
P
2
P • ( x, y) P
P
3
1
Figure 2.5.2 - S1 S2 S3 is split into three triangles Since !1 > !2 > !3 are all positive when S is inside the triangle S1 S2 S3 , all the coe!cients in Wh are positive: Mh has the form 5 6 + + + Mh = 7 + + + 8 = (2.5.10) + + + Now we assemble the element matrices to form the global mass and stiness matrices. The membrane is replaced by an assembly of triangles 4l with vertices Sl and edges Sl Sm as shown in Figure 2.5.3. The boundary condition x = 0 is imposed on the outer vertices labelled ‘0’.
0
0
4
0
5
0
0
2
1
0
0
0
3
6
0
0
0
0
Figure 2.5.3 - An assembly of triangular elements
34
Chapter 2
P
4
n3
n1
P
P
1
3
P
2
Figure 2.5.4 - The angles between outward drawn normals to the faces are all obtuse For this 5 + 9 + 9 9 M=9 9 + 9 7 +
particular configuration, the matrices A> C have the sign patterns 6 5 6 + + + + 9 + + + + + : : : 9 : : 9 + + + : + : 9 : = (2.5.11) > K = : 9 : + + + : 9 : 8 7 + + + + + 8 + + + + +
We note that if l 6= m, then plm A 0> nlm ? 0 i Sl > Sm are the ends of an edge Sl Sm of the mesh. The finite element analysis of a 3D- acoustic cavity proceeds in a similar way. The elements are taken to be tetrahedra, and the pressure s({> |> }) is taken as s({> |> }) = d + e{ + f| + g}
(2.5.12)
in each tetrahedron. Now it is found Zhu (2000) [342], Gladwell and Zhu (2002) [131] that if the angles between the normals to the faces are all obtuse, as shown in Figure 2.5.4, then the element mass and stiness matrices have the form 5 6 5 6 + + + + + 9 + + + + : 9 + : : 9 : (2.5.13) Mh = 9 7 + + + + 8 > Kh = 7 + 8 = + + + + + This means that when the matrices are assembled they have the same kind of sign pattern as before: if l = 6 m then plm A 0> nlm ? 0 i Sl Sm is an edge of the mesh.
2. Vibrations of Discrete Systems
35
Applying Lagrange’s equation to the energies W =
1 1 W u˙ Mu> ˙ Y = uW Ku 2 2
we find the equation governing the vibration as M¨ u + Ku = 0=
2.6
(2.5.14)
Natural frequencies and normal modes
The matrix equation (2.4.13) represents a set of second order equations with constant coe!cients. Following usual practice we seek the solution in the form 6 5 6 5 {1 t1 9 t2 : 9 {2 : : 9 : (2.6.1) q=9 7 · 8 = 7 · 8 sin($w + !)> tq {q where the constants {u , frequency $ and phase angle ! are to be determined. When q has the form (2.6.1), then q ¨ = $2 q = $ 2 x sin($w + !)>
(2.6.2)
so that equation (2.4.13) demands that (K M)x = 0> = $2 =
(2.6.3)
This is the eigenvalue equation (1.4.1) and, since M is positive-definite and K is either positive semi-definite or positive-definite, the whole of the analysis developed in Section 1.4 can be used here. Thus the equation has q eigenvalues (l )q1 satisfying 0 1 2 ? · · · q > (2.6.4) and q corresponding eigenvectors (xl )q1 satisfying (K l M)xl = 0=
(2.6.5)
1
The frequencies $l = (l ) 2 are called the natural frequencies of the system, and the eigenvectors are called the normal or principal modes. Note the distinction between {l , a scalar, and xl , a vector. In order to become acquainted with the properties of natural frequencies and normal modes we shall consider the system specified by equation (2.2.7) and, to simplify the algebra, shall assume that (pu )q1 = p> (nu )q1 = n=
(2.6.6)
36
Chapter 2
In this case the eigenvalue 5 2 1 9 1 2 9 9 · · 9 7 0 ··· 0 ···
equation may be written 65 0 ··· 0 {1 9 {2 1 = = = 0 : :9 9 · ··· · : :9 · 8 7 {q1 2 1 {q 1 1
6 : : : = 0> : 8
(2.6.7)
where
= p$2 @n=
(2.6.8)
To solve for the {u we use the idea suggested in Exercise 1.4.4, namely to write (2.6.7) as the recurrence relation {u1 + (2 ){u {u+1 = 0> (u = 1> 2> = = = > q)=
(2.6.9)
The first of equations (2.6.7) may be written in this form if {0 is taken to be zero; this may be interpreted as stating that the left hand mass (p0 ) is fixed. On the other hand, the last of equations (2.6.7) may be written in the form (2.6.9) if {q+1 is taken to be equal to {q . Thus the end conditions for the recurrence (2.6.9) are {0 = 0 = {q+1 {q = (2.6.10) The recurrence relation has the general solution {u = D cos u + E sin u>
(2.6.11)
where, on substitution into (2.6.9) we find that must satisfy cos(u 1) + cos(u + 1) sin(u 1) + sin(u + 1)
= 2 cos cos u = (2 ) cos u> = 2 cos sin u = (2 ) sin u>
i.e., 2 cos = 2 = The end conditions will be satisfied if and only if D = 0 = sin(q + 1) sin q = 2 cos[(q + 1@2)] sin @2> so that the possible values of are = l =
(2l 1) > (l = 1> 2> = = = > q)> 2q + 1
while the corresponding values of are l = 2 2 cos l = 4 sin2 [
(2l 1) ]= 2(2q + 1)
(2.6.12)
Thus, in the lth mode, the displacement amplitude of the uth mass is {u = sin ul = sin[
(2l 1)u ]= (2q + 1)
(2.6.13)
2. Vibrations of Discrete Systems
37
The modes for the case q = 4, which are shown in Figure 2.6.1, exhibit properties that are held by all eigenvectors of a tridiagonal matrix (such as that in (2.6.7)), namely
• •
4
1
• •
• •
•
• •
2 3 •
•
•
•
Figure 2.6.1 - The modes of the spring-mass system for q = 4 For a proof of the convergence of this class of discrete models to the continuous beam, and for an estimate of the discretisation error on frequencies and mode shapes, see Davini (1996) [74]. (a) the lth mode crosses the axis (l 1) times - the zeros at the ends are not counted; (b) the nodes (points where the mode crosses the axis) of the lth mode interlace those of the neighbouring ((l 1)th and (l + 1)th) modes. If instead of being free at the right hand end, the system were pinned there, then the analysis would be unchanged except that the end conditions would be {0 = 0 = {q = In this case would have to satisfy sin q = 0 so that = !l =
l > l = 1> 2> = = = > q 1 q
(2.6.14)
38
Chapter 2
, would be and the corresponding eigenvalues, which we will label (0l )q1 1 0l = 4 sin2 (
l )= 2q
(2.6.15)
In the lth mode, the uth displacement amplitude is |u = sin(u!l ) = sin[
ul ]= q
(2.6.16)
The two sets of eigenvalues (l )q1 and (0l )q1 are related in a way which will be 1 found to be general for problems of this type (see equation (2.9.10)), namely 0 ? 1 ? 01 ? · · · ? q1 ? 0q1 ? q =
(2.6.17)
Exercises 2.6 1. Consider the beam system of Figure 2.3.1 in the case when (pl )q1 = p> (nl )q1 = n> (cl )q0 = c. Show that the recurrence relation linking the (xu )q+2 may be written 0 xu2 4xu1 + (6 )xu 4xu+1 + xu+2 = 0 where = p$ 2 c2 @n. Seek a solution of the recurrence relation of the form xu = D cos u + E sin u + F cosh u! + G sinh u! and find > ! so that the end conditions x1 = 0 = x0 = xq1 = xq are satisfied. Hence find the natural frequencies and normal modes of the system; i.e., a clamped-clamped beam. A physically more acceptable discrete approximation of a beam is considered in detail by Gladwell (1962) [103] and Lindberg (1963) [215].
2.7
Principal coordinates and receptances
Theorem 1.4.5 states that the vectors (xl )q1 span the space of q-vectors, so that any arbitrary vector q(w) may be written q(w) = s1 x1 + s2 x2 + · · · + sq xq =
(2.7.1)
This may be condensed into the matrix equation q = Xp>
(2.7.2)
where X is the q×q matrix with the xl as its columns i.e., xl = {{1l > {2l > = = = > {ql }. The coordinates s1 > s2 > = = = > sq , called the principal coordinates, will in general be functions of w; they indicate the extent to which the various eigenvectors xl participate in the vector q. The energies W> Y take particularly simple forms
2. Vibrations of Discrete Systems
39
when q is expressed in terms of the principal coordinates. For equation (2.7.2) implies q˙ = Xp> ˙ (2.7.3) so that
1 1 (Xp) ˙ W M(Xp) ˙ = p˙ W (XW MX)p= ˙ (2.7.4) 2 2 But the element in row l, column m of the matrix XW MX is simply xWl Mxm and according to (1.4.12) this is zero if l 6= m> dl if l = m. Thus W =
XW MX = gldj(d1 > d2 > = = = > dq )>
(2.7.5)
1 {d1 s˙21 + d2 s˙22 + · · · dq s˙2q }= 2
(2.7.6)
1 W W p (X KX)p 2
(2.7.7)
so that W = Similarly
Y = and
XW KX = gldj(1 d1 > 2 d2 > = = = q dq )
(2.7.8)
so that
1 {1 d1 s21 + 2 d2 s22 + · · · q dq s2q }= (2.7.9) 2 Equations (2.7.6), (2.7.9) show that the search for eigenvalues and eigenvectors for a symmetric matrix pair (M> K) is equivalent to the search for a coordinate transformation q $ p which will simultaneously convert two quadratic forms qW Mq and qW Kq to sums of squares. We shall now use the principal coordinates to obtain the response of a system to sinusoidal forces. Equations (2.4.2) and (2.4.14) show that the equation governing the response to generalised forces (Tu )q1 is Y =
M¨ q + Kq = Q
(2.7.10)
where q = {t1 > t2 > = = = > tq }. If the forces have frequency $ and are all in phase, then Q and q may be written Q = sin($w + !)> q = x sin($w + !)=
(2.7.11)
In this case equations (2.6.1) - (2.6.2) yield (K M)x = =
(2.7.12)
To solve this equation we express x in terms of the eigenvectors xl , so that x = 1 x1 + 2 x2 + · · · q xq = X>
(2.7.13)
where 1 > 2 > = = = > q are the amplitudes of the principal coordinates s1 > s2 > = = = > sq . Substitute (2.7.13) into (2.7.12) and multiply the resulting equation by XW ; the result is XW (K M)X = XW = = (2.7.14)
40
Chapter 2
But now the matrix of coe!cients of the set of Q equations for the unknowns 1 > 2 > = = = > q is diagonal, and the lth equation is simply (l )dl l = l > so that l =
l = dl (l )
(2.7.15)
In order to interpret this result we consider the response to a single generalised force Tu . In this case Q = {0> 0> = = = > u > 0> = = = > 0} > = u {{u1 > {u2 > = = = > {uq } u {ul > l = dl (l ) and the vth displacement amplitude is: {v =
q X
l {vl = uv u
(2.7.16)
l=1
where uv =
q X l=1
{ul {vl = dl (l )
(2.7.17)
The quantity uv is the receptance Bishop and Johnson (1960) [34] giving the amplitude of response of tv to a unit amplitude generalised force Tu . The fact that uv is symmetric, i.e., uv = vu
(2.7.18)
is a reflection of the reciprocal theorem which holds for forced harmonic excitation. Exercises 2.7 1. Use the orthogonality of the (xl )q1 w.r.t the inertia matrix to show that xWl Mq = sl dl =
2.8
Rayleigh’s Principle
Consider a conservative system with generalised coordinates (tu )q1 vibrating with harmonic motion given by (2.6.1). Its kinetic and potential energies will be W =
1 W q˙ Mq˙ = $ 2 cos2 $wW0 > 2
2. Vibrations of Discrete Systems
Y =
41
1 W q Kq = sin2 $wY0 > 2
where
1 W x Mx> 2 Since the system is conservative, W0 =
Y0 =
1 W x Kx= 2
(2.8.1)
W + Y = frqvw=> so that $2 cos2 $wW0 + (1 cos2 $w)Y0 = frqvw=> and therefore $ 2 W0 = Y0 = This we may write as =
xW Kx Y0 = = W W0 x Mx
If the system is vibrating freely at frequency $, then $ must be one of the natural frequencies and x the corresponding eigenvector. If $ = $l , then = l , x = xl and xW Kxl (2.8.2) l = Wl xl Mxl which agrees with equation (1.4.5). Rayleigh’s Principle states that the stationary values of the Rayleigh Quotient U =
xW Kx xW Mx
(2.8.3)
viewed as a function of the components ({u )q1 , occur when x is an eigenvector xl . The corresponding stationary value of U is l . Proof. Rayleigh’s Principle has a long history - see for example Temple and Bickley (1933) [322] or Washizu (1982) [330]. We shall state the proof in a number of ways because each is instructive. First consider U as a ratio of Y0 and W0 and write down the partial derivative of this quotient w.r.t. {u . We have CW0 = pu1 {1 + pu2 {2 + · · · + puq {q > C{u CY0 = nu1 {1 + nu2 {2 + · · · + nuq {q > C{u and
1 CY0 Y0 CW0 1 C Y0 ( )= 2 = C{u W0 W0 C{u W0 C{u W0
½
CY0 CW0 U C{u C{u
¾
>
so that, on inserting the expressions for CY0 @C{u and CW0 @C{u we obtain just the uth row of the matrix equation (2.6.3) with U in place of . The complete
42
Chapter 2
set of q equations which state that Y0 @W0 is stationary w.r.t. all the ({u )q1 is the matrix equation (2.6.3) which is satisfied when x is an eigenvector xl and is the corresponding eigenvalue l . Now express the energies in terms of principal coordinates. If sl = l sin($w + !)> then equations (2.7.6), (2.7.9) show that W0 =
ª 1© d1 21 + d2 22 + · · · + dq 2q > 2
ª 1© 1 d1 21 + 2 d2 22 + · · · + q dq 2q = 2 Since M is assumed to be positive definite, there is no loss in generality in taking each dl = 1, then 1 21 + 2 22 + · · · + q 2q U = > (2.8.4) 21 + 22 + · · · + 2q Y0 =
so that, in particular,
U 1 =
(2 1 ) 22 + · · · + (q 1 ) 2q = 21 + 22 + · · · + 2q
(2.8.5)
Since the l are labelled in increasing (or non-decreasing) order, the quantities l 1 > l = 2> 3> = = = > q are non-negative, and so U 1 = If 1 is strictly less than 2 then equality occurs only when 2 = 0 = = = = = q , i.e., when the system is vibrating in its first principal mode. Equation (2.8.5) states the important property that whenever values are taken for ({u )q1 , the values of the Rayleigh quotient will always be greater than 1 and (when 1 ? 2 ) will be equal to 1 only if the ratios {1 : {2 : = = = {q correspond to those of the first eigenvector {11 > {21 > = = = {q1 . Equation (2.8.5) shows that 1 is the global minimum of U , and it may be proved in an exactly similar way that U q > so that q is the global maximum of U . If l is an intermediate eigenvalue, so that 1 ? l ? q , then P Pq 2 2 l1 m=1 (l m ) m + m=l+1 (m l ) m U l = = 21 + 22 + · · · + 2q
(2.8.6)
(2.8.7)
In this case U will not be strictly less nor strictly greater than l for variations of the m ; U has a saddle point in the lth mode ( m = 0> m 6= l). However, for computational purposes it is important that the dierence between U and l depends on the squares of the quantities m . This means that if x is ‘nearly’
2. Vibrations of Discrete Systems
43
in the lth mode, so that the m with m 6= l are much smaller than l , i.e., l 1> m = 0(%), then U l = 0(%2 ). Since M is positive definite, xW Mx A 0, and the problem of finding the stationary values of the Rayleigh quotient U given by equation (2.8.3) is equivalent to finding the stationary values of xW Kx subject to the restriction that xW Mx = 1. This in turn is equivalent to finding the stationary values of I xW Kx xW Mx>
(2.8.8)
subject to xW Mx = 1. Here acts as a Lagrange parameter. Note that q q X X CI =2 nuv {v 2 puv {v > C{u v=1 v=1
so that the set of equations CI@C{u = 0 yields equation (2.6.3), viz. (K M)x = 0.
2.9
Vibration under constraint
The concept of a system vibrating under constraint is important in the solution of inverse problems. Suppose a system has generalised coordinates (tu )q1 , but they are constrained to satisfy a relation i (t1 > t2 > = = = > tq ) = 0= For small vibrations about t1 = 0 = = = = = tq , this relation may be replaced by qW d = g1 t1 + g2 t2 + = = = + gq tq = 0> where gu =
Ci (t1 > t2 > = = = > tq )|t1 =0=t2 =====tq = Ctu
Two of the most important constraints will correspond to a certain tu being zero, or two, tu and tv , being equal. Now suppose that the system is vibrating with frequency $, where $ 2 = , and q = x sin $w= Rayleigh’s Principle states that the (natural frequencies)2 will be the stationary values of I , given in equation (2.8.8) but now subject to the further constraint xW d = 0=
(2.9.1)
Thus we must find the stationary values of F = xW Kx xW Mx 2xW d>
(2.9.2)
44
Chapter 2
where is another Lagrange parameter (the 2 is inserted purely for convenience). The equations CI@C{u = 0 now yield Kx Mx d = 0=
(2.9.3)
By comparing this with equation (2.7.12) we see that d is a generalised force; it is the force required to maintain the constraint (2.9.1). In order to analyse equation (2.9.3) we express x in terms of principal coordinates, using equation (2.7.13). Then KX MX d = 0=
(2.9.4)
Multiply throughout by XW and use equations (2.7.5) and (2.7.8) which show that both XW MX and XW KX will be diagonal matrices; the uth row of the resulting equation is u du u du u eu = 0>
u = 1> 2> = = = > q>
(2.9.5)
where b = XW d=
(2.9.6)
Equations (2.9.5) yield u =
eu > du (u )
(2.9.7)
which, when substituted in the constraint (2.9.1); i.e., xW d W XW d W b = 0>
(2.9.8)
yields the frequency equation E()
q X l=1
e2l = 0= dl (l )
(2.9.9)
The form of this equation has important consequences. Consider first the case in which none of the el is zero. The coe!cients (e2l @dl )q1 will all be positive and the graph of E() against will have the form shown in Figure 2.9.1. Since E(l + 0) is very large negative, E(l+1 0) is very large positive, and E() is steadily increasing between l and l+1 > E() will have just q 1 zeros, (0l )q1 , l that interlace the l in the sense that l ? 0l ? l+1 > (l = 1> 2> = = = > q 1)=
(2.9.10)
This inequality may be interpreted as follows: if a linear constraint is applied to a system, each natural frequency increases (or, more precisely, does not decrease), but does not exceed the next natural frequency of the original system.
2. Vibrations of Discrete Systems
45
B(λ )
0
λ
1
•
λ •1
•
λ
•
λn• − 1 •
2
λn
•
λ
Figure 2.9.1 - The eigenvalues of a constrained system interlace the original eigenvalues If all the el are non-zero then the inequalities in (2.9.10) are strictly obeyed. Now, however, suppose some of the el are zero; in particular consider the constraint (2.9.11) 1 = 0> for which (el )q2 = 0. In this case ( l )q2 are the principal coordinates of the system and the corresponding eigenvalues are 0l = l+1 > (l = 1> 2> = = = > q 1)=
(2.9.12)
If the constraint is m = 0> 1 ? m q> then the principal coordinates are 1 > 2 > = = = > m1 > m+1 > = = = > q , so that 0l = l >
l = 1> 2> = = = > m 1; 0l = l+1 >
l = m> m + 1> = = = > q 1=
If the constraint is (2.9.8) and some particular em is zero, then equation (2.9.5) shows that ½ 1 l=m l = lm 0 l 6= m is a solution corresponding to = m . This means that a constraint (2.9.8) with em = 0 does not aect the mth principal mode. Figure 2.9.2 shows the form of E() when e2 = 0. The graph may a) pass to the left of 2 , in which case 01d ? 2 > 02d = 2 ; or b) pass to the right, in which case 01e = 2 > 02e A 2 . If two constraints are applied, then the constrained system will have q 2 eigenvalues (00l )q2 satisfying 1 0l 00l 0l+1 >
(l = 1> 2> = = = > q 2)>
46
Chapter 2
where 0l are the eigenvalues of the system subject to one of the constraints. Thus l 0l 00l 0l+1 l+2 > (l = 1> 2> = = = > q 2) or l 00l l+2 >
(l = 1> 2> = = = > q 2)=
(2.9.13)
B(λ )
a
0
λ • 1
•
λ′
1a
b •λ
2
•
λ′
2b
•
λ
3
Figure 2.9.2 - The form of E() when e2 = 0; either a) 01 ? 2 > 02 = 2 or b) 01 = 2 > 02 A 2 .
2.10
Iterative and independent definitions of eigenvalues
In this section we take a closer look at the eigenvalues of (2.6.3) in relation to the Rayleigh Quotient xW Kx = (2.10.1) U = W x Mx We assume that K is symmetric (it may or may not be positive semi definite) and that M is positive definite. The importance of the latter assumptions is that the denominator of (2.10.1) is never zero and always positive for all x 6= 0. First, we note that U is a homogeneous function of x in the sense that U (fx) = U (x)> f 6= 0= This means that we can always scale any x so that the denominator of (2.10.1) is unity, i.e., xW Mx = 1= (2.10.2) The vectors x with this property constitute a closed and bounded subspace G1 Yq . Now consider the Rayleigh Quotient on G1 ; it is U = xW Kx=
(2.10.3)
2. Vibrations of Discrete Systems
47
This is a continuous function of the variables {1 > {2 > = = = > {q on the closed bounded region G1 so that, by Weierstrass’ Theorem on continuous functions, it attains its minimum value on G1 , i.e., for some vector x 5 G1 . (Recall the definition of a closed set V: if {|l } is a convergent sequence in V then its limit liml$4 |l = | is also in V.) There may be more than one such minimizing vector, but there is always at least one, which we denote by x1 . The corresponding minimum value of U we denote by 1 . We have the result 1 = min xW Kx = xW1 Kx1 = x5G1
(2.10.4)
Having found x1 and 1 , we set up a new minimum problem: finding the minimum of xW Kx on the subspace G2 of G1 that consists of vectors x orthogonal to x1 , i.e., x satisfying xW Mx1 = 0. This subspace is again closed and bounded so that by Weierstrass’ Theorem there is a vector x2 5 G2 which minimizes xW Kx on G2 ; the minimum value is 2 . We have 2 = min xW Kx = xW2 Kx2 > x5G2
(2.10.5)
and xW2 Mx2 = 1> xW2 Mx1 = 0. Since 2 is the minimum of xW Kx on G2 , a subspace of G1 > 2 cannot be less than 1 , i.e., 2 1 . Proceeding in this way we find a set of vectors xl and numbers l > (l = 1> 2> = = = > q) such that l = min xW Kx = xWl Kxl > (2.10.6) x5Gl ½ 1 l = m> (2.10.7) xWl Mxm = 0 l 6= m> and 1 2 = = = q . This procedure is iterative: we cannot set up the minimizing problem that gives 2 until we have found x1 , and generally we cannot set up the minimizing problem that gives l until we have found x1 > x2 > = = = > xl1 . There is another procedure in which we can find any l > xl without first finding x1 > x2 > = = = > xl1 ; this is called the independent or minimax procedure. In the independent procedure we start as before: 1 = min xW Kx = xW1 Kx1 = x5G1
Now we return to the analysis of Section 2.9 relating to vibration under a constraint. The inequality (2.9.10) shows that if none of the (el )q1 is zero, then the first constrained eigenvalue, 1 , is strictly less than 2 . Equations (2.9.11), (2.9.12) show that if the constraint is 1 = 0, then 1 = 2 . The quantity 1 is the amplitude of the component of x1 in x, and on premultiplying equation (2.7.13) by xW1 K we see that xW1 Kx = 1 xW1 Mx1 = 1 =
(2.10.8)
Thus 1 = 0 means that x is orthogonal to x1 w.r.t. the matrix M; this is the constraint which yields the maximum value of 1 , namely 2 .
48
Chapter 2
Thus max min xW Kx = 2 d
x5G1 xBd
(2.10.9)
where x B d means xW Md = 0; the d which maximizes the minimum is x1 . We may now extend this analyses to higher eigenvalues by using (2.9.13); thus max min xW Kx = 3 > d1 >d2
x5G1 xBd1 >d2
and generally max
d1 >d2 >===>dl
min
x5G1 xBd1 >d2 >===>dl
xW Kx = l+1=
(2.10.10)
Again, the d’s that maximize the minimum in the general case are d1 = x1 > d2 = x2 > = = = > dq1 = xq1 . The minimax definition of eigenvalues seems to have been noted first by Fischer (1905) [88]. The iterative and independent definitions of eigenvalues are discussed at length in Courant and Hilbert (1953) [64], and in the more specialised volume Gould (1966) [151]. The motivation for Gould’s book was the search for lower bounds for eigenvalues; discretising methods like the finite element method almost always lead to upper bounds. Exercises 2.10 1. Examine the arguments in Sections 2.9, 2.10 in the case when two eigenvalues are equal, e.g., 1 = 2 . 2. Use the minimax procedure to show that if stiness is added to a system, i.e., the stiness matrix is changed from K to K0 , and xW K0 x xW Kx for all x 5 Yq , then none of the eigenvalues of the system decreases. Why can you prove this result only for 1 by using the iterative definition?
Chapter 3
Jacobi Matrices Let no one say that I have said nothing new; the arrangement of the subject is new. Pascal’s Pensées, 22
3.1
Sturm sequences
In this Chapter we will analyse the properties of the eigenvalues and eigenvectors of systems with the special tridiagonal mass and stiness matrices met in Chapter 2. We will start by considering systems like that for the system in Figure 2.2.1, for which the mass matrix is diagonal and the stiness matrix is tridiagonal, with negative codiagonal. At the end of the section we will show that many of the results may be generalised to apply to systems like that in (2.4.10) in which the mass matrix is tridiagonal with positive codiagonal. The most important property of the eigenvalues of such systems is that they are simple, i.e., distinct (Theorem 3.1.3). Thus 1 ? 2 ? = = = ? q = If xu is the uth eigenvector, then as u increases, the eigenvectors oscillate more and more (Theorem 3.3.1) in such a way that the zeros of xu interlace those of the neighbouring xu1 and xu+1 (3.3.4). We shall now establish these and other results. Throughout the next few Chapters, we redevelop analysis originally established by Gantmacher and Krein (1950) [98]. Their book was republished in 2002.. We start with a definition: Definition 3.1.1 A Jacobi matrix is a positive semi-definite symmetric tridiagonal matrix with (strictly) negative codiagonal. Note: Dierent authors define a Jacobi matrix in dierent ways; some choose the codiagonal to be strictly positive. 49
50
Chapter 3
Now we consider the equation (K M)x = 0
(3.1.1)
where K is a Jacobi matrix. First, we suppose that M is a (strictly) positive diagonal matrix, as in (2.2.7), and we reduce (3.1.1) to standard form. Take M = diag(p1 > p2 > = = = > pq ) and write M = D2 , where 1
D = diag(g1 > g2 > = = = > gq )> gl = pl2 > introduce the vector u related to x by u = Dx>
x = D1 u
and premultiply (3.1.1) by D1 to obtain D1 (K D2 )D1 u = 0> i.e., (J I)u = 0>
(3.1.2)
J = D1 KD1 =
(3.1.3)
where The matrix J> like K, is a Jacobi matrix, and has the same eigenvalues as the system (3.1.1). We write 5 6 d1 e1 0 === 0 9 e1 d2 e2 : === 0 9 : 9 .. : .. .. .. := . . . . J=9 (3.1.4) 9 : 9 : . . .. .. 7 eq1 8 eq1 dq The analysis now centres on the leading principal minors (see (1.4.6)) of the matrix J I. We define ¯ ¯ ¯ d1 e1 ¯¯ ¯ S0 = 1> S1 () = d1 > S2 () = ¯ > etc. (3.1.5) e1 d2 ¯ so that finally
Sq () = det(J I)=
(3.1.6)
The minors satisfy the three-term recurrence relation Su+1 () = (du+1 )Su () e2u Su1 ()> u = 1> 2> = = = > q 1>
(3.1.7)
which enables us to calculate S2 > S3 > = = = > Sq successively from S0 > S1 . Since the zeros of any Su () are the eigenvalues of the truncated symmetric matrix obtained by retaining just the first u rows and columns of J, they are all real. We now prove:
3. Jacobi Matrices
51
Theorem 3.1.1 If e2u A 0 (u = 1> 2> = = = > q1), then the (Su ())q0 form a Sturm sequence, the defining properties of which are 1. S0 () has everywhere the same sign (S0 () 1). 2. When Su () vanishes, Su+1 () and Su1 () are non-zero and have opposite signs. Proof. In order to establish property 2 we note first that two successive Su cannot be simultaneously zero - i.e., for the same = 0 . For if Sv+1 (0 ) = 0 = Sv (0 ) then equation (3.1.7) shows that Sv1 (0 ) = 0, so that finally we must have S1 and S0 zero; but S0 (0 ) = 1, which yields a contradiction. The latter part of property 2 now follows directly from (3.1.7). Before proceeding further we must define the sign change function vu (). This is the integer-valued function equal to the cumulative number of signchanges in the sequence S0 > S1 ()> S2 ()> = = = > Su (). Thus if 5 6 2 1 0 J = 7 1 3 2 8 > 0 2 4 then, S0 = 1> S1 () = + 2> S2 () = 2 5 + 5> S3 () = 3 + 92 21 + 12= For = 0 the sequence of values is 1, 2, 5, 12. Since there is no change of sign in the sequence, each vu (0) = 0. For = 3 the sequence is 1, -1, -1, 3, so that v1 (3) = v2 (3) = 1> v3 (3) = 2. Theorem 3.1.2 vu () changes only when passes through a zero of the last polynomial, Su (). Proof. Clearly, vu () can change only when passes through a zero of one of the Sv ()> (v u); it therefore su!ces to prove that vu () does not change at all when passes through a zero of an intermediate Sv ()> (v ? u). Suppose Sv (0 ) = 0, where 1 v ? u, then Sv1 (0 ) and Sv+1 (0 ) will be both non-zero and have opposite signs. The signs of the triad Sv1 (0 )> Sv (0 )> Sv+1 (0 ) are therefore +0 - or - 0+. Suppose the first to be the case, so that Sv () increases as passes through 0 , (the other possibility may be handled similarly). Then for values of su!ciently close to 0 and less than 0 the signs are + - -, while for values of su!ciently close and greater than 0 the signs are + + -. Thus, whether is greater than or less than 0 there is just one change of sign in the triad of values of Sv1 ()> Sv ()> Sv+1 (). In other words the triad of polynomials Sv1 ()> Sv ()> Sv+1 () will not contribute any change to vu () as passes through 0 . But no other members of the sequence will contribute any change to vu () as passes through 0 (unless 0 is a zero of another Sw ()> |w v| 2, in which case again there will be no change in vu ()) so that vu () will not change at all. Clearly, vu () is not well defined when Su () = 0.
52
Chapter 3
Theorem 3.1.3 The zeros of Su (), are simple, i.e., distinct. In addition, if Su (0 ) 6= 0 and vu (0 ) = n, then Su () has n zeros less than 0 . Proof. Since Sv () = ()v v + · · · , all Sv () will be positive for su!ciently large negative , i.e., , so that vu () = 0: may be taken to be zero if J is positive definite. On the other hand, for su!ciently large positive , i.e., , the Sv () will alternate in sign, so that vu () = u. Now since vu () can increase only when passes through a zero of Su (), all the zeros of Su () must be distinct. For if 0 were a zero of even multiplicity then vu () would not increase at all as passed through 0 , while vu () would increase only by unity if 0 were a zero of odd multiplicity. The second part of the theorem now follows immediately. Corollary 3.1.1 The eigenvalues of a Jacobi matrix are distinct. Corollary 3.1.2 The number of zeros of Su () satisfying ? ? is equal to vu () vu (). Corollary 3.1.3 If 0 is a zero of Su () then, as passes from 0 to 0 + the sign of Su1 ()Su () changes from + to -, and vu () increases by unity. Theorem 3.1.4 Between any two neighbouring zeros of Su () there lies one and only one zero of Su1 (), and one and only one zero of Su+1 (). Proof. Let 1 > 2 be the two neighbouring zeros. Suppose, for the sake of argument that Su (1 ) A 0, then Su (1 +) ? 0 and Su (2 ) ? 0. By Corollary 3.1.3, Su1 (1 +) A 0 and Su1 (2 ) ? 0, so that Su1 () changes sign between 1 + and 2 , and therefore has at least one zero in (1 > 2 ). Now property 2 of Sturm sequences shows that Su+1 (l ) and Su1 (l )> (l = 1> 2) have opposite signs. Thus Su+1 (1 +) ? 0> Su+1 (2 ) A 0 so that Su+1 () has at least one zero in (1 > 2 ). Now suppose, if possible, that Su1 () (or Su+1 ()) had two (or more) zeros in (1 > 2 ) then Su () would have a zero in (1 > 2 ), contrary to the hypothesis that 1 > 2 are neighbouring zeros. This theorem is usually stated in the form: the eigenvalues of successive principal minors interlace each other.
3.2
Orthogonal polynomials
There is an intimate connection between Jacobi matrices and orthogonal polynomials. In this section we outline some of the basic properties of orthogonal polynomials. Two polynomials s({)> t({) are said to be orthogonal w.r.t. the weight function z({) A 0 over an interval (d> e) if (s> t)
Z
d
e
z({)s({)t({)g{ = 0=
(3.2.1)
3. Jacobi Matrices
53
A familiar example is provided by the Laguerre polynomials (Oq ({))4 0 , i.e., O0 ({) = 1> O1 ({) = { 1> O2 ({) = {2 4{ + 2> = = = which are orthogonal w.r.t. the weight function h{ over (0> 4), i.e., Z 4 h{ Oq ({)Op ({)g{ = 0> p 6= q= 0
One of the important properties of such polynomials is that they satisfy a threeterm recurrence relation. The relation for the Ou ({), for example, is Ou+1 ({) = ({ 2u 1)Ou ({) u2 Ou1 ({)= In this section we shall be concerned, not with a continuous orthogonality relation of the form (3.2.1), but with a discrete orthogonality relation (s> t)
q X
zl s( l )t( l ) = 0; (zl )q1 A 0=
(3.2.2)
l=1
where ( l )q1 are q points, satisfying 1 ? 2 ? · · · ? q . To introduce the concept formally we let Pq denote the linear space of polynomials of order q, i.e., the set of all polynomials s({) with degree n q, with real coe!cients. On this space (,) acts as an inner product since it is positive definite, bilinear and symmetric, i.e., 1. (s> s) ||s||2 A 0 if s({) 6= 0 2. (s> t) = (s> t)> (s + t> u) = (s> u) + (t> u) 3. (s> t) = (t> s) In addition 4. ({s> t) = (s> {t) We now prove Theorem 3.2.1 There is a unique sequence of monic polynomials, i.e., (tl ({))q0 such that tl ({) has degree l and leading coe!cient (of {l ) unity, which are orthogonal with respect to the inner product (,), i.e., for which (tl > tm ) = 0> l 6= m= Proof. The tl ({) may be constructed by applying the familiar GramSchmidt orthogonalisation procedure to the linearly independent polynomials ({l )q1 . Thus 0 t0 = 1> tl ({) = {l
l1 X m=0
lm tm ({)> (l = 1> 2> = = = > q 1)>
54
Chapter 3
where (tl > tm ) = ({l > tm ) lm (tm > tm ) = 0 so that lm = ({l > tm )@||tm ||2 > m = 0> 1> = = = > l 1= We note that the polynomial tq ({) =
q Y ({ l )
l=1
is the monic polynomial of degree q in the sequence. It is orthogonal to (tl )q1 ; 0 in fact it is orthogonal to all functions, since tq ( l ) = 0> l = 1> 2> = = = > q. The Gram-Schmidt procedure does not provide a computationally convenient means for computing the tl ; instead we use Forsythe (1957) [90]. Theorem 3.2.2 The monic polynomials (tl )q0 satisfy a three-term recurrence relation of the form tl ({) = ({ l )tl1 ({) 2l1 tl2 ({)> l = 1> 2> = = = > q>
(3.2.3)
with the initial values t1 ({) = 0> t0 ({) = 1=
(3.2.4)
Proof. tl ({){tl1 ({) is a polynomial of degree (l1). It may therefore be expressed in terms of (the linearly independent - see Ex. 3.2.1) t0 > t1 > = = = > tl1 . Thus tl ({) {tl1 ({) = f0 t0 + f1 t1 + · · · + fl1 tl1 = (3.2.5) Take the inner product of this equation with tm ({)> (m = 0> 1> = = = > l 1) thus (tl > tm ) (tl1 > {tm ) =
l1 X
n=0
fn (tn > tm ) = fm ||tm ||2 >
(3.2.6)
where the second term on the left has been rewritten by using property 4, above. But if m = 0> 1> = = = > l 1, then the first term on the left is zero, and if m = 0> 1> = = = > l 3, then {tm has degree at most l 2 and so is orthogonal to tl1 . Thus fm = 0 if m = 0> 1> 2> = = = > l 3 and there only two terms fl1 and fl2 on the right of (3.2.5) i.e., tl ({) {tl1 ({) = fl2 tl2 ({) + fl1 tl1 ({)=
(3.2.7)
Moreover equation (3.2.6) gives l = fl1 = (tl1 > {tl1 )@||tl1 ||2 fl2 = (tl1 > {tl2 )@||tl2 ||2 =
(3.2.8)
3. Jacobi Matrices
55
But {tl2 is a monic polynomial of degree l 1; it may therefore be expressed in the form l2 X {tl2 ({) = tl1 ({) + gm tm ({) m=0
so that (tl1 > {tl2 ) = ||tl1 ||2
and thus fl2 is negative and equal to 2l1 , where ¥
l = ||tl ||@||tl1 ||=
(3.2.9)
Equations (3.2.3), (3.2.4) with (3.2.8), (3.2.9) enable us to compute the polystep by step. Thus with t1 = 0> t0 = 1 we first compute 1 nomials {tl }q1 1 from (3.2.8); this substituted into (3.2.3) gives t1 . Now we compute 2 > 1 and find t2 , etc. In inverse problems we will need to express the weights zl in terms of the polynomials tq1 and tq . For this we note that if i ({) is any polynomial in Pq1 , i.e., of degree q 2 or less, then (tq1 > i )
q X
zl tq1 ( l )i ( l ) = 0=
l=1
But if such a combination q X
pl i ( l )>
pl = zl tq1 ( l )
l=1
is zero for any i ({) in Pq1 then q X
pl nl = 0> (n = 0> 1> = = = > q 2)>
l=1
since each {n is in Pq1 , i.e., 5 9 9 Bm = 9 9 7
1 1 21 ·
q2 1
1 2 22 ·
q2 2
1 3 23 ·
q2 3
··· ··· ··· · ···
1 q 2q ·
q2 q
6
5
9 :9 :9 :9 :9 89 7
p1 p2 p3 ··· · pq
6 : : : : = 0= : : 8
(3.2.10)
It is shown in Ex. 3.2.2 that this equation has the solution pl = @
q Y
m=1
0
( l m )=
(3.2.11)
56
Chapter 3
Apart from the arbitrariness of the factor , this is the unique solution. The prime means that the term m = l is omitted. Now since tq () =
q Y
( m )>
(3.2.12)
m=1
we have tq0 ( l )
=
q Y
0
( l m )>
m=1
where the prime on the left denotes dierentiation! Returning to equation (3.2.11) we can deduce that, for some , pl zl tq1 ( l ) = @tq0 ( l )> l = 1> 2> = = = > q= Since the {tl }q0 satisfy the three-term recurrence relation (3.2.3) it follows, by the arguments used in Section 3.1, that the zeros of tq ({) and tq1 ({) must interlace and therefore (Ex. 3.2.3) tq1 ( l )tq0 ( l ) A 0. This means that the weights (3.2.13) zl = @{tq1 ( l )tq0 ( l )} are positive. This equation is important: it means that if the monic polynomials tq (), tq1 () are given, and if their zeros interlace, then they may be viewed as the qth and (q 1)th members, respectively, of a sequence of monic polynomials orthogonal w.r.t. the weights zl given by (3.2.13), and the points { l }q1 . Exercises 3.2 1. Show that if the polynomials {tl }n0 > n ? q, are orthogonal w.r.t. the innerproduct (3.2.2), then they are linearly independent. Hence deduce that any polynomial s({) of degree n 1 may be expressed uniquely in the form s({) =
n1 X
fm tm ({)>
m=0
and that tn ({) is orthogonal to each polynomial of degree n 1. 2. Show that if the Vandermonde determinant 4 is defined by ¯ ¯ ¯ 1 1 ··· 1 ¯¯ ¯ ¯ 1 2 · · · q1 ¯¯ ¯ 22 · · · 2q1 ¯¯ > 4 = ¯¯ 21 ¯ ··· ·· ··· · · · ¯¯ ¯ q2 ·q2 q2 ¯ ¯ · · · 1 2 q1
3. Jacobi Matrices
57
then 4= where
q1 Y m1 Y
( m n ) = @
m=2 n=1
=
q1 Y
( q m )>
m=1
q m1 Y Y
( m n )=
m=2 n=1
Hence deduce (3.2.11).
of tq1 ({) must satisfy 1 ? 1 ? 3. The zeros { l }q1 of tq ({), and { l }q1 1 ql 0 2 ? · · · ? q1 ? q . Show that () tq ( l ) A 0> ()ql tq1 ( l ) A 0 and hence tq0 ( l )tq1 ( l ) A 0.
3.3
Eigenvectors of Jacobi matrices
In this section we establish some properties of the eigenvectors of Jacobi matrices, in preparation for the solution of ‘inverse mode problems’. We return to the analysis of Section 3.1 and prove Theorem 3.3.1 The sequence (xu>m )qu=1 for the mth eigenvector has exactly m 1 sign reversals. Proof. The xu>m are determined from equation (3.1.2) for = m ; this may be written eu1 xu1>m + (du m )xu>m eu xu+1>m = 0> (u = 1> 2> = = = > q)
(3.3.1)
where x0>m > xq+1>m are interpreted as zero, i.e., x0>m = 0 = xq+1>m =
(3.3.2)
Choose an arbitrary eq A 0 and put y1 = x1>m > y2 = e1 x2>m > = = = > yq+1 = e1 e2 · · · eq xq+1>m and multiply equation (3.3.1) by e1 e2 · · · eu1 to obtain
e2u1 yu1 + (du m )yu yu+1 = 0> (u = 1> 2> = = = > q)=
(3.3.3)
On comparing this equation with (3.1.7) we see that it has the solution y0 = 0> y1 = 1> yu = Su1 (m )> (u = 1> 2> = = = > q + 1) which because of Sq (m ) = 0, satisfies the end-condition yq+1 = 0. Thus, xu>m = (e1 e2 · · · eu1 )1 Su1 (m )>
(3.3.4)
and since m lies between the (m 1)th and mth zeros of Sq1 ()> vq1 (m ) = m 1=
Before establishing further properties of the eigenvectors we introduce the concept of a u-line.
58
Chapter 3
Definition 3.3.1 Let u = {x1 > x2 > = = = > xq+1 } be a vector. We shall define the u-line as the broken line in the plane joining the points with coordinates {u = u> |u = xu > (u = 1> 2> = = = > q + 1)= Thus, between ({u > |u ) and ({u+1 > |u+1 )> |({) is defined by |({) = (u + 1 {)xu + ({ u)xu+1 > (u = 1> 2> = = = > q)> as shown in Figure 3.3.1. Now return to Theorem 3.3.1. For arbitrary (real) the sequence given by x0 = 0> xu () = x1 (e1 e2 · · · eu1 )1 Su1 ()> (u = 1> 2> = = = > q + 1)> satisfies the recurrence (3.3.1) for u = 1> 2> = = = > q. (It will satisfy the last equation with xq+1 = 0 i Sq () = 0.) For arbitrary , the vector u() = {x1 ()> = = = > xq+1 ()} defines a x() line. We now investigate the nodes of this line, i.e., the points { at which |({) = 0. First we note that if xu () = 0, i.e., Su1 () = 0, then Su () and Su2 (), i.e., xu+1 and xu1 , will have opposite signs, so that the x() line will cross the {-axis at { = u. Secondly, if xu and xu+1 have opposite signs, then |({) has a node between u and u + 1. This implies that the x(m ) line has exactly m nodes, excluding the left hand end where x0 = 0, but including the right hand end. Moreover, if m ? m+1 , then the x() line will have exactly m nodes, again excluding the left hand end where x0 = 0. Table 3.3.1 shows the signs of xu for the whole range of -values, for the case q = 3. The last line in the table shows the number of nodes in the x(). Figure 3.3.1 shows the form of the x() for the starred values of . We now establish an identity which will enable us to prove further results concerning the eigenvectors.
• •
•
• • •
•
Figure 3.3.1 - The x() lines for , and
.
3. Jacobi Matrices
59
Table 3.3.1 -The 0 x1 + x2 + x3 + x4 + 0
signs of xu for dierent values of 1 2 3 + + + + + + + + 0 + 0 0 + 0 0 + 0 1 1 1 2 2 3
Consider the solutions u> v of the equations (3.3.1) corresponding to > respectively. Suppose that x0 = 0 = y0 and that some positive value has been assigned to eq . Then eu1 xu1 + du xu eu xu+1 eu1 yu1 + du yu eu yu+1
= xu > (u = 1> 2> = = = > q) = yu > (u = 1> 2> = = = > q)=
Eliminating du from these equations, we find eu wu eu1 wu1 = ( )(vu vu1 ) where wu = xu+1 yu xu yu+1 >
vu =
u X
xl yl
(3.3.5)
(3.3.6)
l=1
so that on summing over u = s> s + 1> = = = > t (1 s t q), we obtain et wt es1 ws1 = ( )(vt vs1 )=
(3.3.7)
In particular, if s = 1, so that x0 = 0 = y0 = w0 = v0 , et wt = ( )vt =
(3.3.8)
We now prove Theorem 3.3.2 If ? , then between any two nodes of the x() - line there is at least one node of the x() - line. Proof. Let > ( ? ) be two neighbouring nodes of the x-line and suppose that s 1 ? s> t ? t + 1> (s t)> so that |(> ) (s )xs1 + ( s + 1)xs = 0>
(3.3.9)
|(> ) (t + 1 )xt + ( t)xt+1 = 0
(3.3.10)
and |({> ) 6= 0 for ? { ? . For the sake of definiteness suppose that |({> ) A 0 for ? { ? , then xs > xs+1 > = = = > xt are all positive. We now need to prove that |({> ) has a zero between and . Suppose |({> ) has no such
60
Chapter 3
zero, that is, it has the same sign for ? { ? . Without loss of generality we can assume that |({> ) A 0 for ? { ? = that is |(> ) 0> |(> ) 0 and ys > ys+1 > = = = > yt are all positive. Thus, (s )ys1 + ( s + 1)ys 0>
(3.3.11)
(t + 1 )yt + ( t)yt+1 0>
(3.3.12)
and on eliminating between (3.3.9), (3.3.11), and between (3.3.10), (3.3.12) P we deduce that ws1 0, wt 0. On the other hand, vt vs1 = tl=s xl yl A 0, so that the LHS of (3.3.7) is non-positive, while the RHS is positive, providing a contradiction. If we had assumed |({> ) ? 0 for ? { ? , the we would have found the LHS of (3.3.5) non-negative and the RHS negative. Theorem 3.3.3 As increases continuously, then the nodes of the x() - line shift continuously to the left. Proof. Let 1 ()> 2 ()> = = = be the nodes of the x() - line, and suppose 0 ? 1 ()> 2 ()> = = = are the nodes of the x() - line. We need to prove that u () ? u () for all those values of u corresponding to the x() - line. Since, by Theorem 3.3.2, there is a least one of the u () between any two of the u (), it is su!cient to prove that 1 () ? 1 () = {= Suppose if possible that 1 () { and that t ? { t + 1 (1 t q) then all x1 > x2 > = = = > xt and y1 > y2 > = = = > yt will be positive while (t + 1 {)xt + ({ t)xt+1 (t + 1 {)yt + ({ t)yt+1
= 0 0
which imply wt 0. On the other hand vt A 0, which, when used with (3.3.8), provides a contradiction. Table 3.3.1 shows how the first node of x() appears at the right hand end (q+1) when = 1 and gradually shifts to the left, how the second zero appears when = 2 , etc. Theorem 3.3.4 The nodes of two successive eigenvectors interlace. Proof. Let the eigenvectors correspond to m and m+1 . The nodes of the x(m ) and x(m+1 ) - lines are (u (m ))mu=1 and (u (m+1 ))m+1 u=1 respectively; and m (m ) = m+1 (m+1 ) = q + 1. Theorem 3.3.3 shows that 1 (m+1 ) ? 1 (m ),
3. Jacobi Matrices
61
while Theorem 3.3.2 applied to the two zeros m1 (m ) and m (m ) q+1 shows that m (m+1 ) A m1 (m ). These two inequalities imply that the only possible ordering of the nodes is 0 ? 1 (m+1 ) ? 1 (m ) ? 2 (m+1 ) ? · · · ? m1 (m ) ¥ ? m (m+1 )> ? m (m ) = m+1 (m+1 ) = q + 1=
(3.3.13)
The derivation of certain other important properties of the eigenmodes will be deferred until Section 5.7, where properties of an oscillatory matrix will be used. See Gladwell (1991a) [119] for some related results.
Exercises 3.3 1. Show that the first and last components of any eigenvector of a Jacobi matrix must be non-zero. 2. Show that if the matrix J of (3.1.4), with negative o-diagonal elements has an eigenpair l > ul , then the corresponding matrix J with positive o-diagonal elements, has eigenpair l > Zul where Z is given by Z = gldj (1> 1> 1> · · · ()q1 ). This means that the eigenvector corresponding to the smallest eigenvalue, 1 , has q1 sign changes, while that corresponding to q has none. Show that if the eigenvalues of J are numbered in reverse, i.e., 1 A 2 A · · · A q A 0, then Theorem 3.3.1 remains valid.
3.4
Generalised eigenvalue problems
In Section 2.4 we showed that the eigenvalue problem for a finite element model of a vibrating rod could be reduced to a generalised eigenvalue problem (K M)u = 0
(3.4.1)
where K> M were both symmetric tridiagonal matrices, K having negative codiagonal and M having positive codiagonal. If M is positive definite, and K is positive semi-definite (i.e., K is a Jacobi matrix), then the analysis of Chapter 1 shows that the eigenvalues are non-negative. Under these conditions we may prove that the solutions of (3.4.1) share the properties of the eigenvalue problem in normal form i.e., equation (3.1.7). In particular, we can show that the eigenvalues of (3.4.1) are distinct and that the sequence (xu>m )qu=1 for the mth eigenvector has exactly m 1 sign reversals. To obtain these results we need to return to the analysis in Section 3.1 onwards and see what changes have to be made. We start with the principal minors of the matrix K M, using the notation of (2.4.10): ¯ ¯ ¯ f1 d1 g1 e1 ¯ ¯ ¯>=== S0 () = 1> S1 () = f1 d1 > S2 () = ¯ g1 e1 f2 d2 ¯ (3.4.2)
62
Chapter 3
so that finally Sq () = det(K M)= The minors satisfy the three-term recurrence relation Su+1 () = (fu+1 du+1 )Su () (gu + eu )2 Su1 ()=
(3.4.3)
The argument used in Section 3.1, 3.3 was based on the fact that the sequence of principal minors defined by (3.1.5), (3.1.7) was a Sturm sequence. The sequence defined by (3.4.2), (3.4.3) however, is not a Sturm sequence. For if Su () = 0, then Su+1 () = (gu + eu )2 Su1 (), and if it happens that gu + eu = 0, then Su+1 () would be zero, and not, as required by condition 2 of Theorem 3.1.1, of opposite sign to Su1 (). Now we make the crucial observation, that if we restrict attention to 0, then the Su () do form a Sturm sequence because eu > gu being positive, eliminates the possibility that gu + eu = 0. If we assume that M is positive definite and K is positive semi-definite, then all the eigenvalues u will be non-negative and we may proceed as before. Thus Theorem 3.1.1 holds provided that 0, and Theorem 3.1.2 holds. The proof of Theorem 3.1.3 must be slightly changed. In the expansion of Sv () in powers of we have Sv () = v>0 + v>1 + = = = + v>v v = (3.4.4) The first term, v>0 , is the vth principal minor of K and, since K is positive semidefinite, v>0 A 0 for v = 1> 2> = = = > q 1 and q>0 0; since Sv (0) = v>0 we have vu (0) = 0. The last term in (3.4.4), is v>v = ()v (the vth principal minor of M), so that for su!ciently large , i.e., > vu () = u. The remainder of the proof of Theorem 3.1.1, the corollaries 1-3 and Theorem 3.1.4 follow as before. We need to make small changes in the proof of Theorem 3.3.1. The xu>m are determined from the equations (gu1 + m eu1 )xu1>m + (fu m du )xu>m (gu + m eu )xu+1>m = 0 for u = 1> 2> = = = > q, where x0>m = 0 = xq+1>m . Put gu + m eu = hu , choose an arbitrary hq A 0, and put y1 = x1>m
y2 = h1 > x2>m > = = =
yq+1 = h1 h2 = = = hq xq+1>m
and multiply equation (3.4.5) by h1 h2 = = = hu1 to obtain h2u1 yu1 + (fu m du )yu yu+1 = 0 u = 1> 2> = = = > q= On comparing this with (3.4.3) we see that it has the solution y0 = 0> y1 = 1> yu = Su1 (m )> (u = 1> 2> = = = > q + 1)= Again, we conclude that vq1 (m ) = m 1. We may make similar changes to the proofs of Theorems 3.3.2-3.3.4.
Exercises 3.4 1. Make appropriate changes in the proofs of Theorems 3.3.2-3.3.4.
(3.4.5)
Chapter 4
Inverse Problems for Jacobi Systems People are generally better persuaded by the reasons which they themselves have discovered than by those which have come into the minds of others. Pascal’s Pensées, 10
4.1
Introduction
Research on these inverse problems began in the former Soviet Union, with the work of M.G. Krein. It appears that his primary interest was in the qualitative properties of the solutions of, and the inverse problems for, the Sturm-Liouville equation (see Chapter 10), and the discrete problems were studied because such problems were met in any approximate analysis of Sturm-Liouville problems. Krein’s early papers Krein (1933) [198], Krein (1934) [199] concern the theory of Sturm sequences, while the Supplement to Gantmacher and Krein (1950) [98], Gantmacher and Krein (2002) and Krein (1952) [202] make use of the theory of continued fractions developed by Stieltjes (1918) [310]. Krein sees his results as giving mechanical interpretations of Stieltjes’ analysis. Consider the simple system shown in Figure 4.1.1a.
k1
k1
k2 m1
k2 m1
m2
m2
Figure 4.1.1 - The system is a) free and b) fixed at the right hand end
If p1 > p2 > n1 > n2 are given, then the analysis of Chapter 2 shows how we can find the two natural frequencies, $ 1 > $ 2 of the system: 1 = $21 > 2 = $ 22 are the 63
64
Chapter 4
eigenvalues of the equation · n1 + n2 p1 n2
n2 n2 p2
¸·
x1 x2
¸
= 0=
(4.1.1)
The eigenvalues are the roots of the determinant: 4() p1 p2 2 {n2 p1 + (n1 + n2 )p2 } + n1 n2 = 0=
(4.1.2)
Now consider the inverse problem. First it is clear that if one set of values p1 > p2 > n1 > n2 has been found that yield specified eigenvalues 1 > 2 , and if d A 0, then dp1 > dp2 > dn1 > dn2 will be another set yielding the same eigenvalues: there are not four quantities to be found, only three ratios p1 : p2 : n1 : n2 . Knowing these ratios, we would need one more quantity, for instance the total mass p = p1 + p2 , or the total stiness n given by 1@n = 1@n1 + 1@n2 , to find the absolute values of the four quantities p1 > p2 > n1 > n2 . But even knowing two eigenvalues 1 > 2 , we cannot find the three ratios; we need one more piece of information. One possible piece is the single eigenvalue 2 = $ of the system obtained by fixing p2 , as shown in Figure 4.1.1b. This is 2 n1 + n2 = (4.1.3) = $ = p1 The sum and product of the roots 1 > 2 of equation (4.1.2) are 1 + 2 =
n2 p1 + (n1 + n2 )p2 n2 (n1 + n2 ) = + p1 p2 p2 p1 1 2 =
n1 n2 = p1 p2
(4.1.4) (4.1.5)
Subtracting (4.1.3) from (4.1.4) we obtain n2 = 1 + 2 > p2
(4.1.6)
1 2 n1 = > p1 1 + 2
(4.1.7)
and then (4.1.5) gives
and finally (4.1.3) gives n1 1 2 ( 1 )(2 ) n2 = = = = p1 p1 1 + 2 1 + 2
(4.1.8)
The general theory of vibration under constraint (Section 2.9) states that 1 ? ? 2 , so that all the quantities on the right hand sides of (4.1.6)-(4.1.8) are positive: the solution is realistic. The theory presented in this Chapter provides various generalisations of this analysis to a lumped-mass system made up of q masses. The Chapter falls into three parts: a discussion of inverse problems for
4. Inverse Problems for Jacobi Systems
65
a Jacobi matrix; mass-spring realisations of these problems; generalisations and variants of these problems.
Exercises 4.1 1. Show that if u1 > u2 are the eigenvectors of (4.1.1), normalised so that uWl Mum = lm , then the equation giving the eigenvalue is x22>1 x22>2 + =0 1 2 so that knowing is equivalent to knowing x2>2 : x2>1 . 2. Show that for the system of Figure 4.1.1, the system of given stiness n, (1@n = 1@n1 + 1@n s 2 ), and least mass p = p1 + p2 , is found for = 1 + 2 1 2 . 3. Show that for a taut string with tension W and unit length with just one concentrated mass p located at a distance c1 from the left hand end, c2 from the right, the frequency $ is given by n1 + n2 p = 0 where nl =
W > cl
c1 + c2 = 1=
s Hence find the system of least mass having a given frequency $ = . This suggests the problem of finding a string of least mass having P concentrated masses (pl )q1 separated by distances c1 > c2 > = = = > cq+1 , where q+1 l=1 cl = 1. Barcilon and Turchetti (1980) [23] considered this problem in a wider context, but did not find a closed form solution for the discrete problem.
4.2
An inverse problem for a Jacobi matrix
It was shown in Section 3.1 that the (natural frequencies)2 of a lumped mass system may be obtained as the eigenvalues of a Jacobi matrix 6 5 d1 e1 : 9 e1 d2 e2 : 9 := 9 · · · · (4.2.1) J=9 · : 7 0 0 dq1 eq1 8 0 0 eq1 dq If the system is connected, i.e., the stinesses between masses are strictly positive, then the codiagonal elements el are strictly negative. The basic theorem is
66
Chapter 4
Theorem 4.2.1 There is a unique Jacobi matrix J having specified eigenvalues (l )q1 , where 0 1 ? 2 ? · · · ? q (4.2.2)
and with normalised eigenvectors (ul )q1 having non-zero specified values (x1l )q1 or (xql )q1 of their first or last components respectively; recall that ul = {x1l > x2l > = = = > xql }.
(We recall Ex. 3.3.1, that the first and last components of an eigenvector of a Jacobi matrix are both non-zero.) Proof. The theorem is at once an existence (there is ...) and a uniqueness (... a unique) theorem. We shall prove existence by actually constructing a matrix, and will do so by using the so-called Lanczos algorithm; the algorithm demonstrates that J is unique. This algorithm has the advantage that numerically it is well conditioned. An independent proof that the matrix is unique is left to Ex. 4.2.2. The proof will be presented for the case in which (x1m )q1 are specified. The eigenvectors ul satisfy Jul = l ul
(4.2.3)
Use the column vectors (ul )q1 to construct a square matrix U : U = [u1 > u2 > = = = > uq ]. The orthonormality conditions uWl um = lm yield UW U = I= This means that UW is the inverse of U : U is an orthogonal matrix. But if UW U = I, then Theorem 1.3.6 states that UUW = I also. Now put UW = X, then UUW = XW X = I. But this means that the columns of X, like the columns of U, are orthonormal. Call the columns (xl )q1 , so that X = [x1 > x2 > = = = > xq ], then xWl xm = lm = The reason why we have introduced the vectors xl is that x1 = {{11 > {21 > = = = > {q1 } = {x11 > x12 > = = = > x1q }
(4.2.4)
is given as part of the data. Now we proceed to rewrite the eigenvalue equations (4.2.3) as equations for the xl . The set of equations (4.2.3) for l = 1> 2> = = = > q, may be written JU = Ua=
(4.2.5)
XJ = aX=
(4.2.6)
Thus, on transposing we find Written in full, this equation is 5 d1 e1 9 e1 d2 [x1 > x2 > = = = > xq ] 9 7 · ·
6 e2 · eq1
: : = a [x1 > x2 > = = = > xq ] = · 8 dq
(4.2.7)
4. Inverse Problems for Jacobi Systems
67
Take this equation column by column. The first column is d1 x1 e1 x2 = ax1 =
(4.2.8)
Premultiply this by xW1 , using xW1 x1 = 1> xW1 x2 = 0; d1 xW1 x1 = d1 = xW1 ax1 = Now rewrite equation (4.2.8) as e1 x2 = d1 x1 ax1 = z2 = The vector z2 is known, because d1 > x1 > a are all known. The vector x2 is to be a unit vector, so that e1 ||x2 || = e1 = ||z2 ||= and x2 = z2 @e1 . (4.2.7):
Having found d1 > e1 > x2 we proceed to the next column of e1 x1 + d2 x2 e2 x3 = ax2
Again, premultiplying by xW2 we find d2 = xW2 ax2 , and then e2 x3 = d2 x2 e1 x1 ax2 = z3 so that e2 ||x3 || = e2 = ||z3 ||>
x3 = z3 @e3
and so on. This procedure is called the Lanczos algorithm; see Lanczos (1950) [203], Golub (1973) [132], Golub and Van Loan (1983) [135] and Kautsky and Golub (1983) [192]. It produces a matrix J and at the same time constructs the columns (xl )q1 which yield X = UW . Actually, what we have described is an inverse version of the original Lanczos algorithm. This original algorithm solved the following problem: Given a symmetric matrix A and a vector x1 such that xW1 x1 = 1, compute a symmetric tridiagonal matrix J and an orthogonal matrix X = [x1 > x2 > = = = > xq ] such that A = XJXW . In our use of the algorithm, we start with A = a. We have defined a Jacobi matrix as a positive semi-definite symmetric tridiagonal matrix with strictly negative codiagonal. If the spectrum (l )q1 satisfies the inequalities (4.2.2), so that 1 0, then the J constructed by the Lanczos algorithm from A a will be a Jacobi matrix. Exercises 4.2 1. Show that the vectors xl constructed in the Lanczos algorithm satisfy xWl xm = lm
l> m = 1> 2> = = = > q
even though this orthogonality is apparently established only for |lm| 1.
68
Chapter 4 2. Show that there cannot be two distinct Jacobi matrices J and J0 with (J) = (J0 ) and with the same values of the first components (x1l )q1 of their normalised eigenvectors. 3. Rewrite the procedure described in equation (4.2.5) on, to solve the original Lanczos problem.
4.3
Variants of the inverse problem for a Jacobi matrix
First, we introduce some notation. Suppose A 5 Mq . The set of eigenvalues of A, the spectrum of A is denoted by (A). If A is symmetric, i.e., A 5 Vq , then (A) is a sequence of real numbers (l )q1 , where 1 2 3 · · · q . If M> K 5 Vq , then the set of eigenvalues of equation (3.1.1) is denoted by (M> K); again it is a sequence of real numbers (l )q1 satisfying 1 2 · · · q . See Kautsky and Golub (1983) [192], deBoor and Sa (1986) [76] for a discussion that places the Jacobi matrix problem in a wider context. Friedland and Melkman (1979) [94] discuss the inverse eigenvalue problem in the context of non-negative matrices. If A 5 Vq , the matrix obtained by deleting the lth row and column of A is called a truncated matrix. It will sometimes be denoted by Al ; its eigenvalues will be denoted (Al ). Now suppose that A 5 Vq is a Jacobi matrix J, then its eigenvalues will be distinct, and the eigenvalues (J1 ) = (l )q1 will strictly interlace (l )q1 , i.e., 1 0 1 ? 1 ? 2 ? · · · ? q1 ? q =
(4.3.1)
The problem of reconstructing J from (J) and (J1 ) seems to have been studied first by Hochstadt (1967) [173]. He proved that there is at most one matrix J with the required property. Hochstadt (1973) [176] attempted to construct this unique Jacobi matrix, but he did not show that his method would always lead to real values of the codiagonal elements el . Hald (1976) [160] presented another construction and showed that, in theory, it would always work provided that the eigenvalues satisfied the interlacing condition (4.3.1). In practice, however, the construction was found to break down due to loss of significant figures. Hald also showed that Hochstadt’s construction will always lead to real el provided that (4.3.1) holds. Gray and Wilson (1976) [154] presented an alternative, inductive construction of J. An independent uniqueness proof was given by Hald (1976) [160]. In this section we shall present two methods for constructing J. The first relies on the theory of orthogonal polynomials described in Section 3.2. The second, which will later be generalised to inverse problems for band matrices, relies on the Lanczos algorithm described in Section 4.2. Note that we have chosen to define a Jacobi matrix so that it is positive semi-definite. Many of the results require only the interlacing of the ’s and the ’s, without any restriction on positivity.
4. Inverse Problems for Jacobi Systems
69
The first method is best described by supposing that (J) = (l )q1 and (Jq ) = (l )q1 are known, and satisfy (4.3.1). Remember that Jq is obtained 1 by deleting the qth row and column of J. Now we form monic polynomials sl (), rather than the polynomials Sl () in equation (3.1.5). We form two polynomials q q1 Y Y sq () = ( l )> sq1 () = ( l )= (4.3.2) l=1
l=1
The polynomials are the qth and (q 1)th monic polynomials of the sequence of monic polynomials with weights given by equation (3.2.13), i.e., zl = @{sq1 (l )s0q (l )}
(4.3.3)
and points (l )q1 . In addition, they are the qth and (q 1)th principal minors of the matrix (I J). The polynomials su () therefore satisfy su () = ( du )su1 () e2u1 su2 ()=
(4.3.4)
Hald’s method of reconstructing J is as follows: he starts from sq ()> sq1 () and constructs sq2 (), and in the process finds dq and eq1 , by synthetic division. Then from sq1 ()> sq2 () he constructs sq3 () and finds dq1 and eq2 , and so on. The process is inherently unstable because the polynomials sq2 > sq3 > = = = > s1 are found by successively cancelling the leading terms in the preceding pair of polynomials; the process becomes unstable because of cancellation of leading digits. de Boor and Golub (1978) [75] proceed quite dierently. Having found the weights zl by using (4.3.3), they construct the polynomials in the natural order by using the analysis of Section 3.2, i.e., s1 () = 0>
s0 () = 1
su () = ( du )su1 () e2u1 su2 ()>
(4.3.5) (4.3.6)
with the numbers du > eu computed by du =
(su1 > su1 ) > ||su1 ||2
eu =
||su || > ||su1 ||
u = 1> 2> = = = > q 1=
(4.3.7)
This process is numerically stable. The only major di!culty encountered by de Boor and Golub lay in the computation of the weights zl . In seeking to overcome this di!culty, they used the reflection of J about its second diagonal. The matrix 5 6 0 0 === 1 9 : 1 : T=9 (4.3.8) 7 8 · 1 0
70
Chapter 4
is orthogonal and symmetric, so that T2 = I. It reverses the order of the rows and the columns of J, i.e., it transforms J into 5 6 dq eq1 9 eq1 dq1 eq2 : 9 : 9 .. : . . .. .. := ¯ = TJT = 9 . J (4.3.9) 9 : 9 : . . .. .. 7 e1 8 e1 d1 ¯ are denoted by d If, therefore, the elements of J ¯u > ¯eu then d ¯u = dq+1u >
¯eu = equ =
¯ are the trailing principal minors of I J; The leading principal minors of I J we denote them by s¯u (). We prove Theorem 4.3.1 For l = 1> 2> = = = > q sq1 (l )¯ sq1 (l ) = (e1 e2 = = = eq1 )2 = e2 = Proof. For once we step out of sequence, and use the notation we will introduce in Section 6.2. Let denote the sequence {2> 3> = = = > q 1}, then sq1 (l ) = E( ^ 1)>
s¯q1 (l ) = E( ^ q)=
Using Sylvester’s theorem (Corollary 2 of Theorem 6.2.2), with E() as pivotal block, we obtain ¯ ¯ ¯ E( ^ 1) E( ^ 1; ^ q) ¯¯ ¯ 0 = E() det(B) = ¯ ¯ E( ^ 1; ^ q) E( ^ q) i.e.,
0 = sq1 (l )¯ sq1 (l ) (e1 e2 = = = eq1 )2 This result means that the polynomials s¯q ()> s¯q1 ()> = = = > s¯1 ()> s¯0 () are the monic polynomials related to the weights zl =
sq1 (l ) e2 = 0 = s¯q1 (l )¯ s0q (l ) sq (l )
(4.3.10)
These weights are more easily constructed than those in (4.3.3). In this procedure, the terms in the matrix J are computed in the order d ¯1 > ¯e1 > d ¯2 > = = = > ¯eq1 > d ¯q , i.e., in the order dq > eq1 > dq1> = = = > e1 > d1 . The second method of constructing J is due to Golub and Boley (1977) [133]. See also de Boor and Sa (1986) [76]. It relies on the fact that, once we know (J) and (J1 ) we may compute the vector x1 of first components of the eigenvectors of J; these are the data needed for construction by the Lanczos algorithm of Section 4.2. We can carry out the analysis for an arbitrary symmetric matrix
4. Inverse Problems for Jacobi Systems
71
A 5 Vq , rather than a Jacobi matrix. Barcilon (1978) [19] concentrated on the eigenvectors corresponding to l and l , rather than using the l to find the quantities {l1 ; his subsequent analysis did not lend itself to computation. If A 5 Vq , then the eigenvalues of A1 are the stationary values of uW Au subject to uW u = 1 and the constraint x1 = 0, i.e., uW e1 = 0. Thus they are the stationary values of i = uW Au uW u 2uW e1 >
(4.3.11)
where e1 = {1> 0> = = = > 0} and > are Lagrange parameters. The condition that i be stationary yields Au u e1 = 0= (4.3.12) Since the eigenvectors ul of A span Yq , we may write u=
q X
l ul >
(4.3.13)
l=1
and then Au =
q X
l Aul =
l=1
so that (4.3.12) becomes
q X
l l ul >
l=1
q X (l )l ul = e1 > l=1
and the orthogonality condition uWm ul = lm gives (m )m = uWm h1 = x1m = {m1 > where we have used (4.2.4). Substituting for l in (4.3.13) we find u=
q X {l1 ul > l=1 l
(4.3.14)
and the condition x1 = 0, and x1l = {l1 , yields the eigenvalue equation q X ({l1 )2 = 0= l=1 l
(4.3.15)
We note that if A is a Jacobi matrix, none of the coe!cients {l1 will be zero (Ex. 3.3.1). The analysis of Section 2.9 shows that the roots (l )q1 of this 1 equation will then strictly interlace the (l )q1 , as in (4.3.1). Since x1 = {{11 > {21 > = = = > {q1 } and P x1 is the first column of the orthogonal matrix X = UW , we have ||x1 ||2 = 1 = ql=1 ({l1 )2 , so that we have the identity Qq1 q X (l ) ({l1 )2 (4.3.16) = Ql=1 q l=1 (l ) l=1 l
72
Chapter 4
(Note that, for large , both sides approach 1@.) On multiplying (4.3.16) through by (m ) and then putting = m we find Qq1 m=1 (m l ) 2 > l = 1> 2> = = = > q (4.3.17) ({l1 ) = Qq0 m=1 (m l )
where 0 indicates that the term m = l has been omitted. The interlacing condition ensures that the right hand side of (4.3.17) is strictly positive for each l = 1> 2> = = = > q. This equation thus yields x1 . We stress the importance of the analysis in equations (4.3.11)-(4.3.17). It shows that if A is an arbitrary symmetric matrix, then (A) and (A1 ) determine the vector x1 of first components of the normalised eigenvectors of A. Conversely, (A) and x1 determine (A1 ). There is a third inverse problem which appears in a number of contexts. Given two strictly increasing sequences (l )q1 and (l )q1 with 0 1 ? 1 ? 2 ? 2 ? · · · ? q ? q
(4.3.18)
A u = u
(4.3.19)
determine J 5 Vq such that (J) = (l )q1 , and (J ) = (l )q1 , where J = (d1 d1 )E1>1 + J. (The matrix J diers from J only in the 1,1 position.) Suppose A 5 Vq is an arbitrary symmetric matrix, and that A diers from A only in the 1,1 position, i.e., A = A + (d1>1 d1>1 )E1>1 . We will show that (A) and (D ) determine x1 . The eigenvalue equation for A is
which we write Au + (d1>1 d1>1 )x1 e1 = u= Write u=
q X
l ul >
(4.3.20)
l=1
so that equation (4.3.19) becomes q X
l l ul + (d1>1 d1>1 )x1 e1 =
l=1
q X
l ul >
l=1
and therefore, ( l )l = (d1>1 d1>1 )x1 x1l > which when substituted into (4.3.20), yields u = (d1>1 d1>1 )x1
q X x1>l ul = l l=1
Equating the first components on each side of their equation, we have 1 = (d1>1 d1>1 )
q X {2l1 > l l=1
4. Inverse Problems for Jacobi Systems
73
where {l1 = x1l . The roots of this equation are (l )q1 , so that 1
(d1>1
¶ q q µ X Y l {2l1 d1>1 ) = l l=1 l l=1
(4.3.21)
µ ¶ q Y l m 0 = l m m=1
(4.3.22)
and therefore (d1>1 d1>1 ){2l1 = (l l )
By comparing the traces of A and A we see that d1>1
d1>1
q X = (m m ) A 0=
(4.3.23)
m=1
Thus, equation (4.3.22) expresses ({l1 )2 in terms of (A) and (A ), and the interlacing condition (4.3.18) insures that ({l1 )2 will be positive. If we know that A is a Jacobi matrix then, of course, we can use the Lanczos algorithm to determine it. Note that nowhere in the analysis do we need the restriction that 1 is non-negative; only the strict interlacing is needed. A matrix A is said to be persymmetric if it is symmetric, and also symmetric about the second diagonal, the one going from top right to bottom left. Thus ¯ given by (4.3.9) satisfies A is persymmetric if A ¯ = A= A
(4.3.24)
If A is tridiagonal and persymmetric, then du = dq+1u >
eu = equ =
(4.3.25)
The final inverse problem considered here concerns the reconstruction of a persymmetric Jacobian matrix. Now we need only one spectrum, not two. We prove Theorem 4.3.2 There is a unique persymmetric Jacobi matrix J with (J) = (l )q1 , satisfying 0 1 ? 2 ? · · · ? q . Proof. The simplest proof is perhaps to show that if the eigenvalues (l )q1 are known, then it is possible to find the weights for the construction of the orthogonal polynomials su (). Indeed if J is persymmetric then the minor su () is equal to s¯u (). But then Theorem 4.3.1 shows that [sq1 (l )]2 = e2 >
i.e., sq1 (l ) = ±e
so that equation (4.3.10) yields zl = ±e@s0q (l )=
(4.3.26)
74
Chapter 4
Since the signs of s0q (l ) will alternate with l, then so must the signs in (4.3.26) if the zl are to be positive. The magnitude of e is irrelevant to the construction of the su (). See Hochstadt (1979) [182] for another variant of this inverse eigenvalue problem. Exercises 4.3 1. Show that if B = l I J, then E(1> 2> = = = > q 1; 2> 3> = = = > q) = ()q1 e1 e2 = = = eq1 = 2. Show that the {l1 computed from (4.3.22) do satisfy q X
{2l1 = 1=
l=1
3. If you like using a computer, then try to reconstruct a Jacobi matrix using Hald’s method, or that of de Boor and Golub. Start with the matrix J with dl = 2> el = 1> l = 1> 2> = = = > q 1; dq = 2= Set up recurrence relations to give (l )q1 and (l )q1 and use these as data 1 to reconstruct J.
4.4
Reconstructing a spring-mass system; by end constraint
We may divide the problem of reconstructing an in-line spring-mass system into three stages: i) Formulate the problem as an inverse eigenvalue problem for a Jacobi matrix J. ii) Solve this problem and find J. iii) Recover the mass and stiness matrices M and K from J. Stage i) was discussed in Section 3.1; we repeat the analysis here. For an in-line system, the frequency equation governing free vibration is (K M)y = 0=
(4.4.1)
For the system shown in Figure 4.4.1 the matrices K and M are given explicitly in (2.2.7). We write M = D2 , where D = gldj(g1 > g2 > = = = > gq ), put Dy = u and reduce (4.4.1) to (J I)u = 0> (4.4.2) where J = D1 KD1 =
(4.4.3)
4. Inverse Problems for Jacobi Systems
75
Stage ii) was the subject of Section 4.3. Given the spectra of the systems , we construct in Figure 4.4.1a) and b), i.e., (J) = (l )q1 and (J1 ) = (l )q1 1 x1 , the vector of first components of the eigenvectors ul of (4.4.2), and then construct J by using the Lanczos algorithm of Section 4.2.
k1
k2 m1
kn
...
m2
mn
a)
k2 m1
kn
...
m2
mn
b)
k1
k2 m1
m2
...
kn mn
c) Figure 4.4.1 - Two possible ways of constraining the end of a fixed-free system It remains to consider Stage iii). By using the explicit form of K in equation (2.2.7) we can verify that if e = {1> 1> = = = > 1}>
(4.4.4)
Ke = {n1 > 0> 0> = = = > 0}=
(4.4.5)
then Physically, this equation states that a static force n1 applied to mass p1 will extend the first spring by unit amount and at the same time displace all the remaining masses p2 > p3 > = = = > pq by unit amount to the right, as if everything to the right of p1 were a rigid body. Since K = DJD we have DJDe = DJD{1> 1> = = = > 1} = {n1 > 0> 0> = = = > 0}>
76
Chapter 4
i.e., Jd = J{g1 > g2 > = = = > gq } = {n1 @g1 > 0> 0> = = = > 0}=
(4.4.6)
(Note that D = gldj(g1 > g2 > = = = > gq ), while d = {g1 > g2 > = = = > gq }.) We need to be sure that d so calculated will be a strictly positive vector. We prove Theorem 4.4.1 If J 5 Vq is a non-singular Jacobi matrix, then J1 is a strictly positive matrix, meaning that each element of J1 is strictly positive; we write this J1 A 0. Proof. We use induction. Write · ¸ d1 bW J= > b J1
b = {e1 > 0> = = = > 0}=
1 A 0. We will have achieved our goal if we can show that if J1 1 A 0 then J Suppose · ¸ k1 kW 1 J = k H
then JJ1 =
·
d1 bW b J1
¸·
k1 k
kW H
¸
=
·
1 0 0 I
¸
so that bkW + J1 H = I>
bk1 + J1 k = 0=
Since J is a non-singular Jacobi matrix, it is positive definite; so therefore is J1 , by Ex. 1.4.2; therefore k1 A 0, and so k = J1 1 bk1 A 0. (Note that the product of J1 1 , which is strictly positive by hypothesis, and the non-negative non-zero vector b, is strictly positive.) Therefore, W 1 H = J1 1 bk + J1 A 0=
(Note that since J1 A 0, all we need in order to prove that H A 0, is that 1 bkW 0, i.e., k 0; actually though, k A 0.) Thus H A 0> k A 0 and k1 A 0 so that J1 A 0. We may now return to equation (4.4.6). Take the unique reconstructed non-singular J and solve Jx = J{{1 > {2 > = = = > {q } = {1> 0> = = = > 0} = e1 = The solution x is strictly positive: x A 0. Thus the solution of equation (4.4.6) is d = fx> for some as yet unknown f A 0. The total mass of the system is p=
q X l=1
pl =
q X l=1
g2l = ||d||2 = f2 ||x||2 =
4. Inverse Problems for Jacobi Systems
77
Thus, knowing p and ||x||2 , we can find f A 0 and d, and thus D. Then K = DJD, and because K satisfies (4.4.5), it necessarily (Ex. 4.4.1) has the ˆ = gldj(n1 > n2 > = = = > nq ). ˆ W given in equation (2.2.12), where K form K = EKE This completes the reconstruction. The reconstruction from the spectra of a) and c) proceeds along similar lines; we merely renumber the masses starting from the right (Ex. 4.4.2). This reconstruction may be used in a reversed situation: it shows that any non-singular Jacobi matrix J may be expressed uniquely as W
ˆ D1 > J = D1 EKE
(4.4.7)
ˆ are strictly positive diagonal matrices and ||D|| = 1; this corresponds where D> K to p = 1 in equation (4.4.6). Now we consider the fixed-fixed case shown in Figure 4.4.2a; there is essentially only one constraint we can apply, to pq , as shown in Figure 4.4.2b).
k1
k2 m1
m2
...
kn + 1 mn
a)
k1
k2 m1
m2
...
kn mn
b) Figure 4.4.2 - A fixed-fixed system, and a constrained system We start our analysis as before. The stiness matrix for the system in a) is 6 5 n2 n1 + n2 : 9 n2 n2 + n3 n3 : 9 := 9 · · · · (4.4.8) K=9 : 8 7 nq nq nq + nq+1 Knowing the spectra (l )q1 and (l )q1 of the systems a) and b) we can construct 1 J = D1 KD1 where again M = D2 . Now however K{1> 1> = = = > 1} = {n1 > 0> 0> = = = > 0> nq+1 } :
(4.4.9)
78
Chapter 4
this states that in order to produce unit static displacements of the masses, we must apply two forces, n1 at p1 and nq+1 at pq . Thus DJD{1> 1> = = = > 1} = n1 e1 + nq+1 eq so that Jd = J{g1 > g2 > = = = > gq } = (n1 @g1 )e1 + (nq+1 @gq )eq =
(4.4.10)
First, consider the equation Jy = eq =
(4.4.11)
simple algebra shows that the solution is |l = el el+1 = = = eq1 Sl1 @Sl >
(4.4.12)
where Sl is the lth leading principal minor of J (see equation (1.4.6). Since J is positive definite, equation (4.4.12) confirms that the solution y is positive, as predicted by Theorem 4.4.1. We can find the solution of Jx = e1
(4.4.13)
in a similar way (Ex. 4.4.3); all we need here is that, according to Theorem 4.4.1, x A 0. Using x and y we may write the solution of (4.4.10) as d = (n1 @g1 )x + (nq+1 @gq )y=
(4.4.14)
In particular, gq
= (n1 @g1 ){q + (nq+1 @gq )|q n1 nq+1 Sq1 = {q + = (4.4.15) g1 gq Sq Q Q But Sq = ql=1 l and Sq1 = q1 l=1 l , so that we can write equation (4.4.15) as Qq1 l n1 gq {q = = (4.4.16) pq nq+1 Ql=1 q g1 l=1 l
Now consider this equation. The system in Figure 4.4.2a) has 2q+1 parameters. Choose one of the parameters, and divide the remaining 2q parameters by it; we obtain 2q ratios. The two spectra (l )q1 and (l )q1 provide 2q 1 ratios, 1 so one more ratio is needed. The chosen parameter is merely a scaling factor; the total mass, or alternatively one individual mass, say pq , would determine it. If we take pq as known, then equation (4.4.16) states that the required 2qth ratio, nq+1 @pq must be chosen so that Qq nq+1 l=1 l 0? ? Qq1 = (4.4.17) pq l=1 l
4. Inverse Problems for Jacobi Systems
79
This inequality was first pointed out by Nylen and Uhlig (1997a) [253]. Once we have chosen nq+1 @pq satisfying this inequality, then equation (4.4.16) deter1
mines n1 @g1 , since {q is known, and gq = pq2 . With nq+1 @gq and n1 @g1 known, equation (4.4.14) gives d and hence D and K = D1 JD1 . The reconstruction is complete. The third system is free-free, as shown in Figure 4.4.3a); constraining p1 we obtain the fixed-free system in Figure 4.4.3b).
k2 m1
m2
kn
...
mn
mn − 1
a)
k2 m2
...
kn mn − 1
mn
b) Figure 4.4.3 - A constraint is applied to a free-free system The pair is essentially the same as the pair in Figure 4.4.1, with n1 = 0. The analysis starts as before; the only dierence is that the lowest frequency of a) is 1 = 0. Still, from (J) and (J1 ) we can construct J uniquely, but now J will be singular, i.e., positive semi-definite. The stiness matrix K of system a) will satisfy K{1> 1> = = = > 1} = 0=
(4.4.18)
Now we need a result like Theorem 4.4.1 which covers the case when J is singular. It is Theorem 4.4.2 If J is a singular Jacobi matrix then the equation Jx = 0 has a unique strictly positive solution x satisfying ||x|| = 1. The proof is straightforward; see Ex. 4.4.4. Now we may complete the reconstruction. We take J and write K = DJD, then (4.4.18) becomes Ke = DJDe = DJd = 0= Thus d = cx where x is governed by Theorem 4.4.2, and if the total mass p = 1, then f = 1. This gives d and hence D and ˆ W K = DJD = EKE
80
Chapter 4
ˆ = gldj(n1 > n2 > = = = > nq ). Again, we can use this result to show that an where K arbitrary singular Jacobi matrix may be written ˆ W D1 J = D1 EKE
(4.4.19)
ˆ has first diagonal entry zero. where now K
Exercises 4.4 1. Show that if Ke = n1 e1 , and E1 is given by (2.2.10), then KEW is bidiagonal and E1 KEW is diagonal. 2. Reconstruct the system of Figure 4.4.1a) from the spectra (l )q1 and (l )q1 1 of a) and c) respectively. 3. Use the solution (4.4.12) of equation (4.4.11), and the transformation from ¯ given in (4.3.9) to find the solution to equation (4.4.13). J to J 4. Provide a constructive proof of Theorem 4.4.2, by writing x in terms of the principal minors of J. 5. Suppose that the eigenvalues (l )q1 of the system in Figure 4.4.2a are known, as are the eigenvalues (l )q1 when the stiness nq+1 is replaced by some unknown stiness nq+1 . Show that there is a one-parameter family of systems, each member of which has the stated eigenvalues. 6. Show that if J is a non-singular Jacobi matrix, then its inverse C = J has the form 6 5 x1 y1 x1 y2 = = = x1 yq 9 x1 y2 x2 y2 = = = x2 yq : : C=9 7 · · === · 8 x1 yq x2 yq = = = xq yq i.e., flm =
½
xl ym xm yl
1
lm lm
and that (xl )q1 , (yl )q1 are strictly positive, and satisfy x2 xq x1 ··· = y1 y2 yq This result is quoted in Gantmacher and Krein (1950) [98], but may have been known earlier.
4. Inverse Problems for Jacobi Systems
4.5
81
Reconstruction by using modification
The simplest way to modify a system is to attach a spring at a free end, thus going from the system in Figure 4.5.1a) to that in Figure 4.5.1b). (We have renumbered the masses so that the spring is attached at p1 .)
k1 m1
k0
...
m2
k1 m1
...
m2
kn mn
kn mn
Figure 4.5.1 - A spring is added to the system This is an example of the analysis of Section 4.3. The spectra for a) and b) are (J) = (l )q1 and (J ) = (l )q1 respectively. Because we have added stiness to the system, we have l A l , as in (4.3.21). i) Use the trace condition to find d1 d1 =
q X (l l ) l=1
ii) Use d1 = n1 @p1 and d1 = (n1 + n0 )@p1 to find n0 @p1 = d1 d1 . iii) Use d1 d1 and equation (4.3.22) to find {2l1 , and hence x1 = {{11 > {21 > = = = > {q1 }. iv) Use the Lanczos algorithm to find J. v) Use a variant of the analysis given in Section 4.4 to untangle K and M from J. As an alternative modification we may add mass to the system, specifically a mass p1 to p1 . In this case it is easier to work initially with the original equation (4.4.1) than with the reduced equation (4.4.2). Again, we start with the free-fixed system of Figure 4.5.1a. The eigenvalue problem for a) is Kyl = l Myl = That for the modified system is Ky = M y>
(4.5.1)
82
Chapter 4
where M = M + p1 E1>1 . eigenvalues must satisfy
Since we have added mass to the system, the
0 ? 1 ? 1 ? · · · ? q ? q =
(4.5.2)
Express y as a combination of the yl : y=
q X
l yl >
(4.5.3)
l=1
then Ky =
q X
l Kyl =
l=1
and
M y =
q X
q X
l l Myl >
l=1
l Myl + p1 E1>1 y>
l=1
so that equation (4.5.1) becomes q X
l l Myl =
l=1
q X
l Myl + p1 E1>1 y=
l=1
Premultiply both sides by ymW , using the orthonormality condition ymW Myl = lm : m m = m + p1 |1m |1 > and on substituting for m in (4.5.3) and equating the first elements of the vectors on each side, we find q X (|1l )2 1 = p1 = (4.5.4) l=1 l
In order to use this equation to obtain the first components x1l of the eigenvectors ul of the reduced equation, for use in the Lanczos equation, we need to express |1l in terms of x1l . The equation Dy = u gives g1 |1l = x1l = {l1 that we may write (4.5.4) as q X p ({l1 )2 > = 1= 1 = p1 l=1 l Since the roots of the equation are (l )q1 we have Qq q X (l ) ({l1 )2 1 = = f Ql=1 q l=1 (l ) l=1 l
Equating both sides for = 0 and $ 4, we have 1=f
q Y (l @l )>
l=1
1+
q X l=1
{2l1 = f=
(4.5.5)
4. Inverse Problems for Jacobi Systems
83
The interlacing condition (4.5.2) gives f=
q Y (l @l ) A 1=
l=1
P The orthonormality condition gives ql=1 {2l1 = 1, so that p1 @p1 = = f 1 A 0. Finally, multiplying (4.5.5) throughout by l and then putting = l we find q q Y Y 0 l {2l1 = f (m l )@ (m l )= (4.5.6) m=1
m=1
The interlacing condition (4.5.2) ensures that {2l1 A 0. Now we use x1 in the Lanczos algorithm, and the untangling procedure as before. There are still more ways in which to obtain second spectrum, for which see Nylen and Uhlig (1997a) [253], Nylen and Uhlig (1997b) [254]. Ram (1993) [276] supposes that the system of Figure 4.5.1 is modified by adding both a mass p to p1 and a spring n0 . He makes use of some simple but powerful results found in Ram and Blech (1991) [277]. We close this section by supposing that an oscillating force I sin $w is applied to the free end of the spring-mass system of Figure 4.5.1a). The matrix equation governing the response y sin $w is (K M)y = I e1 = Write y=
q X
l yl >
l=1
where yl is the lth eigenvector, normalised so that ylW Mym = lm = We obtain (l )l = I |1l > and hence y=I so that |1 = I
q X |1l yl > l=1 l q X l=1
2 |1l = l
(4.5.7)
When the eigenvalue problem is reduced to standard, J, form, then Dy = u, so that g1 |1l = x1l = {l1 so that we may write |1 =
q I X {2l1 > p1 l=1 l
(4.5.8)
84
Chapter 4
where, as usual, x1 = {{11 > {21 > = = = > {q1 } is the vector of first components of the eigenvectors of J. The quantity |1 @I is called the frequency response function, specifically the frequency response function for the displacement |1 due to a unit force applied at |1 . This function may also be identified as a direct receptance for |1 , as described, for instance, in Bishop and Johnson (1960) [34]. The two spectra (J) = (l )q1 and (J1 ) = (l )q1 are the poles and zeros of the response 1 function. The interlacing of these two spectra may thus be interpreted as the interlacing of the poles and zeros of the response function, a result which is well known in control theory. The result of Section 4.3 may thus be stated as follows: the response function, and specifically its poles and zeros, uniquely determines the matrix J. As we have seen, once we know J and the form of the stiness matrix K, we may untangle M and K from J. See Gladwell and Gbadeyan (1985) [106] for an alternative treatment. An experimental - theory study of the problem of reconstructing a springmass system from frequency response data for an actual system may be found in Gladwell and Movahhedy (1995) [123] and Movahhedy, Ismail and Gladwell (1995) [242].
4.6
Persymmetric systems
It was shown in Section 4.3 that a persymmetric Jacobi matrix J can be reconstructed uniquely from its eigenvalues. We shall now consider some physical problems relating to persymmetric matrices. Figure 4.6.1 shows a system of 2q masses connected by (2q + 1) springs and fixed at each end. Suppose that the system is symmetrical about the mid point, so that pu = p2q+1u >
nu = n2q+1u >
(u = 1> 2> = = = > q)=
(4.6.1)
The odd numbered principal modes of the system will be symmetrical about the mid-point; they will thus be the principal modes of one half (say the left-hand half) of the system with the mid-point of the system free, as in Figure 4.6.2(a). Thus the odd numbered eigenvalues a1 > a3 > = = = > a2q1 of the complete system will be the eigenvalues of the left-hand half under the conditions fixed-free, i.e., a2l1 = l >
k1
k2 m1
m2
l = 1> 2> = = = > q=
...
(4.6.2)
k2 n1
k2n m2 n 1
m2n
Figure 4.6.1 - A symmetrical system with 2q-masses
4. Inverse Problems for Jacobi Systems
85
On the other hand, the even-numbered principal modes of the system will be antisymmetrical about the mid-point so that the even-numbered eigenvalues a2 > a4 > = = = > a2q will be the eigenvalues of the left-hand half under the condition fixed-fixed, as in Figure 4.6.2(b).
x
x x
x
x
x
x
x
(a)
x
x
x
x
x x
x
x
(b) Figure 4.6.2(a) The odd numbered modes are symmetrical, (b) The even numbered ones are antisymmetrical. Thus a2l = l >
l = 1> 2> = = = > q=
(4.6.3)
This means that the left-hand half, and hence the whole system may be uniquely constructed, using the analysis of Section 4.4 from the eigenvalues a1 > = = = > a2q and the total mass. Figure 4.6.3 shows a symmetrical system with 2q 1 masses and 2q springs. Now the odd-numbered symmetrical modes will be the modes of the left-hand half with (pq @2) at the end and free there, as in Figure 4.6.4(a). On the other hand, the even-numbered, antisymmetrical modes will be the modes of left-hand half with pq fixed as in Figure 4.6.4(b). Thus a2l1 = l > a2l = l >
l = 1> 2> = = = > q>
l = 1> 2> = = = > q 1=
(4.6.4) (4.6.5)
86
Chapter 4
k1
k2
..
m1
mn
k2n
..
m2n 1
Figure 4.6.3 - A symmetrical system with 2q 1 masses.
x
x x
x x
x x
(a)
x
x
x
x x
x
x
(b) Figure 4.6.4 - (a) The odd numbered modes are symmetrical. (b) The even numbered modes are antisymmetrical.
4.7
Inverse generalised eigenvalue problems
In this section we consider how we can reconstruct a finite element model from spectral data. The eigenvalue problem is (K M)y = 0>
(4.7.1)
where as in (2.4.10), both K and M are symmetric tridiagonal, K with negative codiagonal, M with positive codiagonal. Since one spectrum is insu!cient even to reconstruct one tridiagonal matrix, it is certainly insu!cient to reconstruct two. We therefore assume Gladwell (1999) [127] that M can be written in terms of K: M = D2 fK> f A 0> (4.7.2)
4. Inverse Problems for Jacobi Systems
87
where D is an as yet undetermined diagonal matrix with positive entries, and f is an arbitrary positive number. Since K has negative codiagonal, M will have positive codiagonal. Now K M = K (D2 fK) = (1 + f)K D2 = (1 + f){K D2 }> = @(1 + f)= Thus (4.7.1) reduces to (K D2 )y = 0>
(4.7.3)
which, as in Section 3.1, we can reduce to (J I)u = 0>
(4.7.4)
where J = D1 KD1 and u = D1 y. Suppose that (4.7.1) has specified eigenvalues (l )q1 , where l 0, then J has eigenvalues ( l )q1 where l = l @(1 + fl ) 0, showing that J, and thus K, is positive semi-definite. The matrix M can be written M = D(I fJ)D
(4.7.5)
and the matrix I fJ has eigenvalues 1 f l = 1@(1 + fl ) A 0, showing that M is positive definite. To reconstruct J we need a second spectrum. If the eigenvalues of (4.7.1) under the constraint xq = 0 are (l )q1 , then the eigenvalues of (4.7.4) under 1 the same constraint will be l = l @(1 + fl ). We note that the interlacing 1 ? 1 ? 2 ? · · · ? q1 ? q
(4.7.6)
1 ? 1 ? 2 ? · · · ? q1 ? q =
(4.7.7)
yields the interlacing
Having found J, we need to find D so that K = DJD satisfies the characteristic stiness equation (4.4.9). This can be done exactly as in Section 4.4. Gladwell (1999) [127] finds wider families of systems with the given spectra. See Ram and Gladwell (1994) [289] for a dierent approach to reconstructing a finite element model of a rod.
4.8
Interior point reconstruction
Suppose, following Gladwell and Willms (1988) [113], we have a spring-mass system with q masses, under some end conditions, as in Figure 4.8.1(a). (We exclude the free-free condition at this stage.) If a sinusoidal force I sin $w is applied to mass pp+1 , where 0 ? p ? q 1, then the response at mass pp+1 may be calculated as in equation (4.5.7): {p+1 @I =
q X ({p+1>l )2 l=1
l
=
88
Chapter 4
The poles of this response function are the eigenvalues (l )q1 of the whole system, D. The zeros of the response function will be the eigenvalues of the system constrained so that {p+1 = 0, i.e., they will be eigenvalues of the systems, E, on the left, or F, on the right, of {p+1 , as shown in Figure 4.8.1(b). Dierent ways of assigning the eigenvalues of the constrained system to the two subsystems E and F will lead to dierent reconstructed systems. When this assignment s has been made, then we know the eigenvalues (l )q1 > (l )p 1 and ( l )1 of systems D> E> F respectively; s = q p 1. Within themselves these sets of eigenvalues must be distinct. There are two cases. a) The constrained system has no double eigenvalues. That is, all the s (l )p and ( ) are distinct; if they are arranged in ascending order and l 1 1 relabelled (˜ l )q1 , they will satisfy 1 1 ? ˜ 1 ? 2 ? · · · ? ˜ q1 ? q ; this is equivalent to the statement that no eigenvector xl of J has a node at {p+1 , i.e., {p+1>l 6= 0 for all l = 1> 2> = = = > q. b) Two members of a pair (m > n ) are identical; now there is an l such that l = m = n ; this will occur i {p+1>l = 0. There can be more than one such pair. To analyse the situation we suppose that the eigenvalue equation (4.4.1) has been reduced to normal form, (4.4.2), and we partition J as
J=
p 1 s
where bWp = {0> 0> = = = > ep }>
k1
5 p B 7 bWp 0
1 bp dp+1 cp+1
s 6 0 > cWp+1 8 C
(4.8.1)
cWp+1 = {ep+1 > 0> 0> = = = > 0}.
k2
...
m2
m1
kn mn
a)
k1 m1
...
kn
k m1 m m1
b) Figure 4.8.1 - The mass pp+1 is constrained.
mn
4. Inverse Problems for Jacobi Systems
89
Now we consider the principal minors of I J. We denote the leading principal minors by sl () and the trailing principal minors by tl (). The Laplace expansions of sq () = det(I J) using the first p and first p + 1 rows are sq () = sp ()ts+1 () e2p sp1 ()ts ()>
(4.8.2)
= sp+1 ()ts () e2p+1 sp ()ts1 ()=
(4.8.3)
We know that sq () =
q Y ( l )>
sp () =
l=1
p Y
( m )>
ts () =
m=1
s Y
( n )>
n=1
and thus equation (4.8.2), (4.8.3) give sq (m ) = e2p sp1 (m )ts (m )>
(4.8.4)
sq ( n ) = e2p+1 sp ( n )ts1 ( n )=
(4.8.5)
In case (a), all the quantities appearing in the latter equations are non-zero, so that, apart from the factors e2p and e2p+1 , these equations yield sp1 (m ) and ts1 ( n ), respectively. These quantities are just what is needed to compute the matrices B and C, respectively, using Forsythe’s algorithm in Section 3.2. The weights (zm )e for B are given by e2p (zm )e
= e2p sp1 (m )@s0p (m )> = sq (m )@[s0p (m )ts (m )]>
(4.8.6)
while those for C are e2p+1 (zn )f
= e2p+1 ts1 ( n )@ts0 ( n )> = sq ( n )@[ts0 ( n )sp ( n )]=
(4.8.7)
To verify that the weights (zl )e are positive, we suppose that m has v ’s to its left, and s v to its right; then m+v ? m ? m+v+1 . If a number { may be written { = (1)q f, where f A 0, then we say vjq({) = q. Now we can easily verify that vjq[sq (m )> s0p (m )> ts (m )] = [q m v> p m> s v] so that
vjq(zm )e
= 1 + (q m v) + (p m) + s v = 2q 2m 2v = hyhq
so that (zm )e A 0; we may prove similarly that (zn )f A 0. Thus B may be reconstructed uniquely. At the end, sp1 () will be known, and so the sp1 (m ) will be known. Any one of these values may be substituted into (4.8.4) to yield e2p . The matrix C and e2p+1 may be found in a similar manner.
90
Chapter 4
In case (b) there is a common factor l m = n in each of the equations (4.8.2) and (4.8.3). Cancel this factor and then put = l . Since sp+1 () = ( dp+1 )sp () e2p sp1 ()> ts+1 () = ( dp+1 )ts () e2p+1 ts1 ()> we have
sp+1 (m ) = e2p sp1 (m )> ts+1 ( n ) = e2p+1 ts1 ( n )>
and thus both equations (4.8.2) and (4.8.3) reduce to s0q (l ) = e2p sp1 (m )ts0 ( n ) e2p+1 s0p (m )ts1 ( n )=
(4.8.8)
This is the single equation that replaces the pair of equations (4.8.4) and (4.8.5) for the common eigenvalue. Using (4.8.6) and (4.8.7), we may write (4.8.8) as Zl = s0q (l )@[s0p (m )ts0 ( n )] = e2p (zm )e + e2p+1 (zn )f >
(4.8.9)
and we note that Zl is a positive quantity. Now we proceed as follows to find the family of Jacobi matrices having the specified eigenvalues. Choose 5 (0> 2 ) and put e2p (zm )e = Zl cos2 > e2p (zn )f = Zl sin2 = If there is more than one triple of common eigenvalues, then this procedure may be followed for each. Combine these weights with those corresponding to the distinct eigenvalues, and compute B and C. At the final stage sp1 () and ts1 () will be known, so that e2p and e2p+1 may be found from equations (4.8.4) and (4.8.5) for one of the distinct eigenvalues, and there will be at least one, as before. There is an alternative procedure which elucidates the situation in which one or more triples l > m > n are equal, and which uses the Lanczos algorithm. Express the eigenvalue problem for J in (4.8.1) in terms of the normalised eigens vectors (ym )p 1 and (zm )1 of B and C respectively. Thus s p X X ˜s ( sm ym ) + {p+1 ep+1 + J tn zn ) x =˜ Ip ( m=1
where ˜ Ip = 5 9 9 9 9 9 9 9 9 9 9 7
1
£Ip ¤ 0
.. v1
65
v1 .. .
.
···
n=1
£ ¤ ˜s = 0 . The eigenvalue problem becomes and J Is p vp
vp dp+1 w1 .. . ws
w1 y1
··· ..
.
s1 .. .
:9 :9 :9 : 9 sp :9 9 ws : : 9 {p+1 : 9 t1 :9 :9 .. 87 . ts ys
6 : : : : : :=0 : : : : 8 (4.8.10)
4. Inverse Problems for Jacobi Systems
91
where vm = ep |p>m >
wn = ep+1 }1>n =
(4.8.11)
Thus ( m )sm vm {p+1 = 0> ( n )tn wn {p+1 = 0>
m = 1> 2> = = = > p> n = 1> 2> = = = > s>
so that {dp+1 l +
p X m=1
s
X w2 v2m n + }{p+1>l = 0> l m l n
l = 1> = = = > q=
n=1
In case (a) {p+1>l 6= 0; l = 1> 2> = = = > q, so that dp+1 +
p X m=1
s
X w2 v2m sq () n + = m n sp ()ts ()
(4.8.12)
n=1
which yields v2m =
sq (m ) > s0p (m )ts (m )
w2n =
sq ( n ) sp ( n )ts0 ( n )
(4.8.13)
for m = 1> 2> = = = > p; n = 1> 2> = = = > s, in agreement with (4.8.6), (4.8.7). e2p > e2p+1 may be computed from e2p =
p X
v2m >
e2p+1 =
m=1
s X
w2n =
Now
(4.8.14)
n=1
s With (|p>m )p 1 and (}1>n )1 known, from equations (4.8.11)-(4.8.14), B and C may be computed by using the Lanczos algorithm. In case (b), suppose that there are u 1 triples {lt > mt > nt }> t = 1> 2> = = = > u such that lt = mt = nt , then
i () dp+1 +
p X m=1
s u X X v2mt + w2nt v2m w2n + + m n t=1 lt
(4.8.15)
n=1
has, as its p + s u + 1 = q 2 roots, the q u non-degenerate l . In equation (4.8.15), means that the degenerate triples are omitted. Now the separate v2m and w2n , and the values Zt = v2mt + w2nt for the degenerate modes will be known. Thus as before v2mt + w2nt = Zt >
v2mt = Zt cos2 t >
w2nt = Zt sin2 t
where Zt is defined as in (4.8.9). With the t chosen, the v2mt and w2nt are all known. Equation (4.8.14) yields e2p and e2p+1 as functions of the parameters s {t }u1 (and note that e2p + e2p+1 is invariant) so that the (|p>l )p 1 and (}1>m )1 are known, and B and C may be calculated from the Lanczos algorithm.
92
Chapter 4
An alternative approach to the interior reconstruction problem may be found in Nylen and Uhlig (1997a) [253]. The mass-spring models considered in this chapter are very similar to the shear building model used extensively by Takewaki and his coworkers. They have formulated various hybrid inverse problems in which part of a structure is given and part is yet to be found in order to yield a structure with specified spectral (eigenvalue or modal) properties. Full, definitive description of these problems and their use in structural design may be found in the monograph Takewaki (2000) [321]. Among the original papers most closely related to the concerns of this chapter are the following: Takewaki and Nakamura (1995) [317], Takewaki, Nakamura and Arita (1996) [318] and Takewaki and Nakamura (1997) [319], Takewari (1999) [320].
Chapter 5
Inverse Problems for Some More General Systems Words dierently arranged have a dierent meaning, and meanings dierently arranged have dierent eects. Pascal’s Pensées, 23
5.1
Introduction: graph theory
The inverse problems considered in Chapter 4 are special, simply because Jacobi matrices are special matrices. In this chapter we will consider some slightly more general problems but must admit that there are still only a few problems that we have been able to solve. The special feature of a Jacobi matrix is its structure: it is tridiagonal, with strictly negative codiagonal. (It is also positive semi-definite, but that is another matter.) The structure of the matrix J in equation (4.4.2) is related to the structures of K and M in (4.4.1); K is tridiagonal while M is diagonal. The structures of K and M, in turn, derive from the structure of the system, an in-line mass system, to which they belong. K, the stiness matrix, relates to the stinesses, the connectors, between masses. K is tridiagonal because each interior mass pl 2 l q 1 is connected only to its immediate neighbours pl1 and pl+1 ; the end masses p1 and pq each have just one neighbour p2 or pq respectively. The natural tool for describing and analysing the structure of a system is graph theory. This is not the place to prove any theorems in graph theory, but it is useful to introduce some of the basic concepts. A graph G is a set of vertices, connected by edges. The set of vertices is called the vertex set, and is denoted by V; the set of edges is called the edge set, E. Figure 5.1.1 shows a graph. This is actually an example of a simple, undirected graph. It is simple because there is at most one edge connecting any two vertices; the edge connecting vertices l and m is denoted by (l> m). The graph is undirected because there is no preferred 93
94
Chapter 5
direction associated with an edge. Henceforth, the terms graph will be used to mean a simple, undirected graph. The adjacency matrix A of a graph G is the symmetric matrix defined by ¾ dlm = 1 i l 6= m and (l> m) 5 E> (5.1.1) = 0 otherwise= The adjacency matrix for the graph 5 0 9 1 9 D=9 9 1 7 0 0
in Figure 5.1.1 is 6 1 1 0 0 0 1 0 0 : : 1 0 1 1 : := 0 1 0 1 8 0 1 1 0
1 x x4
3 x
2x
x 5 Figure 5.1.1 - A graph. With any symmetric matrix A we may associate a graph; the rule is if l 6= m then (l> m) 5 E i dlm 6= 0=
(5.1.2)
Using this rule we see that the graph associated with a Jacobi matrix is an (unbroken) path, as in Figure 5.1.2. The path is clearly one of the simplest graphs.
x 1
x 2
x 3
...
x n1
x n
Figure 5.1.2 - The graph associated with a Jacobi matrix Another simple graph is a star on q vertices, shown in Figure 5.1.3.
x3
x2
x x 1
nx
x
x 4
n1
Figure 5.1.3 - A star on q vertices
5. Inverse Problems for Some More General Systems
A (symmetric) bordered diagonal associated graph. 5 d1 9 ˆe1 9 B=9 7 · ˆeq1
95
matrix B has a star on q vertices as its 6 = = = ˆeq1 : : : .. 8 . dq
ˆe1 d2
(5.1.3)
A periodic Jacobi matrix is one of the form 5
Jshu
d1 9 e1 9 9 =9 9 9 7 eq
e1 d2 .. .
eq e2 .. ..
.
..
.
.
..
.
eq1
eq1 dq
6 : : : := : : 8
(5.1.4)
It is tridiagonal except for the terms eq in the top right and bottom left. The underlying matrix is a ring on q vertices as shown in Figure 5.1.4.
1x
x 2
nx
x3
x x
Figure 5.1.4 - A ring on q vertices The graph associated with a pentadiagonal matrix, such as occurred in Section 2.3 in the analysis of the vibration of a beam, is a strut, as shown in Figure 5.1.5.
1
3
... n 3
n 1
2
4
... n 2
n
Figure 5.1.5 - The strut on q (even) vertices is the underlying graph of a pentadiagonal matrix
96
Chapter 5
The graph associated with a 2 × 2 block tridiagonal matrix is also a strut, but now one with double connections, as shown in Figure 5.1.6.
1x
3x
x
x 2
x 4
x
... n x3
nx 1
... n x2
x n
Figure 5.1.6 - The graph underlying a 2 × 2 block tridiagonal matrix The graphs shown in Figs. 5.1.1-5.1.6 are all connected graphs: there is a chain consisting of a sequence of edges connecting any one vertex to any other vertex. Note that the intersections of the diagonals in Figure 5.1.6 are not vertices of the graph. The graphs shown in Figure 5.1.7a), b) are disconnected.
1 x
3 2
x 4
5 a)
1 x
4 3
x 2
5
b) Figure 5.1.7 - Renumbering does not essentially change a graph In order to test whether the underlying graph of a given (symmetric) matrix is connected or not, we note that renumbering the vertices of a graph does not change the essential character of a graph; the graphs a) and b) in Figure 5.1.7 are
5. Inverse Problems for Some More General Systems
97
essentially the same. Renumbering the vertices of a graph leads to a rearranging of the rows and of the columns of any (symmetric) matrix based on that graph. When a graph is disconnected, it may be partitioned, as in Figure 5.1.7a) into a set of connected subgraphs. Then we can always rearrange the numbering, as in b) so that vertex numbers in any one connected subgraph form a consecutive sequence. The adjacency matrices of the graphs a) and b) are 5 9 9 D1 = 9 9 7
0 0 0 1 0
0 0 1 0 1
0 1 0 0 1
1 0 0 0 0
0 1 1 0 0
6 : : :> : 8
5 9 9 D2 = 9 9 7
0 1 0 0 0
1 0 0 0 0
0 0 0 1 1
0 0 1 0 1
0 0 1 1 0
6 : : := : 8
We see, in this example, that when the vertices are renumbered so that each connected subgraph has consecutive numbering, then the adjacency matrix splits into two separate submatrices: such a (symmetric) matrix is said to be reducible. A symmetric matrix A is said to be irreducible i it cannot be transformed to the form ¸ · B 0 (5.1.5) A= 0 C by any rearrangement of rows and columns. If it is reducible, then it can be transformed to the form (5.1.5), and of course B and C may perhaps themselves be reduced further. Note: The concepts of connectedness of a directed graph, and the corresponding concept of irreducibility of a general (not necessarily symmetric) matrix, are more complex than those described here. See Horn and Johnson (1985) [183] Section 6.2.21. Now we may state the general result. Theorem 5.1.1 The (symmetric) matrix A is irreducible i its underlying graph is connected. It is easy to check that if a spring (other than n1 ) is removed from a spring mass system such as that in Figure 4.4.1, then the underlying graph becomes disconnected, and the stiness matrix becomes reducible. A tree is a special kind of connected graph: one which has no circuits. Now there is a unique chain of edges connecting any one vertex to any other. The path and the star are both trees, but a ring, see Figure 5.1.4, is not a tree. A connected graph has one or more spanning trees. If G is a connected graph with vertex set V, then a spanning tree S of G is a maximal tree with the vertex set V; if any more edges in E were added to S then it would cease to be a tree: it would have a circuit. Figure 5.1.8 shows three possible spanning trees for the graph G in Figure 5.1.1.
98
Chapter 5
1 x 3 x 2
x
4 x
1 x
4 x
x3
x3
x
5
2
x
4 x
1 x
x
5
2
x
x
5
Figure 5.1.8 - Three spanning trees for the graph in Figure 5.1.1. It may be proved that all the spanning trees of a given graph G have the same number of edges. Nabben (2001) [243], in a wide ranging paper, discusses Green’s matrices for trees.
5.2
Matrix transformations
In the first part of this book we are concerned very largely with matrix eigenvalue problems. One of the basic questions we face is this: ‘What operations, i.e., transformations, may we apply to a matrix, or a matrix pair, which will leave its eigenvalues unchanged, i.e., invariant?’ We now discuss this question. Suppose C> A 5 Pq . The set of matrices C A is called the matrix pencil based on the pair (C> A). As stated in Section 1.4, the eigenvalues of the pair (C> A) are the values of for which the equation (C A)x = 0 has a non-trivial solution x 5 Yq . The eigenvalues are the roots of det(C A) = 0= Suppose P> R 5 Pq are constant matrices, i.e., they are independent of . Since det(PCR PAR) = det(P) · det(C A) · det(R) we may deduce that if P> R are non-singular, so that det(P) 6= 0> det(R) 6= 0, then det(PCR PAR) = 0 i det(C A) = 0> so that the transformation ‘premultiply by P, and postmultiply by R’ leaves the eigenvalues invariant. The transformation is called an equivalence transformation. It is a special equivalence relation (Ex. 5.2.1). In general, an equivalence (transformation) will transform a symmetric pencil into an unsymmetric pencil. Those which preserve symmetry are characterised by (5.2.1) P = RW =
5. Inverse Problems for Some More General Systems
99
An equivalence changes a pencil A I into PAR PR. If it is to change A I into B I, then we must choose P> R so that PR = I, i.e., P = R1 =
(5.2.2)
An equivalence with this property is called a similarity (transformation). An equivalence which satisfies both (5.2.1) and (5.2.2) is called a rotation or an orthogonal transformation. We reserve the symbol Q to denote the ‘P’ of such a transformation. Equations (5.2.1), (5.2.2.) show that QQW = QW Q = I :
(5.2.3)
Q is an orthogonal matrix ; the matrices U and X in Section 4.2 were orthogonal matrices. We recall that the columns (rows) of an orthogonal matrix are mutually orthogonal, and each column (row) has norm 1; if Q = [q1 > q2 > = = = > qq ], then (5.2.4) qWl qm = lm = If q = 2, an orthogonal matrix has the form · ¸ cos sin Q= = sin cos
(5.2.5)
When q = 2, the eigenvalue problem relates to a plane, and this Q corresponds to a rotation of the {> | axes through an angle about the }-axis. It is di!cult to write down the most general expression for an orthogonal matrix in Pq . Instead, we use the fact that a product of orthogonal matrices is itself orthogonal (Ex. 5.2.3). There is a particularly simple and powerful orthogonal matrix which can be constructed by making a rank-one change to the identity matrix: Q = I 2xxW
(5.2.6)
will be orthogonal if QQW
= (I 2xxW )(I 2xxW ) = I 4xxW + 42 (xW x)(xxW ) = I>
i.e., if is chosen so that = 1@xW x=
(5.2.7)
Such a transform is called a Householder transformation; note that Q in (5.2.6) is symmetric, i.e., Q = QW = Householder transformations are used in various contexts; one is the reduction of a symmetric matrix to tridiagonal form, as we now describe. Suppose Q is given by (5.2.6), and A 5 Vq . We wish to choose Q, i.e., find x, so that the transformed matrix QAQ = B has zero elements in its first row and column, except for the first two, e11 > e12 . First consider the postmultiplication
100
Chapter 5
by Q, and use the abbreviation 1 , for row 1 of a matrix. (5.2.6), we have
With Q given by
C = AQ = A 2(Ax)xW 1 (C) = 1 (AQ) = 1 (A) 2(1 (A)x)xW Thus f1l = d1l 2(1 (A)x){l > l = 1> 2> = = = > q. We now choose {l = d1l > l = 3> 4> = = = > q=
(5.2.8)
Then f1l = 0> l = 3> 4> = = = > q if 2(1 (A)x) = 1=
(5.2.9)
This gives one equation for the remaining unknowns {1 > {2 . Now carry out the premultiplication: QAQ = B = QC = C 2x(xW C)> so that 1 (B) = 1 (C) 21 (x)(xW C)= Thus if the premultiplication is not to change the zero elements in the first row of C, we must choose {1 = 0. Now equations (5.2.7)-(5.2.9), give 2(d12 {2 + d213 + · · · + d21q ) = {22 + (d213 + · · · + d21q )> which yields {2 = d12 ± V> where V2 =
q X
d21l =
(5.2.10)
(5.2.11)
l=2
Thus the required x is
x = {0> d12 ± V> d13 > = = = > d1q }
(5.2.12)
and for numerical purposes we choose the sign of V to be that of d12 . This is the basic Householder transformation; it reduces an arbitrary symmetric A to a matrix · ¸ d11 bW B= > (5.2.13) b B1 where b = e1 e1 . This completes the first step in the reduction to tridiagonal form. Now we apply another Householder transformation to the submatrix B1 , using a new x with {1 = 0 = {2 . This second transformation will leave d11 > b> bW unchanged, and will eliminate all but the first two elements of the first row and column of B1 . After q2 applications, the matrix becomes tridiagonal. Once the matrix has been reduced to tridiagonal form, its eigenvalues can easily
5. Inverse Problems for Some More General Systems
101
be located by using the sign count function vu () of Section 3.1. Details on the numerical implementation of this reduction may be found in Bishop, Gladwell and Michaelson (1965) [33] (Chapter 9), Golub and Van Loan (1983) [135] Section 8.2. We make two comments. Because {1 = 0, the Q in (5.2.6) may be written ¸ · 1 0 > (5.2.14) Q= 0 Q1 where Q1 is an orthogonal matrix in Pq1 . This has an important consequence. Not only does the transformation preserve (A), i.e., (A) = (B), but also (A1 ) = (B1 )= Secondly, we can use a trivial modification of the Householder transformation to reduce a general symmetric matrix A to, say, pentadiagonal, form. We take Pq
x = {0> 0> d13 ± V> d14 > = = = > d1q }
(5.2.15)
2 This transformation preserves (A)> (A1 )> (A1>2 ), where V 2 = l=3 d1l . where the last denotes the spectrum of A with rows and columns 1 and 2 removed.
Exercises 5.2 1. An equivalence relation, ‘d is related to e’, written dUe, has three defining properties: • reflexivity, dUd • symmetry, if dUe then eUd • transitivity, if dUe and eUf, then dUf A set of elements related by an equivalence relation is called an equivalence class. Use the joint operation ‘premultiply by P and postmultiply by R’ (with P> R non-singular) to define an equivalence relation and an equivalence class for matrix pairs (C> A). 2. Show that the transformation B = QAQW defines an equivalence relation and a corresponding equivalence class. 3. Show that if Q1 > Q2 are orthogonal, then so is Q1 Q2 . Show by counterexample that if Q1 > Q2 are symmetric, then Q1 Q2 is not necessarily so. 4. Show that if x is given by (5.2.11), then in (5.2.7) is given by 2V(V + d12 ) = 1. 5. Verify that the Q obtained as a result of q 2 successive Householder transformations has the form (5.2.14).
102
5.3
Chapter 5
The star and the path
In Section 5.1 we noted that the graph associated with a bordered diagonal matrix (5.1.3) is a star on q vertices, as in Figure 5.1.3. There is a particularly simple inverse eigenvalue problem connected with a bordered diagonal matrix B: construct B so that (B) = (l )q1 > (B1 ) = (l )q1 . The usual variational 1 arguments show that the two spectra must interlace, at least in a loose sense: 1 1 2 · · · q1 q =
(5.3.1)
are distinct: For simplicity we assume that the (l )q1 1 1 ? 2 ? · · · ? q1 = We write B in the form (5.1.3), i.e., · d B = ˆ1 b
ˆW b M
¸
>
(5.3.2)
(5.3.3)
ˆ = {ˆe1 > ˆe2 > = = = > ˆeq1 }. Clearly, we can make (B1 ) = where M is diagonal, and b q1 (l )1 by taking M = gldj(1 > 2 > = = = > q1 ). The trace condition gives q X
d1 =
l
l=1
q1 X
l =
(5.3.4)
l=1
Now consider the eigenvector equations for B: ˆel y1 (d1 )y1
+ (l )yl+1 Pq1 ˆ + l=1 el yl+1
= 0> l = 1> 2> = = = > q 1> = 0>
which give the eigenvalue equation
d1
q1 X l=1
ˆe2 l = 0= l
This is to have roots (l )q1 , so that d1
q1 X l=1
and hence
Qq ˆe2 l l=1 ( l ) = Qq1 l l=1 ( l )
Q q ( m ) ˆe2 = Q m=1 l > l 0q1 m=1 (l m )
l = 1> 2> = = = > q 1>
(5.3.5)
(5.3.6)
where, as usual, 0 denotes l 6= m; the interlacing condition (5.3.1) yields ˆe2l 0. We can choose the sign of ˆel to be + or -. Because we have assumed that the l are distinct, a given l can coincide only with its neighbours l or l+1 .
5. Inverse Problems for Some More General Systems
103
Equation (5.3.6) shows that ˆel = 0 i l coincides with either of these two ’s. If ˆel = 0, then the edge (1> l + 1) is absent from the underlying graph. Having constructed the bordered diagonal matrix B, we have a new way to construct a tridiagonal J such that (J) = (l )q1 > (J1 ) = (l )q1 : we can 1 apply Householder transformations to B to get J. On account of Ex. 5.2.5, the transformation will have the form ¸· ¸· ¸ · ¸ · ˆW 1 0 d1 b d1 e1 eW1 1 0 = > (5.3.7) ˆ M 0 Q1 0 QW1 e1 e1 J1 b or equivalently ¸· · 1 0 d1 e1 e1 0 QW1
e1 eW1 J1
¸·
1 0 0 Q1
¸
=
·
d1 ˆ b
ˆW b M
¸
=
(5.3.8)
On carrying out the multiplication, we find QW1 J1 Q1 = M>
ˆ QW1 e1 e1 = b=
(5.3.9)
The first equation shows that the eigenvectors of J1 are the columns of Q1 : the lth eigenvector is ql = {t1l > t2l > = = = > t(q1)>l }= (5.3.10)
The second equation shows that, apart from the factor e1 , the vector ˆe is the vector of first components of the eigenvectors of J1 : ˆ = e1 {t11 > t12 > = = = > t1>q1 }= b
(5.3.11)
ˆ is the vector x1 needed for the construction Thus, apart from the factor e1 > b of J1 from the Lanczos algorithm of Section 4.2. The factor e1 is given by ˆ e1 = ||b||. Note the dierence between (5.3.6) and (4.3.17): the former, according to (5.3.11), gives the first components of the eigenvectors of J1 ; the latter gives the first components of the eigenvectors of J. Sussman-Fort (1982) [312] discusses connections between the inverse eigenvalue problems for Jacobi and bordered matrices.
Exercises 5.3 1. Explore what happens to J when one or more of the ’s coincides with a .
5.4
Periodic Jacobi matrices
In Section 5.1 we showed that the graph underlying a periodic Jacobi matrix is a ring on q vertices.The following analysis is due to Ferguson (1980) [87] and Boley and Golub (1984) [35], Boley and Golub (1987) [36].
104
Chapter 5
A periodic Jacobi matrix Jshu has 2q terms, (dl > el )q1 . We show how to construct Jshu from (Jshu ) = (l )q1 , (Jshu>1 ) = (l )q1 and one extra piece of 1 data: (5.4.1) = e1 e2 = = = eq = It is convenient to consider two matrices, the original matrix Jshu of (5.1.4), and q another matrix J shu with eq replaced by eq . We suppose (Jshu ) = (l )1 ; clearly there are relations between the l and the l . The l and l will again q1 interlace as in (5.3.1), as will the l and l ; again we suppose that the (l )1 are distinct, i.e., (5.3.2) holds. We start by constructing two bordered diagonal matrices, B from (l )q1 and q1 q (l )q1 , B from ( . They will have the form l )1 and (l )1 1 B=
·
d1 ˆ b
ˆW b M
¸
>
B =
·
d 1 ˆ b
ˆW b M
¸
=
(5.4.2)
ˆ will be given by (5.3.4), (5.3.6), and d > b ˆ will be obtained from Here d1 > b 1 (5.3.4), (5.3.6) by replacing l by l . Since (Jshu ) = (B) and (Jshu>1 ) = (B1 ), Jshu and B are related by an orthogonal transformation of the form ¸ · ¸· ¸· ¸ · ˆW 1 0 d1 1 0 e1 eW1 + eq eWq1 d b = B = ˆ1 0 Q1 0 QW1 e1 e1 + eq eq1 A1 b M where e1 = {1> 0> = = = > 0}> eq1 = {0> 0> = = = > 1} are in Yq1 and similarly
B =
·
d 1 ˆ b
ˆW b M
¸
=
·
1 0 0 QW1
¸·
e1 eW1 eq eWq1 d 1 e1 e1 eq eq1 A1
¸·
¸ 1 0 = 0 Q1
ˆ and b ˆ are The subblocks of these equations corresponding to b ˆ b ˆ b
= QW1 (e1 e1 + eq eq1 ) = QW1 (e1 e1 eq eq1 )
which on addition and subtraction give ˆ+b ˆ b ˆ ˆ bb
= 2e1 QW1 e1 = 2e1 x1 = 2eq QW1 eq1 = 2eq xq1
where, as in (5.3.9), x1 is the first column of QW1 and xq1 is the (q 1)th ˆ and b ˆ , then these equations give e1 and eq (up to sign) column. If we know b since ||x1 || = 1 = ||xq1 ||. Once we have d1 > e1 and x1 we may compute Jshu>1 from the Lanczos algorithm as before. ˆ and b ˆ , we assumed However, in finding B and B , specifically in finding b q q that we knew both the (l )1 and the (l )1 . We complete the analysis by ˆ from the (l )q1 and in (5.4.1). showing that we can in fact find b
5. Inverse Problems for Some More General Systems
105
The periodic Jacobi matrix Jshu diers from a regular Jacobi matrix only in the presence of the entries eq in the corners. This means that det(IJshu ) and det(I J shu ) will dier from the qth principal minor sq () only by quadratic terms in eq . In fact Q det(I Jshu ) = Qql=1 ( l ) = sq () e2q uq2 () 2 q 2 det(I J shu ) = l=1 ( l ) = sq () eq uq2 () + 2
where uq2 is the principal minor taken from rows and columns 2> 3> = = = > q 1. Subtacting these two equations, we find q q Y Y ( ) = ( l ) + 4= l
l=1
l=1
q1 2 q : This means that we can express (ˆe m ) in terms of (l )1 and (l )1 2 (ˆe m )
=
Qq
l=1 (m l ) Qq1 0 l=1 (m l )
Qq
l=1 (m l ) + 4 = Q = q1 0 l=1 (m l )
But this expression is not automatically non-negative if the (l )q1 and (l )q1 1 satisfy the interlacing condition. We must examine this more closely. Suppose first that = 0. The expression is certainly non-negative, and actually positive if the ’s and ’s strictly interlace. If they strictly interlace then, from continuity considerations we can conclude that, for each value of m, the expression will be non-negative for lying in a closed interval [hm > im ] around zero, hm A 0> im A 0. 2 This means that all the (e m ) will actually be non-negative in the intersection of these closed intervals. For in this intersection the problem as posed, has a solution; for outside this interval it has no (real) solution. Boley and Golub (1987) [36] present an algorithm to compute Jshu in this way. See also Boley and Golub (1984) [35]. Xu (1998) [339] provides a detailed analysis of the problem and shows (Theorem 2.8.3) that there is a solution i q X
n=1
|m n | 2(1 + (1)qm+1 )
(5.4.3)
are given, then the for all m = 1> 2> = = = > q 1. Note that if (l )q1 and (l )q1 1 inequality (5.4.3) provides an upper bound for . Andrea and Berry (1992) [9] provide a completely dierent approach to the problem via continued fractions.
5.5
The block Lanczos algorithm
In Section 5.1, we exhibited Figs. 5.1.5 and 5.1.6, and showed that the matrices underlying these graphs were pentadiagonal or block tridiagonal. In order to develop methods for solving inverse problems for such systems, we need a block version of the fundamental Lanczos algorithm described in Section 4.2.
106
Chapter 5
First we recall the original scalar version: Given a symmetric matrix A, and a vector x1 , compute a Jacobi matrix J as in equation (4.2.1) and an orthogonal matrix X = [x1 > x2 > = = = > xq ] such that A = XJXW . The algorithm proceeds by using the two equations J = XW AX>
AX = XJ>
(5.5.1)
alternately. Thus the (1,1) term in (5.5.1a) gives d1 = xW1 Ax1 and the first column of (5.5.1b) gives Ax1 = d1 x1 e1 x2 > which we rewrite as e1 x2 = d1 x1 Ax1 = z2 >
(5.5.2)
which gives e1 = ||z2 ||>
x2 = z2 @e1 =
xW2 Ax2 ,
Now the (2,2) term in J gives d2 = and the second column of (5.5.1b) is Ax2 = e1 x1 + d2 x2 e2 x3 which we rewrite as e2 x3 = e1 x1 + d2 x2 Ax2 = z3 which gives e2 = ||z3 ||>
x3 = z3 @e2
and so on. We now construct a block version of these equations, following Boley and Golub (1987) [36]. We start with a symmetric matrix A 5 Vq and suppose q = sv for some integer v. We will reduce A to a block tridiagonal matrix J, where 5 6 A1 BW1 9 B1 A2 : BW2 9 : (5.5.3) J=9 := . . .. .. 7 8 Bv1
Av
Here A1 > = = = > Av are symmetric, i.e., in Vs , and the Bl are upper triangular matrices in Ps . We assume that in addition to A, we are given s orthonormal vectors (xl )s1 5 Yq which form the columns of X1 = [x1 > x2 > = = = > xs ] 5 Pq>s . The matrix X1 therefore satisfies XW1 X1 = Is . The aim of the procedure is to construct J and an orthogonal matrix X = [X1 > X2 > = = = > Xv ] such that A = XJXW =
5. Inverse Problems for Some More General Systems
107
Just as in the scalar Lanczos process, we consider the two equations J = XW AX>
AX = XJ=
(5.5.4)
The first s × s block of the first equation gives A1 = XW1 AX1 while the first q × s block of the second gives AX1 = X1 A1 X2 B1 which we rewrite as X2 B1 = X1 A1 AX1 = Z2 = In the scalar version we had e1 x2 = z2 , from which we immediately concluded that e1 = ||z2 ||, and hence x2 = z2 @e1 . In the block version we have constructed Z2 5 Ps and we wish to write it as X2 B1 . Write X2 = [y1 > y2 > = = = > ys ]> Z2 = [z1 > z2 > = = = > zs ] and 6 5 e11 e12 = = = e1s 9 e22 = = = e2s : : B1 = 9 8 7 ess then finding (yl )s1 and the elements of B1 is essentially a Gram-Schmidt process: finding orthonormal combinations of the vectors (zl )s1 . Thus e11 y1 = z1 implies e11 = ±||z1 ||>
y1 = z1 @e11
(5.5.5)
and then e12 y1 + e22 y2 = z2 gives e12 = y1W z2 >
e22 y2 = z2 e12 y1 = w2
so that e22 = ±||w2 ||>
y2 = w2 @e22 etc.
(5.5.6)
The Gram-Schmidt process is closely related to the QR algorithm. The decomposition X2 B1 = Z2 involves writing Z2 as the product of X2 which is in Pq>s , but which satisfies XW2 X2 = Is , and an upper triangular matrix B1 5 Ps . Because X2 is not simply an orthogonal matrix in Ps , the usual QR algorithm has to be modified to eect the decomposition. Now we can proceed as before. We have found X2 , so that A2 = XW2 AX2 and AX2 = X1 BW1 + X2 A2 X3 B2
108
Chapter 5
so that X3 B2 = X1 BW1 + X2 A2 AX2 = Z3 from which X3 > B2 may be found, as before, by Gram-Schmidt. Note that dierent choices for the square roots, as in (5.5.5) and (5.5.6) will lead to dierent matrices J. Boley and Golub (1987) [36] present a detailed algorithm for the process. Further studies on the block-Lanczos algorithm have been carried out by Underwood (1975) [325] and Golub and Underwood (1977) [134]. See also Mattis and Hochstadt (1981) [222]. A completely dierent and highly e!cient procedure for the solution of band matrix inverse problems has been developed by Biegler-König (1980) [28], Biegler-König (1981a) [29], Biegler-König (1981b) [30], Biegler-König (1981c) [31]. See also Gragg and Harrod (1984) [153] for a procedure based on Rutishauser’s algorithm; they explore the connections to a number of other problems. See also Gladwell and Willms (1989) [114] and Friedland (1977) [92], Friedland (1979) [93], and particularly, Chu (1998) [58].
5.6
Inverse problems for pentadiagonal matrices
We could pose an inverse eigenvalue problem for a general symmetric matrix with 2s + 1 bands, as in Boley and Golub (1987) [36]. Instead, we will confine ourselves to the case s = 2, a pentadiagonal matrix A. The pentadiagonal case occurs in the inverse problem for a vibrating beam, but we shall defer considering the beam until we have discussed positivity in Chapter 6; the pentadiagonal matrix giving the stiness matrix of the beam has a very special form; certain terms in it must be positive, and others must be negative. In this section we will not be concerned with these matters of sign. Suppose we are given (A) = (l )q1 > (A1 ) = (l )q1 > (A1>2 ) = ( l )q2 > 1 1
(5.6.1)
where, as before, (A1>2 ) denotes the spectrum of A1>2 when its first two rows and columns are removed. Clearly the eigenvalues must interlace; and for simplicity we assume that the interlacing is strict. 1 ? 1 ? 2 ? · · · 1 ? 1 ? 2 ? · · ·
? q1 ? q >
(5.6.2)
? q2 ? q1 =
(5.6.3)
Our aim is to construct A such that (5.6.1) holds. We write ¸ · d1 bW > A= b A1
(5.6.4)
where only the first two components of the vector b are non-zero. We denote the eigenvector matrix of A by Q, and of A1 by Q(1) so that QW AQ = a>
(1)W
Q1
A1 Q(1) = M=
(5.6.5)
5. Inverse Problems for Some More General Systems
109
The eigenvectors of A are therefore ql , where Q = [q1 > q2 > = = = > qq ] while those (1) (1) (1) (1) of A1 are ql , where Q(1) = [q1 > q2 > = = = > qq1 ]. We start by constructing a bordered diagonal matrix, as in Section 5.3: ¸ · ˆW d1 b (5.6.6) B= ˆ b M such that (B) = (l )q1 , and (M) = (l )q1 . The term d1 is given by the 1 trace: q q1 X X l l > (5.6.7) d1 = l=1
l=1
ˆ is given by (5.3.6): while b
Qq m=1 (l m ) 2 ˆ = (el ) = Qq1 0 m=1 (l m )
Now, following equation (5.3.8) we relate A to ¸ · ¸· ¸· ¸ · ˆW 1 0 1 0 d1 bW d1 b = = B= ˆ b A1 0 Q(1)W 0 Q(1) b M
(5.6.8)
(5.6.9)
As in (5.3.9), we have ˆ Q(1)W b = b
(5.6.10)
Now however, in contrast to the situation in Section 5.3, b is not just a multiple ˆ does not give the vector of first components of the eigenvectors of e1 , so that b of A1 . But we can use the analysis of Section 4.3 to obtain the first components of the eigenvectors of A and A1 : Qq1 Qq2 m=1 (m l ) m=1 ( m l ) (1) 2 2 > (t1l ) = Qq1 = (5.6.11) t1l = Qq 0 0 m=1 (m l ) m=1 (m l )
To apply the block Lanczos algorithm to construct A we need not just the vector x1 of first components of eigenvectors of A, but also x2 of second components, making up X1 = [x1 > x2 ] 5 Pq>2 . Partition the vector ql : · ¸ t1l ql = > yl 5 Yq1 = (5.6.12) yl
Since ql is the lth eigenvector of A, and A is given by (5.6.4), we may write · ¸· ¸ · ¸ d1 bW t1l t1l = l > (5.6.13) b A1 yl yl so that t1l b + A1 yl = l yl =
110
Chapter 5
Now premultiply by Q(1)W to obtain (1)W
t1l Q(1)W b + Q1
A1 yl = l Q(1)W yl =
(5.6.14)
ˆ and equation (5.6.5b) gives Q(1)W A1 = But equation (5.6.10) gives Q(1)W b = b, 1 MQ(1)W , so that equation (5.6.14) gives ˆ = (M I)Q(1)W yl t1l b l and hence ˆ yl = t1l Q(1) (M l I)1 b= We need just the first term in yl ; it is |1l = t1l
q1 X m=1
(1) t1m ˆem > m l
l = 1> 2> = = = > q=
(5.6.15)
(1)
Since ˆem is given by (5.6.8), and t1l > t1m are given by (5.6.11), this equation yields |1l , and hence x2 = {|11 > |12 > = = = > |1q }. Exercises 5.6 1. Verify that the vector x2 given by (5.6.15) is indeed orthogonal to x1 , as required. 2. Extend the procedure described in this section to the general case of a 2s + 1 band matrix.
5.7
Inverse eigenvalue problems for a tree
The inverse eigenvalue problems for a path and a star are particular examples of a general problem. Both the path, as shown in Figure 5.1.2, and the star, in Figure 5.1.3, are trees, as defined in Section 5.1. The matrices corresponding to these trees are a Jacobi matrix J, or, as we will choose here, a sign-reversed Jacobi ˜ and a bordered diagonal matrix respectively. In both problems, matrix A = J, two spectra were specified, namely (A) = (l )q1 , and (A>1 ) = (l )q1 ; the 1 second spectrum corresponded to the eigenvectors u set to zero at a prescribed vertex, vertex 1. In both cases the two spectra had to satisfy the Cauchy interlacing inequalities 1 1 2 · · · q1 q =
(5.7.1)
In both cases also, if the inequalities (5.7.1) were all strict, the matrix A was irreducible, and the corresponding graph G was connected. The purpose of this section is to serve as an introduction to an important paper by Duarte (1989) [81]. This paper reviews the history of inverse eigenvalue
5. Inverse Problems for Some More General Systems
111
problems for trees, and establishes a general result. We will present analysis covering the simpler parts of the general case. As we will do sometimes in Chapter 6, Duarte labels eigenvalues in decreasing order, and we do the same. Specifically, we will show that if G is a tree on q vertices V, and if two spectra are given, satisfying (l )q1 ,(l )q1 1 1 A 1 A 2 · · · A q1 A q A 0>
(5.7.2)
then we can find a symmetric matrix A 5 Vq on G such that (A) = (l )q1 > 1 (A>1 ) = (l )q1 . We take the strict interlacing and the positiv1 ity condition for simplicity; Duarte relaxes these conditions. We start by observing that the two cases that we have considered so far, the path (Jacobi), and star (bordered diagonal), have common features. First, we note that the entries of the constructed matrices may be considered as functions of the data > 1 . Secondly, we note that in both matrices there are q2 q 2(q 1) = q2 3q + 2 constant functions, which in fact are all zero. This suggests the following questions: 1. Can the constant functions appearing in A be other than the zero function? 2. Can the number of these constant functions be increased? The answer to the first question is NO. For if A 5 Vq has eigenvalues (l )q1 with maximum modulus , then (Ex. 5.7.1) |dlm | , so that A can have no fixed entry, independent of the eigenvalues, other than zero. To answer the second question we note that if the inequalities (5.7.2) hold, then A must be irreducible. For if A were reducible, i.e., after possibly renumbering the vertices, it could be written · ¸ B 0 A= > 0 C then A and A>1 (which after renumbering, would be A>l ) would have a common eigenvalue, a situation that is precluded by (5.7.2). Thus A is irreducible and G is connected. Now we note that A must be positive definite so that no diagonal term dll can be zero. The maximum number of zero entries will be attained for matrices whose graph is a tree, and this number is precisely q2 3q + 2 (Ex. 5.7.2). Thus the answer to the second question is NO also. Having answered these questions, we proceed to the analysis. We start by considering a tree G, choose a vertex of V, label it 1, and see the eect of deleting vertex 1 - this is the graph corresponding to deleting row 1 and column 1 of A. First, we need a symbol, N , to denote the set of p vertices m of G which are connected to vertex 1. Now we use G 0 to denote the graph obtained from G by deleting vertex 1. Figure 5.7.1 shows two examples. In Figure 5.7.1a, where vertex 1 is at the end of a path, N = {2} and G 0 is the connected graph with vertices {2> 3> 4> 5> 6}. In Figure 5.7.1b, N = {2> 4} and G 0 has two connected components, one on either side of vertex 1; we call these G20 > G40 respectively. In general, G 0 will have p connected components which we label Gm0 > m 5 N ; the corresponding matrix A>1 will have p irreducible components.
112
1x
Chapter 5
x2
x3
x4
5x
x6
3x
x2
x2
x3
x4
5x
x6
3x
x2
1x
a)
x4
5x
6x
x4
5x
x6
b)
Figure 5.7.1 - Deleting vertex 1 from a path. Figure 5.7.2 shows another example, a star. Now N = {2> 3> 4> 5} and G 0 has 4 connected components: Gm0 = {m}> m = 2> 3> 4> 5.
x5
x5 x
1x
x
4
x
x
x
2
3
2
4
x
3
Figure 5.7.2 - Deleting the centre of a star. Finally, we need a symbol for the graph obtained by deleting vertex m 5 N from Gm0 ; we call it Gm00 . Figure 5.7.3 shows these subgraphs for the graphs G 0 in Figure 5.7.1.
x
3
x
x
4
5
x
x
G2UU
x
5
6
G4UU
G2UU
a) Figure 5.7.3 - The subgraphs
x
3
6
b) Gm00
for the graphs G 0 .
Note that for the star, the vertex set of Gm00 is empty because the vertex set of Gm0 is {m}. Having established notation that allows us to see what happens when we delete a vertex of a graph, we need to consider the two sets of eigenvalues, and how these relate to the matrix A. To do this we first return to two examples we
5. Inverse Problems for Some More General Systems
113
have already treated, those corresponding to deleting an end vertex of a path, and the centre of a star. are the First, the path with end vertex 1. The eigenvalues (l )q1 and (l )q1 1 0 zeros of the trailing monic principal minors, Sq0 ()> Sq1 () respectively, and, in the notation of equation (4.3.4), these are linked by 0 0 Sq0 () = ( d1 )Sq1 () e21 Sq2 ()=
(5.7.3)
0 0 We note that the graphs corresponding to Sq0 > Sq1 > Sq2 are precisely G> G 0 G20 00 00 0 0 0 and G G2 ; in fact Sq > Sq1 > Sq2 are the characteristic polynomials 4 of A, and of the submatrices of A on G 0 and G200 :
Sq0 () = 4(A)>
0 Sq1 () = 4(A(G 0 ))>
0 Sq2 () = 4(A(G200 ))=
We note also that d1 = d11 > e1 = d12 and N = {2}. This means that we can write (5.7.3) as X 4(A) = ( d11 ) 4 (A(G 0 )) d21m 4 (A(Gm00 ))= (5.7.4) m5N
Now consider the star. (5.3.5):
The equation corresponding to (5.7.3) is equation
d1
q1 X l=1
Qq ˆe2 l l=1 ( l ) = = Qq1 l l=1 ( m )
(5.7.5)
To rewrite this in the same notation as (5.7.4), we note that for a star on p + 1 = q vertices, with the centre labelled 1, N = {2> 3> = = = > p + 1}, d1 = d11 > ˆel = d1>l+1 , so that Y X d21m 4(A(Gn0 ))= (5.7.6) 4(A) = ( d11 ) 4 (A(G 0 )) m5N
n5N \m
Note that we have assigned the p = q 1 ’s to the p connected components 0 > l = 1> 2> = = = > p. Note also that although of G 0 so that l is assigned to Gl+1 the first terms on the right of (5.7.4), (5.7.6) are identical, the second terms are dierent. For the star, the vertex set of Gm00 is empty. Parter (1960) [265] obtained a general result which embraces the particular cases (5.7.4) and (5.7.6): Q d21m 4(A(Gm00 )). nMN \m 4(A(Gn0 )) = 1 if, as for the star, the vertex set of Gm00 is empty.
Lemma 5.7.1 4(A) = (3d11 )4(A(G 0 ))3 with the convention that
4(A(Gm00 ))
P
mMN
Lemma 5.7.1, like the corresponding result (5.7.5) for the star, is eectively a partial fraction expansion. In the general case, it is X 4(A(Gm00 )) 4(A) 2 = d > d 11 1m 4(A(G 0 )) 4(A(Gm0 )) m5N
(5.7.7)
114
Chapter 5
where we have used the fact that G 0 has p separate connected components Gm0 , so that Y 4(A(G 0 )) = 4(A(Gm0 ))= m5N
Equation (5.7.7) provides the basis for an inductive argument: deleting vertex 1 of G splits G 0 into p components Gm0 , and Gm00 bears the same relation to Gm0 as G 0 does to G. This means that if we can eect the reconstruction of A on the components Gm0 of G 0 from data referring to Gm00 and Gm0 , then we can reconstruct the whole of A. Now since Gm0 is itself a tree, and Gm00 is obtained by deleting vertex m from 0 Gm , the roots of 4(A(Gm00 )) should interlace the roots of 4(A(Gm0 )), just as the l interlace the l , i.e., (5.7.2). But equation (5.7.7) gives 4(A(Gm00 )) as a result of the partial fraction expansion. We are given q Y ( l )>
4(A) = i () =
(5.7.8)
l=1
4(A(G 0 )) = j() =
q1 Y
( l )=
(5.7.9)
l=1
Now we must assign the q 1 ’s among the p components Gm0 . Suppose Gm0 has ym vertices then we must split the indices {1> 2> = = = > q 1} into p sets, so that if m 5 N then Gm0 is assigned ym indices. This is equivalent to grouping the terms in j() into p terms jm (), where jm () has degree ym : j() =
Y
jm ()=
m5N
This means that we must check to see if, when i ()@j() is expanded into partial fractions, as X km () i () =d > (5.7.10) |m j() jm () m5N
where km () is a monic polynomial with deg(km ) ? deg(jm ), and if the ’s and ’s interlace as in (5.7.2), then |m is positive, and the zeros of km () and jm () interlace. To do this, it is best to change back into a form like Lemma 5.7.1 by multiplying throughout by j(): X
i () = ( d)j()
|m km ()xm ()
m5N
where xm () = j()@jm (). We note that xm () =
Y
( v )
v5T
5. Inverse Problems for Some More General Systems
115
where T consists of {1> 2> = = = > q 1} less those indices which are assigned to jm . Choose m 5 N and suppose that u > u+s are two successive zeros of jm (), then, since j(u ) = 0 = j(u+s ), we have = |m km (u )xm (u ) i (u ) i (u+s ) = |m km (u+s )xm (u+s )= We need to show that km () has a zero between u and u+s , i.e., that km (u )> km (u+s ) have opposite signs. The terms (u v ) and (u+s v ) appearing in xm (u ) and xm (u+s ) will have the same sign except for those v lying between u+s and u ; these are s 1 such ’s, with indices u + s 1> = = = > u + 2> u + 1. Thus s odd ; xm (xu )> xm (xu+s ) have the same sign s even ; xm (xu )> xm (xu+s ) have opposite signs. By assumption i () has just one zero between any two successive ’s; thus s odd ; i (u )> i (u+s ) have opposite signs s even ; i (u )> i (u+s ) have the same sign. Combining these results, we see that km (u ) and km (u+s ) must have opposite signs. Now we check that |m is positive. Suppose ym = t, and the roots of jm () and km () are (l )t1 and ( l )t1 respectively, where 1 1 A 1 A 2 A · · · A t1 A t = Then jm () = i () =
t Y ( l )>
km () =
l=1 q Y
t1 Y
( l )
l=1
( l )
l=1
and suppose 1 = u, so that i (u ) =
q Y (u l )=
l=1
Now 1 > 2 > = = = > u are all greater than u , so that the sign of i (u ) is ()u . All the l are smaller than u so that km (u ) A 0. Finally Y xm (u ) = (u l ) where the sum is taken over those l 5 {1> 2> = = = > q 1}@{1 > 2 > = = = > t }. But for the sign we need to consider only those l A u ; there are u 1 of these, so that the sign of xm (u ) is ()u1 . Thus |m A 0.
116
Chapter 5
This yields the first stage in the construction of A: take i ()> j() and form the partial fraction expansion (5.7.10); d11 = d> d1m = (|m )1@2 , while the zeros of jm () and km () are the eigenvalues of the components of A>1 . Figure 5.7.1 shows an example of a tree.
1
x
2x
x6 x
3 x
x
x
x
4
8
5
x9
7
Figure 5.7.4 - A tree on 9 vertices. The matrix has the form 5 [ [ 9 [ [ 9 9 [ 9 9 [ 9 [ D=9 9 9 [ 9 9 9 7
6
[ [ [
[
[
[ [ [ [ [
[ [
[ [ [
: : : : : : : : : : : : [ 8 [
In stage 1, A>1 has two components; we find d11> d12 > d16 and we find new data which will allow us to construct the star on vertices {2> 3> 4> 5}, and the starpath on vertices {6> 7> 8> 9}. To carry out the second stage we can, if we choose, relabel each of the connected components so that 2 $ 1 and 6 $ 1. We have assumed that the data for constructing A is two strictly interlacing spectra. However, as with the path and the star, it is possible to use one spectrum (A) = (l )q1 and the first coe!cients x1l > l = 1> 2> = = = > q of the normalised eigenvectors of A, instead. We recall the result proved for a general matrix A 5 Vq , namely that the eigenvalues of A>1 are the zeros of q X (x1l )2 = 0= l l=1
Further discussion of, and reference to, eigenvalue problems related to trees may be found in Nylen and Uhlig (1994) [252]. Further references to the vast literature on inverse eigenvalue problems may be found in Gladwell (1986a) [107], Gladwell (1996) [124], Nocedal and Overton (1983) [251], Friedland, Nocedal and Overton (1987) [95], Ikramov and
5. Inverse Problems for Some More General Systems
117
Chugunov (2000) [184], Xu (1998) [339], Chu (1998) [58] and Chu and Golub (2001) [59].
Exercises 5.7 1. Show that if the eigenvalues l of A 5 Vq have maximum modulus , then |dlm | for all l> m = 1> 2> = = = > q. 2. Show that if A 5 Vq is a matrix on G, then the maximum number of (nondiagonal) zero entries in A is attained when G is a tree, and is q2 3q + 2. 3. Construct an algorithm to form A from (l )q1 > (l )q1 , given the structure of G. Use it to construct A on the graph G of Figure 5.7.4. Take {l }91 = {1> 3> 5> 7> 9> 11> 13> 15> 17}> {l }81 = {2> 4> 6> 8> 10> 12> 14> 16}. As a check, find the eigenvalues of A and A1 .
Chapter 6
Positivity There are then two kinds of intellect: the one able to penetrate acutely and deeply into the conclusions of given premises, and this is the precise intellect; the other able to comprehend a great number of premises without confusing them, and this is the mathematical intellect. Pascal’s Pensées, 2
6.1
Introduction
The basic eigenvalue analysis of real symmetric matrices was discussed in Chapter 1. The eigenvalue properties described there are shared by all positivedefinite (or semi-definite) matrices. This Chapter, which may be missed on a first reading, provides proofs of some of the results which were used in Chapter 1. Foremost among these are Theorem 6.3.1, that if A 5 Vq , then it has q real eigenvectors which are orthonormal, and thus span Yq ; and Theorem 6.3.7 that provides necessary and su!cient conditions for the matrix A to be positive-definite. Signs, positive or negative, provide the recurring theme for this Chapter, and hence our choice for the Chapter heading: positivity. In Chapter 3 we focussed our attention on a narrower class, Jacobi matrices, and found that they had additional eigen-properties: they had distinct eigenvalues and, with increasing l, the eigenvector ul became increasingly oscillatory, meaning that there was an increasing number of sign changes among the elements x1l > x2l > = = = > xql . It will be shown in this Chapter that many of the eigen-properties of such matrices are shared by a wider class of socalled oscillatory matrices. Actually, there are twin classes of matrices, oscillatory and sign-oscillatory, as described in Section 6.5. If A is oscillatory, ˜ = ZAZ is sign-oscillatory, and vice and Z = gldj(1> 1> 1> = = = > ()q1 ), then A versa. The Jacobi matrix J of equation (3.1.4) is actually sign-oscillatory. These matrices were introduced and extensively studied by Gantmacher and Krein (1950) [98], see also Gantmacher (1959) [97]. The matrices appearing in lumped-mass or finite element models of strings, rods and beams are all oscilla118
6. Positivity
119
tory or sign-oscillatory; this Chapter serves as reference material for the study of oscillatory matrices. The theorem upon which the whole of the analysis of oscillatory matrices depends, is Perron’s theorem (Theorem 6.5.1). This relates to a strictly positive matrix, one that has all its elements strictly positive, and states that such a matrix has one eigenvalue, the greatest in magnitude, that is real and positive; the corresponding eigenvector has all its coe!cients strictly positive. The matrices appearing in mechanics are usually not strictly positive; such matrices appear in Economics and Operational Analysis. Instead, the matrices are oscillatory. (See the precise definition in Section 6.6.1.) In order to apply Perron’s theorem to such matrices, we need two essential steps. First, if A is oscillatory, then B = Aq1 is totally positive (TP). This term, which is introduced in Section 6.6.1, means that not only all the elements of B are strictly positive, but also all the minors (Section 6.2) of B. Note that the eigenvalues of B are the (q 1)th powers, q1 , of the eigenvalues of A, while its eigenvectors l are the eigenvectors of A. The other step that is needed is the introduction of the concept of a compound matrix ¡ ¢ (Section 6.2). The compound matrix As is formed from all the Q = qs sth-order minors of A. The important Binet-Cauchy Theorem, Theorem 6.2.3, shows (Ex. 6.3.1) that the eigenvalues of As are simply products of s eigenvalues of A. The argument then runs as follows. Suppose A is oscillatory, then B = Aq1 is TP, and hence, for s = 1> 2> = = = > q> Bs is strictly positive (not TP). The first conclusion (Theorem 6.10.1) is that the eigenvalues of A are positive and distinct, like those of J or ˜ J. Before beginning the analysis proper, we point out a notational matter which must be understood if confusion is to be avoided. In Chapter 3, in dealing with a Jacobi matrix J, a positive semi-definite tridiagonal matrix with negative codiagonal, the eigenvalues were labelled in increasing order, i.e., 0 1 ? 2 ? · · · ? q . The eigenvectors then became increasingly oscillatory, as described in Theorem 3.3.1. In Ex. 3.3.2, it was pointed out that if the eigenvalues of ˜ = ZJZ, a positive semi-definite tridiagonal matrix with positive codiagonal J (an oscillatory matrix if it is actually non-singular, i.e., positive-definite) are labelled in decreasing order, i.e., 1 A 2 A · · · A q 0, then the eigenvectors still satisfy Theorem 3.3.1. In this Chapter, in dealing with oscillatory matrices, we shall keep the same ordering, i.e., 1 A 2 A · · · A q A 0. Theorem 6.10.2 is a generalisation of Theorem 3.3.1.
6.2
Minors
Suppose A 5 Pq . To gain some insight into the structure of A, and into the relative sizes of its elements, we introduce the concept of a minor. A minor of order s of the matrix A is the determinant constructed from the elements of A in s dierent rows and s dierent columns. Thus, the elements of A themselves
120
Chapter 6
¯ ¯ ¯ d d13 ¯¯ are minors of order 1, while det(A) is the only minor of order q; d13 > ¯¯ 11 d21 d23 ¯ and det(A) are all minors of A. Following Ando (1987) [4] we let Ts>q , with 1 s q, denote the set of strictly increasing sequences of s integers 1 > 2 > = = = > s taken from $ = {1> 2> = = = > q}. The complement 0 of is the increasingly arranged sequence {1> 2> = = = > q}\ = $\, so that 0 5 Tqs>q . When 5 Ts>q > 5 Tt>q and _ = 0, their union, ^ > should always be rearranged increasingly to become an element of Tu>q (u = s + t). We will often use two special sequences: = (s) = {1> 2> = = = > s} and ! = !(s) = {qs+1> = = = > q}, and their complements 0 = 0 (s) = {s + 1> = = = > q}> !0 = !0 (s) = {1> 2> = = = > q s}. When the argument is omitted in or !> it will be understood to be s. The submatrix formed from rows and columns of A is denoted by A[|]; A[|] is written A[]. The minor of A taken from rows and columns is denoted by D(; ); thus ¯ ¯ d1 > 1 ¯ ¯ d D(; ) = ¯¯ 2 > 1 · ¯ ¯ d > s 1
d1 > 2 d2 > 2 · ds > 2
= = = d1 > s = = = d2 > s === · = = = ds > s
¯ ¯ ¯ ¯ ¯= ¯ ¯ ¯
(6.2.1)
The minor D(; ) is abbreviated to D(). The cofactor of dlm , introduced in Section 1.3, is a minor with a sign attached to it: Dlm = ()l+m d ˆlm > (6.2.2) where d ˆlm = D(l0 ; m 0 )>
(6.2.3)
and l0 = {1> 2> = = = > l 1> l + 1> = = = > q} = $\l> m 0 = {1> 2> = = = > m 1> m + 1> = = = > q} = $\m; d ˆlm is sometimes called the minor of dlm . ˆ = (ˆ If A 5 Pq , then we can form a new matrix A dlm ) from the minors of elements of A. We may prove ˆ are given by ˆ = (ˆ Theorem 6.2.1 Let A dlm ), then the minors of A ˆ ) = (det(A))s1 D(0 ; 0 )= D(; Proof. Consider the theorem for obtained by a suitable rearrangement of ()l+m Dlm , we may write ¯ ¯ D11 ¯ ¯ D ˆ E = D(; ) = ¯¯ 21 ¯ · ¯ Ds1
= = ; the general case may be the rows and columns. Since d ˆlm = D12 D22 · Ds2
= = = D1s = = = D2s === · = = = Dss
¯ ¯ ¯ ¯ ¯= ¯ ¯ ¯
(6.2.4)
6. Positivity
121
Multiplying this by det(A), and writing the determinant in (6.2.4) as that of an q × q matrix, we find 35 D11 D12 = = = D1s D1>s+1 = = = D1q 6 4 5 d d === d 6 D21 D22 = = = D2s D2>s+1 = = = D2q 11 12 1q d21 d22 = = = d2q · === · E9 · · = = = · : F · · === · = = = Dsq : 9 E9 D D = = = Dss D :F E · det(A) = det E9 0s1 0s2 = = = 0 s>s+1 :·7 1 === 0 8F > C7 8 D . 0
0
===
0
0
. . ===
dq1 dq2 = = = dqq 1
so that on using equation (1.3.10) we obtain ¯ ¯ det(A) 0 === 0 ¯ ¯ 0 det(A) = = = 0 ¯ det(A) 0 E · det(A) = ¯¯ ¯ d1>s+1 d2>s+1 === ds+1>s+1 ¯ ¯ d1q d2q === ds+1>q = (det(A))s D(0 ; 0 )
=== 0 === 0 === 0 = = = dq>s+1 === dqq
¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯
so that the theorem holds when det(A) 6= 0. Continuity considerations show that the theorem also holds when det(A) = 0. One of the implications of this theorem is that when det(A) = 0, the rank ˆ is at most 1, meaning that all the rows of A ˆ are multiples of each other, of A as are all the columns. There is another corollary Corollary 6.2.1 ˆ = (det(A))q1 = det(A) There is another way to form a matrix from minors of a given matrix. Suppose A 5 Pq and 1 s ? q, and put = (s) := {1> 2> = = = > s}. We can define elm by elm = D( ^ l> ^ m)> l> m = s + 1> s + 2> = = = > q= The matrix B 5 Pqs . Thus, if s = 2 and 5
1 9 0 A =9 7 2 1
6 2 3 4 · ¸ 1 1 2 : : > then B = 5 1 = 1 4 1 8 2 2 0 3 2
The matrix B is called a bordered matrix; the indices l> m border . Sylvester’s Theorem on bordered determinants is Theorem 6.2.2 Suppose A 5 Pq > 1 s ? q> elm = D( ^ l> ^ m) for l> m = s + 1> s + 2> = = = > q, and B = (elm ), then det(B) = E(0 ; 0 ) = (D(; ))qs1 det(A)=
122
Chapter 6
Proof. Theorem 6.2.1, with s replaced by q s 1, and = {s + 1> = = = > u 1> u + 1> = = = > q} = 0 \u> = {s + 1> = = = > v 1> v + 1> = = = > q} = 0 \v, shows that fuv
ˆ ) = (det(A))qs2 D( ^ u; ^ v) = D(; = (det(A))qs2 euv =
(6.2.5)
The Corollary of Theorem 6.2.1 shows that if C = (fuv )qs+1 , then ˆ 0 ; 0 ))qs1 = det(C) = (D(
(6.2.6)
det(C) = E(0 ; 0 )(det(A))(qs2)(qs) = det(B)( det(A))(qs2)(qs)
(6.2.7)
But according to (6.2.5),
and from Theorem 6.2.1 ˆ 0 ; 0 ) = ( det(A))qs1 D(; ) D(
(6.2.8)
so that on substituting (6.2.8) into (6.2.6) we find 2
det(C) = ( det(A))(qs1) (D(; ))qs1 which, on comparison with (6.2.7), yields the required result when det(A) 6= 0. Continuity considerations show that the theorem still holds when det(A) = 0. Corollary 6.2.2 If > 5 Tv>q > _ = 0> _ = 0, then E(; ) = (D(; ))v1 D( ^ ; ^ )= Corollary 6.2.3 Suppose > 5 Ts>q and elm = D( ^ l; ^ m) for l = s + 1> = = = > q; m = s + 1> = = = > q and > 5 Tt>q with 1 A s > 1 A s , then E(; ) = (D(; ))t1 D( ^ ; ^ )= This is the general form of Sylvester’s Theorem. For a proof, see Gantmacher (1959) [97], Vol. I, p. 32. We now introduce the powerful Binet-Cauchy Theorem. Theorem 6.2.3 If A 5 Pp>n > B 5 Pn>q and C = AB> 5 Ts>p > 5 Ts>q then X F(; ) = D(; )E(; ) (6.2.9)
where the sum extends to all 5 Ts>n .
6. Positivity
123
The theorem is a generalisation of the formula for flm , namely flm =
n X
dls esm =
s=1
The proof may be found in Gantmacher (1959) [97], Vol. I, p. 9. The importance of the Binet-Cauchy Theorem lies in its application to compound matrices, which we now define. Suppose first that A is square, i.e., A 5 Pq . We shall define the compound matrix As . Consider all the sequences 5 Ts>q ; there are µ ¶ q q! = Q= s s!(q s)! such sequences. For given q> s, the Q sequences may be arranged in ascending order 1> 2> = = = > Q . This may be done by associating with the sequence = {1 > 2 > = = = > s } the number with digits 1 > 2 > = = = > s in the base of g = Q + 1. This procedure associates a specific index v = v() with each sequence ; v lies in the range 1 v Q . Thus, when q = 5> s = 3, we have Q = 10, and the combinations are 123, 124, 125, 134, 135, 145, 234, 235, 245, 345. Thus v(124) = 2, while v(245) = 9. The element a vw of As is then given by avw = D(; ) where v = v()> w = v(). Although we shall not need it in this book, a compound matrix can be defined for a rectangular matrix A 5 Pp>q . Now µ ¶ µ ¶ q p > Q= As 5 PP>Q > P = s s and avw is given by (6.2.9) for 5 Ts>p , 5 Ts>q . The Binet-Cauchy Theorem now states Theorem 6.2.4 If A 5 Pp>n , B 5 Pn>q , C = AB and s min(p> n> q), then C s = As Bs = Proof. The equation (6.2.9) may be written cuv =
n X
auw bwv
w=1
where u = v()> v = v()> w = v(). Corollary 6.2.4 If A 5 Pq is non-singular then the sth order compound matrix of A1 is the inverse of the sth order compound matrix of A. Proof. Let B = A1 , then AB = I implies As Bs = I s so that Bs = (As )1 .
124
Chapter 6
Exercises 6.2 1. If A 5 Pq is non-singular, then equation (1.3.20) shows that its inverse R = A1 has elements ˆml @ det(A)= ulm = Dml @ det(A) = ()l+m d Use Theorem 6.2.1 to show that if > 5 Ts>q then det(A)U(; ) = ()w D(0 ; 0 ) where w=
s X
(p + p )=
p=1
2. If = {1> 2> = = = > s} and ! = {q s + 1> = = = > q} then D(; !) and D(!; ) are called the sth order corner minors of A. Use Ex. 6.2.1 to show that the corner minors of R are given by det(A) · U(> !) = ()w D(0 ; !0 )> where w = u(s + 1). Note that D(0 ; !0 ) is an (q s)th order corner minor of A. 3. Equations (1.3.7), (1.3.8) are a particular case of Laplace’s expansion of a determinant, X det(A) = ()w D(; )D(0 ; 0 )
where 5 Ts>q is fixed, the sum is taken over all 5 Ts>q and w = Ps ( p + p ). Establish this result and show that there is a similar p=1 expansion with fixed and varying over Ts>q .
4. Suppose A 5 Pq . Use the Binet-Cauchy theorem to show that the sth compound matrix of Ap is (As )p , i.e., (Ap )s = (As )p = Ap s = 5. Use the Binet-Cauchy theorem to show that if Q is an orthogonal matrix, then so is Qs , the sth compound matrix of Q. 6. If B = Ap , write the minors of B in terms of the minors of A; use the notation (6.2.9) to show that E(; ) =
X
D(; )D(; 0 ) = = = D( 0(p1) ; )
where the sum is over all ; 0 = = = 0(p1) 5 Ts>q .
6. Positivity
6.3
125
A general representation of a symmetric matrix
We begin with two theorems. Theorem 6.3.1 If A 5 Vq , then A has q real eigenvectors forming an orthonormal system. Theorem 6.3.2 To each p-fold eigenvalue 0 of A 5 Vq , there correspond p linearly independent eigenvectors. In Section 1.4 we showed that the eigenvectors corresponding to distinct eigenvalues are orthogonal. This means that if all the eigenvalues of A are distinct, then it has q orthogonal eigenvectors which may be scaled so that they are orthonormal. It su!cies to prove Theorem 6.3.2. Proof. Suppose that 0 is an p-fold eigenvalue of A, i.e., 4() = det(AI) has 0 as an p-fold root, and that B = A 0 I has rank s, so that the equation Bu (A 0 I)u = 0
(6.3.1)
has u = q s linearly independent solutions. We need to prove that u = p. Now 4() = P det(A I) = det(B ( 0 )I) q l l = m = q l> l=0 () Wm ( 0 ) >
where Wm is the sum of the mth-order principal minors of B, and W0 = 1. But B has rank s, so that Wq = 0 = Wq1 = · · · = Ws+1 and therefore 4() = ±( 0 )u {Ws ( 0 )Ws1 + · · · ± ( 0 )s W0 }> so that p u. It is su!cient to prove that Ws 6= 0, for then 4() will have a u-fold root, i.e., p = u. Without loss of generality we may assume that the first s rows of B are linearly independent, so that any row of B may be expressed as a linear combination of the first s rows, i.e., elm =
s X
fln enm >
l> m = 1> 2> = = = > q>
n=1
which may be written B = CB0 >
(6.3.2)
where C 5 Pq>s , and B0 5 Ps>q is formed from the first s rows of B. Now apply the Binet-Cauchy Theorem 6.2.3 to (6.3.2): X B(; ) = C(; )B0 (; )= (6.3.3)
But C has only s columns, and similarly B0 has only s rows, and they are the rows 1> 2> = = = > s of B. Thus, there is only one term in the expansion (6.3.3): B(; ) = C(; )B(; )>
(6.3.4)
126
Chapter 6
where = {1> 2> = = = > s}. Similarly B(; ) = C(; )B(; )=
(6.3.5)
But B is symmetric, so that B(; ) = B(; )> and thus, on combining (6.3.4), (6.3.5), we have B(; ) = C(; )C(; )B(; )= (6.3.6) All the minors on the left cannot vanish, since then B would have rank less than s; we must have B(; ) 6= 0. But then (6.3.6) with = gives B(; ) = (C(; ))2 B(; )= This means that all the sth order principal minors of B have the same sign, and one at least, B(; ) is non-zero. Thus Ws , their sum, must be non-zero. Hence p = u. We may now assert that if A 5 Vq , then it has q eigenvalues (l )q1 and q orthonormal eigenvectors (ul )q1 . This means that Aul = l ul >
l = 1> 2> = = = > q>
which may be combined to yield AU = Ua>
UUW = UW U = I
(6.3.7)
and this may be transformed to A = UaUW =
(6.3.8)
This is a most important representation of a symmetric matrix.
Exercises 6.3 1. Apply the Binet-Cauchy Theorem, in the form of Theorem 6.2.4, to equation (6.3.7) to show that the eigenvalues of As are all the products l1 l2 = = = ls . 2. Show that the eigenvectors of As are the columns of the compound matrix U s.
6.4
Quadratic forms
Suppose A 5 Vq , then D(x> x) xW Ax = d11 {21 + 2d12 {1 {2 + · · · + 2dq>q1 {q1 {q + dqq {2q (6.4.1) is called the quadratic form associated with A. One of our aims in this section is to find necessary and su!cient conditions for A to be positive definite (PD), i.e., D(x> x) A 0 for all x 6= 0.
6. Positivity
127
First, we consider a number of dierent ways of expressing D(x> x). Let Dl (x) =
q X
dlm {m >
l = 1> 2> = = = > q>
(6.4.2)
{m Dm (x)=
(6.4.3)
m=1
then D(x> x) =
q X m=1
This yields
¯ ¯ d11 d12 ¯ ¯ d21 d22 ¯ ¯ · · ¯ ¯ dq1 dq2 ¯ ¯ D1 (x) D2 (x)
··· ··· ··· ··· ···
d1q D1 (x) d2q D2 (x) · · dqq Dq (x) Dq (x) D(x> x)
¯ ¯ ¯ ¯ ¯ ¯ = 0> ¯ ¯ ¯ ¯
(6.4.4)
since the last column is a combination of the first q columns, and Theorem 6.4.1 If det(A) 6= 0, then ¯ ¯ d11 d12 ¯ ¯ d21 d22 1 ¯¯ · · D(x> x) = det(A) ¯¯ d d q1 q2 ¯ ¯ D1 (x) D2 (x)
··· ··· ··· ··· ···
d1q D1 (x) d2q D2 (x) · · dqq Dq (x) Dq (x) 0
¯ ¯ ¯ ¯ ¯ ¯= ¯ ¯ ¯ ¯
(6.4.5)
Proof. Expand the zero determinant (6.4.4) along its last row. Now we introduce the quantities ¯ ¯ ¯ ¯ d11 d12 D1 (x) ¯ ¯ ¯ d11 D1 (x) ¯ ¯ ¯ ¯ > [3 (x) = ¯ d21 d22 D2 (x) ¯ [1 (x) = D1 (x)> [2 (x) = ¯¯ ¯ ¯ ¯ d21 D2 (x) ¯ d31 d32 D3 (x) ¯ (6.4.6) etc., up to [q (x), and prove
Theorem 6.4.2 If = {1> 2> = = = > s} and Gs = D(; ) 6= 0> then the ([s (x))q1 are linearly independent. Proof. We see that [1 (x) = d21 d1m ){m , and generally
Pq
m=1
d1m {m , while [2 (x) =
s = 1> 2> = = = > q Pq
m=2 (d11 d2m
[s (x) = Gs {s + terms in {s+1 > = = = > {q = Thus we see that in the reversed sequence [q (x)> [q1 (x)> = = = > [1 (x), each term involves one more {m than the previous one. This means that the ([l (x))q1 can all be simultaneously zero i all the ({l )q1 are zero. This leads us to an important expression for D(x> x) given by
128
Chapter 6
Theorem 6.4.3 (Jacobi). 0> s = 1> 2> = = = > q then
If G0 = 1> = {1> 2> = = = > s} and Ds = D(; ) 6= D(x> x) =
q X ([s (x))2 s=1
Gs Gs1
=
(6.4.7)
Note that, on account of Theorem 6.4.2, this equation expresses D(x> x) as a sum of multiples of squares of linearly independent combinations of the ({l )q1 . Proof. Put S0 = 0, and ¯ ¯ ¯ d11 d12 ··· d1s D1 (x) ¯¯ ¯ ¯ d21 d22 ··· d2s D2 (x) ¯¯ ¯ ¯ ¯ · · ··· · · Ss (x> x) = ¯ (6.4.8) ¯ ¯ ds1 ¯ d · · · d D (x) s2 ss s ¯ ¯ ¯ D1 (x) D2 (x) · · · Ds (x) ¯ 0 and find the recurrence relation linking Ss and Ss1 . Ss (x> x) is the determinant of a symmetric matrix C 5 Vs+1 , i.e., Ss (x> x) = F((s + 1); (s + 1))= Apply Theorem 6.2.2 to this, letting elm = F((s 1) ^ l; (s 1) ^ m)> l> m = s> s + 1 then det(B) = ess es+1>s+1 es>s+1 es+1>s = F((s 1); (s 1))= F((s + 1); (s + 1))
(6.4.9)
But ess = Gs > es+1>s+1 = Ss1 (x> x)> es>s1 = es1>s = [s (x) while F((s 1); (s 1)) = Gs1 > F((s + 1); (s + 1)) = Ss (x> x). Thus, equation (6.4.9) gives Gs Ss1 (x> x) [s2 (x) = Gs1 Ss (x> x) or, since the Gs are non-zero [s2 (x) Ss (x> x) Ss1 (x> x) = + > s = 1> 2> = = = > q= Gs Gs1 Gs Gs1
(6.4.10)
Now Theorem (6.4.1) states that D(x> x) =
Sq (x> x) Gq
so that on summing equation (6.4.10) from 1 to q and using S0 = 0 we find the required sum (6.4.7). Theorem 6.4.4 Suppose A 5 Vq , then A is PD i Gl A 0> l = 1> 2> = = = > q.
6. Positivity
129
Proof. First we prove that if A is PD, then det(A) A 0. Since A 5 Vq , it has, by the Corollary to Theorem 6.3.2, q eigenvalues (l )q1 and q orthonormal eigenvectors (ul )q1 such Q that Aul = l ul . Thus l = uWl Aul = D(ul > ul ) A 0 and therefore det(A) = ql=1 l A 0> i.e., Gq A 0. If A is PD, then the matrix obtained by deleting the last m rows and columns of A is PD, for m = 1> 1> = = = > q 1. Therefore, their determinants are positive, i.e., (Gqm )q1 A 0. We have proved that if A is PD, then (Gl )q1 A 0. 1 Now suppose that (Gl )q1 A 0, then equation (6.4.7) shows that A(x> x) A 0, for the ([l (x))q1 can be simultaneously zero only if x = 0. Thus A is PD. Corollary 6.4.1 If A 5 Vq is PD, then all the principal minors D(; ) = D()> 5 Ts>q > s = 1> 2> = = = > q, are positive. If A 5 Vq is merely positive semi-definite (PSD), then the leading principal minors, and indeed all the principal minors are non-negative. We prove Theorem 6.4.5 If A 5 Vq is PSD and, for some s satisfying 1 s ? q> Gs = D(; ) = 0, then every principal minor bordering Gs is zero. In particular, the leading principal minors Gt > s t q, are zero. We prove that the Gt are zero, and leave the remaining result to an Exercise. Proof. There are two cases: i) s = 1, then G1 = d11 = 0, and ¯ ¯ ¯ d11 d1m ¯ 2 ¯ ¯ ¯ dm1 dmm ¯ = d1m 0
implies (d1m )q1 = 0, so that (Gt )q1 = 0; in this case x1 does not appear in D(x> x) at all. ii) d11 6= 0 and, for some s> 1 s q 1> Gs 6= 0> Gs+1 = 0. (If s = q 1, there is nothing further to prove; we may therefore take s ? q 1.) We introduce bordered determinants elm = D( ^ l; ^ m)> l> m = s + 1> = = = > q and form B = (elm )qs+1 . By Sylvester’s identity (Corollary to Theorem 6.2.2), if 1 A s and 5 Tu>q > u q s, then E(; ) = (D(; ))u1 D( ^ ; ^ )
so that B is PSD. Since es+1>s+1 = Gs+1 = 0, the matrix falls under case 1 and if t A s + 1> u = t s 1> = {s + 1> = = = > t}, then 0 = E(; ) = {D((s); (s))}u D((t); (t)) so that D((s); (s)) = Gs 6= 0 implies D((t); (t)) = Gt = 0. This theorem implies that if A 5 Vq is PSD, then, for some s> 1 s ? q> (Gl )s1 A 0> (Gl )qs+1 = 0.
130
Chapter 6
Exercises 6.4 1. Show that A 5 Vq is PD i its eigenvalues (l )q1 are positive; it is PSD i its eigenvalues are non-negative. 2. Show that if A 5 Vq is PSD then A is singular, and that xW Ax = 0 i Ax = 0. 3. Show that if A 5 Vq is PSD and if dll = 0 for some l satisfying 1 l q, then dlm = 0 for m = 1> 2> = = = > q. This means that if dll = 0 then {l does not appear in xW Ax. 4. Show that if A 5 Vq is PSD and has rank u then it has a positive principal minor of order u. These examples are merely a selection of properties of PD and PSD matrices to be found in Chapter 7 of Horn and Johnson (1985) [183].
6.5
Perron’s theorem
Most matrices appearing in classical vibration problems are symmetric. It is therefore known that they have real eigenvalues, and a complete set of orthonormal eigenvectors. Often the matrices are PD, so that their eigenvalues, in addition to being real, are positive. However, the whole theory relating to oscillatory matrices depends on a basic result relating to a class of not necessarily symmetric matrices, as we now describe. We recall some definitions. If a vector x has all its elements positive (nonnegative) we shall say x A 0 ( 0) and shall say that x is positive (non-negative). If x> y are in Yq then x y is equivalent to x y 0. The matrix A 5 Mq is said to be positive (non-negative) if dlm A 0 ( 0) for all l> m = 1> 2> = = = > q. Up to this point the only norm we have used for a vector x 5 Yq is the Euclidean, or so-called O2 norm: q X 1 |{l |2 ) 2 = ||x||2 = (
(6.5.1)
l=1
We can define the O2 norm of a matrix A 5 Mq : ||A||2 = (
q X
l>m=1
1
|dlm |2 ) 2 =
(6.5.2)
The norm is variously called the Frobenius norm, Schur norm or Hilbert-Schmidt norm. We will need another norm, the O1 norm: ||x||1 =
q X l=1
|{l |>
(6.5.3)
6. Positivity
131
||A||1 =
q X
l>m=1
|dlm |=
(6.5.4)
A norm is like a distance; as such it must satisfy various fundamental conditions, for which see Ex. 6.5.1. For a definitive and extensive study of vector and matrix norms, see Horn and Johnson (1985) [183], Section 5.6. We may now prove Perron’s theorem, following Bellman (1970) [25]. Theorem 6.5.1 (Perron). Suppose A 5 Mq and A A 0. Then A has a unique eigenvalue which has greatest absolute value. This eigenvalue is positive and simple, and its associated eigenvector can be taken to be positive. The eigenvalue is often called the Perron root of A. Proof. Let V() be the set of all non-negative for which there exist nonnegative x such that Ax x= (6.5.5) Pq We shall consider only O1 -normalised vectors x, i.e., such that ||x||1 = l=1 {l = 1. (Since x 0, |{l | = {l .) This therefore excludes the zero vector. If x satisfies (6.5.5), then Ex. 6.5.2 shows that ||x||1 ||Ax||1 ||A||1 =||x||1 >
(6.5.6)
so that 0 ||A||1 . This shows that the set V() is bounded. It is clearly not empty, because A is positive. The bounded set V() has a least upper bound; let it be 0 . Let 1 > 2 > = = = be a sequence of ’s in V() converging to 0 , and xl a corresponding sequence of x’s satisfying Axl l xl . The set of all x such that ||x||1 = 1 is closed and bounded; therefore, the sequence {xl } contains a convergent sequence {x l } converging to a non-negative vector x0 satisfying ||x0 ||1 = 1, and (Ex. 6.5.3) Ax0 0 x0 =
(6.5.7)
This means that 0 5 V(). We shall now show that equality holds in (6.5.7), and we do so by reduction to a contradiction. Let d = Ax0 0 x0 0 and suppose one of the gl , say gm , is positive. Put |l = {l0 + (gm @20 ) lm then the lth row of Ay 0 y is hl = gl + dlm gm @(20 ) gm lm @2 A 0= Now let = 0 + minl (hl @|l ), then A 0 , and Ay y
= e ( 0 )y = = e minl (hl @|l )y 0
132
Chapter 6
This states that 5 V(), and that is greater than the least upper bound, 0 , of V(). This contradiction implies that there is equality in equation (6.5.7), i.e., Ax0 = 0 x0 = (6.5.8) Thus 0 is an eigenvalue and x0 is an eigenvector, and x0 is necessarily positive (Ex. 6.5.4). We will show that 0 is the required Perron root. Suppose that there is another eigenvalue , possibly complex, such that || 0 , with z 6= 0 being an associated eigenvector, so that Az = z. Let |z| denote the vector with elements |}1 |> |}2 |> = = = > |}q |, then we deduce that || |z| = |Az| A|z|=
(6.5.9)
But then the maximum property of 0 implies || 0 , and hence || = 0 . Now the argument used earlier shows that equality holds in equation (6.5.9), i.e., A|z| = 0 |z|> |z| A 0= But then |Az| = A|z|
(6.5.10)
and (Ex. 6.5.5) this can hold only if z = fw, where f is complex and w is positive; and this implies that is positive, i.e., = 0 . We now show that x0 and w, both positive and both eigenvectors corresponding to 0 , are equivalent. Put y = x0 %w, and take % = min({l0 @zl ) = {m0 @zm > l
then y is a non-negative eigenvector corresponding to 0 with |m = 0, so that dm1 |1 + dm2 |2 + · · · + dmq |q = 0> and since dm1 A 0 for l = 1> 2> = = = > q we must have y = 0. Thus x0 = %w so that 0 is a simple eigenvalue. Thus 0 has all the properties asserted for the Perron root . Exercises 6.5 1. A vector norm must satisfy three conditions: a) ||x|| 0, and ||x|| = 0 i x = 0 b) ||fx|| = |f| · ||x|| c) ||x + y|| ||x|| + ||y|| Show that both the O1 and the O2 norm satisfy these conditions. 2. Show that A 5 Pq > x 5 Yq , then ||Ax||1 ||A||1 · ||x||1 =
6. Positivity
133
3. Verify that the vector x0 will in fact satisfy the inequality (6.5.7). 4. Show that if x is a non-negative eigenvector of a positive matrix A 5 Pq , then x A 0. This has the following logical negative consequence: if A A 0> x 0> Ax = x and {l = 0 for some l = 1> = = = > q, then x = 0. 5. Show that if A 5 Pq is positive, then equation (6.5.10) can hold only if z = fw, where f is complex and w A 0.
6.6
Totally non-negative matrices
Suppose A 5 Pq . The matrix A is said to be positive (see Section 6.5), written A A 0, if dlm A 0 for all l> m = 1> 2> = = = > q. Total positivity concerns all the minors of A, (see Section 6.2) not just its elements. If A 5 Pp>q , we say that A is 1. TN (totally non-negative) if all the minors of A are non-negative; If A 5 Pq , we say that A is 2. NTN (non-singular and totally non-negative) if A is non-singular and TN ; 3. TP (totally positive) if all the minors are (strictly) positive; 4. O (oscillatory) if A is TN, and a power, Ap , is TP. Note that some authors, including ourselves in Gladwell (1986b) [108], use totally positive (TP) instead of totally non-negative (TN), and strictly totally positive (STP) instead of totally positive (TP). Also, in Gladwell (1986b) [108], following Gantmacher and Krein (1950) [98], we used completely instead of totally; completely positive now has a quite dierent connotation. Reader, beware of these subtle distinctions! The concept of an oscillatory (or oscillation) matrix was eectively introduced by Gantmakher and Krein in the 1930’s, see Gantmacher and Krein (1950) [98]. It was developed further by Gantmacher (1959) [97]. The concept of total positivity had arisen much earlier than this, e.g., Fekete (1913) [86]; it was first systematically explored by Karlin (1968) [190] in his book Total Positivity, Volume 1. (Volume 2 has never appeared!) Ando (1987) [4] reviews its history and proves important new results. All the concepts, total positivity, oscillatory, etc., arise in the study of in-line systems, rods, beams, splines, Sturm-Liouville dierential equations, etc. The study of total positivity involves the delicate treatment of inequalities. Here are two typical examples, which the reader may verify:
134
Chapter 6
i) if d 0 or g 0; e 0 and f 0; and ¯ ¯ d e ¯ ¯ f g
¯ ¯ ¯ A 0> ¯
ii) if d 0 or g 0; e A 0 and f A 0; and ¯ ¯ d e ¯ ¯ f g
¯ ¯ ¯0 ¯
then d A 0 and g A 0;
then d A 0 and g A 0.
The concept of total positivity is similar to positive-definiteness, but there are important dierences between the two concepts: positive definiteness applies only to symmetric matrices, TP applies to any matrices in Pq ; the condition for positive-definiteness involves only the principal minors, while TP involves all the minors. Clearly, if A 5 Vq is TN then it is PSD; if it is TP then it is PD; but the converses of these results are false. (Ex. 6.6.1). There is a theorem like Theorem 6.4.5 for matrices which are TN: Theorem 6.6.1 If A 5 Pq , and A is TN, and A has a zero principal minor, then every minor bordering it is also zero. Proof. For simplicity we confine our attention to the leading principal minors; this restriction can be removed at the expense of some complication in the argument. As in Theorem 6.4.5, there are two cases: 1) G1 = d11 = 0. We assert that this implies that either (dl1 )q1 = 0 or (d1m )q1 = 0. If this were not true, then we could find dl1 A 0 and d1m A 0 for some l> m satisfying 2 l q> 2 m q. But then ¯ ¯ ¯ d11 d1m ¯ ¯ ¯ ¯ dl1 dlm ¯ = dl1 d1m ? 0>
which contradicts the statement that A is TN. Thus if d11 = 0 then either the first row of A or the first column of A must be zero. (See also Ex. 6.6.2.) 2) d11 6= 0. Then for some s(1 s q 1) we have Gs 6= 0> Gs+1 = 0. (Again, if s = q 1, there is nothing further to prove.) We introduce bordered determinants elm = D( ^ l; ^ m) l> m = s + 1> = = = > q
and form the matrix B = (elm )qs+1 . By Sylvester’s identity (Corollary 6.2.3), if > 5 Tu>q > 1 A s> 1 A s, then E(; ) = (D(; ))u1 D( ^ ; ^ ) so that B is TN. Since es+1>s+1 = Gs+1 = 0, the matrix falls under case 1. If t A s + 1, and = = {s + 1> = = = > t}, then E(; ) = (D((s); (s)))ts1 D((t); (t)) = 0=
But since Gs = D((s); (s)) 6= 0, we have Gt = 0.
6. Positivity
135
Theorem 6.6.2 If A 5 Pq and A is NTN, then all its principal minors are positive. Proof. If any principal minor were zero, then, by Theorem 6.6.1, Gq = det(A) would be zero, but A is non-singular, so that det(A) 6= 0 (in fact det(A) A 0). Corollary 6.6.1 If A 5 Vq is NTN, then A is PD. If A 5 Pq is NTN, then we know that some elements of A are strictly positive, in particular, by Theorem 6.6.2, the diagonal elements dll . We now prove an important result which shows that A will have a so-called staircase structure. We first introduce some definitions. Let s = {s1 > s2 > = = = > sq } be a sequence of integers from {1> 2> = = = > q}. Then s is a staircase sequence if s1 s2 · · · sq and sl l for all l = 1> 2> = = = > q. Thus s = {2> 3> 3> 5> 5} is a staircase sequence. Suppose s and t are staircase sequences. A matrix A 5 Pq is said to be a s> t-staircase matrix if dlm = 0 when m A sl or l A tm . Suppose s = {2> 3> 3> 5> 5}> t = {2> 4> 5> 5> 5} then 5 6 d11 d12 9 d21 d22 d23 : 9 : 9 : d32 d33 D=9 : 7 d42 d43 d44 d45 8 d53 d54 d55 is a s> t-staircase matrix. The characteristic of a staircase matrix is that if an element in the upper (lower) triangle is zero then all elements to the right (left) and above (below) are zero. Clearly, if A 5 Vq , then s = t; we say A is a s-staircase matrix. We are now ready for Theorem 6.6.3 If A 5 Pq is NTN, it is a staircase matrix. Proof. The elements in the upper and lower triangles may be dealt with separately; we consider just the upper triangle. Suppose m A 1 and dlm = 0. If n A m, then ¯ ¯ ¯ dlm dln ¯ ¯ ¯ ¯ dmm dmn ¯ 0> dlm = 0> dmm A 0> dln 0
imply dln = 0; all elements to the right of dlm are also zero. Now consider the terms {dll > dl>l+1 > = = = > dlq } in the lth row. The first is positive; there is therefore a least index p, with l p q, such that dlm = 0 for m A p; call this index sl ; sl l. Now suppose m A l and dlm = 0. If n ? l then ¯ ¯ ¯ dnl dnm ¯ ¯ ¯ ¯ dll dlm ¯ 0> dlm = 0> dll A 0> dnm 0
136
Chapter 6
imply dnm = 0. Thus dlm = 0 and n ? l implies dnm = 0. Thus m A sl implies dnm = 0, i.e., m A sl implies m A sn ; sn sl . Thus s is a staircase sequence, and the upper triangle of A is a staircase. Theorem 6.6.4 If A 5 Pq is TN and 1 s ? q, then D((q); (q)) D((s); (s))=D(0 (s); 0 (s)). Recall that (s) = {1> 2> = = = > s}> 0 (s) = {s + 1> = = = > q}. Proof. On account of Theorem 6.6.1 we may assume without loss of generality that all the principal minors are positive, for if any were zero, then the inequality would be satisfied trivially because then by Theorem 6.6.1, det(A) = 0. The theorem is true for q = 2 since D((2); (2)) = d11 d22 d12 d21 d11 d22 = We prove the theorem by induction, and assume that it holds for matrices of order q 1 or less. We introduce the matrix B of Theorem 6.6.1: elm = D( ^ l; ^ m)> l> m = s + 1> = = = > q= and
E(0 ; 0 ) = (D(; ))qs1 D((q); (q)) = (Gs )qs1 Gq
which we reverse to give Gq = E(0 ; 0 )@(Gs )qs1 = Since B[0 |0 ] is of order q s q 1, the inductive hypothesis applies to it: E(0 ; 0 ) es+1>s+1 E(0 (s + 1); 0 (s + 1)) and thus Gq es+1>s+1 E(0 (s + 1); 0 (s + 1))@(Gs )qs1 =
(6.6.1)
Applying Sylvester’s identity again, we have E(0 (s + 1); 0 (s + 1)) = (D(; ))qs2 D(; ) where = ^ 0 (s + 1) = {1> 2> = = = > s> s + 2> = = = > q} which when combined with (6.6.1) and es+1>s+1 = Gs+1 gives Gq
Gs+1 (Gs )qs2 D(; )@(Gs )qs1 = Gs+1 D(; )@Gs
Now we use the inductive hypothesis again to give D(; ) Gs D(0 (s + 1); 0 (s + 1)) which, when combined with (6.6.2) gives D((q); (q)) D((s + 1); (s + 1))D(0 (s + 1); 0 (s + 1)) which shows that the result holds for matrices of order q.
(6.6.2)
6. Positivity
137
Corollary 6.6.2 If A 5 Pq is TN then Gs d11 d22 = = = dss > 1 s q. Theorem 6.6.4 is expressed as a result concerning principal minors of a TN matrix A, but since any square matrix taken from a subset of rows and columns of such an A is also TN we can state Corollary 6.6.3 If A 5 Pq is TN, > 5 Tt>q , and > 5 (s) = (i.e., t > t s), then D( ^ 0 ; ^ 0 ) D(; )D(0 ; 0 )= Similarly, if > 5 Tt>q , and > 5 0 (s) = 0 (i.e., 1 > 1 s + 1), then D( ^ ; ^ ) D(; )D(; )= See Ando (1987) [4] for generalisations of this result. Theorem 6.6.5 Suppose A 5 Pp>q is TN. If A has s linearly dependent rows, labelled by 5 Ts>p with 1 = 1> s = p, of which the first s 1, labelled \s , and the last s 1, labelled \1 , are linearly independent (l.i.), then A has rank s 1. Proof. Clearly, the rank of A is at least s 1; we show that it is not greater than s 1, i.e., it is exactly s 1. The linearly dependent rows are specified by = {1 > 2 > = = = > s }. If s A q, then rank (A) q ? s, so that rank (A) = s 1. Take s q. The row s may be expressed in terms of rows 1 > 2 > = = = > s1 : ds >m =
s1 X
fn dn >m
m = 1> 2> = = = > q>
(6.6.3)
n=1
and since rows \1 are o=l=, f1 6= 0. Since rows \s are l.i., there is 0 5 Ts1>q such that D(\s ; 0 ) A 0. On substituting for ds >m from (6.6.3) we find D(\1 ; 0 ) = ()s f1 D(\s ; 0 )= Therefore D(\1 ; 0 ) 6= 0, but this minor is non-negative and therefore it is strictly positive; therefore ()s f1 A 0. Now suppose t 5 0 then u ? t ? u+1 for some index u satisfying 1 u ? s 1. If 5 Ts>q then, on substituting for ds>m , as before, we find D((\1 ) ^ t; ) = ()s+1 f1 D((\s ) ^ t; )= s
(6.6.4)
The inequality () f1 A 0 implies that the minors on either side of (6.6.4) have opposite signs. But both are non-negative so that both are zero. Since is an arbitrary member of Ts>q , this means that any row t 5 0 may be expressed as a linear combination of the rows \1 , or equivalently of \s . Thus the rank of A is s 1. We now prove a corollary of this result, but since its truth is not immediately clear, we state it as
138
Chapter 6
Theorem 6.6.6 If A 5 Pp>q is TN and there exist 5 Ts>p > 5 Ts>q such that 1 = 1> s = p> 1 = 1> s = q and D(; ) = 0> D(\s ; \ s ) A 0> D(\1 ; \ 1 ) A 0 then A has rank s 1. Proof. Apply Theorem 6.6.5 to the matrix with rows {1> 2> = = = > p} and columns of A. It has s linearly dependent rows , of which the first s 1, \s , and the last s 1, \1 , are linearly independent. Therefore, it has rank s 1, so that its s columns are linearly dependent. These columns are columns of A, and so are rows of AW . Now apply Theorem 6.6.5 to AW . Its rows are linearly dependent, while the first s 1, \ s , and last s 1, \ 1 , are linearly independent. Therefore, by Theorem 6.6.5, AW has rank s1; D has rank s1.
Exercises 6.6 1. Exhibit A 5 V2 which is PD but not TN. 2. Use Theorem 6.6.3 to prove that if A is NTN and d1q A 0, dq1 A 0, then A is a (strictly) positive matrix. Markham (1970) [221] stated this result for oscillatory A, but NTN is su!cient. Find even weaker conditions for the result to hold. (See Gladwell (1998) [126].) See Gasca and Pena (1992) [99] for related work.
6.7
Oscillatory matrices
We introduced four terms at the beginning of Section 6.6: TN, NTN, TP and O. In this section we are concerned with the last, oscillatory. We note that TN is weaker than NTN, which in turn is weaker than TP. O is by definition stronger than NTN; it is weaker than TP because 5 6 2 1 A=7 1 2 1 8 (6.7.1) 1 2 is O because
5
6 5 4 1 A2 = 7 4 6 4 8 1 4 5
is TP, but A itself is not TP. Note that if Ap is TP, then A is necessarily non-singular. We can therefore define A to be O, if A is TN and Ap is TP. We will show later that if A 5 Vq , A is PD, and tridiagonal with positive codiagonal, then A is O. Clearly though, the class of oscillatory matrices is much larger than this. We will first obtain some preliminary results which will allow
6. Positivity
139
us to characterise oscillatory matrices. It is oscillatory matrices, and not TP matrices, which appear in applications to inverse problems. We have defined an oscillatory (O) matrix as a TN matrix which is such that a power Ap is TP. Using this definition, we cannot easily check whether a TN matrix is O. Our principal aim in this section is to obtain an easily applicable test for A to be O. As a first step we prove Theorem 6.7.1 If A 5 Pq is O, then any principal submatrix B 5 Ps formed by deleting successive rows and columns of A is O. Proof. Clearly, any principal submatrix is TN; the question is whether it is O. It is su!cient to show that B = A1 , obtained by deleting the first row and column of A is O. We use Ex. 6.2.6, deduced from the Binet-Cauchy Theorem (equation (6.2.9)), to obtain the minors of a power of a matrix in terms of the minors of the original matrix. Suppose that Ap = C is TP, and consider the minors of D = Bp . We retain the original numbering of rows and columns, so that B = (dlm )q2 . Then if > 5 Ts>q and 1 2> 1 2, we have X (6.7.2) G(; ) = D(; (1) )D( (1) ; (2) ) = = = D( (p1) ; ) where the sum is taken over all sequences (1) > (2) > = = = > (p1) 5 Ts>q with (l) 2, l = 1> 2> = = = > p 1. Now consider the corresponding minors of C = Ap : F(1 ^ ; 1 ^ ); we have X F(1 ^ ; 1 ^ ) = D(1 ^ ; (1) )D( (1) ; (2) ) = = = D( (p1) ; 1 ^ ) (6.7.3)
where the sum is taken over all sequences (1) ; (2) > = = = > (p1) 5 Ts+1>q . Since C is TP, each of its minors must be positive; this implies that for at least one set (1) ; (2) > = = = > (p1) , the product on the right of (6.7.3) must be positive; this implies that each of the minors entering that product must be strictly positive, for they are all non-negative. Now if 5 Ts+1>q , it may be written 0 ^, where 5 Ts>q and 1 2. This means that with the particular set (1) > = = = > (p1) 5 Ts+1>q for which all the terms in (6.7.3) are positive, one may associate a set (1) > = = = > (p1) 5 Ts>q which appears in the product (6.7.2). Now we use Corollary 2 of Theorem 6.6.3; it shows that for this particular choice of ( (l) )p1 , 1 all the minors on the right of (6.7.2) must be positive, for if one were zero, say the first, then (1) D(1 ^ ; 0 ^ (1) ) d1>(1) D(; (1) ) = 0 0
contrary to the fact that the minor on the left is positive. We conclude that one product in the sum on the right of (6.7.2) is positive; D = Bp is TP; B is O. We defined a principal minor of A as D(; ) D(). We now define a quasi-principal minor. The minor D(; ) is said to be quasi-principal if > 5 Ts>q and 1 1 > 1 ? 2 > 2 ? · · · ? s > s q (6.7.4)
140
Chapter 6
and |1 1 | 1>
|2 2 | 1> = = = > |s s | 1=
(6.7.5)
Thus a principal minor is also a quasi-principal minor. The statement 1 > 1 ? 2 > 2 means that each of 1 and 1 is less than each of 2 and 2 , but there is no ordering of 1 and 1 , nor of 2 and 2 ; thus 1 ? 2 > 1 ? 2 > 1 ? 2 > 1 ? 2 = The minors D(1> 3; 2> 3)> D(1> 3; 1> 3)> D(1> 2; 1> 3) are all quasi-principal, but D(1> 2; 2> 3) is not. Note that for A given in (6.7.1), and this matrix A is O, all these quasiprincipal minors are positive. This is a particular case of Theorem 6.7.2 If A 5 Pq , A is NTN and dl>l+1 A 0> dl+1>l A 0 for l = 1> 2> = = = > q 1, then all the quasi-principal minors of A are positive. Proof. We will use induction on the order, s, of the minors. The first order quasi-principal minors are the diagonal terms dll , which are positive because of Corollary 6.6.2; and dl>l+1 and dl+1>l , which are positive by the statement of the theorem. Suppose then that all the quasi-principal minors of order s 1 are positive. We will prove that all those of order s are positive. For suppose this were not true, so that D(1 > 2 > = = = > s ; 1 > 2 > = = = > s ) = 0 where the indices satisfy the inequalities (6.7.4), (6.7.5). But then D(1 > 2 > = = = > s1 ; 1 > 2 > = = = > s1 ) and D(2 > 3 > = = = > s ; 2 > 3 > = = = > s ) will be quasi principal minors of order s1, and so positive. Now Theorem 6.6.6 states that the matrix with rows 1 > l +1> = = = > s and columns 1 > 1 +1> = = = > s has rank s1. Let k = max(1 > 1 ), then it follows from the inequalities (6.7.4), (6.7.5), that 1 > 1 k; s > s k + s 1= Therefore, the minor D(k> k + 1> = = = > k + s 1) is a sth order minor of a matrix with rank s 1, and so is zero. But this minor is a principal minor of A, and Theorem 6.6.4 shows that det(A) = 0; but A is NTN and thus non-singular. This contradiction implies that all the quasi-principal minors of A are positive. We are now in a position to prove the important Theorem 6.7.3 If A 5 Pq is NTN then it is O i dl>l+1 A 0> dl+1>l A 0 for l = 1> 2> = = = > q 1.
6. Positivity
141
Proof. We first prove that if it is O, then dl>l+1 A 0> dl+1>l A 0. If it is O then Theorem 6.7.1 states that the matrix · ¸ dll dl>l+1 B= dl+1>l dl+1>l+1 is O, so that D = Bp is TP for some p. But if say dl>l+1 = 0 then gl>l+1 = 0, whatever the value of p. Similarly, if dl+1>l = 0, then gl+1>l = 0. Thus dl>l+1 A 0 and dl+1>l A 0. We must now prove that if dl>l+1 A 0> dl+1>l A 0 for all l = 1> 2> = = = > q 1, then there is a power of A which is TP. We shall show that Aq1 is TP. We shall use Theorem 6.7.2, which states that the quasi-principal minors of A are positive. We recall the result used in Theorem 6.7.1, that a minor of B = Aq1 is a sum of products of q 1 minors of A. We need to show that the sum corresponding to a particular minor E(; ) has at least one positive term in it. First, we note that if E(; ) A 0 for one particular value of p, then it will be positive for p + 1 also, and thus for all subsequent p; for since C = Ap+1 = A. Ap = AB, the Binet-Cauchy expansion for the minor F(; ) will contain the term D(; ) E(; ). This is positive because, by Theorem 6.6.2, the principal minors of A are positive. This implies that, to show that E(; ) A 0 holds for B = Aq1 , it is su!cient to show that for some p satisfying 1 p q 1 the expansion for E(; ) will contain one product consisting entirely of quasi-principal minors. The problem is essentially how we can step from the sequence to the sequence through intermediate sequences (1) > (2) > = = = > (p1) such that D(; (1) )> D( (1) ; (2) )> = = = > D( (p1) ; ) are all quasi-principal. Take an example. Suppose s = 3> = {1> 2> 3} and = {3> 5> 6}; we step as follows: {1> 2> 3} $ {2> 3> 4} $ {3> 4> 5} $ {3> 5> 6}= The required exponent p is the number of steps needed to go from to , and this is G = max |u u |= (6.7.6) 1us
The quantity G (3 in the example) may be viewed as the distance G(> ) between two sequences (see Ex. 6.7.2). If D(> ) is quasi-principal then G(> ) 1; if D(; ) is quasi-principal but not principal, then G(> ) = 1. The greatest distance between two sequences > 5 Ts>q is q s; it occurs for instance when = {1> 2> = = = > s}>
= {q s + 1> q s + 2> = = = > q};
this in turn is maximized when s = 1, i.e., = {1}> = {q}. We conclude that if p = q 1, then the Binet-Cauchy expansion for any minor of B will contain one product consisting entirely of quasi-principal minors of A; B is TP; A is O. We conclude this section by analyzing how oscillatory matrices relate to the Jacobi matrices which occupied our attention in earlier chapters. We defined a
142
Chapter 6
Jacobi matrix in Section 3.1: J 5 Vq , J is PSD, and J has negative co-diagonal. J is clearly not O, but ˜ = ZJZ is O. Theorem 6.7.4 If J is PD, then A = J Proof. We recall that Z = gldj(1> 1> 1> = = = > ()q1 ), so that in the notation of equation (3.1.4), dl>l+1 = dl+1>l = el A 0= ˜ is TN. Consider According to Theorem 6.7.3 it is su!cient to show that A J a minor D(; ). There are three cases: 1) = , then D(; ) A 0 since A is PD. 2) g(> ) = 1, i.e., D(; ) is quasi-principal, thus it may be expressed as a product of principal minors and e’s; D(> ) A 0. 3) g(> ) A 1, then D(; ) = 0. ˜ only the quasi-principal minors are positive; the others are zero. For A = J ˜ = ZAZ is called the sign-reverse of A. If A 5 Pq , then A ˜ 1 is NTN, TP, Theorem 6.7.5 Suppose A 5 Pq . A is NTN, TP, O i (A) O respectively. Proof. We recall from Section 1.3, that A1 = R, where ulm = Dml @ det(A), where Dlm is given by Dlm = ()l+m d ˆlm . This means that it is su!cient to show ˆ that A is NTN, TP or O i A = (ˆ dlm ) is NTN, TP, O. But Theorem 6.2.1 shows ˆ is NTN or TP respectively. If A is O immediately that A is NTN or TP i A then d ˆl>l+1 and d ˆl+1>l are given by Theorem 6.2.1 as quasi-principal minors of ˆ is O; and vice versa, if A ˆ is O, then so is A. A, and so are positive; A ˜ is oscillatory we shall say that A is sign-oscillatory (SO). This implies, If A in particular, that a non-singular Jacobi matrix is SO. Corollary 6.7.1 If A is SO, then A1 is O. Exercises 6.7 1. Why is it not su!cient to define A to be O if, for some p, Ap is TP? Exhibit an example of A 5 P2 such that A2 is TP but A is not TN. 2. Show that the distance G(> ) satisfies the basic conditions for a distance: G(> ) 0; G(> ) = 0 i = ; G(> ) G(> ) + G(> ). 3. Show that if A 5 Pq is tridiagonal, then it is O i a) its principal minors are non-negative b) dl>l+1 A 0> dl+1>l A 0 for l = 1> 2> = = = > q 1 c) it is non-singular.
6. Positivity
143
4. We say that a tridiagonal matrix A as described in Ex. 6.7.3 has halfbandwidth 1; it has 1 diagonal above, and 1 below, the principal diagonal. Show that if 1 s q 1 then As has half-bandwidth s. 5. Show that if dl>l+1 6= 0> dl+1>l 6= 0, then a tridiagonal matrix A may be symmetrized by using diagonal matrices, i.e., we can find diagonal C> D so that CAD is symmetric. Show that this means that an oscillatory ˜ matrix by using positive tridiagonal matrix may be symmetrized to a J ˜ diagonals C> D, i.e., CAD = J. 6. Suppose A> B 5 Pq . Show that if A> B are TN, then so is C AB. 7. Show that if A> B are O, then so is C AB. 8. Show that if A 5 Pq is O then Aq1 is O.
9. Show that if A = J1 then A, which is O by Theorem 6.7.5, is actually a (strictly) positive matrix, i.e., it is full. Note that by Ex. 6.6.2, it is su!cient to show that d1q A 0. 10. Show that if A is O, then the indices of its staircase structure (Section 6.6) satisfy sl l + 1> tm m + 1. ˜ 1 11. Show that if A has eigenvalues o and eigenvectors ul , then B = (A) has eigenvalues n = 1@c and eigenvectors vn = Zuo , where o = q + 1 n. 12. Exhibit counterexamples to show that if A is one of TN, TP or O, then a compound matrix As need not have the same property.
6.8
Totally positive matrices
The matrix A 5 Pq is TP if all its minors are positive. This is equivalent to the statement that all the compound matrices As s = 1> 2> = = = > q, are (strictly) positive. There are ¶2 µ ¶2 µ ¶2 µ q q q S = q2 + + + === + +1 (6.8.1) 2 3 q1 elements to be checked. Using a result due to Fekete (1913) [86], Ando (1987) [4] proved that one need check only a much smaller set of minors. As in Section 6.2, let Ts>q denote the set of strictly increasing sequences = {1 > 2 > = = = > s } chosen from 1> 2> = = = > q. We write g() =
s1 X (l+1 l 1) l=1
and note that if 5 Ts>q , then g() = 0 i l+1 = l + 1 for l = 1> 2> = = = > s 1; i.e., g() = 0 i 1 > 2 > = = = > s consists of consecutive integers. We define T0n>q as the subset of Tn>q consisting of those with g() = 0. Following Theorem 2.5 of Ando (1987) [4] we have
144
Chapter 6
Theorem 6.8.1 D 5 Pq is TP if D(; ) A 0 for all > 5 T0n>q > n = 1> 2> = = = > q. Proof. Let us prove that D(; ) A 0 for > 5 Tn>q > n = 1> 2> = = = > q
(6.8.2)
by induction on n. When n = 1, this is trivial because Tn>q = T0n>q . Assume that (6.8.2) is true with n1(n 2) in place of n. First fix 5 Tn>q with g() = 0, i.e., 5 T0n>q , and let us prove (6.8.2) with this by induction on c = g(). When c = 0 this follows by the assumption of the theorem. Suppose D(; ) A 0 for all minors whenever 5 Tn>q and g() c 1, with c 1. Take 5 Tn>q with g() = c. Then there is a s such that 1 ? s ? n > g( ^ { 1 > s}) c 1 and g( ^ {s> n }) c 1 where = { 2 > 3 > = = = > n1 }. Now use the identity ((1.39) of Ando (1987) [4]) D($; ^ {s})D(; ^ { 1 > n }) = D($; ^ { 1 })D(; ^ {s> n }) for any $ 5 Tn1>q with $ . It follows from the induction assumptions that the right hand side is positive, as is D($; ^ {s}), so that D(; ^ { 1 > n }) D(; ) A 0. This proves (6.8.2) for 5 Tn>q with g() = 0. Apply the same argument row wise to conclude that (6.8.2) is generally true. We may use precisely the same argument to prove the Corollary 6.8.1 Suppose D 5 Pq . If all minors D(; ) A 0 for > 5 Tn1>q , and D(; ) A 0 for > 5 T0n>q , then, D(; ) A 0 for all > 5 Tn>q . This result mirrors the test for a matrix A 5 Vq to be PD; to show that A is PD, it is su!cient to show that the leading principal minors G1 > G2 > = = = > Gq are all positive. The importance of the result lies in the fact that, with it, the number of minors to be checked for positivity is much smaller than that given by (6.8.1). The test in Theorem 6.8.2 determines whether an arbitrary matrix A 5 Pq is TP. If it is known that A is TN, then one needs to check only a very small number of minors for strict positivity to determine whether A is TP, as stated in Theorem 6.8.2 If A 5 Pq is TN, then it is TP if its corner minors are positive. Proof. The corner minors are the minors D(1> 2> = = = > s; q s + 1> = = = > q)>
D(q s + 1> = = = > q; 1> 2> = = = > s)>
s = 1> 2> = = = > q. The result follows immediately from Theorem 6.6.6 and Theorem 6.8.1. Consider a minor D(; ) with > 5 T0n>q . Suppose = {l> l + 1> = = = > l + n 1}, 5 {m> m + 1> = = = > m + n 1}. If l m then D(; ) is a principal minor of the corner submatrix D(q s + 1> = = = > q; 1> 2> = = = > s) with s = q (l m). This submatrix is NTN, so that, by Theorem 6.6.6, its
6. Positivity
145
principal minors are positive. If l ? m, then D(; ) is a principal minor of D(1> 2> = = = > s; q s + 1> = = = > q) with s = q (m l). Since D(; ) A 0 for all > 5 T0n>q , n = 1> 2> = = = > q, Theorem 6.8.1 states that D is TP. Exercises 6.8 1. Show that if A is NTN and B is TP, then AB and BA are TP. 2. Show that if slm = exp[n(l m)2 ], then P = (slm ) is TP. See Section 7 of Ando (1987) [4]. 3. Use Ex. 6.8.3 to show that a NTN matrix A may be approximated arbitrarily closely, in the O1 norm (see (6.5.4)) by the TP matrix B = PAP. (Again, see Section 7 of Ando (1987) [4].)
6.9
Oscillatory systems of vectors
Before discussing the eigenproperties of totally positive matrices, we need to analyse some sign properties of vectors. Let x1 > x2 > = = = > xq be a sequence of real numbers. If some of them are zero we may assign them arbitrarily chosen signs. We can then compute the number of sign changes in the sequence. This number may change, depending on the choice of signs for the zero terms. The greatest and least values of this number are denoted by Vu+ and Vu respectively, where u = {x1 > x2 > = = = > xq }. If Vu+ = Vu , we speak of an exact number of sign changes in the sequence, and denote this by Vu . Clearly this case can occur only when 1. x1 > xq 6= 0 2. when xl = 0 for some l satisfying 2 l q 1, then xl1 xl+1 ? 0, i.e., xl1 and xl+1 are both non-zero, and have opposite signs. In this case Vu is the number of sign changes when the zero terms are removed. We say that a system of vectors un = {x1n > x2n > = = = > xqn }> n = 1> 2> = = = > s, is an oscillator y system if, for any (fn )s1 with s X
f2n A 0>
(6.9.1)
n=1
the vector u=
s X
fn un
(6.9.2)
n=1
satisfies Vu+ s 1. Clearly, we need only consider s q. Taking s = 1 we see that Vu+1 = 0, i.e., u1 A 0; for s = 2> Vu+ 1, etc.
146
Chapter 6
Theorem 6.9.1 The necessary and su!cient condition for the system (un )s1 to be an oscillatory system is that all the minors X (; ) be dierent from zero, and have the same sign, for 5 Ts>q > = {1> 2> = = = > s}= Proof. The minors in question are X (1 > 2 > = = = > s ; 1> 2> = = = > s)=
(6.9.3)
Remember that 1 > 2 > = = = > s refer to components of the vectors, while 1> 2> = = = > s refer to the vector index n. The theorem states that when s = 1, x11 > x21 > = = = > xq1 must all be non-zero and have the same sign; this is certainly equivalent to Vu+ = 0. For s = 2, it states that ¯ ¯ ¯ ¯ ¯ x11 x12 ¯ ¯ x x12 ¯¯ ¯ ¯ > X (1> 3; 1> 2) = X (1> 2; 1> 2) = ¯¯ 11 ¯ ¯ x31 x32 ¯ > x21 x ¯ 22 ¯ ¯ x xq1>2 ¯¯ = = = X (q 1> q; 1> 2) = ¯¯ q1>1 > xq>1 xq>2 ¯ are all non-zero and have the same sign. We first prove the necessity. If a minor (6.9.3) were to vanish, then we could find numbers (fn )s1 , not all zero, such that s X
fn xm >n = 0 m = 1> 2> = = = > s=
(6.9.4)
n=1
But then the vector u given by (6.9.2) would have s zero terms x1 = 0 = x2 = · · · = xs so that, by Ex. 6.9.1, Vu+ s A s 1. In order to show that the minors all have the same sign it is su!cient to show that all minors X (; ) for next to in the sense G(; ) = 1, (see equation (6.7.6)) all have the same sign. These are the minors (Xm )s1 , where Xm = X ((m) ; ) and (1) = {2> 3> = = = > s + 1}> (m) = {1> 2> = = = > m 1> m + 1> = = = > s + 1}> m = 2> 3> = = = > s. These must all have the same sign as Xs+1 = X (; ). Introduce a vector us+1 such that ; l=m ? ()m+1 Xs+1 > ()s+1 Xm > l=s+1 = xl>s+1 = (6.9.5) = 0 otherwise Then X (1> 2> = = = > s+1; 1> 2> = = = > s+1) = ()s+1 {()s+1 xs+1>s+1 Xs+1 +()m xm>s+1 Xm } = 0
6. Positivity
147
, not all zero, such that so that we can find (fn )s+1 1 s+1 X
fn xl>n = 0> l = 1> 2> = = = > s + 1=
n=1
But then the vector (6.9.2) will have coordinates xl = fs+1 xl>s+1
l = 1> 2> = = = > s + 1=
The quality fs+1 cannot be zero, for then u would have s + 1 zero terms and hence Vu+ s. Choose fs+1 so that fs+1 Xs+1 A 0, then, according to (6.9.5), (xl )m1 = 0> xm = ()m fs+1 Xs+1 > (xl )sm+1 = 0> xs+1 = ()s fs+1 Xm . If Xm > Xs+1 1 had opposite signs, then xm would have the sign of ()m , and xs+1 would have the sign of ()s+1 . This means that we can assign the signs of the zero xl so that, for all l = 1> 2> = = = > s + 1, xl has the sign of ()l . But then Vu+ = s. This proves the necessity. Now we prove the su!ciency. Suppose that all the minors (6.9.3) were nonzero and had the same sign, which we may take to be positive. We will prove Vu+ s 1, by assuming the contrary, i.e., Vu+ s. If that were so we could find s + 1 components x1 > x2 > = = = > xs+1 such that xm xm+1 0> m = 1> 2> = = = > s= (xm )s1
(6.9.6) (fn )s1 ,
The cannot be simultaneously zero, for then the satisfy equation (6.9.4), the determinant of which is not Now consider the zero determinant ¯ ¯ x1>1 x1>2 = = = x1>s x1 ¯ ¯ x2>1 x = = = x x2 2>2 2>s ¯ ¯ · · = = = · · ¯ ¯ xs+1>1 xs+2>2 = = = xs+1>s xs+1
and expand it along its last column s+1 X
not all zero, would zero. ¯ ¯ ¯ ¯ ¯ = 0> ¯ ¯ ¯
xn ()s+1+n X (1 > 2 > = = = > n1 > n+1 > = = = > s+1 ; 1> 2> = = = > s) = 0=
n=1
But this is impossible because the minors are all positive and, by (6.9.6), the quantities ()n xn all have the same sign, and are not zero. This completes the proof. Exercises 6.9 1. Consider the real sequence x1 > x2 > = = = > xq . Show that if (xl )q1 = 0 then Vu = 0> Vu+ = q 1. Show also that if s(0 s ? q) of the xl are zero then Vu q s 1 and s Vu+ q 1 while if s(1 ? s ? q) successive xl are zero then Vu+ Vu s.
148
6.10
Chapter 6
Eigenproperties of TN matrices
Since TN matrices are not necessarily symmetric we cannot immediately assume that their eigenvalues are real; to do so we must make use of their special properties. Theorem 6.10.1 The eigenvalues of an TP matrix are positive and distinct. Proof. Suppose that A 5 Pq has eigenvalues 1 > 2 > = = = > q , possibly complex. We order them in decreasing modulus, i.e., so that |1 | |2 | · · · |q |. Since A is TP, it is positive; Perron’s theorem (Theorem 6.5.1) states that 1 is positive and 1 A |2 |. Since A is TP, the compound matrix A2 is positive; its eigenvalues are the products l m > l> m = 1> 2> = = = > q. It too has a positive eigenvalue, greater in magnitude than any other; it must be 1 2 so that 1 2 A 0 and 1 2 A |1 3 |. Thus 2 A 0 and 2 A |3 |. Now we consider A3 and deduce that 1 2 3 A 0 and 1 2 3 A |1 2 4 |, i.e., 3 A 0 and 3 A |4 |, and so on. Corollary 6.10.1 The eigenvalues of an oscillatory matrix are positive and distinct. Proof. For if A 5 Pq is O, then B = Ap is TP for all p q1. But if the eigenvalues of A are (l )q1 , those of B are l = p l ; since 1 A 2 A · · · A q A 0, and l 0, we have 1 A 2 A · · · A q A 0. We now show that the eigenvectors of an oscillatory matrix behave exactly ˜ matrix, i.e., like those of a Jacobi matrix when the ordering of like those of a J the eigenvalues is reversed (see the comment at the end of Section 6.1). Theorem 6.10.2 Suppose A 5 Pq is O, and has eigenvalues (l )q1 satisfying 1 A 2 A · · · A q A 0. Let un = {x1n > x2n > = = = > xqn } be an eigenvector corresponding to n ; it is unique apart from a factor. Let u=
t X
n=s
fn un >
t X
f2n A 0
(6.10.1)
n=s
then the number of sign changes among the components of u for diering (fn )ts satisfies (6.10.2) s 1 Vu Vu+ t 1= Proof. Since the eigenvectors of A are also the eigenvectors of Ap , and since Ap is TP for some p q 1, we lose no generality by assuming that A is TP. Suppose 1 t q> = {l1 > l2 > = = = > lt } 5 Tt>q > = {1> 2> = = = > t}. Then the minors X (; ) are the coordinates of the eigenvector of the compound matrix At corresponding to the maximum eigenvalue 1 2 = = = t . By Perron’s theorem all the components of this eigenvector have the same sign. If the
6. Positivity
149
sign of the t-th set of minors is Ht then, by multiplying the vectors (un )q1 by H1 > H2 @H1 > = = = > Hq @Hq1 respectively, we can make X (; ) A 0 t = 1> 2> = = = > q= Theorem 6.9.1 now shows that Vu+ t 1. ˜ 1 . Theorem 6.7.5 To prove the second part of the theorem we put B = (A) shows that if A is TP, then so is B, and Ex. 6.7.11 shows that it has eigenvalues n = 1@o and eigenvectors vn = Zuo , where o = q + 1 n. Thus vn = {y1n > y2n > = = = > yqn } = {x1o > x2o > x3o > = = = > ()q1 xqo }= The result already proved shows that the number of sign interchanges in v=
q+1s X
fq+1n vn = Z
n=q+1t
t X
fo uo
o=s
satisfies Vv+ q 1. But since vl = ()l1 xl we have Vv+ + Vu = q 1 so that Vu s 1. Corollary 6.10.2 The vector u = un has exactly n 1 sign changes. (Vu = Vu+ = n 1). Corollary 6.10.3 xqn 6= 0, so that un may be chosen so that xqn A 0. The argument used in this theorem leads directly to Corollary 6.10.4 For each s such that 1 s q, the minors X (1 > 2 > = = = > s ; 1> 2> = = = > s), have the same sign for all 5 Ts>q . The minors of Corollary 6.10.4 relate to components 1 > 2 > = = = > s of the first s eigenvectors. We now prove a result in which components and eigenvalue indices are reversed; this theorem will play a vital role in the inverse problem for the discrete vibrating beam (Chapter 8). Before stating the theorem we repeat comments we have made on the relation between oscillatory (O) and sign-oscillatory (SO) matrices. If A is O, with eigenvalues (l )q1 ordered so that 1 A 2 A · · · A q A 0, then its eigenvectors (un )q1 satisfy Theorem 6.10.2 so that, in particular, un has exactly n 1 sign changes. If A is SO and we label its eigenvalues (l )q1 in reverse order, i.e., so that 0 ? 1 ? 2 ? · · · ? q , then its eigenvectors (un )q1 again satisfy Theorem 6.10.2, so that, in particular, un has n 1 sign changes. We will phrase the final theorem of this section for an SO matrix. Theorem 6.10.3 If A 5 Pq is SO, with eigenvalues (l )q1 satisfying 0 ? 1 ? 2 ? · · · ? q , then its eigenvectors (ul )q1 may be chosen so that X (!; ) A 0= for ! = {q s + 1> q s + 2> = = = > q} and each 5 Ts>q .
(6.10.3)
150
Chapter 6
Proof. The analysis of Section 6.3 (See Ex. 6.3.2) shows that X (!; ) is the last component of the eigenvector of the compound matrix As corresponding to the vth eigenvalue 1 > 2 > = = = > s , where v = v(1 > 2 > = = = > s ). The more general statement of Theorem 6.10.3 is that all the elements X (!; ) have the same sign, which is thus the sign for the case s = 1, i.e., for xql . The proof is by induction on p. Corollary 6.10.3 shows that xql 6= 0. Choose xql A 0 for l = 1> 2> = = = > q; the theorem then holds for s = 1. Suppose the result holds for s. Corollary 6.10.2 shows that ul has l 1 sign changes, so that ()l1 x1l A 0. Choose (fm )s+1 so that l u=
l+s X m=l
fm um >
l+s X
f2m A 0
m=l
and xqs+1 = 0 = xqs+2 = · · · = xq > using the choice fm = ()ml X (!; \m) m = l> l + 1> = = = > l + s where = {l> l + 1> = = = > l + s}> ! = {q s + 1> = = = > q}. The vector u has the form u = {x1 > x2 > = = = > xqs > 0> 0> = = = > 0} and has first element x1 = fl x1>l + fl+1 x1>l+1 + · · · fl+s xl>l+s = Since, by hypothesis, the result is true for s, the coe!cients fm satisfy ()m fl+m A 0; this and the inequality ()l+m1 x1>l+m A 0, yield ()l1 fl+m x1>l+m A 0, so that ()l1 x1 A 0. By Theorem 6.10.2, l 1 Vu Vu+ s + l 1 and since the last s elements of u are zero, there must be exactly l 1 sign changes in the first q s elements of u; but ()l1 x1 A 0, so that the last non-zero element, xqs , must be positive, i.e., xqs = X (q s> q s + 1> = = = > q; l> l + 1> = = = > l + s) A 0> l + s q= This shows that all (s+1)th order minors with consecutive indices l> l+1> = = = > l+s are positive, and Theorem 6.8.1 shows that all (s+1)th order minors are positive.
6. Positivity
151
Exercises 6.10 1. Show that if um > um+1 are eigenvectors of an O or SO matrix A 5 Pq , then xq1>m xq>m+1 xq1>m+1 xq>m is non-zero and has the same sign for m = 1> 2> = = = > q 1. 2. Show that the proof used in Theorem 6.10.3 may be used to show that if A 5 Pq is O with eigenvalues (l )q1 satisfying 0 ? q ? q1 ? · · · ? 1 , then its eigenvalues (ul )q1 may be chosen so that X (!; ) A 0 for ! = {q s + 1> = = = > q and each 5 Ts>q . 3. The matrix
5
6 2 1 9 1 2 1 : : D=9 7 1 2 1 8 1 2
is O. Use the recurrence method described in Section 2.6 to find its eigenvalues (l )41 , labelled so that 0 ? 4 ? 3 ? 2 ? 1 , and its eigenvectors. ¡ ¢ [Note: the eigenvectors may be written explicitly in terms of { = sin 5 ¡ 2 ¢ and { = sin 5 .] Choose the signs of the eigenvectors so that they obey Corollary 6.10.4. Make a dierent choice so that they obey Ex. 6.10.2. 4. If u is an eigenvector of A 5 Pq , and T is the reversing matrix given in equation (4.3.8), then v = Ty is an eigenvector of B = TAT corresponding to the same eigenvalue . Use this result, and Ex. 6.10.2, to show that if E is O, then Y (s> s 1> = = = > 1; 1 > = = = > s ) A 0=
6.11
u-line analysis
We recall the concept of a u-line corresponding to the vector u = {x1 > x2 > = = = > xq }, from Section 3.3: it is the broken line made up on the links joining the points with coordinates ({> |) = (l> xl ), so that x({) = (l + 1 {)xl + ({ l)xl+1 > l { l + 1= Theorem 6.11.1 Let un be an eigenvector corresponding to eigenvalue n of an oscillatory matrix A. The corresponding un -line, u(n) ({) has no links on the {-axis, and has just n 1 nodes, i.e., simple zeros where u(n) ({) changes sign. Proof. A link of a u-line can lie along the {-axis only if two successive xl are zero, but this is precluded by the Corollary to Theorem 6.10.2. Since Vu = n 1, the u-line has just n 1 nodes.
152
Chapter 6
Corollary 6.11.1 If > are successive nodes of a un -line, then | | A 1. Theorem 6.11.2 The u-lines corresponding to two successive eigenvectors of an oscillatory matrix cannot have a common node. Proof. Suppose, if possible, that x(n) () = 0 = x(n+1) (), and put x({) = fx(n) ({) x(n+1) ({)= Theorem 6.10.2 shows that n 1 Vu Vu+ n= (n)
(6.11.1) (n+1)
({) will both The Corollary to Theorem 6.11.1 shows that x ({) and x be non-zero in (> + 1]. Choose so that ? + 1, and put f = x(n+1) ()@x(n) (). Then x({) will have two zeros, > such that ? + 1; it must therefore have a link along the {-axis, means that two successive xl must vanish. According to Ex. 6.9.1 this means that Vu+ Vu 2, contradicting (6.11.1). Theorem 6.11.3 The nodes of u-lines corresponding to two successive eigenvectors un > un+1 of an oscillatory matrix interlace. Proof. Suppose that > are two successive nodes of the un+1 -line, then x(n+1) () = 0 = x(n+1) () and A 1. Suppose if possible that the un -line has no node in (> ). Without loss of generality we may assume that x(n) ({) A 0 in [> ]> x(n+1) ({) A 0 in (> )= Put x({) = fx(n) ({) x(n+1) ({) then n 1 Vu Vu+ n=
(6.11.2)
For su!ciently large f, x({) A 0 in [> ]. Decrease f to a certain value f0 at which x({) first vanishes at least once, at a point in [> ]. Clearly f0 A 0 and x0 ({) = f0 x(n) ({) x(n+1) ({) does not vanish at or , so that ? ? . Thus x0 ({) 0 in [> ] and x0 () = 0. The broken line x0 ({) cannot have a complete link on the {-axis, for then, as in Theorem 6.11.2, it would be zero at two successive x0 (l) and Vu+0 Vu0 2, contradicting (6.11.2). Since x0 () = 0, and x0 ({) is positive on either side of , must be a break-point of the x0 ({) line, say l, so that x0 (l 1) A 0> x0 (l) = 0> x0 (l + 1) A 0 Vu+0
Vu0 2, (n+1)
contradicting (6.11.2). We conclude that between and again any two nodes of x ({) there must be at least one node of x(n) ({). But (n) x ({) has only n 1 nodes, while x(n+1) ({) has n nodes. Thus x(n) ({) has no more than one node between two nodes of x(n+1) ({), i.e., it has exactly one node there; the two sets of nodes interlace.
Chapter 7
Isospectral Systems We view things not only from dierent sides, but with dierent eyes; we have no wish to find them alike. Pascal’s Pensées, 124
7.1
Introduction
We will say that two systems are isospectral if they have the same eigenvalues. (Some authors use the term cospectral.) In our context a ‘system’ is characterised by a symmetric matrix A 5 Vq , or perhaps by two symmetric matrices M> K 5 Vq . In the notation of Section 4.3, two matrices A> B 5 Vq are said to be is isospectral if (A) = (B) (7.1.1) and two systems (M> K) and (M0 > K0 ) are said to be isospectral if (M> K) = (M0 > K0 )=
(7.1.2)
We recall that if M> M0 are positive definite, then we may reduce the problem to (7.1.1). In Section 5.2, when discussing matrix transformations, we showed that if Q is an orthogonal matrix, i.e., one satisfying QQW = QW Q = I
(7.1.3)
B = QAQW
(7.1.4)
and if then A and B are isospectral. The converse is true: if A and B 5 Vq are isospectral, then they are related by (7.1.4) for some Q. To prove this, we may use the general representation of a symmetric matrix given in the Corollary to Theorem 6.3.2. Suppose A> B 5 Vq have the same eigenvalues (l )q1 . Put a = gldj(1 > 2 > = = = > q ), then A = U a UW and B = V a VW 153
154
Chapter 7
where both U and V are orthogonal. Thus B = VUW =UaUW =UVW = VUW =A=UVW = But since U, V are orthogonal, so is Q = VUW , (Ex. 5.2.2). Thus B = QAQW . We recall Ex. 5.2.2, that this transformation defines an equivalence class, an isospectral family of matrices. This means that, from a purely mathematical viewpoint, the problem of characterizing isospectral systems governed by a single matrix is solved: the matrices A and B are linked by some orthogonal matrix Q. However, this result is insu!cient for applications to vibrating systems. For there we are concerned with vibrating systems of a particular type, as described for instance in Section 5.1. It may easily be verified that if the matrix A has a particular form, in the sense that it relates to a particular graph G, and if Q is an arbitary orthogonal matrix, then B will not necessarily have the same form, i.e., relate to the same graph G. In practice, the conditions on the system matrix are even more stringent; there are conditions on the signs of matrix elements. This is the question we address in this Chapter: given one system, specified by A or (M> K), with the matrices having some particular form, specified by a graph G, and perhaps some sign conditions, how can we find other systems B or (M0 > K0 ) satisfying the same conditions? We do not seek just an isospectral family, but a special isospectral family (i.e., a subfamily), the members of which share certain special characteristics. So far, the results which have been obtained relate to comparatively simple systems. We start our quest by considering the concept of isospectral flow.
7.2
Isospectral flow
Suppose A 5 Vq has eigenvalues (l )q1 and eigenvectors (ul )q1 , then equation (6.3.8) states that if U = [u1 > u2 > = = = > uq ] and a = gldj(1 > 2 > = = = > q ), then A = UaUW >
UUW = UW U = I=
(7.2.1)
Now suppose that U depends on a single parameter w, and that U(w), and hence A(w) varies in such a way that the eigenvalues, and hence a, remain invariant. Using · to denote g@gw, we have ˙ a UW ˙ = UaU ˙ W +U A ˙ W ) + (UU ˙ W )(U a UW )= = (U a UW )(UU
(7.2.2)
On dierentiating the second equation in (7.2.1) we find ˙ W + UU ˙W =0 UU so that on writing ˙W S = UU
(7.2.3)
7. Isospectral Systems
we find
155
˙ W = S = SW > UU
(7.2.4)
and we can write equation (7.2.2) as ˙ = AS SA= A
(7.2.5)
This is the dierential equation governing isospectral flow. We note from equation (7.2.4) that the matrix S is skew symmetric. We note also that the dierential equation governing U is ˙ = SU> U ˙ given by (7.2.5) and that because S is skew symmetric and A is symmetric, A, is symmetric. We may apply this analysis in reverse. Suppose S(w) is a skew symmetric matrix, and let U(w) be the solution of the equation ˙ U(w) = S(w)U(w)>
U(0) = U0
where U0 is an orthogonal matrix, then ˙ W U + UW U ˙ = UW SU UW SU = 0= (UW U)• = U But since UW0 U0 = I> UW (w)U(w) = I for all w; U(w) is orthogonal. Now, with this S(w) and U(w) we consider the equation ˙ A(w) = A(w)S(w) S(w)A(w)> A(0) = A0 where A0 = U0 a UW0 . We have (UW AU)•
˙ W AU + UW AU ˙ ˙ + UW AU = U = UW SAU + UW (AS SA)U UW ASU = 0
so that UW AU = UW0 A0 U0 = a= Equation (7.2.5) provides a way in which to construct a one-dimensional family, i.e., a trajectory, of isospectral systems, and we will explore its use later. At this point however, we will discuss the connection between equation (7.2.5) and matrix factorisation. One of the basic procedures of numerical linear algebra is the Gram-Schmidt procedure for orthogonalisation: given a set of vectors (al )p 1 5 Yq , construct a set of orthonormal vectors (ql )u1 5 Yq by forming combinations of the al . The Gram-Schmidt procedure gives a way to factorise a non-singular matrix A 5 Pq . Since A is non-singular, its columns are linearly independent, and so span Yq ; the Gram-Schmidt procedure will yield q orthonormal vectors (ql )q1 spanning Yq ; we obtain the factorisation by writing the al ’s in terms of q’s. Let A = [a1 > a2 > = = = > aq ]>
156
Chapter 7
then we choose (ql )q1 so that ap =
p X
unp qn >
p = 1> 2> = = = > q>
n=1
which we may assemble to give 6 5 5 t11 d11 d12 = = = d1q 9 d21 d22 = = = d2q : 9 t21 :=9 9 7 · · === · 8 7 · dq1 dq2 = = = dqq tq1
t12 t22 · tq2
65 u11 = = = t1q 9 = = = t2q : :9 === · 87 · = = = tqq
u12 u22 ·
=== === ===
6 u1q u2q : :> · 8 uqq
i.e., A = QR=
(7.2.6)
The ql and the u’s are found in Theorem 3.2.1: u11 = ||a1 ||> q1 = a1 @u11 > u12 = qW1 a2 > u22 = ||a2 u12 q1 ||>
q2 = (a2 u12 q1 )@u22 >
etc. We note that the diagonal terms ull are positive. One of the basic results related to the QR factorisation is that if A = QR> then A0 RQ = QW (QR)Q = QW AQ
(7.2.7)
which means that A0 , obtained by reversing the factors Q and R, is isospectral to A. One of the ways in which QR algorithm is used in numerical linear algebra is to use it to form a sequence of matrices A> A0 > A00 > = = = by continually reversing factors: A = QR> A0 = RQ = Q0 R0 > A00 = R0 Q0 = Q00 R00 > = = =
(7.2.8)
Under certain conditions, the sequence converges to an upper triangular matrix or, if A is symmetric, to a diagonal matrix composed of the eigenvalues. We will use the basic reversal (7.2.7) and the sequence (7.2.8), in this book, but we are not interested in the convergence properties of the sequence, for which see Golub and Van Loan (1983) [135]. We now show that, for a special choice of the skew symmetric matrix S, we may relate the sequence (7.2.8) to an isospectral flow. In doing so we will have to retrace some of the steps we have already taken. Suppose A 5 Vq , and that A+ is its strict upper triangle, i.e., 5 6 d12 = = = = = = d1q 9 d23 = = = d2q : 9 : (7.2.9) A+ = 9 : .. 7 8 . dq1>q
7. Isospectral Systems
157
then A may be written A = A+ + A+W + gldj(d11 > d22 > = = = > dqq ) = A+W A+ + 2A+ + gldj(d11 > d22 > = = = > dqq ) = S+R
(7.2.10)
where S = A+W A+ is skew-symmetric and R = 2A+ + gldj(d11 > d22 > = = = > dqq ) is upper triangular. We note that any symmetric matrix has this unique decomposition into a skew-symmetric matrix and an upper triangular matrix. We now start to retrace our steps: Lemma 7.2.1 Suppose S is skew symmetric, and let Q be the solution to the problem ˙ = QS> Q(0) = I Q (7.2.11) then Q is an orthogonal matrix. Proof. (QQW )•
˙ W + QQ ˙W = QQ = QSQ + QSW QW = Q(S + SW )QW = 0=
Since Q(0)QW (0) = I, we have Q(w)QW = I. Lemma 7.2.2 Let A(w) be the solution of the problem ˙ = AS SA A
A(0) = A0 >
(7.2.12)
then A(w) = QW (w)A0 Q(w), where Q(w) is as in Lemma 7.2.1. Proof. Let Z(w) = Q(w)A(w)QW (w), then W ˙ W + QAQ ˙W ˙ = QAQ ˙ + QAQ Z = QSAQW + Q(AS SA)QW + QASW QW = QA(S + SW )QW = 0=
This shows that Z(w) = Z(0) = A(0) = A0 so that QAQW = A0 > i.e., A = QW A0 Q= The orthogonal matrix Q was introduced as ‘the solution to the dierential equation (7.2.11)’. We now show that we may identify it through a QR factorisation: Lemma 7.2.3 If the matrix exp(wA0 ) has the QR-decomposition exp(wA0 ) = Q(w)R(w)>
(7.2.13)
then Q(w) satisfies equation (7.2.11), and A(w) = QW (w)A0 Q(w) is the solution of (7.2.12).
158
Chapter 7
Proof. Here exp(wA0 ) = I + wA0 +
w2 2 A + === 2 0
(7.2.14)
is the solution of the equation ˙ X(w) = A0 X(w)>
X(0) = I=
Taking derivatives of both sides of (7.2.13), we find ˙ + QR ˙ = A0 exp(wA0 ) = A0 QR (QR)• = QR so that
˙ + QRR ˙ 1 = A0 Q> Q
and
ˆ ˙ + RR ˙ 1 = QW A0 Q = A(w)= QW Q
(7.2.15)
ˆ ˙ is skew symmetBut A(w) is a symmetric matrix, Q is orthogonal, so that Q Q 1 ˙ ric, and RR is upper triangular: equation (7.2.15) gives the unique decomˆ as the sum of a skew-symmetric and an upper triangular matrix, position of A i.e., ˆ + = S= ˆ ˙ =A ˆ +W A QW Q W
On the other hand ˆ˙ = QW A0 Q ˙ +Q ˙ W A0 Q A W ˙ + (Q ˙ W Q)(QW A0 Q) = (Q A0 Q)(QW Q) ˆS ˆS ˆA ˆ = A ˆ satisfies the same dierential equation ˆ and A(0) = A0 . But this means that A as A, and has the same initial value, A0 ; ˆ A(w) = A(w) = QW A0 Q= We may now state Theorem 7.2.1 Suppose A(w) is the solution to the dierential equation (7.2.12). For n = 1> 2> = = = suppose exp(A(n 1)) = Qn Rn then exp(A(n)) = Rn Qn where Qn = Q(n)> Rn = R(n). Proof. Lemmas 7.2.2 and 7.2.3 show that A(w) = QW A0 Q(w) and exp(wA(0)) = Q(w)R(w)
(7.2.16)
7. Isospectral Systems
159
so R(w)Q(w) = QW (w)(Q(w)R(w))Q(w) = QW (w) exp(wA0 )Q(w)= Now consider exp(wA(w)) = exp(QW (w)wA0 Q(w)) (7.2.17) W 2 (Q wA0 Q) + ··· = I + QW wA0 Q + 2! W (Q wA0 Q)(QW wA0 Q) = I + QW wA0 Q + + ··· 2! w2 A20 = QW (I + wA0 + + · · · )Q 2! = QW exp(wA0 )Q = R(w)Q(w)= This means that, taking w = 1 in (7.2.16), we have exp(A(0)) = Q1 R1 and taking w = 1 in (7.2.17), exp A(1) = R1 Q1 = Q2 R2 etc. We describe this result by saying that the solutions of (7.2.12) at integral times 0> 1> 2> = = = give the iterates in the QR-sequence (7.2.8) starting from exp(A(0)) = exp A0 = Q1 R1 . We conclude this section with a note on the historical development of the theory of isospectral flow. The analysis had its beginnings in the investigation of the so-called Toda lattice, Toda (1970) [324], a set of q particles constrained to move on a line under exponential repulsive forces. Symes (1980) [315], Symes (1982) [316] gives references to the roots of the problem in Physics, and establishes the theory, basically as described above, for the particular case encountered in the Toda lattice, that A is a Jacobi matrix. The analysis for a Jacobi matrix was developed further by Nanda (1982) [245], Nanda (1985) [246] and by Deift, Nanda and Tomei (1983) [77]. The generalisation of the theory to an arbitrary complex non-symmetric matrix is due to Chu (1984) [57]. Watkins (1984) [331] gives a survey of the general theory, and its extension to other matrix factorisations such as LR (lower triangular matrix L, multiplied by upper triangular matrix R) or the Cholesky factorisation LLW . Chu and Norris (1988) [60] explore the connection between isospectral flows and abstract matrix factorisations. Most of this research is concerned with the connection between isospectral flow and the procedures used in numerical linear algebra; this is not our concern in this book. Rather, we are interested in isospectral flow as a way of constructing isospectral systems, as we will show in later sections of this Chapter. We will take up the topic of isospectral flow in Section 7.6 after we have considered algebraic procedures for obtaining isospectral systems.
160
7.3
Chapter 7
Isospectral Jacobi systems
We follow Gladwell (1995) [121] and start our discussion by considering the particular case of the spring-mass system shown in Figure 4.4.2a and reproduced as Figure 7.3.1.
k1
k2 m1
m2
kn1
kn
...
mn
Figure 7.3.1 — An in-line spring-mass system The governing equation is (K M)y = 0> where
5
n1 + n2 9 n2 K=9 7 · 0
n2 n2 + n3 · ===
0 n3 · nq
(7.3.1) 6 ===0 : ===0 :> 8 === nq + nq+1
M = gldj(p1 > p2 > = = = > pq )=
(7.3.2)
(7.3.3)
We will assume that the chain of masses and springs is unbroken, so that (nl )q2 A 0>
(pl )q1 A 0=
There are three particular cases: (S) supported; n1 A 0> nq+1 A 0 (C) cantilever; n1 A 0> nq+1 = 0 (F) free; n1 = 0> nq+1 = 0 If two systems, 1 and 2, are isospectral then, in the notation of Section 4.3, (M1 > K1 ) = (M2 > K2 )= There are two almost trivial ways of obtaining an isospectral pair. f A 0, then (fM> fK) = (M> K)=
(7.3.4) First, if
Secondly, if we physically turn the system around and renumber the masses and springs from the left, then we will not change the eigenvalues. Renumbering
7. Isospectral Systems
161
is equivalent to pre- and post-multiplying by the matrix T of equation (4.3.8). Thus (7.3.4) will hold if M2 = TM1 T>
K2 = TK1 T=
(7.3.5)
To obtain non-trivial isospectral pairs, we reduce (7.3.1) to standard form. We write (7.3.6) M = D2 > Dy = u> J = D1 KD1 > so that (J I)u = 0=
(7.3.7)
First, consider a cantilever system. Now as in (4.4.7), K may be factorised as
ˆ W> K = EKE
ˆ = gldj(n1 > n2 > = = = > nq )> K
and
W
ˆ D1 = J = D1 EKE
(7.3.8)
To obtain an isospectral pair, we need Lemma 7.3.1 If A> B 5 Pq , then AB and BA have the same eigenvalues, except perhaps for zero. Proof. Suppose 6= 0 is an eigenvalue of AB, so that, for some x 6= 0> ABx = x. Since 6= 0 and x 6= 0, we have Bx 6= 0. Now B(ABx) = BA(Bx) = Bx, so that Bx is an eigenvector of BA corresponding to the eigenvalue . We have proved that any non-zero eigenvalue of AB is an eigenvalue of BA. Now reverse the roles of A and B to complete the proof. ˆ = F2 , so that Write K J = (D1 EF)(FEW D1 )=
(7.3.9)
Now apply the Lemma: the eigenvalues of J are non-zero (in fact, positive) so that if (7.3.10) J0 = (FEW D1 )(D1 EF)> then (J0 ) = (J)= To form a spring-mass system corresponding to J0 we reverse the reduction to standard form, and write (J0 I)u = 0 as ˆ 1 )v = 0> (EW M1 E K
v = Fu=
(7.3.11)
This is the eigenvalue equation for a reversed cantilever, We may verify this by noting that TET = EW > T2 = I>
162
and thus
Chapter 7
TEW M1 Ev
= TEW T · TM1 T · TET · Tv ˆ 0 EW · Tv> = EK
so that we may write equation (7.3.11) as (K0 M0 )Tv = 0> where
ˆ 0 EW > K0 = EK
ˆ 0 = TM1 T> K
ˆ 1 T= M0 = TK
This system relates to a cantilever with nl0 = p1 ql+1 >
1 p0l = nql+1 >
l = 1> 2> = = = > q>
and (M0 > K0 ) = (M> K)= This pair was pointed out by Ram and Elhay (1995a) [285]. See also Ram and Elhay (1998) [287]. In the analysis we have just described, we started with a system specified by M> K and formed the Jacobi matrix J = D1 KD1 . This passage from a spring mass system to a Jacobi matrix is unique, but starting from a given Jacobi matrix we may construct an infinite family of spring mass systems, as we will now show. The stiness matrix K of (7.3.2) has the property K{1> 1> 1> = = = > 1} = {n1 > 0> = = = > 0> nq+1 };
(7.3.12)
this equation states that in order to move all the masses statically to the right by unit displacement, we must apply forces n1 and nq+1 to masses p1 and pq respectively. We follow the analysis developed in Section 4.4. Since J = D1 KD1 we have K = DJD so that equation (7.3.12) yields 1 J{g1 > g2 > = = = > gq } = {n1 g1 1 > 0> = = = > nq+1 gq }=
Thus in order to find a spring-mass system we must take J and find a solution to the equation Jd = {> 0> = = = > 0> } d = {g1 > g2 > = = = > gq }
(7.3.13)
where 0> 0> + A 0. If J is non-singular, then Theorem 4.4.1 ensures that d A 0. Thus to construct a spring-mass system we may choose > to be arbitrary non-negative constants, not both zero. This is equivalent to choosing arbitrary spring stinesses n1 and nq+1 ; for when we solve equation (7.3.13) we find n1 = g1 > nq+1 = gq ; (7.3.14) we have a two-parameter family of isospectral systems. If we demand that the reconstructed system be a cantilever, so that = 0 = nq+1 , then Pthe solution is essentially unique; we can make it unique by taking p1 = 1 or ql=1 pl = 1.
7. Isospectral Systems
163
If J is singular we use Theorem 4.4.2, which ensures that there is a positive solution of Jd = 0 (7.3.15) and then construct K = DJD, M = D2 ; again the system is essentially unique. We now discuss two dierent ways of constructing families of isospectral Jacobi matrices. We let M(1 > 2 > = = = > q ) denote the set of Jacobi matrices J such that (J) = (l )q1 . The first follows directly from the analysis of Section 4.3: we can reconstruct J uniquely from (J) = (l )q1 and the vector x1 of first components of the normalised eigenvectors ul of J. We know that these first components {11 > {21 > = = = > {q1 are all non-zero, so that we can take them to be all positive, and they satisfy xW1 x1 = 1 = {211 + {221 + · · · + {2q1 =
(7.3.16)
This means that each J 5 M may be associated with a point S = ({11 > {21 > = = = > {q1 ) in the (strictly) positive orthant of the unit q-sphere. (In more precise terms, M is a smooth (q 1)-dimensional manifold dieomorphic to the strictly positive orthant of the unit q-sphere.) The second way uses QR factorisation, as discussed in Section 7.2. Suppose A 5 Vq and is not an eigenvalue of A. Then A I is non-singular, and so may be factorised: A I = QR= (7.3.17) Here Q is an orthogonal matrix, and R is upper triangular with positive diagonal terms ull ; this factorisation (7.3.17) is unique. Now form the matrix A0 from the equation A0 I = RQ= (7.3.18) Equations (7.3.17), (7.3.18) define a transformation G : A $ A0 . The matrix A0 is symmetrical, and is isospectral to A: A0 = I + RQ = QW (I + QR)Q = QW AQ=
(7.3.19)
We now prove Theorem 7.3.1 If A is a Jacobi matrix, then so is A0 . Proof. We first show that if A is tridiagonal, then so is A0 . Equations (7.3.17), (7.3.18) give RA = R(I + QR) = (I + RQ)R = A0 R=
(7.3.20)
This relation between A and A0 is fundamental, and is often more instructive than (7.3.17), (7.3.18) or (7.3.19). Consider the l> m term in the products on either side of (7.3.20), and take l m: q X
n=1
uln dnm =
q X
n=1
d0ln unm >
m = 1> 2> = = = > q 1; l = m> m + 1> = = = > q=
(7.3.21)
164
Chapter 7
Since R is upper triangular, uln is non-zero only for n = l> l + 1> = = = > q. Since A is tridiagonal, dnm is non-zero only for n = m 1> m> m + 1. Thus the product on the left is non-zero only for n running from l to m + 1; it is identically zero if l m + 2. Since R is upper triangular, the index n on the right runs from n = 1> 2> = = = > m. Thus m+1 m X X uln dnm = d0ln unm = (7.3.22) n=l
n=1
In particular therefore m X
d0ln unm = 0>
m = 1> 2> = = = > q 2; l = m + 2> = = = > q=
(7.3.23)
n=1
Taking m = 1 we find d0l1 u11 = 0, and since u11 A 0, d0l1 = 0>
l = 3> = = = > q=
Now take m = 2: d0l1 u12 + d0l2 u22 = 0>
l = 4> = = = > q=
But d0l1 = 0 for these values, and u22 A 0, so that d0l2 = 0>
l = 4> = = = > q=
Proceeding in this way we find d0lm = 0>
m = 1> 2> = = = > q 2; l = m + 2> = = = > q=
(7.3.24)
Thus A0 has only one non-zero diagonal below the principal diagonal. But A0 is symmetric, so that it is tridiagonal. To show that if A is Jacobi, then so is A0 we return to equation (7.3.22). Since A0 is tridiagonal, we can rewrite (7.3.22) as m+1 X n=l
uln dnm =
m X
d0ln unm =
(7.3.25)
n=l1
Take l = m + 1, then each sum has just one term: ull dl>l1 = d0l>l1 ul1>l1 >
l = 2> = = = > q :
(7.3.26)
if dl>l1 is positive (negative) then so is d0l>l1 . We now suppose A = J, a Jacobi matrix, and prove Theorem 7.3.2 The operator G is commutative when applied to Jacobi matrices. (7.3.27) G G J = G G J=
7. Isospectral Systems
165
Proof. Consider the relation between the eigenvectors of J and J0 . Suppose u is a normalised eigenvector of J: Ju = u> then J0 QW u = (QW JQ)QW u = QW Ju = QW u> so that u0 = QW u is a normalised eigenvector of J0 . eigenvector in another way. Since
We may express this
Ju = (QR + I)u = u> we have QRu = ( )u> or u0 = QW u =
Ru =
(7.3.28)
This equation shows that the last component of the eigenvector u0l may be taken to be uqq ()xql x0ql = = (7.3.29) |l |
This shows that, under the operation G , the last components of the eigenvectors are simply multiplied by two terms: one, uqq (), independent of l, and the other |l |1 . This means that the last components of the normalised eigenvectors of either of the matrices in (7.3.27) will be proportional to xql = |l ||l |
(7.3.30)
Since they are proportional, and the sum of the squares of each set is unity, the two sets must be the same. But a Jacobi matrix is uniquely determined by its eigenvalues and the last components of its normalised eigenvectors. Therefore, (7.3.27) holds, and G is commutative. We prove a stronger result in Theorem 7.4.2. Theorem 7.3.3 If A> B 5 M, then we can find a unique set (l )q1 such that 1 1 ? 2 ? · · · ? q1 and G1 G2 = = = Gq1 A = B=
(7.3.31)
Proof. It is su!cient to show that we can pass from one set of last components (xql )q1 to any other set (yql )q1 in q 1 G operations. But equation (7.3.29) shows that this is equivalent to choosing 1 > 2 > = = = > q1 such that q1 Y m=1
xql 2 yql > |l m |
l = 1> 2> = = = > q=
166
Chapter 7
This is equivalent to choosing the polynomial S () = N
q1 Y
( m )
m=1
such that |S (l )| = xql @yql >
If we choose the
(l )q1 1
l = 1> 2> = = = > q=
so that
1 ? 1 ? 2 ? · · · ? q1 ? q
(7.3.32)
then S (l ) = ()ql xql @yql >
l = 1> 2> = = = > q=
But there is a unique such polynomial S () of degree q 1> taking values of opposite signs at q points l , and it will have q 1 roots l satisfying (7.3.32). Corollary 7.3.1 If G A = B, then we can find (l )q1 such that G1 G2 = = = Gq1 1 B = A, and hence find G1 . Corollary 7.3.2 We can find (l )q1 such that 1 G1 G2 = = = Gq1 A = A= Exercises 7.3 1. Consider the case (I ), in which n1 = 0 = nq+1 . Use Lemma 7.3.1 to obtain a cantilever which has the same eigenvalues as the original system apart from the zero eigenvalue corresponding to the rigid body mode. 2. Construct a formal inductive proof of equation (7.3.24).
7.4
Isospectral oscillatory systems
In Section 7.3 we considered the operator G defined by equations (7.3.17) and (7.3.18). We showed, amongst other things, that if J is tridiagonal, then so is J 0 ; if dl+1>l ? 0 (A 0), then d0l+1>l ? 0 (A 0). We recall from Section 6.6 that a positive-definite (symmetric) tridiagonal matrix with positive co-diagonal is a particular example of an oscillatory matrix, as defined at the beginning of Section 6.6, and characterised by Theorem 6.7.3. This means that if A is a symmetric tridiagonal oscillatory matrix, is not an eigenvalue of A, and the diagonal elements of R are positive, then the operations A I = QR
(7.4.1)
7. Isospectral Systems
167 A0 I = RQ
(7.4.2)
yield a new matrix A0 that is symmetric, tridiagonal and oscillatory. Following Gladwell (1998) [126] we will now state that this is a special case of a general result: Theorem 7.4.1 Suppose A 5 Vq , let S denote one of the properties NTN, O, TP, let A0 be defined from equations (7.4.1), (7.4.2). A0 has property S i A has property S . This Theorem states that A0 is NTN i A is NTN, A0 is O i A is O, and A is TP i A is TP. Implicit in the theorem is the condition that the diagonal elements of R, which are necessarily non-zero because A I is non-singular, are chosen to be positive. The two conditions, that A is symmetric (A 5 Vq ), and is not an eigenvalue of A, are essential, as we now show by counterexamples. Take = 0 and · ¸ 2 d A= (7.4.3) 1 2 0
then Q = A0
=
· ¸ · ¸ 1 1 2 1 5 2d + 2 s > R= s 5 1 2 5 0 4d · ¸ 1 12 + 2d 4d 1 = 4 d 2(4 d) 5
If d = 15 , A is O and TP, A0 is not even TN; when d = 0, A is NTN and A0 is not TN. The condition that is not an eigenvalue is essential. For when d = 1 the matrix A in (7.4.3) is O and TP, and its eigenvalues are 1 = 3> 2 = 1. (Recall that when we consider oscillatory matrices we label the eigenvalues in decreasing order.) Take = 1, then · ¸ · ¸· ¸ 1 1 1 f f 2f 2f A I = = > f= s (7.4.4) 1 1 f f 0 0 2 · ¸· ¸ · ¸ · ¸ 2f 2f 3 0 f f 2 0 A0 I = = > A0 = 0 0 f f 0 0 0 1 The matrix A0 in (7.4.4) is not oscillatory. In general, if A 5 Vq is O then its eigenvalues are distinct (Corollary to Theorem 6.10.1). This means that if = n for some n, then A I has rank q 1 and uqq = 0, but no other ull is zero. Thus the last row of A0 I will be identically zero, in particular d0q>q1 = 0, so that, by Theorem 6.7.3, A0 cannot be O. The proof of Theorem 7.4.1 requires delicate treatment of inequalities. It may be found in Gladwell (1998) [126] and will not be reproduced here. We
168
Chapter 7
merely give some hints on the proof. First, it relies on an earlier result of Cryer (1973) [66] for the case = 0. See also Cryer (1976) [67] Cryer’s results may be used to show that if A (not necessarily symmetric) is NTN, O or TP, and A = LU where L(U) is lower (upper) triangular, then A0 = UL is NTN, O or TP respectively. Since A is PD we may replace QR factorisation for the case = 0 by two successive Cholesky LLW factorisations: A = L1 LW1 > B = LW1 L1 = L2 LW2 > A0 = LW2 L2 = We write W W = LW Q = L1 LW 2 1 L2 > R = L2 L1 >
and note that W 1 QQW = L1 LW 2 (L2 L1 ) = I>
so that Q is orthogonal. Now A A0
W W = L1 LW1 = (L1 LW 2 )(L2 L1 ) = QR> W W W W = L2 L2 = (L2 L1 )(L1 L2 ) = RQ=
If A has property S , then Cryer’s result shows that B has property S , and then again A0 has property S . The proof also relies on the Binet-Cauchy Theorem. Equation (7.3.20) states that (7.4.5) RA = A0 R> so that the Binet-Cauchy Theorem 6.2.4 gives Rs As = A0s Rs =
(7.4.6)
Rs (Ap )s = (A0p )s Rs > p = 1> 2> = = =
(7.4.7)
We now prove Lemma 7.4.1
Proof. The Binet-Cauchy Theorem gives (Ap )s = (As )p = Ap s and similarly (A0p )s = A0p s . By equation (7.4.6), the result holds for p = 1. Suppose it holds for one value, p, then As0(p+1) Rs
= A0s (A0p s Rs )
0 p = A0s (Rp s As ) = (As Rs )As p+1 = (Rs As )Ap ; s = Rs As
the result holds for p + 1.
7. Isospectral Systems
169
Equations (7.4.5)-(7.4.7) generally yield complicated relations between the elements of A and A0 , As and A0s , but for some important special cases the relations are simple. Consider equation (7.4.5) in element form: q X
uln dnm =
n=1
m X
d0ln unm =
(7.4.8)
n=1
If l = q and m = 1, there is only one term in each sum: uqq dq1 = d0q1 u11 =
(7.4.9)
The hypothesis of Theorem 7.4.1 is that A is NTN (at least). Ex. 6.6.1 states that if A is NTN and dq1 A 0, then A is a positive matrix (strictly positive, but not TP!). In fact, dq1 A 0 is the first of the conditions in Theorem 6.8.2 for a (symmetric) NTN matrix to be TP: dq1 is the first of the corner minors of A, as discussed in Theorem 6.8.2. The general corner minor is A(!; ) where = {1> 2> = = = > s}> ! = {q s + 1> = = = > q}. This is the corner element Q> 1 in the matrix As . Thus equation (7.4.6) gives uqs+1 = = = uqq D(!; ) = D0 (!; )u11 = = = uss
(7.4.10)
so that D0 (!; ) A 0 i D(!; ) A 0. This result, combined with some delicate reasoning, shows that A0 is TP i A is TP. To show that A0 is TN i A is TN, we use a result due to Ando (1987) [4], that a TN matrix may be approximated arbitrarily closely, in, say, the O1 norm, by a TP matrix. Finally, to show that A0 is O i A is O we use Lemma 7.4.1. That shows that the corner minors of A0p are positive i the corner minors of Ap are positive. So if A is O, it is NTN, and therefore, A0 is NTN. Again, if A is O, Ap is TP for some p q 1, its corner minors are positive, so therefore are those of A0p ; A0p is TP; A0 is O. We conclude from Theorem 7.4.1 that the operator G maintains the properties NTN, O, TP (and SO also) invariant, provided of course that A is symmetric, is not an eigenvalue of A, and R has positive diagonal. In Section 6.6 we showed (Theorem 6.6.3) that an NTN matrix is a staircase matrix. We now prove Theorem 7.4.2 Suppose A 5 Vq is NTN and is a s-staircase matrix, then A0 = G A is also a s-staircase matrix. Proof. Since A0 is NTN, it is a staircase matrix, say a s0 -staircase. The fundamental relation (7.3.21) gives sm X
n=1
uln dnm =
m X
d0ln unm =
(7.4.11)
n=1
We use induction to prove s0m = sm > m = 1> 2> = = = > q. Take m = 1. If l A s1 , the L.H.S. is zero, so that d0l1 u11 = 0; s01 s1 . If l = s1 then ull dl1 = d0l1 u11 >
170
Chapter 7
so that d0l1 A 0; s01 = s1 . Suppose that s0m = sm for m = 1> 2> = = = > p 1. m = p and l A sp in (7.4.11) then p X
d0ln unm = 0=
If
(7.4.12)
n=1
But since A0 is a staircase, l A sp implies l A sn = s0n for n = 1> 2> = = = > p 1, so that there is only one term, the last, in the sum (7.4.12); d0lp = 0. Thus s0p sp . Now take m = p> l = sp , then ull dlp =
p X
d0ln unp =
n=1
If sp A sp1 , then there is only one term, the last, on the right, and ull dlp = d0lp upp so that s0p = sp . If sp = sp1 , then the inequalities s0p s0p1 > s0p sp imply s0p = sp . Arbenz and Golub (1995) [12] show that staircase patterns are eectively the only ones invariant under the symmetric QR algorithm. In Theorem 7.3.2 we showed that the operator G applied to a Jacobi matrix was commutative. We now show a stronger result. Theorem 7.4.3 The operator G is commutative. Proof. We need to show that G1 G2 = G2 G1 . Consider the operations G1 A = A1 > G2 A1 = A2 ; G2 A = A3 > G1 A3 = A4 : A 1 I = Q1 R1 >
A1 1 I = R1 Q1 >
A1 2 I = Q2 R2 > A2 2 I = R2 Q2 ; A 2 I = Q3 R3 >
A3 2 I = R3 Q3 >
A3 1 I = Q4 R4 > A4 1 I = R4 Q4 = These equations give A1 2 I = QW1 (A 2 I)Q1 = Q2 R2 > i.e., QW1 Q3 R3 Q1 = Q2 R2 >
(7.4.13)
A3 1 I = QW3 (A 1 I)Q3 = Q4 R4 > i.e., QW3 Q1 R1 Q3 = Q4 R4 =
(7.4.14)
7. Isospectral Systems
171
Equations (7.4.13), (7.4.14) give Q3 R3 Q1 R1
= Q1 Q2 R2 QW1 > = Q3 Q4 R4 QW3 >
and on multiplying these together, we find (Q1 Q2 R2 QW1 )(Q1 R1 ) = (Q3 Q4 R4 QW3 )(Q3 R3 )> or Q1 Q2 R2 R1 = Q3 Q4 R4 R3 = Now Q1 Q2 , Q3 Q4 are orthogonal matrices while R2 R1 and R4 R3 are upper triangular with positive diagonal. But a non-singular matrix has a unique factorisation QR (with positive diagonal). Therefore, Q1 Q2 = Q3 Q4 >
R2 R1 = R4 R3 >
so that, since A4 A2
= QW4 A3 Q4 = QW4 QW3 AQ4 Q3 = QW2 A1 Q2 = QW2 QW1 AQ1 Q2
we have A4 = A2 .
7.5
Isospectral beams
We set up the eigenvalue problem for the (cantilever) beam in Section 2.3: Ky = My where
W
ˆ L1 EW > K = EL1 EKE
(7.5.1)
M = D2 > D = gldj(g1 > g2 > = = = > gq )=
(7.5.2)
As usual, we reduce the problem to standard form: Au = u> where A = D1 KD1 =
(7.5.3)
First, we obtain a simple isospectral system by using Lemma 7.3.1. Write ˆ = F2 > F = gldj(i1 > i2 > = = = > iq ) K then we may write A as A = (D1 EL1 EF) · (FEW L1 EW D1 )=
172
Chapter 7
Now apply Lemma 7.3.1; the eigenvalues of A are non-zero (in fact, positive) so that if A0 = (FEW L1 EW D1 ) · (D1 EL1 EF) then (A0 ) = (A)= To form a discrete beam corresponding to A0 we reverse the reduction to standard form, and write A0 u0 = u0 as K0 y0 = M0 y0
(7.5.4)
ˆ 0 EL1 E K0 = EW L1 EW K
(7.5.5)
ˆ 0 = M1 > M0 = K ˆ 1 = K
(7.5.6)
where
This is the eigenvalue equation for a reversed cantilever, as we may verify just as we did for the spring-mass system in Section 7.3: we operate on (7.5.4) by the reversing matrix T. Thus, TK0 T · Ty0 = TM0 T · Ty0 > where 0
ˆ T · TET · TL1 T · TET TK0 T = TEW T · TL1 T · TEW T · TK 0 ˆ EW L0 EW = K0 = = EL0 EK (7.5.7) The new cantilever is related to the old by 1 0 0 nl0 = p1 ql+1 > ol = oql+1 > pl = nql+1 =
(7.5.8)
To construct a family of isospectral beams, we use the operator G defined by equations (7.4.1), (7.4.2). We carry out the following steps: ˆ L> M = D2 . i) start with a beam, defined by K> ii) construct A as in (7.5.1)-(7.5.3). A is symmetric, pentadiagonal, and signoscillatory. iii) choose , not an eigenvalue of A, and form A0 = G A; A0 also is symmetric, pentadiagonal and sign-oscillatory. 0
ˆ EW iv) factorise A0 = (D0 )1 K0 (D0 )1 and form M0 = (D0 )2 > K0 = E(L0 )1 EK 0 1 W (L ) E .
7. Isospectral Systems
173
The only step which needs to be completed is iv). We must show that the new symmetric pentadiagonal sign-oscillatory matrix A0 may be factorised as in ˆ 0 > L0 . We first (7.5.1)-(7.5.3), with some new positive diagonal matrices D0 > K give the gist of the procedure, and afterwards show that it will always work. The new matrix A0 is related to the new mass and stiness matrices K0 > M0 = 02 D by equation (7.5.3). We start, as we did with the spring mass system in Section 7.3, by considering simple static deflection of the beam, as shown in Figure 7.5.1. We apply forces i1 > i2 at masses 1 and 2 so that all the masses have unit deflection. The force-deflection equation is K0 {1> 1> = = = > 1} = {i1 > i2 > 0 = = = > 0}= But A0 = D01 K0 D01 , so that K0 = D0 A0 D0 , and thus D0 A0 {g01 > g02 > = = = > g0q } = {i1 > i2 > 0 = = = > 0} and A0 {g01 > g02 > = = = > g0q } = {j1 > j2 > 0 = = = > 0}
(7.5.9)
where jl = il @g0l > l = 1> 2=
x
x
x
f1
f2
x
Figure 7.5.1 - Two forces i1 > i2 , are required to produce unit deflections. The matrix A0 is SO, so that, by Theorem 6.7.5, B0 (A0 )1 is O. The solution of (7.5.9) is g0l = e0l1 j1 e0l2 j2 > l = 1> 2> = = = > q=
(7.5.10)
Take j1 = 1; we now show that if j2 is small enough, so that gq0 is positive, then 0 all the g0l will be positive. For if 0 ? j2 ? e0q1 @eq2 , then g0l
A e0l1 e0l2 e0q1 @e0q2 0 = (e0l1 e0q2 e0l2 e0q1 )@eq2 0>
0 are strictly positive for l = because B0 is O. We will show later that e0l1 > el2 0 1> 2> = = = > q, so that the gl are strictly positive. Assuming that this is true for the moment, we have now found d0 satisfying (7.5.9) for some j1 = 1> j2 A 0. The vector d0 is the first column of the matrix D0 EW ; E1 is given in equation (2.2.10). We now show that the matrix
C0 = E1 D0 A0 D0 EW
(7.5.11)
174
Chapter 7
is a Jacobi matrix. Suppose 5 9 9 9 A =9 9 7 0
d01 e01 f01
e01 d02 e02 .. .
6
f01 e02 d03 .. .
f02 e03 .. .
f03
f0q2
e0q1
d0q
: : : : : 8
then A0 D0 EW has just one diagonal below the principal diagonal; that diagonal has elements j2 > f01 g01 > f02 g02 > = = = > f0q2 g0q2 . The matrix E1 D0 is upper triangular, so that C0 E1 D0 (A0 D0 EW ) also will have just one diagonal below the principal diagonal. But C0 is symmetric, so that it will also have just one diagonal above the principal diagonal: it is a symmetric tridiagonal matrix with co-diagonal g02 j2 > f01 g01 g03 > f02 g02 g04 > = = = > f0q2 g0q2 g0q =
(7.5.12)
Denote the matrix obtained by deleting rows and columns 1> 2> = = = > l 1 of A0 by A0l and let d0l = {0> 0> = = = > 0> g0l > g0l+1 > = = = > g0q }, then the diagonal elements of C0 may be written 0 0 f0ll = d0W l Al dl > l = 1> 2> = = = > q=
(7.5.13)
To show that C0 is a Jacobi matrix, we need to show that it is PSD. Actually, since the original A was PD, the new A0 is PD, and so is C0 , because xW C0 x = (xW E1 D0 )A0 (D0 EW x) = yW A0 y A 0= We have constructed a Jacobi matrix C0 from A0 . We now use the result obtained in (4.4.7) for the factorisation of a Jacobi matrix. In changed notation we may write ˆ 0 EW (L0 )1 > C0 = (L0 )1 EK (7.5.14) so that on combining (7.5.11) and (7.5.14) we find ˆ 0 EW (L0 )1 EW (D0 )1 > A0 = (D0 )1 E(L0 )1 EK
(7.5.15)
as required. We now examine this procedure. We must show that the terms e0l1 > e0l2 are strictly positive, and that the terms f0l in the last band of A0 , which appear in the codiagonal of C0 are positive. To verify these matters we must return to the G algorithm, specifically to equations (7.4.5)-(7.4.9). The terms eq1 > e0q1 are elements of B A1 and B0 (A0 )1 respectively. Taking inverses of the terms on each side of equation (7.4.5) we find R1 B0 = BR1
(7.5.16)
7. Isospectral Systems
175
and on equating the q> 1 terms we find 1 1 0 eq1 = eq1 u11 = uqq
(7.5.17)
The original A is given by equations (7.5.1)-(7.5.3), so that ˆ 1 E1 LE1 B = A1 = DEW LEW K so that, with E1 given by equation (2.2.10), it is clear that eq1 A 0 and thus equation (7.5.17) gives e0q1 A 0. We now show that e0q2 A 0. The matrix B0 is known to be oscillatory; it is thus TN so that the minor B0 (1> q; 1> 2) 0; thus ¯ 0 ¯ ¯ e11 e012 ¯ 0 0 0 0 ¯ 0 ¯ (7.5.18) ¯ eq1 e0q2 ¯ = e11 eq2 eq1 e12 0>
and e0q1 A 0> e012 A 0> show that e0l1 A 0> e0l2 ¯ 0 ¯ el1 e0ll ¯ 0 ¯ eq1 e0ql
e011 A 0> imply A 0: ¯ ¯ ¯ 0> l 2; ¯
e0q2 A 0. We apply a similar argument to
¯ 0 ¯ el2 ¯ 0 ¯ eq2
¯ e0ll ¯¯ 0> l 3 e0ql ¯
(7.5.19)
imply e0l1 A 0> e0l2 A 0 respectively. We have proved that the procedure will always yield a vector d0 which is strictly positive. Further discussion and results may be found in Gladwell (2002b) [130]. Exercises 7.5 1. Show that there is a 2-parameter system of isospectral beams corresponding to simple scaling, i.e., in which all the masses are scaled by the same factor, the stinesses by another, and the lengths by a third one. 2. The argument used in (7.5.17), (7.5.18) is due to Markham (1970) [221]. Show that if B is O, and an element elm with l A m, i.e., an element in the lower triangle, is zero,then all the elements below and to the left of elm are also zero. This implies that if B is O, then it has staircase structure, as discussed at the end of Section 7.4. Also, if eq1 A 0 and e1q A 0, then B is a strictly positive matrix.
7.6
Isospectral finite-element models
In Section 2.4 we showed that a finite-element model of a rod in longitudinal vibration had tridiagonal mass and stiness matrices, the former with positive codiagonal, the latter with negative. The explicit form of the stiness matrix was given in Ex. 2.4.2. In this section, following Gladwell (1998) [126], Gladwell (1999) [127], we consider how we can find a finite-element system M0 > K0 for a rod which is isospectral to a given finite-element system M> K for a rod. We
176
Chapter 7
first consider a simple way of constructing an isospectral family M0 > K0 , and then consider a procedure that will yield a large family. See Gladwell (1997) [125] for an earlier attempt to solve this problem. For simplicity we consider a cantilever rod, i.e., one that is fixed at the left, free at the right. The eigenvalue equation is (K M)y = 0=
(7.6.1)
˜ = ZKZ and M; Instead of working with K and M, we will work with K both these are tridiagonal with positive codiagonal, i.e., they are oscillatory (O). We factorise them as ˜ = AAW > M = BBW > (7.6.2) K where relying on Cryer (1973) [66], we know that A> B are lower bidiagonal with positive codiagonals. When reduced to normal form, the equation (7.6.1) is ˜ I)u = 0> (G
(7.6.3)
˜ = B1 KBW , i.e., G = B ˜ 1 K ˜B ˜ W is O: where G ˜ 1 AAW B ˜ W = G=B
(7.6.4)
Thus one way to obtain an isospectral system M0 > K0 is to find lower bidiagonal C> D with positive codiagonals such that ˜ 0 = CCW > K
M0 = DDW >
(7.6.5)
and ˜ 1 CCW D ˜ W = D ˜ W = ˜ 1 AAW B G=B
(7.6.6)
˜ 1 C= ˜ 1 A = D B
(7.6.7)
This holds i Straightforward algebra shows that this implies fll = yl dll > gll = yl ell > l = 1> 2> = = = > q fl+1>l = yl+1 dl+1>l >
gl+1>l = yl+1 el+1>l >
l = 2> 3> = = = > q 1
(7.6.8) (7.6.9)
where (yl )q1 are arbitrary positive constraints, and d11 g21 + e11 f21 = y2 (d11 e21 + d21 e11 ) = y2 s=
(7.6.10)
The general, positive, solution of (7.6.10) is f21 = y2 s sin2 @e11 > g21 = y2 s cos2 @d11 >
(7.6.11)
where 0 ? ? @2. This provides an (q + 1)-parameter family of matrices M0 > K0 specified by the (q + 1) parameters (yl )q1 and .
7. Isospectral Systems
177
˜C ˜W Unless the parameters yl are chosen properly, the new matrix K0 = C will not have the form of a stiness matrix of a cantilever finite element model of a rod. Such a matrix, with K0 given in Ex. 2.4.2 has the defining property K0 {1> 1> 1> = = = > 1} = {n10 > 0> 0> = = = > 0}=
(7.6.12)
Equation (7.6.8)-(7.6.11) show that C has the form C = NC0 > N = gldj(y1 > y2 > = = = > yq ) and C0 depends only on . Thus ˜ 0C ˜ W0 N K0 = NC so that equation (7.6.12) yields ˜ W0 N{1> 1> = = = > 1} = {n10 > 0> = = = > 0}> ˜ 0C NC i.e., ˜ 0C ˜ W0 {y1 > y2 > = = = > yq } = {n10 @y1 > 0> = = = > 0}= C
(7.6.13) W ˜ ˜ Since C0 C0 is a non-singular Jacobi matrix, i.e., it is SO, its inverse is positive. Thus, equation (7.6.13) yields positive (yl )q1 , apart from a single positive factor. To obtain a wider family we use the general theory of Section 7.4: we form G0 from G I = QR> G0 I = RQ> (7.6.14) so that G0 is O. We must show that if G can be factorised as in (7.6.4), then G0 can be factorised in the form ˜ 1 CCW D ˜ W > G0 = D
(7.6.15)
where C> D are lower bidiagonal with positive codiagonals. ˜ = To establish the band forms, we consider how G was constructed: G W W W 1 ˜ ˜ or K = BGB . This we can write as H = GB > K = BH. The B KB equation BH = K is q X eln knm = nlm = (7.6.16) n=1
But K is tridiagonal, so that nlm = 0 for l = 1> 2> = = = > q 2; m = l + 2> = = = > q. The matrix B is lower bidiagonal, so that (7.6.16) gives el>l1 kl1>m + el>l kl>m = 0> l = 1> 2> = = = > q 2; m = l + 2> = = = > q=
Thus, taking l = 1 we find e11 k1m = 0 m = 3> 4> = = = > q but taking l = 2 in (7.6.16) we have e21 k1m + e22 k2m = 0 m = 4> 5> = = = > q
178
Chapter 7
so that e22 k2m = 0 m = 4> 5> = = = > q and generally klm = 0 l = 1> 2> = = = > q 2; m = l + 2> = = = > q=
(7.6.17)
W
˜ Now consider H = GB , which is equivalent to klm =
q X
j˜ln emn
n=1
and when combined with (7.6.17), this gives j˜l>m1 em>m1 + j˜lm emm = 0 l = 1> 2> = = = > q 2;
m = l + 2> = = = > q=
Since j˜lm = ()l+m jlm , and G is symmetric, we may write these equations as 5 6 5 6 jm1>1 jm1 9 jm1>2 : 9 jm2 : 9 : 9 : = emm 9 . (7.6.18) em>m1 9 . : : > m = 3> 4> = = = > q= 7 .. 8 7 .. 8 jm1>m2
jm>m2
We will show that these equations mean that the compound matrix G 2 of 2×2 minors of G has a pattern of zeros like that shown in Figure 7.6.1. Starting from its left hand end, the first q 3 terms in the last row of G 2 are J(q 1> q; 1> 2)> J(q 1> q; 1> 3) = = = J(q 1> q; 1> q 2)=
Figure 7.6.1 - The rectangles in the lower left and upper right are zeros.
7. Isospectral Systems
179
These are all zero, for, by (7.6.18) with m = q, ¯ ¯ ¯ jq1>1 jq1>n ¯ ¯ ¯ = 0> n = 2> = = = > q 2 J(q 1> q; 1> n) = ¯ jq>1 jq>n ¯
has its two rows proportional. Now we investigate the first q 4 terms in the penultimate row of G 2 : J(q 2> q; 1> 2)> J(q 2> q; 1> 3) = = = J(q 2> q; 1> q 3)= To show that these are all zero we consider the zero determinant ¯ ¯ ¯ jq2>1 jq2>1 jq2>n ¯ ¯ ¯ ¯ jq1>1 jq1>1 jq1>n ¯ = 0 ¯ ¯ ¯ jq1 jq1 jqn ¯
and expand it along its first column to give
jq2>1 J(q 1> q; 1> n) jq1>1 J(q 2> q; 1> n) + jq1 J(q 2> q 1; 1> n) = 0= (7.6.19) However, G given by (7.6.4) is a full matrix with all positive terms so that if any two of the minors in (7.6.19) are zero, then so is the third. But if n = 2> 3> = = = > q 3 then the first is zero, and (7.6.19) with m = q 1 shows that the third is zero, and thus the second is also. Proceeding in this way we find that J(l> m; 1> n) = 0 for 3 l ? m> n = 2> = = = > l 1. This provides a non-increasing pattern of zeros for the columns of G2 in the lower triangle. Now the equation R2 G 2 = G 02 R2
(7.6.20)
shows that G20 has a precisely corresponding pattern, and by tracing the steps in the analysis we can conclude that G0 can be factorised just like G. We obtain one factorisation ˜ 1 C0 CW0 D ˜ W > G0 = D 0
(7.6.21)
and then note that equivalently ˜ 1 CCW D ˜ W G0 = D where C = NC0
D = ND0
and N is an arbitrary diagonal matrix. Now we choose N, as before, to make ˜C ˜ W have the form of a stiness matrix. K0 = C
Exercises 7.6 1. Use equation (7.6.19) to verify that G 2 and G 02 have precisely the same staircase patterns, and so show that G0 may be factorised as (7.6.21).
180
7.7
Chapter 7
Isospectral flow, continued
In Section 7.2 we obtained the isospectral flow equation ˙ = AS SA> A
(7.7.1)
which governs the isospectral evolution of a symmetric matrix A; S is a skew symmetric matrix. In this section we investigate whether the pattern of zero and non-zero elements in A, and the pattern of signs of elements of A, are invariant in this flow. We will restrict our attention to a few types of matrices which appear in vibration problems since the general problem is extremely complicated. Ashlock, Driessel and Hentzel (1997) [13], in a very general discussion of Toda flow, show amongst many results, that staircase patterns are the only patterns that remain invariant under Toda flow. Their paper has a valuable summary of the pertinent literture. We start with tridiagonal A and take S = A+W A+ , i.e., 5
d1 9 e1 9 9 A=9 9 9 7
e1 d2 .. .
6 e2 .. .
..
.
..
..
.
.
eq1 5
0 9 e1 9 9 S=9 9 9 7
+e1 0 .. .
eq1 dq
: : : :> : : 8
(7.7.2)
6 +e2 .. .
..
..
..
.
.
. eq1
+eq1 0
: : : := : : 8
Now AS SA is also tridiagonal, so that A retains its tridiagonal form, and d˙ l = 2e2l1 2e2l >
e˙ l = (dl+1 dl )el >
l = 1> 2> = = = > q
(7.7.3)
where e0 > eq are taken to be zero. We examine the signs of the diagonal and codiagonal elements. The flow is isospectral so that if (A(0)) = (l )q1 and all the l are positive, then A(w), like A(0) will be positive definite; dl A 0> l = 1> 2> = = = > q. For given l> el (w) satisfies e˙ l (w) = i (w)el (w), where i (w) = dl+1 (w) dl (w). This has the solution Rw el (w) = F exp(I (w)), where I (w) = 0 i (w)gw. Now i (w) is bounded for all w, so that el (w) retains the sign of F = el (0). Thus el (w) is A> ?> = 0, depending on whether el (0) is A> ?> = 0. We conclude that each codiagonal term retains the sign it had when w = 0. In particular, if the signs of the codiagonal terms are all positive, i.e., A(0) is O, or negative, i.e., A(0) is SO, then A(w) is correspondingly O or SO.
7. Isospectral Systems
181
Before generalizing this analysis, we introduce some notation. The matrix S in (7.7.2) is clearly related to A; it may be written as a so-called Hadamard product: 6 5 6 5 d1 e1 0 +e1 : 9 e1 d2 e2 : 9 e1 0 +e2 : 9 : 9 : 9 : 9 . . . . . . .. .. .. .. .. .. :=9 : 9 : 9 : 9 : 9 : 9 . . . . .. .. . . 8 7 7 . . +eq1 eq1 8 eq1 0 eq1 dq 6
5
0 1 9 +1 0 1 9 9 .. .. . . 9 9 9 . .. 7
: : : : . : : .. . 1 8 +1 0 ..
(7.7.4)
The Hadamard product is quite distinct from the usual matrix product. It is defined only for two matrices A> B of the same size, i.e., A> B 5 Pp>q , and is given by the pairwise product of corresponding elements. If C = A B, then flm = dlm elm , for l = 1> 2> = = = > p; m = 1> 2> = = = > q. Thus the matrix S in (7.7.4) may be written S = A Y, where 6 5 0 1 : 9 +1 0 1 : 9 : 9 . . . . . . : 9 . . . (7.7.5) Y=9 : : 9 . . .. . . 1 8 7 +1 0 is itself a skew-symmetric matrix. (Clearly, if A is symmetric and Y is skewsymmetric, then A Y is skew-symmetric.) This brings us to the next example, in which A is a periodic Jacobi matrix; now we take 5 6 6 5 d1 e1 0 1 +1 eq 9 e1 d2 e2 : : 9 +1 0 1 9 : : 9 9 : : 9 .. .. .. .. .. .. 9 := : 9 . . . . . . A=9 : :> Y = 9 9 : : 9 . . . . . . .. .. 7 8 7 . 1 8 . eq1 1 +1 0 eq1 dq (7.7.6) It is easy to verify (Ex. 7.7.2) that A retains its form under the flow (7.7.1) with S = A Y, and that all the dl and el retain their signs. We note in passing that for tridiagonal matrices we have two ways to form an isospectral family: using the operator G of Section 7.3, or by using the
182
Chapter 7
isospectral flow equation with S given by (7.7.2). The periodic Jacobi form is not invariant under G , and it is not clear that there is a factorisation and reversal operation under which it is invariant. The only algebraic way to form an isospectral family seems to be to use the spectrum (l )q1 and a second spectrum (l )q1 and reconstruct the matrix as in Section 5.4. For the periodic case, the 1 isospectral flow equation with S given in (7.7.6), provides a conceptually simpler procedure. There is a second comment. We showed in Section 7.3 that we can pass from any one Jacobi matrix J to any other isospectral Jacobi matrix J0 in q 1 operations G . It is doubtful that isospectral flow, with S given by (7.7.2), will lead from one J to any other isospectral J0 (see Ex. 7.7.7). We will now show following Gladwell (2002) [129] that this permanence of sign of a tridiagonal matrix under the Toda flow (7.7.1) is a special case of the permanence of the total positivity properties NTN, TP, O, SO under Toda flow. We recall from Section 6.8 that it is the positivity of the corner minors of A that is crucial in determining whether a TN matrix A is TP. We first prove a theorem regarding the flow of these corner minors under the Toda flow (7.7.1). Theorem 7.7.1 Suppose A 5 Vq satisfies (7.7.1), with S = A+W A+ , B = Ap , fs = E(1> 2> = = = > s; q s + 1> = = = > q), then fs (w) satisfies f˙s = (
q X
dmm
m=qs+1
s X
dmm )fs >
s = 1> 2> = = = > q=
(7.7.7)
m=1
Proof. Denote the sth order corner matrix of B by Bs , and suppose that its columns are b1 > b2 > = = = > bs . Thus bm = [eqs+1>m > eqs+2>m > = = = > eq>q ]W = Ex. 7.7.3 shows that B satisfies ˙ = BS SB> B with S = A+W A+ , so that e˙ lm = (dll dmm )elm 2
m1 X
dmn eln + 2
n=1
and b˙ m = dmm bm 2
m1 X
q X
dmn bn + Cbm >
n=1
where C 5 Ps is given by fln
; ? dll > 2dln = = 0
dln enm >
n=l+1
n = l> n = l + 1> = = = > q otherwise
(7.7.8)
7. Isospectral Systems
183
for l> n = q s + 1> = = = > q. Now fs = det(b1 > b2 > = = = > bs ), so that f˙s =
s X
det(b1 > b2 > = = = > bm1 > b˙ m > bm+1 > = = = > bs )=
(7.7.9)
m=1
Consider the sums obtained by inserting each of the three terms in b˙ m from (7.7.8) into (7.7.9). The first gives
s X
dmm fs =
m=1
The second gives zero because it is merely a combination of the first m 1 columns; the third may beP written q m=qs+1 dmm fs = We now prove Theorem 7.7.2 Let S denote one of the properties TN, NTN, TP, O, SO. If A(0) 5 Vq has property S , then A(w), given as the solution of (7.7.1) with S = A+W A+ has the same property S . Proof. Suppose first that A(0) is TP. The corner minors fs of A(w) are thus positive when w = 0; they satisfy f˙s = i (w)fs where i (w) =
q X
dmm
m=qs+1
s X
dmm
m=1
is bounded: |i (w)| wu(A(w)) = wu(A(0)). This implies that these corner minors remain positive. At w = 0, all the minors of A are positive. By continuity, therefore, all the minors are positive in some open interval (d> e) around w = 0. Suppose if possible that one or more of the minors became zero at w = e. A(e) would be NTN and its corner minors would be positive, so that, by Theorem 6.8.2, it would be TP. This contradiction implies that A(w) is TP for all w. Now suppose that A(0) is TN. By Ando’s result, given in Ex. 6.8.3, A(0) may be approximated arbitrarily closely in the O1 norm by a TP matrix C(0> n) = P(n)A(0)P(n) where P(n) = (slm )>
slm = exp[n(l m)2 ]=
We now suppose C(w> n) is the solution of ˙ n) = C(w> n)S(w> n) S(w> n)C(w> n) C(w>
184
Chapter 7
where S(w> n) = C+W (w> n) C+ (w> n)= By our previous argument, C(w> n) is TP for all w and all n, and since (Ex. 7.7.3) ˙ ˙ n)|| = R(exp(n))> ||A(w) C(w> (7.7.10) we have lim C(w> n) = A(w) :
n$4
(7.7.11)
the minors of A(w) are the limits, as n $ 4, of the (positive) minors of C(w> n); all the minors of A(w) are non-negative: A(w) is NTN. Finally, suppose A(0) is O. It is NTN and so, by the previous result, A(w) is NTN. When w = 0, the minors of (A(0))p = B(0) are strictly positive for p q 1. The corner minors of B(w) = (A(w))p remain positive. (Ex. 7.7.3) B(w) is then NTN, with positive corner minors; B(w) is TP; A(w) is O. It now follows trivially that if A(0) is SO, then so is A(w). We can immediately apply this result to obtain other isospectral mass reduced stiness matrices for the discrete beam. Starting from A(0) in equation (7.5.3), we can form A(w); A(w), like A(0), will be SO. Ex. 7.7.5 shows that the corner minors of B(w) = A1 (w) will be strictly positive, and Ex. 7.7.6 shows that the elements in the outer diagonal of A(w) will be positive. These are the results needed for the reconstruction of M0 > K0 > L0 from A(w). Markham (1970) [221] shows that an oscillatory (or sign-oscillatory) matrix must have staircase form. It may be verified (Ex. 7.7.4) that the isospectral flow with S = A+W A+ preserves such staircase forms. In particular, one may show that the outermost elements of the staircase retain their signs: if they are strictly positive (negative) when w = 0, they will remain strictly positive (negative).
Exercises 7.7 1. Write S = A+W A+ as a Hadamard product S = A Y. 2. Verify that if Y is given in (7.7.6), then A in (7.7.6) retains its form under the flow (7.7.1). 3. Establish the results (7.7.10), (7.7.11). 4. Show that the isospectral flow (7.7.1) with S = A+W A+ preserves staircase forms; these include block banded forms, with no holes. 5. Show that B = A1 satisfies the same isospectral flow equation (7.7.1), ˙ = BS SB, and that the corner minors of B satisfy (7.7.7). i.e., B 6. Show that if A has half-bandwidth u, so that dlm = 0 if |l m| A u, then the elements in the outdiagonal of A retain their signs. 7. Find two isospectral matrices J> J0 with the property that one cannot flow from J to J0 in a Toda flow with S given by equation (7.7.2).
Chapter 8
The Discrete Vibrating Beam A thinking reed - It is not from space that I must seek my dignity, but from the government of my thought. I shall have no more if I possess worlds. By space the universe encompasses and swallows me up like an atom; by thought I comprehend the world. Pascal’s Pensées, 348
8.1
Introduction
In this Chapter we shall present in detail the solution of the inverse problem for the discrete spring-mass model of a vibrating beam discussed in Section 2.3. This model is important because it is the simplest model - it is in eect a finite-dierence approximation - for a beam with continously distributed mass. See Gladwell (1991) [116] for a qualitative discussion of the customary finite element model of a beam. The inverse problem for a continuous beam will be considered in Chapter 13. The inverse problem for a discrete beam was first considered by Barcilon (1976) [18], Barcilon (1979) [20], Barcilon (1982) [21]. He established that the reconstruction of such a system would require three spectra, corresponding to three dierent end conditions. The necessary and su!cient conditions for these spectra to correspond to a realizable system, one with positive masses, lengths and stinesses, were derived by Gladwell (1984) [104]. Two papers by Sweet (1969) [313], Sweet (1971) [314] consider the discrete model of a beam obtained by using the so-called ‘method of straight lines’; he shows that the coe!cient matrix obtained in this procedure is (similar to) an oscillatory matrix. See also Gladwell (1991b) [117]. The plan of the Chapter is as follows. In Section 8.2 we show that the (squares of the) natural frequencies of the system are the eigenvalues of an oscillatory matrix. This means that the eigenvalues are distinct and the eigenvectors 185
186
Chapter 8
ul have all the properties derived in Section 6.10. It is found also that not only ul , but also l > l > !l , the slopes, moments and shearing forces, have these same properties (Theorem 8.2.2 and Ex. 8.2.1). Theorem 8.2.2 derives an additional result, that the beam always bends away from the axis at a free end. In Section 8.4 the oscillatory properties of the eigenvectors are used in the ordering of the natural frequences of the system corresponding to dierent end conditions. In Section 8.5 it is shown that while it is possible to take three spectra as the data for the reconstruction, it is better to take one spectrum, that corresponding to a free end, and the end values xql > ql of the normalised eigenvectors, as the basic data. In this way, the conditions on the data may be written as determinantal inequalities. In Section 8.6, a procedure for inversion is presented and it is shown that the conditions (Theorem 8.5.1), which were put forward earlier, are in fact su!cient to ensure that all the physical parameters, masses, lengths and stinesses, will be positive. In Section 8.7 a numerical procedure, based on the Block Lanczos algorithm, is described for the actual computation of the physical parameters.
8.2
The eigenanalysis of the cantilever beam
The equations governing the response of the discrete beam were derived in Section 2.3. Equation (2.3.6) shows that vibration with frequency $ is governed by the equation Mu = Ku !q eq oq1 q Eeq > = $ 2 where E is given in equation (2.2.10), eq = {0> 0> = = = > 1}, and !q and q are the bending moment and shearing force applied at the free end. This means that the free vibrations satisfy Mu = Ku> (8.2.1) which may be reduced to standard form Av = v
(8.2.2)
by the substitutions M = D2 >
v = Du>
A = D1 KD1 =
Theorem 8.2.1 The matrix A is sign-oscillatory. Proof. Equation (2.3.7) shows that W
ˆ L1 EW K = EL1 EKE ˆ are diagonal matrices with positive elements. where L> K
(8.2.3)
8. The Discrete Vibrating Beam
187
We recall, from Section 6.7, that a matrix A is said to be sign-oscillatory ˜ = ZAZ, with Z = gldj(1> 1> = = = > ()q1 ), is oscillatory (O). The (SO) if A matrix 6 5 1 1 : 9 1 1 : 9 : 9 .. .. ˜ =9 E : . . : 9 7 1 1 8 1 ˜ = is NTN (see the beginning of Section 6.6). Also, Ex. 6.7.6 shows that B 1 W ˜ ˜ ˜ ˜ ˜ ˆ ˜ EL E is NTN, as is its transpose, and hence also K = BKB , and A = ˜ is oscillatory, it ˜ 1 . Now, according to Theorem 6.7.3, to show that A D1 KD is su!cient to show that d ˜l+1>l A 0> l = 1> 2> = = = > q 1. This is easily verified. ˜ is O, and A is sign-oscillatory. Thus A Theorem 8.2.1 has important consequences. It means that the eigenvalues (l )q1 are distinct (Corollary to Theorem 6.10.1), that the last element, xql of each eigenvector ul of equation (8.2.1) may be chosen to be (strictly) positive (Corollary to Theorem 6.10.2); note that equation (8.2.3) gives ym = gm xm , so that yq A 0 implies xq A 0; and the xml will satisfy the inequalities (6.10.3). We now prove Theorem 8.2.2 The vectors (m )q1 are the eigenvectors of a sign-oscillatory matrix. Proof. Since = L1 EW u and thus u = EW L, we have ˆ W L1 EW (EW L) Mu = MEW L = Ku = EL1 EKE so that
ˆ W (LE1 MEW L) = EKE
or G= H>
G1 H= =
˜ ˜ 1 ) is O (Theorem 6.7.5). H is SO, so that H The matrix G is O, so that (G 1 ˜ 1 ˜ is O. Therefore, by Ex. 6.7.7, (G H) is O, and thus G H is SO. Theorem 8.2.2 means that the l must satisfy all the requirements for the eigenvectors of an SO matrix, e.g., q>l 6= 0. We now show that, for the particular SO matrix governing the beam, if the xq>l are chosen so that xq>l A 0, so that all the minors xq>v of Theorem 6.10.3 are positive, then q>l , and hence all the corresponding minors Vq>v = (q s + 1> q s + 2> = = = > q; l1 > l2 > = = = > ls ) will be positive. It is su!cient to prove Theorem 8.2.3 Each eigenvector of the cantilever beam satisfies xq>m q>m A 0.
188
Chapter 8
Proof. Choose xm so that xq>m A 0. There is an index u (1 u q 1) such that i) xl>m A 0> l = u> u + 1> = = = > q, ii) xu1>m 0. Note that when m = 1, then u = 1; we have x0>1 = 0. Thus u>m = (xu>m xu1>m )@ou A 0. Now, since !m = m E1 Mum then, because of the form of E1 given in equation (2.2.10), we have !l>m A 0 l = u 1> = = = > q 1= But m = E1 L!m so that, again, l>m A 0 l = u 1> = = = > q 1= Now consider the equation linking the l and l , namely 1 l+1 l = nl+1 l
and sum from u to q 1 to obtain q>m u>m =
q1 X
1 nl+1 l>m A 0
l=u
so that q>m A 0. Theorem 8.2.2, while showing that the m are eigenvectors of a sign-oscillatory matrix, shows that um and m must both have precisely m 1 sign changes. This means that the first mode u1 will steadily increase, i.e., 0 ? x1>1 ? x2>1 ? = = = ? xq>1 > as shown in Figure 8.2.1, while the m-th mode (m A 1) will have m 1 portions that are convex towards the axis, and one final portion that bends away from the axis, as shown in Figure 8.2.2.
x x x x Figure 8.2.1 - The first mode steadily increases
8. The Discrete Vibrating Beam
189
x x x x Figure 8.2.2 - The end of the mode bends away from the axis Exercises 8.2 1. Show that m and !m are eigenvectors of the equations ˆ 1 = EW L1 EW M1 EL1 E K ˆ 1 E1 L! = EW M1 E! LEW K and that each is the eigenvector of a sign-oscillatory matrix.
8.3
The forced response of the beam
The equation governing the response to an end shearing force and bending moment is equation (2.3.6), which for vibration of frequency $ becomes Mu = Ku !q eq oq1 q Eeq =
(8.3.1)
Since the eigenvectors um of the clamped-free beam span Yq , and are orthogonal w.r.t. M and K we may write u=
q X
m um >
m=1
and find m = (!q xq>m + q q>m )@(m )>
190
Chapter 8
where the modes are normalised so that uWm Mun = mn = Thus u=
q X (!q xq>m + q q>m )um > m m=1
(8.3.2)
and on multiplying through by L1 EW we find =
q X (!q xq>m + q q>m ) m = m m=1
(8.3.3)
These two equations completely characterise the forced response of the beam. In the terminology of Bishop and Johnson (1960) [34], equations (8.3.2), (8.3.3) give the end receptances for the beam: the displacement (slope) at one coordinate l due to a unit shearing force or bending moment at the end. In particular, for the end displacement and slope we have
where =
xq = !q + 0 q >
(8.3.4)
q = 0 !q + 00 q >
(8.3.5)
q X (xq>m )2 m=1
m 00
=
> 0 =
m=1
q X (q>m )2 m=1
8.4
q X xq>m q>m
m
m
>
=
(8.3.6)
(8.3.7)
The spectra of the beam
Now suppose that the left hand end of the beam remains clamped while the conditions at the right hand end are varied. The possible end conditions and eigenvalues, (eigenfrequency)2 , are as follows: free
!q = 0 = q
(l )q1
sliding
q = 0 = !q
(l )q1 1
anti-resonant xq = 0 = !q or q = 0 = q
( l )q1 1
pinned
xq = 0 = q
(l )q1 1
clamped
xq = 0 = q
( l )q2 1
8. The Discrete Vibrating Beam
191
Note that the anti-resonant frequencies are those at which the application of an end bending moment produces no end displacement; we will show that there are q 1 such frequencies, and that they are also the frequencies at which the application of an end shearing force produces no end rotation. We will now relate the various eigenvalues to the receptances derived in Section 8.3. We first state Theorem 8.4.1 If (sm )q1 A 0 and {1 ? {2 ? = = = ? {q , then the equation i ({) =
q X m=1
sm =0 {m {
has q 1 real zeros m satisfying {m ? m ? {m+1 = Proof. In each interval ({m > {m+1 ), i ({) is strictly increasing from 4 to +4, and will cross the {-axis just once. We now substitute the end conditions in the receptance equations (8.3.6), (8.3.7), starting with the sliding condition; q X (q>m )2 m=1
m
= 0 has zeros ( l )q1 = 1
(8.4.1)
Making use of Theorem 8.2.3 we may state q X xq>m q>m m=1
and
m
q X (xq>m )2 m=1
m
= 0 has zeros ( l )q1 > 1
(8.4.2)
= 0 has zeros (l )q1 = 1
(8.4.3)
To find the relative positions of the eigenvalues we need Theorem 8.4.2 Suppose (sm )q1 A 0, (tm )q1 A 0, {1 ? {2 ? = = = ? {q , i ({) =
q X m=1
sm > {m {
j({) =
q X m=1
tm {m {
, ( l )q1 are the zeros of i ({), j({) respectively. If sm tl sl tm A and that ( l )q1 1 1 0 for l A m then l A l for l = 1> 2> = = = > q 1. Proof. sl j({) tl i ({) =
q X sl tm sm tl = {m { m=1
192
Chapter 8
Put { = l , so that {l ? l ? {l+1 , and divide the sum into two parts, thus sl j( l ) =
l1 X sm tl sl tm m=1
l {m
+
q X sl tm sm tl = {m l m=l+1
Under the stated conditions, each of the numerators and denominators on the right will be positive, so that j( l ) A 0, i.e., j({) has already become positive when i ({) has just become zero, i.e., l A l . Note that, as in the discussion of positivity in Chapter 6, it is su!cient to have sm tm+1 sm+1 tm A 0 for m = 1> 2> = = = > q 1, for then sm tl sl tm A 0 for all l A m. The converse of Theorem 8.4.2 is not true - see Ex. 8.4.1. We now apply this Theorem, first to l and l . Take sm = xq>m q>m and tm = 2q>m , then pm tl sl tm = q>l q>m (xq>m q>l xq>l q>m ) = q>l q>m (xq>l xq1>m xq>m xq1>l )@cq . To show that this is positive, we use Theorem 6.10.3 with s = 2> l1 = m> l2 = l; it gives ¯ ¯ ¯ xq1>m xq1>l ¯ ¯A0 ¯ ¯ xq>m xq>l ¯
for l A m, and thus l A l . We find in an exactly similar way that l A l . Finally, since the clamped conditions may be obtained by applying the extra constraint q = 0 to the pinned condition, the usual theory of vibration under constraint gives l A l . This gives the following ordering: 0 ? 1 ? 1 ? 1 ? 1 ? ( 1 > 2 ) ? 2 ? 2 ? 2 ? ( 2 > 3 ) ? = = = ( q2 > q1 ) ? q1 ? q1 ? q1 ? q =
(8.4.4)
Note that the relative position of m and m+1 is (so far) indeterminate; in numerical experiments it was always found that m A m+1 . See Gladwell (1985) [105], Gladwell (1991b) [117].
Exercises 8.4 1. Construct a counterexample to show that the converse of Theorem 8.4.2 is false. Take q = 3, ({1 > {2 > {3 ) = (1> 4> 7), (s1 > s2 > s3 ) = (4> 1> 4), (t1 > t2 > t3 ) = (5> 1> 7). Find 1 > 2 > 1 > 2 and show that j( 1 ) A 0, j( 2 ) A 0, so that 1 A 1 , 2 A 2 , but s1 t2 s2 t1 ? 0, s2 t3 s3 t2 A 0. 2. Show that if sm A 0, tm A 0, sm tm+1 sm+1 tm A 0 for m = 1> 2> = = = > q 1, then sm tl sl tm A 0 for all l A m. Compare with Theorem 6.8.1. 3. Use equations (8.4.2), (8.4.3) to deduce that Q f1 q1 m=1 ( m l ) 2 q>l = Qq0 m=1 (m l )
8. The Discrete Vibrating Beam
x2q>l
193 Q f2 qm=1 (m l ) = Qq0 m=1 (m l )
where 0 denotes m 6= l, and f1 , f2 are constants.
4. Develop an intuitive argument to show that l ? l by considering a clamped-clamped beam made up of two identical cantilevers of length c@2 welded together at their free ends. are the (frequency)2 values for which the applica5. The eigenvalues ( l )q2 1 tion of a force and moment at the free end produce xq = 0 = q . Use the equations (8.3.4)-(8.3.7) to show that the l are the roots of q X (xq>l q>m xq>m q>l )2 = 0= (l )(m ) l>m=1
8.5
Conditions on the data for inversion
In the inverse eigenvalue problem for the beam it is required to construct a beam with given eigenvalues. Barcilon showed (for his model) that the beam cannot be uniquely determined from two spectra, and attempted to prove that it could be so determined (apart from a scale factor) from three properly chosen spectra. His procedure (in our notation) was to start from (l > l > l )q1 (and note that he had q of each of the l > l , not q 1 as in the model of Figure 2.3.1) satisfying 1 ? 1 ? 1 ? 2 = = = q ? q ? q and compute the frequencies (l )q1 and ( l )q1 (again note that he had q of the 1 l and q 1 of l ) using some recurrence relations. For his model it was not possible to prove that the eigenvalues l > l so computed satisfied the complete set of inequalities (similar to (8.4.4)). He had to place subsidiary conditions on (l > l > l )q1 in order for the inequalities to be satisfied. His second step was a stripping procedure for computing the parameters oq > nq > pq of the last segment, and for computing the corresponding eigenvalues (l > l > l )q1 of the truncated 1 system obtained by deleting the last segment. The oq > nq > pq were all found to be positive but, even with the extra conditions on the (l > l > l )q1 , it was not possible to prove that the new (starred) eigenvalues satisfied the necessary orderings, which meant that if the stripping procedure were continued, negative masses, stinesses or lengths might be encountered at some stage. He concluded that further conditions must be placed on the data, preferably conditions which could be applied ab initio, so eliminating the need for checks at each stage of the stripping procedure. We shall now state such conditions and construct a new stripping procedure. The spectra, from which will be drawn the data for the inverse problem, may be divided into three parts: (i) (l )q1 ; (ii) ( l > l > l )q1 ; (iii) ( l )q2 . 1 1
194
Chapter 8
Suppose that (i) is given. Each spectrum which is given from (ii) then determines, to within an arbitrary multiplier, the set of coe!cients (q>l )2 > (xq>l q>l ) or (xq>l )2 respectively, from the eigenvalue equations (8.4.1)-(8.4.3); see Ex. 8.4.3 and an analogous result for xq>l q>l . If any two of the spectra in (ii) are given, then the two sets of coe!cients yield the third set, and hence the third spectrum. (Note that since xq>l q>l A 0, there is no ambiguity in taking the square root of x2q>l 2q>l .) However, if two given spectra, say ( l )q1 and (l )q1 1 1 satisfy the appropriate ordering, l ? l , then the third set (l )q1 need not 1 satisfy its appropriate ordering, l ? l . Two counterexamples are provided in Ex. 8.5.1, 8.5.2, and these clearly show that the ordering requirements on the two given spectra e.g., l ? l , are insu!cient for the existence of a real model, with positive ol > nl > pl ; they do not even ensure the ordering of the remaining spectrum. We now prove the fundamental Theorem 8.5.1 A necessary condition for the existence of a real (i.e., positive) model corresponding to data sets (l > xq>l > q>l )q1 is that the matrix P 5 Pq+1>q given by 5 6 xq>1 xq>2 === xq>q 9 q>1 q>2 === q>q : 9 : 9 1 xq>1 2 xq>2 = = = q xq>q : 9 : P=9 : 9 21 q>1 22 q>2 = = = 2q q>q : 7 1 xq>1 2 xq>2 = = = q xq>q 8 · · === · should have all its minors are positive. Note that the last row of P is u1 xq>1
u2 xq>2
===
uq xq>q
u1 q>1
u2 q>2
===
uq q>q
or according to whether q is even or odd respectively, and u = [q@2]. Proof. Because of the repetitive nature of the rows of P, Theorem 6.8.1 shows that all the minors will be positive i S (1> 2> = = = > s; l> l+1> = = = > l+s1) A 0 S (2> 3> = = = > s+1; l> l+1> = = = > l+s1) A 0 (8.5.2) for s = 1> 2> = = = > q and l = 1> 2> = = = > q s + 1. The proof follows directly from Theorem 6.10.3, for ¯ ¯ ¯ xq1>l xq1>l+1 ¯ ¯ ¯ A 0= X (q 1> q; l> l + 1) = ¯ ¯ xq>l xq>l+1
But the recurrence xq1 = xq oq q yields ¯ ¯ ¯ xq>l oq q>l xq>l+1 oq q>l+1 ¯ ¯ ¯ X (q 1> q; l> l + 1) = ¯ ¯ xq>l xq>l+1 ¯ ¯ ¯ x xq>l+1 ¯¯ = oq ¯¯ q>l = oq S (1> 2; l> l + 1) A 0 q>l q>l+1 ¯
8. The Discrete Vibrating Beam
195
which we write in abbreviated notation as [xq1 > xq ] = [xq oq q > xq ] = oq [xq > q ] = oq S (1> 2; l> l + 1)= Similarly, the relations between the xl > l > l > !l in Section 2.3 and Theorem 6.10.3 applied to the l (note that Theorem 8.2.2 shows that m is an eigenvector of an SO matrix) give 0 ? [q1 > q ] = [q nq1 q1 > q ] = nq1 [ q1 > q ] = nq1 oq [q1 > q ] = nq1 oq pq [xq > q ] = nq1 oq pq [q > xq ] = nq1 oq pq S (2> 3; l> l + 1)= Proceeding in this way we may relate the minors occurring in Theorem 6.10.3, for X or , to those appearing in S . Thus X (q 2> q 1> q; l> l + 1> l + 2) = Sq2 S (1> 2> 3; l> l + 1> l + 2) (q 2> q 1> q; l> l + 1> l + 2) = Tq2 S (2> 3> 4; l> l + 1> l + 2) where 1 2 oq oq1 pq pq1 Sq2 = nq1 oq2 oq1 pq > Tq2 = nq1 nq1
and generally X (q s + 1> q s + 2> = = = > q; l> l + 1> = = = > l + s 1) = Sqs+1 S (1> 2> = = = > s; l> l + 1> l + s 1)
(8.5.3)
(q s + 1> q s + 2> = = = > q; l> l + 1> = = = > l + s 1) = Tqs+1 S (2> 3> = = = > s + 1; l> l + 1> = = = > l + s 1)
(8.5.4)
where, as will be important in our discussion later, Sqs+1 and Tqs+1 are products of the pl > ol > nl for l = q s + 2> = = = > q. It will be shown below that the condition that P is TP is also su!cient for the existence of a real model.
Exercises 8.5 1. Construct a counterexample to show that l ? l ? l ? l+1 does not imply l ? l ? l ? l ? l+1 . Take q = 3, (1 > 2 > 3 ) = (1> 4> 7)> (x3>1 > x3>2 > x3>3 ) = (2> 1> 2) so that (1 > 2 ) = (2> 5). Take 3>1 = 3@2> 3>2 = 1 and find 3>3 so that 1 ? 1 > 2 ? 2 > 1 ? 1 but 2 A 2 . 2. With the same l and x3>l data, but with 3>1 = 1, find 3>3 so that 1 ? 1 , 2 ? 2 , 2 ? 2 , but 1 A 1 . 3. Take q = 3> xq>l = 1> q>1 = 1> 1 = 1, and find two sets of values of q>2 and q>3 , 2 > 3 so that the positivity conditions of Theorem 8.5.1 are fulfilled and 1 ? 2 in one case, 1 A 2 in the other. This proves that the relative positions of l and l+1 are indeterminate.
196
8.6
Chapter 8
Inversion by using orthogonality
In this section we show how the system parameters may be found, at least in theory, from the eigenvalue data, and establish necessary and su!cient conditions on the data for the system parameters to be positive. Suppose that we are given (l > xq>l > q>l )q1 for a cantilever beam, so that q>l = 0 = !q>l . We will show that we can construct a beam, and that if the data satisfy the condition stated in Theorem 8.5.1, then all the system parameters will be positive. We start with the system equation l Mul = Kul > and, as usual, put U = [u1 > u2 > = = = > uq ]> a = gldj(1 > 2 > = = = > q ). Then MUa = KU>
(8.6.1)
and the orthogonality of the ul w.r.t. K> M yields UW MU = I> UW KU = a=
(8.6.2)
The first of these equations gives
so that
M1 = UUW >
(8.6.3)
X 1 = (xm>l )2 > m = 1> 2> = = = > q= pm l=1
(8.6.4)
q
Since (xq>l )q1 are known, we have found
q
X 1 = (xq>l )2 = pq l=1
(8.6.5)
The matrix UUW is diagonal; its term m> m 1 is q X
xm>l xm1>l = 0>
l=1
which, with xm1 = xm om m gives q q X X (xm>l )2 om xm>l m>l = 0> l=1
l=1
and using (8.6.4) we find q
X 1 = xm>l m>l > pm om l=1
(8.6.6)
8. The Discrete Vibrating Beam
197
which with m = q, yields oq . The next step is the determination of nq . For this we need the explicit expression for K: ˆ W L1 EW = K = EL1 EKE This gives ˆ 1 = EW L1 EW K1 EL1 E= K
(8.6.7)
Now we use the second of equations (8.6.2) to give K1 = Ua1 UW which, when substituted in (8.6.7), gives ˆ 1 = EW (L1 EW U)a1 (UW EL1 )E= K But = [1 > 2 > = = = > q ] = L1 EW U so that ˆ 1 = EW a1 W E K which yields
q
X (m>l m1>l )2 1 = > m = 1> 2> = = = > q= nm l l=1
But m>l m1>l = nm1 m1>l so that nm =
q X ( m1>l )2
l
l=1
=
(8.6.8)
Now take m = q, then q1>l = oq !q1>l = oq pq l xq>l , so that nq = p2q oq2
q X
l (xq>l )2 =
(8.6.9)
l=1
Having found pq > oq and nq we now state the steps in the algorithm to reconstruct the system. i) set m = q. ii) uq>l > q>l > q>l 0 !q>l are known from data. iii) compute pm > om from equations (8.6.5), (8.6.6). xm1>l iv) compute !m1>l m1>l
= xm>l om m>l > = !m>l + pm l xm>l > = = m>l + om !m1>l
v) compute nm from (8.6.8).
198
Chapter 8
vi) compute m1>l = m>l m1>l @nm . vii) set m = m 1. If m A 1 go to iii), otherwise stop. We note that the quantities (xq>l > q>l )q1 will be known only to within arbitrary multiplying factors. If a second, primed, set is related to the first by x0m>l = xm>l > 0m>l = m>l >
(8.6.10)
p0m = pm @2 > nm0 = nm @ 2 > om0 = om @>
(8.6.11)
then the algorithm yields
or
pm om2 p0m om02 = > nm0 nm
m = 1> 2> = = = > q=
(8.6.12)
Equations (8.6.11), (8.6.12) define the equivalence class of systems corresponding to the given data. The validity of this inversion procedure is based on Theorem 8.6.1 The total positivity of the matrix P of Theorem 8.5.1 is necessary and su!cient for the existence of a real (positive) model having three given spectra, i.e., (l )q1 and two of ( l > l > l )q1 . 1 Proof. The necessity was proved in Theorem 8.5.1. We prove the su!ciency. Consider the equations M1 = UUW >
= L1 EW U
and construct the matrix B L1 EW M1 = L1 EW UUW = UW = Now form the sth compound matrix equation by using the Binet-Cauchy Theorem: W 1 W W Bs = L1 s E s Ms = s U s =
1 W Since L1 s and Ms are diagonal matrices, and each principal minor of E s is unity, the bottom right-hand element of Bs is
eQ Q =
q Y
(pn on )1 =
n=qs+1
Q X v=1
V Q>v UQ>v >
(8.6.13)
where the notation is as in Section 6.2. We now proceed by induction. Suppose that conditions (8.5.2) are satisfied, and that oq > oq1 > = = = > oqs+2 are all positive. Each UQ>V and V Q>V may be expressed, as in equations (8.5.3), (8.5.4), as a product of terms involving pm > nm1 , which are all positive, and terms involving oq > oq1 > = = = > oqs+2 which are positive by hypothesis. Each such V Q>V , UQ>V is thus positive. Therefore, equation (8.6.13) shows that oqs+1 A 0. But oq A 0, so that all om are positive.
8. The Discrete Vibrating Beam
8.7
199
A numerical procedure for the inverse problem
The algorithm described in Section 8.6 has primarily theoretical value. It shows that if the data satsify the conditions in Theorem 8.5.1, then the system parameters constructed by the algorithm will be positive. However, starting as it does from the free end and computing the successive model parameters, the algorithm suers from the same kind of ill conditioning that was encountered in the inverse problem for the rod in Section 4.3. To obtain a reliable numerical procedure we use the Block Lanczos algorithm described in Section 5.5. To use this algorithm, we reduce the governing equation (8.2.1) to standard form Aq = q where A = D1 KD1 >
M = D2 >
q = Du=
To apply the Block-Lanczos algorithm to the pentadiagonal matrix A (s = 2), we use the algorithm starting from the free end (q) rather than the fixed end (1). Thus we need the vectors x1 > x2 containing the qth and (q 1)st terms of the normalised eigenvectors of A: x1 x2
= {tq>1 > tq>2 > = = = > tq>q } = {tq1>1 > tq1>2 > = = = > tq1>q }=
Now tq>l tq1>l
= gq xq>l = gq1 xq1>l = gq1 {xq>l oq q>l }= 1
Equation (8.6.5) gives pq , and gq = pq2 . Equation (8.6.6) gives oq and hence xq1>l and then equation (8.6.4) with m = q 1 gives pq1 . Thus the data (l > xq>l > q>l )q1 give the vectors x1 > x2 which are needed for the Block Lanczos algorithm. Now suppose that we have computed A = D1 KD1 from the Block Lanczos algorithm. We must now untangle A to give K and M. We do this rather like we did it for the rod, in Section 4.4: we use the static behaviour of the system, as we did in Section 7.5. First, we apply external static forces i1 > i2 to masses 1 and 2, and deform the system as shown in Figure 8.7.1.
200
Chapter 8
x
x
x
Figure 8.7.1 - Two static forces are needed to deflect all the masses by the same amount For this configuration u = {1> 1> = = = > 1}, so that q = {g1 > g2 > = = = > gq }. The static equation is Ku = f = {i1 > i2 > 0> = = = > 0} i.e., DADu = f or 1 Ad = {g1 1 i1 > g2 i2 > 0> = = = > 0}=
Consider this equation. We know A, and we know the last two components gq1 > gq . But A is pentadiagonal so that, knowing gq1 > gq , we can compute 1 gq2 > = = = > g1 , and find g1 1 i1 > g2 i2 and hence i1 > i2 . Having found the masses (pm = g2m ), we find the lengths. We apply a single force n1 o11 at p1 and find u0 = {o1 > o1 + o2 > = = = > o1 + o2 + · · · oq } as shown in Figure 8.7.2.
x x x x Figure 8.7.2 - One static force will deflect the beam as a straight line
Now the equation Ku0 = {n1 o11 > 0> = = = > 0} yields 1 A{g1 o1 > g2 (o1 + o2 )> = = = > gq (o1 + · · · oq )} = {g1 1 n1 o1 > 0> = = = > 0}=
This means that if we invert the equation Ax = {1> 0> = = = > 0} we will find gl (o1 + o2 + · · · ol ) = f{l >
1 f = g1 1 n1 o1 =
(8.7.1)
8. The Discrete Vibrating Beam
201
This yields the ol , and the theory of Section 8.6 shows that they will all be positive if the data satisfies the conditions of Theorem 8.6.1. The last step is to find the nl . Using the form of E1 in equation (2.2.10), we may write equation (8.7.1) as 1 ADEW L{1> 1> = = = > 1} = {g1 1 n1 o1 > 0> = = = > 0}
i.e., LE1 DADEW L{1> 1> = = = > 1} = {n1 > 0> = = = > 0} and then as in Section 4.4, we deduce that ˆ W LE1 DADEW L = EKE ˆ The reconstruction is complete. which gives K.
Chapter 9
Discrete Modes and Nodes Memory is necessary for all the operations of reason. Pascal’s Pensées
9.1
Introduction
The emphasis in all the preceding chapters has been on eigenvalues, and on reconstructing a system from eigenvalue data. In this chapter we turn our attention to eigenvectors. In Sections 9.2, 9.3 we consider the question of constructing a Jacobi matrix that has one or more given eigenvectors, and then go on to constructing a spring mass system from such data. In Section 9.4 we comment on the more di!cult problems of constructing a discrete vibrating beam from eigenmode data. Up to this point, all the systems are basically in-line systems, so that the underlying matrices are band matrices, and either oscillatory or sign-oscillatory. In the remaining sections, we widen our study and see what can be said about eigenvectors and their signs, i.e., about modes and nodes, for the equation (K M)u = 0>
(9.1.1)
where K> M relate to some simple 2-D and 3-D systems, specifically membranes and acoustic cavities. We do not yet have any results about constructing M> K from eigenvector data in this case; the properties of the eigenvectors do however provide necessary conditions on the eigenvector data for the masses and stinesses of the underlying system to be positive. Note that in the 2-D and 3-D problems, we will use Q to denote the order of the system, and q to label a particular eigenvalue. 202
9. Discrete Modes and Nodes
9.2
203
The inverse mode problem for a Jacobi matrix
In this section we consider the problem of constructing a Jacobi matrix that has one or more specified eigenvectors. Following Vijay (1972) [329], Gladwell (1986c) [109] we prove Theorem 9.2.1 The vector u is an eigenvector of a Jacobi matrix i Vu+ = Vu . Proof. We recall the definitions of Vu+ > Vu from Section 6.9. The necessity, i.e., only if, follows from Theorem 6.10.2. To prove su!ciency, i.e., if, we need to show first that if Vu+ = Vu then we can find (dl )q1 A 0> (el )q1 A 0, such that 1 (d1 )x1 e1 x2 = 0> el1 xl1 + (dl )xl el xl+1 = 0> l = 2> 3> = = = > q 1 eq1 xq1 + (dq )xq = 0=
(9.2.1)
= 1> dl = + fl > fl = First, suppose that (xl )q1 6= 0, then we may take (el )q1 1 (xl1 + xl+1 )@xl > l = 1> 2> = = = > q where x0 = 0 = xq+1 . Thus, the matrix 5 6 f1 1 9 1 f2 1 : 9 : 9 : .. .. .. 9 : . . . C=9 : 9 : . . .. . . 1 8 7 1 fq satisfies Cu = 0, and A = I + C. The matrix C, having strictly negative codiagonal, will have distinct eigenvalues (l )q1 , one of which will be zero because C is singular. The matrix A will have eigenvalues (+l )q1 , so that if is chosen so that max (l ) 1lq
then A, having non-negative eigenvalues, will be PSD; A will be a Jacobi matrix. What happens when one of the xl is zero? The condition Vu+ = Vu implies x1 6= 0> xq 6= 0. Suppose xp = 0 for just one p satisfying 1 ? p ? q, then xp1 > xp+1 will be non-zero and have opposite signs, so that xp1 xp+1 ? 0. The pth line of equation (9.2.1) is ep1 xp1 + ep xp+1 = 0 so that dp > ep1 > ep may be taken so that dp = > ep1 = 1> ep = xp1 @xp+1 = The remaining el may be chosen so that = 1> (el )qp = ep (el )p1 1
204
Chapter 9
and then dl = + fl > fp = 0> fl = el (xl1 + xl+1 )@xl > l 6= p and again x0 = 0 = xq+1 . Now we construct 5
f1 9 e1 9 9 C=9 9 9 7
e1 f2 .. .
6 e2 .. .
..
.
..
..
.
.
eq1
eq1 fq
: : : : : : 8
which satisfies Cu = 0. Now A = I + C, where is chosen as before. This argument may easily be generalised to the case when two or more (nonconsecutive) xl are zero. The next Theorem relates to two given vectors. Theorem 9.2.2 Suppose u> v 5 Yq , and define vl > wl as in equation (3.3.6). The necessary and su!cient conditions for u> v to be eigenvectors of a Jacobi matrix corresponding to two eigenvalues > , unspecified apart from the ordering ? , are (a) Vu+ = Vu > Vv+ = Vv (b) vq = 0 (c) either vl = 0 = wl or vl wl A 0 for l = 1> 2> = = = > q. Proof. The conditions are necessary, for Corollary 6.10.2 yields (a). The orthogonality condition uW v = 0 yields (b), while equation (3.3.8) yields (c). Note that a) implies that x1 > y1 are not zero, so that v1 = x1 y1 6= 0. Hence, v1 w1 A 0. Also, vq = 0 implies vq1 = xq yq ; again a) implies that xq > yq are not zero so that vq1 wq1 A 0. Without loss of generality, we may take x1 A 0> y1 A 0. The conditions are interesting because they imply that v has more sign changes than u, i,e., Vv A Vu . To see this, we argue as in Theorems 3.3.2, 3.3.3. First, suppose that the first zero of the u-line is 1 () = {, and of the v-line, 1 (). We prove 1 () ? 1 (). Suppose if possible that 1 () 1 () = {, and that t ? 1 () t + 1 (1 t ? q 1), then all (xl )t1 and (yl )t1 will be positive, while (t + 1 {)xt + ({ t)xt+1 (t + 1 {)yt + ({ t)yt+1
= 0 0
which imply wt 0. On the other hand, vt A 0, which, when used with (3.3.8), provides a contradiction.
9. Discrete Modes and Nodes
205
Now we show that there is a zero of the v-line between any two consecutive nodes of the u-line. Let > ( ? ) be two neighbouring nodes of the u-line and suppose that s 1 ? s>
t ? t + 1 (s t)
so that (s )xs1 + ( s + 1)xs = 0
(9.2.2)
(t + 1 )xt + ( t)xt+1 = 0
(9.2.3)
and xs > xs+1 > = = = > xt have the same sign, say positive. Suppose the v-line had no zero in (> ), and without loss of generality, were positive there. Then ys > ys+1 > = = = > yt would be all positive, while (s )ys1 + ( s + 1)ys 0
(9.2.4)
(t + 1 )yt + ( t)yt+1 0=
(9.2.5)
On eliminating between (9.2.2), (9.2.4), and between (9.2.3), (9.2.5), we find ws1 0> wt 0, which, with P (c) imply vs1 0> vt 0 and therefore vt vs1 0. But vt vs1 = tl=s xl yl A 0, a contradiction. We can show similarly (Ex. 9.2.1) that the v-line has a node to the right of the last node of the u-line: the v-line has more nodes than the u-line. Now we proceed to the construction. First, suppose that vl wl A 0 for l = 1> 2> = = = > q 1, then equations (3.3.1), (3.3.6), (3.3.8) show that ¾ ½ yl xl+1 (yl1 xl+1 xl1 yl+1 ) + vl1 dl = + ( ) wl wl1 wl el = ( )(vl @wl )= Where l = 2> = = = > q 1 in the first formula, l = 1> = = = > q 1 in the second. The two remaining quantities d1 > dq are given by d1 = + ( )
v1 x2 > w1 x1
dq = ( )
vq1 xq1 = wq1 xq
We may write these equations in the form dl = + ( )fl >
el = ( )gl =
We note that the el are positive. Now A = I + ( )C where
5
f1
9 9 g1 C=9 9 7
6
g1 f2 .. .
.. ..
.
. gq1
gq1 fq
: : := : 8
206
Chapter 9
Thus C, having non-zero codiagonal, will have distinct eigenvalues (l )q1 . Thus A will have eigenvalues +()l , and A will be PSD if +() min(l ) 0. The slight modifications to the argument which must be made if an vl is zero, are left to the exercises. Exercises 9.2 1. Show that the conditions (a), (b), (c) of Theorem 9.2.2 imply that the v-line will have a node to the right of the last node of the u-line. 2. Show that if two consecutive vl are zero, i.e., vp1 = 0 = vp (2 p q 2) then xp = 0 = yp , and deduce that three consecutive vl cannot be zero. 3. Show that if vp = 0 but vp1 6= 0, then ep may be chosen arbitrarily, e.g., ep = . Find a replacement for dp . 4. Modify the argument to cover the case vp1 = 0 = vp .
9.3
The inverse problem for a single mode of a spring-mass system
We recall from Section 2.2 that the eigenmodes xm of the system of Figure 2.2.1 are the eigenvectors of the equation W
ˆ u = Mu= EKE
(9.3.1)
W
ˆ The matrix M1 (EKE ) is sign-oscillatory (SO), so the analysis of Section 6.10 ˆ W is not symmetric, but applies to the eigenvectors um . (Note that M1 EKE the analysis of SO and O matrices does not depend on symmetry.) Write zl = xl xl1 so that, with x0 = 0, w = EW u>
u = EW w=
Equation (9.3.1) may be written ˆ = w (EW M1 E)Kw and again the matrix on the left is SO. This means that the vectors wm will have the properties listed in Section 6.10 for the eigenvectors of an SO matrix. We first prove two theorems regarding the shape of the vector um . The first is a simple analogue of the maximum principle which appears in elliptic equations. Theorem 9.3.1 An eigenmode of (9.3.1) cannot have an interior negative maximum or an interior positive minimum.
9. Discrete Modes and Nodes
207
Proof. Suppose 2 l q 1. The lth line of (9.3.1) is nl zl nl+1 zl+1 = pl xl = Suppose x has a relative maximum at xl . that zl 0> zl+1 0, and hence xl 0. simultaneously zero, xl A 0.
Then xl xl1 > xl xl+1 , so In fact, since zl > zl+1 cannot be
Theorem 9.3.2 Two neighbouring xl can be equal only at a relative maximum or minimum. Proof. Suppose xl = xl1 , then zl = 0, so that zl1 > zl+1 are non-zero and have opposite signs, i.e., (xl1 xl2 )(xl+1 xl ) ? 0 or equivalently (xl xl2 )(xl xl+1 ) A 0= This implies that xl (= xl1 ) is either strictly greater or strictly less than its neighbours xl2 and xl+1 : there is a relative maximum or minimum at xl . The theorems show (Ex. 9.3.1) that um will have m 1 portions which bend toward the axis, and a final portion which bends away from the axis, as shown in Figure 9.3.1.
x
x
x
x
x x Figure 9.3.1 - The mth mode of a spring-mass system Theorem 9.3.3 The necessary and su!cient conditions for u to be the mth mode of a spring-mass system in the fixed-free configuration are that + (a) Vu+ = Vu = Vw = Vw = m 1,
(b) x1 z1 A 0. Proof. The necessity of these conditions has already been established. To prove su!ciency, we first note that no two of xl > xl1 > zl can be simultaneously zero; now we construct a system.
208
Chapter 9
The mode will have a shape like that shown in Figure 9.3.1. Thus xl will start positive, and will increase (xl A 0> zl A 0), until an index u, the first for which xu A 0> zu 0> zu+1 ? 0= Then xl will decrease (xl A 0> zl ? 0) until an index v, the first for which xv 0> xv+1 ? 0> zv ? 0= Now xl will continue to decrease (xl ? 0> zl ? 0) until an index w, the first for which xw ? 0> zw 0> zw+1 A 0 and then proceed to increase again. The governing equation (9.3.1) may be written ˆ = E1 Mu= Kw Since E1 is given by equation (2.2.10), we have nl zl =
q X
pn xn = l =
(9.3.2)
n=1
This shows that we should take the xl , and choose pl so that zl and l have the same sign. For the construction, we must choose (pl )q1 A 0 so that the following conditions hold: P (i) u 0> with u = 0 i zu = 0; then l = u + u1 n=l pn xn A 0 (ii) u+1 ? 0; then l ? 0 for l = u + 1> = = = > v (iii) w 0, with w = 0 i zw = 0; then l ? 0 for l = v + 1> = = = > w 1 (iv) w+1 A 0, and so on. Finding (pl )q1 with these properties is essentially a linear programming problem. It yields a set of l having the same sign as zl . If zl 6= 0, then nl is given by equation (9.3.2), while if a particular zl is zero, nl may be given an arbitrary positive value. The question of reconstructing a spring-mass system from modal data was considered by Porter (1970) [267], Porter (1971) [268], but he did not discuss the necessary or su!cient conditions on the modes for the masses and stinesses to be positive.
Exercises 9.3 1. Show that the mth mode of a fixed-free spring-mass system will have m 1 portions which bend toward the axis, and a final portion which bends away from the axis. 2. Construct a spring-mass system with 7 masses that has third mode u = {1> 2> 1> 1> 2> 1> 1}.
9. Discrete Modes and Nodes
9.4
209
The reconstruction of a spring-mass system from two modes
The construction described in Section 9.3 is far from unique. In this section, following Gladwell (1986c) [109], we shall show that, provided certain conditions are satisfied, there is essentially a unique system for which two given modes are eigenmodes. We first provide a counterexample to show that even if u> v separately satisfy the conditions of Theorem 9.3.3, there may be no system for which they are both eigenmodes, corresponding to two eigenvalues > , respectively, with ? . Write w = EW u> z = EW v and suppose u = {1> 3> 6}> w = {1> 2> 3}> v = {1> 1> 4}> z = {1> 2> 5}= The governing equations are p1 p1 so that
= n1 2n2 > 3p2 = 2n2 3n3 > 6p3 = 3n3 = n1 + 2n2 > p2 = 2n2 5n3 > 4p3 = 5n3 5 2n2 + 5n3 = = 3 6 2n2 3n3
i.e., 2n2 = 45n3 , which is unrealizable. In order to derive the conditions on the modes, we formalize the elimination procedure used in this counterexample. The recurrence relations are pl xl = nl zl nl+1 zl+1 > l = 1> 2> = = = > q 1
(9.4.1)
pl yl = nl }l nl+1 }l+1 > l = 1> 2> = = = > q 1
(9.4.2)
pq xq = nq zq > pq yq = nq }q =
(9.4.3)
zq yq = · = xq }q
(9.4.4)
and Thus,
+ = Vw , and We know that one of the conditions will have to be Vu+ = Vu = Vw correspondingly Vv+ = Vv = Vz+ = Vz . These will entail that xq > yq > zq > }q will all be non-zero and may be chosen to have the same sign, say positive. The condition A then demands
xq }q yq zq A 0=
(9.4.5)
210
Chapter 9
Eliminating nl > nl+1 in turn from equations (9.4.1), (9.4.2) we find pl (xl }l yl zl ) = nl+1 (zl }l+1 zl+1 }l ) pl (xl }l+1 yl zl+1 ) = nl (zl }l+1 zl+1 }l ) so that on substituting @ from (9.4.4) we find pl sl = nl+1 zq yq ul >
l pl tl = nl zq yq ul
(9.4.6)
where sl tl ul
= xl yq zq }l xq yl zl }q = xl yq zq }l+1 xq yl zl+1 }q = zl }l+1 zl+1 }l =
Thus we may state Theorem 9.4.1 The necessary and su!cient conditions for u> v to be eigenmodes of a (fixed-free) spring-mass system for some eigenvalues > ( ? ), are a) Vu = Vw ? Vv = Vz b) yq zq A 0 c) xq }q yq zq A 0 d) for each l, 1 l q 1, the three quantities sl > tl > ul have the same strict sign or are all identically zero; this sign need not be the same for all l. Proof. The necessity of the conditions has already been demonstrated. If the conditions hold, and none of the triplets is zero, then equations (9.4.6), for l = 1> 2> = = = > q 1, give the 2(q 1) ratios p1 @n1 > p1 @n2 ; p2 @n2 > p2 @n3 ; = = = ; pq1 @nq1 > pq1 @nq = The final equations (9.4.3), (9.4.4) are left for the ratios pq @nq and @. Thus if we choose say and pq then the system is uniquely determined. If a triplet sn > tn > un is identically zero then pn > nn may be chosen arbitrarily (positive). We note (Ex. 9.4.1) that the conditions a)-c) preclude the triples s1 > t1 > u1 , or sq1 > tq1 > uq1 from being zero. In the particular case in which the eigenvalues are consecutive, the conditions may be made sharper, to give Theorem 9.4.2 The necessary and su!cient conditions for u> v to be eigenmodes corresponding to consecutive eigenvalues of the spring-mass system are that a) yq zq A 0
9. Discrete Modes and Nodes
211
b) xq }q yq zq A 0 c) (sl > tl > ul )q1 A 0= 1 Proof. The necessity of a) and b) follow from (9.4.4) and (9.4.5). The necessity of (ul )q1 A 0 is established in Gladwell (1985a) [109]; equation (9.4.6) 1 then shows that (sl > tl )q1 A 0. The su!ciency of the conditions follows as 1 before. Exercises 9.4 1. Show that conditions a)-c) of Theorem 9.4.1 imply s1 ? 0. Show also that the assumption (sq1 > tq1 > uq1 ) = 0 leads to a contradiction. 2. Construct a spring-mass system with first and second modes u = {1> 3> 6> 10> 15}> v = {1> 4> 2> 1> 5}.
9.5
The inverse mode problem for the vibrating beam
In this section, we consider the questions of whether and how we may construct a discrete model of a beam, as described in Section 2.3, from a single mode u. As could be expected, this question is considerably more di!cult than the corresponding question for a rod. Since the question was definitively answered in Gladwell, Willms, He and Wang (1989) [115], we will merely state the principal results obtained there. We recall that the eigenvalue problem for the cantilever beam may be obtained from equation (2.3.6): ˆ W L1 EW u = Mu= Ku EL1 EKE
(9.5.1)
The matrix K is a pentadiagonal SO matrix, so that the eigenvalues are simple, and the eigenvector um = u has sign count Vu = m 1. As with the rod, we can easily show (Ex. 9.5.1) that ˆ W > ! = L1 E = L1 EW u> = KE
(9.5.2)
are also eigenvectors of SO matrices, so that V = V = V! = m 1 also. We note that although can be formed only when the lengths ol are known, and the dierence EW u will have the same sign count. In considering the construction problem we shall in fact assume that the (ol )q1 are given, and seek to construct (nl > pl )q1 . In order to find conditions that must be satisfied by the eigenmodes we need some preliminary results.
212
Chapter 9
Lemma 9.5.1 If u is not identically zero, and Vu = m 1, (m 1), then there is an index n and indices (tl )m1 such that 1 t1 ? t2 ? = = = ? tm q and ()n+l1 xtl A 0 for l = 1> 2> = = = > m. Conversely, if there exist n and (tl )m1 such that ()n+l1 xtl A 0 for l = 1> 2> = = = > m, then Vu m 1. Proof. Take t1 as the index of the first non-zero xl , and let ()n = vljq(xt1 ); then ()n xt1 A 0. Take t2 as the index of the first xl with sign opposite to xt1 , then ()n+1 xt2 A 0, and so on. For example, in the sequence 0,1*,0,-4*,2*,0,3,5*, Vu = 3, so that m = 4 and the tl are the indices of the starred entries; that is, (t1 > t2 > t3 > t4 ) = (2> 4> 5> 8). If Vu = m 1, then we can find (tl )m1 . Conversely, if we can find (tl )m1 , then Vu must be at least m 1. It may be that Vu is even larger; in any case Vu m 1. Lemma 9.5.2 If v = EW u, then Vv Vu . Proof. Note that y1 = x1 > y2 = x2 x1 > = = = > yq = xq xq1 . Suppose that Vu = m 1. Choose n and (tl )m1 as in Lemma 9.5.1. Then ()n yt1 ()n+l1 ytl
= ()n xt1 A 0 = ()n+1 (xtl xtl 1 ) ()n+l1 xtl A 0> l = 2> = = = > m
so that, by Lemma 9.5.1, Vv m 1. Lemma 9.5.3 If v = EW u, then Vv+ Vu+ . The proof, following similar lines to that of Lemma 9.5.2, is given in Gladwell, Willms, He and Wang (1989) [115]. We may now use these Lemmas to prove Theorem 9.5.1 If ol A 0> l = 1> 2> = = = > q, w = EW L1 EW u, and Vu = Vw = m 1, then V = m 1. In addition, if pl A 0> nl A 0> l = 1> = = = > q, then V! = V = m 1. Proof. We note that w has the same sign properties as (see (9.5.2)). Now = L1 EW u, so that by Lemma 9.5.2, V Vu = m 1. On the + = m 1. Therefore, other hand, w = EW , so that, by Lemma 9.5.3, V + Vw + + V m 1 V , so that V = V = V = m 1. This proves the first part. Now consider the converse. Clearly Lemmas 9.5.2, 9.5.3 hold if EW is replaced by E (EW is the forward dierence operator, E the backward operator). Since ˆ = Kw, we have V = Vw if (nl )q1 A 0. Lemma 9.5.2 applied to ! = L1 E shows that V V = m 1. Lemma 9.5.3 applied to Mu = E! shows that ! V + Vu+ = m 1. Therefore, m 1 V V + m 1 so that V! = m 1. ! ! ! Suppose that two vectors u> w are given. The necessary and su!cient conditions that they should be related in the sense w = EW L1 EW u for some positive diagonal L is that the vectors P = EW w and v = EW u should be related by v = L. This means that l = ln=1 zn and yl = xl xl1 must be positive,
9. Discrete Modes and Nodes
213
zero or negative in step, that is l yl 0 with l = 0 i yl = 0. If l 6= 0 then ol = yl @l ; if l = 0, then ol is arbitrary. If u> > w are so related, then Theorem 9.5.1 shows that Vu = Vw = m 1 implies V = m 1. We now state Theorem 9.5.2 Let u> > w relate to the mth mode of the cantilever beam. Let (tl > ul > vl )m1 be the sets of indices for u> > w respectively, as in Lemma 9.5.1. Then (i) tl1 ? ul tl > (ii) vl tl >
ul1 ? vl ul >
l = 2> 3> = = = > m,
l = 2> 3> = = = > m and vl tl2 + 2>
l = 3> = = = > m,
(iii) if xtl 1 = 0, then ul ? tl ; if ul 1 = 0, then vl ? ul ; in either of these cases, therefore, vl ? tl , (iv) if zvl 1 = 0, then vl A tl2 + 2>
l = 3> = = = > m.
Note: This theorem and Lemmas 9.5.2, 9.5.3 may be considered as codifications and extensions of a discrete form of Rolle’s Theorem. They give precision to the intuitively obvious statement, that there must be at least one change of sign in the first dierences > w (that is, the derivatives) between any changes of sign of u> respectively. The formal proof is given in Gladwell, Willms, He and Wang (1989) [115]. We may now state Theorem 9.5.3 Suppose that u and positive (ol )q1 are given. The necessary and su!cient conditions for them to correspond to the mth mode of a cantilever beam are that Vu = Vw = m 1, where w = EW L1 EW u= Proof. The conditions have already been shown to be necessary. We may prove that they are su!cient by actually constructing a set of (nl > pl )q1 which are all positive. The governing equation (9.5.1) may be written Mu = E!>
ˆ ! = L1 EKw=
We may write this as ˆ = E1 L!> Kw
! = E1 Mu
and because E1 has the form (2.2.10), we have nl zl =
q X n=l
on !n = l >
!l =
q X
pn xn
n=l
which imply l = l+1 + ol !l >
!l = !l+1 + pl xl >
l = 1> 2> = = = > q>
(9.5.3)
214
Chapter 9
with !q+1 = 0 = q+1 . We give the construction procedure for the simplest case: m = 1. Algorithms and examples relating to the general case may be found in Gladwell, Willms, He and Wang (1989) [115]. When m = 1 all the (xl > zl )q1 will be positive. The (pl )q1 and may be assigned arbitrary positive values; equation (9.5.3b) gives (!l )q1 which, when substituted in (9.5.3a), yield ( l )q1 . Then nl = l @zl , so that the (nl )q1 are uniquely determined. Exercises 9.5 1. Show that if u is the mth eigenvector of (9.5.1), then > > ! are also mth eigenvectors of SO matrices.
9.6
Courant’s nodal line theorem
We now start our discussion of the properties of eigenvectors of a class of systems that includes discrete models of membranes and acoustic cavities. Since the results we obtain are discrete analogues of results relating to continuous systems, we will start by discussing these, principally Courant’s Nodal Line Theorem (CNLT), which relates to the Dirichlet eigenfunctions x(x) of elliptic dierential equations. It is well-known that such problems have positive eigenvalues with infinity as the only limit point; we label them so that 0 ? 1 2 = = =
(9.6.1)
Now the eigenvalues need not be distinct. If q has multiplicity u we label the eigenvalues so that q1 ? q = q+1 = · · · = q+u1 ? q+u =
(9.6.2)
CNLT (Courant and Hilbert (1953) [64], Chapter VI, Section 6.) is a theorem of wide applicability with a remarkably simple proof based on the minimax property of the Rayleigh quotient. It relates to the Dirichlet eigenfunctions of elliptic partial dierential equations, the simplest and most important of which is the Helmholtz equation 4x + x = 0>
x 5 G=
(9.6.3)
The Dirichlet boundary condition is x(x) = 0>
x 5 CG=
(9.6.4)
Here 4x is the Laplacian, (x) is positive and bounded, and G is a domain in Rp (p-dimensional Euclidian space). Equations (9.6.3), (9.6.4) govern the spatial eigenmodes of a vibrating membrane with fixed boundary in R2 ; and acoustic standing waves in R3 .
9. Discrete Modes and Nodes
215
The nodal set of x(x) is defined as the set of points x such that x(x) = 0. It is known (Cheng (1976) [53]) that for G Rp , the nodal set of an eigenfunction of (9.6.3), (9.6.4) is locally composed of hypersurfaces of dimension p1. These hypersurfaces cannot end in the interior of G, which implies that they are either closed, or begin and end on the boundary. In particular, therefore, in the plane (p = 2), the nodal set of the eigenfunction x(x) of (9.6.3), (9.6.4) is made up of continuous curves, called nodal lines, which are either closed, or begin and end on the boundary. CNLT states that each eigenfunction xq (x) corresponding to q divides G, by its nodal set, into at most q subdomains, called nodal domains, or the more informative sign domains, in which xq (x) has one sign. We recall proofs of two versions of CNLT so that we can indicate later how the continuous and discrete results dier from each other. We express the analysis in variational form. Define Z Z ux=uygx> [x> y]G = xygx= (x> y)G = G
Here u = ( C{C 1 >
G
C C C{2 > = = = > C{p )
Z
G
· gx =
is the grad operator, and
Z Z Z
G
===
Z
· g{1 g{2 = = = g{p =
The fundamental theorem for the Rayleigh quotient U =
(x> x)G > [x> x]G
(9.6.5)
is that if x is orthogonal to the first q 1 eigenmodes of (9.6.3), (9.6.4), i.e., [x> xl ]G = 0>
l = 1> 2> = = = > q 1>
then U q , with equality i x(x) = xq (x). We first prove a weak version of CNLT: Theorem 9.6.1 Suppose the eigenvalues l of (9.6.3), (9.6.4) are ordered as in (9.6.5), and xq (x) is an eigenfunction corresponding to q . If q has multiplicity u 1, so that (9.6.2) holds, then xq (x) has at most q + u 1 sign domains. S Proof. Suppose xq (x) has s sign domains Gl such that sl=1 Gl = G. Define ½ l xq (x) x 5 Gl zl (x) = 0 otherwise and take y(x) =
s X l=1
fl zl (x)>
s X l=1
f2l = 1=
(9.6.6)
216
Chapter 9
Since the Gl are disjoint, (zl (x))s1 are orthogonal. Scale the zl , that is, choose the l , so that [zl > zl ]G = 1, then [y> y]G =
s X l=1
f2l [zl > zl ]G =
s X
f2l = 1=
l=1
Since zl (x) satisfies (9.6.3) with = q , on Gl , and zl (x) = 0 on CGl , the divergence theorem gives Z (zl > zl )Gl = uzl · uzl gx ZG {gly(zl uzl ) zl 4 zl }gx = Z ZG Czl gx+q zl zl2 gx =q = = Cq CGl Gl Ps P 2 f2l q = q , so that U = q . But Thus (y> y)G = l=1 fl (zl > zl )Gl = s we may choose (fl )1 so that [y> xl ]G = 0> l = 1> 2> = = = > s 1, and hence, for that choice, Rayleigh’s principle states that U s . Thus s q . Since q ? q+u , we have s ? q+u so that s ? q + u> s q + u 1. Note that this proof does not require G to be connected. Note also that if q is simple, so that u = 1, then the Theorem states that xq (x) has at most q sign domains. We need to strengthen the result for multiple eigenvalues, reducing the upper bound q + u 1 to q. To reduce the upper bound in this way we need what is called a unique continuation theorem. Loosely speaking, what such a theorem states is that if a solution of (9.6.3) is identically zero in a finite region of G then it is zero throughout G; the only way that it can be continued from the zero patch is by taking it identically zero. (Specifically, for those who have a functional analysis background, Jerison and Kenig (1985) [188] proved that if any solution x 5 K01 (G) of the weak version of (9.6.3) vanishes on a non-empty open subset of a connected domain G, then x 0 in G.) Using this result we can prove Theorem 9.6.2 Suppose G is connected, the eigenvalues of (9.6.3), (9.6.4) are ordered as in (9.6.5), and xq (x) is an eigenfunction corresponding to q , then xq (x) has at most q sign domains. Proof. Suppose xq (x) has s A q sign domains. Define the zl (x) as before, and define y(x) by (9.6.6) with fq+1 = 0 = · · · = fs , so that y(x) 0 on Gq+1 > = = = > Gs . Again we have U = q , and we may choose (fl )q1 so that [y> xl ]G = 0> l = 1> 2> = = = > q 1. Thus y(x) is an eigenfunction of (9.6.3), (9.6.4.), but it is identically zero on Gq+1 and hence, by the unique continuation theorem, it is identically zero on G. This contradiction implies s q. We note that the theorem, which is due to Herrmann (1935) [171] and Pleijel (1956) [266], implies that if G is connected, then 1 is simple, i.e., 1 ? 2 . For any eigenfunction x1 (x) can have at most one sign domain, i.e., it has the
9. Discrete Modes and Nodes
217
same sign throughout G. There cannot be two functions x> y, which are of one sign in a connected domain G and are orthogonal to each other. Theorem 9.6.3 Theorem 9.6.2 holds even if G is not connected. Proof. Suppose G consists of t connected domains (Gn )t1 . Label the eigen(n) values l of each Gn increasingly, and suppose the corresponding eigenfunctions (n) (n) are xl (x). Now assemble the eigenvalue sequences {l }> n = 1> 2> = = = > t; l = 1> 2> = = = into one non-decreasing sequence {m } to give the eigenvalues of G. The corresponding eigenfunctions of G are ½ (n) xl (x) on Gn xm (x) = 0 elsewhere. (n)
The ordinal number m of a given l in this sequence will satisfy m l. Theorem (n) 9.6.2 for Gn states that xl (x) has no more than l sign domains on Gn , so that xm (x) will have no more than m sign domains on Gn , and it will be zero elsewhere.
9.7
Some properties of FEM eigenvectors
Our aim in the next few sections is to obtain discrete versions of Theorems 9.6.19.6.3. In a first step towards achieving this aim, we discuss some properties of eigenvectors of finite element models. We return to the analysis of Section 2.5 and suppose that we are dealing with a FEM model of a membrane with fixed boundary using linear interpolation over acute angled triangles, or correspondingly of an acoustic cavity using linear interpolation over tetrahedra with obtuse angles between normals to faces. In each of these models, the FEM mesh yields a set of vertices connected by edges to form a graph. There are two kinds of vertices, boundary vertices, where x = 0 because of the boundary conditions, and the remainder. These non-boundary vertices are those that appear in the analysis; they form a graph G on Q vertices Sl 5 V with edge set E. The FEM analysis yields two matrices K> M on G with the properties that if l 6= m then ¾ nlm ? 0> plm A 0 if (Sl > Sm ) 5 E (9.7.1) nlm = 0> plm = 0 otherwise. Note that if (Sl > Sm ) 5 E, we say that Sl > Sm are adjacent vertices, and we write Sl Sm . The analysis will revolve around nodal vertices, i.e., vertices Sl where xl = 0. We first prove Theorem 9.7.1 Under conditions (9.7.1), a non-boundary nodal vertex of an eigenvector of (9.1.1) cannot have neighbours that are all of one sign. Proof. Suppose Sl , a non-boundary vertex, is nodal, i.e., xl = 0. The lth line of (9.1.1) is X (nlm plm )xm = 0> (9.7.2)
218
Chapter 9
where the sum is over those m(6= l) for which Sl Sm ; for those m, nlm plm ? 0. If xm 0( 0) for all such m, with at least one inequality strict, then the left-hand side of (9.7.2) would be strictly negative (positive), which is a contradiction. This theorem implies that a non-boundary nodal vertex must either have both positive and negative neighbours, or all nodal neighbours. We may extend this statement to say that a set of nodal vertices of an eigenvector must have positive and negative neighbours: it must separate positive and negative vertex sets. If G is connected, so that K and M are irreducible (see Busacker and Saaty (1965) [46], then we can say more: if an eigenvector u is non-negative then it must be strictly positive. Such an eigenvector must correspond to the lowest eigenvalue, which must therefore be simple: there cannot be two positive eigenvectors u> v which are orthogonal w.r.t. M: 1 ? 2 . There is an important maximum principle for the p.d.e. (9.6.3): a solution x(x) cannot have an interior positive minimum or an interior negative maximum (Protter and Weinburger (1984) [271]). To state the discrete version of this principle, we must divide the non-boundary vertices of a FEM mesh into two subsets: vertices adjacent to boundary vertices, that we call near-boundary vertices; the remainder, that we term interior vertices. Theorem 9.7.2 If G is connected, and (9.7.1) holds, an eigenvector of (9.1.1) cannot have a local positive minimum or a local negative maximum at an interior vertex. Proof. By definition, an interior vertex is adjacent only to non-boundary vertices. It is therefore a vertex of an interior element, i.e., an element that has no vertices on the boundary. Because of the way in which it is formed, by (2.5.6), the stiness matrix Kh of an interior element admits a rigid-body mode, {1> 1> 1} for a triangular mesh, {1> 1> 1> 1} for a tetrahedral mesh. If Sl is an interior vertex, all the elements to which Sl belongs are interior elements. This means that after assembling the Kh to form K we may deduce that, if Sl is an P interior vertex, then nlm = 0, where again the sum is taken over all m such that Sm Sl . The lth line of (9.1.1) is X 0= nlm xm plm xm > so that
X
X X nlm (xm xl ) + ( nlm )xl = plm xm =
(9.7.3)
Suppose that there is a local positive minimum at an interior vertex Sl , so that xl 0 and xm xl 0 for all m such that Sm Sl , and either the first inequality is strict, or the second inequality is strict for at least one m such that Sm Sl . (We need the connectedness of G to be sure that every vertex Sl does have a neighbour.) The first sum on the left is non-positive, while the second sum is zero; the sum on the right is non-negative; one of the two sides, left or right, is non-zero This is impossible. This theorem relates to the eigenvectors of (9.1.1), but we can immediately reword it to apply to FEM eigenfunctions obtained by linear interpolation from
9. Discrete Modes and Nodes
219
the vertex values. An eigenfunction obtained by linear interpolation can have local maxima and minima only at the vertices of the mesh. We conclude that an eigenfunction cannot have a local positive minimum or a local negative maximum at an interior vertex. Loosely speaking, we may say that a mode may have waves, but not dimples. One of the mainstays of the theory related to (9.6.3) is the unique continuation theorem. It was this that allowed us to reduce the upper bound on the number of sign domains for eigenfunctions of (9.6.3), (9.6.4), from q + u 1 to q. There is no straightforward discrete analogue of unique continuation; there is an analogue, as described in Lemma 9.9.2, but it is not straightforward. Figure 9.7.1 shows an example of a FEM eigenmode with zero patches. If the matrices K and M are symmetrical about the {-and |-axes, then there will be a mode that is antisymmetrical about both axes, so that the vertex values must have the signs shown. There are four completely zero triangles in the centre, and four other pairs of zero triangles, but the eigenmode is not identically zero.
0
0
+
+
0
−
0 −
0
Figure 9.7.1 - An eigenvector can have one or more zero (shaded) polygons Even though there is no straightforward discrete analogue of unique continuation, we can still obtain discrete analogues of Theorem 9.6.1, 9.6.2. First, we need to find the discrete FEM counterparts of the sign domains of the continuous theorems. There are two distinct ways of looking at the piecewise linear function x obtained from an eigenvector of (9.1.1): looking at the values xl , and particularly at the signs of xl , at the vertices Sl of G; looking at the subregions with piecewise straight boundaries on which the linearly interpolated x(x) has one sign, either loosely, x(x) 0 ( 0) or strictly, x(x) A 0 (? 0). Consider the first way. The FEM mesh defines a graph G with Q vertices Sl . A FEM vector u 5 YQ associates a value xl and in particular a sign +, 0, or -, to each vertex Sl of G. We may connect the (strictly) positive vertices
220
Chapter 9
by edges of E to form maximal connected subgraphs of G, called strong positive sign graphs. We may do the same with the negative vertices, to form strong negative sign graphs. In this way, we can partition the graph G into disjoint strong positive and strong negative sign graphs, and zero vertices. Figure 9.7.2 shows a graph with 2 strong positive and 2 strong negative sign graphs, each of which has just one vertex. Alternatively, we may partition G into weak positive and weak negative sign graphs, by forming maximal connected subgraphs of nonnegative, and non-positive vertices, respectively. The graph in Figure 9.7.2 has just one weak positive sign graph, and one weak negative sign graph; these weak sign graphs overlap.
−x
x
0
x+
0x
x
x0
x
x
+
0
0
x
−
Figure 9.7.2 - The graph has two strong positive and two strong negative sign graphs; it has just one weak positive, and one weak negative sign graph Two sign graphs V1 > V2 , strong or weak, are said to be adjacent if there are vertices S1 5 V1 > S2 5 V2 such that S1 q S2 . We need the following simple but important property:
Lemma 9.7.1 If two dierent sign graphs are adjacent, then they have opposite signs. Proof. If they had the same sign then one at least would not be maximal. Note that while two adjacent strong sign graphs are disjoint, two adjacent weak sign graphs may overlap. Now consider the second way; looking at the signs of the piecewise linear ‘eigenfunction’ interpolated from the vertex values xl of an eigenvector u. This
9. Discrete Modes and Nodes
221
‘eigenfunction’ is defined on a domain with piecewise straight (in U2 ) or piecewise plane (in U3 ) boundary, that may be some approximation to an original domain G. We are not concerned with how good the approximation is, nor are we concerned with convergence or taking a ‘su!ciently fine’ mesh. Thus, we will simply call the FEM domain G, and forget that there might have been some other original domain with perhaps curved boundary. The domain G may be divided, like the graph G, into strong sign subdomains, Gl , on which x(x) has one strict sign, and on the boundaries of which x(x) = 0. Each of these domains will be polygonal in R2 , polyhedral in R3 . In particular, the nodal places of x in U2 will be piecewise straight lines, either closed or beginning and ending on the boundary, or nodal polygons, as in Figure 9.7.1. In U3 they will be piecewise plane surfaces which are either closed or begin and end on the boundary, or polyhedra. Instead of using strong sign domains, we may use weak ; they too will have piecewise straight or piecewise plane boundaries. A weak positive and a weak negative sign domain may overlap. For triangular or tetrahedral meshes corresponding to linear interpolation, there is a clear correspondence between the sign graphs on the one hand and the sign domains on the other. For each strong or weak, positive or negative sign domain there is exactly one strong or weak, positive or negative, sign graph. This means that we can count the number of sign domains by counting the number of sign graphs. We note however, that the rectangular FEM mesh which is sometimes used in R2 does not have such simple properties. Inside a rectangle, x({> |) has a bilinear interpolation x({> |) = s + t{ + u| + v{|= Now all four vertices of the rectangle are neighbours of each other, in the sense that all the o-diagonal entries in the element matrices are non-zero. This is why we show the vertices of the rectangle joined by the diagonals as well as by the sides, as in Figure 9.7.3. (But the intersection of the diagonals is not a vertex of the graph.) x
x
x x Figure 9.7.3 - A rectangular finite element; each vertex is connected to all the others
222
Chapter 9
It may be shown for this mesh that the element mass matrix is strictly positive, and that the o-diagonal entries of the element stiness matrix s are s strictly negative i the sides d> e of the rectangle satisfy 1@ 2 ? d@e ? 2, i.e., if the rectangle is not too thin. There is a similar result (Ex. 9.7.1) for a rectangular box mesh in R3 . Thus, under these conditions, the matrices K> M for the whole mesh will satisfy the inequalities (9.7.1). This means that we can apply the results of the analysis below to the sign graphs of a rectangular mesh, but as the example in Figure 9.7.4 shows, we cannot extend them to the sign domains. Figure 9.7.4 shows a mesh made up of nine square elements. The vertices D and E are adjacent and have the same sign, so that they belong to the same sign graph. However, because nodal lines in an element are now hyperbolic, and not straight, D and E lie in dierent sign domains; there is an intervening negative sign doman between them.
x
x
x B
x
−1
x
x
x
A
x
x
x
x
x
1
x
1
−2
x x x
Figure 9.7.4 - Vertices D and E are adjacent, but belong to dierent sign domains Exercises 9.7 1. Find the conditions on the ratios of the dimensions of a rectangular box so that the stiness matrix based on linear interpolation of the assumed modes 1> {> |> }> |}> }{> {|> {|} has the sign property (9.7.1).
9.8
Strong sign graphs
The discussion in Section 9.7 should have made it clear that we can study the sign properties of an eigenvector on a graph G as a problem in its own right,
9. Discrete Modes and Nodes
223
that is, without considering the problem as arising from a FEM model. We will do this and, to simplify the analysis, we will consider the eigenvalue in standard form, namely (A I)u = 0 (9.8.1) under the following assumption: if l 6= m then @ E> dlm ? 0 li (Sl > Sm ) 5 E= dlm = 0 li (Sl > Sm ) 5
(9.8.2)
We will then show at the end that all the results hold for (9.1.1) under the condition (9.7.1). In this section, we will understand sign graph to mean strong sign graph. The theorem we are about to prove regarding the number of sign graphs is a discrete analogue of Theorem 9.6.1. In order to prove it, we need to set up a procedure mimicking that used in Theorem 9.6.1, and prove a Lemma, following Davies, Gladwell, Leydold and Stadler (2001) [71]. Suppose u is an eigenvector of (9.8.1) in the eigenspace of q . Suppose u has p sign graphs Sl > l = 1> 2> = = = > p. Define p vectors wl > l = 1> 2> = = = > p, such that ½ u on Sl wl = 0 otherwise. Explicitly, let wl = {zl>1 > zl>2 > = = = > zl>Q }. Then zl>m = xm if Sm 5 Vm > and zl>m = 0 otherwise. Thus p X wl = u= l=1
Now form v=
p X
fl wl =
(9.8.3)
l=1
Using straightforward algebra, we may verify (Ex. 9.8.1) Duval and Reiner’s Lemma (Duval and Reiner (1999) [82]). Lemma 9.8.1 vW Av vW v =
p X l=1
f2l wlW (Au u)
p 1 X (fl fm )2 wlW Awm = 2 l>m=1
This leads to Theorem 9.8.1 Any eigenvector corresponding to q has at most q + u 1 sign graphs. Here the governing equation is (9.8.1), A satisfies (9.8.2), the (q )Q 1 are ordered as in (9.6.1), and q has multiplicity u, so that (9.6.2) holds. Proof. Since none of the wl is identically zero and they are disjoint, their linear span has dimension p. It follows that there are real constants (fl )p l , not
224
Chapter 9
all zero, such that v is non-zero and is orthogonal to the first (p1) eigenvectors of A, i.e., (um )p1 1 vW um = 0> m = 1> 2> = = = > p 1= Without loss of generality we can take vW v = 1. theorem (Section 2.10) we have
Therefore, by the minimax
vW Av p =
(9.8.4)
Now use Lemma 9.8.1 with = q , u = uq . We find vW Avq =
p 1 X (fl fm )2 wlW Awm = 2 l>m=1
(9.8.5)
We will show that the sum on the right is non-negative. A term wlW Awm is non-zero only if wl > wm correspond to adjacent sign graphs; adjacent sign graphs have opposite signs (Lemma 9.7.1); adjacent sign graphs are disjoint. This means that any non-zero product wlW Awm involves only negative, o-diagonal entries in A; therefore wlW Awm = (±)()( ) = += Therefore, equation (9.8.5) gives vW Avq 0=
(9.8.6)
This combined with (9.8.4) states that p q . Since q ? q+u , we have p ? q+u , i.e., p q + u 1. Note that we cannot deduce that the inequality in (9.8.6) is strict, because fl fm might be zero for all those pairs l> m for which wlW Awm was (strictly) positive. As we stated earlier, Theorem 9.8.1 is a discrete counterpart of CNLT in the form of Theorem 9.6.1. Various researchers attempted to reduce the bound q + u 1. Friedman (1993) [96] gave the example of a star on Q vertices to show that the bound could not be reduced, as in Theorem 9.6.2, to q. For the star, the second eigenvalue of the so-called Laplacian matrix (Ex. 9.8.2) has multiplicity Q 2, and has an eigenvector with Q 1 sign graphs. If therefore Q 1 A 2, i.e., Q 4, then a second eigenvector has more than 2 sign graphs. In spite of this counterexample, Duval and Reiner (1999) [82] attempted to reduce the bound to q; the error in their logic is pinpointed in Zhu (2000) [342]; essentially their error lay in thinking that the inequality in (9.8.6) could be made strict. Comments on partly erroneous results put forward by Friedman (1993) [96] and van der Holst (1996) [326] may be found in Davies, Gladwell, Leydold and Stadler (2001) [71]. We note that the distinction between the bounds q+u 1 and q appears only when u A 1, i.e., q is multiple. Following Gladwell and Zhu (2002) [131] we now show that although it is not possible to reduce the bound q+u1 when q is
9. Discrete Modes and Nodes
225
, spanning the multiple, it is possible to construct u orthogonal vectors (um )q+u1 q eigenspace of q , such that um has at most m sign graphs, m = q> q+1> = = = > q+u1. In fact, it is possible to go further and construct u linearly independent (but not necessarily orthogonal) vectors spanning the eigenspace of q , such that each of them has at most q sign graphs. We introduce the notation VJ(u) for the number of sign graphs of u. Theorem 9.8.2 Under the conditions stated in Theorem 9.8.1, if u is an eigenvector corresponding to q , and VJ(u) = p A q, then in the notation of (9.8.3) we may find q X v= fm wm m=1
such that v is an eigenvector corresponding to q , and VJ(v) q. . Proof. We can choose fm , not all zero, such that v is orthogonal to (ul )q1 1 By the minimax theorem U q . By Lemma 9.8.1, U q . Thus U = q and v is an eigenvector corresponding to q . By its construction, VJ(v) q. We denote a normalised v so formed, by v = W ((wm )q1 > (ul )q1 ). This v 1 may not be unique; there is always a non-trivial set (fm )q1 , but it need not be unique. Note that in Theorem 9.6.2, for the continuous CNLT, we suppose that the eigenfunction xq (x) has more than q sign domains, and we construct a purported eigenfunction y(x) orthogonal to (xl (x))q1 , but zero in Gq+1 ; then we 1 use unique continuation of an eigenfunction on a connected domain G to show that y(x) 0 in G; this contradicted the hypothesis that y(x) was an eigenfunction, i.e., not trivial. In the discrete case we start with an eigenvector uq with VJ(uq ) = p A q, and construct another v with VJ(v) q; the new eigenvector has at least one zero sign graph, but it is an eigenvector, and there is no contradiction involved. We may now prove Theorem 9.8.3 Suppose the conditions stated in Theorem 9.8.1 hold. If q is an eigenvalue of multiplicity u, then we may find u orthonormal eigenvectors (um )q+u1 corresponding to q , such that VJ(um ) m> m = q> q+1> = = = > q+u 1. q Proof. The u-dimensional eigenspace Y of q has an orthonormal basis (vm )q+u1 . Theorem 9.8.1 states that VJ(vm ) q+u1 for m = q> q+1> = = = > q+ q u 1. If VJ(vq ) q, take uq = vq ; otherwise VJ(vq ) A q. In this case if q1 q (wm )p ), 1 > (p A q) are the sign graph vectors of vq , take uq = W ((wm )1 ; (um )1 so that VJ(uq ) q. We now proceed by induction. Suppose we have constructed orthonormal vectors uq > uq+1 > = = = > uq+v1 (1 ? v ? u) such that VJ(um ) m, for m = q> q + 1> = = = > q + v 1. We show how to construct uq+v . First, find a new orthonormal basis (um )q+v1 , (xm )q+u1 for Y . If q q+v VJ(xq+v ) q + v, then take uq+v = xq+v ; otherwise VJ(xq+v ) A q + v;
226
Chapter 9
in this case, if (wm )p 1 (p A q + v) are the sign graph vectors of xq+v , take uq+v = W ((wm )1q+v ; (um )1q+v1 ). We may proceed in this way to find (um )q+u1 1 such that VJ(um ) m. We now strengthen this result and prove Theorem 9.8.4 Suppose the conditions stated in Theorem 9.8.1 hold, and that q is an eigenvalue with multiplicity u, and eigenspace Y . There is a basis (um )q+u1 for Y such that VJ(um ) q. 1 Proof. We proceed much as in Theorem 9.8.3. We construct uq as before, and then use induction: we suppose that we have found a basis (um )q+v1 , 1 (xm )q+u1 for Y such that VJ(u ) q for m = q> q + 1> = = = > q + v 1, and we m q+v show how to construct uq+v . If VJ(xq+v ) q, then uq+v = xq+v ; otherwise VJ(xq+v ) = q + w, 1 w u 1. In this case, let Z be theP space spanned by q+w the sign graph vectors (wm )q+w of x : if w 5 Z , then w = q+v 1 m=1 fm wm = Wc. q1 Let \ be the subspace of Z orthogonal to (um )1 ; \ is not empty because P W W xq+v = q+w m=1 wm 5 \ . If y 5 \ , then y = Wc and um y = um Wc = 0> m = 1> 2> = = = > q 1. Of these q 1 constraints on the fm , p q 1 are independent; they may be written Bc = 0, where B 5 Pp>q+w . Then the matrix B has p linearly independent columns which, by suitably renumbering the wm , may be taken as the first p. Thus Bc = 0 may be written · ¸ c1 [B1 > B2 ] = 0> (9.8.7) c2 where B1 5 Pp is non-singular, B2 5 Pp>q+wp , c1 = {f1 > f2 > = = = fp }> c2 = {fp+1 > = = = > fq+w }= The solution space of (9.8.7) is spanned by the q + w p solutions obtained by (l) (l) taking f2>n = ln > l = p + 1> = = = > q + w, and then solving for c1 . Each such choice gives a vector yl = Wc(l) ; these vectors are linearly independent and they span \ ; by construction VJ(yl ) p + 1 q. At least one of the yl , say ys , must be linearly independent of (um )q+v1 , for xq+v 5 \ is, by construction, q linearly independent of (um )q+v1 . Take uq+v = ys , then VJ(uq+v ) q. We q may proceed in this way to find (um )q+u1 such that VJ(um ) q. q We conclude this section by discussing some other implications of Lemma 9.8.1. Suppose that u is an eigenvector corresponding to a multiple eigenvalue q , so that Au = q u. Suppose that VJ(u) = p A q, and v given by (9.8.3) has been computed so that it is orthogonal to (um )q1 . Then, as we showed before, 1 v is also an eigenvector corresponding to q , i.e., Av = q v. Then Lemma 9.8.1 with = q demands p X
(fl fm )2 wlW Awm = 0=
l>m=1
(9.8.8)
9. Discrete Modes and Nodes
227
But, as we showed earlier, wlW Awm 0, with strict inequality i Vl > Vm are adjacent. Equation (9.8.8) implies that if Vl > Vm are adjacent, then fl = fm . This means that if one sign graph, Vl , is omitted in the construction of v from the sign graphs of u (i.e., fl = 0), then any sign graph Vm adjacent to Vl must also be omitted (fm = fl = 0). On the other hand, if one sign graph Vl is included in v, then any other sign graph Vm adjacent to Vl must be included, and must be included with the same weight as Vl : fm = fl . This means that in the construction of v from the sign graphs of u, any connected graph composed of sign graphs of u must either be included or excluded as a whole. This leads to Theorem 9.8.5 Suppose the conditions stated in Theorem 9.8.1 hold. Suppose that u, an eigenvector corresponding to q has more than q sign graphs, so that VJ(u) = q + j, j 1. These sign graphs may be grouped into j + v mutually disjoint connected graphs (Cm )j+v 1 , and v 1. Proof. If v ? 1, i.e., v 0, then there are at most j connected graphs Cm . If we form a non-trivial eigenvector from the q + j, sign graphs of u, by deleting j of them, at least one Vm from each Cm , then none of the Cm will appear; v will be identically zero. This contradiction implies v 1. This theorem has a number of corollaries: (i) If u has p = q+j sign graphs, then a connected component Cm can contain at most q sign graphs. For if one contained q + 1 sign graphs, then there would be at most 1 + (q + j q 1) = j connected components. This provides a somewhat restricted counterpart of Theorem 9.6.2. (ii) If there are q sign graphs in one component Cm , and q 2, then j 2. For if q sign graphs are in one component Cm , they must constitute an eigenvector; so too will the remaining q + j q = j sign graphs. If q 2, an eigenvector, being orthogonal to u1 , must have at least two sign graphs; j 2. (iii) If G is connected and uq has no zeros then, whether q is simple or multiple, VJ(uq ) q. For if there are no zero vertices then all the sign graphs fall into one component. Exercises 9.8 1. Establish Duval and Reiner’s Lemma 9.8.1. 2. Consider the star on Q vertices with d11 = Q 1> dll = 1> d1l = 1> l = 2> = = = > Q . Show that its eigenvalues are 0> 1> Q . Show that the second eigenvalue has multiplicity Q 2, and that there is an eigenvector corresponding to 2 with Q 1 sign graphs. 3. Construct Q 2 orthogonal eigenvectors of 2 for the star in Ex. 9.8.2 such that um has just m sign graphs, m = 2> 3> = = = > Q 1.
228
Chapter 9
4. For the same star, construct Q 2 linearly independent eigenvectors um such that each has just 2 sign graphs.
9.9
Weak sign graphs
In order to obtain a proper discrete analogue of Theorem 9.6.2, we must consider weak sign graphs. Lemma 9.9.1 Suppose S1 > S2 are adjacent weak sign graphs. There is a pair of vertices S1 > S2 such that S1 5 S1 > S2 5 S2 \S1 (i.e., S2 is in S2 , but not in S1 ) and S1 S2 . Proof. Without loss of generality, assume S1 is weak positive and S2 is weak negative. If S1 > S2 are disjoint, then by the definition of adjacency, there exist S1 5 S1 > S2 5 S2 such that S1 S2 ; because S1 > S2 are disjoint, S2 5 S2 \S1 . Otherwise, S1 > S2 have a non-empty intersection S1 _ S2 . S1 _ S2 is a strict subgraph of G so that not all vertices S1 5 S1 _ S2 can be interior vertices in the sense described in Section 9.7. Any boundary vertex S1 will have the required property: for such a S1 , there will be a vertex S2 such that S2 S1 , and x2 ? 0, i.e., S2 5 S2 \S1 . Now suppose u, an eigenvector corresponding to q , has p q weak sign graphs Sl . We define wl > l = 1> 2> = = = > p as before, and we choose fl > l = 1> 2> = = = > p, not all zero, to make v given by (9.8.3) orthogonal to ul > l = 1> 2> = = = > p 1. We prove a continuation result for the coe!cients fl that is a discrete analogue of the unique continuation principle for eigenfunctions. Lemma 9.9.2 Suppose p q, and two of the weak sign graphs S1 and S2 of u are adjacent, then f2 = f1 . Proof. Without loss of generality we may suppose that S1 is weak positive and S2 is weak negative. We proceed as in the derivation of equation (9.8.8). The minimax theorem implies vW Av p , and Lemma 9.8.1 implies vW Av q , and p X (fl fm )2 wlW Awm = 0= (9.9.1) l>m=1
Now use Lemma 9.9.1. If S1 and S2 are disjoint, then there is a pair S1 > S2 such that S1 5 S1 > S2 5 S2 and S1 S2 ; thus x1 A 0> x2 ? 0> d12 ? 0. Thus w1W Aw2 x1 d12 x2 A 0, and (9.9.1) implies f1 = f2 . Otherwise S1 > S2 overlap. Since vW Av q , v, like u, is in the eigenspace of q , and therefore so is z = f1 u v =
p X (f1 fm )wm = m=1
9. Discrete Modes and Nodes
229
By definition zm>l = 0 unless Sl 5 Sm . Choose S1 and S2 as in Lemma 9.9.1: S1 5 S1 _ S2 implies x1 = 0, i.e., zm>1 = 0 for all m, so that }1 = 0. Since z is in the eigenspace of q , we have q z = Az =
p X (f1 fm )Awm > m=1
so that
Pp (f1 fm )(Awm )1 > q }1 = 0 = PQ Pm=1 p = m=1 (f1 fm ) l=2 d1l zm>l >
(9.9.2)
where we have used zm>1 = 0. The term d1l , for l 2, is zero unless Sl S1 . Since x1 = 0, all such Sl are in S1 or S2 . The sum in (9.9.2) is therefore over m = 2 only: Q X 0 = (f1 f2 ) d1l z2>l = l=2
Since S2 is weak negative, d1l z2>l 0 for l = 2> = = = > Q : each term in the sum is non-negative. Since S1 S2 we have d12 ? 0; since S2 5 S2 \S1 , z2>2 = x2 ? 0, so that Q X d1l z2>l d12 x2 A 0> l=2
and hence f1 = f2 . We are now in a position to establish
Theorem 9.9.1 If G is connected, any eigenvector corresponding to q has at most q weak sign graphs. Proof. Suppose, if possible, that u has p weak sign graphs Sl > l = 1> 2> = = = > p, and p A q. At least one of the coe!cients fl , say f1 , is non-zero. Since q 1, we have p 2. Since G is connected, S1 must be adjacent to at least one other weak sign graph, which we label S2 . Lemma 9.9.2 states that f2 = f1 . If p 3, one of S1 > S2 must be adjacent to one of the remaining sign graphs Sl , l = 3> = = = > p, say S3 , otherwise G would not be connected. Therefore f3 = f2 = f1 by Lemma 9.9.2. In p 1 steps, we conclude that fp = fp1 = · · · = f2 = f1 . Hence v = f1 u. But v was constructed so that it was orthogonal to ul for l = 1> 2> = = = > p 1; if p A q, v is orthogonal to u, contradicting v = f1 u. Therefore, p q.
9.10
Generalisation to M> K problems
The proof of Theorem 9.8.1, on strong sign graphs, hinges on two fundamental results: Courant’s minimax theorem, and Duval and Reiner’s Lemma 9.8.1. Theorem 9.9.1 on weak sign graphs, uses these two, and Lemmas 9.9.1, 9.9.2.
230
Chapter 9
All these intermediate steps may be generalised to give results for the problem (9.1.1), in which K is PSD, M is PD, and K> M satisfy (9.7.1). Thus, since M is PD, the minimax theorem holds for the Rayleigh quotient vW Kv@vW Mv. Duval and Reiner’s Lemma 9.8.1 may be generalised to read Lemma 9.10.1 W
v (K M)v =
p X
f2l wlW (K
l=1
p 1 X M)u (fl fm )2 wlW (K M)wm = 2 l>m=1
Since K is PSD and M is PD, the eigenvalues l are non-negative. means that when wl > wm correspond to adjacent sign graphs
This
wlW (K M)wm = (±){() (+)}( ) = += All the arguments used to establish Theorems 9.8.1, 9.9.1 proceed as before with A replaced by K M. Exercises 9.10 1. Establish Lemma 9.10.1.
Chapter 10
Green’s Functions and Integral Equations Mathematicians who are only mathematicians have exact minds, provided all things are explained to them by means of definitions and axioms; otherwise they are inaccurate and unsuerable, for they are only right when the principles are quite clear. Pascal’s Pensées
10.1
Introduction
In this and the following two chapters we shall be concerned with the vibration of, and the inverse problems for, three systems with continuously distributed mass: the taut vibrating string, and the rod in longitudinal or torsional vibration. In this section we state the governing dierential equation. In Section 10.2 we introduce the Green’s function and reformulate the eigenvalue problem giving the natural frequencies as an integral equation. In Section 10.3 we recall the relevant spectral theory for compact self-adjoint operators on a Hilbert space, and in Section 10.4 we apply it to the Green’s function integral equation. This chapter thus serves as introductory material for the study of inverse problems in Chapter 11. The equation governing the free (infinitesinal, undamped) vibration of a taut string having unit tension, mass per unit length 2 ({), vibrating with frequency $ is (10.1.1) y 00 ({) + 2 ({)y({) = 0> where = $2 and 0 g@g{. We denote the mass per unit length by 2 ({), rather than by ({), to indicate that it is positive, and to avoid continual repetition of 1@2 ({). The end conditions will be assumed to be y0 (0) ky(0) = 0 = y 0 (1) + Ky(1)> 231
(10.1.2)
232
Chapter 10
where k> K 0 and k> K are not both zero. This means that the ends { = 0> { = 1 are attached to fixed supports by the use of springs having stinesses k> K respectively. Of course a real (physical) string cannot have a ‘free’ end in the straightforward sense. However, we can simulate a free end by attaching the end to a device that moves transversely in such a way that the slope of the string at the end remains zero. The free longitudinal vibrations of a thin straight rod of cross-sectional area D({), density and Young’s modulus H are governed by the equation (D({)z0 ({))0 + D({)z({) = 0>
(10.1.3)
where = $ 2 @H. The end conditions are z0 (0) kz(0) = 0 = z0 (1) + Kz(1)>
(10.1.4)
where again k> K 0 and k> K are not both zero. The free torsional vibrations of a thin straight rod of second moment of area M({), density and shear modulus J are governed by the equation (M({)0 ({))0 + M({)({) = 0>
(10.1.5)
where = $ 2 @J. The end conditions are 0 (0) k(0) = 0 = 0 (1) + K(1)=
(10.1.6)
There is clearly a one-one correspondence (H> D> > y) $ (J> M> > ) between the longitudinal and torsional systems, but we now show that, by means of a transformation of variables, all these systems may be reduced to the same basic equation. In equation (10.1.3) introduce a new variable , where 0 ({) = 1@D({)>
z({) = y()=
(10.1.7)
0 ˙ ({) = y(), ˙ where · g@g. Hence D(Dz0 )0 = y¨, Then D({)z0 ({) = D({)y() and equation (10.1.3) becomes
y¨() + 2 ()y() = 0>
(10.1.8)
with () = D({). If ({) =
Z
0
{
gw > D(w)
1=
Z
0
1
gw D(w)
(10.1.9)
then the end conditions (10.1.4) become y(0) ˙ kD(0)y(0) = 0 = y(1) ˙ + KD(1)y(1)=
(10.1.10)
Since D({) is positive and bounded, equation (10.1.8) has the same form as (10.1.1), and equation (10.1.10) has the same form as (10.1.2). This means that we may concentrate our attention on equations (10.1.1),(10.1.2).
10. Green’s Functions and Integral Equations
233
We showed that equation (10.1.3) could be transformed into (10.1.1) by a simple change of variable. If we assume further smoothness in D({), that it has a second derivative, then we may transform (10.1.3) into another equation which is often viewed as the standard form, the so-called Sturm-Liouville equation. In equation (10.1.3) put |({) = i ({)z({) then (Dz0 )0
= [D(i 1 |0 i 0 i 2 |)]0 = Di 1 | 00 + {(Di 1 )0 Di 0 i 2 }| 0 (Di 0 i 2 )0 |=
Choose the function i to make the terms in | 0 vanish: (Di 1 )0 Di 0 i 2 = D0 i 1 2Di 0 i 2 > i.e., (Di 2 )0 = 0 or i = D1@2 = Then (Dz0 )0 + Dz i | 00 i 00 | + i | = 0 or | 00 ({) + [ t({)]|({) = 0>
(10.1.11)
t({) = i 00 ({)@i ({)=
(10.1.12)
where We note that since (10.1.3) may be transformed into (10.1.1), the latter may be transformed into (10.1.11). In fact if y({) = |()@i ()>
i () = 1@2 ({)> 0 ({) = i 2 ()
(10.1.13)
then ˙ 2 = i |˙ i˙|> y 00 = i 2 (i |¨ i¨|) y 0 = yi and y 00 + 2 y i 2 (i |¨ i¨|) + i 4 i 1 | = 0 so that |¨() + [ t()]|() = 0>
(10.1.14)
t() = i¨()@i ()=
(10.1.15)
where If ({) is continuous in [0,1] then equation (10.1.1) shows that y({) has a continuous second derivative. If ({) has a simple discontinuity at { = then y 0 ({) is continuous while y 00 ({) has a discontinuity at { = : ¯ ¯ ¯{ = + ¯{ = + = y()2 ({)¯¯ = (10.1.16) y 00 ({)¯¯ { = { =
If ({) therefore is piecewise continuous in (0,1) then y 00 ({) is piecewise continuous also.
234
Chapter 10
To show that any eigenvalues of (10.1.1), (10.1.2) must be real and positive we may argue as follows: suppose , possibly complex, is an eigenvalue, and y({) a corresponding eigenfunction. Multiply (10.1.1) by y({) and integrate over (0,1): Z 1 Z 1 y00 yg{ + 2 yyg{ = 0= 0
0
Integrate the first term by parts and use the end conditions (10.1.2): Z 1 Z 1 y 0 y 0 g{ + ky(0)y(0) + Ky(1)y(1) = 2 yyg{= (10.1.17) 0
0
The terms on the left are real; the integral on the right is real and positive; is real. The sum on the left can be zero only when ky(0) = 0 = Ky(1). There are two cases to consider i) k> K A 0, in this case y(0) = y 0 (0) = y(1) = y 0 (1) so that y({) 0, and there is no eigenfunction y({). ii) k = 0 = K, in this case the supports have no stiness, and there is an eigenvalue = 0 with eigenfunction y({) = constant. This is called a rigid-body mode. Apart from this case, any eigenvalue is strictly positive. Any eigenvalues must be simple, for if x({)> y({) were two dierent eigenfunctions corresponding to the same eigenvalue , then x00 ({)y({) x({)y 00 ({) = 0> i.e., x0 ({)y({) x({)y 0 ({) = Constant. But at { = 0, the end condition (10.1.2) gives x0 (0)y(0) x(0)y 0 (0) = 0= Thus x0 ({)y({) x({)y 0 ({) = 0> and x({)> y({) are proportional. Suppose y1 ({)> y2 ({) are eigenfunctions of (10.1.1), (10.1.2) corresponding to dierent eigenvalues 1 > 2 . Then y100 + 1 2 y1 = 0 = y200 + 2 2 y2 and
Z
1
(y100 y2
y200 y1 )g{
0
But
+ (1 2 )
Z
1
2 y1 y2 g{ = 0=
0
Z
0
1
1
(y100 y2 y200 y1 )g{ = [y10 y2 y20 y1 ]0 = 0
on account of the end conditions, and hence, since 1 2 6= 0, y1 and y2 are orthogonal in the sense Z 1
2 y1 y2 g{ = 0=
0
10. Green’s Functions and Integral Equations
235
We have shown that if equations (10.1.1), (10.1.2) have eigenvalues then they will satisfy (10.1.18) 0 1 ? 2 ? · · · with equality only as stated above. The corresponding eigenfunctions yl ({) will be orthogonal; they may be normalised so that Z 1 2 yl ym g{ = lm = (10.1.19) 0
We have shown that the dierential equation we are studying may be presented in three dierent forms: (10.1.1), (10.1.3) or (10.1.11). For vibration purposes the fundamental equations are the first two: (10.1.1) for the taut string; (10.1.3) for the rod. Equation (10.1.11), called the Sturm-Liouville equation, is introduced as a standard mathematical form because it is easier to analyse, particularly for the asymptotic form of the eigenvalues, and for the inverse problem. Equation (10.1.11) is the one that has been studied by most pure mathematicians, but in our study of vibration problems, we must always remember that it is a secondary equation. In this chapter, we will study some of the basic properties of the equations, particularly the so-called spectral theory. In Chapter 11 we will study some inverse problems: how to reconstruct the functions ({)> D({) or t({), appearing respectively in the three forms of the equation. In the spectral theory there are six main topics: i) The existence of an infinite sequence of real distinct eigenvalues with only one limit point, +4. For equations (10.1.1) and (10.1.3) these are all positive apart perhaps for the first, which is zero when k = 0 = K. ii) The completeness of eigenfunctions on [0,1]. iii) The asymptotic form of the eigenvalues and the so-called norming constants. iv) The interlacing of eigenvalues corresponding to dierent end constants k> K. v) The oscillatory properties of eigenfunctions: how many nodes they have. vi) The interlacing of nodes of neighbouring eigenfunctions. Each of these topics may be studied in various ways, but there are basically just two avenues of approach: through the study of the dierential equation itself; by converting the dierential equation to an integral equation and studying that. Of the six topics, the most di!cult is undoubtedly ii), the completeness of the eigenfunctions. In their recent monograph, Levitan and Sargsjan (1991) [212] study completeness by reducing (10.1.11) to an integral equation and then using a variety of approaches to establish completeness. We will approach topics i) and ii) dierently, in a way that mimics somewhat the matrix approach to discrete problems, by starting from (10.1.1), converting it to an eigenvalue problem for an integral operator, and establishing the necessary functional analysis. This
236
Chapter 10
approach takes more pages than Levitan and Sargsjan’s, but we believe it has merit. For the establishment of the asymptotic form of the eigenvalues we will start from (10.1.11). Topics v) and vi), nodes and interlacing, were studied by Sturm in his original work. The classical treatment, beautifully presented, may be found in Ince (1927) [185]. Levitan and Sargsjan follow Sturm’s approach. We will use the total positivity properties of the integral equation, following the lines of Gantmacher and Krein (1950) [98]. There are two ways to normalise the governing equations and to number the eigenvalues; both have their own advantages and disadvantages, and we shall therefore use both, at dierent times; we label them Y , for vibration, and V, for Sturm-Liouville. V: the governing equation is (10.1.1) or (10.1.3), the equation holds for { 5 [0> 1]; the end conditions are (10.1.2) or (10.1.4); the eigenvalues are la4 belled (l )4 1 , the eigenfunctions (yl ({))1 . S: the governing equation is (10.1.11), the equation holds for { 5 [0> ]; the end conditions are | 0 (0) k|(0) = 0 = | 0 () + K|(); 4 the eigenvalues are labelled (l )4 0 and the eigenfunctions (|l ({))0 .
Thus we will use Y for the analysis in Sections 10.2-10.8 based on the Green’s function approach to equation (10.1.11). We will use V for the study of the asymptotic form of the eigenvalues in Section 10.9, and for the analysis of the inverse problems for the Sturm-Liouville equation (10.1.11) in Chapter 11. Exercises 10.1 1. Show that the eigenvalues and eigenfunctions of (10.1.1) for = 1 and the end conditions (10.1.2) are given by = $2q >
$ q = q + q + (q 1)>
q = 1> 2> = = =
where q = arctan(k@$ q )>
q = arctan(K@$q )
and |q = cos($q { q )>
q = 1> 2> = = =
Hence, show that $q is an increasing function of k and K and that, when k> K are positive, there is just one eigenvalue $ q in each of the intervals ((q 1)> q)> q = 1> 2> = = = 2. Consider various special cases of Ex. 10.1.1. Thus, a) k = 0 = K, then $ q = (q1)> q = 1> 2> = = = Note: in this case, that was considered earlier, there is a zero eigenvalue with eigenfunction |1 = 1.
10. Green’s Functions and Integral Equations
b) k = 0> c) k = 4>
237
K = 4, then $q = (q 1@2)> K = 4, then $q = q>
|q = cos $q {
|q = sin $q {
d) k> K finite, then for large q $ q = (q 1) +
(k + K) +0 (q 1)
µ
1 q3
¶
=
Note that this expression indicates that it would be an advantage to label the eigenvalues 0 > 1 > = = = rather than 1 > 2 > = = = 3. Explore how the end conditions change as one equation of (10.1.1), (10.1.3), (10.1.11) is changed to another. Note that the basic equations for vibration purposes are (10.1.1), (10.1.2) and (10.1.3), (10.1.4) in which k> K are nonnegative. Note particularly that if (10.1.3) is changed to (10.1.1), i.e., to (10.1.8), the end conditions retain the same form; compare (10.1.10) and (10.1.2). But when (10.1.1) or (10.1.3) is changed to the standard form (10.1.11) the end conditions change: y0 (0) ky(0) = 0 becomes |(0) ˙ n|(0) = 0, and k A 0 does not imply n A 0.
10.2
Green’s functions
The idea of a Green’s function is perhaps most easily introduced by considering the static deflection of a string with fixed ends due to a distributed load i ({). The governing equation is y 00 ({) = i ({) (10.2.1) and the end conditions are y(0) = 0 = y(1). If instead of a distributed load we consider a single unit concentrated load at { = v, then the string will be straight on each side of { = v, and have a discontinuity in its slope at { = v, as shown in Figure 10.2.1.
s
0
1
Figure 10.2.1 - The plucked string.
Thus y({) =
½
D{> 0 { v> E(1 {)> v ? { 1=
(10.2.2)
238
Chapter 10
Equilibrium of the two portions gives ¯ gy ¯¯ { = v+ = 1> g{ ¯ { = v
so that, on using (10.2.2), we find D + E = 1. Continuity yields Dv = E(1 v) so that D = (1 v)> E = v= We call the resulting deflection J({> v); thus ½ {(1 v)> 0 { v> J({> v) = v(1 {)> v { 1=
(10.2.3)
To obtain the deflection of the string under the action of the distributed load i ({) we combine the actions of the concentrated forces i (v)gv at the locations v; thus Z 1
J({> v)i (v)gv=
y({) =
(10.2.4)
0
Clearly, we may generalise this procedure, and define a Green’s function for the general end conditions (10.1.2). We introduce two solutions of y 00 ({) = 0: !({) satisfying the condition !0 (0) k!(0) = 0; #({) satisfying #0 (1) + K#(1) = 0. Since !00 ({) = 0 = # 00 ({)> we have !({)# 00 ({) !00 ({)#({) = 0, which on integrating gives !({)# 0 ({) !0 ({)#({) = frqvw. We choose this constant as -1, so that !({)#0 ({) !0 ({)#({) = 1> and define J({> v) =
½
!({)#(v)> 0 { v> !(v)#({)> v { 1>
then J({> v) is continuous at { = v, while ¯ ¯ { = v+ CJ ({> v) ¯¯ = 1= { = v C{ Note that
!({) = D(1 + k{) #({) = E(1 + K(1 {))
where
¾
(10.2.5)
(10.2.6)
(10.2.7)
DE = 1@(k + K + kK)> and the conditions k 0> K 0> k + K A 0 ensure that the denominator in (10.2.7) is positive. We note that the Green’s function is symmetric, i.e., J({> v) = J(v> {)=
(10.2.8)
10. Green’s Functions and Integral Equations
239
and that the functions !({)> #({) are positive, !({) increasing while #({) decreasing. In fact, (10.2.5) shows that !({)@#({) is an increasing function of {. There is thus a clear parallel between the Green’s function and the Green’s matrix introduced in Section 10.5. For our purposes, the most important use of the Green’s function is that it reduces the free vibration problem (10.1.1), (10.1.2) to an eigenvalue problem for an integral equation: Z 1 y({) = 2 (v)J({> v)y(v)gv= (10.2.9) 0
With the changes of variable x({) = ({)y({)>
N({> v) = ({)(v)J({> v)>
we may transform (10.2.9) into the symmetric equation Z 1 N({> v)x(v)gv = x({)>
(10.2.10)
(10.2.11)
0
in which N({> v) = N(v> {), and = 1@. There is a well established body of theory for such integral equations, which we now recall. The theory relates to a compact, self-adjoint linear operator in a separable Hilbert space. In Section 10.3 we summarize the theory regarding the spectrum of such an operator, and in Section 10.4 we apply it to the operator equation (10.2.11).
Exercises 10.2 1. Find the solutions of (Dz0 )0 = 0, !({) satisfying (10.1.4a), and #({) satisfying (10.1.4b), and make D({){!({)# 0 ({) !0 ({)#({)} = 1 and hence write (10.1.3) as an integral equation Z { D(v)J({> v)z(v)gv= z({) = 0
2. Show that if y({) satisfies y 00 + 2 y = 0 y(0) = 0 = y 0 (1) and ({) has a continuous first derivative, then x = y 0 satisfies (2 x0 )0 + x = 0> x0 (0) = 0 = x(1). Hence show that x({) satisfies Z 1 x({) = N({> v)x(v)gv 0
240
Chapter 10
where N({> v) =
Z
1
2 (w)gw>
{+ = max({> v)=
{+
3. Show that if z({) satisfies (Dz0 )0 + Dz = 0>
z(0) = 0 = z0 (1)>
D A 1>
then y = Dz0 satisfies (Ey 0 )0 + Ey = 0 where E = 1@D> y 0 (0) = 0 = y(1). Hence, show that Z 1 y({) = J({> v)E(v)y(v)gv 0
where J({> v) =
Z
1
D(w)gw>
{+ = max({> v)=
{+
10.3
Some functional analysis
In the first edition of this book, in order to prove the existence of eigenvalues and eigenfunctions for the integral equation, i.e., operator equation, (10.2.11) we referred the reader to the classical treatment of integral equations in Courant and Hilbert (1953) [64]. Instead, in this edition, we sketch the functional analysis approach to existence by providing the reader with a sign-posted journey through parts of the book Functional Analysis by Lebedev, Vorovich and Gladwell (1996) [205]. We refer to definitions and theorems in that book by the abbreviations Def. and Th. respectively. The journey starts with the definition of a metric space [, Def. 2.1.4: a set of elements governed by a distance metric g({> |) satisfying certain distance axioms. After defining an open ball or -neighbourhood of a point {0 5 [, Def. 2.2.1, we define an open set in [ as one in which every point is an interior point. Then, after defining limit points Def. 2.2.3, we define a closed set as one that contains all its limit points, Def. 2.2.6. We define the closure V¯ of a set V as the set obtained by adding to V all its limit points, and say, Def. 2.2.7, that V is dense in a set W if V¯ W . The journey continues through metric spaces, to give the metric space versions of limit of a sequence, Def. 2.4.1; Cauchy sequence, Def. 2.4.2; and complete metric space, Def. 2.5.1: a metric space in which every Cauchy sequence has a limit. The definitions Def. 2.6.1, 2.6.2 and the completion theorem Th. 2.6.1 explain how any metric space may be completed. The definition of an operator is given in Definition 10.3.1 Let [ and \ be metric spaces. A correspondence D{ = |> { 5 [> | 5 \ is called an operator from [ into \ , if to each { 5 [ there corresponds no more than one | 5 \ . The set of all those { 5 [ for which there exists a corresponding | 5 \ is called the domain of D and denoted by
10. Green’s Functions and Integral Equations
241
G(D); the set of all | arising from { 5 [ is called the range of D and denoted by U(D). Thus U(D) = {| 5 \ ; | = D{> { 5 [}= We say that D is an operator on G(D) into \ , or on G(D) onto U(D). We also say that U(D) is the image or map of G(D) under D. The null space of D, denoted by Q (D), is the set of all { 5 [ such that D{ = 0. A functional , Def. 2.7.2 is defined as an operator from [ to the real numbers R, or complex numbers C. The definition of a continuous operator Def. 2.7.3 is the straightforward analogue of continuity of an ordinary function. The journey now passes to linear spaces (Section 2.8) over R or C, with the property that if {> | 5 [ then { + | 5 [; when equipped with a norm || · ||, they become normed linear spaces, Def. 2.8.1. After defining a subspace, Def. 2.8.4, we define closed subspace Def. 2.8.5, linear dependence and independence Def. 2.8.6; and dimension, Def. 2.8.8. We carry the notion of an operator in a metric space over and define a linear operator , Def. 2.9.2, in a normed linear space as one that satisfies D({ + |) = D({)+D(|); define a continuous linear operator, and the norm of a continuous linear operator from [ to \ by (Th. 2.9.1) ||D{||\ = {5G(D) ||{||[
||D|| = sup
(10.3.1)
D is continuous, or bounded, i ||D|| is finite. The concepts of metric, g({> |), and norm, ||{||, generalise the notions of distance and magnitude in R3 , respectively. We now pass to an inner product space [ in which an inner product ({> |) is defined for every pair {> | 5 [. This inner product satisfies the axioms P1: ({> {) 0, and ({> {) = 0 i { = 0; P2: ({> |) = (|> {); P3: ({ + |> }) = ({> }) + (|> }). Here, > 5 C and the overbar in P2 denotes complex conjugate. In a real inner product space, P2 is replaced by P20 : ({> |) = (|> {)= In an inner product space we may define a norm by ||{|| = ({> {)1@2 = That this does in fact provide a norm in the usual sense follows from the CauchySchwarz inequality (Th. 2.12.1) |({> |)| ||{|| · |||||>
(10.3.2)
242
Chapter 10
with equality when { 6= 0> | 6= 0, i { = |. For an inner product space [ we may define the terms orthogonal and orthonormal : { and | are orthogonal if ({> |) = 0; a system {jn } [ is orthonormal if ½ 1 p = q> (jp > jq ) = pq = (10.3.3) 0 p 6= q= We may easily extend the concepts of closed and complete to inner product spaces, and we call a complete inner product space a Hilbert space K, Def. 2.12.5. The concept of orthogonality leads to the idea of the orthogonal decomposition of a Hilbert space into a closed subspace P and its orthogonal complement Q = P B ; if { 5 K, then { may be written { = p + q>
p 5 P>
q 5 Q=
(10.3.4)
Clearly, a closed subspace of a Hilbert space is itself a Hilbert space. This leads to Riesz’s representation theorem Th. 4.3.3, which states that any continuous (i.e., bounded) linear functional I ({) on K may be expressed as an inner product: I ({) = ({> i ) for every { 5 K> (10.3.5) and ||I || = ||i ||. We now define a separable Hilbert space K, Def. 4.1.3, one that contains a countable (enumerable) dense subset {iq }. From such a sequence we may, by the usual Gram-Schmidt procedure, construct an orthonormal set {jn } that is dense in K; this will be a complete orthonormal system in the sense that if { 5 K and A 0 are given, there is a finite linear combination of the jn such that q X ||{ n jn || = (10.3.6) n=1
In this case any { 5 [ has a unique representation {=
4 X
n jn >
n = ({> jn )>
(10.3.7)
n=1
and Parseval’s equality holds: ||{||2 =
4 X
n=1
|n |2 =
(10.3.8)
It may be argued that almost all existence proofs in Functional Analysis rely on the concept of a compact set in a metric space. The concept compact is similar to, but must be sharply distinguished from, the concepts closed and complete. In brief, V [ is closed if it contains all its limit points; [ is complete if every Cauchy sequence in V has a limit point in V. A set V [
10. Green’s Functions and Integral Equations
243
is compact Def. 6.1.1 if every sequence {{q } in V contains a subsequence {{qn } which converges to a point { 5 V. The classical Bolzano-Weierstrass Theorem (Th. 1.1.2) states that in a finitedimensional space e.g., RQ , a set V is compact i it is closed and bounded. This result is false for general metric spaces. To be precise, a compact set V [ is closed and bounded, but a closed and bounded set is compact only if the space [ is finite-dimensional. In order to find a criterion for compactness of a set V in an infinite-dimensional metric space we must generalise the classical Heine-Borel Theorem; This uses the concept of an %-covering. Definition 10.3.2 Let [ be a metric space, and suppose V [. A finite set of Q balls E({q > %) with {q 5 [ and % A 0 is said to be a finite %-covering of V, if every element of V lies inside one of the balls E({q > %), i.e., V
Q [
E({q > %)=
q=1
The set of centers {{q } of a finite %-covering is called a finite %-net for V. Definition 10.3.3 Let [ be a metric space. A set V [ is said to be totally bounded if it has a finite %-covering for every % A 0. Hausdor’s compactness criterion is now Theorem 10.3.1 Let [ be a complete metric space. A set V [ is compact i it is closed and totally bounded. In a compact set the points are, as the word compact suggests, close together; the centers {q form a network, and each point in V is near one of the {q . Having the concept of a compact set, we may introduce the idea of a compact (linear) operator. Definition 10.3.4 Let [> \ be metric spaces. A linear operator from [ to \ is said to be compact if it maps the unit ball into a compact set in \ . Note that the map of the unit ball may not itself be a compact set; it is in a compact set. We say that it is precompact, meaning that it may be made compact by closing it: its closure is compact. If the range of a linear operator D is finite-dimensional, we say that D is a finite-dimensional operator. The Bolzano-Weierstrass Theorem then implies that a finite-dimensional operator is compact. We may now use Hausdor’s compactness criterion to obtain a wider class of compact operators. Theorem 10.3.2 Let [> \ be metric spaces, and suppose \ is complete. If the sequence of compact linear operators {Dq } from [ to \ converges uniformly to D, then D is compact.
244
Chapter 10
Proof. Uniform convergence means ||DDq || $ 0. Let V be the unit ball in [. Choose % A 0, and then choose Dq so that ||D{ Dq {|| ? %@3 for all { 5 V. The operator Dq is compact; therefore the map Dq (V) of Dq is precompact; its closure is compact. Therefore, by Th. 6.2.1, it is totally bounded; there is a finite set {{1 > {2 > = = = > {p } V such that every point in Dq (V) lies in a ball of radius %@3 around one of Dq {1 > Dq {2 > = = = > Dq {p . Choose { 5 V, then choose l so that ||Dq { Dq {l || ? %@3 then ||D{D{l || ||D{Dq {+||Dq {Dq {l ||+||Dq {l D{l || %@3+%@3+%@3 = %= This means that the set D(V) is totally bounded and therefore, again by Th. 6.2.1, precompact. (Note that we need \ to be complete.) Thus D is compact. Having introduced one concept, compact, we now introduce another, selfadjoint. To do so we suppose from now on that D is a continuous linear operator on a Hilbert space K i.e., from K to K; we say D 5 E(K> K). If {> | 5 K, then J({) = (D{> |) is a continuous functional on K; therefore, there is an j 5 K such that (D{> |) = ({> j). Clearly, j depends linearly on |, and in fact is the map of | under a new continuous operator D , called the adjoint of D; thus j = D | and (D{> |) = ({> D |)= (10.3.9) If D = D, then D is said to be self-adjoint. If D is self-adjoint, the functional I ({) = (D{> {) is real valued, because I ({) = (D{> {) = ({> D{) = (D{> {) = I ({)= This functional is extremely important because, if D 5 E(K> K) is self-adjoint, then there are two ways to write ||D||, one from (10.3.1), namely ||D|| = sup ||D{|| for ||{|| = 1
(10.3.10)
and another involving I ({), namely ||D|| = sup |I ({)| = sup |(D{> {)| for ||{|| = 1=
(10.3.11)
We denote sup{I ({)} = P>
inf{I ({)} = p> for ||{|| = 1=
(10.3.12)
Clearly, ||D|| = sup(|P|> |p|)=
(10.3.13)
We are now in a position to define an eigenvalue of an operator D 5 E(K> K).
10. Green’s Functions and Integral Equations
245
Definition 10.3.5 Suppose D 5 E(K> K). The scalar is called an eigenvalue of D if there is a non-zero { 5 K such that D{ = {; { is called an eigenvector corresponding to . Note that we use , rather than , to denote an eigenvalue, so that we can use = 1@ to denote an eigenvalue of the dierential equation (10.1.1). Clearly, any eigenvalue of a self-adjoint operator must be real, for D{ = { implies (D{> {) = ({> {). See also Ex. 10.3.1. Theorem 10.3.3 If D 5 E(K> K) is self-adjoint and is not an eigenvalue of D, then U(D L) is dense in K. Proof. We need to equivalent to saying that were so then 0 = =
show that the closure of U(D L) is K. This is if } is orthogonal to all (D L){, then } = 0. If this (}> (D L){) = (}> D{) ¯ (}> {) ((D ¯ L)}> {)
for all { 5 K. But, on taking { = (D ¯ L)}, we find (D ¯ L)} = 0. If } is not zero, this states that ¯ is an eigenvalue of D. But D is self-adjoint so that ¯ is real, i.e., ¯ = ; is an eigenvalue of D, contrary to hypothesis. We now generalise the concept of an eigenvalue and introduce the concept of the spectrum of an operator. Definition 10.3.6 Suppose D 5 E(K> K). The spectrum of D, denoted by (D), is the set of all complex numbers such that D L does not have a bounded inverse. The resolvent set (D) is the complement of , i.e., = C\. We recall that if D 5 E(K> K) then ||D{|| ||D|| · ||{||; if D is to have a bounded inverse then ||D{|| n||{|| for some n A 0. We prove Lemma 10.3.1 If D 5 E(K> K) and ||D{|| n||{|| for all { 5 K and some n A 0, then U(D) is closed. Proof. Suppose {{q } K and D{q $ |. The sequence {D{q } is a Cauchy sequence, and so therefore is {{q } because ||{p {q || ||D{p D{q ||@n. Since K is complete, there is { 5 K such that {q $ {. By continuity we have D{q $ D{, so that | = D{, i.e., | 5 U(D) : U(D) is closed. We may now characterise the resolvent set of a self-adjoint operator. Theorem 10.3.4 Suppose D 5 E(K> K) is self-adjoint, then 5 (D) i ||(D L){|| n||{|| for all { 5 K and some n A 0. Proof. If 5 (D), then (D L) has a bounded inverse, so that ||(D L)1 || · ||{|| i.e., ||(D L){|| ||(D L)1 ||1 · ||{||.
246
Chapter 10
Conversely, if ||(D L){|| n||{|| for all { 5 K, then Theorem 10.3.3 states that U(D L) is dense in K, while Lemma 10.3.1 states that U(D L) is closed. Thus U(D L) = K, and ||(D L){|| n||{|| states that (D L) has a bounded inverse, i.e., 5 (D). We now show that if D 5 E(K> K) self-adjoint then its spectrum is real, non-empty, and lies within the interval [p> P ]. Theorem 10.3.5 If D 5 E(K> K) is self-adjoint, then (D) is a non-empty subset of [p> P ], and p> P 5 (D). Proof. First we prove that the spectrum is real. For suppose = +l> 6= 0, then for all { 5 K, ||(D L){||2
= (D{ { l{> D{ { l{) = ||(D L){||2 + 2 ||{||2 2 ||{||2 =
Theorem 10.3.4 shows that 5 (D). Thus if 5 (D) then must be real. We now show that if ? p, then 5 (D). We have, on the one hand, ((D L){> {) ||(D L){|| · ||{|| and, on the other ((D L){> {) = (D{> {) ||{||2 A p||{||2 ||{||2 A (p )||{||2 so that ||(D L){|| (p )||{|| so that Theorem 10.3.4 shows that 5 (D). We can show similarly that if A P, then 5 (D). We have thus shown that, if (D) exists, it must lie in [p> P ]. We now show that P 5 (D). By the definition of sup, there is a sequence {{q } such that ||{q || = 1, and (D{q > {q ) $ P . Therefore, ||(D PL){q ||2
= =
(D{q P {q > D{q P {q ) ||D{q ||2 2P (D{q > {q ) + P 2 ||{q ||2 P 2 2P(D{q > {q ) + P 2 2P(P (D{q > {q )) $ 0=
Thus P , and similarly p, are in (D). So far, we have shown that a self-adjoint operator D 5 E(K> K) has a nonempty real spectrum that lies in [p> P ]. Now we suppose that, in addition to being self-adjoint, D is a compact operator. In that case the spectrum consists entirely of eigenvalues, apart perhaps from zero. This is given in Theorem 10.3.6 If D 5 E(K> K) is self-adjoint and compact and if 5 (D) and 6= 0, then is an eigenvalue of D.
10. Green’s Functions and Integral Equations
247
Proof. If 5 (D) then, by definition, D L does not have a bounded inverse. There is therefore (Ex. 10.3.3) a sequence {{q } such that ||{q || = 1 and D{q {q $ 0 as q $ 4. Since D is compact it maps {{q } into a precompact set. This means that there is a subsequence {{qn } such that D{qn $ | 5 K. We then have {qn = 1 [D{qn (D{qn {qn )] $ 1 | and therefore, since D is continuous, | = lim D{qn = D|= n$4
Since ||{q || = 1 and 6= 0 we have ||||| 6= 0, so that | is an eigenvector corresponding to . Since we have already proved that p> P 5 (D), we now know that, provided p> P are not zero, and if D is not zero, one of them at least must be non-zero because of (10.3.13), p and P are eigenvalues of D: a non-zero compact selfadjoint operator has at least one real eigenvalue. Having shown that D has at least one eigenvalue, we now prove Theorem 10.3.7 A non-zero compact self-adjoint operator in a Hilbert space K has a finite or infinite sequence of orthonormal eigenvectors {1 > {2 > = = = corresponding to non-zero eigenvalues 1 > 2 > = = = (|1 | |2 | = = =). Proof. By Theorem 10.3.6 there is an eigenvector {1 , with ||{1 || = 1> D{1 = 1 {1 , where 1 = ± sup |(D{> {)|> ||{|| = 1 1 is either p or P , and |1 | = ||D||. Rename the Hilbert space K1 , the operator as D1 , let P1 be the space spanned by {1 , and decompose K1 into K2 and P1 as in equation (10.3.4). The space K2 is a Hilbert space. If { 5 K2 , then D1 { 5 K2 , for (D1 {> {1 ) = ({> D1 {1 ) = ({> 1 {1 ) = 1 ({> {1 ) = 0= This means that we may define a new operator D2 in K2 , by D2 { = D1 {>
{ 5 K2 =
This operator is called the restriction of D1 to K2 ; it is clearly a self-adjoint compact linear operator in the Hilbert space K2 . If this operator is not identically zero we may apply Theorem 10.3.6 to it, and find an eigenvector {2 such that D2 {2 = 2 {2 > ||{2 || = 1= Since {2 5 K2 , we have ({2 > {1 ) = 0 and, for ||{|| = 1, |2 | = sup |(D1 {> {)| sup |(D1 {> {)| = |1 |= {5K2
{5K1
248
Chapter 10
We now continue this process; we let P2 be the space spanned by {2 , decompose K2 into K3 and P2 , call D3 the restriction of D2 to K3 , and find an eigenvalue 3 and eigenvector {3 , and so on. Generally |n | = sup |(D{> {)| = |(D{n > {n )|> {5Kn
||{|| = 1 = ||{n ||=
(10.3.14)
Either the process stops after a finite number of steps or it continues indefinitely. In the former case there is an integer q for which the restriction Dq+1 of D1 to Kq+1 is identically zero, i.e., sup |(D{> {)| = 0>
{5Kq+1
||{|| = 1=
(10.3.15)
In this case we obtain a finite sequence of orthonormal eigenvectors {1 > {2 > = = = > {q . The latter case is the subject of Theorem 10.3.8 Suppose D 5 E(K> K) is a self-adjoint compact operator. If D has an infinity of eigenvalues, they are enumerable with zero being the only limit point. Proof. The procedure described in Theorem 10.3.7 produces a sequence of eigenvalues 1 > 2 > = = = such that |1 | |2 | · · · , and corresponding sequence of orthonormal eigenvectors {1 > {2 > = = = Consider all those eigenvalues satisfying || A f. If there is an infinite sequence {1 > {2 > = = = corresponding to such eigenvalues, then ||D{p D{q ||2 = ||p {p q {q ||2 = |p |2 + |q |2 2f2 =
(10.3.16)
But since D is compact, the sequence {D{q } must have a convergent subsequence; this contradicts (10.3.16). Hence there is at most a finite set of eigenvectors corresponding to eigenvectors satisfying || A f. The eigenvalues may be enumerated by placing their absolute values in the intervals (1> 4)> (1@2> 1]> (1@3> 1@2]> = = =; there is a finite number in each of this enumerable set of intervals; the eigenvalues can have zero as their only limit point. Theorem 10.3.9 Let D 5 E(K> K) be a compact self-adjoint operator with eigenvalues l ordered so that |1 | |2 | · · · , and corresponding orthonormal eigenvectors {1 > {2 > = = = The eigenvectors {{l } are complete in the range of D, i.e., for every i = Dk> k 5 K, the Parseval equality ||i ||2 = holds.
4 X
n=1
|(i> {n )|2
(10.3.17)
10. Green’s Functions and Integral Equations
249
Proof. First, suppose the process described in Theorem 10.3.7 stops. Take i = Dk, and consider q X j =k (k> {n ){n = (10.3.18) n=1
We have (j> {n ) = 0> n = 1> 2> = = = > q, so that j 5 Kq+1 and hence { = j@||j|| satisfies (10.3.15) so that ||D{|| = 0, i.e., Dj = 0. Thus Pq Pq 0 = Dj = Dk Pq n=1 (k> {n )D{n = Dk n=1 (Dk> {n ){n = i n=1 (i> {n ){n > so that
i=
q X (i> {n ){n =
n=1
Now consider the case in which the process does not stop. There is an enumerable sequence of eigenvalues {l } with zero as limit point. Choose A 0 and then choose Q so that if q A Q , then |q |2 ? . Take q A Q . Suppose i = Dk and consider j given by (10.3.18); j 5 Kq so that ||Dj|| |q+1 |= ||j|| Thus ||Dj|| |q+1 | ||j|| |q+1 | ||k||> so that, as before ¯¯2 ¯¯ q ¯¯ ¯¯ X ¯¯ ¯ ¯ ||Dj||2 = ¯¯i (i> {n ){n ¯¯ |q+1 |2 ||k||2 ¯¯ ¯¯ n=1
or equivalently
0 ||i ||2
q X
n=1
|(i> {n )|2 |q+1 | · ||k||2 ||k||2
which implies Parseval’s equality P4
n=1
|(i> {n )|2 = ||i ||2 =
We now obtain another result by making a further assumption concerning D; thus we introduce Definition 10.3.7 A self-adjoint continuous linear operator D in a Hilbert space K is called strictly positive if (D{> {) 0 for all { 5 K and (D{> {) = 0 i { = 0. For a strictly positive, compact, self-adjoint operator in a Hilbert space the process described in Theorem 10.3.7 can stop only if K itself is finite dimensional. This leads to
250
Chapter 10
Theorem 10.3.10 Let D be a strictly positive compact self-adjoint operator in an infinite dimensional Hilbert space K. There is an orthonormal system {{q } which is a basis for K, and D has the representation D{ =
4 X
n ({> {n ){n =
n=1
Proof. Let | 5 K and consider |q+1 = |
q X (|> {n ){n >
n=1
where {{n } is the orthonormal sequence of eigenvectors, as in Theorem 10.3.9. It is easy to show that {|q } is a Cauchy sequence. We wish to prove that its limit is zero. Assume that it is not, i.e., |q $ } 6= 0. Since |q+1 5 Kq+1 we have (D|q+1 > |q+1 ) 2q+1 = |||q+1 ||2 But q $ 0 as q $ 4 so that passage to the limit gives (D}> }) = 0> ||}||2 which is a contradiction since D is strictly positive. Therefore, } = 0 and |=
4 X (|> {n ){n >
| 5 K>
n=1
so that {{n } forms a basis for K, and moreover P P4 D| = 4 n=1 (|> {n )D{n = n=1 n (|> {n ){n =
This theorem shows that one can have a strictly positive compact self-adjoint operator only in a separable Hilbert space. Corollary 10.3.1 Under the condition of Theorem 10.3.10 we can introduce a norm ||{||D = (D{> {)1@2 and a corresponding inner product ({> |)D = (D{> |)= The completion of K with respect to this norm is called KD . Exercises 10.3 1. Show that eigenvectors { and |, corresponding to two dierent eigenvectors of a self-adjoint operator D, are orthogonal, i.e., ({> |) = 0.
10. Green’s Functions and Integral Equations
251
2. Show that the operator D1 is bounded on U(D) i there is a constant f A 0, such that, if { 5 G(D), then ||D{|| f||{||. 3. Use Ex. 10.3.2 to show that D1 is unbounded i there is a sequence {{q } such that ||{q || = 1> ||D{q || $ 0. 4. Show that a compact self-adjoint operator is strictly positive i its eigenvalues are positive.
10.4
The Green’s function integral equation
We must now exhibit the integral operator Z 1 N({> v)x(v)gv Dx =
(10.4.1)
0
as a stricly positive, self-adjoint, compact operator in a separable Hilbert space. In order to make this identification we need some results about functions. We start with the space of continuous functions on the closed interval [0> 1]. We call this F[0> 1]. The fundamental result about a function i ({) 5 F[0> 1] is that i ({) is bounded on [0> 1], and actually attains its upper bound. We may thus form a normed linear space from F[0> 1] by using the norm ||i ||4 = sup |i ({)|=
(10.4.2)
{5[0>1]
Convergence of a sequence of function {iq ({)} in the norm (10.4.2) is uniform convergence. Weierstrass’ Theorem on uniform convergence states that a uniformly Cauchy sequence {iq ({)}, i.e., a Cauchy sequence in the norm (10.4.2), of uniformly continuous functions on [0> 1] converges to a uniformly continuous function. This translates into the statement that F[0> 1] under the norm (10.4.2) is complete. We may introduce another norm on F[0> 1]: ||i ||2 =
½Z
1
¾1@2 (i ({)) g{ = 2
0
(10.4.3)
The example in Ex. 10.4.1 shows that F[0> 1] is not complete under this norm. However, we may use the completion theorem, and complete this space. We may make the space an inner-product space by using the inner product Z 1 i ({)j({)g{= (10.4.4) (i> j) = 0
We call this complete inner-product space, i.e., Hilbert space, O2 (0> 1). Here O stands for Lebesgue. Remember that while the elements of F[0> 1] are uniformly continuous functions, the elements of O2 (0> 1) are equivalence classes of Cauchy
252
Chapter 10
sequences of uniformly continuous functions. The space O2 (0> 1) is known to be separable (Th. 4.1.4). Now we start to examine the operator D from O2 (0> 1) to L2 (0> 1), defined by Z
Dx =
1
N({> v)x(v)gv
0
where N({> v) = ({)(v)J({> v)
(10.4.5)
and J({> v) is given in (10.2.6). The operator D is self-adjoint in O2 (0> 1) because N({> v) is symmetric. Now we examine the continuity of the operator. Suppose first that ({) 5 F[0> 1]> then N({> v) 5 F([0> 1] × [0> 1]) so that N({> v) is bounded on the square, i.e., N({> v) P and ||Dx||4 = sup |Dx| P sup |x| = P ||x||4 > {5[0>1]
{5[0>1]
so that ||D|| P: D is continuous. Now examine continuity in O2 (0> 1). We have 2
||Dx|| =
Z
1
0
½Z
1
0
¾2 N({> v)x(v)gv g{=
Again, if N({> v) 5 F([0> 1] × [0> 1]) then N({> v) P and 2
||Dx|| P
2
Z
1
0
(x(v))2 gv P 2 ||x||2
so that D is continuous. Now suppose that ({) 5 O2 (0> 1). Since N({> v) = ({)(v)J({> v), and J({> v) 5 F([0> 1] × [0> 1]) we have |J({> v)| P and |N({> v)| ({)(v)P= Thus 2
||Dx|| P
2
Z
1 2
({)
0
½Z
1
0
¾2 (v)x(v)gv g{=
The Schwarz inequality (10.3.2) gives ½Z
0
1
¾2 Z (v)x(v)gv
1 2
(v)gv
0
Z
so that ||Dx||2 P 2 ||||4 ||x||2 = Thus ||D|| P ||||2 >
0
1
x2 (v)gv>
10. Green’s Functions and Integral Equations
253
and D is continuous. In order to prove that D is compact we note that if a function i ({> v) is continuous on the unit square, i.e., i 5 F([0> 1]×[0> 1]), then it may be approximated uniformly by a finite sum of the form q X
l ({) l (v)=
l=1
The Green’s function J({> v) is continuous on the unit square, and is symmetric in { and v. Thus there are functions {l ({)}4 1 such that, given % A 0, we can find Q so that if q A Q then sup |J({> v)
q X
l ({)l (v)| %>
l=1
for ({> v) 5 ([0> 1] × [0> 1]). This means that if Nq ({> v) = ({)(v)
q X
l ({)l (v)>
l=1
and Dq x =
Z
1
Nq ({> v)x(v)gv>
0
then Dq is a finite-dimensional operator, and thus compact. If 5 O2 (0> 1) then D is the limit of a sequence of compact linear operators {Dq }, and is thus compact by Theorem 10.3.1. Reader, congratulations if you have read and followed thus far. We have tried to provide a sign-posted journey; clearly, we have not proved every step, but we had no intention of doing that. We could have taken a short cut by merely stating that ‘it can be shown that D is compact’, but we hope that the route we have taken has been more pleasant and instructive. What can we conclude from our study? If ({) 5 O2 (0> 1), the integral equation has a finite or enumerable sequence of positive eigenvalues 1 > 2 > = = = satisfying |1 | A |2 | A · · · , and a corresponding set of eigenfunctions {xl }4 0 which are orthonormal under the O2 (0> 1) norm. However, this result is not as satisfying as we would like, because the eigenfunctions, being in O2 (0> 1), are not functions in the ordinary sense, but equivalence classes of Cauchy sequences of functions in F[0> 1]. Can we say anything more about them? First, we note that if x satisfies (10.2.11) then y satisfies (10.2.9) where, remember that we now switch $ 1@. Thus, the eigenvalues l of (10.2.9) satisfy 0 ? |1 | |2 | = = = Actually, we proved earlier that the l are distinct and positive, i.e., they satisfy (10.1.18): 0 ? 1 ? 2 ? = = = We have not yet shown that there is an infinity of eigenvalues, nor have we shown, in the Green’s function analysis, that they are distinct; we will eventually do this.
254
Chapter 10
We may write (10.2.9) as y({) =
Z
1
(v)J({> v)x(v)gv=
(10.4.6)
0
If 5 O2 (0> 1) and x 5 O2 (0> 1) then the integrand in (10.4.6) is integrable in v and uniformly continuous in {, so that the left hand side, y({), is continuous: y({) 5 F[0> 1], and we may properly speak of an eigenfunction. If 5 F[0> 1] then y({) actually has a continuous second derivative, and satisfies equation (10.1.1), for on using the form of J({> v) given in (10.2.6) we see that Z { Z 1 2 y({) = #({) (v)!(v)y(v)gv + !({) 2 (v)#(v)y(v)gv (10.4.7) {
0
so that y(0) = !(0) y(1) = #(1)
R1 0
R1 0
2 (v)#(v)y(v)gv 2 (v)!(v)y(v)gv=
Now, dierentiating (10.4.7), which we can do because all the integrands are continuous, we find R{ y 0 ({) = # 0 ({) 0 2 (v)!(v)y(v)gv + #({)2 ({)!({)y({) + !0 ({)
Thus
R1 {
2 (v)#(v)y(v)gv !({)2 ({)#({)y({)=
y 0 (0) = !0 (0) y 0 (1) = # 0 (1)
R1 0
R1 0
2 (v)#(v)y(v)gv = ky(0) 2 (v)!(v)y(v)gv = Ky(1)=
Thus y({) satisfies the stated end conditions. On dierentiating a second time, using !00 ({) = 0 = #00 ({), we find y 00 ({) = (!({)# 0 ({) !0 ({)#({))2 ({)y({) and on account of (10.2.5), this is y 00 ({) + 2 ({)y({) = 0= Exercises 10.4 1. Consider the sequence {iq ({)} in F[0> 1]: ½ 1 1 { 4 q {1 = iq ({) = 1 q4 0 { q1 Show that {iq ({)} is a Cauchy sequence under the O2 norm (10.4.3), but {iq ({)} converges to 1 i ({) = { 4 which is not in F[0> 1]. Hence F[0> 1] is not complete under the O2 norm.
10. Green’s Functions and Integral Equations
10.5
255
Oscillatory properties of Green’s functions
In Section 10.4 we showed that when k 0> K 0> k + K A 0, the integral equation (10.2.9) has eigenvalues l satisfying 0 ? |1 | |2 | = = =; if there are an infinity of them, then 0 ? |1 | |2 | = = = $ 4= On the other hand, in Section 10.4, we showed that the eigenvalues of the (equivalent) equation (10.1.1) are positive and distinct, i.e., 0 ? 1 ? 2 ? = = = = This means that the Green’s function J({> v) must have some special properties which lead to the eigenvalues being distinct; we now discuss these properties. We start by defining the interval L, as follows: L
= = = =
[0> 1] if k> K are finite (0> 1] if k = 4, K is finite [0> 1) if k is finite, K = 4 (0> 1) if k = 4 = K=
Note that when k = 4, the end condition x0 (0) kx(0) = 0 becomes x(0) = 0, i.e., the end { = 0 is fixed. This means that L is the set of movable points in [0> 1]. Equations (10.2.6), (10.2.7) show that J({> v) 0 for {> v 5 [0> 1] A 0 for {> v 5 L= We now introduce the concept of an oscillatory kernel. Definition 10.5.1 If 0 ? {1 ? {2 ? · · · ? {q ? 1, and x = [{1 > {2 > = = = > {q ], then we say x 5 T. If {1 > {q 5 L then we say x 5 I. A kernel N({> v) on [0> 1] × [0> 1] is said to be oscillatory if i) N({> v) A 0 for {> v 5 I ii) N(x; s) 0 for x> s 5 T iii) N(x; x) A 0 for x 5 T Here ¯ ¯ N({1 > v1 ) N({1 > v2 ) ¯ ¯ N({2 > v1 ) N({2 > v2 ) N(x; s) = ¯¯ · · ¯ ¯ N({q > v1 ) N({q > v2 )
··· ··· ··· ···
N({1 > vq ) N({2 > vq ) · N({q > vq )
¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯
and take note of Ex. 10.5.1 which shows that iii) must necessarily hold for x 5 I. Theorem 10.5.1 A kernel N({> v) is oscillatory i the matrix D = (dlm ) = (N({l > {m )) is an oscillatory matrix for any x 5 I.
256
Chapter 10
Proof. Suppose the kernel is oscillatory then, in the notation of Section 6.2, if = (ll > l2 > = = = > ls )> = (m1 > m2 > = = = > ms ) then D(; ) = N(x0 ; s0 ) 0 where {0n = {ln > v0n = {mn > n = 1> 2> = = = > s. Thus A is TN. Now dl>l+1 = N({l > {l+1 ) A 0> dl+1>l = N({l+1 > {l ) A 0 while det(A) = N(x; x) A 0= Thus A satisfies the three conditions for it to be oscillatory: it is TN, the terms next to the principal diagonal are positive, and it is non-singular. We may reverse this argument to show that if A is oscillatory then N({> v) is an oscillatory kernel. Note that in addition to being oscillatory, A is a strictly positive matrix for x> s 5 I. We now show that the Green’s function J({> v) defined in (10.2.6), (10.2.7) is an oscillatory kernel. To do so we recall the definition of a Green’s matrix. Definition 10.5.2 The matrix G = (jlm ) is called a Green’s matrix if ½ dl em > l m> jlm = dm el > l m> where (dl )q1 > (el )q1 R. Note that G is symmetric. Theorem 10.5.2 If = (ll > l2 > = = = > ls )> = (m1 > m2 > = = = > ms ) then J(; ) = dn1
s ¯ Y ¯ dnu ¯ ¯ enu
u=2
¯ dou1 ¯¯ e eou1 ¯ os
(10.5.1)
where np = min(lp > mp )> op = max(lp > mp ), provided that lp > mp ? lp+1 > mp+1 . Recall that this means that lp ? lp+1 >
lp ? mp+1 >
mp ? lp+1 >
mp ? mp+1 =
Proof. If l1 ? l2 but m1 l2 , then the first two rows of the minor are jl1 >m1 jl2 >m1 but these are
jl1 >m2 jl2 >m2
= = = jl1 >ms = = = jl2 >ms
dl1 em1 > dl1 em2 > = = = dl1 ems dl2 em1 > dl2 em2 = = = dl2 ems
10. Green’s Functions and Integral Equations
257
and are thus proportional, so that the minor is zero. Similarly, if m1 ? m2 l1 , the first two columns will be proportional and the minor zero. Thus, we may assume max(l1 > m1 ) ? min(l2 > m2 ). Suppose further, for definiteness, that l2 m2 (otherwise the argument proceeds with the first two columns), then the first two row are dn1 eo1 > dl1 em2 > = = = dl1 ems dm1 el2 > dl2 em2 > = = = dl2 ems so that the terms in columns 2> 3> = = = > s are proportional. Multiplying row 2 by dl1 @dl2 and subtracting it from the first, we find the only non-zero term, the first in the first row, to be dn1 eo1 dl1 dm1 el2 @dl2
so that
¯ ¯ d J(; ) = dn1 ¯¯ n2 en2
= dn1 e¯o1 dn1 do1 e¯n2 @dn2 ¯ d do1 ¯¯ d = dnn1 ¯¯ n2 en2 eo1 ¯ 2 ¯ do1 ¯¯ 1 J(\l1 ; \m1 ) · eo1 ¯ dn2
from which the theorem follows by induction.
Theorem 10.5.3 The Green’s matrix G is TN i all (dl )q1 > (el )q1 have the same strict sign and d1 d2 dq ··· = (10.5.2) e1 e2 eq Moreover, G will be oscillatory i (dl )q1 > (el )q1 have the same strict sign and d2 dq d1 ? ? ··· ? = e1 e2 eq
(10.5.3)
Proof. There is no loss in generality in assuming that all (dl )q1 > (el )q1 are positive. It was shown in Theorem 10.5.2 that a minor is zero unless l1 > m1 ? l2 > m2 ? · · · ? ls > ms = Each of the second order determinants in (10.5.1) is non-negative i dol1 dn l> eol1 enl
l = 1> 2> = = = > s=
This is exactly the condition (10.5.2). G is TN and jl>l+1 A 0> jl+1>l A 0, so that the only condition to be fulfilled for G to be oscillatory is that it must be non-singular. Thus each second order determinant in the factorisation of J(; ) must be positive, which is (10.5.3). Corollary 10.5.1 Let !({)> #({) be continuous in [0> 1] and ½ !({)#(v)> 0 { v 1> N({> v) = !(v)#({)> 0 v { 1=
258
Chapter 10
If !({)#({) A 0 in (0> 1) and !({)@#({) is an increasing function of { in (0> 1) then N(x; s) 0 for x> s 5 T= If !({)#({) A 0 in L, and !({)@#({) is a strictly increasing function of { in L, then N(x; s) A 0 i x> s 5 I and {1 > v1 ? {2 > v2 ? · · · ? {q > vq = Theorem 10.5.4 The Green’s function J({> v) given by (10.2.5), (10.2.6) is oscillatory and a minor J(x; s) A 0 i x> s 5 I and {1 > v1 ? {2 > v2 ? · · · ? {q > vq . Proof. Equation (10.2.7) shows that !({)#({) A 0 in L. Equation (10.2.5) yields · ¸ !0 ({)#({) !({)# 0 ({) 1 g !({) = = A 0 in L 2 g{ #({) [#({)] [#({)]2 so that !({)@#({) is strictly increasing in L, and thus the result follows from the Corollary 10.5.1. In order to ascertain the meaning of the oscillatory character of the Green’s function, consider a string under the action of q concentrated forces (Il )q1 applied normal to the string at q points (vl )q1 in L. The displacement is x({) =
q X
J({> vl )Il =
l=1
Thus J({> v) A 0 (condition i) of Definition 10.5.1) means that the displacement due to a single force I occurs ‘in the same direction’ as the force. To see the meaning of condition iii) of Definition 10.5.1 we note that the strain energy of the string under the action of the q forces is q
X=
q
q
1X 1 XX x(vl )Il = J(vl > vm )Il Im > 2 l=1 2 l=1 m=1
so that condition iii) states that X is positive definite (for forces applied at movable points, i.e., in L). The essential nature of an oscillatory kernel is evidenced in Theorem 10.5.5 Under the action of q forces (Il )q1 the displacement x({) of the string can change its sign no more than q 1 times. Proof. Suppose that forces (Il )q1 are applied at points (vl )q1 where s 5 I. If v1 A 0, then q X x({) = !({) Il #(vl )> 0 { v1 > l=1
10. Green’s Functions and Integral Equations
259
so that x({) is of one sign in [0> v1 ]. If q X
Il #(vl ) = 0
l=1
then x({) is identically zero in [0> v]. Otherwise, it is of one sign, and can be zero only at { = 0, and that only if the string is fixed at { = 0, i.e., k = 4. In the interval [vm > vm+1 ]> m = 1> 2> = = = > q 1, x({) = #({)
m X
Il !(vl ) + !({)
l=1
q X
Il #(vl )=
l=m+1
Since !({)> #({) are linearly independent, the displacement x({) is identically zero in [vm > vm+1 ] i q q X X Il !(vl ) = 0 = Il #(vl )= l=1
l=m+1
If this is not the case then x({) can have at most one zero in [vm > vm+1 ]. For if there were two, say > such that vm ? vm+1 then #()!() #()!() = 0, contradicting the fact that !({)@#({) is a strictly increasing function. Finally, if vq { 1 then x({) = #({)
q X
Il !(vl )
l=1
so that again x({) has one sign. It is identically zero if q X
Il !(vl ) = 0>
l=1
otherwise it can be zero only at { = 1, and that only if K = 4. We conclude that x({) can change sign at most q 1 times, at most once in each of (v1 > v2 ]> [v2 > v3 ]> = = =
[vq1 > vq )
Exercises 10.5 1. Continuity of the minor in ii) of Definition 10.5.1 shows that it will be non-negative for ({l )q1 , (vl )q1 satisfying 0 {1 ? {2 ? · · · ? {q 1 and 0 v1 ? v2 ? · · · ? vq 1. Use Theorem 6.6.5 to show that iii) necessarily holds for x 5 I.
10.6
Oscillatory systems of functions
In this section we shall derive some basic results that are needed to establish further properties of the eigensolutions. Let (!l ({))q1 be a sequence of functions defined on an interval L, ([0> 1]> (0> 1]> [0> 1)> or (0> 1)).
260
Chapter 10
Theorem 10.6.1 The necessary and su!cient condition for the functions (!l ({))q1 to be linearly dependent is that ¯ ¯ ¯ !1 ({1 ) !1 ({2 ) = = = !1 ({q ) ¯ ¯ ¯ ¯ !2 ({1 ) !2 ({2 ) = = = !2 ({q ) ¯ ¯ ¯ ({1 > {2 > = = = > {q ; 1> 2> = = = > q) ¯ ¯ · · === · ¯ ¯ ¯ !q ({1 ) !q ({2 ) = = = !q ({q ) ¯ be zero for any ({u )q1 5 L.
Proof. The condition is necessary. For if the functions (!l ({))q1 are linearly dependent then there are constants (fl )q1 , not all zero, such that q X
fl !l ({) = 0 for { 5 L=
l=1
This means that for any ({u )q1 5 L we have q X
fl !l ({u ) = 0>
u = 1> 2> = = = > q=
(10.6.1)
l=1
Since the (fl )q1 are not all zero, the determinant of coe!cients in (10.6.1) must be zero. We prove su!ciently by induction. If q = 1, then = 0 states that !1 ({1 ) = 0 for any {1 5 L, i.e., !1 ({) 0 for { 5 L. Suppose therefore that ({1 > {2 > = = = > {q ; 1> 2> = = = > q) = 0 for all ({l )q1 5 L= We need to prove that the (!l ({))q1 are linearly dependent. Assume that (!l ({))q1 are linearly independent (for if they were dependent then so would 1 the (!l ({))q1 be), then there are ({u )q1 5 L such that 1 ({1 > {2 > = = = > {q1 ; 1> 2> = = = > q 1) 6= 0=
(10.6.2)
But then, for all { 5 L ({1 > {2 > = = = > {q1 > {; 1> 2> = = = > q) = 0= Expand this determinant along its last column; the result has the form (10.6.1) in which fq , being the determinant (10.6.2), is not zero. Definition 10.6.1 A sequence of continuous functions (!l ({))q1 is said to constitute a Chebyshev sequence on L if, for any set of real constants (fl )q1 , not all zero, the function q X !({) = fl !l ({) l=1
does not vanish more than q 1 times on L.
10. Green’s Functions and Integral Equations
261
Theorem 10.6.2 The sequence (!l ({))q1 is a Chebyshev sequence on L i (x; ) maintains strictly fixed sign for x 5 I; denotes (1> 2> = = = > q). Proof. If = 0 for some x 5 I then, and only then, will the equation q X
fl !l ({u ) = 0 u = 1> 2> = = = > q
l=1
has a non-zero solution (fl )q1 , i.e., the function !({) will have q dierent zeros. On the other hand, since I is a connected subset of RQ and is a continuous function, the fact that 6= 0 in I means that has strictly fixed sign on I. Without loss of generality we may take A 0. Definition 10.6.2 A sequence of continuous (!l ({))4 1 will be called a Markov sequence in L if, for each q = 1> 2> = = = the sequence (!l ({))q1 is a Chebyshev sequence. Theorem 10.6.2 shows that (!l ({))4 1 is a Markov sequence i, for q = 1> 2> = = =, ({1 > {2 > = = = > {q ; 1> 2> = = = > q) has the same strict sign for any x 5 I. We now explore the nature of the zeros of a combination !({) =
q X l=1
fl !l ({)>
q X
f2l A 0
l=1
of continuous functions !l ({) in a Chebyshev sequence. By definition, !({) has at most q 1 zeros in L. We may divide these zeros into three groups: v simple nodes in (0> 1), g double nodes in (0> 1), and s end-zeros at 0 or 1 if these are in L. In any two-sided vicinity of a simple node , there are points {1 > {2 such that {1 ? ? {2 and !({1 )!({2 ) ? 0= In any two-sided vicinity of a double node , there are points {1 > {2 such that {1 ? ? {2 and !({1 )!({2 ) A 0= The statement that (!l ({))q1 form a Chebyshev sequence means that v + g + s q 1= We now establish
262
Chapter 10
Theorem 10.6.3 If the continuous functions (!l ({))q1 form a Chebyshev sequence on L, then v + 2g + s q 1 i.e., in the estimate of the number of zeros, each double node may be counted twice. Proof. Let ({l )p If !({n ) 6= 0 for 1 5 L satisfy {1 ? {2 ? · · · ? {p . n = 1> 2> = = = > p, then the maximum number of sign changes in the sequence (!({n ))p 1 occurs if, for some integer k (either 0 or 1) ()k+n !({n ) A 0>
n = 1> 2> = = = > p=
If some !({n ) are zero we may assign signs, + or -, to them and obtain dierent sign change counts for the sequence (!({n ))p 1 ; the sign change count will be maximum, p 1, if for some integer k (either 0 or 1) ()k+n !({n ) 0>
n = 1> 2> = = = > p=
A set of p points with this property is said to have property ]. Consider some examples. Figure 10.6.1 shows !({) with a zero at {1 = 0 5 L [0> 1) and two simple nodes in (0,1).
x x 1 = 0 x2
x
x3
x
x
x4
1
Figure 10.6.1 - !({) has 2 simple nodes. The points ({l )41 5 L have property ]. (Note that we are not interested in the value of !(1) since 1 is not in L.) Here v = 2> s = 1 and p = 4 = v + s + 1. Now suppose also that !({) has a double node at {4 , as in Figure 10.6.2, with L [0> 1).
x x1
x3 x2
x4 x
x5
Figure 10.6.2 - !({) has 2 simple nodes and one double node.
x6
x
10. Green’s Functions and Integral Equations
263
The points ({l )61 5 L have property ]. Here v = 2> g = 1 and p = 6 = v + 2g + s + 1. In general, if !({) has v simple nodes, g double nodes and s end zeros, then we may find p = v + 2g + s + 1 points with property ]. Suppose, if possible, that v + 2g + s q, then we may find q + 1 points ({l )q+1 with property ], i.e., 1 ()k+n !({n ) 0>
n = 1> 2> = = = > q + 1=
(10.6.3)
Since !({) is a linear combination of (!l )q1 , the functions !1 > !2 > = = = > !q > ! = !q+1 are linearly dependent. Therefore, by Theorem 10.6.1, (1> 2> = = = > q + 1; {1 > {2 > = = = > {q+1 ) = 0= Expand this zero determinant along its last row; we get q+1 X
()q+n+1 !({n )(1> 2> = = = > q; {1 > {2 > = = = > {n1 > {n+1 > = = = > {q ) = 0=
n=1
Since (!l ({))q1 form a Chebyshev sequence, the determinants in this equation have the same strict sign, by Theorem 10.6.2. Moreover, by the assumption (10.6.3), the terms ()q+n+1 !({n ) have the same (loose) sign. This means that !({n ) = 0 for n = 1> 2> = = = > q + 1, but this is impossible: since the (!l )q1 form a Chebyshev system, !({) has at most q 1 zeros. We conclude that p q, i.e., v + 2g + s q 1. We now introduce an extra condition on the function {!l ({)}4 1 , that they are orthonormal, and prove the fundamental Theorem 10.6.4 If {!l ({)}4 1 is a Markov sequence of continuous functions on L, and the !l ({) are orthonormal with respect to some inner product, i.e., (!l > !m ) = lm then 1) !1 ({) has no zeros in L. 2) !l ({) has l 1 simple nodes and no other zeros in L. Pn 2 P 1 m n> 3) !({) = nl=m fl !l ({)> l=m fl A 0 has not less than m 1 simple nodes in (0,1), and not more than n 1 zeros in L; in the notation of Theorem 10.6.3, v + 2g + s n 1. Proof. Note that 1) and 2) are particular cases of 3), and all that is left to be proved in 3) is that !({) has not less than m 1 simple nodes. The functions (!l )4 1 form a Markov sequence. This means that if 0 ? {1 ? {2 ? · · · ? {q ? 1, then ({1 > {2 > = = = > {q ; 1> 2> = = = > q)
264
Chapter 10
has fixed sign, which we may take to be positive. Let ( l )v1 be the simple nodes of !({) in (0,1) and define #({) = ( 1 > 2 > = = = > v > {; 1> 2> = = = > v + 1)= If { A v then #({) A 0. If s ? { ? s+1 > s = 1> 2> = = = > v 1 #({) = ()vs ( 1 > 2 > = = = > s > {> s+1 > = = = > v ; 1> 2> = = = > v + 1)> while if { ? 1 , #({) = ()v ({> 1 > = = = > v ; 1> 2> = = = > v + 1)= Thus #({) changes sign as { passes through each node; #({) has just v zeros, the v simple nodes ( l )v1 . These are the same simples nodes as !({). Therefore, (#> !) 6= 0= while ! is a combination of (!l )nm ; these But # is a combination of (!l )v+1 1 combinations must overlap, i.e., v + 1 m> v m 1. Theorem 10.6.5 Under the conditions of Theorem 10.6.4, the simple nodes of !l ({) and !l+1 ({) interlace. Proof. Any combination ! ({) = fl !l ({) + fl+1 !l+1 ({)>
f2l + f2l+1 A 0
has either l 1 or l zeros in (0,1), and all these zeros are simple nodes. (v l1> v+2g+s l imply g = 0 and either v = l1> s = 0 or 1; or v = l> s = 0.) Suppose the nodes of !l+1 ({) are ( m )l1 ; write 0 = 0> l+1 = 1, so that 0 = 0 ? 1 ? · · · l ? l+1 = 1 and consider #({) = !l ({)@!l+1 ({)= In each of the intervals ( m > m+1 )> m = 0> = = = > l the function #({) is continuous, since !l+1 ({) is non-zero. We now show that #({) is monotonic in each of these intervals. Suppose, if possible, that #({) were not monotonic in an interval ( m > m+1 ). Then there would exist points {1 > {2 > {3 such that m ? {1 ? {2 ? {3 ? m+1 and #({1 ) #({2 )> #({2 ) #({3 ) have opposite signs. Without loss of generality we may assume #({1 ) ? #({2 )> #({3 ) ? #({2 ). The function #({), being continuous in [{1 > {3 ], assumes its maximum value in [{1 > {3 ]. This maximum must occur at an interior point, {0 , of [{1 > {3 ] since #({1 )> #({3 ) are both less than #({2 ). Therefore, #({) #({0 ) 0 for all { 5 [{1 > {3 ]
10. Green’s Functions and Integral Equations
265
and thus !({) = !l+1 ({){#({) #({0 )} = !l ({) #({0 )!l+1 ({) retains it sign in the neighbourhood of its zero, {0 . This contradicts the statement that !({) has only simple nodes. Hence #({) is monotonic in each interval ( m > m+1 )> m = 0> 1> = = = > l. We now consider the behaviour of #({) near one of the nodes ( m )l1 of !l+1 ({). Since #({) is monotonic in each of the intervals ( m > m+1 )> m = 0> = = = > l, the limits lim #({) = O1 >
lim #({) = O2
{$ m
{$ + m
will exist for all m = 1> 2> = = = > l; they may be finite of infinite. If m is not a node of !l ({) then O1 and O2 will be infinite and have opposite signs. We will show that this is the only case that can occur. Suppose that m is a node of !l ({), as well as of !l+1 ({). Then O1 ,O2 may be finite or infinite but will at least have the same sign. Suppose, without loss of generality that #({) is monotonic increasing in ( m1 > m ). If #({) is monotonic decreasing in ( m > m+1 ) there are five possible cases, shown in Figure 10.6.3: a) O1 = 4> O2 = 4 b) O1 = 4> O2 finite c) O1 = finite, O2 = 4 d) O1 finite, O2 = O1 e) O1 finite, O2 6= O1
a)
b)
c)
d)
e)
Figure 10.6.3 - #({) is monotonic decreasing in ( m > m+1 ). If #({) is monotonic increasing in ( m > m+1 ) there are just three possible cases shown in Figure 10.6.4: f) O1 = 4> O2 finite g) O1 finite, O2 = O1 h) O1 finite, O2 6= O1
266
Chapter 10
f)
g)
h)
Figure 10.6.4 - #({) is monotonic increasing in ( m > m+1 ). In all but cases a), d) there is a line | = k, shown, such that #({) crosses this line as { passes through m . Thus #({) k changes sign at { = m and thus !({) = !l+1 ({)(#({) k) = !l ({) k!l+1 ({) retains its sign as { passes through its zero m , contradicting the statement that all the zeros of !({) are simple nodes. Now take case d), suppose O1 = O2 = k, and consider the function !({> i ) = !l+1 ({)(#({) i ) = !l ({) i !l+1 ({) when i = k> !({> k) has either l 1 or l nodes. Now take i = k % = k0 , where % A 0. We may find {1 > {2 such that m 1 ? {1 ? m ? {2 ? m +1 #({1 ) = #({2 ) = k0 . Since !l+1 ({) retains its sign and #({) k0 changes its sign as { passes through {1 and {2 , these points are nodes of !({> k0 ). Thus !({> i ) acquires two new nodes as i passes from k to k %, but this is impossible since !({> k) and !({> k0 ) both have either l 1 or l nodes. We conclude that if m is a node of !l ({) then the only possibility is a). But this means that all the limits O1 > O2 for m = 1> 2> = = = > l must be infinite; #({) must assume all values in each interval ( m > m +1 )> m = 1> 2> = = = > l 1; #({) must have a node in each, and so too must !l ({). But !l ({) has just l 1 nodes, so none of the ( m )l1 can be nodes of !l ({): case a) cannot occur; #({) must be monotonic increasing in all the intervals ( m > m +1 )> m = 0> 1> = = = > l, or monotonic decreasing in all of them; the nodes of !l ({) and !l+1 ({) interlace.
10.7
Perron’s Theorem and compound kernels
Our aim in this section is to show that the eigenfunction l ({) of the integral equation (10.2.9) form a Markov sequence. Following the discussion of total positivity in Chapter 6, we base our analysis on continuous versions of Perron’s Theorem and the Cauchy-Binet Theorem. Just as the matrix version of Perron’s
10. Green’s Functions and Integral Equations
267
Theorem holds for arbitrary positive (square) matrices, not just symmetric ones, so there is a continuous version holding for arbitrary (not necessarily symmetric) positive kernels. However, the proof of the theorem of the arbitrary, nonsymmetric, case is beyond the scope of this book. We will therefore state the theorem for the general case but prove it only for the symmetric case, which is in fact all we need. We have Theorem 10.7.1 If the continuous kernel N({> v) satisfies N({> v) 0> N({> {) A 0> {> v 5 (0> 1) then the eigenvalue of 1 of the integral equation Z
x({) =
1
N({> v)x(v)gv
(10.7.1)
0
which has smallest absolute value is positive and simple; the corresponding eigenfunction x1 ({) has no zero in (0,1). Proof. In Section 10.3 we showed that a non-zero self-adjoint compact operator D has at least one, non-zero, eigenvalue = sup (D{> {)= ||{||=1
When translated into the language of the integral equation (10.7.1), this states that the equation (10.7.1) has an eigenvalue 1 satisfying ¾ ½ I (x) 1 > (10.7.2) = max 1 ||x||2 where I (x) =
Z
0
1
Z
1
N({> v)x({)x(v)g{gv>
0
and
Z
||x||2 =
1
x2 ({)g{=
0
This maximum is actually achieved by x1 ({) which satisfies x1 ({) = 1
Z
1
N({> v)x1 (v)gv=
(10.7.3)
0
Now consider z1 ({) = |x1 ({)|. Clearly ||z||2 = ||x1 ||2 while I (z) I (x1 ), which means that z1 ({) is also an eigenfunction, satisfying (10.7.3), i.e., z1 ({) = 1
Z
0
1
N({> v)z1 (v)gv=
(10.7.4)
268
Chapter 10
Suppose that x1 ({) had an isolated zero for some 5 (0> 1). On the basis of N(> ) A 0 and the continuity of N, we have N(> v) A 0, z1 (v) A 0 for some interval (> + %), % A 0. Thus, at , the left hand side of (10.7.4) is zero, while the right hand side is positive; this is a contradiction. A zero interval in x1 ({) may be ruled out similarly. This means that any eigenfunction corresponding to 1 must have the same sign in (0,1). There cannot be two mutually orthogonal eigenfunctions which maintain fixed sign in (0,1) so that 1 must be simple and positive. The proof is thus complete if we can show that if is a negative eigenvalue of (10.7.1) then || A 1 . Let ({) be a normalised eigenfunction corresponding to , so that ({) =
Z
1
N({> v) (v)gv>
0
and therefore | ({)| ||
Z
1
N({> v)| (v)|gv=
(10.7.5)
0
The function ({), being orthogonal to z1 ({), cannot retain one sign in (0,1) so that there must be strict inequality in (10.7.5). Therefore | ({)| ? ||
Z
1
N({> v)| (v)|gv
0
and thus (| |> | |) ? ||I (| |)= But, by (10.7.2) (| |> | |) 1 I (| |) so that || A 1 : 1 is the eigenvalue of smallest modulus and is positive and simple. Starting from a kernel N({> v) on [0> 1] × [0> 1] we may use the minors intro¯ × T, ¯ duced in Section 10.5 to define a compound kernel N(x> s) defined on T ¯ is the q-dimensional simplex where T 0 {1 {2 · · · {q 1= ¯ then If x is an interior point of T 0 ? {1 ? {2 ? · · · ? {q ? 1 so that x 5 Q= The place of the Cauchy-Binet Theorem is taken by Theorem 10.7.2 If three kernels N({> v)> O({> v)> Q ({> v) defined on [0> 1] × [0> 1] are related by Q ({> v) =
Z
0
1
N({> w)O(w> v)gw>
{> v 5 [0> 1]
10. Green’s Functions and Integral Equations
then Q (x; s) =
Z
¯ T
N(x; t)O(t; s)gt>
269
¯ x> s 5 T>
¯ where the integration is taken over the simplex T. Proof. The result follows immediately from splitting the integral over the q-dimensional [0> 1] × [0> 1] × = = = [0> 1] into q! integrals over simplices 0 {l1 {l2 · · · {lq 1. 4 Theorem 10.7.3 If (l )4 1 and (xl ({))1 are eigenvalues and corresponding eigenfunctions of (10.7.1), then Z N(x; s)x(s)gs> (10.7.6) x(x) = a ¯ T
where a = l1 > l2 = = = lq >
x(s) = x(s; )>
= (o1 > o2 > = = = > oq ) and 1 l1 ? l2 ? · · · lq =
Proof. Equation (10.7.1) shows that R x(x; ) = l1 > l2 = = = lq T¯ N(x; s)u(s; )gs=
We may now extend Perron’s Theorem to equation (10.7.3).
Theorem 10.7.4 If the continuous kernel N({> v) satisfies N(x; s) 0>
N(x; x) A 0>
x> s 5 Q
then the eigenvalue of (10.7.3) which has smallest modulus is positive and simple; the corresponding eigenfunction x(x) has no zero in Q. Proof. The proof in the situation in which N(x; s) is symmetric is the analogue of that in Theorem 10.7.1. Now we may prove Theorem 10.7.5 If the continuous kernel N({> v) satisfies N(x; s) 0>
N(x; x) A 0>
x> s 5 I
then all the eigenvalues of equation (10.7.1) are positive and simple, i.e., 0 ? 1 ? 2 ? · · · , and the corresponding eigenfunctions form a Markov sequence in L. Proof. Order the eigenvalues of (10.7.1) so that |1 | |2 | · · · then the eigenvalue of (10.7.3) that has smallest modulus is 1 2 = = = q . Thus Theorem 10.7.4 states that a) 1 2 = = = q A 0
b) 1 2 = = = q ? |1 2 = = = q1 q+1 |
270
Chapter 10
for all q = 2> 3> = = =. Thus in turn we have the following: 1 A 0> 1 ? |2 |> 1 2 A 0 and thus 1 ? 2 > 1 2 ? |1 3 | = 1 |3 |> 1 2 3 A 0 and thus 2 ? 3 , and so on. Theorem 10.7.3 and 10.7.4 shows that the eigenfunction corresponding to the lowest eigenvalue, namely X (x; ) = X ({1 > {2 > = = = > {q ;
1> 2> = = = > q)>
has no zeros in T, and in fact has fixed sign on I, and this is the necessary and su!cient condition for the sequence xl ({) to form a Markov sequence on L. Note that we have shown that if N({> v) is an oscillatory kernel then the corresponding operator D is a strictly positive (compact self-adjoint linear) operator. Thus, Theorem 10.3.9 applies, and the eigenfunctions form a complete orthonormal system in K. Let us now consider the application of these results to the integral equation governing the vibrations of the string. We recall that we wrote this equation in two ways, namely (10.2.9) and (10.2.11); these are ({) =
Z
1
2 (v)J({> v) (v)gv
0
and x({) =
Z
1
N({> v)x(v)gv=
0
Suppose that ({) is piecewise continuous on [0,1] then, as we showed earlier, ({), the actual (amplitude of the) string vibration is continuous while x({) is piecewise continuous. The Theorems we have proved in this section were phrased in terms of a continuous kernel N({> v), but clearly this is unnecessarily restrictive. We used the continuity of N({> v) because we assumed that only N({> v) 0, N({> {) A 0. It was continuity which allowed us to extend N({> {) A 0 to N({> v) A 0 for v near {. If, as for the string, we have N({> v) = ({)(v)J({> v) A 0 for {> v 5 L, we do not need to invoke continuity. A similar argument applies to Theorem 10.7.4. We have N(x; s) A 0 when x> s 5 I and {1 > v1 ? {2 > v2 ? · · · {q > vq . We may thus conclude that X (x; ) has fixed sign on I, and hence the corresponding minor Y (x; ) formed from the l ({) of equation (10.2.9) has fixed sign on I; the l ({) form a Markov sequence on L. Since the l ({) form a Markov sequence, they have the properties established in Section 10.6: l ({) has exactly l 1 simple nodes in (0,1), and the nodes of l ({) and l+1 ({) interlace. Figure 10.7.1 shows typical shapes of a string with end conditions x(0) = 0 = x0 (1). Note that we may simulate the ‘free’ end condition x0 (1) = 0, by passing the string through a slider at { = 1 that keeps the string horizontal there. See Section 2.2. Alternatively, we may simulate a free end by viewing just the left hand half of a symmetrical string stretched between 0 and 2, and considering just the symmetrical modes; these will satisfy x0 (1) = 0. The modes are qualitatively like the modes sin{(l 1@2){} of a uniform string.
10. Green’s Functions and Integral Equations
271
u1( x )
u 3( x )
0
1
u2( x) Figure 10.7.1 - Typical modes of a string under the conditions x(0) = 0 = x0 (1).
10.8
The interlacing of eigenvalues
In Section 2.9, when discussing vibration under constraint, we used a variational formulation of the matrix eigenvalue problem and, in order to discuss how eigenvalues change under constraint, we used Courant’s minimax theorem. This theorem may be extended to a self-adjoint compact operator D in Hilbert space. For simplicity we assume that D is positive definite. In Section 10.3 we found the greatest eigenvalue of D as 1 = sup I ({) = I ({1 ) {5K
where I ({) = (D{> {)@||{||2 =
(10.8.1)
Then we decomposed K = K1 into P1 , the space spanned by {1 , and its orthogonal complement K2 : K1 = P1 + K2 , and found 2 = sup I ({) = I ({2 )= {5K2
Generally, q+1 =
sup I ({) = I ({q+1 )
{5Kq+1
where Pq is the space spanned by {1 > {2 > = = = > {q , and K = Pq + Kq+1 . This is the iterative procedure for finding the eigenvalues.
272
Chapter 10
The corresponding minimax procedure is as follows: 1 = sup I ({) = I ({1 )= {5K
Now take |1 5 K, let Q1 be the space spanned by |1 , and decompose K as K = Q1 + K2 . Then 2 = inf sup I ({) = I ({2 )= |1 5K {5K2
Generally, let Qq be the space spanned by |1 > |2 > = = = > |q , and K = Qq + Kq+1 , then q+1 = inf sup I ({) = I ({q+1 )= Qq K {5Kq+1
The advantage possessed by the minimax form over the iterative form is seen most clearly when it is required to order the eigenvalues of two dierent operators D, D0 . If it is known that (D0 {> {) (D{> {) so that I 0 ({) = (D0 {> {)@||{||2 I ({)> then 0q+1 = inf
sup I 0 ({) inf
Qq K {5Kq+1
sup I ({) = q+1 :
Qq K {5Kq+1
(10.8.2)
the eigenvalues of D0 are greater than or equal to those of D; we can compare the eigenvalues because the infs and sups are taken over the same subspaces. By contrast, in the iterative scheme, the subspace Kq+1 is related to the operator: it is the subspace orthogonal to the space spanned by the previously found eigenvectors {1 > {2 > = = = > {q . If in addition (D0 {> {) (D{> {) = F({> |)2 > (10.8.3) for some F A 0 and | 5 K, then we can say more. Equation (10.8.3) implies I ({) = I 0 ({) if ({> |) = 0= Thus q =
inf
sup I ({) = inf 0
Qq1 K {5Kq
sup I 0 ({)>
Qq K {5K 0 q+1
where Qq0 is the space spanned by the arbitrary |1 > |2 > = = = > |q1 .and |, and K = 0 Qq0 + Kq+1 . But this inf cannot be less than that taken over Qq , so that q 0q+1 =
(10.8.4)
The inequalities (10.8.2), (10.8.4) imply that the eigenvalues of D and D0 interlace in the sense 01 1 02 2 · · ·
10. Green’s Functions and Integral Equations
273
We now apply this theory to the eigenvalues of the string under dierent end conditions. When translated into the language of integral equations, equation (10.8.1) becomes I (x) =
Z
1
0
Z
1
N({> v)x({)x(v)g{gv@
0
Z
1
x2 ({)g{
0
where N({> v) = ({)(v)J({> v)> and J({> v) is given by (10.2.6), !({)> #({) by (10.2.7). Since J({> v) depends on k> K we write it as J({> v> k> K). Simple algebra shows that J({> v> k> K 0 ) J({> v> k> K) = F1 (K K 0 )(1 + k{)(1 + kv) and J({> v> k0 > K) J({> v> k> K) = F2 (k k0 )(1 + K(1 {))(1 + K(1 v)) where F1 (k + K 0 + kK 0 )(k + K + kK) = 1 = F2 (k0 + K + k0 K)(k + K + kK)= This imples that I (x> k> K 0 ) I (x> k> K) = F1 (K K 0 )(x> z1 )2 @||x||2 and I (x> k0 > K) I (x> k> K) = F2 (k k0 )(x> z2 )2 @||x||2 where z1 ({) = (1 + k{)({)> and (x> ) =
z2 ({) = (1 + K(1 {))({) Z
1
x({) ({)g{=
0
Remembering that must be replaced by 1@, we may now apply the previous theory as follows: a) if K ? K 0 then q (k> K) q (k> K 0 ) q+1 (k> K)
(10.8.5)
q (k> K) q (k0 > K) q+1 (k> K)
(10.8.6)
b) if k ? k0 then if k ? k0 and K ? K 0 then, by combining a) and b) we find q (k> K) q (k0 > K) q (k0 > K 0 )
274
Chapter 10
and q (k> K) q (k0 > K) q (k0 > K 0 ) q+1 (k0 > K) q+2 (k> K)=
(10.8.7)
Note that we have used loose inequalities throughout, but in general the inequalities will be strict, as we now show. We obtained these interlacing results by using the Green’s function formulation of the eigenvalue problem. There is another approach using a variational formulation for the original dierential equation (10.1.1). The eigenvalue problem (10.1.1), (10.1.2) is equivalent to finding the stationary values of M(y)
Z
1
[y0 ({)]2 g{ + ky 2 (0) + Ky 2 (1)
0
subject to 2
( y> y)
Z
1
2 ({)y 2 ({)g{ = 1=
(10.8.8)
0
The following argument may be made rigorous. We introduce a Lagrange parameter and consider J(y) = M(y) (2 y> y)= Then J(y + %) J(y) = %$0 2% lim
Z
1
y 0 0 g{ + ky(0)(0) + Kx(1)(1)
0
Z
1
2 yg{=
0
Integrate the first term by parts, rearrange the terms, and equate the whole to zero: Z 1 (y 00 + 2 y)g{ [y 0 (0) ky(0)](0) + [y 0 (1) + Ky(1)](1) = 0= 0
This will be zero for all variations ({) only if y({) satisfies (10.1.1) and (10.1.2). Suppose that {q > yq ({)}4 1 are the eigenvalues and eigenfunctions of (10.1.1), (10.1.2), normalised so that (2 yp > yq ) = pq = Then Z
0
1 0 yp ({)yq0 ({)g{ = [yp ({)yq0 ({)]10 + q pq
= kyp (0)yq (0) Kyp (1)yq (1) + q pq =(10.8.9)
Now consider the variational problem for equation (10.1.1) under the end conditions y 0 (0) k0 y(0) = 0 = y0 (1) + Ky(1)> (10.8.10)
10. Green’s Functions and Integral Equations
275
where k0 A k. This is the problem of finding the stationary values of Z 1 [y0 ({)]2 g{ + k0 y 2 (0) + Ky 2 (1) M 0 (y) = 0
subject to (10.8.8). Expand y({) in terms of the eigenfunctions yp ({); y({) =
4 X
fp yp ({)=
p=1
Now use the integral (10.8.9) to get M 0 (y) =
4 X
p f2p + (k0 k)y 2 (0)=
(10.8.11)
p=1
The equations giving the values of fp that make M 0 (y) stationary are p fp + (k0 k)yp (0)y(0) fp = 0
p = 1> 2> = = =
i.e., fp = (k0 k)yp (0)y(0)@( p ) so that the condition y(0) =
4 X
fp yp (0)
p=1
gives
1 = (k0 k)
4 2 X yp (0) = p p=1
(10.8.12)
This is the analogue of equation (4.3.21), and immediately gives the strict form of (10.8.6): q (k> K) ? q (k0 > K) ? q+1 (k> K)= (10.8.13) The end values, yp (0), cannot be zero unless k = 4, i.e., the end { = 0 is fixed; this case is excluded by k0 A k. We may employ a similar procedure to get the strict form of (10.8.5).
Exercises 10.8 1. Derive the expression (10.8.11) for the functional M 0 (y). 0 2. If (0p )4 1 are the eigenvalues of (10.1.1) subject to (10.8.10), i.e., p = 0 p (k > K), show that
1 (k0 k)
¶ 4 4 µ 2 X Y 0p yp (0) = = p p=1 p p=1
276
Chapter 10
and hence deduce that (k0 k)yq2 (0) = (q 0q )
4 Y
0
p=1
µ
q 0p q p
¶
where 0 denotes p 6= q. 3. How should the infinite product be interpreted so that, with k0 A k, the interlacing (10.8.13), i.e., 1 ? 01 ? 2 ? = = =, yields positive values of 2 yp (0). These examples show that knowing (q 0q )4 1 we may compute 2 4 the so-called norming constants ( q )4 = (y (0)) ; conversely, knowing 1 q 1 0 4 (q > q )4 ), we may compute ( ) . See Elhay, Gladwell, Golub and 1 q 1 Ram (1999) [85] for further discussion of eigenvector-eigenvalue relations like (10.8.12).
10.9
Asymptotic behaviour of eigenvalues and eigenfunctions
For the solution of inverse problems in Chapter 11 we shall need to know the asymptotic behaviour of the eigenvalues q and eigenfunctions yq ({), and norming constants for large q. To examine this behaviour it is convenient to suppose that ({) in equation (10.1.1) or D({) in equation (10.1.3), are su!ciently smooth that the equation (10.1.1) or (10.1.3) may be transformed to the Sturm-Liouville form (10.1.11) with t({) 5 F[0> ]. We now use the numbering convention S described in Section 10.1. First, we need an existence uniqueness theorem. This is provided by Titchmarsh (1962) [323]. Theorem 10.9.1 If t({) 5 F[0> ] then, for any there exists a unique solution |({> ) of (10.1.14) such that |(0> ) = sin > !0 (0> ) = cos . For any fixed { 5 [0> ], ({> ) is an entire function of . [Note: Here is taken to be complex variable; an entire function of a complex variable is one that has no poles in the finite -plane.] On the basis of this theorem we denote the solution of | 00 ({) + ( t({))|({) = 0
(10.9.1)
!0 (0> ) = k
(10.9.2)
satisfying the condition !(0> ) = 1
by !({> ). We assume that k is finite and that t({) 5 F[0> ]. Write = $ 2 , then (10.9.1) may be written |00 ({) + $ 2 |({) = t({)|({)=
10. Green’s Functions and Integral Equations
277
Treating the right hand side as a forcing function, we may use the so-called Duhamel solution Z { 1 sin $({ w)t(w)!(w> )gw> (10.9.3) !({> ) = D cos ${ + E sin ${ + $ 0
where D = 1> E = k@$. In this equation we can treat $ as a complex variable and can obtain an estimate for !({> ) for large |$|: Lemma 10.9.1 Let $ = +l . Then there exists 0 A 0 such that for |$| A 0 ¶ µ exp | |{ (10.9.4) !({> ) = cos ${ + R |$| uniformly with respect to { in [0> ]. Proof. Put !({> ) = exp(| |{)i ({), then it follows from (10.9.3) that i ({) = (cos ${+k$ 31 sin ${) exp(3| |{)+$ 31
Z
{ 0
sin $({3w) exp(3| |({3w))t(w)i (w)gw= (10.9.5)
Let P = max0{ |i ({)|, then equation (10.9.5) gives Z P { |k| + P 1+ |t(w)|gw= |$| |$| 0 Thus
¶ µ ¶ µ Z { 1 |k| @ 1 |t(w)|gw P 1+ |$| |$| 0
provided that the denominator is positive, that is, provided that Z |$| A |t(w)|gw= 0
For such $, |!({> )| P exp(| |{) so that on substituting this into the integral (10.9.3) we find (10.9.4). We may now use the estimate (10.9.4) to estimate the eigenvalues of (10.9.1) subject to |0 (0) k|(0) = 0 = | 0 () + K|(); (10.9.6) we assume that K, like k, is finite. In Section 10.1 we showed that the eigenvalues are real; we may therefore take = 0 in (10.9.4) and find !({> ) = cos ${ + R($ 1 )= The eigenvalues are the solutions of !0 (> ) + K!(> ) = 0
(10.9.7)
278
Chapter 10
which for large |$| becomes $ sin $ + R(1) = 0
(10.9.8)
which clearly has solutions near to integers for large $. There is in fact only one solution near any large integer q for, on dierentiating (10.9.8) with respect to $, (which is justified because (10.9.8) is actually (10.9.7) which is an analytic function of ) we find $ cos $ + R(1) which is not zero near a large integer. We conclude that the eigenvalues which are arranged in the order 0 ? 1 ? 2 ? = = = eventually become positive and near the square of an integer. To obtain a precise estimate of the eigenvalues we use Rouché’s Theorem If i (}) and j(}) are analytic within and on a closed contour F and |j(})| ? |i (})| on F, then i (}) and i (}) + j(}) have the same number of zeros inside F. To apply this theorem we take i ($) = $ sin $> i ($) + j($) = !0 (> ) + K!(> ), and take F to be a circle with centre R, radius Q + 12 , in the $-plane. Then for large enough Q , |j($)| ? |i ($)| on F, so that i ($) and i ($) + j($) have the same number of zeros inside F. The eigenvalues are real, and both i ($) and i ($)+j($) are even s functions of $. This means that the zeros,p $, will lie on the real axis, $ = ± if 0; or on the imaginary axis, $ = ±l || if ? 0. The number of eigenvalues is therefore 12 (number of zeros of i ($) + j($)) = 12 (number of zeros of i ($)). But the zeros of i ($) are ±0> ±1> = = = ± Q ; there are 2Q + 2, so that there are Q + 1 eigenvalues inside F. We conclude that $ q = q + R(1)=
(10.9.9)
We may now make this estimate more precise by substituting (10.9.9) in (10.9.8). Put $q = q + q , then (q + q ) sin( q ) + R(1) = 0 so that sin( q ) = R(q1 )> i.e., q = R(q1 )= This means that, for large q p q $q = q + R(q1 )=
(10.9.10)
We continue to examine this estimate. We can write (10.9.3) and its derivative as !({> ) = cos ${{1 $ 1 t1 ({)} + $1 sin ${{k + t2 ({)}>
(10.9.11)
10. Green’s Functions and Integral Equations
279
!0 ({> ) = cos ${{k + t2 ({)} $ sin ${{1 $ 1 t1 ({)}> where t1 ({) =
Z
(10.9.12)
{
sin $wt(w)!(w> )gw>
(10.9.13)
cos $wt(w)!(w> )gw=
(10.9.14)
0
t2 ({) =
Z
{
0
Thus
1 t2 ({) = 2
t1 ({) = r(1)>
Z
{
t(w)gw + r(1)
(10.9.15)
0
and !({> ) = cos ${ + R($ 1 )
(10.9.16)
Z
(10.9.17)
!0 ({> ) = $ sin ${ + {k +
1 2
{
t(w)gw} cos ${ + r(1)
0
so that (10.9.7) may be written f cos $ $ sin $ + r(1) = 0
(10.9.18)
µ ¶ Z 1 k+K + t(w)gw = 2 0
(10.9.19)
where f=
1
Equation (10.9.18) gives tan $ = f$1 + r($ 1 ) so that on putting $q = q + q as before, we find tan q q s q = $ q
= fq1 + r(q1 ) = fq1 + r(q1 ) = q + fq1 + r(q1 )
(10.9.20)
We now consider the asymptotic form of the eigenfunctions. Equations (10.9.11), (10.9.15) give Z { 1 1 1 !({> ) = cos ${ + k$ sin ${ + $ sin ${ t(w)gw + r($ 1 )= 2 0 Substituting for $ q from (10.9.20), we find !({> q )
= =
cos q{ 3 f{q31 sin q{ + kq31 sin q{ + 12 q31 sin q{ cos q{ + q31 ({) sin q{ + r(q31 )>
R{ 0
t(w)gw + r(q31 ) (10.9.21)
280
Chapter 10
where
Z 1 { t(w)gw= (10.9.22) 2 0 To derive the asymptotic expression for the normalised eigenfunctions, we compute the integral Z Z 2 2 q = ! ({> q )g{ = {cos2 q{ + q1 ({) sin 2q{}g{ + r(q1 )= ({) = k f{ +
0
0
Since ({) is dierentiable, Z
({) sin 2q{g{ = R(q1 )
0
so that
+ r(q1 ) 2
2q =
and the normalised eigenfunction is r !({> q ) 2 {cos q{ + q1 ({) sin q{} + r(q1 )= = |q ({) = q
(10.9.23)
(10.9.24)
So far we have assumed only that t({) is continuous. If we assume that t({) has a bounded derivative, then the terms in (10.9.15) are R($ 1 ); for example ¸{ · Z { Z { cos 2$w 1 t(w) + sin 2$wt(w)gw = cos 2$wt 0 (w)gw = R($1 )= 2$ 2$ 0 0 0 In this case the terms r(1), r(q1 ) in equations (10.9.17)-(10.9.24) may be replaced by R(q1 ) and R(q2 ) respectively. Now consider the case in which k = 4, K is finite. The end condition at { = 0 is |(0) = 0, and the solution of (10.9.1) satisfying the condition # 0 (0> ) = 1>
#(0> ) = 0> is #({> ) = $ 1 sin ${ + $ 1
Z
(10.9.25)
{
sin $({ w)t(w)#(w> )gw>
(10.9.26)
0
and we can show as before (see Ex. 10.9.1) that #({> ) = $ 1 sin ${ + R($ 2 ) #0 ({> ) = cos ${ + R($1 )=
(10.9.27) (10.9.28)
This means that the second end condition, (10.9.7), has the form cos $ + R($ 1 ) = 0 which has solutions near q + 12 : $q = q +
1 + q = 2
(10.9.29)
10. Green’s Functions and Integral Equations
281
We write #({> ) and #0 ({> ) as before: #({> ) = $ 1 sin ${{1 + t2 ({)} $1 cos ${t1 ({)> #0 ({> ) = cos ${{1 + t2 ({)} + sin ${t1 ({)> where t1 ({) = t2 ({) =
Z
(10.9.30) (10.9.31)
{
0 Z {
sin $wt(w)#(w> )gw>
(10.9.32)
cos $wt(w)#(w> )gw=
(10.9.33)
0
Since #(w> ) has the form (10.9.27), we have Z 1 1 { $ t(w)gw + r($ 1 ) t1 ({) = 2 0 t2 ({) = r($ 1 ) and 0
# (> )+K#(> ) = cos $+$ Putting $ = q +
1 2
1
Z
(10.9.35)
t(w)gw} sin $+r($ 1 )= (10.9.36)
0
+ q we find, as before, that $q = q +
where
1 {K + 2
(10.9.34)
1 f=
f 1 + 2 q+
1 2
+ r(q1 )>
µ ¶ Z 1 K+ t(w)gw = 2 0
(10.9.37)
(10.9.38)
Similarly, if k is finite and K = 4, then $q = q + where
1 f=
f 1 + 2 q+
1 2
+ r(q1 )>
µ ¶ Z 1 k+ t(w)gw = 2 0
(10.9.39)
(10.9.40)
Finally, consider the case k = 4, K = 4, so that the end conditions are the Dirichlet conditions #(0> ) = 0 = #(> )= Substituting from (10.9.27) we find that the second condition is $ 1 sin $ + R($ 2 ) = 0= For large Q , there are as many zeros inside the circle of radius Q + 12 as there are zeros of $ 1 sin $; there are 2Q such zeros: ±1> ±2> = = = > ±Q . Thus $q = q + 1 + q
282
Chapter 10
and we find, as before, that $q = q + 1 + where
1 f= 2
f + r(q1 ) q+1
Z
(10.9.41)
t(w)gw=
(10.9.42)
0
Again, if t({) has a bounded derivative, then the terms r(q1 ) in (10.9.37), (10.9.39), (10.9.41) may be replaced by r(q2 ). [Note: there are several small errors in Levitan and Sargsjan (1991) [212], as there undoubtedly are in this book; q in their equation (2.19) in Section 1.2.4 should be q + 1.] A historical notes is in order. In the many papers on asymptotic estimates, many dierent assumptions are made regarding the smoothness of t({): it is continuous; it has a bounded derivative; it has a piecewise continuous derivative; it has a continuous derivative; etc. It is known that if t({) is continuous it need not have a derivative at all; there is a pathological function that is continuous in [0> ] but is dierentiable nowhere. However, in the older treatments, e.g., Ince (1927) [185], and some of the Soviet literature, it is assumed implicitly that if t({) is said to be continuous, then it has a derivative but that this derivative is not necessarily continuous; it is piecewise continuous. Similarly, if t({) is said to have u continuous derivative then it has a piecewise continuous (u + 1)th derivative. One of the most extensive studies of asymptotic estimates of the SturmLiouville spectrum was carried out by Hochstadt (1961) [172], who used a variant of the WKB method. He supposes that the mean value of t({) is zero. Equation (10.9.1) may be reduced to this form by writing it as 2
| 00 ({) + ($ t ({))|({) = 0
(10.9.43)
where 2
$ = $ 2 t¯>
t ({) = t({) t¯>
t¯ =
1
Z
t({)g{=
(10.9.44)
0
When k> K are finite, and t({) is twice continuously dierentiable, he shows that 1
($ 2q t¯) 2 = q + e0 q1 + e1 q3 + R(q4 )> where e0 =
g1
=
g2
=
k+K >
e1 = g1 g2 >
(10.9.45)
(10.9.46)
½Z ¾ 1 2 [t W (w)] gw + t 0 () 3 t 0 (0) + 4kt W (0) + 4Kt W () > (10.9.47) 8 0 µ ¶2 µ ¶ k+K 1 k3 + K 3 + = (10.9.48) 3
10. Green’s Functions and Integral Equations
283
Note that g1 = 0 when t({) = frqvw, i.e., t ({) = 0. Hochstadt also considered the various special cases in which k or K are infinite. See also Fix (1967) [89], P˝oschel and Trubowitz (1987) [269] and Rundell (1997) [294]. Equation (10.9.45) may be written $ q = q + d0 q1 + d1 q3 + R(q4 ) where
µ ¶ Z 1 k+K + t(w)gw = f 2 0 1 1 d1 = e1 t¯2 e0 t¯> 8 2 where e1 is given by equation (10.9.46). Equation (10.9.49) gives 1 d0 =
q = q2 + 2d0 + f0 q2 + R(q3 )
(10.9.49)
(10.9.50) (10.9.51)
(10.9.52)
where f0 = d20 + 2d1 . Equation (10.9.23) gives a first asymptotic estimate of the so-called norming constants q = |q2 (0) =
[*(0> q )]2 : 2q
q =
2 + R(q2 )=
(10.9.53)
Levitan (1987) [211] shows that if t({) is twice continuous by dierentiable then 2 q = (1 + h0 q2 + R(q3 )= (10.9.54) Suppose (q )4 (q )4 0 > 0 are the eigenvalues of (10.9.1) corresponding to the end conditions |0 (0) k1 |(0) = 0 = |0 () + K|()> |0 (0) k2 |(0) = 0 = |0 () + K|()>
(10.9.55) (10.9.56)
so that 1
q2 1
q2
= q + d0 q1 + d1 q3 + R(q4 )> = q + d00 q1 + d01 q3 + R(q4 )>
After a long derivation based on Ex. 10.8.2, (with the change of numbering from V to S) Levitan shows that h0 = V
2 d0 d1 (d0 d00 )2 + d0 + 10 > 6 d0 d0
where V = 0 0 +
4 X
[(q q ) 2(d0 d00 )]=
q=1
(10.9.57)
(10.9.58)
284
Chapter 10
We note that (10.9.52) shows that this series converges. Note that equation (10.9.54) is important for stating the asymptotic form of q , and not for the actual expression (10.9.54) for h0 , i.e., as a way of finding q ; the result in Ex. 10.8.2 (with the change of numbering from V to S) shows how to find q = yq2 (0) from two spectra. McNabb, Anderssen and Lapwood (1976) [233] discuss the asymptotics of the eigenvalues when there are one or two discontinuities in the potentials.
Exercises 10.9 1. Show that when k = 4, 1 1 + e0 (q + )1 + e1 q3 + R(q4 ) 2 2 where e0 = K@> e1 = g1 g2 , ½Z ¾ 1 2 0 0 [t (w)] gw + 4Kt () + t (0) + t () g1 = 8 0 µ ¶3 K 1 K3 = g2 = + 3 1
($2q t¯) 2 = q +
10.10
Impulse responses
Consider a rod of density , Young’s modulus H, cross section D({) and length 1, free at { = 0 and fixed at { = 1. Suppose that at time w0 = 0 the rod is at rest, and is then set in motion by a force j(w0 ) applied at the end { = 0. The governing equations are µ ¶ Cx C C2x HD({) (10.10.1) D({) 02 = Cw C{ C{ ¯ Cx ¯¯ HD = j(w0 ) C{ ¯{=0 x(1> w0 ) = 0> x({> 0) = 0
w0 A 0 Cx = 0 ({> 0)> Cw
0 { 1=
p Instead of real time w0 we use the scaled time w = fw0 > f = H@, and put j(w) = j(w0 )@H. We may replace the end force j(w) by a distributed loading j({> w) over a small interval (0> %), so that ¶ µ j(w) = j(w)({) j({> w) = lim %$0 % so that equation (10.10.1) becomes C2x C D({) 2 = Cw C{
µ ¶ Cx D({) + j(w)({)= C{
(10.10.2)
10. Green’s Functions and Integral Equations
285
Take the Laplace transform of this equation, and put Z 4 Z 4 exp(vw)x({> w)gw> J(v) = exp(vw)j(w)gw> X ({> v) = 0
0
to obtain v2 D({)X ({> v) = (D({)X 0 )0 + J(v)({)=
(10.10.3)
The solution of this equation that satisfies the end condition X (1> v) = 0 may be written X ({> v) = N({> v)J(v)> so that, by the convolution theorem Z w x({> w) = n({> w )j( )g >
(10.10.4)
0
where n({> w) is the inverse Laplace transform of N({> v), i.e., Z 1 N({> v) exp(vw)gv> n({> w) = 2l where is a line ( l4> + l4) lying to the right of the singularities of N({> v). The function n({> w) is called the (displacement) impulse response function. Clearly, when j( ) is a unit impulse, i.e., j( ) = ( ), then equation (10.10.4) shows that x({> w) = n({> w). If ($ 2q > xq ({))4 0 are the (scaled) eigenvalues and normalised eigenfunctions of the free-fixed rod, i.e., (D({)x0q ({))0 + $2q D({)xq ({) = 0 x0q (0) = 0 = xq (1) then we may expand X ({> v) in the form X ({> v) =
4 X
q (v)xq ({)
q=1
so that equation (10.10.3) becomes 4 X
(v2 + $2q )D({)q (v)xq ({) = J(v)({)=
q=1
Multiplying though by xp ({) and integrating over (0,1), using orthogonality and the result Z 1
xq ({)({)g{ = xq (0)
0
we obtain
(v2 + $ 2q )q (v) = J(v)xq (0)
286
Chapter 10
and N({> v) = for which the inverse is n({> w) =
4 X xq (0)xq ({) v2 + $2q q=1
; P xq (0)xq ({) sin $q w> w A 0 ? $q =
=
(10.10.5)
w0
0
For a uniform rod xq ({) =
¸ · s (2q 1){ > 2 cos 2
so that
i.e.,
4 4 X cos n({> w) = q=1
1 n({> w) = 2
h
(2q1){ 2
$q = i
sin
h
(2q 1) > 2
(2q1)w 2
(2q 1)
½ · ¸ · ¸¾ ({ + w) ({ w) V V > 2 2
where V({) =
4 4 X sin(2q 1){ = q=1 2q 1
i
>
(10.10.6)
(10.10.7)
Now V({) is discontinuous at 0> ±> ±2> = = =, and V({) = vljq({)>
? { ? =
(10.10.8)
(Gradshteyn and Ryzhik (1965), 1.4421) From equation (10.10.8) we may deduce the behaviour of the rod subjected to an impulse at w = 0. Thus if { A w, then { + w ? 2{ ? 2, so that ¸ · ¸ · ({ w) ({ + w) = 1> V = 1> V 2 2 and n({> w) = 0. This may be interpreted as showing that the eect of the impulse moves along the rod with scaled speed 1, i.e., real speed f, and the rod is at rest for { A w. Analysis of the partial dierential equation (10.10.2) shows that this result is true even when D({) is not uniform (Courant and Hilbert (1962)). For the uniform rod, behind the initial disturbance, i.e., for { ? w> { + w ? 2 we have n({> w) = 1@2. When the disturbance reaches the end { = 1 and starts to return we have · ¸ · ¸ ({ + w) ({ w) V = 1> V = 1 2 2
10. Green’s Functions and Integral Equations
287
so the step 1@2 which had stretched from { = 0 to { = 1 is annihilated starting from { = 1. So the process continues indefinitely. Sometimes it is convenient to use velocity and (scaled) stress as variables, i.e., Cx Cx > s({> w) = D({) > (10.10.9) y({> w) = Cw C{ then equation (10.10.2) may be written D({)
Cs Cy = + j(w)({)> Cw C{
D({)
Cy Cs = > C{ Cw
and the velocity y({> w) is given by Z w ˆ w )j( )g > y({> w) = k({>
(10.10.10)
(10.10.11)
0
where
ˆ w) = Cn ({> w) k({> Cw must be interpreted as a generalised function. Equation (10.10.5) shows that ½ P4 q=1 xq (0)xq ({) cos($ q w)> w 0 ˆ k({> w) = 0> w?0 and thus ˆ w) = k(0>
4 X
x2q (0) cos($q w)>
(10.10.12)
(10.10.13)
w 0=
q=1
For the uniform rod, therefore i h ( P 4 (2q1)w > w0 cos 2 q=1 ˆ w) = 2 = k(0> 0> w?0 We note that (Gradshteyn and Ryzhik (1965), 1.4421) Z w Z w 4 X sin[(2q 1)w@2] ˆ )g = ˆ )g = 4 > k(0> k(0> q=1 2q 1 4 0 = 1
so that for 0 ? w ? 2,
(0 ? w ? 2)
ˆ w) = (w)= k(0>
ˆ w) may be evaluated by using its periodicity, k(0> ˆ w+ For larger values of w, k(0> ˆ w). For a non-uniform rod it can be shown that 2) = k(0> ˆ w) = (w) + k(w)> k(0> where k(w) is continuously dierentiable. (See Ex. 10.10.2).
(10.10.14)
288
Chapter 10
Exercises 10.10 1. Show that V({) given in (10.10.6) satisfies V({ + ) = V({)>
V({ + 2) = V({)
and hence show that V({) = ()|q| >
q ? { ? (q + 1)=
2. Show that if the rod is such that its eigenvalues $q and eigenfunctions xq ({) satisfy $q =
(2q 1) > 2
[xq (0)]2 = 2>
p = Q + 1> = = =
then its impulse response may be written in the form (10.10.14), where ¾ Q ½ X (2q 1)w 2 k(w) = = [xq (0)] cos($ q w) 2 cos 2 p=1
Chapter 11
Inversion of Continuous Second-Order Systems Certain authors, speaking of their works, say, "My book," "My commentary," "My history," etc. They resemble middle class people who have a house of their own, and always have "My house" on their tongue. They would do better to say, "Our book," "Our commentary," "Our history," etc., because there is in them usually more of other people’s than their own. Pascal’s Pensées, 43
11.1
A historical review
It was shown in Section 10.1 that the Sturm-Liouville equation can appear in three dierent forms. The one preferred by pure mathematicians seems to be (10.1.14): | 00 ({) + [ t({)]|({) = 0= (11.1.1) In vibration problems, the equation x00 ({) + 2 ({)x({) = 0
(11.1.2)
appears in the transverse vibrations of a taut string, while (D({)y 0 ({))0 + D({)y({) = 0
(11.1.3)
occurs in the longitudinal or torsional vibrations of a thin straight rod of cross section D({). As with all inverse problems (see Parker (1977) [263]), the introduction of Newton (1983) [249], Sabatier (1978) [295], Sabatier (1985) [298], Groetsch (1993) [155], Groetsch (2000) [156] or Kirsch (1996) [193] there are three aspects to the inverse problem: 289
290
Chapter 11
i) existence, i.e., mathematically, is there a function t({), ({) or D({), or physically, is there a vibrating system, with the required properties? ii) uniqueness, i.e., is there only one system with these properties? iii) construction, i.e., how can we construct one or more systems from the given data? These questions, which are closely related, have been gradually elucidated over the past seventy years. In this chapter we will use the numbering convention S given in Section 10.1, unless we state otherwise. Ambarzumian (1929) [3] considered the question of uniqueness in a special case. He considered equation (11.1.1) with the symmetrical end conditions | 0 (0) = 0 = | 0 ()>
(11.1.4)
and the equation | 00 ({) + |({) = 0 with the same end conditions. He showed that if the two systems have the 2 same spectrum (q )4 0 , where q = q , then t({) is identically zero. Note that he considered symmetrical end conditions, so that only one spectrum is needed. His proof has a defect in that it relies on a perturbation method which requires t({) to be small. The fundamental paper on the inverse problem for the equation (11.1.1) is Borg (1946) [39]. He showed that if t({) is symmetric, i.e., t({) = t( {)>
(11.1.5)
then the spectrum of equation (11.1.1) corresponding to the end conditions (11.1.4), or to the (Dirichlet) end conditions |(0) = 0 = |()>
(11.1.6)
determines t({) uniquely. This validates Ambarzumian’s earlier result. (See also Hochstadt and Kim (1970) [174].) It is important to bear in mind a fundamental feature of equations (11.1.1)(11.1.3); if the system is symmetrical about the mid-point { = 1@2, and the end conditions are symmetrical also then in general one spectrum corresponding to one set of end conditions is su!cient to determine it. If it is not symmetrical then two spectra, corresponding to two dierent end conditions at one end, are required. In this connection, Gottlieb (1986) [138] constructs some interesting counterexamples. Recall that a uniform string fixed at both ends, i.e., a violin string, has natural frequences $ l that are all multiples of $1 ; we say that the spectrum (in $, not ) is harmonic. It is this property that makes the violin a musical instrument: the overtones of a string are all octaves above the fundamental tone. A harmonic spectrum is a special case of a uniformily spaced spectrum; here $ l+1 $ l = constant. The uniform string is special in the sense
11. Inversion of Continuous Second-Order Systems
291
that it has a harmonic spectrum. $ l = (l + 1)> l = 0> 1> 2> = = = for fixed-fixed ends, and a harmonic spectrum $ l = (l + 1@2)> l = 0> 1> 2> = = = for fixed-free ends (see the note on a free end at the beginning of Section 10.1). Gottlieb (1986) [138] constructs piecewise uniform strings with one step that have one harmonic spectrum, either for fixed-fixed or fixed-free ends. (see Section 12.4.) In each case the other spectrum is uniformly spaced but not harmonic. His analysis thus highlights the need to consider two spectra to ensure uniqueness. Borg also considered equation (11.1.1) for two sets of end conditions; one set cos |(0) + sin |0 (0) = 0 = cos |() + sin | 0 ()>
(11.1.7)
and the other cos |(0) + sin | 0 (0) = 0 = cos |() + sin |0 ()>
(11.1.8)
that dier only at the end { = , i.e., 6= . He showed that if sin = 0 = sin , so that (11.1.8) is equivalent to (11.1.6), and sin 6= 0, then two interlacing spectra (as in Section 10.8) determine a unique nonsymmetric function t({). If sin sin 6= 0, then t({) is uniquely determined by two spectra that are short of the first eigenvalue 0 of the first spectrum corresponding to (11.1.7). Borg’s results were extended and simplified by Levinson (1949) [207]. He proved that if the spectra of (11.1.1) for each of the end conditions (11.1.7), (11.1.8) are given, and if sin( ) 6= 0, that is if (11.1.7), (11.1.8) are not identical, then t({) is uniquely determined. (Remember that this means that there is not more than one t({), not that there is at least one t({).) This result was extended by Hochstadt (1973) [175], Hochstadt (1975a) [177] who considered the extent to which t({) was determined when some eigenvalues q > q corresponding respectively to the end conditions (11.1.7), (11.1.8), were unknown; see also Hald (1978a) [162], Barcilon (1974c) [16] and further references given there. For the symmetrical case, Levinson showed that if it is known that (11.1.5) holds almost everywhere in (0> ), and if + = , i.e., k = K in (11.1.2), then t({) is uniquely determined by the spectrum for the end conditions (11.1.7). This result includes Borg’s results for (11.1.4) (k = 0 = K) and (11.1.6) (k = 4 = K) as special cases. Marchenko (1950) [218], Marchenko (1952) [219], Marchenko (1953) [220] made these results a little sharper. He showed that if t({) 5 O1 (0> ) and sin( ) 6= 0, then the spectra of (11.1.1) corresponding to (11.1.7), (11.1.8) determine t({) and tan > tan > tan uniquely. A full account of the uniqueness theorem may be found in Levinson (1949) [209]. Further results may be found in Hochstadt (1973) [175], Hochstadt (1975b) [178], Hochstadt (1976) [179], Hochstadt (1977) [180], Hochstadt and Lieberman (1978) [181], Sabatier (1979a) [296], Sabatier (1979b) [297], Hald (1984) [165], Seidman (1985) [301], McLaughlin (1986) [228] and Levitan (1987) [211], Kirsch (1996) [193]. These results are all concerned with uniqueness. Basically, they all state that it is not possible to find more than one function t({) corresponding to two spectra. However, it was shown in Chapter 10 that the eigenvalues of (11.1.1), (11.1.7), and of (11.1.1), (11.1.8) have a number of specific properties, e.g., they
292
Chapter 11
interlace, and they have the asymptotic form (10.9.22) if k> K are finite, or one of the others listed in Section 10.9 if k or K is infinite. The question is therefore 4 what are su!cient conditions for two sets of numbers (q )4 0 and (q )0 to be the spectra of equation (11.1.1) corresponding to two sets of end conditions, like (11.1.7), (11.1.8). The conditions will, of course, depend on what conditions we demand of t({). We note that, when viewed as a purely mathematical problem, this problem is very di!cult, as an inspection of the literature will immediately verify. However, the di!culties arise because it is assumed that the data consist of two 4 4 infinite sequences, either (q > q )4 0 or perhaps (q > q )0 , where ( q )0 are the norming constants introduced in Section 10.8. In practical inverse vibration problems, it is not possible to measure more than a (small) finite number of frequencies. In that case we find that the su!cient conditions are that the eigenvalues are positive (they are the squares of the natural frequencies) and interlace as discussed in Section 10.8. We make a few remarks on the mathematical problem for the sake of completeness. 4 Levitan (1964b) [210] proved the following result. Let (q )4 0 > (q )0 be sets of real numbers satisfying 0 ? 0 ? 1 ? 1 ? = = = >
(11.1.9)
q2 = q + d0 q1 + d1 q2 + R(q3 )>
(11.1.10)
1
1 2
q = q + d00 q1 + d01 q2 + R(q3 )>
(11.1.11)
d00 .
where d0 6= Then there exists an equation of the form (11.1.1) with a continuous real valued function t({) and real numbers k> k0 > K such that (q )4 0 is the spectrum of (11.1.1) subject to |0 (0) k|(0) = 0 = | 0 () + K|()>
(11.1.12)
(q )4 0 is the spectrum subject to | 0 (0) k0 |(0) = 0 = | 0 () + K|()>
(11.1.13)
d00 d0 = (k0 k)@=
(11.1.14)
and Note that the asymptotic form (11.1.10) was obtained in Section 10.9 by assuming that t({) had a bounded derivative, and that the explicit expressions for d1 given in equation (10.9.51)-(10.9.53), were obtained under the assumption that t({) was twice continuously dierentiable, although actually it would have been su!cient to assume that t 00 ({) was, say, piecewise continuous, or even just bounded. We showed in Section 10.9 that if it is assumed only that t({) was 1
continuous, then the asymptotic form of q2 was (10.9.20). Thus we note that the su!cient conditions (11.1.10), (11.1.11) are stronger than the necessary condition (10.9.20). As Levitan (1964b) [210] shows, there is a similar mismatch between necessary and su!cient conditions if it is required that t({) have, say,
11. Inversion of Continuous Second-Order Systems
293
u continuous derivatives. See Levitan and Sargsjan (1991) [212] and references given there. Levitan’s result is a refinement of results contained in the final section of Gel’fand and Levitan (1951) [100]. In that paper the data for the inverse 1
4 2 problem are (q )4 0 and the norming constants ( q )0 . They showed that if q had the asymptotic form (11.1.10), and q had the asymptotic form (10.9.54), then there exists a continuous function t({) for which (q > q )4 0 are the spectral constants corresponding to (11.1.12). Note that there are various special cases of these results corresponding to k or K being infinite. After these few remarks on su!cient conditions, we come to the third question: How does one construct t({) from the given data? Here the fundamental paper is Gel’fand and Levitan (1951) [100]. They show that t({) and the constants k> K are uniquely determined from (q > q )4 0 . They develop a procedure for reconstructing t({) based on an earlier paper by Marchenko (1950) [218]; we describe a modified form of this procedure in this chapter. Three papers by Krein (1951a) [200], Krein (1951b) [201], Krein (1952) [202], considered the question of uniqueness, existence and reconstruction for the taut string (equation (11.1.2)). He used his theory of the extension of positive definite functions. His results were generally stated without proof, and his methods have been used only by a few later authors; see Gopinath and Sondhi (1971) [137], and Landau (1983) [204]. Gopinath and Sondhi (1970) [136], in considering the determination of the shape of the human vocal tract from accoustical measurements, encountered Webster’s horn equation (10.10.10), and devised two methods for its inversion. The first is in the spirit of Gel’fand and Levitan, and can be replaced by the analysis of Section 11.6. The second is set in the time domain and relies on the impulse response described in Section 10.10. This formulation was improved and extended by Gopinath and Sondhi (1971) [137] and is described in Section 11.11. A recent review of the vocal tract inverse problem was given by Sondhi (1984) [309]. The interconnections between all the various procedures for the inversion of second-order problems have been analysed by Burridge (1980) [45]; he pays particular attention to the case in which the cross-sectional area function D({) is discontinuous. See also Hald (1984) [165]. One of the early strands of research into actually constructing the potential in a Sturm-Liouville problem, used some finite dierence/finite element approximation to the governing equation. One of the di!culties that had to be faced was that the eigenvalues derived from a discrete approximation diverge, with increasing mode number, from those predicted by the dierential equation. Paine and his colleagues made a detailed study of this problem. See Paine and de Hoog (1980) [258], Paine, de Hoog and Andersen (1981) [259], Paine (1982) [256], Paine (1984) [257], Andrew and Paine (1985) [10], Andrew and Paine (1986) [11]. A study of inverse problems for Sturm-Liouville systems modelled as discrete (Jacobi) systems was carried out by Andersson (1970) [5]; he did not have the results on inverse problems for Jacobi matrices at his disposal. See also Barcilon (1974a) [14]. Hald (1972) [159] made a detaled study of the problem,
294
Chapter 11
later Hald (1977) [161] paid particular attention to the Sturm-Liouville problem with symmetric potential. Hald (1978b) [163] discussed the discrete system obtained by applying the Rayleigh-Ritz procedure to the continuous problem, and considered the limiting case in which the number of terms in the Fourier series expansions of t({) tends to infinity. A modern version of such an approximation procedure may be found in Section 11.9. Barcilon (1983) [22] attempted to derive the (continuous) density of the string from the known solution to the inverse problem for the discrete system, but his procedure does not lend itself to computation. A straightforward and comparitively simple solution of the inverse Sturm-Liouville system for a rod, using a piecewise uniform model, is given in Section 12.1.
11.2
Transformation operators
The fundamental step in the elucidation of all three aspects, uniqueness, existence and reconstruction, is the introduction of the Gel’fand-Levitan-Marchenko transformation operator. This operator relates the solution of one SturmLiouville equation to another. Consider two equations, a base equation !00 ({) + ( s({))!({) = 0>
{ 0>
(11.2.1)
subject to the single boundary condition !0 (0) k!(0) = 0>
(11.2.2)
and another equation # 00 ({) + ( t({))#({) = 0>
{ 0>
(11.2.3)
subject to another boundary condition # 0 (0) k0 #(0) = 0=
(11.2.4)
We seek an operator of the form #({) = !({) +
Z
{
N({> |)!(|)g|>
(11.2.5)
0
that transforms a solution of (11.2.1), (11.2.2) into a solution of (11.2.3), (11.2.4). Dierentiation of (11.2.5) gives Z { # 0 ({) = !0 ({) + N({> {)!({) + N{ ({> |)!(|)g| 0
where N{ ({> |) =
CN ({> |)= C{
11. Inversion of Continuous Second-Order Systems
295
A second dierentiation gives #00 ({) = !00 ({) +
gN({> {) !({) + N({> {)!0 ({) + N{ ({> {)!({) + g{
Z
{
N{{ ({> |)!(|)g|= 0
This is the first term in (11.2.3). The last term is Z { t({)#({) = t({)!({) + t({)N({> |)!(|)g|= 0
This leaves the second term: #({) = !({) + = !({) +
R{
R{ 0
0
N({> |)!(|)g|
N({> |){s(|)!(|) !00 (|)}g|=
We evaluate the last integral in this expression by parts twice: Z { Z { 00 0 { N({> |)! (|)g| = [N({> |)! (|) N| ({> |)!(|)]0 + N|| ({> |)!(|)g|= 0
0
Collect the terms in these equations to write (11.2.3); the result is as follows: gN({> {) !({) + N({> {)!0 ({) g{ +N{ ({> {)!({) [N({> |)!0 (|) N| ({> |)!(|)]|={ |=0 Z { + {N{{ ({> |) N|| ({> |) + (s(|) t({))N({> |)}!(|)g|= !00 ({) + ( t({))!({) +
0
Now use the facts that !({) satisfies (11.2.1), and that N{ ({> {) + N| ({> {) = to obtain ½ Z
gN({> {) g{
¾ 2gN({> {) + s({) t({) !({) + N({> 0)!0 (0) N| ({> 0)!(0) + g{
0
{
{N{{ ({> |) N|| ({> |) + (s(|) t({))N({> |)}!(|)g|
This equation is satisfied identically by taking N{{ ({> |) N|| ({> |) + (s(|) t({))N({> |) = 0> 1 gN ({> {) = (t({) s({))> g{ 2 N({> 0)!0 (0) N| ({> 0)!(0) = 0>
0 | { > (11.2.6)
0 { > 0 { =
(11.2.7) (11.2.8)
296
Chapter 11
Now we examine the boundary conditions at { = 0. Clearly #(0) = !(0)>
# 0 (0) = !0 (0) + N(0> 0)!(0)=
(11.2.9)
If k> k0 are finite then !(0) 6= 0, !0 (0) = k!(0) imply N| ({> 0) kN({> 0) = 0>
0 { >
(11.2.10)
and #0 (0) = (k + N(0> 0))#(0)> so that N(0> 0) = k0 k>
(11.2.11)
and hence, with (11,2,7), N({> {) = k0 k +
1 2
Z
{
(t(|) s(|))g|=
(11.2.12)
0
Note that if k = 4, so that !(0) = 0, then equation (11.2.8) implies N({> 0) = 0>
0 { >
(11.2.13)
and #({) satisfies #(0) = 0, i.e., k0 = 4. In that case equation (11.2.12) is replaced by Z 1 { (t(|) s(|))g|= (11.2.14) N({> {) = 2 0 With this kernel N({> |), the equation (11.2.5) transforms a solution of (11.2.1), (11.2.2) into a solution of (11.2.3), (11.2.4).
11.3
The hyperbolic equation for N({> |)
The kernel x({> |) = N({> |) satisfies the hyperbolic equation (11.2.6), i.e., x{{ x|| + (s(|) t({))x = 0>
0|{
(11.3.1)
in the upper triangle OIC shown in Figure 11.3.1. The characteristics for this equation are the lines { ± | = frqvw. The kernel has the value (11.2.12), i.e., Z 1 { (t(w) s(w))gw> (11.3.2) x({> {) = k0 k + 2 0 on the characteristic { = |, and satisfies the condition (11.2.10), i.e., x| ({> 0) kx({> 0) = 0>
0{
(11.3.3)
on the {-axis. First we discuss how x({> |) may be continued to the lower triangle OID so that the boundary condition (11.3.3) is satisfied. There are three cases: i) h = 0. Now x| ({> 0) = 0 so that we continue x({> |) to the lower triangle as an even function of |, i.e., x({> |) = x({> |);
11. Inversion of Continuous Second-Order Systems
297
C
y x= y
x=π
0
x
I
x = −y D
Figure 11.3.1 - 0 | { in the upper triangle RLF. then
1 x({> {) = k + 2 0
Z
{
(t(w) s(w))gw=
0
ii) h = 4. Now x({> 0) = 0 so that we continue x({> |) as an odd function of |, i.e., x({> |) = x({> |) so that, according to (11.2.14), x({> {) = x({> {) =
1 2
Z
{
(t(w) s(w))gw=
0
iii) h is finite and not zero. Define x({> |) = exp(k|)N({> |)
(11.3.4)
then x| ({> 0) = N| ({> 0) kN({> 0) = 0= This means that we should continue x({> |) as an even function of |. The values of x({> |) on the characteristics are x({> {) = x({> {) = exp(k{)N({> {) where N({> {) is given by (11.2.12). satisfies
Since N({> |) satisfies (11.3.1), x({> |)
x{{ x|| 2k vljq(|)x| + (s(|) t({) k2 )x = 0
(11.3.5)
298
Chapter 11
throughout the triangle OCD, i.e., 0 ||| { . Here vljq(|) =
½
+1 | A 0 1 | ? 0
and s(|) = s(|). The equations (11.3.1), (11.3.5) are hyperbolic partial dierential equations. The existence and uniqueness properties of such equations are the subject matter of treatises on p.d.e.’s. In keeping with the philosophy of this book, we shall not assume that the reader is acquainted with these properties, and will derive them ab initio. There are two fundamental questions regarding the p.d.e.’s (11.3.1) and (11.3.5): what boundary data lead to a unique solution? How can we find this unique solution from the boundary data? It transpires that there are two kinds of suitable boundary data, giving rise to two problems: The Goursat problem in which x is given on the characteristics { = ±|. The Cauchy problem in which x and x{ are given on the side CD, i.e., on { = > | . In both these cases we can reduce the solution of the p.d.e. to the solution of a Volterra integral equation, and we can show that this equation has a unique solution. The Goursat problem We first consider cases i) and ii); the governing equation is equation (11.3.1). Stokes’ theorem in 2-D is Z Z µ V
Cz2 Cz1 C{ C|
¶
g{g| =
Z
z1 g{ + z2 g|>
(11.3.6)
where is the boundary of the region V, traversed counter-clockwise. Apply this theorem to the rectangle RES D in Figure 11.3.2, with z1 = x| > z2 = x{ . Then Cz2 Cz1 = x{{ ||| = i ({> |)x> C{ C| where i ({> |) = t({) s(|)= The L.H.S. of equation (11.3.6) is thus Z Z V
i ({> |)x({> |)g{g|>
(11.3.7)
11. Inversion of Continuous Second-Order Systems
A
299
ξ = const
η=0 P
η = const 0
ξ=0
Q
B
Figure 11.3.2 - The rectangle RES D. where V is the rectangle RES D. The R.H.S. of equation (11.3.6) is made up of four line integrals, along RE + ES + S D + DR. To evaluate these integrals, it is convenient to introduce the so-called characteristic coordinates =
1 ({ + |)> 2
=
1 ({ |)= 2
(11.3.8)
Equivalently, { = + >
| = =
(11.3.9)
The partial derivates in these coordinates are C C{ C C| C C C = + = + > C C{ C C| C C{ C| C C{ C C| C C C = + = = C C{ C C| C C{ C| Consider the integral along RE L1 =
Z
z1 g{ + z2 g|=
RE
On RE, { = > | = , so that g{ = g> g| = g, and z1 g{ + z2 g| = (x| x{ )g = x g, so that Z L1 = x g = [x]E R = x(E) + x(R)= RE
300
Chapter 11
On ES , = frqvw>
g{ = g| = g, so that Z x g = [x]S L2 = E = x(S ) x(E)= ES
Similarly L3 = x(S ) x(D)>
L4 = x(R) x(D)=
Thus the R.H.S. of (11.3.6) is 2x(S ) 2x(D) 2x(E) + 2x(R) since D has coordinates (({ + |)@2> ({ + |)@2) and E has coordinates ({ |)@2> (| {)@2, we find x({> |) = x(({ + |)@2> ({ + |)@2)) + x(({ |)@2> (| {)@2) Z Z 1 x(0> 0) + i ({0 > | 0 )x({0 > | 0 )g{0 g| 0 = (11.3.10) 2 V This equation expresses x({> |) as a sum of two parts: the first, comprising the first three terms, is made up of data on the characteristics; the second is an integral over the rectangle RES D. In the characteristic coordinates, equation (11.3.10) is x( + > ) = x(> ) + x(> ) x(0> 0) ( ) Z Z 1 + i ( + > )x( + > )g g > 2 0 0
(11.3.11)
which has the form of a Volterra integral equation. We note that when k = 0> x(> ) = x(> ); when k = 4> x(> ) = x(> ) and x(0> 0) = 0. Equation (11.3.11) has a unique solution for given data on the characteristics. For if there were two solutions, then their dierence, x = x1 x2 , would satisfy ) Z (Z 1 i ( + > )x( + > )g g = (11.3.12) x(> ) = 2 0 0 The classical way to show that this equation has only the trivial solution is as follows. The function i is bounded: |i | 2P. This means that y = |x| satisfies y(> ) PY (> ) where Y (> ) =
Z
0
Z
y( + > )gg =
0
Suppose 0 n, and 0 n, then y(> ) P Y (n> n)
11. Inversion of Continuous Second-Order Systems
301
so that Y (n> n) Pn2 Y (n> n)= If y(> ) is not identically zero in [0> n] × [0> n], then this inequality is clearly 2 impossible if Pn2 ? 1. Choose n0 sos that Pns 0 ? 1, then x(> ) 0 in [0> n0 ]×[0> n0 ]. Now suppose (> ) 5 [0> 2n0 ]×[0> 2n0 ], then for (> ) outside [0> n0 ] × [0> n0 ] we have s s y(> ) PY ( 2n0 > 2n0 ) so that
s s s s Y ( 2n0 > 2n0 ) PY ( 2n0 > 2n0 )(2n02 n02 )
which, with P n02 ? 1, provides a contradiction. By continuing this argument inductively we deduce that y(> ) = 0, i.e., x(> ) = 0. The extension of this argument to case iii), when k is finite but not zero, is a little complicated but not essentially di!cult. Proceeding exactly as before we find the equation corresponding to (11.3.10) to be x({> |) = x(({ + |)@2> ({ + |)@2) + x(({ |)@2> (| {)@2) 1 x(0> 0) + 2
Z Z
0
0
0
0
0
0
i ({ > | )x({ > | )g{ g| k
V
Z Z
vljq(| 0 )x| g{0 g| 0 (11.3.13)
V
where i ({> |) = t({) s(|) + k2 >
(11.3.14)
and s(|) = s(|). Again we can use Stokes’ theorem to write the last term as a sum of line integrals; see Ex. 11.3.1. The resulting equation is again a Volterra integral equation with a unique solution. The Cauchy problem We proceed as in the Goursat problem, but now apply Stokes’ theorem to the triangle DES in Figure 11.3.3. C
C
B( π ,π − x + y )
B( π ,π − x + y ) P
x
P
x A(π , x + y − π )
0
0
A(π , x + y − π )
D
D
Figure 11.3.3 - The triangle DES when a) { + | ? and b) { + | A .
302
Chapter 11
In cases i) and ii) the R.H.S. of equation (11.3.6) is (11.3.7) while the L.H.S. is the sum of line integrals along DE + ES + S D. Now R R L1 = DE z1 g{ + z2 g| = DE x{ g| L2
=
L3
=
R
ES
R
SD
z1 g{ + z2 g| =
z1 g{ + z2 g| =
R
ES
x g = [x]S E = x(S ) x(E)
SD
x g = [x]D S = x(S ) x(D)=
R
This yields the Volterra integral equation 2x({> |) = x(> { + | ) + x(> { + |)
Z
{+|
x{ (> w)gw +
{+|
Z Z
i ({0 > | 0 )x({0 > | 0 )g{0 g|0
(11.3.15)
V
where now V is the triangle DES . Again, in case iii) there is an extra term Z Z vljq(|)x| g{0 g| 0 (11.3.16) k V
to be added to the R.H.S. of (11.3.15). This is given in Ex. 11.3.3. In all cases x({> |) is given as the solution of a Volterra integral equation; the solution is uniquely determined by the values of x and x{ on the line FG, i.e., { = , | . We now show how the uniqueness of solution of the hyperbolic equation for N({> |) may be used to show the uniqueness of an inverse problem for the SturmLiouville equation. Exercises 11.3 1. Show that the integral in (11.3.13) may be written L
Z Z
=
vljq(|)x| g{g| = 2
V
Z
x( + > )g +
+
Z
x( + > )g +
0
Z
Z
Z
{|
x(v> 0)gv
0
x( + > )g
0
x(> )g +
0
Z
x( > )g =
0
2. Show that the integral in (11.3.16) may be written L
=
Z Z
vljq(|)x| g{g| =
V
Z
Z
x( + > )g +
x( + > )g Z
x( + > )g 2
Z
{+|
x(v> 0)gv
11. Inversion of Continuous Second-Order Systems
when | A 0> { + | ? and Z Z Z vljq(|)x| g{g| = L= V
x(+> )g
303
Z
x( + > )g
when | A 0> { + | .
11.4
Uniqueness of solution of an inverse problem
With the uniqueness results of Section 11.3 we are now in a position to show that the potential s({) in (11.2.1) is uniquely determined by two spectra corresponding to two dierent conditions at one end of (0> ). Theorem 11.4.1 Suppose that there were two potentials s({), t({) 5 F[0> ] with the following properties: i) | 00 + ( s)| = 0> | 0 (0) k1 |(0) = 0 = | 0 () + K1 |() has spectrum (q )4 0 ; ii) | 00 + ( s)| = 0> | 0 (0) k1 |(0) = 0 = | 0 () + K10 |() has spectrum (q )4 0 ; iii) | 00 + ( t)| = 0> | 0 (0) k2 |(0) = 0 = | 0 () + K2 |() has spectrum (q )4 0 ; iv) | 00 + ( t)| = 0> | 0 (0) k2 |(0) = 0 = | 0 () + K20 |() has spectrum (q )4 0 . If K2 6= K20 , then s({) = t({), k1 = k2 , K1 = K2 , K10 = K20 . Proof. First we use the known asymptotic forms for the eigenvalues. Equation (10.9.20) states that p q = q + fq1 + r(q1 ) where
Z 1 1 t({)g{)= f = (k + K + 2 0 Thus, since i) and iii) have the same spectrum Z Z 1 1 s({)g{ = k2 + K2 + t({)g{ k1 + K1 + 2 0 2 0
and because ii) and iv) have the same spectrum Z Z 1 1 0 0 k1 + K1 + s({)g{ = k2 + K2 + t({)g{= 2 0 2 0
(11.4.1)
(11.4.2)
Now we transform a solution !q ({) of i) corresponding to the eigenvalue q , into a solution #q ({) of iii) with the same eigenvalue: Z { N({> |)!q (|)g|> # q ({) = !q ({) + 0
304
Chapter 11
and, according to (11.2.11), (11.2.12), we have N(0> 0) = k2 k1 > Z 1 N(> ) = k2 k1 + (t({) s({))g{= 2 0 We examine the boundary condition at { = : R # q () = !q () + 0 N(> |)!q (|)g|> #0q () = !0q () + N(> )!q () +
so that
R 0
(11.4.3) (11.4.4)
N{ (> |)!q (|)g|>
# 0q () + K2 #q () = !0q () + K1 !q () + {N(> ) + K2 K1 }!q () Z {N{ (> |) + K2 N(> |)}!q (|)g|= + 0
Now !0q () + K1 !q () = 0, and equations (11.4.1), (11.4.4) show that N(> ) + K2 K1 = 0= Thus
Z
0
(11.4.5)
{N{ (> |) + K2 N(> |)}!q (|)g| = 0=
(11.4.6)
But the {!q } form a complete orthogonal set on (0> ), so that N{ (> |) + K2 N(> |) = 0=
(11.4.7)
Applying the same argument to ii) and iv), we find N{ (> |) + K20 N(> |) = 0
(11.4.8)
and, since K2 6= K20 by hypothesis, N{ (> |) = 0 = N(> |)=
(11.4.9)
This holds for 0 | , and therefore also for | . But in Section 11.3 we showed that if N({> |) satisfies this condition, then N({> |) 0 in 0 ||| { . Now equation (11.2.7) implies s({) = t({), (11.4.3) implies k1 = k2 , (11.4.5) implies K1 = K2 , (11.4.2) implies K10 = K20 . In this proof we have assumed that k1 > k2 > K1 > K2 > K10 > K20 are all finite, but the argument may easily be adapted to the situation in which some of these are infinite. As we noted in the historical review, if it is known that t({) is symmetric about 2 , i.e., t({) = t( {), then t({) is uniquely determined from one spectrum corresponding to symmetrical end conditions, i.e., k = K. For since the governing equation (11.1.1) and the end conditions |0 (0) k|(0) = 0 = | 0 () + k|()
(11.4.10)
11. Inversion of Continuous Second-Order Systems
305
are invariant under the transformation { $ {, the solutions of (11.1.1) satisfying (11.4.10) must satisfy |({) = ±|({). Since the lowest eigenfunction |0 ({) can have no zero in (0> ), the even eigenfunctions must satisfy |0 ( 2 ) = 0, while the odd ones must satisfy |( 2 ) = 0. This means that the given spectrum 4 (q )4 0 must split into two: (2q )0 corresponding to |0 (0) k|(0) = 0 = |0 ( ) 2 and (2q+1 )4 0 corresponding to | 0 (0) k|(0) = 0 = |( )= 2 We thus have two spectra which will uniquely determine t({) on [0> 2 ]; the symmetry then gives t({) on [ 2 > ]. For other uniqueness theorems, see McLaughin (1986) [228], McLaughlin and Rundell (1987) [230]. The literature on uniqueness and existence of solutions of inverse problems for the various forms of equations (11.1.1)-(11.1.3) is so vast that one can only make some pointers to the literature. Hald (1984) [165] is useful for a review of the early research. Other studies of problems with discontinuous t({), in (11.1.1) or D({), in (11.1.3) include Willis (1985) [334], Kobayashi (1988) [197], Andersson (1988a) [6], (1988b) [7], Coleman and McLaughlin (1993a) [62], (1993b) [63].
11.5
The Gel’fand-Levitan integral equation
The transformation operator introduced in equation (11.2.5) transforms a solution !({) of equation (11.2.1) subject to the single end condition (11.2.2) into a solution #({) of a new equation (11.2.3) subject to the single end condition (11.2.4). But, as in Section 11.4, we require more of the transformation: that it produce a complete orthonormal set (c.o.s.) of eigenfunctions for the new equation (11.2.3) subject to two end conditions, at 0 and . Denote the unique solution of (11.2.1), (11.2.2) satisfying !(0) = by !({> > ). The eigenfunction !q ({) of (11.2.1) subject to the end conditions !0 (0) k!(0) = 0 = !0 () + K!()
(11.5.1)
!q ({) = !({> q > !q (0))=
(11.5.2)
is therefore We are going to construct new orthonormal eigenfunctions #({) of (11.2.3) subject to end conditions #0 (0) k0 #(0) = 0 = #0 () + K 0 #()>
(11.5.3)
from equation (11.2.5). We denote the new eigenvalues by (q )4 0 and write "q ({) = !({> q > # q (0))> Z { N({> w)"q (w)gw= # q ({) = "q ({) + 0
(11.5.4) (11.5.5)
306
Chapter 11
Note that "q ({) is the solution of equation (11.2.1), (11.2.2) for = q , while #q ({) is to be the qth orthonormal eigenfunctions of equation (11.2.3) subject to (11.5.3). The eigenfunctions {!q }4 0 of the base problem do form a c.o.s. on (0> ). This means that if j 5 O2 (0> ), then 2
||j||
Z
[j({)]2 g{ =
0
4 X
d2q >
(11.5.6)
q=0
where dq = (j> !q ) =
Z
0
j({)!q ({)g{=
(11.5.7)
This implies also that if j> k 5 O2 (0> ), then (j> k) =
Z
j({)k({)g{ =
0
4 X
dq eq >
q=0
where eq = (k> !q ). The eigenfunctions {# q }4 0 are to form a c.o.s. on (0> ), so that ||j||2 =
4 X
2
d0q >
(11.5.8)
q=0
where d0q = (j> # q ). Equation (11.5.5) shows that #q = "q + N"q > where N is the operator defined by Z Nx =
(11.5.9)
{
N({> w)x(w)gw=
(11.5.10)
0
Now (Nx> y) =
Z
0
½Z
w
0
¾ N(w> {)x({)g{ y(w)gw
and on interchanging the order of integration we see that ¾ Z ½Z (Nx> y) = N(w> {)y(w)gw x({)g{= 0
{
The adjoint operator N is defined by (Nx> y) = (x> N y)> so that Ny =
Z
{
N(w> {)y(w)gw=
(11.5.11)
11. Inversion of Continuous Second-Order Systems
307
Return to equation (11.5.9); we can write d0q = (j> "q ) + (j> N"q ) so that the equation 0=
4 X
2
d0q
q=0
can be written 0=
4 X
4 X
d2q
q=0
{(j> "q )2 + 2(j> "q )(j> N"q ) + (j> N"q )2 (j> !q )2 }
(11.5.12)
q=0
Put N j = J then, since J 5 O2 (0> ), we have (j> J)
4 X
(j> !q )(J> !q ) = 0>
4 X
(j> !q )(j> N!q ) = 0=
q=0
and this is equivalent to (Nj> j)
(11.5.13)
q=0
Similarly (J> J)
4 X
(J> !q )2 = 0
q=0
is equivalent to
(NN j> j)
4 X
(j> N!q )2 = 0=
(11.5.14)
q=0
Now form the combined equation (11.5.12) + 2*(11.5.13) + (11.5.14) and group the terms to get (11.5.15) 0 = V1 + V2 + V3 + V4 > where
P4
2 q=0 {(j> "q )
(j> !q )2 }>
V1
=
V2
= 2
V3
=
V4
= 2(j> Nj) + (j> NN j)=
P4
q=0 {(j> "q )(j> N"q )
P4
2 q=0 {(j> N"q )
(j> !q )(j> N!q )}>
(j> N!q )2 }>
In order to represent these products of integrals as multiple integrals we use the simple identity Z Z Z Z { Z Z | j({)g{ k(|)g| = j({)k(|)g|g{ + j({)k(|)g{g| 0
0
0
0
0
0
308
Chapter 11
obtained by dividing the square (0> ) × (0> ) into two triangles. This yields RR{ V1 = 2 0 0 j({)j(|)I ({> |)g|g{> V2
= 2
RR{ 0
+2 V3
= 2
V4
= 2
where
0
j({)j(|)
RR| 0
0
RR| 0
0
RR{ 0
0
R{
j({)j(|)
j({)j(|)
N({> w)I (w> |)gwg|g{
0
R{ 0
R{ 0
N({> w)I (w> |)gwg{g|>
N({> w)
£R | 0
¤ N(|> v)I (v> w)gv gwg{g|>
j({)j(|)N({> |)g|g{ + 2
I ({> |) =
4 X
RR|R{ 0
0
0
N({> w)N(|> w)gwg{g|
{"q ({)"q (|) !q ({)!q (|)}>
(11.5.16)
q=0
so that equation (11.5.15) gives ½ Z Z { Z j({)j(|) M({> |) + 0
0
0
|
¾ M({> w)N(|> w)gw g|g{ = 0>
(11.5.17)
where M({> |) = N({> |) +
Z
{
N({> w)I (w> |)gw + I ({> |)=
(11.5.18)
0
Since j({) is an arbitrary function in O2 (0> ), equation (11.5.17) implies Z | M({> w)N(|> w)gw = 0> 0 | { = (11.5.19) M({> |) + 0
For fixed {, this is a homogeneous Volterra integral equation for M({> |), and we may argue exactly as in Section 11.3 that its only solution is M({> |) = 0, for 0 | {. Thus Z { N({> w)I (w> |)gw + I ({> |) = 0> 0 | { = (11.5.20) N({> |) + 0
This is the Gel’fand-Levitan integral equation for N({> |). Note that for fixed {, (11.5.19) is a Volterra equation for M({> |); on the other hand, for fixed {, (11.5.20) is a Fredholm equation for N({> |). There is one matter in this analysis that needs to be examined: the convergence of the series in equation (11.5.16). There are two ways to approach this question: examine the asymptotic form of the terms in the series and find the conditions under which the series is convergent; make an assumption that will obviate the question by turning the infinite series into a finite one. We shall follow the latter course. We started this section by taking a base problem consisting of equation (11.2.1) and end conditions (11.5.1); the c.o.s. of eigenfunctions of this problem
11. Inversion of Continuous Second-Order Systems
309
We then used the operator N to construct a new c.o.s. of eigenis {!q }4 0 . functions {#q }4 0 for a new problem. The orthonormal eigenfunctions # q were constructed from the solutions "q = !({> q > # q (0)) of the base equation (11.2.1) with = q , and with initial conditions "q (0) = #q (0), "0q (0) = k#q (0). Now we introduce the Truncation Assumption q = q >
# q (0) = !q (0) for q = Q + 1> = = =
This means that, for q = Q + 1> = = = "q ({) = !({> q > # q (0)) = !({> q > !q (0)) = !q ({) so that I ({> |) =
Q X
{"q ({)"q (|) !q ({)!q (|)}=
(11.5.21)
q=0
We now prove
Theorem 11.5.1 Let I ({> |) be given by (11.5.21), and suppose that N({> |) is continuous in |, 0 | { , for each fixed {, 0 { . Then there exists at most one solution of equation (11.5.20). Proof. We need to show that the homogeneous Fredholm integral equation Z { i (|) + I (w> |)i (w)gw = 0> 0 | {> (11.5.22) 0
has only the zero solution. Multiply (11.5.22) by i (|) and integrate from 0 to { to obtain Z { Z {Z { [i (|)]2 g| + I (w> |)i (w)i (|)gwg| = 0= (11.5.23) 0
0
The function j(|) = is in O2 (0> ), so that Z where dp =
0
{
Z
0
i (|) > 0 | {> 0 > { ? | >
[i (|)]2 g| =
Z
[j(|)]2 g| =
0
Z
0
Z
½
{
0
On the other hand
0
j(|)!p (|)g| =
I (w> |)i (w)i (|)gwg| =
d2p >
p=0
{
4 X
Q X
Z
0
{
i (|)!p (|)g|=
(e2p d2p ) =
p=0
4 X
(e2p d2p )>
p=0
310
Chapter 11
where ep =
Z
{
0
i (|)"p (|)g| =
Z
0
j(|)"p (|)g|>
and we have used "p (|) = !p (|) for p A Q to give ep = dp for p A Q . Equation (11.5.23) now gives 4 X e2p = 0> q=0
that is
ep = 0>
p = 0> 1> = = =
We must show that this implies j(|) = 0. This is equivalent to showing that ep = 0, p = 1> 2> = = = implies dp = 0, p = 0> 1> 2> = = = Now ep = (j> "p ) =
4 X
fpq dq
p = 0> 1> 2> = = =
(11.5.24)
q=0
where fpq = ("p > !q ). If p A Q , then "p = !p and fpq = pq , so that ep = 0 implies dp = 0. Thus the sum in (11.5.24) is over q = 0> 1> = = = > Q , and we have the Q + 1 equations 0=
Q X
fpq dq >
p = 0> 1> = = = > Q=
q=0
If there is a pair p0 > q0 such that "p0 = !q0 then equation (11.5.24) with p = p0 gives dq0 = 0 so that, when p 6= p0 , the term with q = q0 may be omitted. This means that we need consider only those p> q for which "p 6= !q . Renumber these 0> 1> = = = > Q 0 . We need to show that these Q 0 + 1 equations have only the trivial solution, i.e., their determinant of coe!cients is not zero. The equations "00p + (p s)"p = 0 = !00q + (q s)!q yield (q p )"p !q = "00p !q "p !00q so that (q p )fpq = ["0p !q "p !0q ]0 = Since "p and !q satisfy the same condition at { = 0, we have (q p )fpq = gp hq where gp = "0p () + K"p ()>
hq = !q ()=
Both gp > hq are non-zero, and thus the determinant of coe!cients is 0
det(F) =
Q Y
q=1
gq hq det(1@(q p ))
(11.5.25)
11. Inversion of Continuous Second-Order Systems
311
and it may easily be shown (Ex. 11.5.1) that this is non-zero when, as we know, q p 6= 0 for all p> q. We have proved that, under the Truncation Assumption (TA), the Gel’fandLevitan integral equation has at most one solution. In fact it is a degenerate integral equation with a solution of the form N({> |) =
Q X
{Ip ({)"p (|) Jp ({)!p (|)}=
(11.5.26)
p=0
On substituting (11.5.26) into (11.5.20) and equating multiples of "p (|)> !p (|) to zero we find Ip ({) +
Q X
{epq ({)Iq ({) fpq ({)Jq ({)} + "p ({) = 0
(11.5.27)
q=0
Jp ({) +
Q X
{fqp ({)Iq ({) gpq ({)Jq ({)} + !p ({) = 0
(11.5.28)
q=0
for p = 0> 1> = = = > Q , where epq ({) = eqp ({) = R{
fpq ({)
=
gpq ({)
= gqp ({) =
0
R{
"p (w)"q (w)gw
R{
!p (w)!q (w)gw=
0
"p (w)!q (w)gw 0
We may verify (Ex. 11.5.2) that these equations do have a unique solution, as stated by Theorem 11.5.1. When we first introduced the transformation operator N in Section 11.2, we showed that N({> |) must satisfy the hyperbolic dierential equation (11.2.6). In this section we showed that N({> |) must satisfy the integral equation (11.5.20). In order to relate these two equations we note (Ex. 11.5.3) that I ({> |) given by (11.5.21) satisfies the hyperbolic equation I{{ ({> |) I|| ({> |) + (s(|) s({))I ({> |) = 0=
(11.5.29)
It is not di!cult to show (Ex. 11.5.4) that if N satisfies (11.5.20) then it satisfies the dierential equation (11.2.6), where t({) is given by (11.2.7)
Exercises 11.5 1. Show that if p 6= q for all p> q = 0> 1> = = = > Q , then the matrix F = (fpq ) = (1@(p q )) is non-singular. 2. Show that the equations (11.5.27), (11.5.28) have a unique solution. Hint: consider the homogeneous equations obtained by omitting "p ({)> !p ({); multiply the first by Ip ({), the second by Jp ({) and add the equations for p = 0> 1> = = = > Q .
312
Chapter 11
3. Show that if I ({> |) is given by (11.5.21) then it satisfies equation (11.5.29). 4. Show that if N({> |) satisfies equation (11.5.20) then Z { O({> |) + O({> w)I (w> |)gw + [N({> 0)I{ (0> |) N| ({> 0)I (0> |)] = 0 0
where O({> |) N{{ ({> |) N|| ({> |) + (s(|) t({))N({> |) and t({) is related to s({) by equation (11.2.7). Show that the term in square brackets is zero, and hence, by Theorem 11.5.1, O({> |) = 0 for 0 | { ; this is equation (11.2.6). 5. Show that the solutions of equations (11.5.26), (11.5.27) may be written R{ Iq ({) = {"q ({) + 0 N({> |)"q (|)g|} = # q ({) Jq ({) = {!q ({) +
11.6
R{ 0
N({> |)!q (|)g|}=
Reconstruction of the Sturm-Liouville system
First, we recapitulate what we have achieved in this chapter so far. We have shown that by starting with one V O system | 00 ({) + ( s({))|({) = 0>
(11.6.1)
| 0 (0) k|(0) = 0 = | 0 () + K|()>
(11.6.2)
(q )4 0
with eigenvalues and c.o.s. of eigenfunctions ing the operator N, form a new V O system
(!q )4 0
we may, by introduc-
| 00 ({) + ( t({))|({) = 0>
(11.6.3)
| 0 (0) k0 |(0) = 0 = | 0 () + K 0 |()>
(11.6.4)
4 with eigenvalues (q )4 0 and c.o.s. of eigenfunctions (# q )0 given by equation 4 (11.2.5). In order to find the new system we need the (q )0 and the end values (# q (0))4 0 of the eigenfunctions # q ({) which are yet to be found. We can find these, as in Section 10.8, from two spectra of equation (11.6.3), 4 (q )4 0 corresponding to the end conditions (11.6.4), and ( q )0 corresponding to |0 (0) k01 |(0) = 0 = |0 () + K 0 |()=
Changing equation (10.8.12) to the numbering system V we find that ( q )4 0 are the roots of 4 X #2q (0) 1 = (k01 k0 ) = (11.6.5) q q=0
11. Inversion of Continuous Second-Order Systems
313
The Truncation Assumption allows us to write this 1=
(k01
(
Q 4 X X #2q (0) !2q (0) k) + q q q=0 0
q=Q+1
)
=
(11.6.6)
This gives Q + 1 equations (
Q 4 X X #2q (0) !2q (0) 0 0 1 = (k1 k ) + q p q q=0 p q=Q +1
)
>
p = 0> 1> = = = > Q>
(11.6.7) for the Q + 1 quantities {# q (0)}Q . As in Ex. 11.5.1, the determinant of 0 coe!cients is not zero. To check that the #2q (0) are indeed positive, we write (11.6.6) as 1
(k01
) ¶ Q Q µ X Y p #2q (0) j() k) i () = q p q=0 p=0 0
(
(11.6.8)
where for definiteness we take k01 A k0 . The functions i ()> j() are positive for 0 ? ? p , and the q > q interlace according to 0 ? 0 ? 1 ? · · ·
(11.6.9)
Multiplying (11.6.8) throughout by ( q ) and then putting = q we find (k01
0
k
)# 2q (0)
µ ¶ Q Y q p 0 j(q ) = ( q q ) q p p=0
(11.6.10)
so that the interlacing gives #2q (0) A 0. Taking an arbitrary value of k01 has the disadvantage that the # 2q (0) depend on k01 and k0 . If k01 = 4, then equation (11.6.6) takes the simpler form Q 4 X X #2q (0) !2q (0) + = 0= q q q=0
(11.6.11)
q=Q +1
This yields Q X #2q (0) = i ( p )> q q=0 p
p = 0> 1> = = = > Q
(11.6.12)
for # 2q (0), q = 0> 1> = = = Q . We are now ready to proceed to the reconstruction. We need to find t({)> k0 > K 0 such that the first Q + 1 eigenvalues of (11.6.3), (11.6.4) are the specified (q )Q 0 , and the first Q + 1 end values of the normalised eigenfunctions are (# q (0))Q . We take the following steps: 0
314
Chapter 11
Q Step 1: Choose a base system (11.6.1), (11.6.2), and find {!q ({)}Q 0 > {"q ({)}0 given by (11.5.2), (11.5.4) respectively. Under the Truncation Assumption, the values of Q+1 > Q+2 > = = =, which are not part of the data, are taken to be Q+1 > Q+2 > = = = respectively. We must therefore choose the base system so that Q ? Q+1 . The simplest choice for the base system would be to take s({) = 0, and k> K each to be 0 or 4. If for example k = 0> K = 4 then (11.6.1), (11.6.2) reduce to
| 00 + | = 0
(11.6.13)
| 0 (0) = 0 = |()
(11.6.14)
and !q ({) =
r
µ ¶ 1 2 cos q + {> 2
¶2 µ 1 q = q + 2
(11.6.15)
q = $2q =
(11.6.16)
"q = #q (0) cos $ q {>
This choice for a base system would therefore be appropriate provided ¡ ¡ ¢2 ¢ that Q ? Q + 32 , i.e., $ Q ? Q + 32 . Since the q are to be the eigenvalues of some V O system, they must have the asymptotic form given in Section 10.9. Depending on the end conditions, they must therefore have ¡ the ¢form (10.9.20), (10.9.39) or (10.9.41); in any of these cases, $ Q ? Q + 32 for large enough Q . If of course $ q had the form (10.9.41), then it would be more appropriate to take k = 4> K = 4, so that k0 = 4. Step 2: Form I ({> |) given by (11.5.21) and solve equations (11.5.27), (11.5.28) for {Iq ({)> Jq ({)}Q 0 . Step 3: Form N({> |) from equation (11.5.26). Step 4: Form t({) from equation (11.2.7). Step 5: Find k0 from equation (11.2.11). Step 6: Find K 0 . For this final step we proceed as follows. Since gpq () = pq , equation (11.5.28) gives Q X fqp Iq () + !p () = 0 q=0
where, as in (11.5.23), fqp = fqp (). putting { = , we find Q X
q=0
Dierentiating (11.5.28) w.r.t. { and
fqp Iq0 () + !p ()N(> ) + !0p () = 0=
11. Inversion of Continuous Second-Order Systems
315
Now use Ex. 11.5.5, which shows that Iq () = #q (), and the fact that !0p () + K!p () = 0, to give Q X
q=0
fqp {#0p () + (K N(> ))# q ()} = 0=
But det(F) 6= 0, so that
# 0p () + (K N(> ))# q () = 0=
(11.6.17)
K 0 = K N(> )=
(11.6.18)
This means Apart from the introduction of the Truncation Assumption, the analysis described in this chapter so far is the classical Gel’fand-Levitan inversion of the Sturm-Liouville equation. While the method has great theoretical value, it is impractical; the stumbling block is Step 2, the solution of the equations for Iq ({)> Jq ({), and the subsequent steps 3,4 which give t({) by dierentiating N({> |). In Section 11.9 we describe other methods that use the partial dierential equation satisfied by N({> {).
Exercises 11.6 1. Show that equation (11.2.12), (11.6.18) imply µ µ ¶ ¶ Z Z 1 1 1 1 k0 + K 0 + k+K + t(w)gw = f s(w)gw = f0 2 0 2 0 Since we took q = q for q = Q + 1> = = =, this equation must hold; see the asymptotic form (10.9.22).
11.7
An inverse problem for the vibrating rod
The inversion procedure that we have described so far has been for the SturmLiouville equation (11.2.1). As we have already pointed out, this is not the basic equation for vibrating systems. In this section, at the risk of repetition, we show how the ideas behind the V O inversion may be adapted to the rod equation (11.1.3). We start with a base problem (D({)x0 ({))0 + D({)x({) = 0>
0{
(11.7.1)
write D({) = d2 ({), and scale the independent variable { so that 0 { . The eigenfunctions xq ({) of (11.7.1) subject to some end conditions yet to be described, are orthonormal with weight function d2 ({), i.e., Z d2 ({)xp ({)xq ({)g{ = pq 0
316
Chapter 11
so that the functions !q ({) = d({)xq ({)>
q = 0> 1> = = =
form a c.o.s. Provided that d({) 5 F 2 (0> )>
(11.7.2)
!({) = d({)x({) satisfies
!00 + ( s)! = 0>
(11.7.3)
s({) = d00 ({)@d({)=
(11.7.4)
where Suppose that the end condition at { = 0 is !0 (0) k!(0) = 0>
(11.7.5)
then the corresponding end condition for x({) is d(0)x0 (0) + (d0 (0) kd(0))x(0) = 0= Without loss of generality we may choose the base system so that d(0) = 1>
d0 (0) kd(0) = 0=
(11.7.6)
This means d({) is the solution of (11.7.3) for = 0, that satisfies the end condition (11.7.5) and !(0) = 1, and the base rod is free at { = 0. The rod that is to be constructed is governed by (E({)y0 ({))0 + E({)y({) = 0>
0 { =
(11.7.7)
Write E({) = e2 ({)> #({) = e({)y({), then # 00 + ( t)# = 0>
(11.7.8)
t({) = e00 ({)@e({)=
(11.7.9)
where We now use the operator N to link # to !: Z { N({> w)!(w)gw> #({) = !({) +
(11.7.10)
0
i.e., e({)y({) = d({)x({) +
Z
{
N({> w)d(w)x(w)gw=
(11.7.11)
0
As we know from Section 11.2, this operator transforms the solution of (11.7.3) satisfying !(0) = 1, and (11.7.5), into a solution of (11.7.8) satisfying #(0) = 1 and (11.7.12) # 0 (0) (k + N(0> 0))#(0) = 0= This last condition is equivalent to e(0)y 0 (0) + {e0 (0) (k + N(0> 0)e(0)}y(0) = 0=
(11.7.13)
11. Inversion of Continuous Second-Order Systems
317
If we choose the new system so that e0 (0) (k + N(0> 0)e(0) = 0>
e(0) = 1>
(11.7.14)
then y0 (0) = 0: the new rod is free at { = 0. In this case e({) is the solution of (11.7.8) for = 0 satisfying the conditions (11.7.14). Thus e({) is related to d({) by equation (11.7.10): Z { N({> w)d(w)gw= (11.7.15) e({) = d({) + 0
In particular, if we choose k = 0> d({) = 1, then Z { N({> w)gw= e({) = 1 +
(11.7.16)
0
The remainder of the analysis is as before: N({> |) satisfies Z { N({> |) + N({> w)I (w> |)gw + I ({> |) = 0> 0|{
(11.7.17)
0
where I ({> |) = d({)d(|)
Q X
(zq ({)zq (|) xq ({)xq (|))=
(11.7.18)
q=0
Here "q ({) = d({)zq ({) is the solution of (11.7.3) with = q satisfying "q (0) = # q (0), "0q (0) = k"q (0). This means that zq (0) = yq (0), zq0 (0) = 0. Here (q )Q 0 are the eigenvalues of the new system and yq (0) is the end value of the corresponding normalised eigenfunction yq ({), and xq ({) is the normalised eigenfunction of (11.7.1). Again we must choose the base system so that Q ? Q +1 . If we make the s choice d({) = 1> k = 0> K = 4 then this means Q = $Q ? (Q + 32 ). The solution of equation (11.7.17) has the form N({> |) = d({)d(|)
Q X
{Iq ({)zq (|) Jq ({)xq (|)}
q=0
where Iq ({)> Jq ({) satisfy Ip ({) +
Q X
{epq ({)Iq ({) fpq ({)Jq ({)} + zp ({) = 0>
(11.7.19)
q=0
Jp ({) +
Q X
{fqp ({)Iq ({) gpq ({)Jq ({)} + xp ({) = 0>
q=0
(11.7.20)
318
Chapter 11
and epq ({) = gpq ({) =
Z
{
d2 (w)zp (w)zq (w)gw>
Z0 {
fpq ({) =
Z
{
d2 (w)zp (w)xq (w)gw
0
2
d (w)xp (w)xq (w)gw=
0
We note that gpq () = pq . It is important to note that the e({) generated by the construction procedure will always be positive. We show this by supposing that e({0 ) = 0 for some {0 5 [0> ] and arriving at a contradiction. By analogy with Ex. 11.5.5, we have Z { N({> |)"q (|)} (11.7.21) d({)Iq ({) = {"q ({) + 0 Z { N({> |)!q (|)} (11.7.22) d({)Jq ({) = {!q ({) + 0
where "q ({) = !({> q > f1 )> !q ({) = !({> q > f2 ), and !({> > f) denotes the solution of (11.7.3) for !(0) = f, satisfying (11.7.5). But if "q > !q are solutions of (11.7.3), then d({)Iq ({)> d({)Jq ({) given by (11.7.21), (11.7.22), are solutions of (11.7.8). Thus d({)Iq ({) = e({)y({> q > f1 ) d({)Jq ({) = e({)y({> q > f2 ) where y({> > f) denotes the solution of (11.7.7) satisfying y(0) = f> y 0 (0) = 0. Note that y({> q > f1 ) is an unnormalised eigenfunction of the new system. This means that if e({0 ) = 0, then Iq ({0 ) = 0 = Jq ({0 ) for q = 0> 1> = = = > Q , and hence, from equations (11.7.19), (11.7.20), zq ({0 ) = 0 = xq ({0 ), for q = 0> 1> = = = > Q . But xq ({) is the qth eigenfunction of the base system, and when q = 0> x0 ({) has no zero in [0> ], except possibly at { = when K = 4. Therefore, the only possibility is {0 = > K = 4, and then zq () = 0> q = 0> 1> = = = > Q also. This means that (q )Q 0 are eigenvalues of the base problem and q = q > q = 0> 1> = = = > Q . This contradiction implies e({0 ) 6= 0. Since e(0) = 1 and e({) is continuous, we must have e({) A 0 for { 5 [0> ]. To conclude this section we return to the end conditions. The base problem, in the V O form (11.7.3), has end conditions !0 (0) k!(0) = 0 = !0 () + K!()= In terms of x, these are x0 (0) = 0 = d()x0 () + (d0 () + Kd())x() where we have taken d0 (0) = kd(0). The end condition for the V O form of the new problem are # 0 (0) k0 #(0) = 0 = #0 () + K 0 #()=
11. Inversion of Continuous Second-Order Systems
319
In terms of y, these are y 0 (0) = 0 = e()y 0 () + (e0 () + (K N(> )e())y()> where we have used (11.6.18) to give K 0 = K N(> ). We note that while the choices (11.7.6), (11.7.14) for d({)> e({), make the analysis straightforward, it is not necessary to make these choices. Again, if we know the eigenvalues of the new rod for the end { = 0 free and fixed, then we can find (yq (0))Q For examples of reconstruction, see 0 as in Section 11.6. Gladwell and Dods (1987) [111]. See also Andersson (1988a) [6], (1988b) [7] for a detailed study of the inverse problem for equation (11.7.1), see Knobel and Lowe (1993) [195]. For the case in which D({) is rough, see Coleman (1989) [61], Coleman and McLaughlin (1993a) [62], (1993b) [63].
11.8
An inverse problem for the taut string
In Section 10.1, we showed how the three forms of the Sturm-Liouville equation, (10.1.1), (10.1.3) and (10.1.11) were related. In approaching the taut string, it is somewhat easier to start from (10.1.3), the rod, rather than from the standard form (10.1.11). We recall part of the analysis in Section 10.1, and make a few changes in the way we normalise variables. Suppose x({) satisfies equation (11.7.1), i.e., (D({)x0 ({))0 + D({)x({) = 0 and the end conditions x0 (0) kx(0) = 0 = x0 () + Kx()= Scale D({) so that
Z
0
gw = D(w)
and introduce a new variable by the equation Z { gw = 0 D(w) so that 0 , and 0 ({) = 1@D({). Put x({) = |()>
D({) = ()
˙ and equation (11.8.1) becomes then D({)x0 ({) = |(), |¨() + 2 ()|() = 0= The end conditions become |(0) ˙ kD(0)|(0) = 0 = |() ˙ + KD()|()=
(11.8.1)
320
Chapter 11
We note that the new spring constants are scaled versions kD(0)> KD() of the old, but that the end conditions x0 (0) = 0 = x()>
x(0) = 0 = x()
(11.8.2)
|(0) ˙ = 0 = |()>
|(0) = 0 = |()=
(11.8.3)
remain invariant: This means that, under either of these two sets of end conditions, the string has the same eigenvalues as the rod, and in particular the asymptotic forms of the eigenvalues are the same. 4 If therefore we are given two sequences of eigenvalues (q )4 0 > ( q )0 which purport to be the eigenvalues of a taut string under the two sets of end conditions (11.8.2), we must first scale them, (eectively to find the length, O, of the string to which they correspond) so that they correspond to a string of length . They will then have the asymptotic forms 1 q = [(q + )]2 + q > 2
q = [(q + 1)]2 + q =
4 Given (q )4 0 > ( q )0 , we then find the end values |q (0) of the normalised eigenfunctions from the fact that 4 X |q2 (0) =0 q q=0
We use this in truncated form, as in Section 11.6 to find has roots ( q )4 0 . (|q (0))Q . We note that 0 Z Z 2 ()|q2 ()g = D({)zq2 ({)g{ = 1 0
0
and |q (0) = zq (0). This means that we have the data needed to find the new rod E({) as in (11.7.7). We now reverse the analysis given at the beginning of this section. Thus we scale E({) so that Z gw = E(w) 0 and then introduce a new variable by Z gw = > E(w) 0
0 =
The new mass density of the string is () = E({)=
11. Inversion of Continuous Second-Order Systems
11.9
321
Some non-classical methods
In this section we explain in general terms the theory behind some recent approximate methods for inverting the Sturm-Liouville equation. The theory is largely due to Rundell and Sacks (1992a) [292], Rundell and Sacks (1992b) [293]; see also Lowe, Pilant and Rundell (1992) [216] and Rundell (1997) [294]. The distinguishing feature of the methods is that they rely on the hyperbolic equation satisfied by N({> |), rather than on the Gel’fand Levitan integral equation. We start by recalling analysis from Section 11.2 onwards. The base problem is | 00 + ( s)| = 0> 0 { > (11.9.1) |0 (0) k|(0) = 0 = | 0 () + K|()= The eigenfunctions
(!q )4 0
(11.9.2)
of this problem form a c.o.s. This means that if i ({) =
Q X
dq !q ({)
q=0
then dq = (i> !q )= Suppose that (q )4 0 is a new spectrum and, in the notation of (11.5.4), "q ({) = !({> q > 1)= If we expand i ({) in terms of these functions: i ({) =
Q X
eq "q ({)>
Q X
fqp eq >
q=0
then we can find the eq from dp = (i> !p ) =
Q X
q=0
eq ("q > !p ) =
p = 0> 1> = = = > Q=
(11.9.3)
q=0
We showed in equation (11.5.25) that (p q )fqp = ("0q () + K"q ())!p ()= The end value, !p (), is not zero; we are assuming that K is finite. As in Section 11.5, if "0q0 () + K"q0 () = 0 for some q0 , then q0 is an eigenvalue of the base system with "q0 ({) being a (not necessarily normalised) eigenfunction; i.e., "q0 ({) = f!p0 ({). In that case, equation (11.9.3) with p = p0 yields dp0 = eq0 and there are just Q 1 equations for the remaining eq . In any case, we can solve equation (11.9.3) for e0 > e1 > = = = > eQ . This is the first result we will use.
322
Chapter 11
In the classical method we suppose first that we have two spectra of a V O equation with (unknown) potential t({) corresponding to two sets of end conditions 0 0 0 0 (q )4 0 for | (0) k |(0) = 0 = | () + K |() 0 0 0 0 ( q )4 0 for | (0) k1 |(0) = 0 = | () + K |()=
(11.9.4)
Note that the conditions at { = are the same, while those at { = 0 are dierent. We then used equation (11.6.6), or preferably (11.6.11), to find the end values (# q (0))4 0 corresponding to (11.9.4). (Of course we introduced the Truncation Assumption, so that we had to find only (# q (0))Q 0 .) We then introduced the operator N and found N({> |) so that the normalised eigenfunctions (#q ({))4 0 corresponding to (11.9.4) were given by "q ({) = !({> q > # q (0)) Z { N({> |)"q (|)g|= # q ({) = "q ({) + 0
4 The particular N({> |) that transforms ("q ({))4 0 into a c.o.s. (# q ({))0 is the solution of the Gel’fand-Levitan integral equation (11.5.20). The potential t({) is given by 2gN({> {) t({) = s({) + g{ and the values of k0 > K 0 are given by (11.2.11) and (11.6.18):
k0 = k + N(0> 0)>
K 0 = K N(> )=
Rundell and Sacks proceed dierently. They suppose that we are given two Q spectra (q )Q 0 > ( q )0 corresponding to a V O equation | 00 + ( t)| = 0
(11.9.5)
under two sets of end conditions that dier at { = (not at 0 as in the classical approach): 0 0 0 0 (11.9.6) (q )Q 0 for | (0) k |(0) = 0 = | () + K1 |()> 0 0 0 0 ( q )Q 0 for | (0) k |(0) = 0 = | () + K2 |()=
(11.9.7)
Now we find N({> |) so that "q ({) = !({> q > 1)> Z { N({> |)"q (|)g| #q ({) = "q ({) + 0
give (unnormalised) eigenfunctions of (11.9.5) corresponding to the end conditions (11.9.6), while q ({) = !({> q > 1)> Z { N({> |)q (|)g| q ({) = q ({) + 0
11. Inversion of Continuous Second-Order Systems
323
gives (unnormalised) eigenfunctions of (11.9.5) corresponding to the end conditions (11.9.7). Note that N({> |) will be given by the theory of Section 11.2, not by that of Section 11.5. This means that N({> |) will satisfy the hyperbolic equation (11.2.6) and the boundary condition (11.2.10). The basic theory of Section 11.2 states that the transformation operator transforms a solution of the base equation (11.9.1) satisfying (11.9.2a) into a solution of (11.9.5) and (11.9.6a). This means that (#q ({))Q 0 will be eigenfunctions of (11.9.5) corresponding to the end conditions (11.9.6) if # 0q () + K10 #q () = 0>
q = 0> 1> = = = > Q=
(11.9.8)
Similarly ( q ({))Q 0 will be eigenfunctions of (11.9.5) corresponding to the end conditions (11.9.7) if 0q () + K20 q () = 0>
q = 0> 1> = = = > Q=
(11.9.9)
The problem is therefore to find a solution of the hyperbolic equation (11.2.6) that satisfies equations (11.9.8), (11.9.9). Before considering how to do this we make some preliminary simplifications. 4 If the given sequences (q )4 0 > ( q )0 are indeed the spectra of some V O equation (11.9.5) corresponding to (11.9.6), (11.9.7) respectively, then they must have one of the asymptotic forms listed in Section 10.9: (10.9.47) if k0 is finite; Exercise 10.9.1 if k0 = 4. Assume for the sake of argument that k0 = 4 then, by examining the sequences we can recover t¯ from either of the two equations ¾ ¾ ½ ½ 1 1 1 1 2 2 lim (q t¯) (q + ) = 0 = lim ( q t¯) (q + ) = (11.9.10) q$4 q$4 2 2 With this t¯, form t ({) = t({) t¯ as in (10.9.44). The starred system has eigenvalues q = q t¯, q = q t¯ corresponding to (11.9.6), (11.9.7) respectively. We note that even if the equation (11.9.5) was derived from a physical system with positive eigenvalues, and the limits show that q > q will eventually exceed t¯, there is no guarantee that all the starred quantitites q > q will be positive. We now consider the reduced, i.e., starred, system and drop the asterisks. We showed in Section 11.2 that if k0 = 4, then we must take a base system with k = 4. Then N({> 0) = 0> 0 { so that N({> |) is continued as an odd function of | into the lower triangle in Figure 11.3.1. For simplicity we take s({) = 0, so that equation (11.2.14) gives N(> ) = 0. We choose K = 0 so that the base system is !00 + ! = 0> 0 { > 0 !(0) = 0 = ! ()> ¡ ¢2 with eigenvalues q = q + 12 > q = 0> 1> = = =, and eigenfunctions r µ ¶ 1 2 !q ({) = sin q + {= 2
(11.9.11)
324
Chapter 11
We define !({> ) as the solution of (11.9.11) satisfying !0 (0> ) = 1=
!(0> ) = 0> 1
This means that if $ = || 2 , then !({> ) =
sinh ${ sin${ > if A 0; > if ? 0= $ $
We now define "q ({) = !({> q )> q ({) = !({> q )>
q = 0> 1> = = = > Q> q = 0> 1> = = = > Q=
and construct # q ({)> q ({), the eigenfunctions of (11.6.3) corresponding to (11.9.6), (11.9.7) respectively, by using the transformation operator N({> |): Z { # q ({) = "q ({) + N({> w)"q (w)gw (11.9.12) 0 Z { N({> w)q (w)gw= q ({) = q ({) + 0
Now consider the equation (11.9.8). Equation (11.9.12) gives Z # q () = "q () + N(> |)"q (|)g| 0
and # 0q () = "0q () + But 0=
#0q () + K10 # q ()
=
Z
0
N{ (> |)"q (|)g| + N(> )"q ()=
"0q () + K10 "q () +
Z
0
{N{ (> |) + K10 N(> |)}"q (|)g|=
(11.9.13) This equation gives the inner products of the function i1 (|) = N{ (> |) + K10 N(> |) with respect to the "q (|). But knowing these we can use the analysis leading to (11.9.3) to find the inner products with respect to (sin qw)Q+1 , because 1 sin qw is the solution of the base problem under the Dirichlet conditions !(0) = 0 = !(). Proceeding in exactly the same way for the second spectrum ( q )Q 0 , we can find the inner products of the function i2 (|) = N{ (> |) + K20 N(> |) Q +1 with respect to (q (|))Q . By taking multiples of 0 , and hence to (sin qw)1 i1 (|) and i2 (|) we find N(> |) =
Q+1 X q=1
dq sin q|>
N{ (> |) =
Q+1 X
eq sin q|=
(11.9.14)
q=1
Note that these expansions give N(> 0) = 0 = N(> ) and N{ (> 0) = 0, as required (recall that N({> 0) 0), but they make N{ (> ) = 0 which is an
11. Inversion of Continuous Second-Order Systems
325
unnecessary restriction. We recall that when k = 4> N({> |) is an odd function of |, and the expansions (11.9.14) are odd in |. Now return first to equation (11.2.7) which states t({) =
2gN({> {) > g{
(11.9.15)
and then to equation (11.3.15) which expresses N({> |) in terms of N and N{ on the line { = , and an integral over the triangle DES of Figure 11.3.3. When { = |> the triangle is as shown in Figure 11.9.1, so that equation (11.3.15) gives Z 2N({> {) = N(> 2{ ) + N(> ) N{ (> w)gw 2{ ¾ Z ½Z v t(v)N(v> w)gw gv= + {
2{v
B(S , S )
P
( x, x )
0 A(S ,2 x S )
Figure 11.9.1 - The triangle DES when { = |.
We have no space to refer to the many other numerical methods, see for example Brown, Samko, Knowles and Marletta (2003) [41]. On dierentiating w.r.t. { we find Z 2gN({> {) = 2N| (> 2{ ) + 2N{ (> 2{ ) 2 t(v)N(v> 2{ v)gv= g{ { This equation provides the basis for an iterative solution to the problem. Putting J({) = 2N| (> 2{ ) + 2N{ (> 2{ )>
326
Chapter 11
we use the equation in the form tp+1 ({) = J({) 2
Z
tp (v)N(v> 2{ v)gv
(11.9.16)
{
to obtain a new value of t({) from an existing one. Treating the R.H.S. of (11.9.16) as the result of operating on tp by an operator W , we have tp+1 = W tp = The potential t({) is thus sought as a fixed point of the mapping W t. The actual numerical implementation is not our primary concern; for that see, say, Rundell (1997) [294]. In principle we can proceed as follows: Step 1: Start from some initial approximation, for example t0 ({) = J({). Put p = 0. Step 2: Solve the Cauchy problem N{{ N|| tp N = 0 with N> N{ given by (11.9.14) on { = . This can be done using standard numerical procedures. Step 3: Form tp+1 ({) from equation (11.9.16). Put p = p + 1 and return to step 2 until convergence is achieved.
11.10
Some other uniqueness theorems
The fundamental uniqueness theorem in Section 11.4 showed that the potential t({) and the end constants k> K were uniquely determined by two spectra. The crucial step in the analysis was that the completeness of the eigenfunctions meant that equation (11.4.6) implied N(> |) = 0 = N{ (> |) for 0 | . But this is the Cauchy data for the hyperbolic equation (11.2.6); since the data is zero, N({> |) = 0 for 0 | { , and s({) = t({)> k1 = k2 and K1 = K2 . The fundamental uniqueness theorem uses two spectra corresponding to two dierent end constants at one end. We now show that if just one spectrum is known, then there are various other sets of auxiliary data which will lead to a unique system. As in Section 11.4, we phrase the uniqueness theorem in terms of general end conditions, i.e., end constants that are neither zero nor infinite. The special cases in which one or both are zero or infinite may be covered by straightforward modifications of the argument. Theorem 11.10.1 Suppose that there were two potentials s({)> t({) 5 F[0> ] with the following properties: i) | 00 + ( s)| = 0> | 0 (0) k1 |(0) = 0 = | 0 () + K1 |() has the spectrum 4 (q )4 0 and eigenfunctions (!q (|))0 ;
11. Inversion of Continuous Second-Order Systems
327
ii) | 00 + ( t)| = 0> | 0 (0) k2 |(0) = 0 = | 0 () + K2 |() has the same 4 spectrum (q )4 0 and eigenfunctions (# q (|))0 ; and one of the following properties holds: iii)
# q () # q (0)
=
!q () !q (0)
q = 0> 1> 2> = = =
iv)
# 0q () # 0q (0)
=
!0q () !0q (0)
q = 0> 1> 2> = = =
2
R # q2( ) # q ({)g{ 0
v)
=
2
0 R # q2 ( ) # q ({)g{ 0
vi)
2
R !q2( ) !q ({)g{ 0
= 0 or >
q = 0> 1> 2> = = =
2
=
0 R !q2 ( ) ! q ({)g{ 0
= 0 or >
q = 0> 1> 2> = = =
vii) s({) = s( {)> t({) = t( {)> k1 = K1 > then s({) = t({)> k1 = k2 > K1 = K2 . Proof. #q ({) is related to !q ({) by Z # q ({) = !q ({) +
k2 = K2
{
N({> |)!q (|)g|
0
(11.10.1)
so that #0q ({)
=
!0q ({)
+ N({> {)!q ({) +
Z
{
0
N{ ({> |)!q (|)g|
(11.10.2)
# q (0) = !q (0) # 0q (0) k2 #q (0) = !0q (0) k1 !q (0) + {N(0> 0) + k1 k2 }!q (0) # 0q () + K2 #q () = !0q () + K1 !q () + {N(> ) + K2 K1 }!q () Z {N{ (> |) + K2 N(> |)}!q (|)g|= (11.10.3) + 0
Since i) and ii) have the same spectrum, Z Z 1 1 s({)g{ = k2 + K2 + t({)g{> k1 + K1 + 2 0 2 0 but
1 N({> {) = k2 k1 + 2
Z
0
{
{t({) s({)}g{
so that N(0> 0) + k1 k2 = 0 = N(> ) + K2 K1 . Since # 0q () + K2 #q () = 0 = !0q () + K1 !q (), equation (11.10.3) implies N{ (> |) + K2 N(> |) = 0 as in (11.4.7). Now bring in the extra information: iii) Since # q (0) = !q (0), we have #q () = !q () and thus, from (11.10.1), Z N(> |)!q (|)g| = 0 q = 0> 1> 2> = = = 0
328
Chapter 11
and N(> |) = 0, so that N{ (> |) = 0, and the conclusion follows as before. iv) Expressing # 0q () and #0q (0) in terms of !q , we find, after some manipulations, that Z 0 N{ (> |)!q (|)g|= (k1 K2 k2 K1 )!q (0)!q () = !q (0) 0
Let q $ 4, then the Riemann-Lebesque Lemma states that Z N{ (> |)!q (|)g| = 0= lim q$4
0
Thus k1 K2 k2 K1 = 0, and N{ (> |) = R0, and we proceed as before. v) We need to get an expression for 0 !2q ({)g{ = q . R We show that two eigenfunctions satisfying i) are orthogonal, i.e., 0 !p ({)!q ({)g{ = 0 by taking the two equations !00q + (q s)!q = 0 = !00p + (p s)!p > multiplying the first by !p , the second by !q , subtracting the resulting equations and integrating over (0> ). To find q , we need to take the equation !00 + ( s)! = 0 for q and for another infinitesimally close to it. We proceed as follows. Let ! = !({> > f) be the solution of !00 + ( s)! = 0>
!0 (0) k1 !(0) = 0>
!(0) = f=
(11.10.4)
Then, on letting • = C@C, we find 00 !˙ + ( s)!˙ + ! = 0>
0 ˙ !˙ (0) k1 !(0) = 0>
˙ !(0) = 0=
(11.10.5)
˙ (11.10.5a) by !, subtracting and integrating over Multiplying (11.10.4a) by !, (0> ), and putting = q , so that ! = !q , we find Z 0 [!˙ q !0q !˙ q !q ]0 = !2q g{= 0
At the lower limit, the L.H.S. is zero, at the upper limit it is Z 0 (!˙ q () + K1 !˙ q ())!q () = !2q ({)g{=
(11.10.6)
0
We may carry out the same calculation for # q ({), and find Z 0 (#˙ q () + K2 #˙ q ())#q () = #2q ({)g{=
(11.10.7)
0
Since N({> |) is independent of , equations (11.10.1), (11.10.2) when dierentiated w.r.t. , give 0 0 #˙ q () + K2 #˙ q () = !˙ q () + K1 !˙ q ()=
(11.10.8)
11. Inversion of Continuous Second-Order Systems
329
Thus v) with (11.10.6), (11.10.7) yield # () # 2q ( ) = q 2 !q () !q ( )
q = 0> 1> = = =
If = 0, then # q (0) = !q (0) yields # q () = !q (); if = , then again #q () = !q (). But if # q () = !q (), then equation (11.10.1) shows that Z N(> |)!q (|)g| = 0> 0
so that N(> |) = 0, and we proceed as before. vi) On using (11.10.6)-(11.10.8) we find 2
# 0q ( ) 2 !0q ( )
=
#q () = !q ()
If = 0, then the L.H.S. is k22 @k21 . Thus Z ¡ ¢ # q () = !q () + N(> |)!q (|)g| = k22 @k21 !q ()= 0
¡ ¢ Again, N(> |) = 0. If = , then # q () = K12 @K22 !q () and again N(> |) = 0. vii) The potential and the end conditions are invariant under the transformation { $ {. Thus, all the eigenfunctions must be either symmetric or antisymmetric about { = 2 . More precisely !2q ({)> # 2q ({) are symmetric while !2q+1 ({)> # 2q+1 ({) are antisymmetric. Thus ! () # q () = ±1 = q !q (0) !q (0) so that this is a special case of iii). Corresponding to each of the sets of auxiliary data in iii)-vii) we may devise a way to estimate N(> |) and N{ (> |) for 0 | . We may then proceed as in Section 11.9 to construct the potential. Hochstadt and Lierberman (1978) [181] considered the problem of determining t({) in [0> 2 ] from knowledge of t({) in [ 2 > ] and one spectrum, say that for the Dirichlet end conditions |(0) = 0 = |(). The non-classical method described in Section 11.9 lends itself well to this problem. Suppose the Dirichlet spectrum is (q )4 0 ; it must have the asymptotic form (10.9.41), i.e., f s + r(q1 )> q = $q = q + 1 + q+1 where Z 1 f= t({)g{ 2 0
330
Chapter 11
is known. Without loss of generality we take s({) = 0 in the base problem. Let "q ({) be the solution of "00q + q "q = 0
"q (0) = 0>
"0q (0) = 1
and let #q ({) = "q ({) +
Z
{
N({> |)"q (|)g|=
0
The equation # q () = 0 is "q () +
Z
0
N(> |)"q (|)g| = 0
which yields N(> |)> 0 | . The kernel N satisfies N({> 0) = 0, and N({> {) =
1 2
Z
{
t({)g{ =
0
1 2
Z
t({)g{
0
1 2
Z
t({)g{=
{
Since f is known, and t({) is known for { 2 , so is N({> {). We need to recall the arguments we used in Section 11.3. We considered the Goursat problem in which x({> |) is known on the two characteristics { = ±|, for 0 { , and we showed that x({> |) is uniquely determined. Under Dirichlet conditions the kernel N({> |) = x({> |) is an odd function of |, so that N({> 0) = 0. The uniqueness result is therefore that N({> |) is determined in the region 0 | { if it is known on the two parts | = 0> 0 { ; and { = | for 0 { , of the boundary. But we can argue just as in Section 11.3 that if N({> |) is known, as indicated by the asterisks on the two parts { = > 0 | and 2 { = | of the boundary of the shaded region in Figure 11.10.1a, then it is known in that region. That means that we can find N({> |) on the third part of the boundary: | = {, 2 { . Now consider the new shaded region in Figure 11.10.1b. The kernel N is known on the two parts | = 0, | = { for 2 { , of the boundary, again indicated by asterisks; therefore it is known throughout that shaded region, and therefore N( 2 > |) and N{ ( 2 > |) are known for 0 | 2 . Finally, we consider a Cauchy problem for the shaded region in Figure 11.10.1c; N and N{ are known on { = 2 , so that N is known throughout. Thus N({> {) is known for 0 { 2 , and t({) =
2gN({> {) = g{
11. Inversion of Continuous Second-Order Systems
y
* * *
y
y
* * * *
** * ** *
*
0
331
x
S
0
S * * * *S
x
0 * * * *S 2
2
a)
b)
* * * * *
S
x
c)
Figure 11.10.1 - Boundary value problems on 3 triangles.
11.11
Reconstruction from the impulse response
In this section we describe analysis, derived by Gopinath and Sondhi, by which ˆ w) of Section 10.10. D({) can be reconstructed from the impulse response k(0> See Gopinath and Sondhi (1970) [136], (1971) [137], Sondhi and Gopinath (1971) [308] and Sondhi (1984) [309]. Suppose a unit impulse is applied to the free ˆ w) at end, { = 0, at time w = 0. It is intuitively clear that the response k(0> the end of the rod at time w is independent of the shape of the rod for { A c, ˆ w) due to the shape for { A w@c where c = w@2. This is because any eect on k(0> would not be felt until after time w, the time taken for a disturbance moving with (scaled) speed 1 to reach { = c and return. Sondhi and Gopinath demonstrate ˆ the converse, namely that knowledge of k(w) for 0 w 2 is su!cient (and necessary) for the determination of D({) for 0 { 1. The solution is based upon the following observation. Suppose the rod is at rest at time w = wr , i.e., y({> wr ) = 0 = s({> wr ), for 0 { 1, and a force is applied at the free end { = 0. At time w = wr + d the rod will still be at rest for { d, because the scaled wave speed is 1. Integrating the first of equations (10.10.10), we obtain D({)[y({> w)]wwrr +d
= D({)y({> wr + d) =
Z
wr +d
wr
Cs gw C{
and on a second integration, w.r.t. {, we find Z
d
D({)y({> wr + d)g{ =
Z
wr +d
s(0> w)gw=
(11.11.1)
wr
0
If now, for every d, we could find a force s(0> w) such that y({> wr + d) = 1 for 0 { d then, for that case, equation (11.11.1) would give Z
0
d
D({)g{ =
Z
wr +d
wr
s(0> w)gw=
(11.11.2)
332
Chapter 11
t a
a
0
x
a Figure 11.11.1 - The region in the {> w plane. Thus the integral of D({), and hence D(d), would be determined as a function of d. We now show that such a force exists, and can be determined from a ˆ w). knowledge of k(0> If y({> w)> s({> w) satisfy equation (10.10.10), then so do y(> w)> s({> w) and, by superposition Y ({> w) = y({> w) + y({> w)> S ({> w) = s({> w) s({> w)= Trivially, Y ({> w) = 2> S ({> w) = 0 is such a solution. The analysis of the Cauchy Problem in Section 11.3 states that this is the unique solution in the triangular region in Figure 11.11.1 which satisfies the conditions Y (0> w) = 2> S (0> w) = 0, for d w d. Thus if s(0> w) is such that Y (0> w) = 2> S (0> w) = 0 for d w d, then everywhere in the triangle, Y ({> w) = 2> S ({> w) = 0. In particular, when w = 0, Y ({> 0) = 2y({> 0) = 2 implies y({> 0) = 1; this gives the y({> w) required in equation (11.11.2) if w0 is taken to be d. To find the required pressure s(0> w), we note that since the rod is rest at w = d, equation (10.10.11) gives y(0> w) =
Z
w
ˆ w )s(0> w)g k(0>
d
so that if y(0> w) + y(0> w) = 2 then Z
w
d
ˆ w )s(0> )g + k(0>
Z
w
d
ˆ w )s(0> )g = 2= k(0>
11. Inversion of Continuous Second-Order Systems
333
The solution of this equation depends on d; we therefore write s(0> ) := i (d> )= Now using the fact that s(0> ) is even in , and that equation (10.10.14) yields ˆ = (w) + k(w) k(w) we find i (d> w) +
1 2
Z
d
d
k(|w |)i (d> )g = 1=
(11.11.3)
Once i (d> ) is known, equation (11.11.2) gives Z d Z d D({)g{ = i (d> )gw= 0
0
Equation (11.11.3) may be written in operator form (L + Kd )i (d> w) = 1= ˆ is the impulse response of an actual rod, Sondhi and Gopinath show that if k(w) then the operator L + Kd will be positive definite, so that equation (11.11.3) will have a unique solution. They show moreover that the corresponding D(d) will be positive, provided that it is continuous. In addition, they show that if L + Kd is positive definite, then there is a rod (i.e., an D({)) which has this impulse response. We now apply the procedure to the problem of Section 11.6, i.e., the reconstruction of a rod which has, from some index on, the same eigenvalues and end values of nomalised eigenfunctions, as the uniform rod, i.e., $ l = $ 0l >
xl (0) = x0l (0)>
l = q + 1> q + 2> = = =
where
(2l + 1) > [x0l (0)]2 = 2= 2 In this case k(w) is given by Ex. 10.10.2, and the kernel k(|w |) is degenerate. Since i (d> w) is even in w, we may write equation (11.11.3) as Z 1 d i (d> w) + {k(|w |) + k(|w + |)}i (d> )g = 1> 0 w d= (11.11.4) 2 0 $0l =
Now the kernel is 1 K(w> ) = P |) + k(|w + |)} 2 {k(|w q 2 0 2 0 0 = l=0 {[xl (0)] cos $ l w cos $ l [xl (0)] cos $ l w cos $ l }=
Since the kernel is degenerate, the solution may be found by a straightforward matrix inversion. Thus equation (11.11.4) gives i (d> w) = 1 +
q X {dl (d)[xl (0)]2 cos $ l w el (d)[x0l (0)]2 cos $ 0l w}> l=0
334
Chapter 11
and on substituting this into equation (11.11.4) we find dl +
Z
d
cos $ l g +
0
el +
Z
d
0
where
elm flm glm
q X (elm dm flm em ) = 0> m=0
q X cos $0l g + (fml dm glm em ) = 0> m=0
Rd = [xm (0)]2 R0 cos $l cos $m g > d = [x0m (0)]2 R0 cos $l cos $ 0m g > d = [x0m (0)]2 0 cos $0l cos $ 0m g =
Once i (d> w) has been found, D({) may be computed from equation (11.11.2). This completes the inversion. This analysis has intimate connections to the whole area of inverse scattering; see for example Burridge (1980) [45],Bube and Burridge (1983) [43], Landau (1983) [204], Bruckstein and Kailath (1987) [42], Chadan and Sabatier (1989) [52]. Further references may be found in Gladwell (1993) [120].
Chapter 12
A Miscellany of Inverse Problems Symmetry is what we see at a glance; based on the fact that there is no reason for any dierence, and based also on the face of man; whence it happens that symmetry is only wanted in breadth, not in height or depth. Pascal’s Pensées, 28
12.1
Constructing a piecewise uniform rod from two spectra
All the uniqueness proofs and construction algorithms described in Chapter 11 relate to the construction of a continuous system (i.e., continuous t({)> D({) or ({)). The basic data is two infinite sequences which may be two spectra corresponding to dierent end conditions, or one spectrum and some auxiliary data, as in Theorem 11.10.1. If two finite data sets are given then they are either complemented by using the Truncation Assumption, as in Section 11.5, or the system is approximated numerically, as in Section 11.9. In this section we show how a piecewise uniform rod may be constructed so that it has precisely two given finite spectra; we do not use the Truncation Assumption. Andersson (1990) [8] was the first to provide a constructive algorithm; we follow the analysis given in Gladwell (1991c) [118], which places Andersson’s algorithm in the context of inversion algorithms in seismology and transmission line theory, see Bube and Burridge (1983) [43], Bruckstein and Kailath (1987) [42] and Gladwell (1993) [120]. Andersson considered a vibrating rod, i.e., equation (11.1.3), viz. (D({)y 0 ({))0 + $ 2 D({)y({) = 0
(12.1.1)
subject to the end conditions i)
y 0 (0) = 0 = y 0 (O); ii) 335
y(0) = 0 = y 0 (O);
(12.1.2)
336
Chapter 12
these correspond to free-free and fixed-free ends respectively. He showed that if there were given q + 1 frequencies ($ n )q0 satisfying 0 = $ 0 ? $1 ? = = = $ q = q@2O>
(12.1.3)
and such that the even $ m were eigenvalues for equation (12.1.1) for i), the odd for ii), then there exists a unique rod with piecewise constant D({), such that D({) = Dm > where 4 = O@q>
0
1
(m 1)4 { m4
m = 1> 2> = = = > q>
(12.1.4)
D1 = 1, as shown in Figure 12.1.1.
2
n 1
n
Figure 12.1.1 - A stepped rod with q segments. In seismology and transmission line theory, a medium with parameters that are constant over equal intervals of depth 4, such as (12.1.4), is called a Goupillard medium. In transmission line theory, as in most inverse scattering problems, the data do not relate to eigenvalues; there are no eigenvalues, or so-called bound states. Instead the data refer to the response to an input. One way of expressing the data uses the reflected wave X (w) at equal intervals 24, due to an incoming wave G(w) also sampled at intervals 24. One of the fundamental questions is to ask whether a given reflected wave and incoming wave actually correspond to a Goupillard medium. This is the question: ‘Are the data realisable?’ The realisability criterion can be phrased by introducing the ]-transforms, X (} ) and G(} ), or X (w) and G(w), and defining the left-reflection function U(} ) =
X (} ) G(} )
and then putting i1 (} ) = } 1 U(} )= The realizability criterion is P (i1 ) sup |i1 (} )| 1= |}|1
(12.1.5)
12. A Miscellany of Inverse Problems
337
Schur (1917) [300] constructed an algorithm to test whether a function i1 (}) satisfies (12.1.5), that is, is bounded by 1 on the unit disc. The algorithm is based on the fundamental see Gilbarg and Trudinger (1977) [102] Maximum Modulus Principle: The maximum modulus of a function i (}) (of the complex variable } = {+l|) which is regular (holomorphic) in a closed region, always lies on the boundary of that region. Note that i (}) is said to have a maximum modulus at }0 if |i (}0 )| |i (})| for all } in some neighbourhood |} }0 | of }0 . An important corollary of the principle is that if i (}) has a maximum modulus at an interior point }0 of a region in which it is regular, then i (}) = i (}0 ) throughout the region. Schur’s algorithm is based on the fact that if || ? 1, then z=
} 1 ¯ }
(12.1.6)
maps |}| 1 onto |z| 1, and |}| = 1 onto |z| = 1. His algorithm is based on the recurrence 1 im1 (}) m > m = 2> 3> = = = (12.1.7) im (}) = • } 1 ¯ m im1 (}) where m = im1 (0). Suppose P (im1 ) 1. There are two possibilities: either | m | = 1, in which case the condition P(im1 ) 1 and the maximum modulus principle forces im1 (}) = m , so that the sequence terminates at im1 (}); or | m | ? 1, in which case P (im ) 1. Thus the condition (12.1.5) used with the recurrence (12.1.7) leads to a finite or infinite sequence 2 > 3 > = = = with the property | m | 1, where the inequality is strict except possibly for the last one. We note in particular that if P(i1 ) = 1, then P(im ) = 1 for all m, and if the sequence terminates at m = q + 1, it will do so with |iq | = 1. Now we formulate the vibration problem so that we obtain a recurrence of the form (12.1.6). First we replace equation (12.1.1) by two coupled first-order equations, namely y 0 ({) = l$s({)@D({)>
s0 ({) = l$D({)y({)=
Note that l$s({) = D({)y 0 ({), so that it is y({) and s({) that are continuous at 1 a point at which D({) is discontinuous. Put ({) = {D({)} 2 and define down and up quantities G=
1 (y + 1 s)> 2
X=
1 (y 1 s)= 2
These satisfy the equations G0 = l$G + 0 1 X>
X 0 = l$X + 0 1 G>
so that if D({) = constant, then 0 = 0 and G0 = l$G>
X 0 = l$X
(12.1.8)
338
Chapter 12
which have the solutions G = G0 exp(l${)>
X = X0 exp(l${)=
(12.1.9)
Suppose D({) has the form (12.1.4). Define the quantities Gm = G(m4+)>
Gm = G(m4)>
Xm = X (m4+)>
Xm = X(m4) (12.1.10)
where + or - indicates a value just to the right or left of m4, respectively. Equations (12.1.9) show that Gm = exp(l$4)Gm1 >
Xm = exp(l$4)Xm1 =
1
Put exp(l$4) = } 2 , then · ¸ · 1 Gm }2 = Xm 0 Let 1 Hm = 2
·
0 1
} 2 m m
¸·
Gm1 Xm1
1 m 1 m
¸
=
(12.1.11)
¸
then equation (12.1.8) and the continuity of y and s across a discontinuity of D({) give · ¸ · ¸ · ¸ · ¸ Gm1 Gm1 ym1 ym1 = Hm > = Hm1 Xm1 sm1 Xm1 sm1 so that
·
Gm1 Xm1
¸
= Hm H1 m1
The matrix m = Hm H1 m1 may be written · 1 1 m = m m
·
Gm1 Xm1
m 1
¸
=
¸
(12.1.12)
(12.1.13)
where 1
m = (1 2m ) 2 >
m = (Dm1 Dm )@(Dm1 + Dm )=
We can combine equations (12.1.11), (12.1.12) to obtain ¸ · ¸ · ¸ · 1 Gm1 Gm }2 0 = = 1 m Xm Xm1 0 } 2
(12.1.14)
(12.1.15)
Put Xm @Gm = im (}), then equation (12.1.15) gives im (}) =
1 im1 (}) m • > } 1 m im1 (})
(12.1.16)
12. A Miscellany of Inverse Problems
339
which, since m is real ( m = ¯ m ), is precisely Schur’s recurrence (12.1.7). Before considering the inverse problem of reconstructing the cross-sections Dm from the spectra, we consider the simpler problem of computing the spectra from the cross-sections. Suppose we are given (Dm )q1 , with D1 = 1, and we wish to find the eigenvalues corresponding to the end conditions i) and ii). Suppose the rod is vibrating with frequency $ and the condition y 0 (O) = 0 is satisfied, then without loss of generality we can take y(O) = 1; then sq = 0, yq = 1, so that Gq = q @2 = Xq and iq (}) = 1. The values of y(0) = y0 , s(0) = s0 are related to G0 > X0 by G0 =
1 { y0 + 1 0 s0 }> 2 0
X0 =
1 { y0 1 0 s0 } 2 0
so that y0 20 s0
1
= =
1
G0 + X0 } 2 G + } 2 X1 = 1 1 1 G0 X0 } 2 G1 } 2 X1 1 + j(}) 1 j(})
(12.1.17)
where j(}) = }i1 (})=
(12.1.18)
In the forward problem, we are given iq (}) = 1 and we are given the ( m )q2 with | m | ? 1. We may thus compute iq1 (})> iq2 (})> = = = > i1 using the recurrence (12.1.16) in its reverse form: im1 (}) =
}im (}) + m > 1 + m }im (})
m = q> q 1> = = = > 2=
(12.1.19)
The mapping of }im (}) onto im1 (}) has the form (12.1.6). Thus the region |}im (})| 1 is mapped onto |im1 (})| 1, and |}im (})| = 1 is mapped onto |im1 (})| = 1. But iq (}) = 1, so that each (im (}))q1 has |im (})| = 1 when |}| = 1, i.e., when $ is real. Thus the function z = j(}) maps |}| 1 onto |z| 1, and |}| = 1 onto |z| = 1. When j(}) is expressed in terms of } it has the form (12.1.20) j(}) = }Sq1 (})@Tq1 (})> where Sq1 (})> Tq1 (}) are polynomials of degree q 1. Thus j(}) maps the circle |}| = 1 into itself q times. Equation (12.1.19) shows that if im (} 1 ) = 1@im (}), then im1 (} 1 ) = 1@im1 (}). But iq (} 1 ) = 1 = 1@iq (}), so that indeed im (} 1 ) = 1@im (})>
m = 1> 2> = = = > q>
(12.1.21)
and hence j(} 1 ) = 1@j(})=
(12.1.22)
340
Chapter 12
The mapping of |}| = 1 into itself caused by j(}) produces two sets of q points on |}| = 1 of significance, namely A = {}; |}| = 1 and j(}) = 1} = B = {}; |}| = 1 and j(}) = 1} The points in A correspond to values of } for which, according to (12.1.17), s0 = 0; the } values give values of $ which are eigenvalues of i). Similarly the } values on B give y0 = 0, so that $ corresponds to an eigenvalue of ii). The known interlacing of these two sets of eigenvalues means that the points of A and B will interlace on the circle |}| = 1 Equation (12.1.22) shows that if } is a member of either set, then } 1 = }¯ is a member of the same set. Figure 12.1.2 shows the arrangement of the two sets when q = 2 and q = 3. Since iq (1) = 1, the recurrence (12.1.19) shows that im (1) = 1 for m = q> q 1> = = = > 1. Thus j(1) = 1 : 1 is in A. On the other hand im (1) = (1)qm , so that j(1) = (1)q : 1 is in A if q is even, in B if q is odd. It may easily be verified that there are q + 1 values of } in A ^ B which satisfy 0 arg(}) =
(12.1.23)
If these points are }n = exp(ln ), where 0 = 0 ? 1 ? = = = q = , then $ n = n @(24) = qn @(2O), so that 0 = $ 0 ? $ 1 ? = = = ? $ q = q@(2O). These points }, other than } = ±1, yield q 1 points }¯ = } 1 on the lower half of the circle. Thus the system also has the eigenvalues $q+m = @4 $ qm >
m = 0> 1> = = = > q=
(12.1.24)
Since } = exp(2l$4) is a periodic function of $ with period @4, each value of } gives rise to an infinite sequence of eigenvalues with equal spacing @4, and each } 1 gives another such sequence. Thus the system not only has the eigenvalues ($m )2q 0 , but also $ pq+m =
$pq+m =
p + $m > 4
p + $ qm > 4
m = 0> 1> = = = > q if p is even
m = 0> 1> = = = > q if p is odd.
(12.1.25)
(12.1.26)
Now consider the inverse problem, that of determining the m from the spectrum. We are given q + 1 eigenvalues $ m satisfying (12.1.3). We must use them to construct j(}) and hence i1 (}), and then find the m which will lead eventually to iq (}) = 1.
12. A Miscellany of Inverse Problems
341
u
N
x
2
u
N
x
3
u
Figure 12.1.2 - The members of D(×) and E(°) interlace on the circle
First consider the case in which q is even: q = 2p. Of the q + 1 = 2p + 1 eigenvalues, p + 1 are even, corresponding to i), p are odd, corresponding to ii). The set A consists of 2p points: }0 = 1> }2p = 1, and the p 1 1 pairs }2m > }2m > m = 1> 2> = = = > p 1. The 2p odd } ’s in B occur in p pairs 1 }2m1 > }2m1 > m = 1> 2> = = = > p. Thus equation (12.1.17) gives Q 1 C p 1 + j(}) 20 y0 m=1 (} }2m1 )(} }2m1 ) = > = (12.1.27) Q 1 s0 1 j(}) (} 2 1) p1 m=1 (} }2m )(} }2m )
so that j(}) = 1 when } is a root of the denominator, and j(}) = 1 when } is a root of the numerator. The constant C must be chosen so that j(0) = 0, i.e., C = 1; the numerator of j(}) will thus have no constant term, while the highest powers of } 2p , in the denominator will cancel, so that j(}) will have the form (12.1.20). Denote the right hand side of equation (12.1.27) by i (}), so that 1 + j(}) = i (}) = = (12.1.28) 1 j(}) The function i will map the open, connected region D = {} : |}| ? 1} into an open, connected region in the -plane. When |}| = 1 we can easily verify that i (}) given by (12.1.27) satisfiesi (}) = i (}), so that ¯ = : lies on the imaginary axis. The function i maps } = 0 onto = 1 so that we may conclude that i maps |}| 1 into the right hand half plane i.e., if |}| 1, then R{i (})} 0; if |}| = 1, Then R{i (})} = 0. Since the given eigenvalues $ m , corresponding to i) and ii) interlace, the members of A and B interlace, then as we proceed counterclockwise around |}| = 1 starting at } = 1, the points of A and B are mapped successively onto the point at infinity and the origin in the -plane. Equation (12.1.28) implies j(}) =
1 i (}) 1 = = i (}) + 1 +1
But if R(} = 0, then | 1| | + 1|, so that |j(})| 1. We conclude that j(}), and hence by the Schwarz lemma , i1 (}), is bounded by 1 on the unit disc.
342
Chapter 12
Now apply Schur’s algorithm to produce a sequence (im (}))q1 . The form of j(}) given by (12.1.27) leads to a form i1 (}) = sq1 (})@Tq1 (})
(12.1.29)
with real coe!cients. Therefore, all m will be real. Equation (12.1.27) shows that j(}) has the properties j(} 1 ) = 1@j(})>
j(1) = 1=
Therefore i1 (} 1 ) = 1@i1 (}), and i1 (1) = 1. Equation (12.1.16) now shows that im (} 1 ) = 1@im (})> im (1) = 1> m = 1> 2> = = = > q (12.1.30) because the statement is true for m = 1. Thus im (}) will have the form im (}) = Sqm (})@Tqm (})>
m = 1> 2> = = = > q>
(12.1.31)
so that the sequence will terminate with iq (}) = 1 as required, and the m will satisfy 1 ? m ? 1> m = 2> 3> = = = > q; q+1 = 1= (12.1.32) Since D1 = 1 by assumption, these m lead to a unique set of finite, positive (Dm )q1 as required. We stress that the single condition (12.1.5) ensures the existence of the m satisfying (12.1.32). For computational purposes, Schur’s algorithm leads to a recurrence relation for the coe!cients in the polynomials Sqm (}) and Tqm (}). Let Sqm (}) =
qm X
n
dqm>n } >
n=0
Tqm (}) =
qm X
eqm>n } n =
n=0
Equation (12.1.27) yields the values of dqm>n and eqm>n (n = 0> 1> = = = > q m) from data. Equation (12.1.31) states that dqm>n = eqm>qmn
n = 0> 1> = = = > q m
and {eqm>n }qm consist of the same numbers, so that the sequences {dqm>n }qm 0 0 in opposite orders. The recurrence (12.1.16) yields m dqm>n eqm>n
= dqm+1>0 @eqm+1>0 = dqm+1>n+1 m eqm+1>n+1 > n = 0> 1> = = = > q m = = eqm+1>n m dqm+1>n > n = 0> 1> = = = > q m
In its simplest terms, the algorithm has three steps; we have adapted the procedure of Kailath and Lev-Ari (1987) [189]: I. Take the coe!cients of Sq1 (}) from equation (12.1.27) and construct J0 5 P2>q : ¸ · d0 d1 > = = = dq1 > dn = dq1>n = G0 = dq1 dq2 = = = d0
12. A Miscellany of Inverse Problems
343
II. Compute q = d0 @dq1 and construct · · ¸ 1 2 0 G0 = G01 = 2 1 d0q2
d00
d0q3
d01 ·
= = = d0q3 === d00
d0q2 0
¸
=
III. Shift the top row of the matrix formed in II to the left and delete the last column to form J1 5 P2>q1 : ¸ · d00 d01 = = = d0q2 G1 = d0q2 d0q3 = = = d00 and go to step I. Bruckstein and Kailath (1987) [42] showed that Schur’s algorithm is computationally stable and e!cient. Note that by making minor changes in the analysis (See Ex. 12.1.2) we can construct a Goupillard model of a rod from q interlacing eigenvalues 0 ? $ 1 ? $ 2 ? = = = ? $ q = q@(2O) corresponding to the end conditions i) y 0 (0) = 0 = y(O), odd $ m
ii) y(0) = 0 = y(O), even $m =
(12.1.33)
However, it is not possible to use the essentially algebraic method described here to construct the Dl from the eigenvalues 0 ? $ 1 ? $ 2 ? = = = ? $ q corresponding to the general end condition y 0 (0) = 0 = y 0 (O) + Ky(O);
y(0) = 0 = y 0 (O) + Ky(O)=
This is because $ will appear in the analysis as itself, and not just in the form exp(2l$4). It is possible to modify the analysis (see Ex. 12.1.2) so that it can be applied to a piecewise uniform string, governed by equation (10.1.1), but now the model will consist of a string with density 2 ({) satisfying ({) = 2m , {m1 ? { ? {m , where 2m ({m {m1 ) = constant, m = 1> 2> = = = > q. Exercises 12.1 1. Make the necessary modifications to the analysis of this section so that it applies to the case of q odd. Take q = 2p 1. Show that A consists 1 of }0 = 1 and p 1 pairs }2m > }2m > m = 1> 2> = = = > p 1, and B consists of 1 }2p1 = 1 and p 1 pairs }2m1 > }2m1 > m = 1> 2> = = = > p 1. Hence show that p1 1 } + 1 Y (} }2m1 )(} }2m1 1 + j(}) 20 y0 =C • = > 1 s0 1 j(}) } 1 m=1 (} }2m )(} }2m ) where again j(0) = 0 implies F = 1. 2. Make the necessary changes so that it applies to (12.1.33).
344
Chapter 12
3. For the string governed by (10.1.1) with ({) = 2 ({), the appropriate up and down quantities are given by (12.1.8), where now y 0 ({) = l$s({)>
s0 ({) = l$ 4 ({)y({)=
Note that it is y and y 0 , i.e., s that are continuous at discontinuities of ({). Show that G0 = l$2 G + 0 1 X>
X 0 = $ 2 X + 0 1 G
and that when =const, G = G0 exp(l$ 2 {)
X = X0 exp(l$ 2 {)=
This means that we must choose intervals of uniformity so that 2m ({m {m1 ) =const. Now when the m have been found, one must also find the points {m of discontinuity.
12.2
Isospectral rods and the Darboux transformation
We denote the spectrum of the rod governed by the equation (Dy 0 )0 + Dy = 0>
(12.2.1)
D(0)y 0 (0) ny(0) = 0 = D()y 0 () + Ny()>
(12.2.2)
and the end conditions
by (D> n> N). If two such rods have the same spectrum i.e., (D1 > n1 > N1 ) = (D2 > n2 > N2 )>
(12.2.3)
we say that they are isospectral. The simplest, almost trivial, pair of isospectral rods is obtained by physically turning the rod and restraints around so that D2 ({) = D1 ( {)>
n2 = N1 >
N2 = n1 =
This will have no eect on the spectrum, so that (D({)> n> N) = (D( {)> N> n)=
(12.2.4)
To avoid complications we shall henceforth assume that D({) = d2 ({) is a positive, twice continuously dierentiable function of {. This is unnecessarily restrictive, but at this time we are not interested in discussing the finer points of analysis. We leave it to the reader to see what regularity conditions are su!cient for various points of the analysis.
12. A Miscellany of Inverse Problems
345
To obtain the next simplest pair we note that if y satisfies (12.2.1), then z = Dy 0 satisfies (D1 z0 )0 + D1 z = 0> which is precisely (12.2.1) with D replaced by D1 . conditions. We have z0 = Dy= z = Dy 0 >
Now consider the end
Thus if the original rod is a cantilever, with y(0) = 0 = y 0 (), then the new rod satisfies z0 (0) = 0 = z(), so that it is a reversed cantilever. The cantilever cannot have a zero eigenvalue so that we conclude (D> 4> 0) = (D1 > 0> 4) and using (12.2.4) we deduce also that (D({)> 4> 0) = (D1 ( {)> 4> 0)= This is a result that has been known for many years, see Eisner (1967) [83], Benade (1976) [26], and was recently pointed out again by Ram and Elhay (1995) [285]; they examined many other interesting dualities. If the original rod is free, so that y 0 (0) = 0 = y 0 (), then z(0) = 0 = z(), so that the new rod is supported. But the free rod has a zero eigenvalue with eigenfunction y = 1, for which z = 0. Thus the zero eigenvalue will not appear in the spectrum for the supported rod. We conclude that 0 (D> 0> 0) = (D1 > 4> 4) where 0 indicates that the zero eigenvalue has been omitted. To conduct a more systematic search for isospectral pairs, we reduce (12.2.1) to standard Sturm-Liouville form, as in Section 10.1. Write D = d2 >
| = dy>
(12.2.5)
then Dy 0 = d2 y 0 = d| 0 d0 |>
(12.2.6)
so that (12.2.1) reduces to the Sturm-Liouville form |00 + ( s)| = 0>
(12.2.7)
d00 sd = 0=
(12.2.8)
where For given D or d, there is a unique s, but for given s there are many d. This allows us to obtain further isospectral sets. Although rather obvious, and observed already in Bernoulli and Euler, the indeterminacy introduced by the Liouville transformation in the inverse eigenvalue problem seems to have been systematically studied first by Hochstadt (1975a) [177]. He proved that
346
Chapter 12
classical uniqueness theorems for Sturm-Liouville problems hold, modulo a Liouville transformation: if d0 is one a corresponding to a given s, then variation of parameters gives the general solution ¾ ½ Z { gv > g0 > g1 constant= d({) = d0 ({) g0 + g1 2 0 d0 (v) The normalization condition d(0) = 1, gives g0 = 1, so that ¾ ½ Z { gv = d({) = d0 ({) 1 + g1 2 0 d0 (v) i
(12.2.9)
The constant g1 must be chosen so that D A 0 for 0 { ; this happens Z gv 1 + g1 A 0, where = = (12.2.10) 2 0 d0 (v)
If y0 > y are solutions of (12.2.1) corresponding to the same |, then d0 y0 = | = dy= A simple calculation shows that if y0 satisfies the conditions D0 (0)y00 (0) n0 y0 (0) = 0 = D0 ()y00 () + N0 y0 ()
(12.2.11)
then y satisfies (12.2.2) with n = n0 g1
N = N0 (1 + g1 ) + g1 d0 ()
(12.2.12)
where is given by (12.2.10). Thus, provided that g1 satisfies m0 ? g1 ? n0 >
m0 =
N0 Nr + d0 ()
(12.2.13)
we have a one-parameter family of rods with positive spring constraints: (D> n> N) = (D0 > n0 > N0 )= In particular, if n0 = 4 = N0 , then n = 4 = N, and (D> 4> 4) = (D0 > 4> 4)> provided only that g1 satisfies (12.2.10). In a series of papers, Isaacson and Trubowitz (1983) [186], Isaacson, McKean and Trubowitz (1984) [187], Dahlberg and Trubowitz (1984) [68], Trubowitz and his co-workers have given a complete characterisation of the isospectral potentials s({) for the Sturm-Liouville problem (12.2.7) with dierent sets of boundary conditions. Coleman and McLaughlin (1993a) [62], Coleman and McLaughlin (1993b) [63] extended this analysis to equation (12.2.1) with Dirichlet boundary conditions. In this section we have a more modest aim: to show how to obtain
12. A Miscellany of Inverse Problems
347
families of rods isospectral to a given one, following Gladwell and Morassi (1995) [122]. The analysis is based on the fundamental result that if D and E are two linear operators then DE + and ED + have the same eigenvalues except perhaps for . For if DE + has eigenvalue then there is a x 6= 0 such that (DE + )x = x. Thus DEx = ( )x, so that 6= implies Ex 6= 0. Now E(DEx) = ED(Ex) = ( )Ex, i.e., (ED + )Ex = (Ex). Since Ex 6= 0, is an eigenvalue of ED + . To apply this to our situation we factorise the operator G2 s + = (G + )(G ) = G2 0 2 = Thus s = 0 + 2 + . Put = j 0 @j, so that s = (j 00 @j) + . This means that j satisfies (12.2.14) j 00 + ( s)j = 0= Now | satisfies | 00 + ( s)| = 0>
(12.2.15)
then 0 = (G2 s + )| = {(G + )(G ) + }| = 0 so that } = (G )| satisfies {(G )(G + ) + }} = 0 i.e., (G2 + 0 2 + )} = 0. Write this as } 00 + ( t)} = 0
(12.2.16)
t = 0 + 2 + = s 20 = s 2(cqj)00 =
(12.2.17)
where We can interpret this analysis, called the Darboux Lemma or the Darboux Transformation, after Darboux (1882) [69], Darboux (1915) [70], in various ways. We can say that, starting from one system with potential s and solution |, we can find another system with potential t and solution } = (G )| = | 0
[j> |] j0 | = > j j
(12.2.18)
where the bracket is defined by [j> |] := j| 0 j 0 |=
(12.2.19)
Alternately we can say that, given two solutions, | of (12.2.15), and j of (12.2.14), we can form a solution } of (12.2.16) given by (12.2.18), where t is related to s by (12.2.17).
348
Chapter 12
Note that 6= . It may be shown (Ex. 12.2.1) that, when = , the general solution of (12.2.16) is µ ¶ Z { 1 1+g }= j 2 (v)gv > g = constant= (12.2.20) j 0 Suppose that we have a rod D({) with spectrum {q }4 0 corresponding to end conditions (12.2.2). Transforming to Sturm-Liouville form, we have a set of eigenfunctions |q satisfying |q00 + (q s)|q = 0>
(12.2.21)
where s is given by (12.2.8), and the end conditions |q0 (0) k|q (0) = 0 = |q0 () + K|q ()>
(12.2.22)
k = n + d0 (0)@d(0) K = N d0 ()@d()=
(12.2.23)
where In particular the zeroth eigenfunctions |0 will satisfy |000 + (0 s)|0 = 0=
(12.2.24)
Taking = 0 , j = |0 we deduce that }q =
1 [|0 > |q ] |0
(12.2.25)
is a solution of }q00 + (q t)}q = 0
(12.2.26)
t = s 2(cq|0 )00 =
(12.2.27)
where We can use this result only if |0 is positive in 0 { . This will be the case if n> N are finite. Since |0 > |q satisfying the same conditions (12.2.22), }q will satisfy }q (0) = 0 = }q ()= (12.2.28) This means that the eigenfunction of the new Sturm-Liouville system will satisfy Dirichlet end conditions. We must now find a function e({), or in fact a family of such e({) corresponding to t. The original S-L system was (12.2.7). As we showed earlier, there is a family of rods with cross sections D({) = d2 ({), associated with this s. If d0 ({) is one such, then each member of the family may be written ¾ ½ Z { gv = (12.2.29) d({) = d0 ({) 1 + g1 2 0 d0 (v)
12. A Miscellany of Inverse Problems
349
We note that if g1 satisfies (12.2.10), then d({) will be positive throughout [0> ]; otherwise d({) will change sign once in [0> ]. All the d({) will satisfy (12.2.8). On replacing by 0 in the preceding analysis, we find that e=
1 [|0 > d] |0
(12.2.30)
satisfies e00 te = 0. For this e to correspond to a proper rod, it must have one sign throughout [0> ]. First, we show that e({) can have at most one zero in any interval in which d({), given by (12.2.29) is of one sign. For suppose e({) had two such zeros, {1 > {2 ({1 ? {2 ) in such an interval, then by Rolle’s theorem, [|0 > d]0 must be zero at an intermediate point. But [|0 > d]0 = (|0 d0 |00 d)0 = |0 d00 |000 d = 0 |0 d 6= 0> which is a contradiction. There are two cases: i) d, given by (12.2.29) is positive throughout [0> ]. Now d A 0, 1 + g1 A 0 (see 12.2.10)). Now e can have at most one zero in [0> ], and so it will have no zero if it has the same sign at 0 and . A simple calculation shows that e(0) = nd(0)> e() = Nd()= (12.2.31) Since n> N are related to n0 > N0 by (12.2.12), e({) will have one sign throughout if 1 g1 A n0 or ? g1 ? m0 > (12.2.32) where m0 is given by (12.2.13). ii) d({), given by (12.2.29) has one zero in [0> ]. Now d() = 0 for some 5 [0> ], and g1 1@. Since e() = g1 @d0 () ? 0, e() will have the same sign throughout i e(0) ? 0> e() ? 0, i.e., if g1 ? m0 . But since g1 1@, this is satisfied automatically. We conclude that (12.2.29), (12.2.30) provide a proper rod with fixed end conditions if g1 A n0 or g1 ? m0 . Note that in both cases the intermediate system specified by d({)> n> N will not be proper because the inequalities (12.2.13) will not be satisfied. Note that the restriction 6= in the original analysis relating to DE + and ED + , means that the new rod e({), with fixed ends, will not have the eigenvalue 0 , so that 0 (D0 > n0 > N0 ) = (E> 4> 4)> where the prime indicates that 0 has been deleted.
(12.2.33)
350
Chapter 12
If the original rod is free (n0 = 0 = N0 ), then 0 = 0 and |0 = j. Now equation (12.2.20) states that the general solution of e00 te = 0 is µ ¶ Z { 1 2 1+g e= j (v)gv = (12.2.34) j 0 R This will be positive in [0> ] provided 1 + g 0 j 2 (v)gv A 0. Again 0 (D0 > 0> 0) = (E> 4> 4)=
(12.2.35)
We now show that q , and zq = }q @e given by (12.2.25), are in fact the (q 1)th eigenvalue and eigenfunction of the E rod. First, we show that there is a zero of |q between two zeros of }q . If {1 > {2 are two consecutive zeros of }q , then Z {2 Z {2 (|0 |q00 |000 |q )gv = (0 q ) |0 |q gv= 0 = [|0 > |q ]|{{21 = {1
{1
But |0 has constant sign throughout [0> ], so that |q must change sign, and have a zero, between {1 and {2 . Now we show that there is a zero of }q between consecutive zeros of |q . This follows from (12.2.25), namely }q = |q0
|00 |q |0
when |q = 0> }q = |q0 . But |q0 has opposite signs at successive zeros of |q . Thus }q changes sign, and therefore has a zero, between zeros of |q . We conclude that the zeros of |q and }q interlace. But |q has q zeros in (0> ) while }q (0) = 0 = }q (). Therefore }q has (q 1) zeros in (0> ); it is the (q 1)th eigenfunction. We may thus rewrite (12.2.33) as q (D0 > n0 > N0 ) = q1 (E> 4> 4)=
(12.2.36)
The foregoing analysis breaks down where |0 has a zero at an end, as it does when one or other end of the original rod is fixed. For such cases, and to eliminate the 0 in (12.2.33), we must modify the analysis of reversing the order of the factors in the dierential equation twice. Crum (1955) [65] has a dierent approach to finding pairs of solutions to the Sturm-Liouville equation.
Exercises 12.2 1. Show that the general solution of equation (12.2.16) is given by equation (12.2.20). 2. Equation (12.2.36) states that the (q 1)th eigenvalue of one rod is equal to the qth eigenvalue of another. This means that the (q1)th eigenvalue of (12.2.26) is equal to the qth of (12.2.21). Examine the asymptotic forms of the two spectra as given by equations (10.9.19), (10.9.20) to show that they are consistent with this statement.
12. A Miscellany of Inverse Problems
12.3
351
The double Darboux transformation
Suppose we have a rod D0 ({) with spectrum {q }4 0 corresponding to end conditions (12.2.2). Transforming to S-L form, we have a set of eigenfunctions |q satisfying |q00 + (q s)|q = 0 and some end conditions |q0 (0) k|q (0) = 0 = |q0 () + K|q ()> as before. We now choose a particular eigenvalue and eigenfunction p > |p ; p 00 does not need to be zero. Thus |p satisfies |p + (p s)|p = 0. Applying the Darboux lemma, we find a non-trivial solution }q =
1 [|p > |q ]> |p
q 6= p
(12.3.1)
of }q00 + (q t)}q = 0
(12.3.2)
t = s 2(cq|p )00 =
(12.3.3)
where On the other hand, the second part of the Darboux lemma, equation (12.2.20), states that the general solution of the equation 00 }p + (p t)}p = 0
is }p =
1 |p
µ Z 1+g
0
{
¶ 2 |p (v)gv =
(12.3.4)
(12.3.5)
We now apply the Darboux lemma to equations (12.3.2), (12.3.4), and deduce that if q 6= p, then 1 zq = [}p > }q ] (12.3.6) }p is a non-trivial solution of zq00 + (q u)zq = 0
(12.3.7)
where u
= t 2(cq}p )00 = s 2(cq(|p }p ))00 =
(12.3.8)
We now examine zq and u. First, we note that equation (12.3.5) gives Z { 2 |p }p = 1 + g |p (v)gv= (12.3.9) 0
352
Chapter 12
R 2 (v)gv = 1, then |p }p will be positive, If |p has been normalised so that 0 |p and so u will be continuous, if g A 1. We now evaluate zq : it is zq =
1 }0 0 (}p }q0 }p }q ) = }q0 p }q = }p }p
But equation (12.3.1) shows that }q0 =
00 0 |p |q00 |p |0 |q |p }q = (p q )|q p }q |q |p |p
so that zq = (p q )|q
(|p }p ) }q = (|p }p )
But since }q (0) = 0, }q
= =
Z { 0 |p |q0 |p 1 |q 00 = (|p |q00 |p |q )gv |p |p 0 Z { (p q ) |p |q gv= |p 0
This means that zq has a factor (p q ), so that is we define zq0 =
zq p q
and use (12.3.9) to give |p }p , we find zq0
R{ g|p 0 |p (v)|q (v)gv R{ = |q = 2 (v)gv 1 + g 0 |p
(12.3.10)
We see that this is a non-trivial solution of (12.3.7) even when q in that equation to p. It may also be shown (Ex. 12.3.1) that zq0 is normalised so that Risequal 0 [zq (v)]2 gv = 1. 0 Now we must find the corresponding rods. We started with a rod with d({) satisfying d00 sd = 0= (12.3.11) 00 Applying the Darboux lemma to this equation and |p + (p s)|p = 0 we find that 1 e= [|p > d] (12.3.12) |p
satisfies e00 te = 0=
(12.3.13)
Now apply the Darboux lemma to this equation and (12.3.5), and we find that f=
1 [}p > e] }p
(12.3.14)
12. A Miscellany of Inverse Problems
353
satisfies f00 uf = 0= We can find f as we found
(12.3.15)
zq0 :
f({) = d({)
g|p ({)[|p > d] R{ = 2 (v)gv} p {1 + g 0 |p
(12.3.16)
Note that just as d({) is one solution of (12.3.11), and e({) is one solution of (12.3.13), so f({) is one solution of (12.3.15); other solutions may be found as in Section 12.2, see equation (12.2.9). We now consider whether f({) is of one sign in [0> ]. Suppose the end conditions for the original rod were (12.2.2), i.e., D(0)y0 (0) n1 y(0) = 0 = D()y 0 () + N1 y()= Equations (12.2.5), (12.2.6) show that these transform to [d> |](0) n1 |(0)@d(0) = [d> |]() + N|()@d() so that the end values of f({) given by (12.3.16) satisfy 2 gn1 |p (0) f(0) =1+ = 0 d(0) p d2 (0)
(12.3.17)
2 gN1 |p f() () =1 = 1= d() p (1 + g)d2 ()
(12.3.18)
Note that unless the original rod is fixed (|p (0) = 0), or free (n = 0) at the left hand end, the new f({) will not be normalised so that f(0) = 1. We now show that if g A 1 then 0 and 1 are both positive. Let yp be the pth mode of 0 0 the original rod; then (Dyp ) + p Dyp = 0 so that R 2 R 0 0 p = p 0 Dyp g{ = 0 yp (Dyp ) g{ R 0 02 = [yp (Dyp )]0 + 0 Dyp gv R 02 2 2 = n1 yp (0) + N1 yp () + 0 Dyp g{ 2 2 so that p A n1 yp (0) + N1 yp () and hence
0
A
1
A
2 2 (1 + g)n1 yp (0) + N1 yp () A0 p 2 2 (1 + g)n1 yp (0) + N1 yp () A 0= p (1 + g)
2 These inequalities hold provided that np p2p (0) and N1 yp () are not both zero, i.e., provided that at most one end of the rod is free (n1 or N1 is zero) or fixed (yp (0) or yp () is zero). We now have a one-parameter family of rods f({) = f({> g) defined for { 5 [0> ]> g A 1; each member of the family is positive at { = 0 and { = and, when g = 0, f({> 0) = d({) is positive for { 5 [0> ]. To show that f({> g) must be positive for all { 5 [0> ]> g A 1, we use the following deformation lemma.
354
Chapter 12
Lemma 12.3.1 Let kw> 0 w 1, be a family of real valued functions on d { e, which is jointly continuously dierentiable in w and {. Suppose that for every w, kw has a finite number of zeros in [d> e], all of which are simple, and has boundary values with signs that are independent of w. Then k0 and k1 have the same number of zeros in [d> e]. This is a slightly extended version of Lemma 3 in P˝oschel and Trubowitz (1987) [269] (p. 41); they simply supposed that kw has boundary values that are independent of w, but it may easily by seen that their proof holds if only the signs of these boundary values are independent of w. It may easily be seen that f({> g) can have only a finite number of zeros, and that these must be simple (Ex. 12.3.2), so that the lemma implies that f({> g), like f({> 0) = d({), must have no zeros, and thus be positive, for { 5 [0> ], and g A 1. We may use the deformation lemma to show that f({) is positive for the limiting cases in which each end of the rod is either free or supported. We now examine the end conditions for the new rod. The eigenfunctions of the new rod are xq = zq0 @f. A tedious, but straightforward calculation shows that the new rod has end conditions F(0)x0 (0) n2 x(0) = 0 = F()x0 () + N2 x() where F({) = f2 ({), and n2 = 0 n1 >
N2 = 1 N1 =
Thus (D> n1 > N1 ) = (F> n2 > N2 ) and in particular (D> 0> 0) = (F> 0> 0) and (D> 4> 4) = (F> 4> 4)= It must be remembered, of course, that the particular F that is formed from a given D depends on the end conditions corresponding to the original rod, and the value of p that is chosen in the Darboux transformation. Exercises 12.3 1. Show that zq0 given by (12.3.10) is normalised so that Z
0
[zq0 (v)]2 = 1=
2. Show that the zeros of f({> g) given by (12.3.16) are simple.
12. A Miscellany of Inverse Problems
12.4
355
Gottlieb’s research
H.P.W. Gottlieb has been carrying out research into various vibrating systems - rods, strings, beams, membranes, plates, etc. - amongst other matters, since 1984. In this section we briefly describe some of these researches, those related to strings, rods and beams. We start by considering one of his early papers, Gottlieb (1986) [138], which builds on earlier papers by Levinson (1976) [208] and Sakata and Sakata (1980) [299]. We made a comment about Gottlieb (1986) [138] in Section 11.1; Gottlieb’s work was motivated by the fact, central to the analysis of Chapter 11, that two spectra, corresponding to two dierent conditions at one end of the string, are needed to determine the string density uniquely. Consider the string shown in Figure 12.4.1, with a density 2 ({) that has one step, at { = 0.
U
1
U D
2
0
(1 D )
Figure 12.4.1 - A stepped string. For fixed ends, at and (1 ), the end and continuity conditions are x() = 0 = x(1 )> Thus x({) =
½
[x]0 = 0 = [x0 ]0 =
x1 ({) for { 5 [> 0] x2 ({) for { 5 [0> 1 ]
where x001 + 21 $2 x1 = 0 = x002 + 22 $ 2 x2 = Thus x1 ({) = D sin{1 $({ + )} x2 ({) = E sin{2 $(1 {)} so that the continuity conditions at { = 0 give = E sin{2 $(1 )} D sin(1 $) 1 D cos(1 $) = 2 E cos{2 $(1 )}= This gives the frequency equation 2 sin(1 $) cos{2 $(1 )} + 1 sin{2 $(1 )} cos(1 $) = 0=
(12.4.1)
356
Chapter 12
This is the frequency equation for the general case of a string with one step, as shown in Figure 12.4.1. Gottlieb examines the special case in which 1 = 2 (1 )=
(12.4.2)
Now (12.4.1) reduces to (1 + 2 ) sin(21 $) = 0 which has the spectrum $ q = q@(21 )>
q = 1> 2> = = =
(12.4.3)
The spectrum is harmonic: $ q = q $ 1 . To compare this spectrum with that of a uniform string of uniform density 2 , fixed at { = > 1 , we note that the governing equations are x00 + 2 $ 2 x = 0>
x() = 0 = x(1 )
so that x = D sin{$({ + )}> where sin($) = 0= Now $q = q@>
q = 1> 2> = = =
(12.4.4)
If = 21 , then the two spectra, (12.4.3) and (12.4.4) are identical. To distinguish between the two strings, we must examine their spectra for fixed-free ends. Now (Ex. 12.4.1) the frequency equation for the stepped string is (12.4.5) cos(21 $) = (2 1 )@(2 + 1 )> so that the spectrum is uniformly spaced, but not harmonic. The frequency equation for the uniform string is cos $ = 0, with harmonic spectrum $ q = (q 12 )@, q = 1> 2> = = = Gottlieb (1986) [138] considers other strings, and extends his analysis to multi-segment strings, some with harmonic spectra, in Gottlieb (1987a) [139]. For the special case (12.4.2), the discontinuous string in Figure 12.4.1 is isospectral to the uniform string, for fixed-fixed ends. In Gottlieb (1988a) [141] and the somewhat simpler paper Gottlieb (2002) [149], Gottlieb examines continuous isospectral strings, as we now describe. Start with a string governed by (12.4.6) 00 () + !()() = 0> with fixed ends at 0,1, so that (0) = 0 = (1)=
(12.4.7)
12. A Miscellany of Inverse Problems
357
We seek transformations to a new coordinate { and new displacement x, that preserve the structural form of the governing equation, and the fixed end conditions. Let { = {()> y() = ({)x({)> (12.4.8) where ({) is some positive non-singular function of {. We wish to find {() and ({), so that the new displacement x satisfies x ¨({) + i ({)x({) = 0
(12.4.9)
where i ({) is some new density, dual to the density function !(), and · = g@g{. Now gy g{ gy = · = {0 y˙ = {0 ( x˙ + x) y0 = ˙ g g{ g and y 00
= {00 ( x˙ + x) ˙ + ({0 )2 ( x ¨ + 2˙ x˙ + ¨ x) 0 2 = ({ ) x ¨ + ({00 + 2({0 )2 ) ˙ x˙ + ({00 ˙ + ({0 )2 ¨ )x=
(12.4.10)
To maintain the form of the equations, one must have {00 ˙ + ({0 )2 ¨ = 0
(12.4.11)
{00 + 2({0 )2 ˙ = 0= 0
(12.4.12)
0
0
0
Equation (12.4.11) may be written ({ ) ˙ = 0; and since { ˙ = , we have 00 = 0 and ({()) = d + e= (12.4.13) Equation (12.4.12) implies {00 2 + 2({0 )2 ˙ = 0 i.e., ({0 2 )0 = 0, so that {0 = f@ 2 , i.e., f g{ = > g (d + e)2
{() =
f + g= d(d + e)
The requirements that {(0) = 0 and {(1) = 1 give f = deg>
dg = d e
so that on taking dg = 1 we find {=
1 + d(1 )>
=
(1 + d){ > 1 + d{
=
1+d ; 1 + d{
clearly, we must take d A 1. Now (12.4.10) gives 2 2 i ({) = !(({))@{0 = ˙ !(({))
(12.4.14)
358
Chapter 12
and since ˙ = (1 + d)@(1 + d{)2 we have i ({) =
(1 + d)2 ! (1 + d{)4
µ
(1 + d){ 1 + d{
¶
=
(12.4.15)
The relation between the solutions x({) and y() is µ ¶ (1 + d){ (1 + d{) y = x({) = 1+d 1 + d{
(12.4.16)
We stress that the system (12.4.9) with fixed end conditions is isospectral to (12.4.6), (12.4.7), for all values of d A 1. The transformation from one coordinate to another, {, has a group struc(1+d0 ) ture. First we note that if = (1+d){ 1+d{ , then { = 1+d(1) = 1+d0 where d0 = d@(1 + d): thus, if a characterises $ { then a 0 characterises { $ . Note that if d0 = d@(1 + d) then d = d0 @(1 + d0 ), and that d A 1 implies d0 A 1 and vice versa. This shows that each transformation has an inverse. Now consider a product of transforms. Suppose {1 = then {1 =
(1 + d1 ){2 > 1 + d1 {2
{2 =
(1 + d2 ){3 (1 + d2 {3 )
(1 + d1 + d2 + d1 d2 ){3 (1 + d1>2 ){3 = 1 + (d1 + d2 + d1 d2 ){3 1 + d1>2 {3
where d1>2 = d1 + d2 + d1 d2 = d2 + d1 + d2 d1 =
(12.4.17)
(1 + d1>2 ) = (1 + d1 )(1 + d2 )>
(12.4.18)
We note that so that d1 A 1> d2 A 1 implies d1>2 A 1: the product of two transformations is a transformation, and (12.4.17) shows that the product is commutative. There is an identity transformation, d = 0, and the associative property holds (Ex. 12.4.2): the transformations form a group, a one-parameter Lie group. Now consider the density functions. When i and ! are linked by (12.4.15) then we say that i is the dual of ! with respect to d. Since 1 + d{ = 1@(1 + d0 ) and 1 + d = 1@(1 + d0 ), we can rewrite (12.4.15) as ¶ µ (1 + d0 ) (1 + d0 )2 = !() = i (1 + d0 )4 1 + d0 This shows that ! is the dual of i with respect to d0 . symbolically as i = G(!> d) $ ! = G(i> d0 )
We may express this
and now we may verify (Ex. 12.4.3) that if i2 = G(i1 > d1 )>
i3 = G(i2 > d2 ) then i3 = G(i1 > d1>2 )=
(12.4.20)
12. A Miscellany of Inverse Problems
359
This means that the dual w.r.t. d2 of the dual of i1 w.r.t. d1 is just another dual of i1 , with respect to d1>2 . Gottlieb (1986) [138] provides some examples. For the simplest, we start with !() = 1, then i ({) = (1 + d)2 @(1 + d{)4 = (12.4.21) Both these systems have spectrum q = $ 2q , where $q = q>
q = 1> 2> = = =
for fixed-fixed ends. The eigenfunctions are yq () = sin(q)>
x({) =
(1 + d{) sin 1+d
½
q(1 + d){ 1 + d{
¾
and we note that while the nodes of the former are the equidistant points p = p@q>
p = 1> 2> = = = q 1>
the nodes of the latter are {p = p@{q + d(q p)}>
p = 1> 2> = = = > q 1=
Gottlieb calls (12.4.21) the Borg density because it was discussed by Borg (1946) [39]. Another example is given in Ex. 12.4.4. Gottlieb (1987b) [140] studied isospectral beams. In the notation of Section 13.7, his analysis is as follows. Start with the governing equation (13.1.4): µ ¶ g2 g2 x u({) 2 = d({)x({) (12.4.22) g{2 g{ and introduce a new variable v = v({) so that (12.4.22) reduces to the standard form µ ¶ g4 y(v) gy g D(v) + E(v)y(v) = y(v)= (12.4.23) + gv4 gv gv As in Section 13.7 we write ¶1 µ d({) 4 > e(v) = u({)
1
f2 (v) = (d({)u3 ({)) 4
so that u({) = f2 (v)e1 ({)>
d({) = f2 (v)e3 (v)
where
gv = e(v)= g{ In terms of v, equation (12.4.22) is (e(f2 (ex0 )0 )0 )0 = e2 f2 x
(12.4.24)
(12.4.25)
360
Chapter 12
where 0 = g@gv. Put x({) = y(v)@{e(v)f(v)} then ex0 = (y 0 y)@f
(12.4.26)
f2 (ex0 )0 = f{y 00 ( + )y 0 (0 )y} 2
0 0 0
000
0
00
0
(12.4.27)
00
(f (ex ) ) = f{y y (2 + !)y + ( + !)y} (e(f2 (ex0 )0 )0 )0 = ef{y 0y + (Dy 0 )0 + Ey} where
e0 = > e
f0 = > f
D = 30 2 !>
= + >
(12.4.28) (12.4.29)
! = 0 + 2
E = (00 + !)0 + (00 + !)=
(12.4.30)
This means that the transformed system (12.4.23) will correspond to a uniform beam if D = 0 = E. We will of course have to check which, if any, of the end conditions are preserved. The only end condition that is preserved in all cases is the clamped condition: x=0=
gy gx =, y = 0 = = g{ gv
Equations (12.4.30) shows that one solution of D = 0 = E is given by = 0 = !. Since (12.4.26), (12.4.28) show that, when = 0 = !, ex0 = y 0 @f>
(f2 (ex0 )0 )0 = fy 000 >
any such solution will preserve a sliding-end condition. We explore this solution: = (ef)0 @ef so that = 0 implies ef = constant; ! = f00 @f, so that ! = 0 implies f(v) = sv + t, where s> t are constants. The coordinate transformation becomes gv = (sv + t)1 g{ so that
(sv + t)2 = { + g= 2s
We choose the constants so that { = 0> 1 correspond respectively to v = 0> 1: 1
1 + nv = (1 + N{) 2 >
1
1 + n = (1 + N) 2 =
The original beam is given by 3
u({) = u0 (1 + N{) 2 >
1
d({) = d0 (1 + N{) 2 =
As we noted earlier, this beam will have the same spectrum as a uniform beam for clamped-clamped and clamped-sliding end conditions.
12. A Miscellany of Inverse Problems
361
Another solution is given by e(v) = constant. Now = so that D = 0 implies 2 0 + 2 = 0, and thus 00 + 0 = 0 and 00 +! = 00 +( 0 + 2 ) = 0 2 so that E = 0. Now 2 0 + 2 = 2f00 @f f0 @f2 = 0 and f = (sv + t)2 . Since { = v we have u({) = u0 (s{ + t)4 d({) = d0 (s{ + t)4 = Other examples are given in the exercises. Gottlieb (1987b) [140] studies many other cases in detail; see also Abrate (1995) [1]. Gottlieb has studied isospectral membranes and plates in Gottlieb (1988) [142], Gottlieb (1991) [144], Gottlieb (1992b) [146], Gottlieb (1993) [147], Gottlieb (2000) [148], Gottlieb (2004a) [150]. In a recent paper, Gottlieb showed that the only mappings of the mapping that transform the membrane equation onto another membrane equation are conferral mappings.
Exercises 12.4 1. Set up the frequency equation for the stepped string in Figure 12.4.1 for fixed-free end conditions and obtain (12.4.5); find $ q . 2. Show that the product of transformations defined by (12.4.14) is associative, i.e., d(1>2)>3 = d1>(2>3) . 3. Verify (12.4.17). 4. Show that if !() = (1 + e)q , then i ({) = (1 + d)2 (1 + f{)q @(1 + d{)q+4 where f = d + e + de. 5. Show that in the special case q = 2> f = 0, the dual string is just the original string turned around, i.e., i ({) = {1 + e(1 {)}2 . 6. The composition law for the transformation group is (12.4.17). Show that if = cq(1+d) then the composition law becomes additive: 1>2 = 1 +2 . 7. Another possible solution of (12.4.30) is given by D = 0> 00 = !. Explore this solution. 8. Explore solutions of D = 0 = E by seeking e = e0 V > f = f0 V where V = sv + t, and > are to be determined.
12.5
Explicit formulae for potentials
We discussed at length in Chapters 10, 11 what spectral data are necessary to determine the potential in Sturm-Liouville equation. By potential we mean either t({) in (10.1.14), ({) (or 2 ({)) in (10.1.8) or D({) in (10.1.3). In
362
Chapter 12
general, as we have found, there is no explicit formula for a potential; rather, it is found after a long process involving integral and/or dierential equations. In this section we describe some explicit formulae that have been found for various particular cases. We will give few derivations since these are generally very lengthy; instead we will make references to the original papers. We start with Gel’fand and Levitan (1953) [101]. They considered equation (10.1.14) under the free-free end condition |0 (0) = 0 = | 0 (1), with t({) 5 F 1 (0> 1) and Z 1 t({)g{ = 0= (12.5.1) 0
(q )4 1
They showed that if denote the eigenvalues of (10.1.14), and (q )4 1 are the corresponding eigenvalues of the same equation with t({) 0, i.e., | 00 ({) = |({)> then
4 X
(q q ) =
q=1
| 0 (0) = 0 = |0 (1)>
1 (t(0) + t(1))= 4
(12.5.2)
Halberg and Kramer (1960) [158] extended this result to the end conditions | 0 (0) k|(0) = 0 = | 0 (1) + K|(1)=
(12.5.3)
If k> K are finite, then (12.5.2) holds; if k is finite and K = 4, then 4 X
(q q ) =
q=1
1 (t(0) t(1)); 4
(12.5.4)
1 (t(1) t(0)); 4
(12.5.5)
if k = 4, K is finite, then 4 X
(q q ) =
q=1
and if k = 4 = K (the ends are fixed) then 4 X
1 (q q ) = (t(0) + t(1))= 4 q=1
(12.5.6)
Barcilon (1974d) [17] gives an alternative derivation of these results. Note that all these formulae give a sum or dierence of values of t at the end points as a function of the eect of t on the eigenvalues. Barcilon (1983) [22] examined the string equation (10.1.8) and considered the eigenfunctions for a part of the string; we change his formulation somewhat. Barcilon first proves
12. A Miscellany of Inverse Problems
363
Theorem 12.5.1 Let (q )4 1 be the spectrum of
and (q )4 1 the spectrum of
x00 + x = 0>
(12.5.7)
x(0) = 0 = x(1)>
(12.5.8)
x00 + x = 0> x(0) = 0 = x0 (1)=
(12.5.9)
If the function ({) is continuous and bounded away from zero, then (1) =
4 1 Y 2q = 1 q=1 q+1 q
(12.5.10)
We may make a (rather weak) check on this by considering the case 1 (Ex. 12.5.1). Now we consider the string with ends at 0 and {, and find the spectra by simply scaling the coordinate so that the string occupies (0> 1). This gives the 4 Corollary 12.5.1 If (q ({))4 1 and {q ({)}1 are the spectra of (12.5.7) for the end conditions x(0) = 0 = x({), x(0) = 0 = x0 ({) respectively, then
({) =
4 Y 1 [q ({)]2 = 2 { 1 ({) q=1 q+1 ({)q ({)
(12.5.11)
Barcilon’s formula (12.5.11) involves two spectra, for two dierent end conditions at {. Pranger (1989) [270] expresses ({) in terms of just one spectrum {q ({)} for (12.5.7) subject to x(0) = 0 = x({). He shows that if v({) =
4 X
{q ({)}1 >
(12.5.12)
q=1
if ({) is positive, has a continuous first derivative, and has a second derivative in O2 , then ({) is given by the remarkable explicit formula ¶ µ 2 g 2 g v({)= (12.5.13) ({) = + g{2 { g{ Gottlieb (1992a) [145] considers some examples and counter-examples of this formula. First, if ({) 1, then q ({) = (q@{)2 so that v({) =
4 {2 X 1 2 q=1 q2
and substitution into (12.5.13) recovers ({) = 1. See also Ex. 12.5.2. Prager considers some other explicit formulae.
364
Chapter 12
Gottlieb (1992a) [145] also considers some cases of discontinuous ({) to show that, as Pranger himself thought, his formula holds under wider conditions than he assumed.
Exercises 12.5 1. Take 1 in (12.5.7) and find the eigenvalues q and q for (12.5.7) subject to (12.5.8) and (12.5.9) respectively. Check that equation (12.5.10) gives (1) = 1. [Use the identity ¶ 4 µ Y 1 2 1 2 = > 4q q=1 given in 0.2622 in Gradshteyn and Ryzhik (1965) [152]] 2. The string with density given by (12.4.21), i.e., (|) = (1 + d)2 @(1 + d|)4
0|1
is isospectral to the uniform string. Use the scaling {| = to find q ({) and hence recover ({) from equation (12.5.13).
12.6
The research of Y.M. Ram et al.
For the whole of his scientific career, Ram’s research has been related, more or less, to some aspect of inverse problems, interpreted in a loose sense. Since much of this work does not fit neatly into just one category, we have chosen to devote this section to it. It is impossible to do justice to it in so short a space and therefore we limit our treatment to the questions that he and his colleagues asked and the methods they used to answer them. We limit our attention to those papers related to undamped vibrating systems. One of the earliest papers is Ram, Braun and Blech (1988) [272]. This paper is in the tradition of modal analysis, see, for example, Berman (1984) [27]. They ask the following question: for a system with unknown mass and stiness matrices M and K, but with its first q eigenmodes and eigenvalues known from modal testing, how can one find approximations to the first q eigenmodes and eigenvalues of a modified system M + M> K + K? They show that upper bounds to the eigenvalues are given by the eigenvalue problem ( + W K)x = (I + W M)x=
(12.6.1)
They illustrate their analysis by an example of a vibrating beam. The bounds obtained in this paper were upper bounds to the eigenvalues of the modified structure because they were found as stationary values of the Rayleigh quotient in a constrained subspace. Ram and Braun (1990a) [273] obtain both upper
12. A Miscellany of Inverse Problems
365
and lower bounds by judicious use of the independent definition of eigenvalues discussed in Section 2.10, and the fact that decreasing (increasing) the stiness, i.e., the strain energy, of a structure, decreases (increases) the eigenvalues. They show moreover that both the upper and lower bounds that they obtain are optimal. This paper gives the clearest introduction to the search for upper and lower bounds. In Ram and Braun (1990b) [274] they extend their results to obtain upper and lower bounds to q eigenvalues, not necessarily the lowest q, by using Lehman’s optimal interval (see Parlett (1980) [264], pp. 198-202). In Ram, Blech and Braun (1990) [275], this analysis is placed in an abstract matrix setting and related to previous matrix analytical results. Braun and Ram (1991) [40] give various examples of the application of this analysis. Ram and Blech (1991) [277] prove a nice result regarding the addition of an oscillator of stiness n and mass p to a vibrating system with a single spatial direction of motion: the eigenvalues of the original system that are less than n@p increase, while those greater than n@p decrease. They introduce their analysis with the example given in Ex. 12.6.1. Ram and Braun (1991) [278] apply the analysis derived in Ram, Braun and Blech (1988) [272] to the inverse problem: find M> K to give a specified spectrum. They apply their result to some simple examples. See also Ram and Braun (1993) [280] for more examples. Ram and Caldwell (1992) [279] consider the reconstruction of a spring mass system with a single direction of motion in which the masses are not simply in-line, as in Jacobi systems, but form a multiply connected system. The data consist of various spectra obtained by anchoring one or more of the masses to the ground. The solution obtained is not unique; note that the graph of the system is not a tree, as in the system considered by Duarte (1989) [81], and described in Section 5.7. We have already referred to Ram (1993) [276] in Section 4.5. He applies the result in Ram and Blech (1991) [277]to the situation when an oscillator of stiness n and mass p is attached to the free end of an in-line mass-spring system. Ram and Gladwell (1994) [289] consider a finite element model of a vibrating string, for which both the stiness and the mass matrices are tridiagonal, and show that both these matrices may be constructed from a single eigenvalue and two eigenvectors. Since this method is extremely sensitive to error, they also use overdetermined data. Ram (1994a) [281] discusses the reconstruction of the mass-spring model of a beam in transverse vibration described in Section 2.3 from three eigenvectors, one eigenvalue and the total mass and length of the beam. Unfortunately, no criteria are given for deciding whether the mode/eigenvalue data will lead to a realistic model; see Gladwell, Willms, He, and Wang (1989) [115] for a discussion of this matter. In Ram (1994b) [282] he returns to the idea introduced in Ram and Blech (1991) [277] to enlarge a spectral gap: to modify a system so that the modified eigenvalues l > l+1 satisfy l ? l > l+1 A l+1 . He shows that this may be accomplished by judiciously adding two oscillators, n1 > p1 and n2 > p2 with n1 @p1 A l+1 and n2 @p2 ? l . Certain specific conditions, which he states, must be satisfied.
366
Chapter 12
In Ram (1994c) [283] he considers the continuous model for an axially vibrating rod, in the form (12.6.2) (ux0 )0 + dx = 0> and shows that u and d may be reconstructed from two eigenvalues, the corresponding eigenmodes and the total mass of the rod. He states the conditions on the given modes that ensure that u({)> d({) will be positive, and gives a number of examples. Ram and Elhay (1995a) [285] attempt a di!cult problem, the reconstruction of the mass and stiness matrices, M> K of a system from modal and spectral data, on the assumption that M> K are symmetric band matrices. This construction procedure still leaves many important questions open for further research. Ram and Elhay (1996) [284] consider the theory of dynamic absorbers, and their use in dynamic modification problems. In the important paper, Sivan and Ram (1997) [306], the authors confront the realisation that the mass and stiness matrices for a given kind of system will have a specific form. They consider the forms associated with a general mass-spring system as in Ram and Caldwell (1992) [279]; the mass matrix is diagonal, while the stiness matrix has negative (or non-positive) o-diagonal elements and is diagonally dominant. They start with raw spectral data =diag(1 > 2 > = = = > q ) and modal data . In theory, the mass and stiness matrices M> K should satisfy W M = I>
W K =
so that, in theory M = W 1 >
K = W 1 =
But in general, the given matrices > will not yield a diagonal M, or K with the required positivity properties. They therefore pose the problem of finding > , near, in some sense, to > , such that M> K, computed from (12.6.3) do have the correct form. They divide the problem into two parts: Problem 12.6.1 Given , determine such that M = W 1 is a positive diagonal matrix, and minimises || ||. Problem 12.6.2 Given and , determine which minimises || ||, such that K = W 1 has the required form. They give algorithms for solving both these problems. This paper provides a promising starting point for realistic construction procedures. Ram and Elhay (1998) [287] is related to isospectral Jacobi systems, and contains a novel fixed-point approach to constructing a particular Jacobi matrix. Sivan and Ram (1999) [307] return to the analysis of Ram and Braun (1991) [278]. They start from the equations K = M and partition > 5 Pq into ¸ · 1 = [1 |2 ]> = 2
12. A Miscellany of Inverse Problems
367
where 1 5 Pq>p > 1 5 Pp are given. They pose the problem of finding ˜ l ) and mass and stiness matrices M ˜ 1 = 1 W, and ˜ 1 = gldj( ˜ = M+ ˆ ˜ ˆ M> K = K + K to minimise the norm of ˜ 1@2 (K ˜ ˜1 M ˜ ˜ 1 ˜ 1) U = (M) and apply their analysis to some simple spring mass problems. The great˜ K ˜ have est problem to be overcome is that of ensuring that the matrices M> prescribed forms. Burak and Ram (2001) [44] treat the eigenvalue equation (K M)u = 0 by writing both K and M as sums X X K= l Kl > M = l Ml
where Kl > Ml are matrices with fixed elements that reflect the connectivity, the graph, of the system, and l > l are unknown parameters. When the system is an in-line mass-spring system, the parameters may be obtained from two modes and an eigenvalue, as in Ram and Gladwell (1994) [289]. Ram and Elishako (2004) [288] return to the problem of reconstructing a rod cross-sectional area from a mode, for both discrete and continuous models. For the continuous model, the governing equation is (10.1.3): (D({)x0 ({))0 + D({)x({) = 0=
They concentrate on the problem of finding D({) when x({) is a polynomial , and discuss particular low order polynomials for the fundamental and the first few overtones of a free-free rod. In conclusion, we note that the research conducted by Ram and his colleagues demonstrates the complexity of inverse problems: the data must be available from testing or elsewhere, the construction algorithms must be robust, and the model that is constructed must be realistic - it should satisfy all the necessary positivity and connectivity constraints. Ram and his colleagues have made important advances in many diverse aspects of these matters; in spite of this there is still ample opportunity for more research in fulfilling all the requirements of a satisfactory solution to the many inverse problems that arise in vibration theory.
Exercises 12.6 1. Consider a uniform taut spring of unit length, fixed at { = 0, free at { = 1 (the end { = 1 is attached to a massless ring that slides at right angles to the string). Find its eigenvalues. Now replace the slider by an oscillator of mass p and stiness n. Show that the eigenvalues l ? n@p increase, while those with l A n@p decrease.
Chapter 13
The Euler-Bernoulli Beam There is enough light for those who only desire to see, and enough obscurity for those who have a contrary disposition. Pascal’s Pensées, 430
13.1
Introduction
The free undamped infinitesimal vibrations, of frequency $, of a thin straight beam of length O are governed by the Euler-Bernoulli equation µ ¶ g2 x({) g2 HL({) = D({)$ 2 x({)> 0 { O= (13.1.1) g{2 g{2 Here H is Young’s modulus, is the density, both assumed constant; D({) is the cross-sectional area at section {, L({) is the second moment of this area about the axis through the centroid at right angles to the plane of vibration (the neutral axis). We put { = Ov>
x(v) = x({)>
u(v) =
L({) > L({f )
d(v) =
D({) > D({f )
= D({h )O4 $2 @(HL({f ))>
(13.1.2) (13.1.3)
where {f 5 [0> O]. Equation (13.1.1) then becomes (u(v)x00 (v))00 = d(v)x(v)>
0 v 1>
(13.1.4)
where 0 = g@gv. From now on we will use { rather than v for the dimensionless independent variable. Both u({) and d({) are positive, i.e., u({) A 0>
d({) A 0>
{ 5 [0> 1]=
We shall assume throughout that d({)> u({) 5 F 2 [0> 1]; they are twice continuously dierentiable in [0> 1]. 368
13. The Euler-Bernoulli Beam
369
For a beam, the most common end-conditions are free pinned sliding clamped
: : : :
x00 = 0 = x000 > x = 0 = x00 > x0 = 0 = (ux00 )0 > x = 0 = x0 =
(13.1.5) (13.1.6) (13.1.7) (13.1.8)
There are certain combinations of these end conditions which allow movement of the beam as a rigid body: free-free : free-sliding : sliding-sliding :
x = 1> and x = { x = 1> x = 1>
(13.1.9) (13.1.10) (13.1.11)
Note that the free-free beam has two rigid-body modes. The two which are given above are not orthogonal, but a combination d{ + f can be found that is orthogonal to the first, x = 1. (Ex. 13.1.1). The ends may be restrained by translational and rotational spring devices. In this case the end conditions are (u({)x00 ({))00 + k1 x(0) = 0 = (u({)x00 ({))01 k2 x(1)> u(0)x00 (0) n1 x0 (0) = 0 = u(1)x00 (1) + n2 x0 (1)=
(13.1.12) (13.1.13)
Here k1 > k2 are the translational and n1 > n2 are the rotational stinesses. The conditions (13.1.5)-(13.1.8) correspond respectively to k = 0 = n; k = 4> n = 0; k = 0> n = 4; k = 4 = n. We shall say that the system groverned by equations (13.1.4), (13.1.12), (13.1.13) is positive if k1 + k2 A 0>
n1 + n2 A 0=
(13.1.14)
Since k1 > k2 > n1 > n2 0, this means that one of k1 > k2 and one of n1 > n2 must be strictly positive; this rules out rigid-body modes. Papanicalaou (1995) [261] considers spectral theory for a periodic beam; we do not discuss this. Theorem 13.1.1 The Euler-Bernoulli operator Bx (u({)x00 ({))00 is self-adjoint, i.e., (Bx> y) = (x> By) under the end conditions (13.1.12), (13.1.13). R1 {(ux00 )00 y (uy 00 )00 x}g{ = Proof. (Bx> y) (x> By) = 0 00 0 00 0 00 0 00 0 1 [(ux ) y ux y (uy ) x + uy x ]0 . Under any of the conditions (13.1.12), (13.1.13), the bracketed term is zero at each end.
370
Chapter 13
Theorem 13.1.2 Eigenvalues of an Euler-Bernoulli system are non-negative, and are positive i the system is positive. Proof. Suppose x({) is an eigenfunction of equation (13.1.4) corresponding to , then Bx = dx= x> Bx) = (¯ x> dx) = Thus (Bx> x ¯) = (dx> x ¯). But (Bx> x ¯) = (x> B¯ x) = (¯ ¯ d¯ ¯ ¯ and is real. Now (x> x) = (dx> x ¯). Thus = 00 0
(Bx> x) = [(ux ) x
ux00 x0 ]10
+
Z
1
u(x00 )2 g{
0
so that (dx> x) = k1 x2 (0) + k2 x2 (1) + n1 [x0 (0)]2 + n2 [x0 (1)]2 Z 1 + u(x00 )2 g{=
(13.1.15)
0
Since x({) is an eigenfunction, (dx> x) A 0. There can be a zero eigenvalue only if the right hand side of (13.1.15) is identically zero. The integral is zero only if x00 ({) 0, i.e., x({) = f{ + g. Each of the other terms must be separately zero, so that k1 g2 = 0 = k2 (f + g)2 = n1 f2 = n2 f2 = (13.1.16) Suppose k1 > k2 > n1 > n2 are finite. Equation (13.1.16) implies that either n1 = 0 = n2 , in which case the system is not positive, or f = 0. If f = 0, then either k1 = 0 = k2 , in which case the system is not positive; or g = 0, in which case x({) 0, so that x({) is not an eigenfunction. We conclude that if k1 > k2 > n1 > n2 are finite, then the eigenvalues are positive only if the system is positive. The cases when one or more of the kl > nl are infinite, may be considered similarly (Ex. 13.1.2). Before introducing the Green’s function in general, we consider the special case of a cantilever beam, i.e., a beam clamped at { = 0, free at { = 1. If a unit concentrated load (made dimensionless as in (13.1.2)) is applied to the beam at { = v (0 ? v 1), then the deflection x({) and its first two derivatives will be continuous in [0> 1], while its third derivative will have a jump discontinuity at { = v. Equilibrium demands [(u({)x00 ({))0 ]{=v+ {=v = 1= The end conditions at { = 1, namely x00 (1) = 0 = x000 (1)> then yield 00
0
(u({)x ({)) =
½
1 > 0 { ? v> 0 > v { 1>
13. The Euler-Bernoulli Beam
and u({)x00 ({) =
371
½
v { > 0 { ? v> 0 > v { 1=
Thus
Z
0
x ({) =
{0
0
(v w) gw> u(w)
where {0 = min({> v), so that the displacement, i.e., the Green’s function J({> v), is Z {0 ({ w)(v w) J({> v) = gw= (13.1.17) u(w) 0 Under general end-conditions of the form (13.1.12), (13.1.13), the Green’s function has the following properties: 1. J({> v) is, for fixed v, a continuous function of {, and satisfies the endconditions (13.1.12), (13.1.13). 2. Except at {v, the first four derivatives of J({> v) w.r.t. { are continuous in [0> 1]. At { = v, the third derivative has a jump discontinuity given by ·
C C{
µ ¶¸{=v+ C 2 J({> v) u({) = 1= C{2 {=v
(13.1.18)
3. E{ J({> v) = 0 for 0 { ? v and v ? { 1. Theorem 13.1.3 The Green’s function is symmetric, i.e., J({> v) = J(v> {). Proof. See Ex. 13.1.3. Theorem 13.1.4 If i ({) is piecewise continuous then x({) =
Z
1
J({> v)i (v)gv
(13.1.19)
0
is a solution of Bx = i ({)>
(13.1.20)
and satisfies the end conditions (13.1.12), (13.1.13). Conversely, if x({) satisfies (13.1.20) and the end conditions (13.1.12), (13.1.13), then it can be represented by (13.1.19). This follows immediately from the properties (1)-(3). The construction procedure used for the cantilever beam can be generalised. It can be shown (Ex. 13.1.5) that ½ !({)(v) + #({)"(v)> 0 { v> J({> v) = (13.1.21) !(v)({) + #(v)"({)> v { 1> where !({)> #({) are linearly independent solutions of Bx = 0 satisfying the end-conditions at { = 0, while ({)> "({) are linearly independent solutions of
372
Chapter 13
Bx = 0 satisfying the end-conditions at { = 1. Note that for (13.1.17) these functions are Z { Z { (v w)gw w({ w)gw > #({) = > ({) = {> "({) = 1= !({) = u(w) u(w) 0 0 (13.1.22) Theorem 13.1.4 allows us to replace the dierential equation (13.1.4) and end-conditions (13.1.12), (13.1.13) by the integral equation Z 1 x({) = d(v)J({> v)x(v)gv= (13.1.23) 0
Exercises 13.1 1. !1>1 = 1 and !1>2 = f{ + g will be orthogonal rigid-body modes of a free-free beam if Z 1 d({)!1>1 ({)!1>2 ({)g{ = 0= 0
Show that when d({) is symmetrical about { = 12 , i..e., d({) = d(1 {), then !1>2 ({) = f({ 12 ).
2. Show that eigenfunctions !l ({)> !m ({) of (13.1.4), (13.1.12), (13.1.13) corresponding to dierent eigenvalues l > m are orthogonal, i.e., Z 1 (!l > d!m ) = d({)!l ({)!m ({)g{ = lm = 0
Show also that (!00l > u!00m ) =
Z
0
1
u({)!00l ({)!00m ({)g{ = l lm =
3. Show that the Green’s function J({> v) for (13.1.4) under the end-conditions (13.1.12), (13.1.13) is symmetric. 4. Show that the Green’s function for a pinned-pinned beam is ½ Z 1 ¾ ½ ¾ Z { 2 Z 1 (1 w)2 gw w gw w(1 w)gw J({> v) = v + (1 {) +{ (1v) u(w) u(w) v 0 u(w) { when { v. Identify > !> #> " for this J({> v). Show that J({> v) A 0 when {> v 5 (0> 1), and that |({) = J({> v) satisfies | 0 (0) A 0> | 0 (1) ? 0. 5. Establish (13.1.21). Use the fact that x({) = J({> v) has the forms ½ f(v)!({) + g(v)#({)> 0 { ? v> x({) = h(v)({) + i (v)"({)> v ? { 1= Now use the facts that x> x0 > x00 are continuous at v while the third derivative has the jump given by (13.1.18).
13. The Euler-Bernoulli Beam
13.2
373
Oscillatory properties of the Green’s function
First we prove some preliminary results. Theorem 13.2.1 Under the end conditions (13.1.12), (13.1.13) for finite positive k1 > k2 > n1 > n2 , the Green’s function x({) = J({> v) for 0 v 1 satisfies ½ f 0 { ? v 0 00 0 P ({) := (u({)x ({)) = (13.2.1) 1f v?{1 where 0 ? f ? 1. Proof. Properties (2) and (3) of the Green’s function imply that (13.2.1) hold for some f; we prove that 0 ? f ? 1. Consider the values of P({) u({)x00 ({). The end conditions (13.1.13) preclude the case in which P(0) 0 and P(1) 0, for then x0 (0) 0, x0 (1) 0 so that x00 ({0 ) 0, i.e., P ({0 ) 0 for some {0 5 (0> 1). But P({) is linear in each of (0> v) and (v> 1), so that P(v) 0, P 0 (v) = (P(v) P(0))@v 0, P 0 (v+) = (P(1)P(v))@(1v) 0, and therefore [P 0 (v)]+ 0, contradicting (13.1.18). Secondly, if f 0 or f 1, i.e., if P 0 ({) has the same weak sign throughout [0> 1], then the case P(0) 0, P(1) 0 is excluded. For then P({) 0 throughout [0> 1], while the end conditions yield x0 (0) 0, x0 (1) 0 which contradicts x00 ({) 0 for all { 5 [0> 1]. Clearly, x({) 0 is excluded. Suppose f 0, so that P 0 ({) 0, { 5 [0> 1], then P(0) 0, P (1) 0, since all other cases are excluded. Thus, the end condition (13.1.13) yields x0 (0) 0, x0 (1) 0 and, since P ({) is piecewise linear, we may argue as before that x0 ({) 0 for all { 5 [0> 1]. But P 0 (0) 0, P 0 (1) 0 and the end condition (13.1.12) imply x(0) 0, x(1) 0 contradicting x0 ({) 0 for all { 5 [0> 1]. The case x0 ({) 0 is excluded by (13.1.18). If f 1 then P 0 ({) 0, { 5 [0> 1] and P(0) 0, P(1) 0 and the end conditions yield x0 (0) 0, x0 (1) 0 and, as before, x0 ({) 0 for { 5 [0> 1]. But now x(0) 0, x(1) 0, which is again contradictory. We conclude that 0 ? f ? 1. Corollary 13.2.1 P ({) cannot have the same sign throughout [0> 1]. Proof. If P ({) 0, { 5 [0> 1], then P(0) 0 and P(1) 0, which has been excluded. If P ({) 0 then P (0) 0, P(1) 0 so that the end conditions (13.1.13) yield x0 (0) 0, x0 (1) 0 which contradicts P ({) 0. In this Theorem and Corollary, we have assumed that k1 > k2 > n1 > n2 are finite and positive, but the results still hold even if some or all of them are infinite, and k1 + k2 A 0> n1 + n2 A 0, i.e., provided the system is positive. See Ex. 13.2.1.
374
Chapter 13
Theorem 13.2.2 Under the end conditions (13.1.12), (13.1.13), the Green’s function satisfies J({> v) 0> J({> v) A 0>
{> v 5 [0> 1]> {> v 5 L=
(13.2.2) (13.2.3)
Proof. Here L has the same meaning as in Chapter 10: it is [0> 1] if k1 > k2 are finite, (0> 1] if k1 = 4, i.e., x(0) = 0, etc. Theorem 13.2.1 and the Corollary show that P({) cannot have the same sign throughout [0> 1]; there is one zero to the left of v and/or one zero to the right. If x({) = J({> v), then the three possible forms of P 0 ({)> P({)> x0 ({)> x({) are shown in Figure 13.2.1.
Figure 13.2.1 - The formation of the Green’s function, showing P 0 ({)> P ({)> x0 ({)> x({) in [0> 1].
13. The Euler-Bernoulli Beam
375
In anticipation of the next result, we recall a classical theorem and prove a refinement. Theorem 13.2.3 (Rolle) Suppose !({) is continuous in [d> e] and dierentiable in (d> e). If !(d) = 0 = !(e) then < f 5 (d> e) such that !0 (f) = 0, i.e., !0 ({) has a zero place in (d> e). We need the following refinement. Theorem 13.2.4 Suppose !({) is continuous in [d> e] and dierentiable in (d> e). If !(d) = 0 = !(e) and !({) is not identically zero in [d> e], then !0 ({) has a nodal place in (d> e). We recall that i ({) is said to have a node at f if in any two-sided vicinity of f there are points 1 > 2 such that 1 ? f ? 2 and i ( 1 )i ( 2 ) ? 0. Alternatively i ({) can have a nodal interval [f> g] such that in any two-sided vicinity of [f> g] there are points 1 > 2 such that 1 ? f ? g ? 2 and i ( 1 )i ( 2 ) ? 0. Proof. Since !({) is continuous in [d> e], it assumes its maximum and minimum values in [d> e]. Since !({) is not identically zero, one of these must be non-zero. Without loss of generality we may suppose that it is the maximum; it will therefore be assumed at one or more points 5 (d> e), or in an interval [f> g] 5 (d> e). In the former case is a node of !0 ({), in the latter [f> g] is a nodal place. Theorem 13.2.5 Suppose !({)> !0 ({) are continuous in [d> e], and !0 ({) has q nodes ( l )q1 such that d = 0 ? 1 ? 2 ? · · · ? q ? q+1 = e, then the function !({) has at most one zero place in each of the intervals [d> 1 ]> [ 1 > 2 ]> = = = [ q > e]> q+1 in all. If !(d)!0 (d) A 0 then !({) has no zero in [d> 1 ], while if !(e)!0 (e) ? 0 it has no zero in [ q > e]. The satisfaction of each of these inequalities thus reduces the number of zero places of !({) by 1. Proof. The first part follows from Theorem 13.2.4: if !({) had two zeros in [ l > l+1 ], then !0 ({) would have a nodal place in ( l > l+1 ), contrary to hypothesis. For the second we note that if !(d)!0 (d) A 0 then !(d)!0 () A 0 for 5 [d> 1 ). The mean value theorem states that for every { 5 [d> 1 ] there is a 5 (d> {) such that !(d)!({) = !(d)[!(d) + ({ d)!0 ()] A 0= Thus !({) has no zero in [d> 1 ]. Similarly, if !(e)!0 (e) ? 0 then !({) has no zero in [ q > e]. Corollary 13.2.2 If instead of being continuous and having nodes at ( l )q1 , !0 ({) is continuous and of one sign in each of the intervals [d> 1 )> ( 1 > 2 )> = = = > ( q > e], and has jumps and may thus change sign only at ( l )q1 , then the results concerning !({) still hold. We are now ready for Theorem 13.2.6 Under the action of q forces (Il )q1 acting at (vl )q1 , where 0 v1 ? v2 ? · · · ? vq 1, the beam can reverse its sign at most q 1 times.
376
Chapter 13
Proof. We assume that the beam is a positive system, as described in Section 13.1. First, we assume that k1 > k2 > n1 > n2 are positive, v1 A 0 and vq ? 1. The deflection of the beam is x({) =
q X
Il J({> vl )>
l=1
and because of (13.1.18) it satisfies ; ? f0 > { 5 [0> v1 ) fl { 5 (vl > vl+1 )> l = 1> = = = > q 1 P 0 ({) = = fq { 5 (vq > 1] where fl = f0 +
l X
Im >
l = 1> 2> = = = > q=
m=1
Thus P 0 ({) has the property stated in the Corollary to Theorem 13.2.5, so that P ({) has at most q+1 zero places, at most one in each of [0> v1 ]> [v1 > v2 ] = = = [vq > 1]. Thus P({), and therefore x00 ({), has at most q + 1 nodes in (0> 1), so that by Theorem 13.2.5, x0 ({) has at most q + 2 nodes and x({) has at most q + 3 nodes in (0> 1). Now consider the sequences P 0 (0)> P(0)> x0 (0)> x(0) and P 0 (1)> P(1)> x0 (1)> x(1) i.e., k1 x(0)> n1 x0 (0)> x0 (0)> x(0) and k2 x(1)> n2 x0 (1)> x0 (1)> x(1)= First, suppose that x(0)> x0 (0)> x(1)> x0 (1) are all non-zero, then equation (13.1.13) shows that x0 (0)x00 (0) A 0 and x0 (1)x00 (1) ? 0= But then Theorem 13.2.5 states that x0 ({) has at most q nodes, x({) has at most q + 1 nodes. Now P 0 (0)P (0) = k1 n1 x0 (0)x(0) P 0 (1)P (1) = k2 n2 x0 (1)x(1) so that either P 0 (0)P (0) A 0 or x0 (0)x(0) A 0 and either P 0 (1)P(1) A 0 or x0 (1)x(1) A 0. If one of the left hand inequalities is satisfied the P({) has one less node than before, while if one of the right hand inequalities is satisfied then x({) has one less node. Thus in any case x({) has at most q 1 nodes. A detailed consideration of special cases is left to the exercises, but in the typical case k1 = 0, n1 finite and non-zero, we may argue as follows. P 0 (0) = 0, so that P 0 ({) 0 in [0> v1 ], P({) = n1 x0 (0) in [0> v1 ], so that P({) has no node in [0> v1 ]; x00 (0)x0 (0) A 0 so that the remainder of the argument holds.
13. The Euler-Bernoulli Beam
377
Theorem 13.2.6 holds the key to the proof that the Green’s function is an oscillatory kernel. However, in order to prove this, we must continue the investigation of oscillatory systems of functions started in Section 10.5. First, we introduce Definition 13.2.1 The function ! ({) is said to reverse its sign n times in the interval L, and this is denoted by v! = n, if there are n + 1 points ({l )n+1 in L 1 such that {1 ? {2 ? · · · ? {n+1 and !({l )!({l+1 ) ? 0
l = 1> 2> = = = > n + 1
and there do not exist n + 2 points with this property. Evidently, if !({) is continuous in [0> 1] and v! = n, then !({) has n nodal places in (0> 1). Before going further we state a basic composition formula. Suppose {!l ({)}q1 are continuous in [0> 1], P({> v) is continuous in [0> 1] × [0> 1], # l ({) = then (x; ) =
Z
1
P({> v)!l (v)gv
0
Z Z Z
===
Y
Z
P(x; s)(s; )gs
(13.2.4)
where Y is the simplex defined by 0 v1 ? v2 ? · · · ? vq 1, (x; ) = det(#l ({m )) = ({1 > {2> = = = > {q ; 1> 2> = = = > q) P (x; s) = det(P({l > vm )) (s; ) = det(!l (vm )) = (v1 > v2 > = = = > vq ; 1> 2> = = = > q) gs = gv1 gv2 = = = gvq , and = {1> 2> = = = > q}. Theorem 13.2.7 Suppose !({) is continuous in [0> 1] and v! q 1. P ({> v) is a continuous kernel with the property
If
P(x; s) A 0 for x; s 5 T> then #({) =
Z
1
P({> v)!(v)gv
0
does not vanish more than q 1 times in [0> 1]. Proof. T is defined in Definition 10.5.1. By assumption there are q + 1 points { l )q0 such that 0 = 0 ? 1 ? · · · ? q = 1 and !({) has one sign and is not identically zero on each interval ( l1 > l ), 1> 2> = = = > q. Put Z l P({> v)!(v)gv #l ({) = l1
378
Chapter 13
then #({) =
q X
#l ({)
l=1
and for all ({l )q1 such that 0 {1 ? {2 ? · · · ? {q 1 we have (Ex. 13.2.2) (x; ) =
Z
q
===
q1
Z
1
P(x; s)!(v1 )!(v2 ) = = = !(vq )gs=
(13.2.5)
0
The integrand is not identically zero and its non-zero values have one and the same sign, so that (x; ) is strictly of one sign, and hence the (# l ({))q1 form a Chebyshev sequence (Definition 10.6), and #({) does not vanish more than q 1 times in [0> 1]. An important kernel that satisfies the conditions of Theorem 13.2.7 is provided by 2 P% ({> |) = s exp{({ |)2 @%2 }= % This kernel has a remarkable property. Suppose !({) 5 F[0> 1], and define #({> %) =
Z
1
P% ({> v)!(v)gv=
(13.2.6)
0
If we define !(v) = 0 for { A 1, then #({> %) = = and
R4 s2 exp{({ v)2 @%2 }!(v)gv> %R 0 4 s2 exp( 2 )!({ + %)g> 0
2 lim #({> %) = s %$0
Z
4
exp( 2 )g=!({) = !({)=
(13.2.7)
0
Using this kernel we may prove Theorem 13.2.8 Let {!l ({)}q1 be linearly independent functions in F[0> 1], and define q X !({) = fl !l ({)= l=1
The necessary and su!cient conditions for v! q 1 in [0> 1] for all fl not all zero, is that (x; ) ({1 > {2 > = = = > {q ; 1> 2> = = = > q)>
x5T
should have fixed sign, i.e., one and the same sign for those points for which it is not zero.
13. The Euler-Bernoulli Beam
379
Proof. If for certain fl not all zero, v! q 1 and #l ({> %) =
Z
1
0
P% ({> v)!l (v)gv
then Theorem 13.2.7 shows that #({> %) =
q X
fl #l ({> %)
l=1
vanishes in [0> 1] not more than q 1 times. Conversely, equation (13.2.7) shows that if #({> %) vanishes not more than q 1 times, then v! q 1. Thus v! q 1 for all fl , not all zero, i {#l ({> %)}q1 form a Chebyshev system in [0> 1], i.e., i % (x; ) := det(#l ({m > %)) has strictly fixed sign when x 5 T. If % (x; ) has strictly fixed sign, then (x; ) = lim % (x; ) %$0
will have strictly fixed sign. On the other hand, since the {!l ({)}q1 are linearly independent, will not be identically zero. Thus (13.2.4) with P = P% shows that if has fixed sign, then % will have fixed sign. We have now established all the results needed to prove Theorem 13.2.9 The Green’s function of a positive Euler-Bernoulli system is oscillatory. Proof. There are three conditions to be fulfilled in the Definition 10.5.1. Theorem 13.2.2 yields i), the argument via strain energy yields iii). It remains to prove ii). Theorem 13.2.5 states that if x({) =
q X
Il J({> vl )
l=1
then vx q 1. Put !l ({) = J({> vl ), then, since the {!l ({)}q1 are linearly independent, Theorem 13.2.8 states that ({1 > {2 > = = = > {q ; 1> 2> = = = > q) = J(x; s) 0 for x> s 5 T. This is ii) The following theorem states which of the determinants in ii) are zero and which are non-zero; it is the analogue of Theorem 10.5.4 for the beam Theorem 13.2.10 J(x; s) A 0 i x> s 5 I and {l ? vl+2 and vl ? {l+2 for l = 1> 2> = = = > q 2.
380
Chapter 13
Proof. The first condition is necessary; for if one of {1 > {q > v1 > vq is not in I, e.g., {1 = 0, then J({1 > vl ) = 0 for l = 1> 2> = = = > q and the determinant is zero. Now suppose there is an index n such that 1 n q 2 and {n vn+2 . Then {l vm for l = n> n + 1> = = = > q and m = 1> 2> = = = > n + 2. Consider entries in the submatrix taken from rows n> n + 1> = = = > q and columns 1> 2> = = = > n + 2 of the matrix (J({l > vm )). Since {r vm for each entry, equation (13.1.21) shows that J({l > vm ) = !(vm )({l ) + #(vm )"({l ) so that the matrix will have rank 2. If q = 3, then n = 1 and the submatrix is the complete matrix which has rank 2, and therefore has zero determinant. If q 4, then we evaluate the q × q determinant using Laplace’s expansion with minors of order n + 2 taken from the first n + 2 columns; each such minor, having n + 2 3 rows, will be zero, so that the determinant will be zero. Thus {l ? vl+2 is necessary for J(x; s) to be positive, and so similarly is vl ? {l+2 . Now we prove the su!ciency. Suppose x> s 5 I. {l ? vl+2 , and vl ? {l+2 for l = 1> 2> = = = > q 2. We will prove the determinant is positive by induction. When q = 1, the results holds, by Theorem 13.2.2. Suppose that, if possible, it holds for q 1, but not for q, i.e., there exist ({0l )q1 > (v0l )q1 in L and satisfying {0l ? vl+2 > v0l ? {0l+2 for l = 1> 2> = = = > q 2 such that J({01 > {02 > = = = > {0q ; v01 > v02 > = = = > v0q ) = 0, but J({01 > {02 > = = = > {0q1 ; v01 > v02 > = = = > v0q1 ) A 0 and J({02 > {03 > = = = > {0q ; v02 > v03 > = = = > v0q ) A 0. Now choose arbitrary points ({l )q1 > (vl )q1 such that {01 {1 ? {2 ? · · · ? {q {0q >
v01 v1 ? {2 ? · · · ? vq v0q
0 0 and renumber ({l )q1 > ({0l )q1 increasingly as ({0l )2q 1 . The 2q×2q matrix (J({l > vl )) 0 q 0 q is TN and the minors corresponding to ({l )1 and (vl )1 fit the criteria of Theorem 6.6.6. Therefore, the matrix has rank q 1, so that
J({1 > {2 > = = = > {q ; v1 > v2 > = = = > vq ) = 0=
(13.2.8)
There are two cases: q 3 and q = 2. In the first case {01 ? v03 v0q and v01 ? {03 {0q imply that the intervals ({01 > {0q ) and (v01 > v0q ) overlaps. We may therefore take {l = vl for l = 1> 2> = = = > q, so that (13.2.8) yields J({1 > = = = > {q ; {1 > = = = > {q ) = 0 contradicting condition iii) of Definition 10.5.1. If q = 2, equation (13.2.8) states that J({1 > {2 > ; v1 > v2 ) = 0 for all {1 > {2 > v1 > v2 satisfying {01 {1 ? {2 {02 > v01 v1 ? v2 v02 . Without loss of generality we can take {02 v01 , so that {1 ? v1 ? v2 > {2 v1 ? v2 and ¯ ¯ ¯ !({1 )(v1 ) + #({1 )"(v1 )> !({1 )(v2 ) + #({1 )"(v2 ) ¯ ¯ J({1 > {2 > ; v1 > v2 ) = ¯¯ !({2 )(v1 ) + #({2 )"(v1 )> !({2 )(v2 ) + #({2 )"(v2 ) ¯ ¯ ¯ ¯ ¯ ¯ !({1 ) #({1 ) ¯ ¯ (v1 ) "(v1 ) ¯ ¯•¯ ¯ = 0= = ¯¯ !({2 ) #({2 ) ¯ ¯ (v2 ) "(v2 ) ¯ One or other of the factors in this equation must be zero. Suppose that for some v1 > v2 the second factor is not zero, then the first must be zero for all
13. The Euler-Bernoulli Beam
381
{1 > {2 such that {01 {1 ? {2 {02 . But that means that !({)> #({) are proportional, contradicting the fact that there are linearly independent solutions of (HL| 00 )00 = 0 satisfying the end conditions at { = 0. Similarly, if the first factor is not zero for some {1 > {2 , then the second must be identically zero, which again is impossible. Hence, we have arrived at a contradiction. The stated conditions are su!cient to ensure that J({1 > {2 > = = = > {q ; v1 > v2 > = = = > vq ) A 0. Exercises 13.2 1. Establish Theorem 13.2.1 when some of the kl > nl are 0 or 4, but the system is still positive. 2. Verify equation (13.2.4) in the case q = 2. Show that R1R1 P ({1 > {2 ; v1 > v2 )!(v1 )!(v2 )gv2 gv1 0 R0 R 1 1 1 P ({1 > {2 ; v1 > v2 )(v1 > v2 ; 1> 2)gv2 gv1 1 R201 R0v1 0 = 0 0 P({1 > {2 ; v1 > v2 )(v1 > v2 ; 1> 2)gv2 gv1 =
({1 > {2 ; 1> 2) = =
3. Establish equation (13.2.5) for q = 2.
4. Verify the Corollary of Theorem 13.2.5. 5. Establish Theorem 13.2.6 when some of the kl > nl are 0 or 4, but the system is still positive.
13.3
Nodes and zeros for the cantilever beam
For the cantilever beam the governing equations are (u({)x00 ({))00 = d({)x({)> x(0) = 0 = x0 (0)>
x00 (1) = 0 = x000 (1)=
(13.3.1) (13.3.2)
The theory of Section 13.2 shows that the Green’s function for the beam is an oscillatory kernel on L = (0> 1], so that the eigenvalues (l )4 1 are distinct, and the eigenfunctions (!l ({))4 1 have properties (1)-(3) stated in Theorem 10.6.4. We need to strengthen this classical result. To do so, we suppose, as in Section 13.1 that d({)> u({) 5 F 2 [0> 1], and put P({) = u({)x00 ({). Equation (13.3.1)-(13.3.2) show that P({) satisifies (e({)P 00 ({))00 = v({)P ({)> P 00 (0) = P 000 (0)>
P(1) = 0 = P 0 (1)>
(13.3.3) (13.3.4)
where e({) = 1@d({), v({) = 1@u({). Thus P({) is an eigenfunction of a reversed cantilever on (0> 1), and is thus an eigenfunction of an oscillatory kernel on [0> 1). We now state
382
Chapter 13
Theorem 13.3.1 If {!l ({)}4 1 are the eigenfunctions of a cantilever beam, then 1. !1 ({)> !01 ({) have no zeros in (0> 1], 2. !l ({)> !0l ({) have (l 1) nodes in (0> 1) and no other zeros in (0> 1], 3. If !({) =
n X
fl !l ({)>
1 m n>
l=m
0
n X
f2l A 0>
l=m
then !({) and ! ({) have not less than (m 1) nodes and not more that (n 1) zeros in (0> 1], 4. Pl ({) := u({)!00l ({) and Pl0 ({) have the properties 2) and 3) on [0> 1). Proof. The stated properties of !l ({)> Pl ({) follow from Theorem 10.6.4. We verify those for !0l ({)> Pl0 ({). 1) !01 (0) = 0; if !01 ({0 ) = 0 for some {0 5 (0> 1], then Rolle’s Theorem 13.2.3 states that there is a 5 (0> {0 ) such that !01 () = 0, contradicting the fact that P1 ({) has no zero in [0> 1). 2) !l ({) has l 1 nodes ({m )l1 in (0> 1) and a zero at {0 = 0. By 1 Theorem 13.2.4, !01 ({) has l1 nodes ( m )l1 satisfying {m1 ? m1 ? {m > m = 1 1> = = = > l 1; it also has a zero at { = 0. If !0l ({) has any other zero in (0> 1] then !00l ({) would have more than l 1 zeros in (0> 1], contradicting 4) for Pl . 3) The part relating to the !0l ({) may be proved in a similar way. See Ex. 13.3.1. 4) This follows because Pl ({) is an eigenfunction of the reversed cantilever. Theorem 13.3.2 If !l ({) is an eigenfunction of a cantilever beam then !l (1)!0l (1) A 0. Proof. Theorem 13.3.1 shows that !l (1)!0l (1) A 0 6= 0; we show that !l (1) and !0l (1) have the same sign. The Green’s function for the cantilever is given in equation (13.1.17), and Z 1 !l ({) = l J({> v)d({)!l (v)gv> 0
so that
Z
!l ({) = l
1
0
Since
CJ ({> v) = C{ we find that [!0l (1)]1{
= l
Z
CJ({> v) d (v) !l (v)gv= C{ Z
0
1
{
min({>v)
d(v)
½Z
v
{
(v w)gw > u(w)
(v w)!l (w)gw u(w)
¾
gv=
(13.3.5)
13. The Euler-Bernoulli Beam
383
Suppose { is the largest zero of !l ({); it will be a node if l 2, and 0 if l = 1. Since !l (1) A 0, we have !0l ({ ) A 0, and thus !l ({) A 0 for { 5 ({ > 1]. Thus equation (13.3.5) yields !0l (1) !0l ({ ) A 0 so that !0l (1) A 0. Corollary 13.3.1 If !l ({) is an eigenfunction of a cantilever beam then Pl (0)Pl0 (0) A 0. Exercises 13.3 1. Establish the part 3) of Theorem 13.3.1 relating to !0l ({).
13.4
The fundamental conditions on the data
We are now in a position to prove the fundamental Theorem 13.4.1 Suppose d({)> u({) has derivatives of all orders, (This restriction can be relaxed but it is su!cient for our purposes.) then the infinite matrix 6 5 x2 x3 === x1 9 1 2 3 === : : 9 9 1 x1 2 x2 3 x3 = = = : : 9 S = 9 1 1 2 2 3 3 = = = : : 9 2 9 1 x1 22 x2 23 x3 = = = : 8 7 .. . is TP. Here xl := !l (1)> !l (1) A 0.
l := !0l (1), and the !l ({) have been chosen so that
Before starting the proof proper, we give the gist of the argument in a simple case. Consider the determinant 5 6 !2 ({) !3 ({) !4 ({) x3 x4 8 ; !({) = 7 x2 2 3 4 since this may be written !({) =
4 X
fl !l ({)>
l=2
Theorem 13.3.1 states that it has at least one node in (0> 1), and at most 3 zeros in (0> 1]. In fact, since !(1) = 0 = !0 (1) = !00 (1) = !000 (1), it has a fourfold zero at { = 1. Since Theorem 13.3.1 does not state how to count such a multiple zero, we must consider the zeros of !0 ({) and !00 ({). We know that !({) has
384
Chapter 13
one node in (0> 1) and has zeros at 0 and 1. Suppose !({) had two zeros d1 > d2 in (0> 1). By using Theorem 13.2.3 we deduce that !0 ({) has zeros 0> e1 > e2 > e3 > 1 P({) : = u!00 ({) has zeros f1 > f2 > f3 > f4 > 1= P But P({) = 4l=2 fl Pl ({) and, by part 4) of Theorem 13.3.1, P ({) has at most 3 zeros in [0> 1). This is a contradiction; !({) has just one zero, a node, in (0> 1). Now !l ({) has exactly l1 changes of sign in (0> 1), so that !l (1) A 0 implies ()l1 !l (0+) A 0. We will eventually prove the Theorem by induction on the order of the minors. Suppose therefore that all the 2×2 minors of S are positive then, since ()!2 (0+) A 0>
()2 !3 (0+) A 0>
()3 !4 (0+) A 0>
we see by expanding the determinant !({) along its first row that ()!(0+) A 0 and hence !(1) A 0. Now expand !({) for small { in a Taylor series about { = 1: {4 0y !(1 {) = ! (1) + R({5 ) 4! so that ¯ ¯ ¯ ¯ ¯ 2 x2 3 x3 4 x4 ¯ ¯ x2 x3 x4 ¯¯ ¯ ¯ ¯ x3 x4 ¯¯ = ¯¯ 2 3 4 ¯¯ A 0= !1y (1) = ¯¯ x2 ¯ 2 3 4 ¯ ¯ 2 x2 3 x3 4 x4 ¯ We may treat
¯ 0 ¯ !2 ({) !03 ({) !04 ({) ¯ 3 4 #({) = ¯¯ 2 ¯ 2 x2 3 x3 4 x4
¯ ¯ ¯ ¯ ¯ ¯
in exactly the same way: #({) has just one zero, a node, in (0> 1); !0l (1) A 0 implies ()l1 !0l (0+) A 0, ()#(0+) A 0 and hence #(1) A 0. But #(1 {) = and
{4 0y # (1) + R({5 )> 4!
¯ ¯ 2 ¯ 1y # (1) = ¯¯ 2 x2 ¯ 2 2
3 3 x3 3 3
4 4 x4 4 4
¯ ¯ ¯ ¯ A 0= ¯ ¯
We now generalise this analysis to provide a formal proof of the theorem. Proof. We use the Corollary to Theorem 6.8.2 to prove the theorem by induction on the order of the minors. All minors of order 1 are positive; assume that all minors of order s are positive; we will prove that all minors of order s+1 involving consecutive rows and columns are positive. Because of the repetitive nature of the rows of S , it is su!cient to consider just two types of minors:
13. The Euler-Bernoulli Beam
385
those beginning with xp > xp+1 > = = = > xq , and those beginning with p > p+1 > q . As we showed in the example, both may be treated in a similar way; we consider just the first. Consider ¯ ¯ ¯ !p ({) !p+1 ({) · · · !q ({) ¯ ¯ ¯ q ¯ xp xp+1 ··· xq ¯¯ X ¯ = !({) = ¯ p fl !l ({)= ¯ p+1 ··· q ¯ ¯ l=p ¯ ¯ .. .. .. ¯ ¯ . . ··· .
t1 Take q = p+s. If s is even, i.e., s = 2t, then the last row is t1 p p > = = = > q q ; t1 t1 if s is odd, i.e., s = 2t 1, then the last row is p xp > = = = > q xq . Theorem 13.3.1 states that !({) has at least (p 1) nodes in (0> 1), and at most (q 1) zeros in (0> 1]. Suppose s is even, then !({) has a zero of multiplicity 2s at 1. Suppose !({) had m zeros in (0> 1), where m p 1. Thus !({) has zeros 0> d1 > d2 > = = = > dm > 1; !0 ({) has zeros 0> e1 > = = = > em+1 > 1; P({) has zeros f1 > f2 > = = = > fm+2 > 1. Now introduce the notation
P>1 := P 0 >
P>2 := d1 ({)P 00 >
P>3 := (d1 P 00 )0 >
P>4 := u(d1 P 00 )00
then equation (13.3.3) states that Pl>4 = l Pl . Now extend this notation: if n = 4v + w then P>n := (P>4v )>w > so that Pl>n = (Pl>4v )>w = vl Pl>w . Clearly we may deduce from Theorem 13.2.3, that there is a zero of P>n+1 between any two zeros of P>n . Now we can extend our study of zeros. P ({) P>1 ({) P>2 ({) P>3 ({) P>4 ({)
has has has has has
zeros f1 > f2 > = = = > fm+2 > 1 zeros g1 > g2 > = = = > gm+2 > 1 zeros 0> h2 > = = = > hm+2 > 1 zeros 0> i2 > = = = > im+3 > 1 zeros j1 > = = = > jm+4 > 1
> > > > > etc.
In each 4-cycle, two zeros appear at { = 0, for P>4v+2 and P>4v+3 . P>4t4 has m + 2t zeros in [0> 1). But Pl>4t = tl Pl , so that P>4t4 =
q X
Thus
t1 fl Pl l
l=p
and hence, by part 4) of Theorem 13.3.1, P>4t4 can have at most q 1 = p + 2t 1 zeros in [0> 1). Therefore, m + 2t p + 2t 1, and m p 1 so that m = p 1 : !({) has just p 1 zeros, all nodes, in (0> 1). We continue the argument as in the example. Assume that all minors of S of order s are positive. Since !l (1) A 0, we have ()l1 !l (0+) A 0 and by expanding !({) along its first row we find ()p1 !(0+) A 0, and hence since !({) has just p1 changes of sign in (0> 1), !(1) A 0. We now expand !(1{) for small {: {2s 2s ! (1) + R({2s+1 ) !(1 {) = (2s)!
386
Chapter 13
so that
¯ ¯ xp ¯ ¯ p !2s (1) = ¯¯ ¯ t· ¯ xp p
xp+1 p+1 · 2p+1 xp+1
=== xq === q === · = = = tq xq
¯ ¯ ¯ ¯ ¯ A 0= ¯ ¯ ¯
This is a minor of order (s + 1) in S . Since all the other cases may be analysed in a similar way, we deduce that all the minors of S involving (s + 1) consecutive rows and columns are positive; the Corollary to Theorem 6.8.1 states that S is TP. Exercises 13.4 1. Establish the generalisation of the argument used with #({) is equation (13.4.1).
13.5
The spectra of the beam
Suppose that the beam of equation (13.1.1) specified by length O, cross-section D({), and second moment of area L({), is transformed into one of length O , cross-section D ({ ), second moment L ({ ), where { = {>
L ({ ) = L({)>
D ({ ) = D({)>
O = O>
(13.5.1)
then the spectra of the new beam under any combination of the end conditions (13.1.5)-(13.1.8) will be the same as those of the original beam provided that 4 = @=
(13.5.2)
With this relationship, equation (13.5.1) defines a two-parameter family os isospectral beams. Now consider a beam, clamped at { = 0, and acted on by a concentrated static force I and bending moment P at its free end { = O. The deflection is given by (13.5.3) (L({)x00 ({))00 = 0 subject to x(0) = 0 = x0 (0)>
(L({)x00 ({))0{=O = I>
L(O)x00 (O) = P>
(13.5.4)
so that x({) = I
Z
0
{
({ v)(O v) gv + P L(v)
Z
0
{
({ v)gv > L(v)
and the end displacement and slope are given by x(O) = J2 I + J1 P>
x0 (O) = J1 I + J0 P>
(13.5.5)
13. The Euler-Bernoulli Beam
387
where the receptances Jl are given by Jl =
Z
O
0
(O v)l gv gv> L(v)
l = 0> 1> 2=
For the transformed beam the receptances will be Jl =
l+1 Jl >
l = 0> 1> 2=
We conclude that equation (13.5.2) and any two of the four equations O = O>
Jl = Jl >
l = 0> 1> 2
(13.5.6)
demand that = 1 = = , so that the beams are identical. We now use the results of Section 13.4 to order the eigenvalues for a beam clamped at { = 0 and subject to dierent end conditions at { = 1. (We shall work with the dimensionless equation (13.1.4) and use the numbering 1,2,3,. . . ) Consider the variational problem of finding the stationary values of the functional M(x) =
1 2
Z
1
u({)(x00 ({))2 g{
0
2
Z
1
d({)x2 ({)g{ I x(1) Px0 (1)= (13.5.7)
0
Replace x by x + x and find M := M(x + x) M(x); after two integrations by parts we find M
=
Z
0
1
{(ux00 )00 dx}xg{ + [u(1)x00 (1) P ]x0 (1)
[(u({)x00 ({))0{=1 + I ]x(1)
(13.5.8)
so that the displacement that makes M stationary satisifes equation (13.1.4) and the end conditions u(1)x00 (1) = P>
(u({)x00 ({))0{=1 = I
(13.5.9)
i.e., it is the displacement of the cantilever due to the concentrated static force I and moment P applied at { = 1. We now use the eigenfunctions (!l ({))4 1 of the cantilever to find an alternative expression for this displacement. The eigenfunctions of the cantilever are complete in O2 (0> 1). Write x({) =
4 X
fl !l ({)
l=1
and use Ex. 13.1.3 to give M(x) =
4 4 4 1X 2 X 2 X l fl fl fl {I xl + P l } 2 l=1 2 l=1 l=1
388
Chapter 13
where xl = !l (1), l = !0l (1) and the eigenfunctions have been normalised so that Z 1
0
d({)!2l ({)g{ = 1=
M(x) will be stationary if (l )fl = I xl + Pl i.e., x({) =
4 X (I xl + Pl ) l=1
l
!l ({)=
This yields the end receptances {1 > {10 > {0 1 > {0 10 of the beam with the properties x({) = {1 I + {10 P> x0 ({) = {0 1 I + {0 10 P; 4 4 X X xl !l ({) l !l ({) > {10 = > {1 = l l l=1 l=1 {0 1
=
4 X xl !0 ({) l
l=1
l
>
{0 10 =
4 X l !0 ({) l
l=1
l
=
(13.5.10) (13.5.11)
We now use these expressions to obtain equations for the eigenvalues of the beam corresponding to various conditions at { = 1. The eigenvalues of the clamped-pinned beam are the values of $2 for which the application of an end force I alone (i.e., P = 0) produces no end displacement, i.e., x(0) = 0. They are thus the roots of the equation 11 = 0, i.e., 4 X x2l = 0= (13.5.12) l=1 l
We will denote them by (l )4 1 . Since xl A 0, they satisfy l ? l ? l+1 >
l = 1> 2> = = =
Similarly, the eigenvalues (l )4 1 of the clamped-sliding beam are the values of $ 2 for which P alone (i.e., I = 0) produces no end slope, i.e., x0 (1) = 0. They are the roots of 10 10 = 0, i.e., 4 X l=1
2l =0 l
and since l A 0, they satisfy l ? l ? l+1 >
l = 1> 2> = = =
(13.5.13)
13. The Euler-Bernoulli Beam
389
The anti-resonant eigenvalues ( l )4 1 are those at which I alone produces no slope, or equivalently P alone produces no displacement; they are the roots of 110 = 0, i.e., 4 X xl l = 0= (13.5.14) l=1 l
Since xl l A 0 (Theorem 13.3.2), they satisfy l ? l ? l+1 >
l = 1> 2> = = =
We can order l > l > l by using Theorem 8.4.2 and the total positivity of the matrix S in Theorem 13.4.1: xm l xl m A 0 for l A m gives the ordering l ? l ? l =
(13.5.15)
Since the clamped end condition may be obtained by adding another constraint x0 (1) = 0 to the pinned condition, and alternatively by adding the constraint x(1) = 0 to the sliding condition, the clamped-clamped eigenvalues ( l )4 1 will satisfy l ? l ? l+1 > l ? l ? l+1 = Putting all these inequalities together we find 1 ? 1 ? 1 ? 1 ? (2 > 1 ) ? 2 ? 2 ? 2 ? (3 > 2 ) = = =
(13.5.16)
As with the discrete beam (see Ex. 8.5.3) the relative position of l and l+1 is indeterminate. Tables 7.2(b), (c) of Bishop and Johnson (1960) [34] show that for the uniform beam 1 A 2 > 2 ? 3 > 3 A 4 , etc. and that thereafter l and l+1 are vertically identical. In order to find the asymptotic forms of the eigenvalues corresponding to dierent end conditions we use the WKB approach, see for example Carrier, Krook and Pearson (1966) [49], p. 291. First we make a change of independent variable: ¶1 Z {µ d(w) 4 gw> v= u(w) 0 and write ¶1 µ 1 d({) 4 > f2 (v) = (u3 ({)d({)) 4 > e(v) = u({) then µ ¶1 g gv d({) 4 g g g = = = e(v) > g{ gv g{ u({) gv gv and µ ¶ µ ¶ g2 g g g g u({) 2 = u({)e(v) e(v) = f2 (v) e(v) = g{ gv gv gv gv Thus equation (13.3.1) becomes µ µ µ ¶¶¶ g g gx 2 g e e f e = e3 f2 x> gv gv gv gv
390
Chapter 13
since d = e3 f2 . Thus putting 0 = g@gv we find (e(f2 (ex0 )0 )0 )0 = e2 f2 x> where O=
Z
0
1
µ
d(w) u(w)
¶ 14
0 v O>
(13.5.17)
gw=
(13.5.18)
The new end conditions are x(0) = 0 = x0 (0)>
(13.5.19)
(ex0 )0 (O) = 0 = (f2 (ex0 )0 )0 (O)=
(13.5.20)
Now suppose that is large positive, put = } 4 and expand the left hand side of equation (13.5.17) to give s2 (v)x0y (v)+2s(v)s0 (v)x000 (v)+i1 (v)x00 (v)+i2 (v)x0 (v) = } 4 s2 (v)x(v)> (13.5.21) where s = ef>
i1 = s00 + ef2 e00 + 2efe0 f0 >
i2 = (e(f2 e0 )0 )0 =
For large } it will be the first two terms of (13.5.21) that will be dominant. We look for a solution having the form µZ ¶ X (v) = exp (}#1 (v) + #2 (v)) gv = After substituting this into (13.5.21) and retaining only the terms involving } 4 and } 3 we find s2 (6#21 #01 + 4# 31 #2 ) + 2ss0 #31 = 0
#41 = 1>
1
so that #1 = ±1> ±l and 2s# 2 + s0 = 0, i.e., exp(# 2 (v)) = s 2 (v). There are thus four solutions corresponding to the four values of # 1 , and we may write 1
x(v) = s 2 (v){D cos }v + E sin }v + F cosh }v + G sinh }v}=
(13.5.22)
1
Apart from the factor s 2 (v), this has exactly the same form as that for a uniform beam, so that for large } the eigenvalue equation will be the same as for a uniform beam of equivalent length O. Thus, for the cantilever the four end conditions (13.5.19); (13.5.20) will yield the eigenvalue equation Bishop and Johnson (1960) [34], p. 382) cos }O cosh }O + 1 = 0> so that cos }O = sech}O ' 2 exp(}O)>
(13.5.23)
13. The Euler-Bernoulli Beam
and
391
}l O ' (2l 1) > 2
l = 1> 2> = = = >
or l ' (2l 1)4 4 @(16O4 )= In a similar way we find l l l l
' ' ' '
(4l + 1)4 4 @(256O4 )> (l 1)4 4 @O4 > (4l 1)4 4 @(256O4 )> (2l + 1)4 4 @(16O4 )=
We note that these do obey the interlacing conditions (13.5.15), and that l ' l+1 . Note also that, taking account of the change of notation, the values of l > l > l agree with those given by Barcilon (1982) [21]; his values of 2u > $ 2u (our l > l ) are incorrect.
Exercises 13.5 1. Verify the statement (13.5.2). 2. Carry out the integration from equations (13.5.3), (13.5.4) to (13.5.5). 3. Derive the expression for M in equation (13.5.7) by replacing x by x + x, in (13.5.6), neglecting the second order terms and integrating by parts twice. 4. Show that the asymptotic form for the eigenvalue equation for the clampedclamped beam is cos }O cosh }O1 = 0, and use this equation and (13.5.23) to show that for large l> l is alternately greater and less than l+1 .
13.6
Statement of the inverse problem
Inverse problems for the vibrating Euler-Bernoulli beam seem to have been studied first by Niordson (1967) [250]. He was not concerned with the reconstruction of a unique beam from su!cient data, in the sense to be described below. Rather, he was concerned with constructing a beam in a class having q arbitrary parameters so that it would have q specified eigenvalues which would be perturbations on the eigenvalues of the uniform cantilever beam. The proper study of the inverse problem for the vibrating Euler-Bernoulli beam began with the work of Barcilon. He realised that there are three questions to be answered. First, what spectral (and other) data are required to determine the properties (cross-sectional area D({), second moment of area L({)) of the beam? In Barcilon (1974b) [15], Barcilon (1974c) [16] he showed that three spectra, corresponding to three dierent end conditions, are required. Secondly, what are the necessary and su!cient conditions on the data to ensure that the
392
Chapter 13
beam properties will be realistic, i.e., D({) A 0> L({) A 0? Barcilon battled with this question in Barcilon (1982) [21], but it was not fully answered until Gladwell (1986d) [110]. Thirdly, how can the beam be reconstructed? Barcilon (1976) [18] answered this question for the case in which the spectra were small perturbations on these for the uniform beam, but a proper reconstruction procedure was not available until McLaughlin (1984b) [227]. As a result of the analysis described in Section 13.5 we may state that there is only a two-parameter family of beams which have three given spec4 4 tra {l > l > l }4 1 (or {l > l > l }1 or {l > l > l }1 ). The particular member in the family may be found as in (13.5.1). The spectra {l > l > l }4 1 will have to satisfy certain conditions, amongst which will be some asymptotic ones. The argument of Section 13.5 shows that to be given {l > l > l }4 1 > and some appropriate asymptotic conditions in equivalent to being given {l > xl > l }4 1 and some other asymptotic conditions. We can, and shall, circumvent the asymptotic conditions with the way in which the problem is posed in practice - that only (l > xl > l )q1 are given, while the remainder are chosen so that r r r 4 (l > xl > l )4 q+1 = (l > xl > l )q+1
(13.6.1)
where the r quantities relate to a known beam which, without loss of generality, may be taken to be a uniform beam. In this case, equation (13.5.12), for example, may be written q X l=1
q
4
X xr X xr x2l l l + = 0= r l l=1 l l=1 rl 2
2
(13.6.2)
The infinite sum is an end receptance of the uniform beam and may be expressed in closed form, in fact Bishop and Johnson (1960) [34] if xrl = 1, then 2 4 X 1 cos ! sinh ! sin ! cosh ! xrl = > ! = 4 = r 3 4! (cos ! cosh ! + 1) l=1 l
Thus, the statement that (13.6.2) is satisfied by = (l )q1 yields q simultaneous linear equations for (x2l )q1 . The first set of necessary conditions is therefore as follows: 1) l 6= m , l 6= rm for all l> m = 1> 2> = = =. Those ensures that the matrix of coe!cients in the equations for (x2l )q1 in non-singular, and the right hand sides are well-defined. 2) the solution (x2l )q1 must be positive. (See Ex. 13.6.1) Similarly, if the (2l )q1 are to be determined from equation (13.5.13) then we need 3) l 6= m , l 6= rm for all l> m = 1> 2> = = = 4) the solution (2l )q1 must be positive.
13. The Euler-Bernoulli Beam
393
Provided that these conditions are satisfied, i.e., (xl > l )q1 may be found, then we shall show that the positivity of the minors of S of Theorem 13.4.1, which has been shown to be necessary, is also a su!cient condition for the construction of a unique realistic beam. The analysis shows that three properly chosen spectra are required to reconstruct a beam uniquely. Gottlieb (1987b) [140] made an exhaustive study of beams that have one or two spectra in common, and/or in common with a uniform beam, for various combinations of end conditions. His study thus highlights the need for three (properly chosen) spectra. See also Gottlieb (1988) [142].
Exercises 13.6 1. By retaining (13.6.2) in the form 4 X l=1
x2l =0 l
show that x2l A 0 if the roots of (13.6.2) interlace the l , i.e., l ? l ? l+1 > l = 1> 2> = = =.
13.7
The reconstruction procedure
The procedure is essentially the same as that described in Chapter 11 for the vibrating rod, and is due to McLaughlin (1976) [223], McLaughlin (1978) [224], McLaughlin (1981) [225], McLaughlin (1984a) [226], McLaughlin (1984b) [227]. Papanicalaou and Kravvaritis (1997) [262] consider the special case, d({)u({) = 1; in this case the problem can eectively be reduced to a second order problem. See also Gladwell (1991d) [119]. We use a transformation operator as described in Section 11.3. We suppose that we wish to construct a cantilever beam, i.e., functions u({) and d({), such that the equation (u({)x00 ({))00 = d({)x({)>
(13.7.1)
subject to the end conditions x(0) = 0 = x0 (0)>
x00 (1) = 0 = x000 (1)>
(13.7.2)
4 has specified eigenvalues (l )4 1 , and has eigenfunctions (!l ({))1 , normalised w.r.t. d({), i.e., such that Z 1 d({)!l ({)!m ({)g{ = l>m > l> m = 1> 2> = = = 0
which have specified values of (!l (1)> !0l (1))4 1 .
394
Chapter 13
First make a change in the independent variable similar to that used in Section 13.5: ¶1 Z 1µ d(w) 4 v= gw (13.7.3) u(w) { and write
e(v) =
µ
d({) u({)
¶ 14
>
1
f2 (v) = (u3 ({)d({)) 4 >
s(v) = e(v)f(v)>
(13.7.4) (13.7.5)
then equation (13.7.1) becomes (e(f2 (ex0 )0 )0 )0 e2 f2 x = 0> where 0 g@gv and O=
Z
0
while the end conditions become
1
µ
d(w) u(w)
¶ 14
0 v O>
gw>
(13.7.6)
(13.7.7)
(ex0 )0 (0) = 0 = (f2 (ex0 )0 )0 (0)
(13.7.8)
x(O) = 0 = x0 (O)=
(13.7.9)
Without loss of generality, we assume that e(0) = 1 = f(0). Just as with the Sturm-Liouville reconstruction, we introduce a base problem (e0 (f20 (e0 y0 )0 )0 )0 e20 f20 y = 0
(13.7.10)
(e0 y0 )0 (0) = 0 = (f20 (e0 y 0 )0 )0 (0)
(13.7.11)
y(O) = 0 = y 0 (O)
(13.7.12)
where e0 (v)> f0 (v) are known (e.g., e0 (v) = 1 = f0 (v)) e0 (1) = 1 = f0 (0), and s0 (v) = e0 (v)f0 (v). This base problem has a certain set of eigenvalues (0l )4 1 and its eigenfunctions !0l (v), normalised so that Z O s20 (d)!0l (v)!0m (v) gv = lm > l> m = 1> 2> = = = (13.7.13) 0
4 will have end values (!0l (0)> !00 m (0))1 . For given values > > we may define a unique function
y(v; > > ) y(v)
(13.7.14)
which is the solution of equation (13.7.10) satisfying y(0) = > Clearly
y 0 (0) = >
(e0 y 0 )0 (0) = 0 = (f20 (ey 0 )0 )0 (0)= 0
y(v; 0l > !0l (0)> !0l (0)) = !0l (v)=
(13.7.15) (13.7.16)
13. The Euler-Bernoulli Beam
395
2 The eigenfunctions {!0l (v)}4 1 are orthogonal with weight function s0 (v), as 4 shown by equation (13.7.13). The eigenfunctions {!l (v)}1 of equations (13.7.6)(13.7.9) are to be orthogonal w.r.t. s2 (v), i.e., Z O s2 (v)!l (v)!m (v)gv = lm > l> m = 1> 2> = = = (13.7.17) 0
Therefore, following (11.3.5) we construct (13.7.6) so that the solution (13.7.16) of equation (13.7.10) is transformed into a solution of equation (13.7.6) satisfying x(0) = >
x0 (0) = >
(ex0 )0 (0) = 0 = (f2 (ex0 )0 )0 (0)
(13.7.18)
by means of the equation s(v)x(v) = s0 (v)y(v) +
Z
v
N(v> w)s20 (w)y(w)gw=
(13.7.19)
0
The eigenfunctions of equations (13.7.6)-(13.7.9) will be !l (v) = x(v; > !l (v)|v=0 > !0l (v)|v=0 ) and we note that !l (v)|v=0 = !l ({)|{=1 >
¯ ¯ g!l (v) ¯¯ g!l ({) ¯¯ = gv ¯v=0 g{ ¯{=1
If the eigenvalues l and end values !l (0), !0l (0) (with variables v) are chosen so that l = 0l >
!l (0) = !0l (0)>
0
!0l (0) = !0l (0)>
l = q + 1> = = =
2 then the system {!l (v)}4 1 will form a complete orthogonal set with weight s (v) i N({> v) satisfies the analogue of equation (11.5.20), i.e., Z u N(u> v) + s20 (w)N(u> w)I (w> v)gw + s0 (u)I (u> v) = 0> 0 v u> (13.7.20) 0
where I (u> v) =
q X {yl (u)yl (v) yl0 (u)yl0 (v)}
(13.7.21)
l=1
and
yl (v) = y(v; l > !l (0)> !0l (0)) 0 yl0 (v) = y(v> 0l > !0l (0)> !0l (0)) !0l (v)=
We note that x(v; 0> 1> 0) = 1>
y(v; 0> 1> 0) = 1
so that equation (13.7.19) gives s(v) = s0 (v) +
Z
0
v
N(v> w)s20 (w)gw=
(13.7.22)
396
Chapter 13
On the other hand, if t(v) =
Z
v
0
gw > e(w)
t0 (v) =
Z
v
0
gw e0 (w)
(13.7.23)
then x(v; 0> 0> 1) = t(v)> so that s(v)t(v) = s0 (v)t0 (v) +
Z
y(v; 0> 0> 1) = t0 (v)
v
N(v> w)s20 (w)t0 (w)gw=
(13.7.24)
0
The reconstruction procedure is thus as follows: • solve equation (13.7.20) for N(v> w) • find s(v)> t(v) from equations (13.7.22), (13.7.24) • find e(v)> f(v) from equations (13.7.4), (13.7.5) • find {> d({)> u({) from equations (13.7.3), (13.7.4). To justify this procedure we need to verify that when s(v)> t(v) are given by (13.7.22), (13.7.24) then 1) s(v)> t(v) are well-defined and positive, and t(v) is an increasing function. 2) x(v) satisfies the end conditions at v = 0. 3) x(v) satisfies the dierential equation (13.7.6). 4) xl (v) satisfies the end conditions at v = O. We shall consider these points in the order 2,3,4,1. Equation (13.7.22) yields s(0) = s0 (0) = 1, while equation (13.7.19) yields s(0)x(0) = x(0) = s0 (0)y(0) = y(0) = . On dierentiating equation (13.7.22) we obtain s0 (0) = s00 (0) + N(0> 0)> while on dierentiating equation (13.7.19) we find s0 (0)x(0) + s(0)x0 (0) = s00 (0)y(0) + s0 (0)y0 (0) + N(0> 0)s20 (0)y(0)> which yields x0 (0) = y 0 (0) = = By continuing this dierentiation we may establish the remainder of 2) and 3). As in Section 11.5, the solution of equation (13.7.20) has the form N(u> v) =
q X {Il (u)yl (v) Jl (u)yl0 (v)} l=1
(13.7.25)
13. The Euler-Bernoulli Beam
397
where Il (u)> Jl (u) satisfy q X sr (u)yl (u) + Il (u) + {elm Im (u) flm Jm (u)} = 0
(13.7.26)
l=1
sr (u)yl0 (u) + Jl (u) +
q X {fml Im (u) glm Jm (u)} = 0
(13.7.27)
l=1
where elm (u) =
Z
u
s20 (w)yl (w)ym (w)gw>
flm (u) =
0
glm (u) =
Z
u
0
u
0
Z
s20 (w)yl (w)ym0 (w)gw
s20 (w)yl0 (w)ym0 (w)gw=
In considering point 4) we need to discuss two cases, l q and l A q. For the first we note than on subsituting (13.7.25) into (13.7.19) and using (13.7.27) we may deduce (13.7.28) s(v)xl (v) = Il (v)> l = 1> 2> = = = > q= But equation (13.7.27) with u = O and the orthogonality conditions glm (O) = lm (yl0 (w) are normalised eigenfunctions of the base problem) yield q X
fml (O)Im (O) = 0
(13.7.29)
l=1
since yl0 (O) = 0. Thus if C = (flm (O)) is non-singular, and s(O) 6= 0, then Im (O) = 0 = xm (O)= On dierentiating (13.7.27) we find, under the same proviso, that Im0 (O) = 0 = x0m (O)>
m = 1> 2> = = = > q=
We shall return to the proviso, s(O) 6= 0, later. When l A q, then yl (O) = yl0 (O) = !0l (O) = 0, so that equation (13.7.19) yields Z O N(O> w)s20 (w)yl0 (w)gw> s(O)xl (O) = 0
so that on substituting for N(O> w) from equation (13.7.25) we find s(O)xl (O) =
Z q X {Im (O) l=1
0
O
s20 (w)ym (w)yl0 (w)gw Jm (O)
Z
0
O
s20 (w)ym0 (w)yl0 (w)gw}=
But since l A q, yl0 (w) is orthogonal to all of {ym0 (w)}q1 . Therefore, again, if Im (O) = 0> m = 1> 2> = = = > q and s(O) 6= 0, then xl (O) = 0. The satisfaction of x0l (O) = 0 may be verified similarly (Ex. 13.7.3).
398
Chapter 13
We now discuss point 1). The first step is the determination of Il (v)> Jl (v) from equations (13.7.26), (13.7.27). The argument used in Section 11.5 shows that the matrix of coe!cients in these equations is non-singular unless u = O. Thus there remains only the case u = O. Then the matrix of coe!cients in (13.7.26), (13.7.27) takes the form ¸ · I + B> C A= 0 CW so that det A = (det C)2 . Thus det C 6= 0 is a necessary and su!cient condition for the Il (u)> Jl (u), and hence s(v), to be well-defined. We now enquire as to when and whether s(v) A 0. Suppose s(v) = 0 for some v 5 [0> O], then (13.7.28) and the similar equation 0
s(v)x0l (v) = s(v)x(v; 0l > !0l (0)> !0l (0)) = Jl (v)>
l = 1> 2> = = = > q
show that Il (v) = 0 = Jl (v)> l = 1> 2> = = = > q and hence, on account of (13.7.26), (13.7.27), s0 (v)yl (v) = 0 = s0 (v)yl0 (v)> l = 1> 2> = = = > q. But s0 (v), corresponding to an actual beam, is always positive, and the only common zero of the yl0 (v), is v = O. Thus the only possible zero of s(v) is v = O. At v = O, equation (13.6.27) reduces to q X fml (O)Im (O) = 0 l=1
so that if det C(O) 6= 0 then Im (O) = 0> reduces to s0 (O)yl (O)
q X
m = 1> 2> = = = > q.
Then (13.6.26)
l = 1> 2> = = = > q=
(13.7.30)
flm (O)Jm (O) = 0>
l=1
Equations (13.7.22), (13.7.25) yield s(O) = s0 (O)
q X l=1
Jm (O)
Z
O
0
s20 (w)ym0 (w)gw=
(13.7.31)
Put C = (flm (O))>
g = [J1 (O)> = = = > Jq (O)]>
v = [y1 (O)> = = = > yq (O)]
then (13.7.30) becomes s0 (O)v = Cg so that on multiplying (13.7.31) by v we have s(O)v = s0 (O)v Hg = s0 (O)(C H)g> where flm klm =
Z
0
O
s20 (w)(yl (w) yl (O))ym0 (w)gw = hlm =
13. The Euler-Bernoulli Beam
399
Thus s(O)v = s0 (O)Eg= This means that if E is non-singular then s(O) 6= 0. If C and E are non-singular then all the stated operations may be carried out to obtain s(v)> t(v) and hence e(v)> f(v). Since e(v)> f(v) are never zero, and e(0)f(0) = s(0) = 1, e(v)> f(v) are always positive, and hence d({)> u({) A 0. Lowe (1993) [217] considers a special case of equation (13.7.1) in which u({) = [d({)]2 , and uses a construction based on a Fourier series for d({). Exercises 13.7 1. Verify the transformation of equations (13.7.1), (13.7.2) to (13.7.6)-(13.7.9). Show that the conditions (13.7.8) are equivalent to t 0 (0)x00 (0) t 00 (0)x0 (0) = 0 = t 0 (0)x000 (0) t 000 (0)x0 (0)= 2. Show that if y(v) satisfies t00 (0)y 00 (0) t000 (0)y0 (0) = 0 = t00 (0)y 000 (0) t0000 (0)y 0 (0) then x satisfies the conditions in Ex. 13.7.1. 3. We established xl (O) = 0 = x0l (O) for l q; establish them for l A q.
13.8
The total positivity of matrix P is su!cient
In Section 13.4 we showed that the eigenvalues (l )4 1 and end values xl > l of a cantilever beam make the infinite matrix P of Theorem 13.4.1 totally positive. In Section 13.7 we found some su!cient conditions for the reconstruction of an actual beam from such data. We now show that the total positivity of the matrix P is not only necessary but su!cient. It was shown in Section 13.7 that the reconstruction will proceed provided that the matrices C and E are non-singular. Here flm
=
Z
O
s20 (v)yl (v)ym0 (v)gv
0
hlm
=
Z
0
O
s20 (v)yl0 (v)[ym (v) ym (O)]gv
and l> m = 1> 2> = = = > q. Suppose, if possible, that C were singular. Its rows will be linearly dependent, i.e., there are multipliers (l )q1 , not all zero, such that q X l=1
l flm = 0>
m = 1> 2> = = = > q=
400
Chapter 13
Thus Z
O
s20 (v)
0
( q X
)
l yl (v) ym0 (v)gv = 0>
l=1
m = 1> 2> = = = > q=
(13.8.1)
But since the {ym0 (v)}4 1 , being the eigenfunctions of the base problem, form a complete orthogonal set with weight function s20 (v), equation (13.8.1) means that the sum in the integrand is a linear combination of the remaining {ym0 (v)}4 q+1 . Thus q 4 X X i (v) := l yl (v) + l yl0 (v) = 0= l=1
l=q+1
In particular, i (v) and all its derivatives must be zero at v = 0, the free end. Consider the case e0 (v) 1 f0 (v). Now = !l ({)|{=1 = xl > g! ({) |{=1 = l > = yl0 (v)|v=0 = l g{
yl (v)|v=0 e(v)yl0 (v)|v=0
while equation (13.7.10) gives ylly (0) = l yl (0) = l xl > Thus
(p)
yl
0(p)
(0) = 0 = yl
yly (0) = l l =
(0) for p = 2> 3; 6> 7; = = = >
(p)
so that i (0) is identically zero for these values of p. The equations obtained by setting i (p) (0) to zero for the remaining values 0> 1; 4> 5; = = = are therefore 4 X
4 X
ml xl l = 0
l=1
ml l l = 0>
m = 0> 1> 2> = = =
(13.8.2)
l=1
and here we have used the fact that x0l = xl , 0l = l for l = q + 1> = = = But the matrix of coe!cients for equations (13.8.2) is just the matrix P of Theorem 13.4.1, and every minor of P is positive so that P has infinite rank. Thus all the l are zero, contradicting the assumed signularity of C. Thus C is nonsingular. When e0 (v)> f0 (v) are not identically unity, the rows of the matrix are linear combinations of the rows of P, so that the conclusion still follows. A similar argument shows that if E is singular then there are multipliers ( l )4 1 , not all zero, such that j(v) :=
q X l=1
l {yl (v) yl (0)} +
i..e., j 0 (v) =
q X l=1
l yl0 (v) +
4 X
l=q+1
4 X
l yl0 (v) = 0
l=q+1
0
l yl0 (v) = 0=
13. The Euler-Bernoulli Beam
401
When e0 (v) = 1 = f0 (v), the matrix of coe!cients for the equations obtained by setting j (p) (0) to zero for p = 1; 4> 5; 8> 9; = = = is just the matrix formed from rows 2> 3> = = = of matrix P. We conclude that the l are identically zero and that E is non-singular. We conclude that the total positivity of the matrix P is a necessary and su!cient condition for the reconstruction of a realistic beam. Note that this conclusion is subject to the condition (13.6.1), and that the total positivity of the matrix P ensures only that d({)> u({) will be positive; they may still vary wildly along the beam, and in this case the vibration of the beam will not be governed by the Euler-Bernoulli equation, which applies only to slender beams, i.e., ones for which d({)> u({) do not dier much from the values of a uniform beam. If the beam is not uniformly slender, i.e., if d =
max d({) > min(d({))
u =
max u({) > min(u({))
{ 5 [0> 1]>
are not nearly unity, then the vibration of the beam is aected by thickness eects, and the simple Euler-Bernoulli model is inadequate. See Gladwell, England and Wang (1987) [112]. Note also that experimental studies of the natural frequencies of even a slender uniform beam show that the natural frequencies start to depart from the classical Euler-Bernoulli values after about the fourth or fifth frequency. This means that although the study of the inverse problem yields valuable insights into the behaviour of the Euler-Bernoulli beam, it must in many ways be considered as an academic exercise, and should be used only to find a beam in which the departures from the uniform beam is small and only a very few frequencies are to be changed, and that only by small amounts. For such problems, perturbation methods combined with least squares approaches form an alternative avenue; but such numerical methods are outside the purview of this book.
Chapter 14
Continuous Modes and Nodes If there were no obscurity, man would not be sensible of his corruption; if there were no light, man would not hope for a remedy. Pascal’s Pensées, 585
14.1
Introduction
Throughout most of the preceding chapters, the emphasis has been placed on eigenvalues; in this chapter, we turn our attention to eigenmodes and, in particular, to the nodes of eigenmodes. We will find that, in contrast to inverse eigenvalue problems, there are no easily stated inverse nodal problems. There are some uniqueness results pertaining to nodes, most of which are due to McLaughlin and Hald; there are also some approximate solution of inverse nodal problems, again mostly due to McLaughlin and Hald; both of these topics are studied in Section 14.4. It should, however, be stated from the outset that it is impossible to do justice either to these uniqueness results or to the approximate solutions in the space available in this chapter; all we can do is to give an introduction to the published papers, discuss the methods used and some of the results obtained. We take this opportunity to point out a fundamental dierence between eigenvalues and nodes of a continuous system, and consequently between inverse eigenvalue and inverse nodal problems. Eigenvalues are global quantities; they are properties of the system as a whole. By contrast, a node, in particular the position of a node, is related to the properties of the system around that node; it is a local property. We begin our discussion of modes and nodes by making reference to Sturm’s Theorems relating to the nodes of a second order equation. These theorems have wide applicability, are easily proved, and yield valuable insight into the properties of the solutions of Sturm-Liouville systems. Sturm’s original results 402
14. Continuous Modes and Nodes
403
appeared in 1836. The most complete account was given by Bôcher (1917) [37]. A detailed account also appears in Chapter X of Ince (1927) [185].
14.2
Sturm’s Theorems
In Section 10.1 we introduced three equations, (10.1.1), (10.1.3) and (10.1.11) that appear in vibration problems. These equations all contain the frequency parameter , and they must all be complemented by end conditions in order to yield a well-posed eigenvalue problem. Sturm’s Theorems may be formulated for a wider class of equations that includes (10.1.1), (10.1.3) and (10.1.11), and apply to the equation without regard for end conditions. Consider the equation (D| 0 )0 + E| = 0> (14.2.1) and suppose that D({)> D0 ({) and E({) are continuous and D ({) A 0 throughout an interval [d> e]. These conditions are unnecessarily restrictive; we could suppose, say, that they were piecewise continuous with a finite number of points of discontinuity, or even in a wider class. We leave such niceties to the interested reader. For the starting point of our discussion, we note that if |({) is a continuous solution of (14.2.1) and |(f) = 0 = |0 (f) for some f 5 [d> e], then | is identically zero. If D> E have derivatives of all orders then |(f) = 0 = | 0 (f) implies | 00 (f) = 0 = | 000 (f) = = =, so that the Taylor expansion of |({) is identically zero. If only D> D0 > E are continuous then we may reach the same conclusion by converting (14.2.1) into an integral equation. Alternatively, we may approximate D> E ˜ ˜ arbitrary closely by D({)> E({) that do have derivatives of all orders, and reach the same conclusion. From this result, we may deduce that every zero (node) of a solution of (14.2.1) is simple: if |(f) = 0, then | 0 (f) 6= 0> and | crosses the axis at { = f. We may deduce also that no continuous solution of (14.2.1) can have an infinity of nodes in [d> e]. For if there were an infinity of nodes then, by the Bolzano-Weierstrass Theorem, they would have at least one limit point f 5 [d> e] and we can show (Ex. 14.2.1) that, at f, not only |(f) = 0 but |0 (f) = 0 : | 0. Now suppose that x> y are two solutions of (14.2.1), so that (Dx0 )0 + Ex = 0 = (Dy 0 )0 + Ey= Multiplying the first by y, the second by x, subtracting and rearranging, we find (D(yx0 xy 0 ))0 = 0> so that D(yx0 xy 0 ) = constant = F=
(14.2.2)
Since, by hypothesis, D({) A 0, the constant is zero i the Wronskian, yx0 xy 0 , is zero, i.e., i the solutions are proportional, i.e., x = ny. Henceforward, we will say that two solutions x> y are the same if x = ny, dierent if there is no n such that x = ny. From this we may immediately deduce
404
Chapter 14
Theorem 14.2.1 Two dierent solutions of (14.2.1) cannot have a common zero. Proof. x(f) = 0 = y(f) implies F = 0 in (14.2.2) Theorem 14.2.2 The nodes of two real dierent solutions of (14.2.1) separate each other. Proof. First note that it is necessary to include the word ‘real’ in the statement because |({) = cos { + l sin {> a solution of | 00 + | = 0, has no nodes on the real axis. Now suppose one solution of (14.2.1), x, has two nodes {1 > {2 5 [d> e], and y is a second, dierent, solution. By Theorem 14.2.1, y({1 )> y({2 ) 6= 0. Suppose y({) has no node in ({1 > {2 ), then it must have the same sign in [{1 > {2 ], say positive. That means that } = x({)@y({) is continuous, has a continous derivative in [{1 > {2 ], and is zero at the ends {1 and {2 . Therefore, by Rolle’s Theorem, } 0 () = 0 for some 5 ({1 > {2 ). But }0 =
yx0 xy0 = y2
The numerator of this expression is the Wronskian, which is not zero because x> y are dierent, and the denominator is y 2 , which is positive by hypothesis. Thus, } 0 6= 0 throughout ({1 > {2 ). This contradiction implies that y has a node in ({1 > {2 ). Corollary 14.2.1 If x> y are two dierent solutions of (14.2.1), then the numbers of nodes of x> y in any interval [> ] [d> e] cannot dier by more than one. Theorems 14.2.1, 14.2.2 concern two dierent solutions of the same equation (14.2.1); the next results concern the solutions of two dierent equations. Theorem 14.2.3 Suppose x({) is a solution of (D| 0 )0 + E1 | = 0> and y({) is a solution of (D| 0 )0 + E2 | = 0> where E1 E2 in [d> e] and E1 ({) ? E2 ({) for some { 5 [d> e], then y({) has a node between any two nodes of x({). Proof. Suppose {1 > {2 are consecutive nodes of x, and suppose, if possible, that y has no node in ({1 > {2 ). With no loss in generality, we may assume that x({)> y({) A 0 in ({1 > {2 ). The equations (Dx0 )0 + E1 x = 0> (Dy 0 )0 + E2 y = 0 yield, as before, x(Dy 0 )0 + y(Dx0 )0 = (E2 E1 )xy
14. Continuous Modes and Nodes
405
so that (D(xy 0 yx0 ))0 = (E2 E1 )xy which, on integration, gives 0
0
[D(xy yx
)]{{21
=
Z
{2
(E2 E1 )xyg{=
(14.2.3)
{1
Since x({1 ) = 0 = x({2 ) the L.H.S. is S D({1 )y({1 )x0 ({1 ) + D({2 )y ({2 ) x0 ({2 )= Since x({) A 0 in ({1 > {2 ) we have x0 ({1 ) A 0> x0 ({2 ) ? 0 so that S 0. S can be zero only if y({1 ) = 0 = y({2 ) in which case the Wronskian of x and y is zero, x and y are the same solution in [{1 > {2 ], and Z {2 (E2 E1 )xyg{ = 0= {1
Since xy A 0 in ({1 > {2 ) and E1 > E2 are continuous, this forces E1 = E2 in ({1 > {2 ). Otherwise, S ? 0, and E1 E2 implies that the R.H.S. of (14.2.3) is non-negative ( 0). This contradiction implies that y({) has a node between {1 and {2 . Picone extended Sturm’s Theorem 14.2.3 to give Theorem 14.2.4 Suppose x({) is a solution of (D1 |0 )0 + E1 | = 0 and y({) is a solution of (D2 |0 )0 + E2 | = 0 where D1 D2 A 0> E1 E2 in [d> e] and D1 () A D2 ()> E1 () ? E2 () for some > 5 [d> e]. Then y({) has a node between any two nodes of x({). Proof. Picone wrote ¶2 µ ´0 2 xy0 (D1 x0 y D2 xy 0 ) = (E2 E1 )x2 +(D1 D2 )x0 +D2 x0 = (14.2.4) y y
³x
Suppose, as before, that {1 > {2 are two consecutive nodes of x, that x({) A 0 in ({1 > {2 ) so that x0 ({1 ) A 0> x0 ({2 ) ? 0. Suppose y has no node in ({1 > {2 ) and that y({1 )> y({2 ) A 0. On integrating (14.2.4) over ({1 > {2 ) we find Z {2 Z {2 hx i{2 2 (D1 x0 y D2 xy 0 ) = (E2 E1 )x2 g{ + (D1 D2 )x0 g{ + y {1 {1 {1 ¶ µ Z {2 0 2 xy D2 x0 g{= (14.2.5) y {1
406
Chapter 14
The L.H.S. is zero because x({1 ) = 0 = x({2 )> while the R.H.S. is positive. This contradiction implies that y has a node in ({1 > {2 ). The L.H.S. is still zero even if y({) is zero at one or both of {1 > {2 . For if y({) is zero, say, at {1 then lim
{${1
so that lim
hx
x0 ({1 ) x = 0 y y ({1 )
i (D1 x0 y D2 xy 0 ) = (D1 D2 )xx0 |{1 = 0=
y { Note that, in exceptional cases, discussed in Ex. 14.2.2, the R.H.S. may be zero, in which cases we must modify our conclusion slightly. We may use Picone’s formula (14.2.4) to prove two separation theorems. The first is {${1
Theorem 14.2.5 Suppose x({) is the solution of (D1 x0 )0 + E1 x = 0
(14.2.6)
subject to x(d) = >
x0 (d) = 0
(14.2.7)
and y({) is the solution of (D2 y 0 )0 + E2 y = 0 subject to y(d) = >
y 0 (d) = 0 =
We make the following assumptions: 1) D1 D2 A 0>
E1 E2 in [d> e].
2) > 0 are not both zero, nor are > 0 . 3) If 6= 0, then
D2 (d) 0 D1 (d)0
which implies 6= 0. 4) The identity D1 0 = D2 is not satisfied in any finite part of [d> e]. If x({) has q nodes in (d> e], then y({) has at least q nodes in (d> e], and the lth node of y({) is less than the lth node of x({). Proof. Let {1 > {2 > = = = > {q be the nodes of x({) in (d> e], so that d ? {1 ? {2 ? · · · ? {q e= Sturm’s Theorem 14.2.4 states that y({) has a node between any two consecutive nodes {l > {l+1 . The Theorem holds therefore if we can show that y({) has a node between d and {1 .
14. Continuous Modes and Nodes
407
If x({) is zero at the left hand end point, then, by Theorem 14.2.4, y({) has a node between d and {1 . We therefore suppose that 6= 0, so that condition 3) implies 6= 0. Integrate the Picone formula (14.2.4) between d and {1 , assuming that y({) has no node in (d> {1 ); it is ¶ µ hx i{1 D1 (d)0 D2 (d) 0 (D1 x0 y D2 xy 0 ) = x2 (d) y d which, by condition 3), is negative. The integral of the R.H.S. of (14.2.4) is positive. This contradiction implies that y({) has a node between d and {1 . This theorem allows us to deduce what happens to the nodes of x({), the solution of equations (14.2.6), (14.2.7) when D({) decreases continuously and E({) increases continuously, while and 0 are kept invariant: each new node enters at { = e and moves towards { = d.
Exercises 14.2 1. Suppose |({) has an infinity of nodes in [d> e] with limit point f. Use the Mean Value Theorem to show that | 0 (f) = 0. 2. Explore how the R.H.S. of (14.2.5) can actually be zero. Show that one can ensure that it is not zero by imposing the condition that E1 and E2 are not identically zero in any finite part of (d> e). 3. See how the nodes of | 00 + $ 2 | = 0> |(0) = > | 0 (0) = 0 in [0> 1] travel from 1 towards 0 as $ increases.
14.3
Applications of Sturm’s Theorems
Sturm’s Theorems describe what happens to a node of a solution of equation (14.2.1) when D({) or E({) change. In this section we look at the inverse question: what can we deduce about changes in D({)> E({) from changes in nodal positions? First, consider the taut string governed by equation (10.1.1), namely x00 + 2 x = 0=
(14.3.1)
Recall that = $ 2 , 2 ({) is the mass per unit length, and that the end conditions are x0 (0) kx(0) = 0 = x0 (1) + Kx(1)= Equation (14.3.1) has the form (14.2.1) with D({) = 1> E({) = 2 ({)= Consider what happens when a small mass is removed from the string at some interior point f. We can imagine that the mass is removed continuously over
408
Chapter 14
a small interval (f %> f + %). Removal of mass increases (or at least does not decrease) the natural frequency. Denote the new natural frequency by 2 2 $ > = $ , the new mass distribution by ({), and let y be the solution of 2
y 00 + y = 0 subject to y0 (0) ky(0) = 0 = y 0 (1) + Ky(1)= Suppose x has a node 5 (0> f %], and apply Theorem 14.2.5 to [0> f %]. In that interval D1 = D2 , = and E1 = 2 2 = E2 . Thus y has a node 5 (0> f %], and . If x has q nodes ( l )q1 5 (0> f %]> then y has at least q nodes ( l )q1 5 (0> f %], and l l . Thus nodes to the left of f % move to the left. By physically turning the string around, we see that nodes to the right of f + % will move to the right: the nodes move away from f. We note that the result holds if mass is removed over any interval, small or not. (But the theorem does not yield information about the movement of nodes in the interval.) Also, if mass is added rather than removed then the nodes will move toward the added mass. We may draw a conclusion regarding the inverse question: if nodes move away from (toward) an interval, then mass has been removed (added) in that interval. This holds only for one interval; if there are two or more intervals in which mass is removed (added), then there will be interaction between the two eects. Note that in Theorem 14.2.5, and in our analysis in this section, we predicted the movement of nodes to the left of f % by considering only the solution of the dierential equation and the left-hand end conditions x(0) = >
x0 (0) = k=
We can make a crude estimate of the amount of mass added or removed in a small interval if we can identify two neighbouring nodes {1 > {2 of a mode with frequency $, such that after the mass is added, the node {1 moves to the right to {1 , and {2 to the left, to {2 , the frequency decreases to $ . Suppose that the original mass per unit length between {1 and {2 was constant, 2 , and that after the mass is added it is ( + )2 . Since {1 > {2 are consecutive nodes of the initial mode $({2 {1 ) = and similarly $ ( + )({2 {1 ) = = This means that, knowing {2 {1 > {2 {1 > $ and $ we may find ( + )@: $ {2 {1 + = · A 1= $ {2 {1 For a slightly more challenging problem, let us consider the eect of point damage to a rod in longitudinal vibration, following Gladwell and Morassi (1999)
14. Continuous Modes and Nodes
409
[128]. Recall that for an undamaged rod with cross-sectional area D({), the governing equations are (10.1.3), (10.1.4): (Dx0 )0 + Dx = 0>
(14.3.2)
x0 (0) kx(0) = 0 = x0 (1) + Kx(1)=
(14.3.3)
We note that the ‘stiness’ term, (Dx0 )0 , has the same distribution, D, as the ‘inertia’ term Dx. If the rod is damaged by a small notch at { = f, then the stiness will be seriously aected while the inertia term will be almost unaected. For this reason, we model the notch as a spring so that, at f, [x0 (f)] = 0>
(14.3.4)
n[x(f)] = D(f)x0 (f)=
(14.3.5)
where [i (f)] := i (f+) i (f). The undamaged rod corresponds to n $ 4, i.e., % = 1@n $ 0. We may show, as expected, that the natural frequencies are increasing functions of n, i.e., decreasing functions of %. We may find the first order variation of the natural frequencies with % by taking x({) = x0 ({) + %y({)>
= 0 + %>
in (14.3.2)-(14.3.5). We find that (Dx00 )0 + 0 Dx0 = 0>
(14.3.6)
(Dy 0 )0 + 0 Dy + Dx0 = 0>
(14.3.7)
0
[y (f)] = 0>
(14.3.8)
[y(f)] = D(f)x00 (f)=
(14.3.9)
Multiplying (14.3.6) by y, (14.3.7) by x0 and subtracting, and then integrating from 0 to 1, using (14.3.3), we find (D(f)x00 (f))2 +
Z
1
Dx20 g{ = 0>
0
which, with the normalising condition Z 1 Dx20 g{ = 1 0
gives = (D(f)x00 (f))2 =
(14.3.10)
This equation shows how the natural frequencies change with %. We now show how the modes, and particularly the nodes, change with %. To do that, we use Theorem 14.2.4 again. We consider the portion of the rod to the left of f; there the displacement is given by the solution of (14.3.2) and the first of equations
410
Chapter 14
(14.3.3) with = 0 + % ? 0 . We identify E2 with the undamaged case (E2 = 0 D) and E1 with the damaged case (E1 = D). According to Theorem 14.2.5, the nodes corresponding to E2 lie to the left of those corresponding to E1 . That is, due to the damage, nodes move toward the damage. We now determine the first order change in the positions of the nodes. To do this, we estimate the first order changes in the nodes of x to the left of f. This means that we are looking at the first order change in the solution of (D0 )0 + D = 0> 0 (0) k(0) = 0= Note that we write the dependent variable as to emphasize that we are not looking to an eigenmode, just at the solution satisfying the left-hand end condition. This solution is uniquely determined apart from an arbitrary multiplicative constant. Put = 0 + %#> = 0 + %> and find (D# 0 )0 + 0 D# + D0 = 0> 0
# (0) k#(0) = 0=
(14.3.11) (14.3.12)
To solve these equations we use the method of variation of parameters: we write # = 0 i . After some manipulation, we find that this will satisfy (14.3.11), (14.3.12) if (D20 i 0 )0 + D20 = 0> i 0 (0) = 0= Thus D20 i 0
+
Z
{
D20 g{ = 0=
(14.3.13)
0
If {0 is a node of 0 , then the corresponding node of is {0 + % where, to first order, 0 = ({0 + %) = 0 ({0 + %) + %#({0 ) = 0 ({0 ) + %00 ({0 ) + %#({0 )= Thus = #({0 )@00 ({0 )=
(14.3.14)
Now #({) = 0 ({)i ({), and since 0 ({) $ 0 as { $ {0 , we must have i ({) $ 4 as { $ {0 . We must therefore evaluate #({0 ) by writing i ({) = 1@j({) and using l’Hôpital’s rule: #({0 ) = lim
{${0
0 ({0 ) 0 ({) = 00 = j({) j ({0 )
Putting i = 1@j in (14.3.13) we find D20 ({) 0 j ({) + j 2 ({)
Z
0
{
D20 g{ = 0>
14. Continuous Modes and Nodes
411
and on taking the limit { $ {0 , we find D({0 )
µ
00 ({0 ) j 0 ({0 )
¶2
0
j ({0 ) +
Z
{0
D20 g{ = 0=
(14.3.15)
0
To find the change in a node of a mode, we put 0 ({) = x0 ({) and combine (14.3.15) with (14.3.14) and (14.3.10) to give Z {0 Dx20 g{@(D({0 )[x00 ({0 )]2 ) (14.3.16) % = %[D(f)x00 (f)]2 0
as the change in the position of the node, from {0 to {0 + %; as expected, A 0. s In the particular case of a uniform free-free rod, for which D = 1> x0 = 2 cos[(q 1){]> q = 1> 2> = = = we find that the pth mode moves from {0 = (2p 1)@(2q 2)> p = 1> = = = > q 1, to {0 + % where % = %{0 sin2 [(q 1)f]= The corresponding changes for nodes to the right of f are % = %(1 {0 ) sin2 [(q 1)f]= These results show that, for a given mode, the changes in node positions increase as the node, {0 , approaches the damage position. The proportional changes for those nodes to the left of f> %@{0 , and for those to the right, %@(1 {0 ), are the same for each node; they depend only on the position of the notch. This means that to find the position of the damaged point we look for two nodes of a mode that have moved towards each other; the notch lies between these nodes. An experimental study based on these results may be found in Gladwell and Morassi (1999) [128], which also gives references to the related literature.
14.4
The research of Hald and McLaughlin
Both Ole Hald and Joyce McLaughlin have been studying inverse problems, amongst other topics, for many years, and we have referred to their individual researches on numerous occasions already. In this section we make a brief report on their joint work on inverse nodal problems. Inverse nodal problems dier from the inverse eigenvalue problems that form the subject matter of most of this book, in many subtle ways. We have already noted that while an eigenvalue, a natural frequency, reflects the properties of a system as a whole, a nodal position relates to the properties of the system near the node. But there are other dierences, dierences in the paths from data to system properties. When the data consist of eigenvalues (and maybe some norming constants) there is usually some algorithm that gives the exact values of a set of parameters defining the properties of a unique system which has this spectral data. In contrast, any researcher approaching an inverse nodal problem soon realises that nodal positions, the totality of nodal positions for all
412
Chapter 14
the principal modes, provide too much data. For example, for a string fixed at its ends, the first mode has no node, the next has one, and so on; the first q modes have a total of q(q 1)@2 nodes. Somehow we must make a choice: choose all the nodes of one mode, or choose one node from each mode, for example. Clearly, dierent choices will yield dierent models. The situation is made more complex because a continuous system, like a string, has an infinity of modes, and thus of nodes. From a mathematical point of view it would be reassuring to know that if one chose nodes of more and more modes in a particular way, then the resulting systems would converge in some sense to a unique system, and that one could give numerical estimates of the error one would make by using a finite number, q, of properly chosen nodes. There are thus three distinct parts to the ‘solution’ of an inverse nodal problem: finding an approximate system which has given nodal positions for certain mode(s); establishing that if any infinity of nodes, chosen in a certain way, is given, there is no more than one vibrating system, of a specified type, that has these nodes for certain of its modes; constructing bounds for the error in truncating the infinite set of nodes at a certain number q. The first part is relatively simple; Hald and McLaughlin provide a number of algorithms for the various types of Sturm-Liouville system described in Section 10.1. The other two parts are di!cult, and require a daunting array of analytical tools; we shall therefore content ourselves with giving the gist of the methods used and theorems proved; the interested reader may consult the original papers that are readily available. Our starting point is a fundamental paper by McLaughlin alone, McLaughlin (1988) [231]. This deals with part 2 of the problem, uniqueness. McLaughlin considers the Sturm-Liouville equation (10.1.14) with Dirichlet end conditions: | 00 + ( t)| = 0>
(14.4.1)
|(0) = 0 = |(1)>
(14.4.2)
where t 5 O2 (0> 1). First recall that if t1 > t2 are two potentials with t2 = t1 + f, where f is a constant, then noting that t2 = ( f) t1 > we see that the eigenvalues of the two problems dier by f while the eigenfunctions, and thus the nodes of the eigenfunctions, remain the same. This means that nodal information alone can yield t only to within an arbitrary additivie constant: any uniqueness theorem related to nodal information must contain the added information Z 1 Z 1 t1 ({)g{ = t2 ({)g{= (14.4.3) 0
0
McLaughlin proves that if two potentials t1 > t2 satisfy (14.4.3), and if the eigenfunctions |q (t1 > {)> |q (t2 > {) have a common set of nodes that is dense in (0,1) (see Section 10.3 for the definition of dense), then t1 = t2 in O2 (0> 1). The gist of the proof is as follows.
14. Continuous Modes and Nodes
413
First consider (14.4.1), (14.4.2) with t 0. The eigenvalues are q = (q)2 > q = 1> 2> = = =; the eigenfunctions are |q ({) = |q (0> {) = sin q{; the nodes of |q ({) are {q>m (0) = m@q> m = 1> 2> = = = > (q 1). Note that |1 ({) has no node. Now group the numbers 2,3,4,. . . as follows: 2;4,3;8,7,6,5;. . . This is equivalent to writing q = 2n+1 p; n = 0> 1> 2> = = = ; p = 0> 1> = = = 2n 1=
(14.4.4)
The (p + 1)th node of the qth node is {q>p+1 (0) = (p + 1)@(2n+1 p)
(14.4.5)
and the set of numbers {q>p+1 (0) for n = 0> 1> 2> = = = ; p = 0> 1> = = = > 2n 1 is dense in (0,1); the numbers are 12 ; 14 > 23 ; 18 > 27 > 36 > 45 ; = = = The uniqueness result is Theorem 14.4.1 Let t1 > t2 5 O2 (0> 1), and consider the eigenvalue problems |00 + ( tl )| = 0> |(0) = 0 = |(1)> l = 1> 2. For each q 2, suppose that the positions of the nodes, chosen according to (14.4.5) satisfy {q>m (t1 ) = {q>m (t2 )> q = 2> 3> = = = and that
Z
0
then t1 = t2 in O2 (0> 1).
1
t1 ({)g{ =
Z
1
t2 ({)g{
0
McLaughlin contrasts this inverse nodal problem with inverse eigenvalue problems for the Sturm-Liouville equation, and recalls that two spectra, corresponding to two dierent end conditions at one end (or some equivalent data, e.g., norming constants) are needed to determine t. She comments ‘what can be shown. . . is that the position of one node, albeit judiciously chosen, for each eigenfunction, q 2, is more than enough data to determine t uniquely (apart from a constant). It seems then that the nodal positions in some sense contain “more” information about this potential t than either a set of eigenvalues or a set of norming constants.’ While McLaughlin (1988) [231] was concerned only with part 2, uniqueness, Hald and McLaughlin (1989) [167] consider all three aspects, approximation, uniqueness and error bounds. They consider a generic equation with free end conditions: (sy 0 )0 + $ 2 2 y = 0> (14.4.6) y 0 (0) = 0 = y 0 (1)=
(14.4.7) 2
2
If s 1, this is the string equation (10.1.1) (with density ). If s , this is the rod equation (10.1.3) with D s 2 .
414
Chapter 14
One problem that they consider is the string (s 1) with free ends. They construct a string with piecewise constant density as follows. Suppose the nodes , where 0 ? {1 ? {2 ? = = = {q1 ? of the qth (q 2) mode yq ({) are ({m )q1 1 1. Consider yq ({) in an interval ({m1 > {m ). In that interval yq ({) is the fundamental mode of the string fixed at the ends {m1 and {m , and $ q is the fundamental frequency; in (0> {1 )(({q1 > 1)) it is the fundamental mode for a string free at 0, fixed at {1 (fixed at {q1 , free at 1). Suppose, therefore, that the non-uniform string is replaced by a string with uniform density 2m in the interval ({m1 > {m )> m = 1> = = = > q where {0 = 0> {q = 1. For the mth (2 m q1) part of the string, the governing equation is y 00 + $ 2q 2m y = 0> y({m1 ) = 0 = y({m )> so that y({) = sin{({ {m1 )@({m {m1 )}> and m = @($q ({m {m1 ))> m = 2> = = = > q 1=
(14.4.8)
For the first segment (0> {1 ) we have the end conditions y 0 (0) = 0 = y({1 ) so that y({) = cos({@(2{1 ))> and 1 = @(2$ q {1 )=
(14.4.9)
Similarly for the last segment ({q1 > 1), q = @(2$ q (1 {q1 ))=
(14.4.10)
This is the approximation in the specific case s 1, and the end conditions (14.4.7). If the end conditions are y(0) = 0 = y(1), then equation (14.4.8) holds for m = 1> q also. Hald and McLaughlin present similar algorithms to compute approximations to s and in other cases, and for the Sturm-Liouville potential t. For the rod equation, in which s = 2 , they first find a potential t, and then reverse the transformation leading to (10.1.14) to find D({). They also point out a fundamental dierence between the nodes of the string equation (10.1.3) and the rod equation or the Sturm-Liouville equation (10.1.14): a perturbation of in the string equation may cause (relatively) large changes in the nodal positions; by contrast a perturbation in D({) or t may cause only miniscule changes in the nodes of a high mode. They comment, “. . . the information in the nodal positions which we use to approximate the. . . impedance function (D({)) sits much deeper in the data than the information about the density (of the string).” These remarks concern the part of the solution, the approximate construction. The greater part of Hald and McLaughlin (1989) [167] concerns error bounds and uniqueness theorems. For the simple case of the string with free
14. Continuous Modes and Nodes
415
ends, for instance, they show that the procedure outlined above gives a second order approximation to the density at the mid points of the interior intervals, i.e., (({m + {m1 )@2) = m but only a first order approximation for the two end intervals. They give precise error bounds which show how the rate of convergence with increasing q depends on the smoothness of the density. They present numerous case studies here and also in Hald and McLaughlin (1988) [166]. The uniqueness results that they obtain are generalisations of those found in McLaughlin (1988) [231], a typical one is Theorem 14.4.2 Let s 1, and suppose that the second derivative of is integrable. Then is uniquely determined (up to a multiplicative constant) by any dense set of nodes. In Hald and McLaughlin (1998) [169] they return to the inverse nodal problem and develop a theory governing approximation, uniqueness and error bounds for (14.4.6), subject to Dirichlet end conditions, when s and are functions of bounded variation. Hald and McLaughlin (1996) [168] deal with inverse nodal problems for nonuniform rectangular membranes. Space does not allow us to consider these problems. We simply note that they pose two di!culties. Consider a uniform rectangular membrane with sides d> e vibrating with fixed edges. Its eigenvalues are ¶ µ 2 p q2 2 =$ = + 2 2 = d2 e If = d@e and 2 is rational, then there will be multiple eigenvalues, and one can find eigenvalues with multiplicity exceeding any stated number. If 2 is irrational, each eigenvalue will be distinct, but one can find two eigenvalues as close as one wishes. This closeness poses problems in the search for error bounds. The second di!culty relates to the shape of nodal domains, regions bounded by nodal lines. For the uniform rectangular membrane the nodal domains are themselves rectangles; the eigenfunction corresponding to p>q divides the rectangle into pq equal rectangles, as shown in Figure 14.4.1a. However, if the membrane density is perturbed from its uniform value, then the nodal domains may change dramatically, as shown in Figure 14.4.1b. This complicates the search for an approximation to the density; one would like to have a situation which is a generalisation of that for the string; the perturbed nodal domain is roughly a rectangle. One could then assume that the density was constant over that rectangle, and use the fact that the eigenfunction is the fundamental eigenfunction on the region bounded by the nodal lines. The major contribution of Hald and McLaughlin (1996) [168] is that they show how both these di!culties may be overcome and how one can find good approximations to the density, and how to obtain uniqueness theoreoms. McLaughlin (2000) [232] reconsiders inverse problems for a rectangular membrane. She considers three dierent
416
Chapter 14
approaches to the problem: in the first, the data consists of mode shape level sets and frequencies; in the second, it consists of frequencies and boundary mode shape measurements; in the third, the data consists of frequencies for four dierent boundary value problems. Local existence, and uniqueness results are established together with numerical results for approximate solutions.
a)
b)
Figure 14.4.1 - Nodal domains change from rectangles to irregular figures.
Chapter 15
Damage Identification Chance gives rise to thoughts, and chance removes them; no art can keep or acquire them. A thought has escaped me. I wanted to write it down. I write instead, that it has escaped me. Pascal’s Pensées, 585
15.1
Introduction
As we mentioned in the Preface, the identification of damage in a vibrating structure from changes in vibratory behaviour is an inverse problem in a loose interpretation of the term. Since such damage identification has potentially important practical value, it is appropriate for it to be included in any treatment of inverse problems but, since it is essentially an application of inverse techniques and must be combined with numerical methods, it is only of marginal relevance in this book which, as we stated in the Preface, is concerned primarily with theoretical and qualitative matters. We will therefore confine our remarks in this Chapter to a survey of the literature, and an examination of the methods used, the assumptions that are made, and the conlusions that may be drawn regarding damage identification in certain simple cases. We begin our discussion with some statements that may be grasped intuitively: If a structure is damaged, its vibratory behaviour will change. By vibratory behaviour we mean the response of a structure to time-varying forces. We will assume thta the structure is undamped so that we may speak about frequency response, the response of the structure to sinusoidal forces with a specific frequency $. As usual, we focus our attention on the natural frequencies and corresponding principal mode shapes of the structure; these may be obtained (at least in theory) by applying standard modal analysis techniques to the frequency responses at various points of the structure. Thus, we make the following statement: 417
418
Chapter 15
The vibratory behaviour of a structure may be characterised by its natural frequencies and corresponding principal mode shapes. On the other hand Structural damage may be characterised by its locations, intensities and types; we thus refer to a damage pattern. Strictly speaking, a structure is said to be damaged when it undergoes a change that reduces its stiness, or more generally reduces its strain energy. Under this definition, damage reduces the natural frequencies of the structure, or at least does not increase them. We shall loosen this definition and define damage as a (small) change in the structure; this would include (positive or negative) changes in stiness or in mass. Now consider the ‘simple’ forward problem: Given a specified structure, find the changes in vibratory behaviour brought about by specified damage. The solution of this forward problem depends critically on there being models for the undamaged and damaged structures, from which the natural frequencies and mode shapes may be extracted using established methods. It is known that, under certain conditions, this problem may be well posed: specific damage will cause a unique set of changes to the natural frequencies and mode shapes; and these changes will be continuous functions of the damage parameters. However, almost all the inverse problems, in which one tries to find the damage (i.e., its locations, intensities and types) which gave rise to specified behavioural changes, are ill-posed. Specifically, there may be no damage pattern (whether damage is interpreted strictly or loosely) that would give rise to a certain set of behavioural changes; or there may be more than one pattern that would produce the same set of behavioural changes; and there is no guarantee that the damage parameters will be continuous functions of the behavioural changes. The fact that there may be more than one damage pattern giving rise to specific changes in natural frequencies is a consequence of the fact that natural frequencies are global constructs - they depend on the complete structure, its distribution of mass and stiness, and the way in which it is supported. It is sometimes possible to identify a damage pattern because a specific damage pattern will aect dierent frequencies by diering amounts. We may make this statement precise. Suppose it is known that a structure is damaged just at one location, but the location, v, and magnitude, g, are unknown. Generally, for small g, the change in the lth frequency, $ l , will have the form $ l = g=il (v) :
(15.1.1)
it depends linearly on g, and non-linearly on the position. Thus, if the position, v, is known, then the change, $ l , in one frequency, may be enough to determine g (provided that il (v) is known). If v is unknown, then we consider the ratio of the changes to two frequencies: il (v) $ l = = $ m im (v)
(15.1.2)
15. Damage Identification
419
Thus, if the form of il (v) is known as a function of v, then it may be possible to find the value of v corresponding to a given value of $l @$ m . In any particular case, it will be necessary to determine whether there is no, one, or more than one value(s) of v satisfying (15.1.2) If (one or more values of) v is known, then g may be found from (15.1.1). Clearly, if the damage is not restricted to one location, then the identification procedure will be very complicated. We divide our discussion into two parts: damage identification in rods, and in beams.
15.2
Damage identification in rods
For a rod in longitudinal vibration, we model damage as a crack that stays open; following Freund and Herrmann (1976) [91] or Cabib, Freddi, Morassi and Percivale (2001) [47] we model such a crack as a longitudinal spring of stiness n, and write 1@n = g. In one of the early papers, Adams, Cawley, Pye and Stone (1978) [2] (see also Cawley and Adams (1979) [50]; and Hearn and Testa (1991) [170] for references to engineering studies) considered a damaged onedimensional system (a generalised rod) modelled as two parts E and F, linked by a spring of stiness n. If vv := vv (v> $) and vv := vv (v> $) are direct receptances (Bishop and Johnson (1960) [34]) of E and F at { = v, then the usual receptance analysis gives the frequency equation of the damaged system as vv (v> $) + vv (v> $) + g = 0= Thus, if $p = $ 0p + $ p > frequencies then
$ q = $0q + $q , where $ 0p > $ 0q are undamaged
vv (v> $ p ) + vv (v> $p ) = vv (v> $0p ) + vv (v> $0p ) + $p
C C$
{ vv (v> $) + vv (v> $)}|$=$0p =
The first term is zero because $ 0l is a natural frequency of the undamaged system (g = 0). Thus, $ p
C { (v> $) + vv (v> $)}|$=$0p + g = 0 C$ vv
which may be rearranged in the form (15.1.1). Narkis (1994) [247] used this approach for a uniform free-free rod. For a general rod, the perturbation analysis of Section 14.3 shows that if = $ 2 then p = % = g = (D(v)x0p (v))2 g=
(15.2.1)
Morassi (2001) [237] made extensive use of this result. He showed that the problem of determining the location v from changes in two natural frequencies is generally ill-posed: if the system is symmetrical, then damage at any one of a
420
Chapter 15
set of symmetrical points will produce identical changes in natural frequencies. Even if the system is not symmetrical, damage at dierent locations can still produce identical changes in two natural frequencies. Morassi (2003) [238] obtains particular results for uniform rods under various end conditions and determines situations in which the knowledge of p > q does, and does not, uniquely determine the location v. Thus, for example, for a rod under free end conditions, he defines I p = 2p2 2 s The pth (p 1) mode shape is xp ({) = 2 cos(p{) so that (15.2.1) gives I Fp =
p = 2p2 2 sin2 (pv)g so that I Fp = g sin2 (pv)=
Now use the trigonometric identities to deduce that sin2 (2pv) = (2 sin qv= cos pv)2 = 4 sin2 pv 4 sin4 pv and hence I I I 2 g(4Fp F2p ) = 4(Fp )
so that the amount and location of the damage are given by g=
1
I Fp > I I) F2p @(4Fp
sin2 pv =
I Fp = g
I Similarly, he shows that g and sin2 pv may be uniquely determined from Fp+1 V and Fp , defined as I Fp+1 =
I p+1 > 2(p + 1)2 2
V Fp =
Vp 2p2 2
(15.2.2)
where Vp is the pth natural frequency of the rod when it is supported at both ends. (Ex. 15.2.1). Morassi and Dilena (2002) [239] analyse the analogous problem of determining the magnitude and location of a point mass attached to a thin rod from its eect on the natural frequencies. Morassi (1997) [236] sets up the problem of crack detection in a rod as an inverse problem in the spirit of Chapter 11, following Hald (1984) [165]. He shows that the position of the crack is uniquely determined from the asymptotic form of the spectrum. Biscontin, Morassi and Wendel (1998) [32] study the asymptotic form of the spectrum for a uniform free-free rod of unit length with a spring of stiness n at { = f. The eigenvalues (= $2 ), are the roots of s() = sin f sin (1 f) n sin =
(15.2.3)
15. Damage Identification
421
We can study two kinds of asymptotics, for n large or n small. For n $ 4 the two parts of the rod are firmly joined together: the rod is an undamped free-free rod with eigenvalues p = p> p = 0> 1> 2> = = = For large n, i.e., small % = 1@n, the pth eigenvalue is p = p + %p where (Ex. 15.2.2) p = ()p sin pf sin p(1 f)= This is the kind of small change that we have been observing in the analysis above. For small n, the asymptotic form is centred about n = 0; now the rod splits into two free-free rods, one of length f, the other 1 f; there are two branches p1 = p1 @f> p2 = p2 @(1 f)= We now perturb these branches and seek eigenvalues of the form = p1 + n p1 > = p2 + n p2 and find (Ex. 15.2.2), to first order, that p1 = 1@(p1 )>
p2 = 1@(p2 )=
This gives the asymptotic form of the two branches as ³ ´ p1 = pf1 + pn1 + r p11 ³ ´ p2 + pn2 + r p12 = p2 = (1f)
Biscontin et al. found experimental evidence of two such branches in some steel rods. Our discussion so far has focused on the identification of damage from its eect on natural frequencies. We discussed the eect on nodal positions, for a rod, in Section 14.3. This is essentially a qualitative result which could be a useful adjunct in an experimental/numerical study, see Gladwell and Morassi (1999) [128]. Wu and Fricke (1989) [335], Wu and Fricke (1990) [336] and Wu and Fricke (1991) [337] discuss the problem of finding one or more small blockages in a duct.
Exercises 15.2 1. Consider a uniform rod of unit length under supported (V) and free (I ) I V end conditions. Define Fp+1 > Fp as in (15.2.2). Show that if the damage, g, is located at { = v, then I V g = Fp+1 + Fp >
cos[2(p + 1)v] = 1 +
2 = I V 1 + Fp+1 @Fp
2. Set up the eigenvalue equation (15.2.3) for the uniform free-free rod, governed by the equation x00 + 2 x = 0>
x0 (0) = 0 = x0 (1)>
422
Chapter 15
when there is a spring of stiness n connecting the parts to the left and right of { = f (see equation (14.3.5)). Establish the asymptotic forms for the eigenvalues for small and large n in a uniform duct by using measured eigenfrequency shifts.
15.3
Damage identification in beams
A number of early papers, including Cawley and Adams (1979) [50], Hearn and Testa (1991) [170] used a sensitivity analysis based on the general discrete equation (K M)u = 0= (15.3.1) Suppose the stiness and mass matrices are perturbed to K + K> M + M, respectively, and the solution is u + u> + . Then (K + K ( + )(M + M))(u + u) = 0 and, to first order, this is (K M)u + (K M)u + (K M)u Mu = 0 so that on premultiplying by uW and using (15.3.1) and uW Mu = 1, we find = uW Ku uW Mu= In particular, if there is only a change in the stiness of the structure, then = uW Ku=
(15.3.2)
This equation shows that if the changes in K are known, then the changes in natural frequencies may be found. One way to solve the inverse problem is to compute the changes in the various frequencies produced by changes in each element of a finite element model of the structure, in turn, and then determine which element change yields a set of frequency changes closest (either by inspection or in some least squares sense) to that found or specified. Cawley and Adams (1979) [50], Yuen (1985) [341], Morassi and Rovere (1997) [241], Vestroni and Capecchi (1996) [327], Vestroni and Capecchi (2000) [328] follow this general approach. See Shen and Taylor (1991) [304] for a careful and detailed engineering study of the problem, treated in a least squares form. See also Liang, Hu and Choy (1992a) [213], Liang, Hu and Choy (1992b) [214], Davini, Gatti and Morassi (1993) [72], and Cerri and Vestroni (2000) [51], and Capecchi and Vestroni (1999) [48]. There are a number of papers that discuss, in many dierent ways, how a crack in a flexurally vibrating beam should be modelled, including Freund and Herrmann (1976) [91], Chondros and Dimarogonas (1980) [55], Gudmundson (1982) [157], Christides and Barr (1984) [54], Shen and Pierre (1990) [303], Rizos, Aspragathos and Dimarogonas (1990) [291], Ostachowicz and Krawczuk
15. Damage Identification
423
(1991) [255], Chondros, Dimarogonas and Yao (1998) [56], and other papers cited in these. The simplest model of a crack is a rotational spring of stiness n; see Chondros and Dimarogonas (1980) [55], Narkis (1994) [247], or Boltezar, Strancar and Kuhelj (1998) [38]. All these researchers approach the problem in their own ways; typically Boltezar et al. set up the frequency equation for a uniform beam with a crack, modelled as a rotational spring of stiness n, at an interior location U, and find the value of U that will yield the same stiness n deduced from the measured (actually numerically predicted) changes in the first six natural frequencies. See Wu (1994) [338] for a dierent approach, and Natke and Cempel (1991) [248] for a review of the subject. Following Morassi (1993) [235] we set up a perturbation analysis for a beam with a rotational spring of stiness n at location v, when g := 1@n = % is small. We follow the lines laid out for the rod in Section 13.1. The beam is governed by equation (13.1.4), the end conditions (13.1.12), (13.1.13), and the jump conditions at { = v: [x] = 0 = [ux00 ] = [(ux00 )0 ]> where, as usual, [i ] := i (v+) i (v). Writing x({) = x0 ({) + % ({)>
u(v)x00 (v) = n[x0 ]
= 0 + %>
in (13.1.4) we find (ux00r )00 = 0 dx0 >
(15.3.1)
00 00
(15.3.2)
) = 0 d + dx0 >
(u
where both x0 and satisfy the end conditions (13.1.12), (13.1.13), and the jump conditions [ ] = 0 = [u
00
] = [(u
00 0
) ]>
satisfies
[ 0 ] = u(v)x000 (v)=
Multiplying (15.3.2) by x0 , (15.3.1) by , subtracting and integrating over (0,1), using the end and jump conditions, we find (Ex. 15.3.1) that = (u(v)x000 (v))2 >
(15.3.3)
so that the change in the pth natural frequency is p = (u(v)x00p (v))2 g=
(15.3.4)
Morassi (1993) [235] noted that this shows that the change in p (= $2p ) is proportional to the potential energy stored at location v in the undamaged beam; also, it is proportional to the square of the curvature of the undamaged beam at v. Morassi (2003) [238] uses (15.2.4) just as he used the corresponding equation (15.2.1) for the rod. He shows for instance that the severity and location of the damage in a uniform simply-supported beam is uniquely determined (except for symmetry) by the changes in the pth and 2pth frequencies. An alternative identification is provided by the changes in the pth frequency of the beam under
424
Chapter 15
simply supported boundary conditions and the (p + 1)th frequency of the beam under sliding-sliding end conditions. See Ex. 15.3.2. Clearly, he uses these conditions because they are the only ones for which the modes are simple sines or cosines; in the general case the modes involve both sinusoidal and hyperbolic terms. The procedure could easily be generalised to a consideration of the changes of frequencies under other end conditions. Morassi and Rollo (2001) [240] use (15.3.4) to estimate the positions of two cracks in a simply supported beam. There have been a few papers devoted to damage identification from other eects, namely curvature, mode shape and nodal positions. Thus, Pandey, Biswas and Samman (1991) [260] noted that the curvature of a principal mode of a damaged beam increased in a region localised about the damaged zone; this was dierent from simply the change in a mode shape, which generally was not localised about the damaged zone, see Rizos, Aspragathos and Dimarogonas (1990) [291]. Pandey et al used this curvature eect to locate damage. Dilena and Morassi (2002) [80] and Dilena (2003) [78] make a systematic study to see whether the conclusion of Gladwell and Morassi (1999) [128], for a rod, that nodes move toward the damage location, could be generalised to apply to a flexurally vibrating beam. The result for the rod follows from Sturm’s theorems, as described in Section 14.3. The vibration modes of a beam are governed by the fourth order equation (13.1.4), and not by the simple second order equation (14.3.2) for the rod. As Leighton and Nehari (1958) [206] showed in their massive authoritative discussion of oscillatory properties of fourth order equations, there are no simple extensions of Sturm’s results to such equations. There are points, called conjugate points, and it may be proved that conjugate points move toward damage, but conjugate points have no clear physical interpretation. To corroborate this conclusion, Dilena and Morassi (2002) [80] found counterexamples: nodes do not always move toward the damage location. The simplest counterexample is shown in Figure 15.3.1, adapted from Dilena and Morassi (2002) [80]. Figure 15.3.1(a) shows the first (proper) bending mode (i.e., mode 3) of a free-free beam. It has two nodes 1 and 2 . Figure 15.3.1(b) shows the sign of the change in position 1 due to damage of magnitude g (measured in some way) and position v. We note that the sign depends almost entirely on v. The node 1 is roughly 0 · 2; the figure shows that if v ? 0 · 41 then the node moves to the left (negative), while if v A 0 · 41 it moves to the right (positive). That means that when v is in (0 · 2> 0 · 41) the node moves the ‘wrong’ way. Figure 15.3.1(c) shows that there is a similar interval near the second node in which damage causes the node to move the ‘wrong’ way. For a corresponding axially vibrating rod the sign change would occur at the node: if the damage is to the left of the (undamaged rod) node, the node will move left (negative); to the right it will move right (positive). For a beam it appears that one may state that damage ‘far away’ from a node causes the node to move toward the damage. Dilena and Morassi (2002) [79] extend their analysis to higher modes, and find that there is a dierence in the eects of damage on the so-called external and internal nodes; an external node is an extreme node, one
15. Damage Identification
nearest an end of the rod. tests.
[
425
They complement their study with experimental
[
1
0
2
(a ) 1
(b)
d
0.0
0.2
0.4
0.6
0.8
10 .
( c)
d
0.0
s
0.2
0.4
0.6
0.8
10 .
s
Figure 15.3.1 - When the damage is in the unshaded (shaded) region the node moves to the left (right). Exercises 15.3 1. Derive equation (15.3.4) for the change in eigenvalue due to damage g(= 1@n) at v= 2. Find the change in the pth eigenvalue of a simply-supported, and of a sliding-sliding uniform beam brought about by a rotational spring of stiness n at { = v.
Index beam transverse vibration of, 24
compound matrix, 123 connectivity, 367 constraint eect on natural frequencies, 44 vibration under, 43 coordinates generalized, 26 principal, 38 corner minor, 124 Courant’s Nodal Line Theorem, 214
acoustic cavity finite element model of, 30 adjacency matrix, 94 adjacent vertex, 217 adjoint operator, 244 areal coordinates, 32 asymptotic behaviour, 276 Binet-Cauchy Theorem, 122 block Lanczos algorithm, 105 Bolzano-Weierstrass Theorem, 243, 403 bordered diagonal matrix, 95 bordered matrix, 121 boundary vertex, 217 bracket, 347
damage pattern, 418 Darboux lemma, 347 deformation lemma, 353 dense, 240 determinant, 6 Laplace expansion of, 124 of a matrix product, 11 rules for evaluating, 7 Vandermonde, 56 diagonal, 29 principal, 2 dierential equation for, 155 discrete maximum principle, 206 divisors of zero, 4, 11 domain of operator, 241 double node, 261 dual density, 358 Duhamel solution, 277
Cauchy Problem, 332 Cauchy problem, 298 Cauchy Schwarz inequality, 241 Cauchy sequence, 240 characteristic coordinates, 299 characteristic equation, 13 Chebyshev sequence, 260 closed set, 47, 240 codiagonal, 22 cofactors, 9 compact operator, 243 set, 242 compactness criterion for, 243 completion theorem, 240 compound kernel, 268
eigenvalue of a matrix, 14 of matrix pair, 13 of operator, 245 positive, 16, 18 real, 14 426
427
Index
eigenvalue problem non-symmetric, 18 eigenvector, 13 normalized, 17 eigenvectors linear independence of, 17 orthogonalety of, 16 element interior, 218 Euclidean norm, 130 Euler-Bernoulli beam, 368 Euler-Bernoulli operator, 369 external node, 424 finite element method for rod, 26 tetrahedral elements, 34 triangular elements, 31 force generalized, 27 frequency response, 417 frequency response function, 84 Frobenius norm, 130 functional, 241 G-L-M, 294 Gaussian elimination, 12 generalised eigenvalue problem, 14 generalized coordinates, 26 generalized force, 27 Goupillard medium, 336 Goursat problem, 298 Gram-Schmidt procedure, 53, 155 graph, 93, 94 connected, 96 of the system, 367 simple, 93 undirected, 93 graph theory, 93 Green’s matrix, 256 Green’s function, 237, 370 symmetry of, 238 Green’s matrix, 98, 256 Hadamard product, 181
harmonic, 290 harmonic spectrum, 356 Hausdor’s compactness criterion, 243 Heine-Borel Theorem, 243 Helmholtz equation, 214 Householder transformation, 99 i, 1 independent procedure, 47 inner product, 53 interior element, 218 interior vertex, 218 interlacing of eigenvalues, 45, 52 interlacing of nodes, 37 internal node, 424 inverse nodal problems, 411 isospectral, 153, 344 isospectral family, 154 isospectral flow, 154, 155 isospectral strings, 356 Jacobi Matrix inverse problem for, 65 Jacobi matrix associated graph, 94 eigenvectors of, 57 periodic, 95 kernel compound, 268 oscillatory, 255 Kronecker delta, 3 Lagrange’s equations, 26 Lanczos algorithm, 66 block, 105 Laplace expansion, 124 Lie group, 358 limit points, 240 map, 241 matrices dierence of, 3 equality of, 2 multiplication of, 3
428
sum of, 3 matrix, 1 adjoint of, 10 associated graph, 94 bordered, 121 bordered diagonal, 95 compound, 123 diagonal, 2 Green’s, 256 inertia, 23 inverse of, 11 invertible, 11 irreducible, 97 mass, 23 minor of, 8 non-negative, 130 non-singular, 11 orthogonal, 66, 99 oscillatory, 118, 133 PD, 126 pentadiagonal, 26 persymmetric, 73 positive, 130 positive definite, 14, 126, 128, 129 positive semi-definite, 14, 129 reducible, 97 sign-oscillatory, 118 square, 2 staircase, 135 strictly totally positive, 133 symmetric, 2 totally positive, 133 transpose of, 2 tridiagonal, 29 truncated, 68 upper triangular lower triangular, 12 matrix multiplication non-commutative, 4, 5 matrix pencil, 98 maximum modulus, 337 Maximum Modulus Principle, 337 maximum principle, 218 discrete form, 206
Index
membrane finite element model of, 30 method of variation of parameters, 410 metric space, 240 complete, 240 minimax procedure, 47 minimax procedure for operators, 272 minor, 119 corner, 124 principal, 15 quasi-principal, 139, 140 movable points, 255 multi-segment strings, 356 natural frequency, 35 near-boundary vertex, 218 nodal domain, 215 nodal interval, 375 nodal lines, 215 nodal place, 375 nodal set, 215 nodal vertex, 217 node, 375 double, 261 simple, 261 non-negative matrix, 130 vector, 130 norm Euclidean, 130 Frobenius, 130 O1 , 130 O2 , 130 of a matrix, 130 norming constants, 276, 283 NTP, 133 null space, 241 O, 133 open set, 240 operator, 241 adjoint of, 244 compact, 243 continuous, 241 continuous linear, 241
Index
domain of, 241 eigenvalue of, 245 finite-dimensional, 243 linear, 241 norm of, 241 null space of, 241 range of, 241 resolvent set of, 245 self-adjoint, 244 spectrum of, 245 order of matrix, 1 orthogonal transformation, 99 w.r.t. a matrix, 16 orthogonal decomposition, 242 orthogonal matrix, 99 orthogonal polynomials, 52 orthogonality discrete continuous, 53 Oscillator addition of, 365 oscillator addition of, 365 oscillatory, 133 kernel, 255 system of vectors, 145 oscillatory matrix, 118 Parseval’s equality, 242 path, 94 PD, 126 conditions for, 129 pendulum compound, 24 pentadiagonal matrix associated graph of, 95 periodic Jacobi matrix, 95 Perron root, 131 Perron’s theorem, 131 persymmetric matrix, 73 persymmetric system, 84 Picone’s formula, 406 poles and zeros, 84 polynomial, 367
429
positive matrix, 130 vector, 130 positive beam system, 369 positive definite test for, 15 positive semi-definite test for, 16 positivity, 367 precompact, 243 principal coordinates, 38 diagonal, 2 minor, 15 mode, 35 quadratic form, 14, 126 quasi-principal minor, 140 range of operator, 241 Rayleigh Quotient, 15, 41 global minimum global maximum, 42 independent definition of eigenvalues minmax definition of eigenvalues, 47 iterative definition of eigenvalues, 46 stationary values of, 42 Rayleigh’s Principle, 41 receptance, 84 of discrete system, 40 receptances, 387 reciprocal theorem for forced exatation, 40 recurrence relation, 18, 36 three term, 53 resolvent set of operator, 245 Riesz’s representation theorem, 242 rigid-body mode, 234 rigid-body modes, 369 ring, 95 rod
430
longitudinal vibration of, 232 torsional vibration of, 20, 232 vibrating, 20 Rolle’s Theorem, 375, 404 rotation, 99 Rouche’s Theorem, 278 Schwarz lemma, 341 self-adjoint, 244 sequence Cauchy, 240 Chebysev, 260 limit of, 240 set closed, 240 closure of, 240 compact, 242 dense, 240 open, 240 precompact, 243 totally bounded, 243 sign change function, 51 sign changes, 145 sign domain, 215, 221 strong, 221 weak, 221 sign graph, 220 strong, 220 weak, 220 sign-oscillatory, 142 sign-oscillatory matrix, 118 sign-reverse, 142 similarity transformation, 99 simple graph, 93 simple node, 261 space complete, 240 Hilbert, 242 inner product, 241 linear, 241 metric, 240 normed linear, 241 spanning tree, 97 spectral gap, 365 spectrum harmonic, 290, 356
Index
of a matrix, 68 of operator, 245 uniformly spaced, 290, 356 staircase matrix, 135 sequence, 135 staircase structure, 175 star, 94 stepped string, 355 STP, 133 strict total positivity test for, 144 strictly totally positive, 133 string multi-segment, 356 stepped, 355 transverse vibration of, 20 vibration of, 231 strings isospectral, 356 strong sign domain, 221 sign graph, 220 strut, 95 Sturm sequence, 51 Sturm’s Theorems, 402 Sylvester’s Theorem, 121 Theorem Bolzano-Weierstrass, 243, 403 Heine-Borel, 243 Riesz, 242 Sylvester’s, 121 Weierstrass, 251 Theorem, Rolle’s, 375 three term recurrence, 53 Toda lattice, 159 totally bounded, 243 totally positive, 133 TP, 133 transformation Householder, 99 orthogonal, 99 similarity, 99 tree, 97 spanning, 97
Index
truncated matrix, 68 Truncation Assumption, 309 u-line, 57 undirected graph, 93 uniformly spaced spectrum, 290 unique continuation theorem, 216 upper and lower bounds, 365 Vandermonde determinant, 56 vector O2 norm of, 5, 6 column row, 3 non-negative, 130 positive, 130 vectors orthogonal, 5 vertex adjacent, 217 boundary, 217 interior, 218 near-boundary, 218 nodal, 217 non-boundary, 217 vibration longitudinal, 232 of string, 231 torsional, 232 vibratory behaviour, 417 Volterra integral equation, 298 wave equation, 31 weak sign domain, 221 sign graph, 220 Weierstrass’ Theorem, 47, 251 weight function, 52 Wronskian, 403
431
Bibliography [1] Abrate, S. (1995) Vibration of non-uniform rods and beams. [47], 185, 703-716. 361. [2] Adams, R.D., Cawley, P., Pye, C.J. and Stone, B.J. (1978) A vibration technique for non-destructively assessing the integrity of structures. [44], 20, 93-100. 419. ˝ ber eine Frager der Eigenwerttheorie. [92], 53, [3] Ambarzumian, V. (1929) U 690-695. 290. [4] Ando, T. (1987) Totally positive matrices. [57], 90, 165-219. 120, 133, 137, 143, 143, 144, 145, 145, 169. [5] Andersson, L.-E. (1970) On the eective determination of the wave operator in the case of a dierence equation corresponding to a Sturm-Liouville dierential equation. [41], 29, 467-497. 293. [6] Andersson, L.-E. (1988a) Inverse eigenvalue problems with discontinuous coe!cients. [29], 4, 353-397. 305, 319. [7] Andersson, L.-E. (1988b) Inverse eigenvalue problems for a Sturm-Liouville equation in impedance form. [9], 4, 929-971. 305, 319. [8] Andersson, L.-E. (1990) Algorithms for solving inverse eigenvalue problems for Sturm-Liouville equations in Inverse Problems in Action, ed. P.C. Sabatier, Berlin: Springer. 335. [9] Andrea, S.A. and Berry, T.G. (1992) Continued fractions and periodic Jacobi matrices. [57], 161, 117-134. 105. [10] Andrew, A.L. and Paine, J.W. (1985) Correction of Numerov’s eigenvalue estimates. [68], 47, 289-300. 293. [11] Andrew, A.L. and Paine, J.W. (1986) Correction of finite element estimates for Sturm-Liouville eigenvalues. [68], 50, 205-215. 293. [12] Arbenz, P. and Golub, G.H. (1995) Matrix shapes invariant under the symmetric QR algorithm. [67], 2, 87-93. 170. 432
Bibliography
433
[13] Ashlock, D.A., Driessel, K.R. and Hentzel, I.R. (1997) On matrix structures invariant under Toda-like isospectral flows. [57], 254, 29-48. 180. [14] Barcilon, V. (1974a) Iterative solution of the inverse Sturm-Liouville equation. [42], 15, 429-436. 293. [15] Barcilon, V. (1974b) On the uniqueness of inverse eigenvalue problems. [24], 38, 287-298. 391. [16] Barcilon, V. (1974c) On the solution of inverse eigenvalue problems of high orders. [24], 39, 143-154. 291, 391. [17] Barcilon, V. (1974d) A note on a formula of Gel’fand and Levitan. [41], 48, 43-50. 362. [18] Barcilon, V. (1976) Inverse problems for a vibrating beam. [36], 27, 346358. 185, 392. [19] Barcilon, V. (1978) Discrete analog of an iterative method for inverse eigenvalue problems for Jacobi matrices. [42], 29, 295-300. 71. [20] Barcilon, V. (1979) On the multiplicity of solutions of the inverse problem for a vibrating beam. [82], 37, 605-613. 185. [21] Barcilon, V. (1982) Inverse problems for the vibrating beam in the freeclamped configuration. [69], 304, 211-252. 185, 391, 392. [22] Barcilon, V. (1983) Explicit solution of the inverse problem for a vibrating string. [41], 93, 222-234. 294, 362. [23] Barcilon, V. and Turchetti, G. (1980) Extremal solutions of inverse eigenvalue problems with finite spectral data. [90], 2, 139-148. 65. [24] Barcilon, V. (1990) Two-dimensional inverse eigenvalue problem. [29], 6, 11-20. [25] Bellman, R. (1970) Introduction to Matrix Analysis. New York: McGrawHill. 131. [26] Benade, A.H. (1976) Fundamentals of Musical Acoustics. London: Oxford University Press. 345. [27] Berman, A. (1984) System identification of structural dynamic models theoretical and practical bounds. [5], 84-0929, 123-129. 364. [28] Biegler-König, F.W. (1980) Inverse Eigenwertprobleme. Dissertation, Bielefeld. 108. [29] Biegler-König, F.W. (1981a) A Newton iteration process for inverse eigenvalue problems. [68], 37, 349-354. 108.
434
Bibliography
[30] Biegler-König, F.W. (1981b) Construction of band matrices from spectral data. [57], 40, 79-84. 108. [31] Biegler-König, F.W. (1981c) Su!cient conditions for the solvability of inverse eigenvalue problems. [57], 40, 89-100. 108. [32] Biscontin, G., Morassi, A. and Wendel, P. (1998) Asymptotic separation of the spectrum in notched rods. [53], 4, 237-251. 420. [33] Bishop, R.E.D., Gladwell, G.M.L. and Michaelson, S. (1965) The Matrix Analysis of Vibration. Cambridge: Cambridge University Press. 12, 17, 19, 101. [34] Bishop, R.E.D. and Johnson, D.C. (1960) The Mechanics of Vibration. Cambridge: Cambridge University Press. 19, 40, 84, 190, 389, 390, 392, 419. [35] Boley, D. and Golub, G.H. (1984) A modified method for reconstructing periodic Jacobi matrices. [60], 42, 143-150. 103, 105. [36] Boley, D. and Golub, G.H. (1987) A survey of matrix inverse eigenvalue problems. [29], 3, 595-622. 103, 105, 106, 108, 108. [37] Bôcher, M. (1917) Leçons sur les méthodes de Sturm dans la théorie des équations dierentielles linéares et leurs dévelopements modernes. Paris. 403. [38] Boltezar, M., Strancar, B. and Kuhelj, A. (1998) Identification of transverse crack locations in flexural vibrations of free-free beams. [47], 211, 729-734. 423. [39] Borg (1946) Eine Umkehrung der Sturm-Liouvilleschen Eigenwertaufgabe. [1], 78, 1-96. 290, 359. [40] Braun, S.G. and Ram. Y.M. (1991) Predicting the eect of structural modificiation: Upper and lower bounds due to modal truncation. [27], 6, 199-211. 365. [41] Brown, B.M., Samko, V.S., Knowles, I.W. and Marletta, M. (2003) Inverse spectral problem for the Sturm-Liouville equation. [29], 19, 235-252. 325. [42] Bruckstein, A.M. and Kailath, T. (1987) Inverse scattering for discrete transmission-line models. [87], 29, 359-389. 334, 335, 343. [43] Bube, K.P. and Burridge, R. (1983) The one-dimensional inverse problem of reflection seismology. [86], 25, 497-559. 334, 335. [44] Burak, S. and Ram, Y.M. (2001) The construction of physical parameters from spectral data. [63], 15, 3-10. 367.
Bibliography
435
[45] Burridge, R. (1980) The Gel’fand-Levitan, the Marchenko, and the Gopinath-Sondhi integral equations of inverse scattering theory, regarded in the context of inverse impulse-response problems. [90], 2, 305-323. 293, 334. [46] Busacker, R.G. and Saaty, T.L. (1965) Finite Graphs and Networks: an Introduction with Applications. New York: McGraw Hill. 218. [47] Cabib, E., Freddi, L., Morassi, A. and Percivale, D. (2001) Thin notched beams. [39], 64, 157-178. 419. [48] Capecchi, D. and Vestroni, F. (1999) Monitoring of structural systems by using frequency data. [23], 28, 447-461. 422. [49] Carrier, G.F., Krook, M. and Pearson, C.E. (1966) Functions of a Complex Variable. New York: McGraw-Hill. 389. [50] Cawley, P. and Adams, R.D. (1979) The location of defects in structures from measurements of natural frequencies. [48], 14, 49-57. 419, 422, 422. [51] Cerri, M.N. and Vestroni, F. (2000) Detection of damage in beams subjected to diused cracking. [47], 234, 259-276. 422. [52] Chadan, K. and Sabatier, P.C. (1989) Inverse Problems in Quantum Scattering. 2nd Ed. New York: Springer-Verlag. 334. [53] Cheng, S.Y. (1976) Eigenfunctions and nodal sets. [13], 51, 43-55. 215. [54] Christides, S. and Barr, A.D.S. (1984) One-dimensional theory of cracked Euler-Bernoulli beams. [28], 26, 639-648. 422. [55] Chondros, T.G. and Dimarogonas, A.D. (1980) Identification of cracks in welded joints of complex structures. [47], 69, 531-538. 422, 423. [56] Chondros, T.G., Dimarogonas, A.D. and Yao, J. (1998) A continuous cracked beam vibration theory. [47], 215, 17-34. 423. [57] Chu, M.T. (1984) The generalized Toda flow, the QR algorithm and the center manifold theory. [81], 5, 187-201. 159. [58] Chu, M.T. (1998) Inverse eigenvalue problems. [86], 40, 1-39. 108, 117. [59] Chu, M.T. and Golub, G.H. (2002) Structured inverse eigenvalue problems, [2], 11, 1-71. 117. [60] Chu, M.T. and Norris, L.K. (1988) Isospectral flows and abstract matrix factorizations. [85], 25, 1383-1391. 159. [61] Coleman, C.F. (1989) Inverse Spectral Problem with a Rough Coe!cient. Ph.D. Thesis. Rensselaer Polytechnic Institute, Troy, N.Y. 319.
436
Bibliography
[62] Coleman, C.F. and McLaughlin, J.R. (1993a) Solution of the inverse spectral problem for an impedance with integrable derivative I. [8], 46, 145-184. 305, 319, 346. [63] Coleman, C.F. and McLaughlin, J.R. (1993b) Solution of the inverse spectral problem for an impedance with an integrable derivative II. [8], 46, 185-212. 305, 319, 346. [64] Courant, R. and Hilbert, D. (1953) Methods of Mathematical Physics. Vol. 1, New York: Interscience. 48, 214, 240. [65] Crum, M.M. (1955) Associated Sturm-Liouville systems. [76], 6, 121-127. 350. [66] Cryer, C.W. (1973) The LU-factorization of totally positive matrices. [57], 7, 83-92. 168, 176. [67] Cryer, C.W. (1976) Some properties of totally positive matrices. [57], 15, 1-25. 168. [68] Dahlberg, B.E.J. and Trubowitz, E. (1984) The inverse Sturm-Liouville problem III. [16], 37, 255-267. 346. [69] Darboux, G. (1882) Sur la représentation sphérique des surfaces. [17], 94, 1343-1345. 347. [70] Darboux, G. (1915) Leçons sur le Théorie Générale des Surfaces et les Applications Géométrique du Calcul Infinitesimal. Vo. II. Paris: Gauthier Villars. 347. [71] Davies, E.B., Gladwell, G.M.L., Leydold, J. and Stadler, P.F. (2001) Discrete nodal domain theorems. [57], 336, 51-60. 223, 224. [72] Davini, C., Gatti, F. and Morassi, A. (1993) A damage analysis of steel beams. [62], 28, 27-37. 422. [73] Davini, C., Morassi, A. and Rovere, N. (1995) Modal analysis of notched bars: tests and comments on the sensitivity of an identification technique. [47], 179, 513-527. [74] Davini, C. (1996) Note on a parameter lumping in the vibrations of uniform beams. [79], 28, 83-99. 37. [75] de Boor, C. and Golub, G.H. (1978) The numerically stable reconstruction of a Jacobi matrix from spectral data. [57], 21, 245-260. 69. [76] de Boor, C. and Sa, E.B. (1986) Finite sequences of orthogonal polynomials connected by a Jacobi matrix. [57], 75, 43-55. 68, 70. [77] Deift, P., Nanda, T., and Tomei, C. (1983) Ordinary dierential equations and the symmetric eigenvalue problem. [85], 20, 1-22. 159.
Bibliography
437
[78] Dilena, M. (2003) On damage identification in vibrating beams from changes in node positions, in Davini, C. and Viola, E. (Eds) Problems in Structural Identification and Diagnostics: General Aspects and Applications. New York: Springer. 424. [79] Dilena, M. and Morassi, A. (2002) Identification of crack location in vibrating beams from changes in node positions. [47], 255, 915-930. 424. [80] Dilena, M. and Morassi, A. (2002) The use of antiresonances for crack detection in beams. [47]. 424, 424, 424. [81] Duarte, A.L. (1989) Construction of acyclic matrices from spectral data. [57], 113, 173-182. 110, 365. [82] Duval, A.M. and Reiner, V. (1999) Perron-Frobenius type results and discrete versions of nodal domain theorems. [57], 294, 259-268. 223, 224. [83] Eisner, E. (1976) Complete solution of the ‘Webster’ horn equation. [92], 41, 1126-1146. 345. [84] El-Badia, A. (1989) On the uniqueness of a bi-dimensional inverse spectral problem. [18], 308, 273-276. [85] Elhay, S., Gladwell, G.M.L., Golub, G.H. and Ram, Y.M. (1999) On some eigenvector-eigenvalue relations. [84], 20, 563-574. 276. [86] Fekete, M. (1913) Über ein Problem von Laguerre. [78], 34, 89-100, 110120. 133, 143. [87] Ferguson, W.E. (1980) The construction of Jacobi and periodic Jacobi matrices with prescribed spectra. [60], 35, 1203-1220. 103. [88] Fischer, E. (1905) Über quadratische Formen mit reelen Koe!zienten. [65], 16, 234-249. 48. [89] Fix, G. (1967) Asymptotic eigenvalues of Sturm-Liouville systems. [41], 19, 519-525. 283. [90] Forsythe, G.E. (1957) Generation and use of orthogonal polynomials for data fitting with a digital computer. [54 ], 5, 74-88. 54. [91] Freund, L.B. and Herrmann, G. (1976) Dynamic fracture of a beam or plate in plane bending. [37], 76, 112-116. 419, 422. [92] Friedland, S. (1977) Inverse eigenvalue problems. [57], 17, 15-51. 108. [93] Friedland, S. (1979) The reconstruction of a symmetric matrix from the spectral data. [41], 71, 412-422. 108. [94] Friedland, S. and Melkman, A.A. (1979) Eigenvalues of non-negative Jacobi matrices. [57], 25, 239-253. 68.
438
Bibliography
[95] Friedland, S., Nocedal, J., and Overton, M. (1987) The formulation and analysis of numerical methods for inverse eigenvalue problems. [85], 24, 634-667. 116. [96] Friedman, J. (1993) Some geometric aspects of graphs and their eigenfunctions. [22], 69, 487-525. 224, 224. [97] Gantmacher, F.R. (1959) The Theory of Matrices. New York: Chelsea Publishing Co. 18, 118, 122, 123, 133. [98] Gantmakher, F.P. and Krein, M.G. (1950) Oscillation Matrices and Kernels and Small Vibrations of Mechanical Systems. 1961 Translation by U.S. Atomic Energy Commission, Washington, D.C. A revised edition was published in (2002) by AMS Chelsea Publishing, Providence, listing the first author as Gantmacher, not Gantmakher. 49, 63, 80, 118, 133, 133, 236. [99] Gasca, M. and Peña, J.M. (1992) Total positivity and Neville elimination. [57], 165, 25-44. 138. [100] Gel’fand, I.M. and Levitan, B.M. (1951) On the determination of a dierential equation from its spectral function. (In Russian). [31], 15, 309-360. (In English). [54], 1, 253-304. 293. [101] Gel’fand, I.M. and Levitan, B.M. (1953) On a simple identity for the characteristic values of a dierential operator of the second order (in Russian). [21], 88, 593-596. 362. [102] Gilbarg, D. and Trudinger, N.S. (1977) Elliptic Partial Dierential Equations of Second Order. Berlin, Springer. 337. [103] Gladwell, G.M.L. (1962) The approximation of uniform beams in transverse vibration by sets of masses elastically connected. Proceedings of the 4th U.S. Congress of Applied Mechanics, 169-176, New York: American Society of Mechanical Engineers. 38. [104] Gladwell, G.M.L. (1984) The inverse problem for the vibrating beam. [74], 393, 277-295. 185. [105] Gladwell, G.M.L. (1985) Qualitative properties of vibrating systems. [74], 401, 299-315. 192. [106] Gladwell, G.M.L. and Gbadeyan, J. (1985) On the inverse problem of the vibrating string and rod. [77], 38, 169-174. 84. [107] Gladwell, G.M.L. (1986a) Inverse problems in vibration. [79], 39, 10131018. 116. [108] Gladwell, G.M.L. (1986b) Inverse Problems in Vibration. Dordrecht: Martinus Nijho Publishers. 133, 133.
Bibliography
439
[109] Gladwell, G.M.L. (1986c) The inverse mode problem for lumped-mass systems, [77], 39, 297-307. 203, 209, 211. [110] Gladwell, G.M.L. (1986d) The inverse problem for the Euler-Bernoulli beam. [74], 407, 199-218. 392. [111] Gladwell, G.M.L. and Dods, S.R.A. (1987) Examples of reconstruction of vibrating rods from spectral data. [47], 119, 267-276. 319. [112] Gladwell, G.M.L., England, A.H. and Wang, D. (1987) Examples of reconstruction of an Euler-Bernoulli beam from spectral data. [47], 119, 81-94. 401. [113] Gladwell, G.M.L. and Willms, N.B. (1988) The reconstruction of a tridiagonal system from its frequency response at an interior point. [29], 4, 1018-1024. 87. [114] Gladwell, G.M.L. and Willms, N.B. (1989) A discrete Gel’fand-Levitan method for band-matrix inverse eigenvalue problems. [29], 5, 165-179. 108. [115] Gladwell, G.M.L., Willms, N.B., He, B., and Wang, D. (1989) How can we recognise an acceptable mode shape for a vibrating beam? [77], 42, 303-316. 211, 212, 213, 214, 365. [116] Gladwell, G.M.L. (1991a) Qualitative properties of finite element models I: Sturm-Liouville systems. [77], 44, 249-265. 185. [117] Gladwell, G.M.L. (1991b) Qualitative properties of finite-element models II: the Euler Bernoulli beam. [77], 44, 267-284. 185, 192. [118] Gladwell, G.M.L. (1991c) The application of Schur’s algorithm to an inverse eigenvalue problem. [29], 7, 557-565. 335. [119] Gladwell, G.M.L. (1991d) On the scattering of waves in a non-uniform Euler-Bernoulli beam. [72], 205, 31-34. 61, 393. [120] Gladwell, G.M.L. (1993) Inverse Problems in Scattering. Dordrecht: Kluwer Academic Publishers. 334, 335. [121] Gladwell, G.M.L. (1995) On isospectral spring-mass systems. [29], 11, 591602. 160. [122] Gladwell, G.M.L. and Morassi, A. (1995) On isospectral rods, horns and strings. [29], 11, 533-544. 347. [123] Gladwell, G.M.L. and Movahhedy, M. (1995) Reconstruction of a massspring system from spectral data I: Theory. [30], 1, 179-189. 84. [124] Gladwell, G.M.L. (1996) Inverse problems in vibration-II. [9], 49, 525-534. 116.
440
Bibliography
[125] Gladwell, G.M.L. (1997) Inverse vibration problems for finite element models. [29], 13, 311-322. 176. [126] Gladwell, G.M.L. (1998) Total positivity and the QR algorithm. [57], 271, 257-272. 138, 167, 167, 175. [127] Gladwell, G.M.L. (1999) Inverse finite element vibration problems. [47], 211, 309-324. 86, 87, 175. [128] Gladwell, G.M.L. and Morassi, A. (1999) Estimating damage in a rod from changes in node positions. [30], 7, 215-233. 409, 411, 421, 424. [129] Gladwell, G.M.L. (2002a) Total positivity and Toda flow. [57], 350, 279284. 182. [130] Gladwell, G.M.L. (2002b) Isospectral vibrating beams. [74], 458, 26912703. 175. [131] Gladwell, G.M.L. and Zhu, H.M. (2002) Courant’s nodal line theorem and its discrete counterparts. [77], 55, 1-15. 34, 224. [132] Golub, G.H. (1973) Some uses of the Lanczos algorithm in numerical linear algebra, in J.H.H. Miller (Ed) Topics in Numerical Analysis, New York: Academic Press. 67. [133] Golub, G.H. and Boley, D. (1977) Inverse eigenvalue problems for band matrices, in G.A. Watson (Ed.) Numerical Analysis Heidelberg, New York: Springer Verlag, 23-31. 70. [134] Golub, G.H. and Underwood, R.R. (1977) Block Lanczos method for computing eigenvalues, in Rice, J.R. (Ed.) Mathematical Software III. New York: Springer, 23-31. 108. [135] Golub, G.H. and Van Loan, C.F. (1983) Matrix Computations. Baltimore: The Johns Hopkins University Press. 12, 17, 67, 101, 156. [136] Gopinath, B. and Sondhi, M.M. (1970) Determination of the shape of the human vocal tract from acoustical measurements. [12], 1195-1214. 293, 331. [137] Gopinath, B. and Sondhi, M.M. (1971) Inversion of the telegraph equation and the synthesis of non-uniform lines. [25], 59, 383-392. 293, 293, 331. [138] Gottlieb, H.P.W. (1986) Harmonic frequency spectra of vibrating stepped strings. [47], 108, 63-72 and 345. 290, 291, 355, 355, 356, 359. [139] Gottlieb, H.P.W. (1987a) Multi-segment strings with exactly harmonic spectra. [47], 118, 283-290. 356. [140] Gottlieb, H.P.W. (1987b) Isospectral Euler-Bernoulli beams with continuous density and rigidity functions. [74], 413, 235-250. 359, 361, 393.
Bibliography
441
[141] Gottlieb, H.P.W. (1988a) Isospectral operators: some model examples with discontinuous coe!cients. [41], 132, 123-137. 356. [142] Gottlieb, H.P.W. (1988b) Density distribution for isospectral circular membranes. [82], 48, 948-951. 361, 393. [143] Gottlieb, H.P.W. (1989) On standard eigenvalues of variable-coe!cient heat and rod equations. [37], 56, 146-148. [144] Gottlieb, H.P.W. (1991) Inhomogeneous clamped circular plates with standard vibration spectra. [37], 58, 729-730. 361. [145] Gottlieb, H.P.W. (1992a) Examples and counterexamples for a string density formula in the case of a discontinuity. [41], 164, 363-369. 363, 364. [146] Gottlieb, H.P.W. (1992b) Axisymmetric isospectral annular plates and membranes. [26], 50, 107-112. 361. [147] Gottlieb, H.P.W. (1993) Inhomogeneous amular plates with exactly beamlike radial spectra. [26], 50, 107-112. 361. [148] Gottlieb, H.P.W. (2000) Exact solutions for vibrations of some annular membranes with inhomogeneous radial densities. [47], 233, 165-170. 361. [149] Gottlieb, H.P.W. (2002) Isospectral strings. [29], 18, 971-978. 356. [150] Gottlieb, H.P.W. (2004a) Isospectral circular membranes. [29], 20, 155161. 361. [151] Gould, S.H. (1966) Variational Methods for Eigenvalue Problems. Toronto: University of Toronto Press. 48. [152] Gradshteyn, I.S. and Ryzhik, I.M. (1965) Tables of Integrals, Series and Products, 4th ed., Moscow 1963. English Translation, A. Jerey (Ed.) New York: Academic Press. 364. [153] Gragg, W.B. and Harrod, W.J. (1984) Numerically stable reconstruction of Jacobi matrices from spectral data. [68], 44, 317-335. 108. [154] Gray, L.J. and Wilson, D.G. (1976) Construction of a Jacobi matrix from spectral data. [57], 14, 131-134. 68. [155] Groetsch, C.W. (1993) Inverse Problems in the Mathematical Sciences. Braunschweig: Vieweg Verlag. 289. [156] Groetsch, C.W. (2000) Inverse Problems: Activities for Undergraduates. Washington, D.C.: Mathematical Association of America. 289. [157] Gudmundson, P. (1982) Eigenfrequency changes of structures due to cracks, notches or other geometrical changes. [51], 30, 339-353. 422.
442
Bibliography
[158] Halberg, C.J.A. and Kramer, V.A. (1960) A generalization of the trace concept. [22], 27, 607-617. 362. [159] Hald, O.H. (1972) On Discrete and Numerical Inverse Sturm-Liouville Problems. Ph.D. Thesis, New York University, New York, NY. 293. [160] Hald, O.H. (1976) Inverse eigenvalue problems for Jacobi matrices. [57], 14, 63-85. 68. [161] Hald, O.H. (1977) Discrete inverse Sturm-Liouville problems. [68], 27, 249256. 294. [162] Hald, O.H. (1978a) The inverse Sturm-Liouville problem with symmetric potentials. [1], 141, 263-291. 291. [163] Hald, O.H. (1978b) The inverse Sturm-Liouville equation and the Rayleigh-Ritz method. [60], 32, 687-705. 294. [164] Hald, O.H. (1983) Inverse eigenvalue problems for the mantle, II. [24], 72, 139-164. [165] Hald, O.H. (1984) Discontinuous inverse eigenvalue problems. [16], 37, 539-577. 291, 293, 305, 420. [166] Hald, O.H. and McLaughlin, J.R. (1988) Inverse problems using nodal position data - uniqueness results, algorithms, and bounds. Proceedings, Centre for Mathematical Analysis, Australian National University, Special Program in Inverse Problems, ed. R.S. Anderssen and G.N. Newsam. 17, 32-58. 415. [167] Hald, O.H. and McLaughlin, J.R. (1989) Solutions of inverse nodal problems. [29], 5, 307-347. 413, 414. [168] Hald, O.H. and McLaughlin, J.R. (1996) Inverse nodal problems: finding the potential from nodal lines. [64], 119, 415, 415. [169] Hald, O.H. and McLaughlin, J.R. (1998) Inverse problems: recovery of BV coe!cients from nodes. [29], 14, 245-273. 415. [170] Hearn, G. and Testa, R.B. (1991) Modal analysis for damage detection in structures. [49], 117, 3042-3063. 419, 422. [171] Herrmann, H. (1935) Beziehungen zwischen den Eigenwerten und Eigenfunktionen verschiedener Eigenwertprobleme. [61], 40, 221-241. 216. [172] Hochstadt, H. (1961) Asymptotic estimates of the Sturm-Liouville spectrum. [16], 14, 749-764. 282. [173] Hochstadt, H. (1967) On some inverse problems in matrix theory. [10], 18, 201-207. 68.
Bibliography
443
[174] Hochstadt, H. and Kim, M. (1970) On a singular inverse eigenvalue problem. [11], 37, 243-254. 290. [175] Hochstadt, H. (1973) The inverse Sturm-Liouville problem. [16], 26, 715729. 291, 291. [176] Hochstadt, H. (1974) On the reconstruction of a Jacobi matrix from spectral data. [57], 8, 435-446. 68. [177] Hochstadt, H. (1975a) On inverse problems associated with SturmLiouville operators. [38], 17, 220-235. 291, 345. [178] Hochstadt, H. (1975b) Well posed inverse spectral problems. [73], 72, 24962497. 291. [179] Hochstadt, H. (1976) On the determination of the density of a vibrating string from spectral data. [41], 55, 673-685. 291. [180] Hochstadt, H. (1977) On the well posedness of the inverse Sturm-Liouville problem. [38], 23, 402-413. 291. [181] Hochstadt, H. and Lieberman, B. (1978) An inverse Sturm-Liouville problem with mixed given data. [82], 34, 676-680. 291, 329. [182] Hochstadt, H. (1979) On the reconstruction of a Jacobi matrix from mixed given data. [57], 28, 113-115. 74. [183] Horn, R.A. and Johnson, C.R. (1985) Matrix Analysis. Cambridge: Cambridge University Press. 1, 97, 130, 131. [184] Ikramov, Kh.D. and Chugunov, V.N. (2000) Inverse matrix eigenvalue problems. [43], 98, 51-135. 117. [185] Ince, E.L. (1927) Ordinary Dierential Equations, London: Longmans, Green. 236, 282, 403. [186] Isaacson, E.L. and Trubowitz, E. (1983) The inverse Sturm-Liouville problem I. [16], 36, 767-783. 346. [187] Isaacson, E.L., McKean, H.P. and Trubowitz, E. (1984) The inverse SturmLiouville problem II. [16], 37, 1-11. 346. [188] Jerison, D. and Kenig, C. (1985) Unique continuation and absence of positive eigenvalues for Schrödinger operators. [7], 121, 463-494. 216. [189] Kailath, T. and Lev-Ari, H. (1985) On mappings between covariance matrices and physical systems. [20], 47, 241-252. 342. [190] Karlin, S. (1968) Total Positivity, Vol. 1. Stanford: Stanford University Press. 133.
444
Bibliography
[191] Kato, T. (1976) Perturbation Theory for Linear Operators. Springer Verlag, New York. [192] Kautsky, J. and Golub, G.H. (1983) On the calculation of Jacobi matrices. [57], 52, 439-455. 67, 68. [193] Kirsch, A. (1996) An Introduction to the Mathematical Theory of Inverse Problems. New York: Springer Verlag. 289, 291. [194] Knobel, R. and McLaughlin, J.R. (1992) A reconstruction method for the two spectra inverse Sturm-Liouville problem, preprint. [195] Knobel, R. and Lowe, B.D. (1993) An inverse Sturm-Liouville problem for an impedance. [36], 44, 433-450. 319. [196] Knobel, R. and McLaughlin, J.R. (1994) Reconstruction method for a two-dimensional inverse problem. [91], 45, 794-826. [197] Kobayashi, M. (1988) Discontinuous Inverse Sturm-Liouville Problems with Symmetric Potentials. Ph.D. Thesis. University of California at Berkeley. 305. [198] Krein, M.G. (1933) On the spectrum of a Jacobian matrix, in connection with the torsional oscillation of shafts. (in Russian) [59 ], 40, 455-466. 63. [199] Krein, M.G. (1934) On nodes of harmonic oscillations of mechanical systems of a certain special type. (in Russian) [59], 41, 339-348. 63. [200] Krein, M.G. (1951a) Determination of the density of a non-homogeneous symmetric cord from its frequency spectrum. (In Russian). [21], 76, 345348. 293. [201] Krein, M.G. (1951b) On the inverse problem for a non-homogeneous cord. (In Russian). [21], 82, 669-672. 293. [202] Krein, M.G. (1952) Some new problems in the theory of Sturm systems. (In Russian) [71], 16, 555-563. 63, 293. [203] Lanczos, C. (1950) An iteration method for the solution of the eigenvalue problem of linear dierential and integral operators. [46], 45, 225-232. 67. [204] Landau, H.J. (1983) The inverse problem for the vocal tract and the moment problem. [83], 14, 1019-1035. 293, 334. [205] Lebedev, L.P., Vorovich, I.I. and Gladwell, G.M.L. (1996) Functional Analysis: Applications in Mechanics and Inverse Problems. Dordrecht: Kluwer Academic Publishers. 240. [206] Leighton, W. and Nehari, Z. (1958) On the oscillation of solutions of selfadjoint linear dierential equations of the fourth order. [88], 89, 325-377. 424.
Bibliography
445
[207] Levinson, N. (1949) The inverse Sturm-Liouville problem. [66], 25-30. 291. [208] Levinson, M. (1976) Vibrations of stepped strings and beams. [47], 49, 287-291. 355. [209] Levitan, B.M. (1964a) Generalized Translation Operators and Some of Their Applications. Jerusalem: Israel Program for Scientific Translations. Chapters IV, V. 291. [210] Levitan, B.M. (1964b) On the determination of a Sturm-Liouville equation by spectra. (In Russian). [31], 28, 63-68. (In English) [54], 68, 1-20. 292. [211] Levitan, B.M. (1987) Inverse Sturm-Liouville Problems. Utrecht: VNU Science Press. 283, 291. [212] Levitan, B.M. and Sargsjan, I.S. (1991) Sturm-Liouville and Dirac Operators. Dordrecht: Kluwer Academic Publishers. 235, 282, 293. [213] Liang, R.Y., Hu, J. and Choy, F. (1992a) Theoretical study of crackinduced eigenfrequency changes on beam structures. [40], 118, 384-396. 422. [214] Liang, R.Y., Hu, J. and Choy, F. (1992b) Quantitative NDE technique for assessing damages in beam structures. [40], 118, 1469-1487. 422. [215] Lindberg, G.M. (1963) The vibration of non-uniform beams. [3], 14, 387395. 38. [216] Lowe, B.D., Pilant, M. and Rundell, W. (1992) The recovery of potentials from finite spectral data. [83], 23, 482-504. 321. [217] Lowe, B.D. (1993) Construction of an Euler-Bernoulli beam from spectral data. [47], 163, 165-171. 399. [218] Marchenko, V.A. (1950) On certain questions in the theory of dierential operators of second order. (In Russian). [21], 72, 457-460. 291, 293. [219] Marchenko, V.A. (1952) Some problems in the theory of one-dimensioned second order dierential operators I (In Russian). [89], 1, 327-420. 291. [220] Marchenko, V.A. (1953) Some problems in the theory of one-dimensioned second order dierential operators II (In Russian). [89], 2, 3-82. 291. [221] Markham, T. (1970) On oscillatory matrices. [57], 3, 143-158. 138, 175, 184. [222] Mattis, M.P. and Hochstadt, H. (1981) On the construction of band matrices from spectral data. [57], 38, 109-119. 108. [223] McLaughlin, J.R. (1976) An inverse problem of order four. [83], 7, 646-661. 393.
446
Bibliography
[224] McLaughlin, J.R. (1978) An inverse problem of order four - an infinite case. [83], 9, 395-413. 393. [225] McLaughlin, J.R. (1981) Fourth order inverse eigenvalue problems, in Knowles, I.W. and Lewis, R.T. (Eds) Spectral Theory of Dierential Operators. New York: North Holland, 327-335. Crum, M.M. (1995) [78], 6, 121-127. 393. [226] McLaughlin, J.R. (1984a) Bounds for constructed solutions of second and fourth order inverse eigenvalue problems, in I.W. Knowles and T.R. Lewis (Eds) Dierential Equations. New York: Elsevier/North Holland, 437-443. 393. [227] McLaughlin, J.R. (1984b) On constructing solutions to an inverse EulerBernoulli beam problem, in F. Santosa et al (Eds) Inverse Problems of Acoustic and Elastic Waves. Philadelphia: SIAM, 341-347. 392, 393. [228] McLaughlin, J.R. (1986) Analytical methods for recovering coe!cients in dierential equations from spectral data. [87], 28, 53-72. 291, 305. [229] McLaughlin, J.R. (1986) Uniqueness theorem for second order inverse eigenvalue equations. [41], 118, 38-41. [230] McLaughlin, J.R. and Rundell, W. (1987) A uniqueness theorem for an inverse Sturm-Liouville problem. [42], 28, 1471-1472. 305. [231] McLaughlin, J.R. (1988) Inverse spectral theory using nodal points as data - a uniqueness result. [41], 73, 354-362. 412, 413, 415. [232] McLaughlin, J.R. (2000) Solving inverse problems with spectral data, in Colton, D., Engl, H.W., Louis, A.K., McLaughlin, J.R. and Rundell, W. (Eds) Surveys on Solution Methods for Inverse Problems. Vienna, Springer-Verlag. pp. 169-194. 415. [233] McNabb, A., Anderssen, R.S. and Lapwood, E.R. (1976) Asymptotic behaviour of the eigenvalues of a Sturm-Liouville system with discontinuous coe!cients. [41], 54, 741-751. 284. [234] Meirovitch, L. (1975) Elements of Vibration Analysis. New York: McGrawHill. 19. [235] Morassi, A. (1993) Crack-induced changes in eigenparameters on beam structures. [40], 119, 1798-1803. 423, 423. [236] Morassi, A. (1997) An uniqueness result on crack localization in vibrating rods. [30], 4, 231-254. 420. [237] Morassi, A. (2001) Identification of a crack in a rod based on changes in a pair of natural frequencies. [47], 242, 577-596. 419.
Bibliography
447
[238] Morassi, A. (2003) The crack detection problem in vibrating beams, in Davini, C. and Viola, E. (Eds) Problems in Structural Identification and Diagnostics: General Aspects and Applications. New York: Springer, 163177. 420. [239] Morassi, A. and Dilena, M. (2002) On point mass identification in rods and beams from minimal frequency measurements. [30], 10, 183-201. 420. [240] Morassi, A. and Rollo, M. (2001) Identification of two cracks in a simply supported beam from minimal frequency measurements. [53], 7, 729-739. 424. [241] Morassi, A. and Rovere, N. (1997) Localizing a notch in a steel frame from frequency measurements. [40], 123, 422-432. 422. [242] Movahhedy, M., Ismail, F. and Gladwell, G.M.L. (1995) Reconstruction of a mass-spring system from spectral data II: Experiment. [30], 1, 315-327. 84. [243] Nabben, R. (2001) On Green’s matrices for trees. [84], 22, 1014-1026. 98. [244] Nachman, A., Sylvester, J. and Uhlmann, G. (1988) An q-dimensional Borg-Levinson theorem. [14], 115, 595-605. [245] Nanda, T. (1982) Ph.D. Thesis, New York University, New York. 159. [246] Nanda, T. (1985) Dierential equations and the QR algorithm. [85], 22, 310-321. 159. [247] Narkis, Y. (1994) Identification of crack location in vibrating simplysupported beams. [47], 172, 549-558. 419, 423. [248] Natke, H.G. and Cempel, C. (1991) Fault detection and localisation in structures: a discussion. [45], 5, 345-356. 423. [249] Newton, R.G. (1983) The Marchenko and Gel’fand-Levitan methods in the inverse scattering problem in one and three dimensions, in J.G. Bednar, et al. (Eds.) Conference on Inverse Scattering: Theory and Application. Philadelphia: SIAM. 1-74. 289. [250] Niordson, F.I. (1967) A method of solving inverse eigenvalue problems, in B. Broberg, J. Hults and F.I. Niordson (Eds) Recent Progress in Applied Mechanics: The Folke Odqvist Volume. Stockholm: Almqvist and Wiksell, 373-382. 391. [251] Nocedal, J. and Overton, M.L. (1983) Numerical methods for solving inverse eigenvalue problems. [55], 1005, 212-226. 116. [252] Nylen, P. and Uhlig, F. (1994) Realizations of interlacing by tree-patterned matrices. [58], 38, 13-37. 116.
448
Bibliography
[253] Nylen, P. and Uhlig, F. (1997a) Inverse eigenvalue problems associated with spring-mass systems. [57], 254, 409-425. 79, 83, 92. [254] Nylen, P. and Uhlig, F. (1997b) Inverse eigenvalue problem: existence of special spring-mass systems. [29], 13, 1071-1081. 83. [255] Ostachowicz, W.M. and Krawczuk, M. (1991) Analysis of the eect of cracks on the natural frequencies of a cantilever beam. [47], 150, 191-201. 423. [256] Paine, J. (1982) Correction of Sturm-Liouville eigenvalue estimates. [60], 39, 415-420. 293. [257] Paine, J. (1984) A numerical method for the inverse Sturm-Liouville problem. [81], 5, 149-156. 293. [258] Paine, J.W. and de Hoog, F.R. (1980) Uniform estimation of the eigenvalues of Sturm-Liouville problems. [33], 21, 365-383. 293. [259] Paine, J.W., de Hoog, F.R. and Anderssen, R.S. (1981) On the correction of finite dierence eigenvalue approximations for Sturm-Liouville problems. [19], 26, 123-139. 293. [260] Pandey, A.K., Biswas, M. and Samman, M.M. (1991) Damage detection from changes in curvature mode shapes. [47], 145, 321-332. 424. [261] Papanicolaou, V.G. (1995) The spectral theory of the vibrating periodic beam. [14], 170, 359-373. 369. [262] Papanicolaou, V.G. and Kravvaritis, D. (1997) An inverse spectral problem for the Euler-Bernoulli equation for the vibrating beam. [29], 13, 10831092. 393. [263] Parker, R.L. (1977) Understanding inverse theory. [8], 5, 35-64. 289. [264] Parlett, B.N. (1980) The Symmetric Eigenvalue Problem. Englewood Clis: Prentice Hall. 17, 365. [265] Parter, S. (1960) On the eigenvalues of a class of matrices. [54], 8, 376-388. 113. [266] Pleijel, A. (1956) Remarks on Courant’s nodal line theorem. [16], 543-550. 216. [267] Porter, B. (1970) Synthesis of lumped-parameter vibrating systems by an inverse Holzer technique. [44], 12, 17-19. 208. [268] Porter, B. (1971) Synthesis of lumped-parameter vibrating systems using transfer matrices. [28], 13, 29-34. 208. [269] P˝oschel, J. and Trubowitz, E. (1987) Inverse Spectral Theory. Boston: Academic Press. 283, 354.
Bibliography
449
[270] Pranger, W.A. (1989) A formula for the mass density of a vibrating string in terms of the trace. [41], 141, 399-404. 363. [271] Protter, M.H. and Weinburger, H.F. (1984) Maximum Principles in Differential Equations. New York: Springer. 218. [272] Ram, Y.M., Braun, S. and Blech, J.J. (1988) Structural modification in truncated systems by the Rayleigh-Ritz method. [47], 125, 203-209. 364, 365. [273] Ram, Y.M. and Braun, S.G. (1990a) Structural dynamic modification using truncated data: Bounds for the eigenvalues. [63], 4, 39-52. 364. [274] Ram, Y.M. and Braun, S.G. (1990b) Upper and lower bounds for the natural frequencies of modified structures based on truncated modal testing results. [47], 137, 69-81. 365. [275] Ram, Y.M., Blech, J.J. and Braun, S.G. (1990) Eigenproblem error bounds with application to the symmetric dynamic system modification. [84], 11, 553-564. 365. [276] Ram, Y.M. (1993) Inverse eigenvalue problem for a modified vibrating system. [82], 53, 1762-1775. 83, 365. [277] Ram, Y.M. and Blech, J.J. (1991) The dynamic behaviour of a vibrating system after modification. [47], 150, 357-370. 83, 365, 365, 365. [278] Ram, Y.M. and Braun, S.G. (1991) An inverse problem associated with the dynamic modification of structures. [37], 58, 233-237. 365, 366. [279] Ram, Y.M. and Caldwell, J. (1992) Physical parameters reconstruction of a free-free mass-spring system from its spectra. [82], 52, 140-152. 365, 366. [280] Ram, Y.M. and Braun, S.G. (1993) Eigenvector error bounds and their application to structural modification. [4], 31, 759-764. 365. [281] Ram, Y.M. (1994a) Inverse mode problems for the discrete model of a vibrating beam. [47], 169, 239-252. 365. [282] Ram, Y.M. (1994b) Enlarging a spectral gap by structural modification. [47], 176, 225-234. 365. [283] Ram, Y.M. (1994c) An inverse mode problem for the continuous model of an axially vibrating rod. [37], 61, 624-628. 366. [284] Ram, Y.M. and Elhay, S. (1996) The theory of a multi degree of freedom dynamic absorber. [47], 195, 607-615. 366. [285] Ram, Y.M. and Elhay, S. (1995a) Dualities in vibrating rods and beams: continuous and discrete models. [47], 181, 583-594. 162, 345, 366.
450
Bibliography
[286] Ram, Y.M. and Elhay, S. (1995b) The construction of band symmetric models for vibrating systems from modal analysis data. [47], 184, 759766. [287] Ram, Y.M. and Elhay, S. (1998) Constructing the shape of a rod from eigenvalues. [15], 14, 597-608. 162, 366. [288] Ram, Y.M. and Elishako, I. (2004) Reconstructing the cross-sectional area of an axially-vibrating non-uniform rod from one of its mode shapes. [74]. 367. [289] Ram, Y.M. and Gladwell, G.M.L. (1994) Constructing a finite element model of a vibratory rod from eigendata. [47], 169, 229-237. 87, 365, 367. [290] Rayleigh, Lord (1984) The Theory of Sound. London: Macmillan. 15. [291] Rizos, P.F., Aspragathos, N. and Dimarogonas, A.D. (1990) Identification of crack location and magnitude in a cantilever beam from the vibration modes. [47], 138, 381-388. 422, 424. [292] Rundell, W. and Sacks, P.E. (1992a) Reconstruction techniques for classical inverse Sturm-Liouville problems. [60], 58, 161-183. 321. [293] Rundell, W. and Sacks, P.E. (1992b) The reconstruction of Sturm-Liouville operators. [29], 8, 457-482. 321. [294] Rundell, W. (1997) Inverse Sturm-Liouville problems, in Chadan, K., Colton, D., P˝ aiv˝ arinta, L. and Rundell, W., (Eds.) An Introduction to Inverse Scattering and Inverse Spectral Problems. Philadelphia: SIAM. 67-131. 283, 321, 326. [295] Sabatier, P.C. (1978) Spectral and scattering inverse problems. [42], 19, 2410-2425. 289. [296] Sabatier, P.C. (1979a) On some spectral problems and isospectral evolutions connected with the classical string problem. I. Constants of motion. [56], 26, 477-482. 291. [297] Sabatier, P.C. (1979b) On some spectral problems and isospectral evolution connected with the classical string problem. II. Evolution equations. [56], 26, 483-486. 291. [298] Sabatier, P.C. (1985) Inverse problems - an introduction. [29], 1, i-iv. 289. [299] Sakata, T. and Sakata, Y (1980) Vibrations of a taut string with stepped mass density. [47], 71, 315-317. 355. ˝ ber Potenzreihen, die im Innern des Einheitskreises [300] Schur, J. (1917) U beschr˝ ankt sind. [34], 147, 205-232. 337.
Bibliography
451
[301] Seidman, T. (1985) Converent approximation scheme for the inverse Sturm-Liouville problem. [29], 1, 251-262. 291. [302] Seidman, T. (1988) An inverse eigenvalue problem with rotational symmetry. [29], 4, 1093-1115. [303] Shen, M.-H.H. and Pierre, C. (1990) Natural modes of Bernoulli-Euler beams with symmetric cracks. [47], 138, 115-134. 422. [304] Shen, M.-H.H. and Taylor, J.E. (1991) An identification problem for vibrating cracked beams. [47], 150, 457-484. 422. [305] Sinha, J.K., Friswell, M.I. and Edwards, S. (2002) Simplified models for the location of cracks in beam structures using measured vibration data. [47], 251, 13-38. [306] Sivan, D.D. and Ram, Y.M. (1997) Optimal construction of a mass-spring system from prescribed modal and spectral data. [47], 201 323-334. 366. [307] Sivan, D.D. and Ram, Y.M. (1999) Physical modifications to vibratory systems with assigned eigendata. [37], 66, 427-432. 366. [308] Sondhi, M.M. and Gopinath, B. (1971) Determination of vocal-tract shape from impulse response of the lips. [32], 49, 1867-1873. 331. [309] Sondhi, M.M. (1984) A survey of the vocal tract inverse problem: theory, computation and experiments, in F. Santosa, Y.-H. Pao, W.W. Symes and C. Holland. (Eds.) Inverse Problems of Acoustic and Elastic Waves. Philadelphia: SIAM. 293, 331. [310] Stieltjes, T.J. (1918) Oevres Completes. Vol. 2, Groningen: Noordho. 63. [311] Strang, G. and Fix, G.J. (1973) An Analysis of the Finite Element Method. Prentice-Hall, Englewood Clis, NJ. 26. [312] Sussman-Fort, S.E. (1982) Reconstruction of bordered-diagonal and Jacobi matrices from spectral data. [50], 314, 271-282. 103. [313] Sweet, R.A. (1969) Properties of a semi-discrete approximation to the beam equation with a second order term. [35], 5, 329-339. 185. [314] Sweet, R.A. (1971) Oscillation properties of a semi-discrete approximation to the beam equation with a second order term. [35], 7, 119-125. 185. [315] Symes, W.W. (1980) Hamiltonian group actions and integrable systems. [70], 1, 339-374. 159. [316] Symes, W.W. (1982) The QR algorithm and scattering for the finite nonperiodic Toda lattice. [70], 4, 275-280. 159. [317] Takewaki, I. and Nakamura, T. (1995) Hybrid inverse mode problems for FEM-shear models. [40], 121, 873-880. 92.
452
Bibliography
[318] Takewaki, I., Nakamura, T. and Arita, Y. (1996) A hybrid inverse mode problem for fixed-fixed mass-spring models. [52], 118, 641-648. 92. [319] Takewaki, I. and Nakamura, T. (1997) Hybrid inverse mode problem for structure-foundation systems. [40], 123, 312-321. 92. [320] Takewaki, I. (1999) Hybrid inverse eigenmode problem for top-linked twin shear building models. [28], 41, 1133-1153. 92. [321] Takewaki, I. (2000) Dynamic Structured Design: Inverse Problem Approach. Southampton, UK: WIT Press. 92. [322] Temple, G. and Bickley, W.G. (1933) Rayleigh’s Principle and its Applications to Engineering. London: Oxford University Press. 41. [323] Titchmarsh, E.C. (1962) Eigenfunction Expansions. Part I. Oxford: Oxford University Press. 276. [324] Toda, M. (1970) Waves in nonlinear lattices. [75], 45, 174-200. 159. [325] Underwood, R.R. (1975) An Iterative Block-Lanczos Method for the Solution of Large Sparse Symmetric Eigenproblems. Ph.D. Thesis, Stanford University. 108. [326] van der Holst, H. (1996) Topological and Spectral Graph characterizations. Ph.D. Thesis. Universiteit van Amsterdam. 224. [327] Vestroni, F. and Capecchi, D. (1996) Damage evaluation in cracked vibrating beams using experimental frequencies and finite element models. [53], 2, 69-86. 422. [328] Vestroni, F. and Capecchi, D. (2000) Damage detection in beam structures based on frequency measurements. [59], 126, 761-768. 422. [329] Vijay, D.K. (1972) Some Inverse Problems in Mechanics. M.A.Sc. Thesis, University of Waterloo. 203. [330] Washizu, K. (1982) Variational Methods in Elasticity and Plasticity. 3rd Edition, Oxford: Pergamon Press. 41. [331] Watkins, D.S. (1984) Isospectral flows. [87], 26, 379-391. 159. [332] Weinberger, H. (1974) Variational Methods for Eigenvalue Approximation. Regional Conf. Ser. in Appl. Math., 15, SIAM. [333] Willis, C. (1986) An inverse method using toroidal mode data. [29], 2, 111-130. [334] Willis, C. (1985) Inverse Sturm-Liouville problems with two discontinuities. [29], 1, 263-289. 305.
Bibliography
453
[335] Wu, Q.L. and Fricke, F. (1989) Estimation of blockage dimensions in a duct using measured eigenfrequency shifts. [47], 133, 289-301. 421. [336] Wu, Q.L. and Fricke, F. (1990) Determination of blocking locations and corss-sectional area in a duct by eigenfrequency shifts. [32], 87, 67-75. 421. [337] Wu, Q.L. and Fricke, F. (1991) Determination of the size of an object and its location in a rectangular cavity by eigenfrequency shifts - 1st order approximations. [47], 144, 131-147. 421. [338] Wu, Q.L. (1994) Reconstruction of crack function of beams from eigenvalue shifts. [47], 173, 279-282. 423. [339] Xu, S.F. (1998) An Introduction to Inverse Algebraic Eigenvalue Problems. Braunschweig: Vieweg. 105, 117. [340] Yen, A. (1978) Numerical Solution of the Inverse Sturm-Liouville Problem. Ph.D. Thesis, University of California at Berkeley. [341] Yuen, M.M.F. (1985) A numerical study of the eigenparameters of a damaged cantilever. [47], 103, 301-310. 422. [342] Zhu, H.M. (2000) Courant’s Nodal Line Theorem and its Discrete Counterparts. Ph.D. Thesis, University of Waterloo. 34, 224. [343] Zienkiewicz, O.Z. (1971) The Finite Element Method in Engineering Science, London: McGraw-Hill. 26.
454
List of Journals 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
19 20 21 22 23 24 25 26 27
Full Journal Name acta mathematica acta numerica aeronautical quarterly AIAA journal. american institute of aeronautics and astronautics paper american mathematical society translations series 2 annals of mathematics annual review of earth and planetary sciences applied mechanics reviews archiv der mathematik archive for rational mechanics and analysis bell systems technical journal commentarii mathematici helvetici communications in mathematical physics communications in numerical methods in engineering communications on pure and applied mathematics comptes rendus academie des sciences comptes rendus des seances academie des sciences serie 1 mathematique computing contemporary mathematicians doklady akademii nauk sssr duke mathematical journal earthquake engineering and structural dynamics geophysical journal royal astronomical society ieee transactions on sonics and ultrasonics ima journal of applied mathematics international journal of analytical and experimental modal analysis
Abbreviated Journal Title acta math acta numerica aeron q aiaa j am inst aeronaut astronaut pap amer math soc trans ser 2 ann math annu rev earth planet sci appl mech rev arch math arch rat mech anal bell sys tech j comment math helvetici commun math phys comm numer methods engrg math comm pure appl cr acad sci paris cr acad sci paris sec I math co contemp mathematicians dokl ak sssr duke math j earthquake eng struct dyn geophys j r astr soc ieee trans sonics ultrason
Call # QA1 .A185 QA297. A327 TL501 .R7 TL501.A688 A2 TL512. A66 QA 1.A522 QA1 .A6 QE1 .A674 TA1 .A639 QA 1.A66 QA801 .A6 TK 1.B425 QA1 .C7 QC1 .C6 TA335 .C65 QA1 .C718 AS162.P315
Q46 .A14 QA 76.C582 monographic series Q60.A3 QA1 .D8 TH1095 .E27x QD96.A643x QC244 .I53
ima j appl math
QA 1.I522
intl j analyt exptl modal analysis
TA654.15.I56
4 55
28 29 30 31 32 33
34 35 36 37 38 39 40 41 42 43 44 45 46
47 48 49 50
51 52
53
Full Journal Name international journal of mechanical sciences inverse problems inverse problems in engineering izvestiia akademii nauk sssr seriya matematicheskaya journal acoustical society of america journal australian mathematical society series b applied mathematics journal fuer die reine und angewandte mathematik journal institute of mathematics and its applications journal of applied mathematics and physics journal of applied mechanics journal of dierential equations journal of elasticity journal of engineering mathematics journal of mathematical analysis and applications journal of mathematical physics journal of mathematical sciences journal of mechanical engineering science journal of mechanical systems and systems processing journal of research, united states national bureau of standards, section b. mathematical sciences journal of sound and vibration journal of strain analysis journal of structural engineering asce journal of the franklin institute b engineering and applied mathematics journal of the mechanics and physics of solids journal of vibration and acoustics. Transactions of the asme journal of vibration and control
Abbreviated Journal Title
Call #
intl j mech sci inverse pr inverse probl eng izv akad nauk sssr ser mat
TJ1 .I59 QA370 .I52x TA347.D45 I582
j acoust soc am
QC 221.A4
j austral math soc series b j reine angew math
G 3271 .C55
QA1 .J97645 QA 1.J95
j inst math appl
QA1 .I552
j j j j
QA1 .Z5 TA1 .J6 QA371 .J73 QA931 .J6x
appl math phys app mech di equa elast
j eng math
TA1 .A5233
j math anal appl j math phys j math sci
QA1 .J596 QA1 .J598 QA1.J63x
j mech eng sci mech syst signal processing j res nat bur standards sect b j sound vib j strain anal
TJ1.J6
j struct eng asce
TA1 .A5235
j franklin inst b j mech phys solids
T1 .F8
j vib accoust trans asme j vib control
TA654 .M38
QA1 .U571 QC221 .J6 TG265 .J6
TA350 .J68
TJ1 .J68x TJ212 .J68x
4 56
55
Full Journal Name journal society of industrial and applied mathematics lecture notes in mathematics
56 57 58
lettere al nuovo cimento linear alegbra and its applications linear and multilinear algebra
59 60 61 62
matematicheskii sbornik mathematics of computation mathematische zeitschrift meccanica journal of the italian association of theoretical and applied mechanics mechanical systems and signal processing memoirs of the american mathematical society monatshefte fuer mathematik und physik nordisk mathematisk tidsskrift b numerical linear algebra with applications numerische mathematik philosophical transactions royal society of london series a mathematical and physical sciences physica d. nonlinear phenomena prikladnaya matematika i mekhanika proceedings of the institution of mechanical engineers proceedings of the national academy of science proceedings royal society of london series a mathematical and physical science progress of theoretical physics supplement quarterly journal of mathematics oxford second series quarterly journal of mechanics and applied mathematics
54
63 64 65 66 67 68 69
70 71 72 73 74
75 76 77
Abbreviated Journal Title
Call #
jsiam lecture notes in math lett nuovo c lin alg app linear multilin algebra mat sb math comp math z meccanica j ital assoc theoret appl mech mech syst signal processing
QA1 .S73
mono series monatsh math phys nord mat tidsskr b numer linear algebra appl numer math
QA1 .A514
phil trans roy soc lond a phys d prik mat mekh proc inst mech eng proc nat acad sci proc roy soc lond a prog theor phys suppl q j math oxford ser 2 q j mech appl math
QA3 .L28 QC 1.L4 QA 251.L52 QA251 .L524x QA1 .M4 QA 47.M29 QA1 .M799
QA801 .M4x TA654 .M38
QA1 .M877 QA1 .N83 QA184. N88 QA1 .N8
Q 41.L8 QC1 .P3834 TA350 . U34 TJ 1.I5 Q11 .N26
Q41 .L72 QC1 .P8852x QA1 .Q22
QA1 .Q23
457
78 79
80 81 82 83 84 85 86 87 88 89
90 91 92
Full Journal Name Rendiconti del Circolo matematico di Palermo rendiconti dell’istituto di matematica dell’universita di trieste shock and vibration digest siam journal on algebraic and discrete methods siam journal on applied mathematics siam journal on mathematical analysis siam journal on matrix analysis and applications siam journal on numerical analysis siam journal on scientific and statistical computing siam review transactions of the american mathematical society trudy moskovskogo matematicheskogo obshchestva wave motion zeitschrift fuer angewandte mathematik und physik zeitschrift fuer physik
Abbreviated Journal Title rend circ mat palermo rend istit mat univ trieste shock vib dig siam j alg disc math
Call # QA1 .C6
QA1 .T82a US1 DH 41 S37 QA1 .S732x
siam j appl math
QA1 .S73
siam j math anal siam j matrix anal appl
QA 1.S25
siam j num anal siam j sci stat comput siamr n y am mth st
QA297.S52x
QA1 .S732x
QA264 .S53x QA1 .S5 QA1 .A522
trudy mosk mat obsch wamod
QA1 .M988 QA927.W3x
zamp z phys
QA1 .Z5 QC1 .Z4