Fourier Analysis of Economic Phenomena

Monographs in Mathematical Economics 2 Toru Maruyama Fourier Analysis of Economic Phenomena Monographs in Mathemati...

Author: Toru Maruyama

38 downloads 1066 Views 5MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

Monographs in Mathematical Economics 2

Toru Maruyama

Fourier Analysis of Economic Phenomena

Monographs in Mathematical Economics Volume 2

Editor-in-chief Toru Maruyama, Tokyo, Japan Series editors Shigeo Kusuoka, Tokyo, Japan Jean-Michel Grandmont, Malakoff, France R. Tyrrell Rockafellar, Seattle, USA

More information about this series at http://www.springer.com/series/13278

Toru Maruyama

Fourier Analysis of Economic Phenomena

123

Toru Maruyama Professor Emeritus Keio University Tokyo, Japan

ISSN 2364-8279 ISSN 2364-8287 (electronic) Monographs in Mathematical Economics ISBN 978-981-13-2729-2 ISBN 978-981-13-2730-8 (eBook) https://doi.org/10.1007/978-981-13-2730-8 Library of Congress Control Number: 2018958313 © Springer Nature Singapore Pte Ltd. 2018 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, express or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd. The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore

Preface

Fourier analysis as a branch of mathematical analysis was born from ingenious attempt to express a function as a weighted sum of trigonometric waves. In particular, abstract theory addressing a similar problem on topological groups is sometimes called harmonic analysis. However, Fourier analysis and harmonic analysis are almost synonymous, and we do not distinguish between these two names in this book. The interaction between Fourier analysis and other fields of mathematics has a long and remarkable history. The role of Fourier analysis in physics has also been impossible to exaggerate. However, the situation is quite different regarding its relation with economics. Indeed, Fourier analysis has not been so familiar to economists, except for specialists of time series analysis. Therefore, we have to confess that the extent of the interaction between Fourier analysis and economics is quite restricted. Nevertheless, the narrowness of the interaction realm is a completely different matter from its significance. As a peculiar example of significant interaction between them, I would like to choose the studies into periodic economic phenomena as the main materials for this book. Not a few economic variables show behaviors over time which enjoy periodic or almost periodic regularities. Trade cycles or business fluctuations are typical examples of periodic economic phenomena. Many economic theorists since the latter half of the nineteenth century have incessantly endeavored to examine periodic fluctuations of economic activities. In the process of my research into trade cycles, two classical works impressed on me the close connection between dynamic economic theory and Fourier analysis. The first one is N. Kaldor’s theory of trade cycles based upon nonlinear investment functions. The second is the theory of periodic behaviors of moving average processes due to E. Slutsky. Although their works are more or less rudimentary and devoid of mathematical rigor, it is indeed surprising that their results can be basically justified by more exact mathematical reasoning. Rather, I should say, they invoked several new developments of mathematics. It is remarkable that the works of Kaldor and Slutsky share a common mathematical

v

vi

Preface

background of Fourier analysis. That is why I selected these two as key figures for this book. It is the object of this book to describe the entire mathematical skeleton which supports their economic theories exploring the periodic behaviors of an economy. This book is not intended to be a well-balanced collection of economic applications or something like a convenient cookbook. What I expect the readers to find in the book is a systematic body of carefully constructed mathematical theories which are hidden in the background of the well-known traditional economic doctrines mentioned above. I do not intend to go into details of many economic theories developed more recently. If I dared to try it, I would have to spend many pages on the explanations of concepts and reasonings peculiar to economics. It would eventually obscure the main theme of the book. The attitude I chose is to restrict myself to a rather narrow field of applications familiar to many economists and to give full details of mathematical materials instead. The Kaldorian theory of trade cycles is discussed in Chap. 11. The Hopf bifurcation theorem plays a key role in the existence proof of periodic solutions of the fundamental equation of the Kaldor model. A natural approach to Hopf’s theory is explored from the viewpoint of Fourier analysis. The classical results (Chaps. 1 and 2) on the convergence problem of Fourier series as well as the Carleson–Hunt theorem are effectively made use of. The theory of Fredholm operators provides another prerequisite. We also give a systematic exposition of the topic in Chap. 10.

The theory of periodic or almost periodic stationary processes, initiated by Slutsky, is discussed in Chaps. 8 and 9. The most essential tool is provided by the Herglotz–Bochner theorem on the spectral representation of positive definite functions. It is prepared in Chap. 6 in the framework of generalized harmonic analysis fortified by L. Schwartz’s theory of distributions. Stone’s theorem on one-parameter groups of unitary operators (Chap. 7) is crucial for the spectral representation of stationary processes.

Preface

vii

Chapters 1, 2, 3, 4, and 5 may be regarded as a menu for a standard basic course on Fourier analysis. The contents of these chapters are more or less commonsense in this mathematical discipline. Classical Fourier series (Chaps. 1 and 2); Fourier transforms on L1 , L2 , S (space of rapidly decreasing functions), and S (space of tempered distributions) (Chaps. 3 and 4); and the summation method and spectral synthesis (Chap. 5) are the contents of the menu. This book is the English edition of the Japanese version, published in 2017 by Chisen Shokan, Tokyo. There is no significant difference between them, except one point. The last section of Chap. 8 in the English edition is not included in the Japanese version. This section is a brief exposition of E. Slutsky’s original work, and it was added on the advice of the series editor in charge. Since I had started to learn Fourier analysis in his class, the late Professor Tatsuo Kawata continuously gave me kind encouragement as well as rich advice. Reading through the manuscript of this book, I am impressed by Professor Kawata’s decisive influence upon me, both in the motivation and in the way of thinking. The materials of the book are based upon my lectures given during the graduate course in economics at Keio University and the graduate course in mathematical sciences at the University of Tokyo. The critical comments from my former students, including Messrs. H. Kawabi, Y. Hosoya, and C. Yu Chaowen, were always beneficial. The kind and detailed comments by the anonymous referee deserve special mention. I revised, though not sufficiently enough, the earlier manuscript according to the advice of the referee. I have to express my sincere gratitude to Professor S. Kusuoka for the tedious work of checking the manuscript as the editor in charge. I am very much indebted to him for his kind advice concerning probability theory, which I am not familiar with very well. I would also like to devote my thanks to Professors A. Ioffe, L. Nirenberg, and P.H. Rabinowitz for their suggestive comments on my work. I incorporate in the text several of my articles published elsewhere; I appreciate the generous permissions of the American Mathematical Society, Operations Research Society of Japan, and Society of Mathematical Economics. Finally, I express my deepest thanks to the members of my Dream Team, Mmes. Y. Hagiwara, Y. Katsuragi, T. Noma, and H. Yusa for their efficient cooperation. Tokyo, Japan March 2018

Toru Maruyama

Contents

1

Fourier Series on Hilbert Spaces. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 Hilbert Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Orthonormal Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3 Fourier Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.4 Completeness of Orthonormal Systems. . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1 1 5 11 14 21

2

Convergence of Classical Fourier Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Dirichlet Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Dini, Jordan Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Almost Everywhere Convergence: Historical Survey . . . . . . . . . . . . . 2.4 Uniform Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5 Fejér Integral and (C, 1)-summability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

23 24 28 34 37 40 44

3

Fourier Transforms (I) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Fourier Integrals. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Fourier Transforms on L1 (R, C) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 Application: Heat Equation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

47 47 52 60 62

4

Fourier Transforms (II) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Fourier Transforms of Rapidly Decreasing Functions . . . . . . . . . . . . 4.2 Fourier Transforms on L2 (R, C) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 Application: Integral Equations of Convolution Type. . . . . . . . . . . . . 4.4 Fourier Transforms of Tempered Distributions . . . . . . . . . . . . . . . . . . . . 4.5 Fourier Transforms on L2 (R, C) Revisited . . . . . . . . . . . . . . . . . . . . . . . . 4.6 Periodic Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

65 65 72 75 78 87 92 99

5

Summability Kernels and Spectral Synthesis . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 5.1 Shift Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 5.2 Summability Kernels on [−π, π ] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 ix

x

Contents

5.3 Spectral Synthesis on [−π, π ] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4 Summability Kernels on R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.5 Spectral Synthesis on R: Inverse Fourier Transforms on L1 . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

109 111 115 119

6

Fourier Transforms of Measures. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1 Radon Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 Fourier Coefficients of Measures (1). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3 Fourier Coefficients of Measures (2). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.4 Herglotz’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.5 Fourier Transforms of Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.6 Bochner’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.7 Convolutions of Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.8 Wiener’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

121 121 122 126 129 136 146 154 158 163

7

Spectral Representation of Unitary Operators . . . . . . . . . . . . . . . . . . . . . . . . . . 7.1 Lax–Milgram Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2 Conjugate Operators and Projections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3 Unitary Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4 Resolution of the Identity. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.5 Spectral Representation of Unitary Operators . . . . . . . . . . . . . . . . . . . . . 7.6 Stone’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

165 165 169 175 177 180 187 191

8

Periodic Weakly Stationary Processes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.1 Stochastic Processes of Second Order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2 Weakly Stationary Stochastic Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.3 Periodicity of Weakly Stationary Stochastic Process . . . . . . . . . . . . . 8.4 Orthogonal Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.5 Spectral Representation of Weakly Stationary Processes . . . . . . . . . 8.6 Spectral Density Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.7 A Note on Slutsky’s Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

193 194 200 210 216 221 227 238 243

9

Almost Periodic Functions and Stochastic Processes . . . . . . . . . . . . . . . . . . . 9.1 Almost Periodic Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.2 AP(R, C) as a Closed Subalgebra of L∞ (R, C). . . . . . . . . . . . . . . . . . . 9.3 Spectrum of Almost Periodic Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.4 Fourier Series of Almost Periodic Functions . . . . . . . . . . . . . . . . . . . . . . 9.5 Almost Periodic Weakly Stationary Stochastic Processes . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

245 245 247 256 266 274 277

10

Fredholm Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.1 Direct Sums and Projections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.2 Fredholm Operators: Definitions and Examples. . . . . . . . . . . . . . . . . . . 10.3 Parametrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

279 279 290 293

Contents

xi

10.4 Product of Fredholm Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295 10.5 Stability of Indices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299 11

Hopf Bifurcation Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1 Ljapunov–Schmidt Reduction Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Abstract Hopf Bifurcation Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.3 Classical Hopf Bifurcation for Ordinary Differential Equations . 11.4 Smoothness of F . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.5 dim V = 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.6 codim R = 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.7 Linear Independence of P Mv ∗ and P Nv ∗ (1) . . . . . . . . . . . . . . . . . . . . 11.8 Linear Independence of P Mv ∗ and P Nv ∗ (2) . . . . . . . . . . . . . . . . . . . . 11.9 Hopf Bifurcation in Cr . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.10 Kaldorian Business Fluctuations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.11 Ljapunov’s Center Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

301 303 305 309 312 315 317 321 322 328 330 335 344

Appendix A Exponential Function eiθ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.1 Complex Exponential Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.2 Imaginary Exponential Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.3 Torus R/2π Z . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.4 A Homomorphism of R into U . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.5 Functions and σ -Fields on the Torus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

347 347 349 352 354 355 356

Appendix B Topics from Functional Analysis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B.1 Inductive Limit Topology. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B.2 Duals of Locally Convex Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

357 357 366 377

Appendix C Theory of Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C.1 The Space D. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C.2 Examples of Test Functions and an Approximation Theorem . . . . C.3 Distributions: Definition and Examples. . . . . . . . . . . . . . . . . . . . . . . . . . . . C.4 Differentiation of Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C.5 Topologies on the Space D(Ω) of Distributions . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

379 379 384 388 392 396 402

Addendum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 403 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 404 Name Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 405 Subject Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 407

Chapter 1

Fourier Series on Hilbert Spaces

Let e1 , e2 , · · · , el be the standard basis of an l-dimensional Euclidean space consisting of l unit vectors. Then any vector x can be expressed as x=

l

ci ei

i=1

and such an expression is determined uniquely. The coefficients c1 , c2 , · · · , cl are computed as ci = x, ei (inner product). Is a similar expression possible for an infinite dimensional vector space (i.e. preHilbert space or Hilbert space)? It is the theory of Fourier series which answers this question. In this chapter, we start by defining a Fourier series on an abstract Hilbert space and go on to discuss its fundamental properties.

1.1 Hilbert Spaces Let H be a complex vector space.1,2 A function ·, · : H × H → C is called an inner product on H if it satisfies the following four conditions for any elements x, y, x1 , and x2 of H: (i) x, x 0 ; x, x = 0 ⇔ x = 0. (ii) x, y = y, x (conjugate complex). 1A

systematic theory of functional analysis in the framework of Hilbert spaces is discussed in many textbooks in this discipline. For instance, Halmos [4] and Schwartz [8] are very well-written classics. See also Lax [6] Chap. 6 and Maruyama [7] Chap. 3. 2 In the case of a real vector space H, an inner product is a real-valued function ·, · : H × H → R such that (i)–(iv) are satisfied. Of course, (ii) should be rewritten as x, y = y, x. © Springer Nature Singapore Pte Ltd. 2018 T. Maruyama, Fourier Analysis of Economic Phenomena, Monographs in Mathematical Economics 2, https://doi.org/10.1007/978-981-13-2730-8_1

1

2

1 Fourier Series on Hilbert Spaces

(iii) αx, y = αx, y ; α ∈ C. (iv) x1 + x2 , y = x1 , y + x2 , y. It follows from the above axioms that: (a) x, y1 + y2 = x, y1 + x, y2 . (b) x, αy = αx, ¯ y ; α ∈ C. Given an inner product ·, · on H, we define a function · : H → R by

x =

x, x,

x ∈ H.

(1.1)

Then a couple of important inequalities immediately follow: (I) Schwarz inequality |x, y| x · y ; (II) triangular inequality x + y x + y . We recognize that the function · is a norm on H, taking account of (II). A normed vector space endowed with the norm (1.1) defined by an inner product is called a pre-Hilbert space. A complete pre-Hilbert space is called a Hilbert space. We briefly pick up a few basic facts concerning a Hilbert space. 1◦ (parallelogram law) (i) In a pre-Hilbert space H, the equality

x + y 2 + x − y 2 = 2( x 2 + y 2 )

(1.2)

holds good for any x, y ∈ H, (ii) Conversely, if (H, · ) is a normed space which satisfies (1.2), then it is a pre-Hilbert space. 2◦ (Pythagorean theorem) Two elements x, y of a Hilbert space H are said to be orthogonal if x, y = 0. The notation x ⊥ y means that these two elements are orthogonal. If x and y in H are orthogonal, we have

x + y 2 = x 2 + y 2 . More generally, the relation

x1 + x2 + · · · + xn 2 = x1 2 + x2 2 + · · · + xn 2 holds good if x1 , x2 , · · · , xn in H are mutually orthogonal, i.e. xi ⊥ xj (i j ). 3◦ (orthogonality) Two elements x and y of a Hilbert space H are orthogonal if and only if

x x + λy for all

λ ∈ C.

1.1 Hilbert Spaces

3

4◦ (Minimum distance theorem) Let C be a nonempty closed convex set in a Hilbert space H. Then there exists, for any z ∈ H, a unique element x ∈ C such that

x − z = inf y − z . y∈C

Definition 1.1 Let H be a pre-Hilbert space and M be its nonempty subset. Then the set M ⊥ = {x ∈ H | x ⊥ y

for all

y ∈ M}

is called the orthogonal complement of M. 5◦ (orthogonal decomposition) Let H be a Hilbert space and M be its closed subspace. Then for any z ∈ H, there exist some x ∈ M and y ∈ M⊥ such that z = x + y. Such x and y are determined uniquely. The following theorem due to F. Riesz is the most important foundation in the theory of Hilbert spaces. It asserts that any continuous linear functional on a Hilbert space can be represented as an inner product. Theorem 1.1 (F. Riesz) Let H be a Hilbert space. (i) If we define an operator Λy : H → C for each y ∈ H by Λy (x) = x, y, then Λy ∈ H and Λy = y . (H is the dual space of H.) (ii) Conversely, for each Λ ∈ H , there exists uniquely yΛ ∈ H such that Λ(x) = x, yΛ for all

x ∈ H.

Proof (i) We have, by the Schwarz inequality, that |Λy (x)| = |x, y| x · y . Consequently, Λy is a bounded linear functional with

Λy y . On the other hand, since Λy (y) = y, y = y 2 ,

(1.3)

4

1 Fourier Series on Hilbert Spaces

it follows that

Λy y .

(1.4)

Hence Λ = y by (1.3) and (1.4). (ii) Assume Λ ∈ H . The theorem is trivial in the case of Λ ≡ 0. So, without loss of generality, we may assume Λ 0. M = KerΛ is a nonempty closed subspace and M H. For any z ∈ H\M, there exists a unique x ∈ M such that

x − z = inf y − z

y∈M

(1.5)

(by 4◦ ). Since M is a vector subspace, x + λw ∈ M for any w ∈ M and λ ∈ C. We have, by (1.5),

x − z x + λw − z . Hence we can conclude, by 3◦ , that (x − z) ⊥ w

for any

w ∈ M.

(1.6)

Here x − z M, since x ∈ M and z M, and so Λ(x − z) 0. If we define u=

x−z , Λ(x − z)

then Λ(u) = 1. It follows that Λ(v − Λ(v)u) = Λ(v) − Λ(v)Λ(u) = 0

(1.7)

for all v ∈ H. u ⊥ w for all w ∈ M by (1.6), and v − Λ(v)u ∈ M by (1.7). Therefore we have v − Λ(v)u, u = v, u − Λ(v) u 2 = 0. Consequently, we can conclude that Λ(v) = v,

u

u 2

for all

v ∈ H.

Defining yΛ by yΛ =

u ,

u 2

we arrive at our desired result. The uniqueness of yΛ is obvious.

1.2 Orthonormal Systems

5

1.2 Orthonormal Systems We start by discussing the theory of Fourier series on real or complex Hilbert spaces. ·, · and · denote an inner product and a norm of a Hilbert space, respectively. Definition 1.2 A subset Φ of a Hilbert space H is called an orthogonal system in H if ϕ, ϕ = 0 for any two distinct elements ϕ and ϕ of Φ. In particular, Φ is said to be an orthonormal system if ϕ = 1 for all ϕ ∈ Φ. It is convenient to keep in mind the following rule of calculation. That is, if Φ = {ϕ1 , ϕ2 , · · · , ϕn } is an orthonormal system, then n n n n n 2 ci ϕi = ci ϕi , ci ϕi = ci ci = | ci |2 i=1

i=1

i=1

i=1

(1.8)

i=1

for any of the complex numbers c1 , c2 , · · · , cn , where ci is the conjugate complex number of ci . It follows that n

ci ϕi = 0

⇒

ci = 0 for all

i.

i=1

Therefore vectors which form an orthonormal system are linearly independent. We next show some typical examples of orthonormal systems. Although we have no chance to make use of Examples 1.4, 1.5, and 1.6, we briefly give proofs.3 Example 1.1 The system of unit vectors e1 = (1, 0, · · · , 0), e2 = (0, 1, 0, · · · , 0), · · · , el = (0, · · · , 0, 1) in Cl (or Rl ) forms an orthonormal system. Example 1.2 The system of functions 1 1 1 1 1 √ , √ cos x, √ sin x, · · · , √ cos nx, √ sin nx, · · · ; n = 1, 2, · · · π π π π 2π forms an orthonormal system in L2 ([−π, π ], C) or L2 ([−π, π ], R). Example 1.3 The system of functions 1 √ einx ; 2π

n = 0, ±1, ±2, · · ·

forms an orthonormal system in L2 ([−π, π ], C).

3 See

Folland [3] Chap. 6, Terasawa [10] pp. 145–149, pp. 414–416, Yosida [11] Chap. 1, §3 and Chap. 2, §2.

6

1 Fourier Series on Hilbert Spaces

Example 1.4 The system of functions

2n + 1 Pn (x); 2

n = 0, 1, 2, · · ·

forms an orthonormal system in L2 ([−1, 1], C), where Pn (x) =

1 2n n!

·

dn 2 n (x − 1) dx n

(Legendre polynomial).

Pn (x) is a polynomial of degree n, and the term of the highest degree is (2n)!/2n (n!)2 · x n .4 For any function f : [−1, 1] → R ∈ Cn ([−1, 1], R), we obtain, by repeating integration by parts n times, that5

1

dn 2 n!f, Pn = f (x) n (x 2 − 1)n dx = (−1)n dx −1 n

1

−1

dn f (x) · (x 2 − 1)n dx. dx n (1.9)

Applying this formula to f = Pm (m < n), we have 2n n!Pm , Pn = 0

(m < n),

since the n-th derivative of Pm is zero. The same result also holds good for m > n by a similar reasoning. Hence it follows that Pm , Pn = 0

if m n.

This proves the orthogonality of Pm and Pn (m n). We next show that Pn = 1. The n-th derivative of Pn (x) is computed as dn 1 · 2 · · · · · · (2n) (2n)! = = 1 · 3 · 5 · · · (2n − 1). Pn (x) = n dx n 2 · n! (2 · 1)(2 · 2) · · · · · · (2 · n) Applying (1.9) to f = Pn , we obtain

4 The 5 The

term of the highest degree = (1/2n n!){(2n)(2n − 1) · · · (n + 1)}x n . first process of integration by parts is as follows: f (x)

1 1 d n−1 d n−1 2 n (x − 1) − f (x) n−1 (x 2 − 1)n dx −1 dx n−1 dx −1

=−

1

−1

f (x)

d n−1 2 (x − 1)n dx. dx n−1

1.2 Orthonormal Systems

7

2n n!Pn , Pn = (−1)n

1

−1

(1 · 3 · 5 · · · · · · (2n − 1))(x 2 − 1)n dx

= (1 · 3 · 5 · · · · · · (2n − 1))

(1.10) 1 −1

(1 − x 2 )n dx.

The following computation is based upon the beta integration formula.6

1

−1

1

(1 − x 2 )n dx = 2

(1 − x 2 )n dx =

0

1

Γ (n + 1)Γ 2 = 3 Γ n+ 2

1

1

(1 − y)n y − 2 dy

0

√ n! π = 3 Γ n+ 2

(1.11)

n! 2n+1 · n! = , =

1 1 3 1 · 3 · 5 · · · · · · (2n + 1) ······ n + 2 2 2 where Γ is the gamma function. Equations (1.10) and (1.11) imply

Pn 2 =

6 We

2 . 2n + 1

denote by C+ the complex half-plane Rez > 0. The function (Γ : C+ → C) defined by

∞ Γ (s) = x s−1 e−x dx

(t)

0

is called the gamma function. Γ is analytic on C+ . Sometimes the domain of Γ is taken to be (0, ∞) rather than C+ . In the text above, we also follow this policy. The function B : C+ ×C+ → C defined by

1

B(p, q) =

x p−1 (1 − x)q−1 dx

0

is called the beta function. (Note that the integral is convergent.) The following formulas hold good (cf. Cartan [1] Chap. V, §3 and Takagi [9] Chap. 5, §68): 1◦ Γ (s + 1) = sΓ (s). We also obtain Γ (n + 1) = n! for all n ∈ N by induction. 2◦ B(p, q) = Γ (p)Γ (q)/Γ (p + q). 3◦ Γ (s)Γ (1 − s) = π/ sin π s. √ If s = 1/2, in particular, we have √ Γ (1/2) = π . 4◦ Γ n + 12 = 21 32 · · · n − 12 π (n = 0, 1, 2, · · · ).

8

1 Fourier Series on Hilbert Spaces

Hence we have finally that 2n + 1 Pn (x) = 1. 2 2 Example 1.5 The system of functions √

1

x2

Hn (x)e− 2 ;

√ 4

2n n! π

n = 0, 1, 2, · · ·

forms an orthonormal system in L2 (R, C), where Hn (x) = (−1)n ex

2

d n −x 2 e dx n

(Hermite polynomial).

Hn (x) is a polynomial of n-th degree, and its term of highest degree is (2x)n . Applying integration by parts to any f ∈ Cn (R, R), we have

∞

−∞

f (x)Hn (x)e−x dx = (−1)n 2

∞

−∞

f (x)

d n −x 2 e dx = dx n

∞

−∞

dn 2 f (x)·e−x dx. dx n (1.12)

It follows from (1.12) that

∞

Hm (x)Hn (x)e−x dx = 0 2

−∞

for m < n. We also get the same result for m > n. In other words, Hm and Hn (m 2 2 n) are orthogonal with respect to the measure e−x dx with the weight e−x . If we 2 define H˜ n (x) = Hn (x)e−x /2 , then H˜ m and H˜ n (m n) are orthogonal in the usual sense, i.e.

∞

−∞

Hm (x)e−

x2 2

x2

· Hn (x)e− 2 dx = 0

if m n.

Setting f = Hn = (2x)n + · · · in (1.12), we have (d n /dx n )f (x) = 2n n!. Hence

Hn (x)e

2

− x2

= 2 n! 2

n

∞

−∞

e−x dx = 2n n! 2

It follows finally that 2 √ 1 √ Hn (x)e− x2 = 1. 2n n! 4 π 2

√

π.

1.2 Orthonormal Systems

9

Example 1.6 The system of functions x 1 Ln (x)e− 2 ; n!

n = 0, 1, 2, · · ·

forms an orthonormal system in L2 ((0, ∞), C), where Ln (x) = ex

d n n −x (x e ) dx n

(Laguerre polynomial).

Ln (x) is a polynomial of degree n. The term of the highest degree is (−1)n x n . For any f ∈ Cn ([0, ∞), R), integration by parts gives

∞

f (x)Ln (x)e

−x

dx =

0

∞

f (x) · ex

0

= (−1)n 0

d n n −x (x e ) · e−x dx dx n (1.13)

∞

dn f (x) · (x n e−x )dx. dx n

It follows that

∞

Lm (x)Ln (x)e−x dx = 0

0

if m < n. The same result also holds good for m > n. If we define L˜n (x) = Ln (x)e−x/2 , L˜ m and L˜ n (m n) are orthogonal, i.e.

∞

x

x

Lm (x)e− 2 · Ln (x)e− 2 dx = 0

if n m.

0

Apply (1.13) for f = Ln . Then we have

1 2 (−1)n ∞ − x2 (−1)n · x n e−x dx Ln (x)e = 2 n! (n!)2 0

1 1 1 = x n e−x dx = Γ (n + 1) = 1, (n!)2 0 (n!)2 taking account of d n /dx n (Ln ) = (−1)n . We have illustrated some concrete examples of orthonormal systems in various Hilbert spaces. Here arises a question: does there exist an orthonormal system in a general Hilbert space? The next theorem gives a positive answer to this problem. Theorem 1.2 (Schmidt’s orthogonalization) Let {xm } be a linearly independent sequence in a Hilbert space H. Then we can construct an orthonormal system {ϕn } of the form

10

1 Fourier Series on Hilbert Spaces

Fig. 1.1 Orthogonalization

ϕ1 = c11 x1 ,

ϕ2 = c21 x1 + c22 x2 ,

···

ϕn = cn1 x1 + cn2 x2 + · · · + cnn xn ,

···

by choosing suitable complex numbers cnm . Let us start by illustrating the intuitive idea of the proof (See Fig. 1.1). Since {xn } is linearly independent, xn 0 for all n. So we can define ϕ1 = x1 / x1 . Draw a line spanned by x1 . Let H be the intersection of this line and another one which runs through the top of the vector x2 and is orthogonal to the first. The length of 0H is

x2 cos θ = x2 · ϕ1 cos θ = x2 , ϕ1 , where θ is the angle between ϕ1 and x2 . The vector x2 defined by x2 = x2 − x2 , ϕ1 ϕ1 is the vector connecting the top of x2 and H . x2 is clearly orthogonal to ϕ1 . If we adjust the length by ϕ2 = x2 / x2 , ϕ1 is orthogonal to ϕ2 and ϕ1 = ϕ2 = 1. We have only to repeat this process. Proof (of Theorem 1.2) Define {xn } and {ϕn } by x1 = x1 , ϕ1 = x1 / x1

x2 = x2 − x2 , ϕ1 ϕ1 , ϕ2 = x2 / x2

·················· ·················· n xn+1 = xn+1 − xn+1 , ϕk ϕk , ϕn+1 = xn+1 / xn+1

. k=1

1.3 Fourier Series

11

We note that each ϕn is a linear combination of {x1 , x2 , · · · , xn } and each xn is a linear combination of {ϕ1 , ϕ2 , · · · , ϕn }. Such a construction as above is possible because each xn is not zero. It can be confirmed as follows. First of all, it is obvious that x1 = x1 0 since {xn } is linearly independent. Assume next that xk 0 (1 k n) and xn+1 = 0. Then xn+1 =

n xn+1 , ϕk ϕk . k=1

Since each ϕk is a linear combination of {x1 , x2 , · · · , xk }, xn+1 would be a linear combination of {x1 , x2 , · · · , xn }. It is a contradiction to the linear independence of {x1 , x2 , · · · , xn+1 }. Hence we must have xn+1 0. It is clear that ϕn = 1 for all n. Finally, we show the orthogonality of {ϕn }. To start with, ϕ1 , ϕ2 = 0 since x2 , ϕ1 = x2 , ϕ1 − x2 , ϕ1 ϕ1 , ϕ1 = x2 , ϕ1 − x2 , ϕ1 = 0. Assume next that ϕ1 , ϕ2 , · · · , ϕn are orthogonal to each other. Then it is easy to show that ϕn+1 , ϕj = 0

for j = 1, 2, · · · , n.

(1.14)

In fact, we have by simple calculations that , ϕj xn+1

= xn+1 , ϕj −

n

xn+1 , ϕk ϕk , ϕj = xn+1 , ϕj − xn+1 , ϕj = 0,

k=1

which implies (1.14).

Theorem 1.2 tells us that any Hilbert space of infinite dimension has an orthonormal system which consists of at least countable vectors.

1.3 Fourier Series The unit vectors e1 = (1, 0, · · · , 0), e2 = (0, 1, 0, · · · , 0), · · · , el = (0, · · · , 0, 1) form an orthonormal system in Rl . Any element x ∈ Rl can be expressed as a linear combination of e1 , e2 , · · · , and el : x=

l

xi ei .

(1.15)

i=1

Such an expression is unique. Is a similar expression possible in a general Hilbert space?

12

1 Fourier Series on Hilbert Spaces

Definition 1.3 Let H be a Hilbert space and {ϕ1 , ϕ2 , · · · , ϕn , · · · } be an orthonormal system. For any x ∈ H, x, ϕn (n = 1, 2, · · · ) is called the n-th Fourier coefficient of x with respect to {ϕn }, and ∞

x, ϕn ϕn

(1.16)

n=1

is the Fourier series of x with respect to {ϕn }. Of course, the series (1.16) may or may not be convergent. In any case, the expression x∼

∞

x, ϕn ϕn

n=1

tells us that (1.16) is the Fourier series of x with respect to {ϕn }. Now we have to answer the basic question: does the series (1.16) converge to x in the norm, i.e. p x, ϕn ϕn → 0 x −

as

p → ∞?

n=1

The answer to this question is given by Theorem 1.6. However, we need some more preparation before we arrive at the answer. In the case where the system of functions explained in Example 1.2 is chosen as an orthonormal system in L2 ([−π, π ], C), the Fourier series of f ∈ L2 ([−π, π ], C) is given by ∞

a0 + f (x) ∼ (an cos nx + bn sin nx), 2

(1.17)

n=1

where 1 an = π bn =

1 π

π

−π

π −π

f (x) cos nxdx;

n = 0, 1, 2, · · · ,

f (x) sin nxdx;

n = 1, 2, · · · .

If the orthogonal system is as in Example 1.3, the Fourier series of f ∈ L2 ([−π, π ], C) is given by f (x) ∼

∞ n=−∞

cn einx ,

(1.18)

1.3 Fourier Series

13

where 1 cn = 2π

π

f (x)e−inx dx;

−π

n = 0, ±1, ±2, · · · .

This is called the Fourier series in complex form.7 We now proceed to investigate the meaning of Fourier coefficients from the viewpoint of approximation theory. Suppose that ϕ1 , ϕ2 , · · · , ϕn be an orthonormal system in a Hilbert space H. Consider a problem to approximate any element x of H by a linear combination of ϕj ’s. We wish to find the coefficients c1 , c2 , · · · , cn so as to minimize n ci ϕi . x − i=1

Theorem 1.3 (Best approximation by Fourier coefficients) Let ϕ1 , ϕ2 , · · · , ϕn be an orthonormal system in a Hilbert space H. For any x ∈ H, the inequality n n x, ϕi ϕi x − ci ϕi x − i=1

i=1

holds good. Proof n 2 J ≡ x − ci ϕi i=1

= x −

n

ci ϕi , x −

i=1

= x, x −

n

ci ϕi

i=1 n

ci ϕi , x −

i=1

n

c¯i x, ϕi +

n

ci c¯j ϕi , ϕj

i,j =1

i=1

(1.19) 7 More

rigorously, the Fourier coefficient corresponding to (1/

π 1 f (x)e−inx dx, √ 2π −π

√

2π )einx is

and the Fourier series is ∞ n=−∞

1 √ 2π

π −π

1 f (x)e−inx dx · √ einx . 2π

14

1 Fourier Series on Hilbert Spaces

= x 2 −

n

ci ϕi , x −

i=1

= x 2 −

n

−

c¯i x, ϕi +

i=1

ci ϕi , x −

i=1 n

n

n

| ci |2

(cf. (1.8) on page 5)

i=1

n

c¯i x, ϕi +

i=1

n

| ci |2 +

i=1

n ϕi , xx, ϕi i=1

ϕi , xx, ϕi

i=1

= x + 2

n

| ci − x, ϕi | − 2

n

i=1

| x, ϕi |2 .

i=1

Hence J attains its minimum when ci = x, ϕi (i = 1, 2, · · · , n).

According to Theorem 1.2, there exists an orthonormal system {ϕ1 , ϕ2 , · · · , ϕn , · · · } consisting of countably infinite number of vectors in a Hilbert space of infinite dimension. If we substitute x, ϕi as ci (i = 1, 2, · · · , n) in (1.19), then

x 2 −

n

| x, ϕi |2 0

i=1

since J 0. Passing to the limit as n → ∞, we have an important result. Theorem 1.4 (Bessel’s inequality) If H is a Hilbert space of infinite dimension and {ϕ1 , ϕ2 , · · · , ϕn , · · · } is an orthonormal system in H, then we have ∞

| x, ϕn |2 x 2 .

i=1

Corollary 1.1 (Riemann–Lebesgue lemma) lim x, ϕn = 0.

n→∞

1.4 Completeness of Orthonormal Systems We now examine the conditions under which x ∈ H can be represented in the form of Fourier series based upon these preparations. A sufficient number of vectors must be included in an orthonormal system in order to represent x as a linear combination of vectors in an orthonormal system even in the case of an l-dimensional Euclidean space. In this sense, the menu of the orthonormal system is required to be sufficiently rich in order to represent x ∈ H (Hilbert space) in the form of Fourier series.

1.4 Completeness of Orthonormal Systems

15

Definition 1.4 Let Φ be an orthonormal system in a Hilbert space H. Φ is said to be complete (as an orthonormal system) if there is no orthonormal system which contains Φ as a proper subset. An orthonormal system Φ in a Hilbert space H is complete if and only if x, ϕ = 0

for all

ϕ ∈ Φ ⇒ x = 0.

(Proof is almost obvious.) All the orthonormal systems shown in Examples 1.1–1.6 above are complete. Here we shall only show the completeness of 8 Φ=

1 √ einx ; n = 0, ±1, ±2, · · · 2π

in L2 ([−π, π ], C) (in Example 1.3). By Theorem 1.3, we obtain n n n 1 1 1 | αj |2 , √ cj eij x f − √ αj eij x = f 22 − f − 2 2 2π 2π 2π j =−n j =−n j =−n (1.20) for any f ∈ L2 ([−π, π ], C) and any complex numbers cj (j = 0, ±1, ±2, · · · ), where · 2 is the L2 -norm and αj ’s are Fourier coefficients:

π 1 1 ij x αj = f , √ e = √ f (x)e−ij x dx. 2π 2π −π If f is continuous with f (−π ) = f (π ), in particular, the left-hand side of (1.20) can be arbitrarily small by choosing cj and n in a suitable manner (Weierstrass approximation theorem). Hence the subspace span Φ which is spanned by Φ is L2 dense in M = {f ∈ L2 ([−π, π ] , C)| f is continuous, f (−π ) = f (π )}. Since M is dense in L2 ([−π, π ], C), span Φ is dense in L2 ([−π, π ], C). Consequently, the left-hand side of (1.20) can be arbitrarily small by suitable choice of cj ’s. Thus if 1 αj = f, √ eij x = 0 2π

for all

j = 0, ±1, · · ·

for f ∈ L2 ([−π, π ], C), then f = 0. This proves the completeness of Φ.9 8 See 9 We

Folland [3] Chap. 6 and Kawata [5] pp. 33–36 for other orthonormal systems. acknowledge Yosida [12] p. 88 for the proof here.

16

1 Fourier Series on Hilbert Spaces

The next theorem confirms the existence of a complete orthonormal system in a general Hilbert space. Theorem 1.5 (Existence of orthonormal system) For any orthonormal system Φ in a Hilbert space, there exists a complete orthonormal system which contains Φ. Proof We denote by O the family of all the orthonormal systems in H which contain Φ as a subset. Define a partial ordering ≺ on O by Φ ≺ Φ ⇐⇒ Φ ⊂ Φ . Let {Φα } be a chain (i.e. totally ordered subfamily) of O. Then we must have ∪α Φα ∈ O and

Φα ≺ ∪α Φα

for all α.

Hence the partial ordering structure (O, ≺) is inductive. Zorn’s lemma confirms the existence of a maximal element in O with respect to ≺. We can now state and prove the basic theorem concerning the Fourier expansion of x ∈ H. Theorem 1.6 (Fundamental theorem of Fourier series expansion) The following five statements for an orthonormal system Φ = {ϕ1 , ϕ2 , · · · , ϕn , · · · } in a Hilbert space H are equivalent to each other: (i) Φ is complete. (ii) For any x, y ∈ H, x, ϕn = y, ϕn for all n ⇒ x = y. (iii) Every x ∈ H can be represented as x=

∞ x, ϕn ϕn , n=1

in the sense that p lim x − x, ϕn ϕn = 0.

p→∞

n=1

(iv) For any x, y ∈ H, x, y =

∞ n=1

αn βn ,

1.4 Completeness of Orthonormal Systems

17

where αn = x, ϕn ,

βn = y, ϕn ;

n = 1, 2, · · · .

(v) For any x ∈ H.

x 2 =

∞

| x , ϕn |2 .

(Parseval’s equality)

n=1

Most of the proof seems to be easy. So we show only the step (ii) ⇒ (iii) in detail. By Bessel’s inequality, we obtain p

| x, ϕn |2 x 2 < ∞

n=1

for any x ∈ H. If we denote by Sp a partial sum of the Fourier series, i.e. Sp =

p x, ϕn ϕn , n=1

it follows that (p < q)

Sq − Sp = 2

q

| x, ϕn |2 −→ 0

as

p, q −→ ∞.

n=p+1

Hence the sequence {Sp } is Cauchy in H, and so it has a limit y ∈ H, i.e. ∞ y= x, ϕn ϕn . n=1

Since y, ϕn = x, ϕn

for all n

for this y, (ii) implies that x = y. This proves (iii). However, we have to make sure that we do not have a positive answer to the question: “Is there any complete orthonormal system which consists of countable vectors?” If the answer is yes, any vector can be expanded by its Fourier series. The next theorem fills this gap, fortunately.

18

1 Fourier Series on Hilbert Spaces

Theorem 1.7 (orthonormal system consisting of countable vectors) If a Hilbert space H is separable, there exists a complete orthonormal system which consists of countable vectors. Proof The theorem is clear if dim H < ∞, so we may assume dim H = ∞ without loss of generality. Let D = {x1 , x2 , · · · } be a countable dense subset of H. We denote by ϕ1 the first element of D which is not zero (that is, the nonzero element with the smallest suffix). Next let ϕ2 be the first element of D which does not belong to span {ϕ1 }. Repeating this process, we define ϕn as the first element of D which does not belong to span{ϕ1 , ϕ2 , · · · , ϕn−1 }. Since we assume that dim H = ∞, we get a linearly independent family {ϕ1 , ϕ2 , · · · , ϕn , · · · } consisting of countable (distinct) vectors. Then we can prove that span{ϕn ; n = 1, 2, · · · } = D = H.

(1.21)

Assume that (1.21) is not true. Then there must exist some xp ∈ D such that xp ∈ H \ span{ϕn }. xp is, of course, different from ϕ1 . Since xp is not chosen as ϕn+1 for any n, xp is not the first element of D which does not belong to span{ϕ1 , ϕ2 , · · · , ϕn }. Hence there is no element of {xp , xp+1 , · · · } which is chosen as ϕn+1 . This holds good for all n. So {ϕ1 , ϕ2 , · · · } must be chosen from {x1 , x2 , · · · , xp−1 } (elements of D whose suffixes are smaller than p). But it contradicts the fact that {ϕ1 , ϕ2 , · · · } has countably infinite distinct elements. Thus we can conclude that (1.21) holds good. Applying the Schmidt orthogonalization (Theorem 1.2) to {ϕn }, we obtain an orthonormal system { ϕn } in H. Finally, we proceed to show the completeness of { ϕn }. Since each ϕn is a linear combination of {ϕ1 , ϕ2 , · · · , ϕn } and each ϕn is a linear combination of {ϕ1 , ϕ2 , · · · , ϕn } (as remarked in the proof of Theorem 1.2), we have ϕn ; n = 1, 2, · · · } span{ϕn ; n = 1, 2, · · · } = span{ and both of these sets are dense in H. If we assume that { ϕn } is not complete, there must exist some nonzero x ∈ H such that x, ϕn = 0 for all n.

(1.22)

Since span{ ϕn } is dense in H, there exists some sequence {ξp } of finite linear combinations of { ϕn } such that ξp → x (as p → ∞). It follows from (1.22) that x, ξp = 0 for all p.

1.4 Completeness of Orthonormal Systems

19

Passing to the limit, we must have lim x, ξp = x, x = x 2 = 0,

p→∞

which contradicts x 0. Thus we can conclude that { ϕn } is complete.

Corollary 1.2 (separable Hilbert space l2 ) Any separable Hilbert space is isomorphic to l2 . Proof Let {ϕn } be a complete orthonormal system which consists of countable vectors. (The existence of such a system is guaranteed by Theorem 1.7.) Thanks to Theorem 1.6, we have x=

∞ x, ϕn ϕn n=1

and

x = 2

∞

| x, ϕn |2 < ∞.

n=1

If we define an operator T by T : x −→ {x, ϕ1 , x, ϕ2 , · · · },

x ∈ H,

T is an isometric linear operator of the form T : H −→ l2 . Since T is isometric, it is injective.10 Furthermore, T is surjective. In order to show it, let {αn } be any element of l2 . Since Sp =

p

αn ϕn ;

p = 1, 2, · · ·

n=1

is a Cauchy sequence, there exists some x ∈ H such that x=

∞

αn ϕn

n=1

10 The

injectivity of T is also verified by Theorem 1.6 (ii).

20

1 Fourier Series on Hilbert Spaces

and αn = x, ϕn ;

n = 1, 2, · · · .

This proves the surjectivity of T . We conclude that T : H → l2 is an isometric isomorphism.

Finally, we would like to add some remarks. Remark 1.1 1◦ Any separable Hilbert space has a complete orthonormal system which consists of countable vectors. Hence any element can be expanded by Fourier series by using this system. However, the situation is different in the case of a general Hilbert space, because it may be impossible to construct a complete system consisting of countable vectors. (See Dudley [2] pp. 126–131, Yosida [12] pp. 86–88.) 2◦ How can we compute Fourier coefficients when we consider L2 ([−l, l], C) instead of L2 ([−π, π ], C) as a concrete Hilbert space? If we change the variable t ∈ [−l, l] to x ∈ [−π, π ] by x=

πt , l

i.e. t =

lx , π

f (t) ∈ L2 ([−l, l], C) can be transformed to a function f˜(x) ∈ L2 ([−π, π ], C) by f˜(x) ≡ f

lx . π

The Fourier coefficients of f˜ are given by 1 an = π = = =

1 π 1 π 1 l

π

−π

π −π

f

lx cos nx dx π

l

−l

f˜(x) cos nx dx

f (t) cos

π nπ t · dt l l

f (t) cos

nπ t dt ; n = 0, 1, 2, · · · . l

l

−l

References

21

Similarly, we have 1 bn = l

l −l

f (t) sin

nπ t dt; l

n = 1, 2, · · · .

Thus the Fourier series of f is given in the form ∞ nπ t nπ t a0 . + + bn sin an cos 2 l l n=1

References 1. Cartan, H.: Théorie élémentaires des fonctions analytiques d’une ou plusieurs variables complexes. Hermann, Paris (1961) (English edn.) Elementary Theory of Analytic Functions of One or Several Complex Variables. Addison Wesley, Reading (1963) 2. Dudley, R.M.: Real Analysis and Probability. Wadsworth and Brooks, Pacific Grove (1988) 3. Folland, G.B.: Fourier Analysis and its Applications. American Mathematical Society, Providence (1992) 4. Halmos, P.R.: Introduction to a Hilbert Space and the Theory of Spectral Multiplicity. Chelsea, New York (1951) 5. Kawata, T.: Fourier Kaiseki (Fourier Analysis). Sangyo Tosho, Tokyo (1975) (Originally published in Japanese) 6. Lax, P.D.: Functional Analysis. Wiley, New York (2002) 7. Maruyama, T.: Kansu Kaisekigaku (Functional Analysis). Keio Tsushin, Tokyo (1980) (Originally published in Japanese) 8. Schwartz, L.: Analyse hilbertienne. Hermann, Paris (1979) 9. Takagi, T.: Kaiseki Gairon (Treatise on Analysis), 3rd edn. Iwanami Shoten, Tokyo (1961) (Originally published in Japanese) 10. Terasawa, K.: Shizen Kagakusha no tame no Sugaku Gairon (Treatise on Mathematics for Natural Scientists). Iwanami Shoten, Tokyo (1954) (Originally published in Japanese) 11. Yosida, K.: Lectures on Differential and Integral Equations. Interscience, New York (1960) 12. Yosida, K.: Functional Analysis, 3rd edn. Springer, New York (1971)

Chapter 2

Convergence of Classical Fourier Series

We have discussed basic contents of the theory of Fourier series on a general Hilbert space. We now proceed to the classical problem concerning the Fourier series expansion of an integrable function with respect to the trigonometric functions. If we choose L2 ([−π, π ], C) as a Hilbert space and 1 1 1 1 1 √ , √ cos x, √ sin x, · · · , √ cos nx, √ sin nx, · · · ; π π π π 2π n = 1, 2, · · · as a complete orthonormal system, the Fourier series of f ∈ L2 ([−π, π ], C) is given in the form ∞

a0 + (an cos nx + bn sin nx), 2 n=1

where an =

1 π

π

−π

f (x) cos nx dx,

bn =

1 π

π

−π

f (x) sin nx dx.

This Fourier series converges to f in L2 -norm. However, it is necessary for us to investigate convergence criteria of Fourier series other than L2 -convergence: pointwise convergence, almost everywhere convergence, uniform convergence, and so on. This chapter is devoted to these important topics.

© Springer Nature Singapore Pte Ltd. 2018 T. Maruyama, Fourier Analysis of Economic Phenomena, Monographs in Mathematical Economics 2, https://doi.org/10.1007/978-981-13-2730-8_2

23

24

2 Convergence of Classical Fourier Series

2.1 Dirichlet Integrals Let f : R → C (or R) be a 2π -periodic function which is integrable on [−π, π ]. We are now following the traditional setting rather than the framework of a Hilbert space. However,

1 π

an =

π −π

bn =

f (x) cos nx dx,

1 π

π

−π

f (x) sin nx dx

make sense for any f ∈ L1 ([−π, π ], C) by Hölder’s inequality. Therefore we can consider the formal series ∞

a0 + (an cos nx + bn sin nx) 2

(2.1)

n=1

for f ∈ L1 ([−π, π ], C) in a way similar to that for L2 ([−π, π ], C). We call (2.1) the Fourier series of f , although L1 ([−π, π ], C) is not a Hilbert space. We start by investigating the convergence of (2.1) at a specific point x ∈ [−π, π ]. It is the so-called Dirichlet integral which provides us with an effective tool. A partial sum Sn (x) of (2.1) at x can be computed as follows: 1 Sn (x) = 2π

n

1 π f (t) cos 0 · t dt + f (t) cos kt dt cos kx π −π −π π

k=1

+

n

1 π f (t) sin kt dt sin kx π −π k=1

=

1 π

π

−π

1 1 f (t)dt + 2 π

n π k=1 −π

+

f (t) cos kt cos kx dt

n

1 π f (t) sin kt sin kx dt π −π

(2.2)

k=1

=

1 π

π

−π

f (t)

n 1 + cos k(t − x) dt. 2 k=1

We have to know the formula 2n + 1 sin u 1 2 + cos u + cos 2u + · · · + cos nu = u 2 2 sin 2 (assuming sin(u/2) 0).

(2.3)

2.1 Dirichlet Integrals

25

In fact, summing up each side of sin

1 u u = · 2 sin , 2 2 2

sin

3u u u − sin = cos u · 2 sin , 2 2 2

·················· sin

2n + 1 2n − 1 u u − sin u = cos nu · 2 sin , 2 2 2

sin

2n + 1 u 1 u = 2 sin + cos u + · · · + cos nu , 2 2 2

we obtain

from which (2.3) immediately follows. We obtain, by (2.2) and (2.3), that

1 Sn (x) = π

sin

π −π

f (t)

2n + 1 (t − x) 2 dt. t −x 2 sin 2

(2.4)

The formula (2.4) can be rewritten in the form

1 Sn (x) = π

2n + 1 z 2 dz f (x + z) z −π −x 2 sin 2

π −x

sin

(2.5)

by changing variables: z = t − x. Since the entire integrand of (2.5) is 2π -periodic,

1 Sn (x) = π

π −π

f (x + z)

2n + 1 z 2 dz. z 2 sin 2

sin

(2.6)

The integral appearing in (2.6) is called the Dirichlet integral, and the function Dn (z) defined by 2n + 1 sin z 1 2 · Dn (z) ≡ 2π z sin 2

(2.7)

26

2 Convergence of Classical Fourier Series

is called the Dirichlet kernel. Then Sn (x) can be expressed as

Sn (x) =

π

−π

f (x + z)Dn (z)dz.

(2.8)

Remark 2.1 Of course, the integrand of (2.4) can not be defined at t such that 2−1 (t − x) = mπ (m ∈ Z) because sin 2−1 (t − x) = 0 for such t. However, the set {t ∈ [−π, π ]|2−1 (t − x) = mπ, m ∈ Z} is of measure zero, so it does not influence the integral. Rigorously speaking, the Dirichlet integral (2.7) can not be defined at z such that z/2 = mπ (m ∈ Z). But we may assign any values at these points. It is clear by (2.3) that

π

Dn (z)dz = 1.

−π

(2.9)

Hence by (2.8) and (2.9), the relation

Sn (x) − c =

π −π

(f (x + z) − c)Dn (z)dz

(2.10)

holds good for any c. In particular, it is obvious that

Sn (x) − f (x) =

π −π

(f (x + z) − f (x))Dn (z)dz

(2.10 )

for c = f (x). Since Dn (z) is an even function, Sn (x) can be reexpressed as

π

Sn (x) =

π

[f (x + z) + f (x − z)]Dn (z)dz = 2

0

φ(z)Dn (z)dz, 0

where φ(z) =

1 f (x + z) + f (x − z) . 2

Consequently, we have

π

Sn (x) − c = 2

[φ(z) − c]Dn (z)dz

(2.11)

0

for any c. In the special case of c = f (x), (2.11) becomes

π

Sn (x) − f (x) = 2 0

f (x + z) + f (x − z) 2

− f (x) Dn (z)dz.

(2.11 )

2.1 Dirichlet Integrals

27

Thus we obtain the following theorem. Theorem 2.1 (convergence at a point) Let f : R → C (or R) be a 2π -periodic function which is integrable on [−π, π ]. The sequence of partial sums Sn (x) of the Fourier series (2.1) converges to f (x) at a point x if and only if the integral on the right-hand side of (2.10 ) or (2.11 ) converges to 0. Of course, Sn (x) converges to c if and only if the integral on the right-hand side of (2.10) or (2.11) converges to 0. Similar results also follow when the family of functions 1 √ einx ; 2π

n = 0, ±1, ±2, · · ·

is chosen as an orthonormal system. Let f : R → C be 2π -periodic and integrable on [−π, π ]. Consider the formal Fourier series ∞

(2.1 )

cn einx ,

n=−∞

where cn =

1 2π

π −π

f (t)e−int dt;

n = 0, ±1, ±2, · · · .

Computing the partial sums Sn (x), we have

π n

n 1 π 1 −ikt ikx f (t)e dt · e = f (t) eik(x−t) dt. Sn (x) = 2π 2π −π −π k=−n

The sum appearing in

k=−n

(2.2 ) n

(2.2 )

can be rewritten as

eik(x−t) =

k=−n

n

e−ikz =

k=−n

n

eikz

k=−n

by changing variables: z = t − x. This is exactly equal to 2π · Dn (z), where Dn (z) is the Dirichlet kernel (2.7); i.e., n 1 ikz 1 e−inz − einz+iz e = 2π 2π 1 − eiz k=−n

2n+1

2n+1

1 e−i( 2 )z − ei( 2 · = 2π e−iz/2 − eiz/2

)z

(2.7 )

28

2 Convergence of Classical Fourier Series

1 sin 2n+1 2 z · 2π sin 2z

=

= Dn (z). Hence the partial sums (2.2 ) of the Fourier series in complex form can be expressed as

Sn (x) =

π

−π

(2.8 )

f (x + z)Dn (z)dz.

We can verify, by a similar argument, that Theorem 2.1 also holds good for the convergence of Fourier series in complex form.

2.2 Dini, Jordan Tests We now provide more convenient criteria of convergence of a Fourier series. The next lemma establishes a result analogous to the Riemann–Lebesgue lemma (Corollary 1.1, p. 14) for a Fourier series on a Hilbert space in the case of L1 . Lemma 2.1 For any ϕ ∈ L1 ([a, b], R),

b

lim

p→∞ a

lim

ϕ(x) sin pxdx = 0,

p→∞ a

b

lim

p→∞ a

b

ϕ(x) cos pxdx = 0,

ϕ(x)eipx dx = 0.

Proof It is enough for us to check only the first one. We can prove the second quite similarly. And the last one immediately follows from the other two. Assume, in addition, that ϕ is of the class C1 for the moment. Then we obtain, by integration by parts, that

a

b

b cos px b cos px dx ϕ(x) sin pxdx = −ϕ(x) ϕ (x) + a p p a

b cos pb cos pa cos px = −ϕ(b) + ϕ(a) + dx −→ 0 ϕ (x) p p p a as

p −→ ∞.

We depend upon the continuity of ϕ (and so the boundedness of ϕ (x) · cos px) to conclude that the third term tends to 0.

2.2 Dini, Jordan Tests

29

We now consider the general case: ϕ ∈ L1 ([a, b], R). Since C1 ([a, b], R) is dense in L1 ([a, b], R), there exists, for any ε > 0, some ϕε ∈ C1 such that ϕ − ϕε 1 < ε/2. ( · 1 denotes the L1 -norm.) Applying the above result to this ϕε , we must have

b a

ε ϕε (x) sin px dx < 2

for sufficiently large p. Hence we obtain the following evaluation for large p:

a

b

ϕ(x) sin px dx b

[ϕ(x) − ϕε (x)] sin px dx + a

ϕ − ϕε 1 · sin px ∞

b a

ϕε (x) sin px dx

ε + 2

(Hölder’s inequality, · ∞ is the essential sup norm) ϕ − ϕε 1 ·1 +

ε 2

ε ε + 2 2

= ε. The same result is also valid for ϕ ∈

L1 ([a, b], C).1

Corollary 2.1 For any ϕ ∈ L1 ([−π, π ], C), the Fourier coefficients {an }, {bn } and {cn } tend to 0 as n → ∞. We now proceed to the convergence problem of a Fourier series at a point. Assumptions imposed on f are still the same. Making use of (2.11) in the previous section and writing ξ(z) = φ(z) − c, we have

π Sn (x) − c = 2 ξ(z)Dn (z) dz 0

1 = π

1 We

π 0

2n + 1 z 2 ξ(z) dz z sin 2

sin

owe this result to Kolmogorov–Fomin [14] pp. 387–388. For more general results, see Kawata [11] pp. 63–64.

30

2 Convergence of Classical Fourier Series

1 = π

sin

δ 0

1 π

+

2n + 1

π sin 2n + 1 z z 1 2 2 ξ(z) dz + ξ(z) dz z z π δ sin 2 2 (2.12)

δ

sin

0

≡ I1 + I2 + I3

2n + 1 z 2

1

1 z − z sin 2 2

ξ(z) dz

(0 < δ < π ).

It can be readily verified that I2 → 0 as n → ∞. To see this, we have to observe that

π z ξ(z) sin nz + dz 2 sin z δ 2 (ξ(z)(sin(z/2))−1 is integrable on [δ, π ], and so this integral makes sense)

π π z ξ(z) z ξ(z) = sin nz cos cos nz sin dz + dz, z 2 sin 2 sin z δ δ 2 2

where cos

z ξ(z) , 2 sin z 2

sin

z ξ(z) 2 sin z 2

are both integrable on [δ, π ]. Hence, by Lemma 2.1, I2 → 0 as n → ∞. How about I3 ? Since z z z z 3) − sin − + O(z 1 1 2 2 2 2 = O(z) z = z z z − z = z sin sin sin 2 2 2 2 2 2 we obtain ⎛ ⎜ ⎝

⎞ 1 ⎟ 1 z − z ⎠ ξ(z) ∈ L [0, δ], C . sin 2 2 1

So, again by Lemma 2.1, we conclude that I3 → 0 as n → ∞. There remains only to check I1 .

(as z → 0),

2.2 Dini, Jordan Tests

31

Theorem 2.2 (convergence at a point) Let f : R → C (or R) be a function which is 2π -periodic and integrable on [−π, π ]. The sequence of partial sums of the Fourier series of f converges to some number c at a point x if and only if

δ

sin

0

2n + 1 z 2 ξ(z) dz → 0 (as n → ∞) z

for some δ ∈ (0, π ), where ξ(z) =

1 f (x + z) + f (x − z) − c. 2

Thus the answer to the question whether a Fourier series converges or not at a point x depends upon the behavior of f in a neighborhood (x − δ, x + δ) of x (δ may be arbitrarily small). We call this the local property of the convergence of a Fourier series. The next theorem immediately follows from Theorem 2.2 and Lemma 2.1. Theorem 2.3 (Dini test) Let f : R → C (or R) be a function which is 2π -periodic and integrable on [−π, π ]. We write ξ(z) = (1/2)[f (x + z) + f (x − z)] − c for some number c. If

0

δ

| ξ(z) | dz < ∞ z

(2.13)

for some δ > 0, then Sn (x) → c as n → ∞. The proof is almost obvious. The condition (2.13) is called the Dini condition. Remark 2.2 It is easy to check that the Dini condition is satisfied if and only if

δ

−δ

f (x + z) − c dz < ∞ z

(2.14)

for some δ > 0. (The assumptions imposed on f are the same as above.)2 We state another convergence criterion due to Jordan. Theorem 2.4 (Jordan test) Let f : R → C (or R) be a function which is 2π -periodic and integrable on [−π, π ]. If f is of bounded variation on some neighborhood [x − δ, x + δ] (δ > 0), then Sn (x) −→

2 The

1 f (x + 0) + f (x − 0) 2

(as n → ∞).

requirement (2.14) is called the Dini condition in Kolmogorov–Fomin [14], Vol.II, p. 388.

32

2 Convergence of Classical Fourier Series

Lemma 2.2 There exists some constant C such that

b a

sin t dt C t

(2.15)

for any a < b. (C does not depend upon a and b.) Proof We assume first that 0 a < b. The case 1 a < b: There exists some η ∈ [a, b] which satisfies

b a

1 sin t dt = t a

η

1 (cos a − cos η) a

sin t dt =

a

by the second mean-valued theorem of integration.3 Hence

b a

sin t dt 2. t

The case 0 a < b 1:

b

0 a

sin t dt t

b

dt 1.

a

The case 0 a 1 < b:

b a

sin t 1 b dt + 1 + 2 = 3. t a 1

3 Weierstrass-type. Let g : [a, b]

→ R be integrable and h : [a, b] → R be bounded and monotone. Then there exists some η ∈ [a, b] such that

b

η

g(x)h(x)dx = h(a)

a

b

g(x)dx + h(b)

a

g(x)dx η

(cf. Stromberg [16] pp. 328–329, Takagi [17] pp. 287–288). Bonnet-type. Let g : [a, b] → R be integrable and h : [a, b] → R be nonnegative and nonincreasing. Then there exists some η ∈ [a, b] such that

b

η

g(x)h(x)dx = h(a)

a

g(x)dx. a

If h is nonnegative and nondecreasing, then there exists some η ∈ [a, b] such that

b

b

g(x)h(x)dx = h(b)

a

(cf. Kawata [10] I, p. 18, Stromberg [16] p. 334).

g(x)dx η

2.2 Dini, Jordan Tests

33

By the above observations, (2.15) holds good for C = 3 in the case 0 a < b. Hence in the general case (not necessarily 0 a < b), C = 6 (for instance) satisfies (2.15). (C is independent of a and b.)4 Proof of Theorem 2.4. Without loss of generality, we may assume that f is realvalued. Since f is of bounded variation on [x −δ, x +δ], f (x +0) and f (x −0) certainly exist. Specifying c = (1/2) [f (x + 0) + f (x − 0)], we write ξ(z) = (1/2) f (x + z) + f (x − z) − c. Then ξ is of bounded variation on [−δ, δ] and ξ(0+) = 0. There exists a pair of nonnegative and nondecreasing functions h1 , h2 : [−δ, δ] → R which satisfies h1 (0+) = h2 (0+) = 0, and ξ(z) = h1 (z) − h2 (z); z ∈ [−δ, δ].

(2.16)

For any ε > 0, there exists some δ > 0 such that 0 < z δ ⇒ 0 h1 (z), h2 (z) < ε.

(2.17)

Reviewing the calculation of (2.12) on page 30, we should recall that Sn (x) − c is split into three parts: Sn (x) − c = I1 + I2 + I3 . We established there that I2 + I3 → 0 as n → ∞. Hence | I2 + I3 |< ε

for all

n n0

for a sufficiently large n0 ∈ N. Consequently, 2n + 1

z sin 1 δ 2 | Sn (x) − c | ξ(z)dz +ε z π 0 2 2n + 1 2 δ sin z 2 2 hi (z)dz + ε 0 π z i=1

(2.18)

for large n. The magnitude of the integral

4 We acknowledge Kawata [11] pp. 83–84 for the proof. It can be proved that C

satisfies (2.15). See fine calculations by Kawata [10] I, pp. 67–69.

= π/2+4/π 2.8

34

2 Convergence of Classical Fourier Series

δ 0

sin

2n + 1 z 2 hi (z)dz z

is evaluated as follows. By the second mean-value theorem of integration and (2.17), we have 2n + 1 2n + 1 2n + 1

δ sin δ sin δ sin z z z 2 2 2 hi (z)dz = hi (δ) dz ε dz θi 0 θi z z z (2.19) for certain θi ∈ [0, δ]. The final integration is reduced to

2−1 (2n+1)δ 2−1 (2n+1)θi

sin w dw w

(2.20)

by changing variables: w = 2−1 (2n + 1)z. The value of (2.20) is bounded by some absolute constant C, by Lemma 2.2. Hence we obtain | Sn (x) − c |

4C

4 εC + ε = +1 ε π π

for all n n0

by (2.18) and (2.19). This completes the proof.5

(2.21)

2.3 Almost Everywhere Convergence: Historical Survey In the previous section, we studied some necessary and sufficient conditions as well as striking sufficient conditions for a Fourier series to be convergent at a point. However, it is natural to ask if we can expect a Fourier series of a function with nice properties (continuity, integrability, and so on) to be automatically convergent everywhere. Is there any class of functions which guarantees pointwise convergence or almost everywhere convergence? This is a serious problem in the long history of the theory of Fourier series. First of all, P. du Bois Reymond, in 1876, constructed an example of a continuous function, the Fourier series of which diverges at a point. An optimistic expectation as stated above was doomed to failure by this discovery. We show its basic ideas by following several steps.6 Let x be any point of [−π, π ]. (i) We first prove that

5 The

proof here is due to Kawata [11] p. 85. [14] II, pp. 390–391.

6 Kolmogorov–Fomin

2.3 Almost Everywhere Convergence: Historical Survey

π −π

35

|Dn (z)|dz −→ ∞ (as n → ∞),

where Dn (z) is the Dirichlet kernel. By the definition of Dn (z), we have 2n + 1 z sin 2 z . |Dn (z)| = 2π sin 2 It is easy to see that 2n + 1 zk = 1 sin 2 for any zk such that 2n + 1 1 π, zk = k + 2 2

k ∈ Z.

If such a zk is in [−π, π ], we have −n − 1 k n. Here we make use only of 0 k n. Let Ik be a sufficiently small interval with center zk (0 k n) such that 2n + 1 2n + 1 2n + 1 2k + 1 π z− zk = z− π < , 2 2 2 2 3 that is, |z − zk | <

2π . 3(2n + 1)

The length of Ik is equal to 4π/3(2n + 1) and 2n + 1 1 z > sin 2 2 on Ik . On the other hand, the values of sin(z/2) on Ik satisfy z z 1 sin < < · 2 2 2

π 2k + 1 π+ 2 3 < k + 1 π. 2n + 1 2n + 1 2

36

2 Convergence of Classical Fourier Series

Hence we have

n

n 1 4π 1 1 · 2π 2 k+1 3(2n + 1) k=0 π 2n + 1

|Dn (z)|dz Ik

k=0

=

n 1 1 → ∞ as 3π k+1

n → ∞.

k=0

This proves (i). (ii) Define an operator Λn : C([−π, π ], C) → C by

Λn (f ) =

π −π

f (x + z)Dn (z)dz ; f ∈ C [−π, π ], C .

Then Λn is a continuous linear functional on C([−π, π ], C). {Λn |n ∈ N} is not bounded in the operator norm by (i) and the resonance theorem.7 Therefore it is not bounded in the weak topology. (iii) Since {Λn } is not weakly convergent by (ii), there exists some f ∈ C([−π, π ], C) for which

lim

π

n→∞ −π

f (x + z)Dn (z)dz

does not exist. (iv) Thus du Bois Reymond’s conclusion immediately follows. A lot of trials followed the work of du Bois Reymond. For instance, G.H. Hardy, in 1913, gave a illuminating result that the series ∞ n=−∞

cn einx

1 log(| n | +2)

is convergent almost everywhere for any f ∈ L1 ([−π, π ], C), where cn ’s are Fourier coefficients in complex form. Depending upon the accumulation of these experimental results, N.N. Lusin, in 1915, raised an important question explicitly – “Is there any f ∈ L2 ([−π, π ], C), the Fourier series of which diverges on a set of positive measure?”

7 Banach–Steinhaus resonance theorem: Let X be a Banach space, Y a normed space, and {Tα |α ∈ A} a family of bounded linear operators of X into Y. If {Tα x|α ∈ A} is bounded in Y for each x ∈ X, then {Tα |α ∈ A} is uniformly bounded, i.e. supα∈A Tα < ∞. (The converse is obvious.) cf. Dunford–Schwartz [2] pp. 52–53, Maruyama [15] pp. 344–345, Yosida [18] p. 69.

2.4 Uniform Convergence

37

Lusin’s clarification of the problem promoted and invoked further researches very much. In particular, the qualitative difference between L1 and Lp (p > 1) was highlighted in relation with Lusin’s problem. One of the most remarkable results was obtained by A.N. Kolmogorov in 1923. It surprised the mathematical world by showing an example of f ∈ L1 ([−π, π ], C), the Fourier series of which diverges everywhere. The mathematical research exhibited a great jump in 1960s. The decades 1920s– 1930s may be called the period of classical accumulation which leads to that of harvest in 1960s. Y. Katznelson proved: “One of the following statements holds good for Lp ([−π, π ], C), 1 p < ∞: a. the Fourier series of every f ∈ Lp converges almost everywhere, b. there exists some f ∈ Lp , the Fourier series of which diverges everywhere.” Furthermore, J-P. Kahane and Y. Katznelson also proved the same result for the space C([−π, π ], C).8 And the best result came last – L. Carleson established the following decisive result in 1966. Carleson’s theorem The Fourier series of any f ∈ L2 ([−π, π ], C) converges to f almost everywhere. R.A. Hunt extended this result to the case of any p > 1 in the following year. Thus Lusin’s problem was solved completely. Carleson’s theorem will play a crucial role in Chaps. 9 and 11 in this book. However, we can find no easy way leading to the proof of this theorem. So it is very regrettable that we have to state the theorem without its proof.9

2.4 Uniform Convergence We state and prove the condition which guarantees the uniform convergence of a Fourier series. In this case, functions to be expanded by Fourier series must be continuous. Theorem 2.5 (uniform convergence) Assume that f : R → C (or R) is 2π periodic and absolutely continuous. If, in addition, f ∈ L2 ([−π, π ], C), then the Fourier series of f converges uniformly to f on R.

8 The classical works cited here are Hardy [3], Kolmogorov [12, 13], Katznelson [9] and Kahane– Katznelson [7]. See also Hardy–Rogozinski [4]. Zygmund [19] is the best treatise in the discipline. 9 The classical works are Carleson [1] and Hunt [5]. Jørsboe and Melbro [6] devoted their entire volume to a clear-cut proof of Carleson’s theorem.

38

2 Convergence of Classical Fourier Series

Proof Since f is absolutely continuous, it is differentiable almost everywhere. We denote by an and bn the Fourier coefficients (with respect to trigonometrical functions) of f : an =

1 π

bn =

1 π

π

−π

π

−π

f (x) cos nx;

n = 0, 1, 2, · · · ,

f (x) sin nx;

n = 0, 1, 2, · · · .

We obtain, by integration by parts,10 that an =

1 π

π

−π

f (x) cos nx =

1 π

sin nx π f (x) n −π

π 1 b − f (x) sin nxdx = − n . n −π n

Similarly, we get bn =

1 π

π −π

f (x) sin nx =

an . n

Hence the following evaluation holds good: ∞

| an | + | bn |

n=1

∞ | bn | | an | , + n n

(2.22)

| bn | 1 | bn |2 + 2 . n n

(2.23)

n=1

where 2

| an | 1 | an |2 + 2 , n n

2

Applying Bessel’s inequality (Theorem 1.4, p. 14) to f ∈ L2 ([−π, π ], C), we have ∞

| an |2 + | bn |2

< ∞.

(2.24)

n=1

u be absolutely continuous and v be integrable on [a, b]. Then u, v is also integrable. Denoting by V the indefinite integral of v, we obtain the formula:

10 Let

b

b

u(t)v(t)dt = u(b)V (b) − u(a)V (a) −

a

cf. Kato [8] p. 107, Stromberg [16] p. 323.

a

u (t)V (t)dt.

2.4 Uniform Convergence

39

By (2.22), (2.23) and (2.24), it follows that ∞

| an | + | bn |

< ∞.

(2.25)

n=1

Since the absolute value of each term of the Fourier series

a0 + an cos nx + bn sin nx 2 ∞

n=1

of f is bounded by the corresponding term of the positive series ∞

| a0 | + | an | + | bn | , 2 n=1

the Fourier series of f uniformly converges to some continuous function ϕ : R → C.11 Finally, both f and ϕ are square integrable on [−π, π ], since they are continuous. Theorem 1.6 (p. 16) implies that f = ϕ because all the Fourier coefficients of f and ϕ are identical. That is, the Fourier series of f uniformly converges to f . 12 Remark 2.3 1◦ It is obvious that Theorem 2.5 is valid also for a Fourier series in complex form. The Fourier coefficients cn are computed as 1 cn = √ 2π

π

−π

1 1 · √ in 2π 1 = cn , in

=

f (x)e−inx dx

π

−π

f (x)e−inx dx

(integration by parts)

where the cn ’s are the Fourier coefficients of f . Hence ∞ n=−∞

11 cf.

|cn |

∞

1 |c |. |n| n n=−∞

Stromberg [16] pp. 141–142, Takagi [17] p. 155, Theorem 39. proof given here is due to Kolmogorov–Fomin [14], pp. 391–392.

12 The

40

2 Convergence of Classical Fourier Series

By the evaluation 2

1 |cn | |cn |2 + 2 n n

and Bessel’s inequality, we obtain ∞

|cn | < ∞.

n=−∞

The remaining arguments are the same. 2◦ Assume that f is of C1 -class on [−π, π ], in particular. Then it is clear that the Fourier series of f uniformly converges to f and that the series which sums up the Fourier coefficients is absolutely convergent.

2.5 Fejér Integral and (C, 1)-summability Consider a series

∞

an , in general. The average of its partial sums

n=0

Sn =

n−1

ak

k=0

is called the (C, 1)-sum and is denoted by σn ; i.e. n−1 1 σn = Sk . n k=0

If σn converges to σ (as n → ∞), we say that the series

∞

an is first summable

n=0

in the sense of Cesàro or, briefly, (C, 1)-summable. More generally, the concept of (C, m)-summability is discussed in Kawata [10] I, p. 12. Remark 2.4 13 (i) If a series

∞

an converges, it is (C, 1)-summable to the same limit.

n=0

13 Takagi

[17] p. 9, p. 275. Related results are neatly discussed in Stromberg [16] pp. 473–484.

2.5 Fejér Integral and (C, 1)-summability

41

It can be verified as follows. Assume that Sn → S ∗ (as n → ∞). If we define tn by Sn = S ∗ + tn (n = 1, 2, · · · ), then tn → 0. σn can be rewritten as σn =

1 1 (S0 + S2 + · · · + Sn−1 ) = S ∗ + (t0 + t1 + · · · + tn−1 ). n n

For any ε > 0, there exists some k0 ∈ N such that |tk | < ε

for all

k k0 .

Defining T by T = Max{ |t0 |, |t1 |, · · · , |tk0−1 | }, we obtain the evaluation 1 (t0 + t1 + · · · + tn−1 ) 1 {T · k0 + ε(n − k0 )} 1 T · k0 + ε. n n n It follows that |(1/n)(t0 + t1 + · · · + tn−1 )| < 2ε. Hence σn → S ∗ . (ii) The converse is not necessarily true. We show it by an example. A partial sum of the series 1−1+1−1+· · · is given by S2n = 0, S2n+1 = 1. It must be observed that the series does not converge. However, σn → 1/2, since σ2n = n/2n = 1/2 and σ2n+1 = (n + 1)/(2n + 1). Let f : R → C (or R) be a 2π -periodic function which is integrable on [−π, π ]. A partial sum of the Fourier series of f is denoted by a0 (ak cos kx + bk sin kx), + 2 n−1

Sn (x) =

k=1

and the (C, 1)-sum σn (x) =

n−1 1 Sk (x) n k=0

is also called a Fejér sum of f . As was already shown, Sk (x) can be expressed in the form

Sk (x) =

π

−π

f (x + z)Dk (z)dz,

42

2 Convergence of Classical Fourier Series

where Dk (x) is the Dirichlet kernel. Hence we have

σn (x) =

π

−π

f (x + z)

n−1 ! 1 Dk (z) dz. n

(2.26)

k=0

Defining Kn (z) by Kn (z) =

n−1 1 Dk (z), n

(2.27)

k=0

we can rewrite (2.26) by

σn (x) =

π

−π

f (x + z)Kn (z)dz.

(2.28)

Kn (z) is called the Fejér kernel and the integral (2.28) is called the Fejér integral. Lemma 2.3 (representation of Fejér kernel) z 2 1 sin n · 2 Kn (z) = . 2nπ z sin 2

(2.29)

Proof Multiplying 2 sin2 (z/2) to both sides of (2.27), we have n−1 n−1 2k + 1 1 1 z z 2 z = = z sin Kn (z) · 2 sin Dk (z)2 sin 2 sin 2 n 2 2nπ 2 2 2

k=0

=

1 2nπ

n−1

k=0

cos kz − cos(k + 1)z =

k=0

1 nz 1 (1 − cos nz) = sin2 . 2nπ nπ 2

The relation (2.29) immediately follows from this. (Of course we are assuming sin(z/2) 0.) Lemma 2.4 (properties of Fejér kernel) (i)

Kn (z) 0. π Kn (z)dz = 1. (ii) −π

(iii) If we define Mn (δ) by Mn (δ) = sup Kn (z) δ|z|π

(δ > 0), then

2.5 Fejér Integral and (C, 1)-summability

43

lim Mn (δ) = 0.

n→∞

Proof (i) is obvious. Taking account of

π −π

Dk (z)dz = 1,

we obtain (ii) by (2.27). So we have only to check (iii). By Lemma 2.3, we have ⎧ ⎫2 nz ⎪ ⎪ ⎪ ⎪ 1 ⎨ sin 2 ⎬ 0 Kn (z) = . ⎪ 2nπ ⎪ ⎪ ⎩ sin z ⎪ ⎭ 2 Since it is obvious that sin2

δ z sin2 2 2

for |z| ∈ [δ, π ], we have nz sin2 1 1 2 1 · · 0 Kn (z) 2nπ 2nπ δ δ sin2 sin2 2 2 for such z. Hence Mn (δ)

1 · 2nπ

1 sin2

δ 2

−→ 0 (as n → ∞).

Theorem 2.6 (Fejér’s summation formula) If a function f : R → C (or R) is continuous and 2π -periodic, then its Fejér sum σn (x) converges uniformly to f on R. Proof Since f is continuous and periodic, it is bounded and uniformly continuous. So there exists δ > 0 for each ε > 0 such that | x − x | δ

⇒

| f (x ) − f (x ) |

ε . 3

44

2 Convergence of Classical Fourier Series

Fix such a positive number δ > 0. It follows that

f (x) − σn (x) =

π

−π

=

[f (x) − f (x + z)]Kn (z)dz

−δ

−π

+

δ −δ

π

+

(2.30) δ

≡ J1 + J2 + J3 . Defining A = sup |f (x)| (< ∞), we obtain the evaluations x∈R

| J1 | 2A | J3 | 2A

−δ

−π

π

Kn (z)dz 2A · Mn (δ) · π,

(2.31)

Kn (z)dz 2A · Mn (δ) · π,

(2.32)

ε . 3

(2.33)

δ

ε | J2 | 3

δ

−δ

Kn (z)dz

We observe that (2.31), (2.32), and (2.33) are valid regardless of x. The right-hand sides of (2.31) and (2.32) can be smaller than ε/3 for a sufficiently large n, by Lemma 2.4 (iii). Hence | f (x) − σn (x) |

ε ε ε + + =ε 3 3 3

for all

x∈R

for large n. This proves the theorem.

We will recapitulate Fejér’s summability method in more detail in Chap. 5.

References 1. Carleson, L.: On the convergence and growth of partial sums of Fourier series. Acta Math. 116, 135–157 (1966) 2. Dunford, N., Schwartz, J.T.: Linear Operators, Part 1. Interscience, New York (1958) 3. Hardy, G.H.: On the summability of Fourier series. Proc. London Math. Soc. 12, 365–372 (1913) 4. Hardy, G.H., Rogozinski, W.W.: Fourier Series, 3rd edn. Cambridge University Press, London (1956) 5. Hunt, R.A.: On the convergence of Fourier series. In: Proceedings of the Conference on Orthogonal Expansions and Their Continuous Analogues, pp. 234–255. Southern Illinois University Press, Carbondale (1968) 6. Jørsboe, O.G., Melbro, L.: The Carleson-Hunt Theorem on Fourier Analysis. Springer, Berlin (1982) 7. Kahane, J.P., Katznelson, Y.: Sur les ensembles de divergence des séries trigonométriques. Stud. Math. 26, 305–306 (1966)

References

45

8. Kato, T.: Iso Kaiseki (Functional Analysis). Kyoritsu Shuppan, Tokyo (1957) (Originally published in Japanese) 9. Katznelson, Y.: Sur les ensembles de divergence des séries trigonometriques. Stud. Math. 26, 301–304 (1966) 10. Kawata, T.: Ohyo Sugaku Gairon (Elements of Applied Mathematics). I, II, Iwanami Shoten, Tokyo (1950, 1952) (Originally published in Japanese) 11. Kawata, T.: Fourier Kaiseki (Fourier Analysis). Sangyo Tosho, Tokyo (1975) (Originally published in Japanese) 12. Kolmogorov, A.N.: Une série de Fourier-Lebesgue divergent presque partout. Fund. Math. 4, 324–328 (1923) 13. Kolmogorov, A.N.: Une série de Fourier-Lebesgue divergent partout. C. R. Acad. Sci. Paris 183, 1327–1328 (1926) 14. Kolmogorov, A.N., Fomin, S.V.: (English edn.) Introductory Real Analysis. Prentice-Hall, New York (1970) (Japanese edn., translated from 4th edn.) Kansu-kaiseki no Kiso, 4th edn. Iwanami Shoten, Tokyo (1979), (Originally published in Russian). I use the Japanese edition for quotations. 15. Maruyama, T.: Kansu Kaisekigaku (Functional Analysis). Keio Tsushin, Tokyo (1980) (Originally published in Japanese) 16. Stromberg, K.R.: An Introduction to Classical Real Analysis. American Mathematical Society, Providence (1981) 17. Takagi, T.: Kaiseki Gairon (Treatise on Analysis), 3rd edn. Iwanami Shoten, Tokyo (1961) (Originally published in Japanese) 18. Yosida, K.: Functional Analysis, 3rd edn. Springer, New York (1971) 19. Zygmund, A.: Trigonometric Series, 2nd edn. Cambridge University Press, Cambridge (1959)

Chapter 3

Fourier Transforms (I)

The objects of classical theory of Fourier series discussed in the preceding chapter are periodic functions. Is it possible to construct an analogous theory for nonperiodic functions? It is the theory of Fourier transforms which answers this question positively.

3.1 Fourier Integrals We first make a rough sketch of ideas in order to grasp what is going on. A function f : R → C is assumed to be integrable on each finite interval and to satisfy some condition which assures the convergence of its Fourier series (say, Dini’s condition) at each point. The restriction of f to the interval [−l, l] can be expanded by its Fourier series in the form (cf. pp. 20–21) a0 nπ nπ + x + bn sin x , an cos 2 l l ∞

f (x) =

(3.1)

n=1

where an =

1 l

1 bn = l

l −l

f (t) cos

nπ tdt; n = 0, 1, 2, · · · , l (3.2)

l

nπ tdt; n = 1, 2, · · · . f (t) sin l −l

Substituting (3.2) into (3.1), we obtain

© Springer Nature Singapore Pte Ltd. 2018 T. Maruyama, Fourier Analysis of Economic Phenomena, Monographs in Mathematical Economics 2, https://doi.org/10.1007/978-981-13-2730-8_3

47

48

3 Fourier Transforms (I)

f (x) =

1 2l

l −l

f (t)dt +

l ∞ nπ 1 nπ tdt · cos x f (t) cos l −l l l n=1

l ∞ 1 nπ nπ + f (t) sin tdt · sin x l −l l l n=1

=

1 2l

1 = 2l

l −l

f (t)dt +

l ∞ 1 nπ (t − x)dt f (t) cos l −l l n=1

l ∞ 1 π nπ (t − x)dt. f (t)dt + f (t) cos π l l −l −l l

(3.3)

n=1

If we assume, in addition, that f is integrable on R, ) then the first term of (3.3) tends to 0 as l → ∞. Furthermore, the summation part (· · · ) of the second term can be regarded as an approximation of the integral on [0, ∞) (with respect to λ) of the function

l −l

f (t) cos λ(t − x)dt.

(3.4)

Dividing [0, ∞) into intervals of length π/ l, we evaluate the values of (3.4) at the right end-point of each interval ) and sum up these values after the multiplication by π/ l. This operation gives (· · · ). (See Fig. 3.1.) The second term of (3.3) may be expected to converge to 1 π

∞ ∞ −∞

0

f (t) cos λ(t − x)dt dλ

as l → ∞, since the decomposition becomes finer and finer as l → ∞. Although this is not a rigorous argument, we expect the relation 1 f (x) = π

∞ ∞

0

−∞

f (t) cos λ(t − x)dt dλ

(3.5)

to be almost correct. If we define aλ and bλ by aλ =

1 π

∞

f (t) cos λtdt, 0

Fig. 3.1 Interpretation of (3.3)

bλ =

1 π

∞

f (t) sin λtdt, 0

(3.6)

3.1 Fourier Integrals

49

we have the expression

∞

f (x) =

(aλ cos λx + bλ sin λx)dλ.

(3.7)

0

This representation of f in the integral form should be regarded as an analogue to the Fourier series of periodic functions. The outline of the derivation of (3.7) shown above seems almost correct. However, there remains some dubious reasoning in the limit operation. We now try to give a more rigorous proof of (3.7). Theorem 3.1 (Fourier integral formula) Assume that a function f : R → C is integrable and satisfies

f (x + z) − f (x) dz < ∞ z −δ δ

(3.8)

for each δ > 0 and each x ∈ R. Then we obtain f (x) =

1 π

∞ ∞

0

−∞

f (t) cos λ(t − x)dt dλ.

(3.5)

Remark 3.1 Note that (3.8) is exactly Dini’s condition in the case c = f (x) (cf. p. 31). Proof If we define J (A) (A > 0) by 1 J (A) = π

A ∞

0

−∞

f (t) cos λ(t − x)dt dλ,

(3.9)

we have A ∞

−∞

0

|f (t) cos λ(t − x)|dt dλ < ∞,

(3.10)

since f is integrable on R. Hence applying Fubini’s theorem, we obtain J (A) =

1 π

1 = π =

1 π

∞ −∞

A

∞ −∞

f (t) cos λ(t − x)dλ dt

0

∞ −∞

f (t)

sin A(t − x) dt t −x

f (x + z)

sin Az dz z

(3.11)

(changing variables : z = t − x).

50

3 Fourier Transforms (I)

Consequently, J (A) − f (x) = =

1 π 1 π

∞

f (x + z) − f (x) sin Az dz z

−∞

f (x + z) − f (x) sin Az dz (3.12) z −N

f (x) 1 f (x + z) sin Az sin Az dz − dz + π |z|N z π z |z|N N

≡ I1 + I2 + I3 . Taking account of the well-known result 1 π

∞

sin Az dz = 1 z

−∞

(A > 0),

(3.13)

we get the evaluations ε ε , | I3 |< 3 3

| I2 |<

for any ε > 0 if N > 0 is sufficiently large. Fix such a large N > 0. Then (3.8) and Lemma 2.1 (pp. 28–29) imply | I1 |< ε/3 for a large A. Thus we obtain | J (A) − f (x) |<

ε ε ε + + =ε 3 3 3

for a large A > 0. This proves (3.5). The expression (3.5) is called the Fourier integral formula. Since the integral

∞ −∞

f (t) cos λ(t − x)dt

appearing in (3.5) is an even function in λ, f (x) can be written in the form f (x) =

1 2π

∞

−∞

∞ −∞

f (t) cos λ(t − x)dt dλ.

By the integrability of f , the integral

∞ −∞

f (t) sin λ(t − x)dt

(3.14)

3.1 Fourier Integrals

51

exists, and it is an odd function in λ. Hence

∞ ∞ 1 f (t) sin λ(t − x)dt dλ = 0, 2π −∞ −∞

(3.15)

where the integral (with respect to λ) is in the sense of Cauchy’s principal value; i.e.

N

lim

N →∞ −N

· · · dλ.

The operation (3.14)–i(3.15) gives the Fourier integral formula in complex form: f (x) =

1 2π

∞

−∞

∞

−∞

f (t)e−iλ(t−x) dt dλ.

(3.16)

If we define g(λ) by 1 g(λ) = √ 2π

∞ −∞

f (t)e−iλt dt,

(3.17)

g(λ)eiλx dλ.

(3.18)

(3.16) becomes 1 f (x) = √ 2π

∞ −∞

We must be sure that the integral in (3.17) always exists, but the integral in (3.18) exists only in the sense of Cauchy’s principal value in general. While (3.17) just defines a function g, (3.18) is a variant of the Fourier integral formula and implies a positive claim that the right-hand side is equal to f (x). Needless to say, g(λ) corresponds to Fourier coefficients and (3.18) corresponds to the Fourier series in the case of periodic functions.1 We now proceed to the theory of Fourier transforms based upon the preliminary observations explained above. The function g(λ) defined by (3.17) is the Fourier transform of f ∈ L1 (R, C), which can be interpreted as a Fourier coefficient of a function with period ∞.2

1 The

exposition in this section is primarily due to Kolmogorov–Fomin [6] Chap. 8, §3. basic textbooks on the classical theory of Fourier transforms are Dym–McKean [2], Goldberg [3], Kawata [4, 5], Titchmarsh [10]. 2 Some

52

3 Fourier Transforms (I)

3.2 Fourier Transforms on L1 (R, C) Definition 3.1 Let f be an element of L1 (R, C). The function fˆ : R −→ C defined by

1 fˆ(t) = √ 2π

∞ −∞

f (x)e−itx dx

is called the Fourier transform of f . The mapping F : f −→ fˆ is also called the Fourier transform. Theorem 3.2 (uniform continuity) The Fourier transform fˆ of f ∈ L1 (R, C) is uniformly continuous on R. Proof Since 1 fˆ(t + u) − fˆ(t) = √ 2π

∞

−∞

f (x){e−i(t+u)x − e−itx }dx,

we obtain the evaluation 1 | fˆ(t + u) − fˆ(t) | √ 2π

∞ −∞

| f (x) || e−iux − 1 | dx.

The right-hand side is independent of t. Taking account of | f (x) || e−iux − 1 | 2 | f (x) |, we can apply the dominated convergence theorem to conclude that the right-hand side converges to 0 as u → 0. Hence fˆ is uniformly continuous. It is quite easy to check the following relations for f, g ∈ L1 (R, C) and α ∈ C: (f + g)(t) = fˆ(t) + g(t), ˆ

(3.19)

* )(t) = α fˆ(t), (αf

(3.20)

1 | fˆ(t) | √

f 1 . 2π

(3.21)

We denote by U(R, C) the space of all the uniformly continuous functions of R into C, endowed with the usual vector operations and the uniform convergence norm. Then the mapping F : f −→ fˆ is a bounded linear operator of the form F : L1 (R, C) −→ U(R, C),

3.2 Fourier Transforms on L1 (R, C)

53

the operator norm of which is given by 1

F √ 2π by (3.21). More properties of fˆ will be discussed later. In particular, it must be remembered that fˆ vanishes at infinity. Remark 3.2 The following formulas hold good for f, g ∈ L1 (R, C). fˆ (t) = fˆ(−t). (f (·) is the conjugate complex of f (·).) If we write fy (x) = f (x − y), then fˆy (t) = fˆ(t)e−ity . If we write ϕ(x) = λf (λx) for a real λ 0, then ϕ(t) ˆ = fˆ(t/λ). ∞ 1 (iv) If we define h(x) = √ g(t)eitx dt (integrable), 2π −∞ then

∞ (h ∗ f )(x) = g(t)fˆ(t)eitx dt,

(i) (ii) (iii)

−∞

where h ∗ f is the convolution of h and f , i.e.

(h ∗ f )(x) =

∞ −∞

h(x − y)f (y)dy.

We now illustrate some examples of Fourier transforms. Example 3.1 If we define a function f : R → R by + f (x) =

1 on (a, b), 0 otherwise,

the Fourier transform of this function is given by 1 fˆ(t) = √ 2π

b

e−itx dx.

a

Hence it is evaluated as b−a fˆ(t) = √ 2π

54

3 Fourier Transforms (I)

at t = 0, and b −1 1 fˆ(t) = √ e−itx = √ (e−ita − e−itb ) a 2π it 2π it at t 0. If a = −b in particular, fˆ becomes 1 2 sin tb fˆ(t) = √ (eitb − e−itb ) = √ . 2π it 2π t Example 3.2 The Fourier transform of f (x) = e−γ |x| 1 fˆ(t) = √ 2π

∞

1 e−γ |x| e−itx dx = √ 2π −∞

∞ −∞

(γ > 0) is given by

e−γ |x| (cos tx − i sin tx)dx,

where

∞ −∞

e

−γ |x|

∞

sin txdx =

e

−γ x

sin txdx +

0 ∞

=

e−γ x sin txdx −

0

−∞

0 ∞

0

eγ x sin txdx

e−γ y sin t (−y)dy

(changing variables: − x = y)

∞

∞ −γ x = e sin txdx − e−γ y sin tydy 0

0

= 0. Hence we have 1 fˆ(t) = √ 2π

∞

2 e−γ |x| cos txdx = √ 2π −∞

0

∞

2 e−γ x cos txdx ≡ √ I. 2π

By integration by parts, I can be computed as I=

γ2

γ . + t2

So we obtain 2 γ fˆ(t) = √ . · 2 2π γ + t 2 Example 3.3 Find the Fourier transform of f (x) = e−x 1 fˆ(t) = √ 2π

∞

−∞

e−x

2 /2

e−itx dx = e−t

2 /2

2 /2

1 √ 2π

. ∞

−∞

e−(x+it)

2 /2

dx.

3.2 Fourier Transforms on L1 (R, C)

55

Fig. 3.2 Closed path γ

Integrating the holomorphic function e−z Fig. 3.2, we obtain

e−z

2 /2

2 /2

along a closed path γ given in

dz = 0

γ

by Cauchy’s theorem.3 Hence 1 √ 2π

λ

−λ

e−(x+it)

2 /2

1 dx = √ 2π

λ

−λ

e−x

2 /2

1 dx + √ 2π

0

e−(−λ+it)

2 /2

idt

t

(3.22) 1 + √ 2π

t

e−(λ+it)

2 /2

idt.

0

The second and the third terms of (3.22) converge to 0 as λ → ∞ for a fixed t. In fact, we can verify this fact for the third term by observing

t t t 2 2 2 2 2 e−(λ+it) /2 dt e−(λ −t )/2 dt = e−λ /2 et /2 dt → 0 as λ → ∞. 0

0

0

Consequently, we obtain by the Gauss integration that 1 fˆ(t) = √ 2π

∞

−∞

e

−x 2 /2 −itx

e

dx = e

−t 2 /2

1 √ 2π

Thus we arrive at the conclusion that f (x) = e−x invariant under the Fourier transform.

3 cf.

Cartan [1] Chap. II, and Takagi [9] pp. 208–210.

2 /2

∞

−∞

e−x

2 /2

dx = e−t

2 /2

.

is a special function which is

56

3 Fourier Transforms (I)

It immediately follows that the Fourier transform of f (x) = e−ax given by

2

(a > 0) is

1 2 fˆ(t) = √ e−t /4a . 2a We now list some basic properties of Fourier transforms. We start by investigating the convolutions of two functions. Let f, g : R −→ C be integrable. The convolution of f and g is defined by

h(x) =

∞

−∞

f (x − y)g(y)dy.

(3.23)

A simple calculation gives

∞

−∞

∞

−∞

|f (x − y)g(y)|dx dy =

∞

−∞

∞

−∞

|f (z)|dz |g(y)|dy

= f 1 · g 1 < ∞. Thus we observe that: (i) h(x) is defined at almost every x, and (ii) h is integrable. The definition (3.23) makes sense almost everywhere and h(x) is usually denoted by (f ∗ g)(x). Theorem 3.3 (Fourier transform of a convolution) (f ∗ g)(t) =

√

2π fˆ(t)g(t) ˆ

for f, g ∈ L1 (R, C). Proof The theorem is proved by a direct calculation: 1 (f ∗ g)(t) = √ 2π 1 = √ 2π 1 = √ 2π

∞

−∞

∞

−∞

∞

−∞

∞

∞ −∞ ∞ −∞

f (x − y)g(y)dy e−itx dx f (x − y)e−itx dx g(y)dy

∞ −∞

f (w)e

−itw

1 = √ f (w)e−itw dw · 2π −∞ √ ˆ = 2π fˆ(t)g(t).

dw g(y)e−ity dy ∞

−∞

g(y)e−ity dy

3.2 Fourier Transforms on L1 (R, C)

57

Theorem 3.4 (Fourier transform of a derivative) Assume that f ∈ L1 (R, C) is absolutely continuous on every finite interval and f ∈ L1 (R, C). Then f, (t) = it fˆ(t). Proof Since f is absolutely continuous on any finite interval, f (x) can be expressed as

x f (t)dt. f (x) = f (0) + 0

The right-hand side converges as x → ±∞, since f ∈ L1 (R, C). This limit is 0, since f ∈ L1 (R, C). Consequently, we have 1 f, (t) = √ 2π 1 = √ 2π

∞

−∞

f (x)e−itx dx

∞ −itx f (x)e + it

1 = it · √ 2π

−∞

∞ −∞

∞ −∞

f (x)e

−itx

dx

f (x)e−itx dx

= it fˆ(t). Remark 3.3 The corresponding result for the Fourier coefficients of an absolutely continuous function on [−π, π ] is as follows. Let f : R → C be a 2π -periodic function which is integrable on [−π, π ] and satisfies fˆ(0) = 0. If we define

F (x) = f (0) +

x

f (t)dt, 0

then F is a 2π -periodic continuous function and the Fourier coefficients of F are computed as 1 Fˆ (n) = fˆ(n), in

n 0.

In fact, the continuity (even the absolute continuity) of F is obvious. F is 2π periodic since

x+2π

F (x + 2π ) − F (x) = x

f (x)dt =

√ 2π fˆ(0) = 0.

58

3 Fourier Transforms (I)

Finally, we see by integration by parts that 1 Fˆ (n) = √ 2π

−1 F (t)e−int dt = √ 2π −π π

π −π

F (t)

1 −int 1 e dt = fˆ(n) −in in (assuming n 0).

Repeating the same reasoning as that in Theorem 3.4, we obtain the next corollary. Corollary 3.1 (Fourier transform of higher derivatives) If the ν-th derivative f (ν) (ν = 0, 1, · · · , k − 1) of f : R → C is absolutely continuous on each finite interval and f (ν) ∈ L1 (R, C) (ν = 0, 1, · · · , k − 1), then (k) (t) = (it)k fˆ(t). f-

The next result is an analogue of the Riemann–Lebesgue lemma concerning Fourier coefficients (Corollary 1.1, p. 14). Theorem 3.5 (Riemann–Lebesgue) For f ∈ L1 (R, C), lim fˆ(t) = 0.

|t|→∞

Proof Assume first that g ∈ C10 (R, C), i.e. g is continuously differentiable and the support of g is compact. Then we have, by Theorem 3.4 and (3.21), 1 | g, (t) |=| t g(t) ˆ | √

g 1 , 2π which implies 1 1 · √

g 1 = 0. |t|→∞ | t | 2π

ˆ | lim lim | g(t)

|t|→∞

(3.24)

For any f ∈ L1 (R, C) and any ε > 0, there exists some g ∈ C10 (R, C) such that

f − g 1 <

√ 2π · ε.

Again by (3.21), we have | fˆ(t) − g(t) ˆ | ε. It follows, from (3.24) and (3.25), that lim | fˆ(t) | ε.

|t|→∞

(3.25)

3.2 Fourier Transforms on L1 (R, C)

59

Since ε > 0 is arbitrary, we conclude that lim | fˆ(t) |= 0.

|t|→∞

This establishes the important result that the Fourier transform of a function in L1 is uniformly continuous and vanishes at infinity. The next theorem clarifies how fast fˆ vanishes at infinity. Theorem 3.6 (speed of vanishing at infinity) Assume that the derivatives f (ν) (ν = 0, 1, · · · , k − 1) of a function are absolutely continuous on each finite interval and f (ν) ∈ L1 (R, C) (ν = 0, 1, · · · , k − 1). Then fˆ(t) = o(| t |−k )

| t |→ ∞.

as

Proof By Corollary 3.1, (k) (t) = (it)k fˆ(t) = f-

1 (

1 k it )

fˆ(t).

(k) (t) → 0 (as | t |→ ∞) by Theorem 3.5. Hence we obtain Since f (k) ∈ L1 , f(k) (t) |= | f-

1 |

1 t

|k

| fˆ(t) |→ 0

as

| t |→ ∞.

This theorem tells us that the higher the derivatives f admits, the more rapidly fˆ vanishes at infinity. We can also prove a dual result that the more rapidly f vanishes at infinity, the smoother fˆ becomes. We prepare a lemma. Lemma 3.1 For any complex number a = α + iβ, + | eia − 1 |

|a|e−β

for

β 0,

|a|

for

β 0.

Theorem 3.7 (derivative of Fourier transform) Assume that f, L1 (R, C). Then fˆ is differentiable and d ˆ f (t) = (−ixf )(t). dt .

Proof First we observe that −ihx

∞ 1 fˆ(t + h) − fˆ(t) −1 −itx e = √ dx. f (x)e h h 2π −∞

xf (x) ∈

60

3 Fourier Transforms (I)

By Lemma 3.1, −ihx − 1 −itx e f (x)e | xf (x) | . h It is clear that f (x)e

−itx

e−ihx − 1 h

−→ −ixf (x)e−itx

as h → 0.

Combining these results and the assumption xf (x) ∈ L1 (R, C), we obtain, by the dominated convergence theorem, −ihx

∞ 1 −1 fˆ(t + h) − fˆ(t) −itx e f (x)e lim = lim √ dx h→0 h→0 h h 2π −∞ 1 = √ 2π

∞ −∞

(−ixf (x))e−itx dx

.

= (−ixf )(t). Similarly: Corollary 3.2 (higher derivatives of Fourier transform) If f (x), xf (x), · · · , x k f (x) are integrable, then fˆ is differentiable k-times and .

fˆ(ν) (t) = [(ix)ν f ](t) , ν = 0, 1, · · · , k. We have arrived at a significant result: the Fourier transform fˆ of a function f which is very smooth and vanishes at infinity very rapidly is also very smooth and vanishes at infinity very rapidly. This observation leads to the idea of the space of rapidly decreasing functions, which is to be discussed in Chap. 4. But before going on to this topic, we make a digression to a simple application to physics. The heat equation will be discussed in the next section.

3.3 Application: Heat Equation We interpret u(x, t) as the temperature at a point x ∈ (−∞, ∞) on a line and at time t 0. Given an initial condition u(x, 0) = u0 (x), the dynamic process of heat conduction is described by the following partial differential equation, called the heat equation:

3.3 Application: Heat Equation

61

∂ 2u ∂u (x, t) = 2 (x, t) ∂t ∂x

(3.26)

subject to u(x, 0) = u0 (x).

(3.27)

u0 (x) is a known function, and we suppose that u0 (x),

u0 (x),

u0 (x) ∈ L1 (R, R).

We would like to find a solution of (3.26) and (3.27) in the space of functions u which satisfy: (i) x → u(x, t), x → ux (x, t), x → uxx (x, t) are integrable on R for any fixed t (ux = ∂u/∂x, uxx = ∂ 2 u/∂x 2 ) and (ii) for any bounded interval [0, T ], there exists some integrable function f : R −→ R such that | ut (x, t) | f (x)

for all

t ∈ [0, T ]

x ∈ R.

and

The Fourier transforms of both sides of (3.26) (with respect to x) are given by 1 Fourier transform of the left-hand side of (3.26) = √ 2π

∞ −∞

1 ∂ = √ 2π ∂t

ut (x, t)e−iλx dx ∞

−∞

u(x, t)e−iλx dx

(making use of differentiation formula under integral4 ), and 1 Fourier transform of the right-hand side of (3.26) = −λ2 · √ 2π

∞

−∞

u(x, t)e−iλx dx

(by Corollary 3.1). If we define v(λ, t) by 1 √ 2π

∞ −∞

u(x, t)e−iλx dx = ν(λ, t),

we have vt (λ, t) = −λ2 v(λ, t).

4 cf.

(3.28)

Takagi [9] p. 166, Theorem 42. We must be sure that we are dealing with an improper integral here. See also Stromberg [8] p. 380.

62

3 Fourier Transforms (I)

It is not difficult to solve (3.28) under the initial condition v(λ, 0) = u,0 (λ).

(3.29)

In fact, (3.28) and (3.29) can be regarded as an ordinary differential equation in t when λ is arbitrarily fixed. The solution is given by v(λ, t) = e−λ t u,0 (λ). 2

(3.30)

Applying the result on p. 56 to the case a = 1/4t, we can observe that e−λ t is the Fourier transform of 2

1 2 x → √ e−x /4t . 2t If we denote by F the mapping f → fˆ, we obtain, by (3.30), that 1 2 √ e−x /4t (λ) · F (u0 )(λ) 2t 1 1 −x 2 /4t = √ ∗ u0 (x) (λ) ·F √ e 2π 2t

v(λ, t) = F

(by Theorem 3.3).

Hence v(λ, t) is the Fourier transform of u(x, t) =

1 √ 2 πt

∞

−∞

e−(x−ξ )

2 /4t

· u0 (ξ )dξ =

1 √ 2 πt

∞ −∞

e−ξ

2 /4t

· u0 (x − ξ )dξ. (3.31)

The function u(x, t) thus obtained is the solution of (3.26) and (3.27), and is called the Poisson formula of the heat equation. Check that (3.31) is really the solution.5

References 1. Cartan, H.: Théorie élémentaires des fonctions analytiques d’une ou plusieurs variables complexes. Hermann, Paris (1961) (English edn.) Elementary Theory of Analytic Functions of One or Several Complex Variables. Addison Wesley, Reading (1963) 2. Dym, H., McKean, H.P.: Fourier Series and Integrals. Academic Press, New York (1972) 3. Goldberg, R.R.: Fourier Transforms. Cambridge University Press, Cambridge (1961) 4. Kawata, T.: Fourier Henkan to Laplace Henkan (Fourier Transforms and Laplace Transforms). Iwanami Shoten, Tokyo (1957) (Originally published in Japanese) exposition here is basically due to Kolmogorov–Fomin [6] Chap. 8, §4, 6◦ . See also Schwartz [7] pp. 211–215 and pp. 311–315.

5 The

References

63

5. Kawata, T.: Fourier Kaiseki (Fourier Analysis). Sangyo Tosho, Tokyo (1975) (Originally published in Japanese) 6. Kolmogorov, A.N., Fomin, S.V.: (English edn.) Introductory Real analysis. Prentice-Hall, New York (1970) (Japanese edn. translated from 4th edn.) Kansu-kaiseki no Kiso, 4th edn. Iwanami Shoten, Tokyo (1979) (Originally published in Russian). I use the Japanese edition for quotations. 7. Schwartz, L.: Méthods mathématique pour les sciences physiques. Hermann, Paris (1965) 8. Stromberg, K.R.: An Introduction to Classical Real Analysis. American Mathematical Society, Providence (1981) 9. Takagi, T.: Kaiseki Gairon (Treatise on Analysis), 3rd edn. Iwanami Shoten, Tokyo (1961) (Originally published in Japanese) 10. Titchmarsh, E.C.: Introduction to the Theory of Fourier Integrals, 2nd edn. Oxford, London (1948)

Chapter 4

Fourier Transforms (II)

In the previous chapter, we observed a peculiar relation between the smoothness and the rapidity of vanishing at infinity of a function f , as well as its Fourier transform fˆ. Based upon this observation, we introduce an important function space S, which is invariant under the Fourier transforms. We then proceed to L2 -theory of Fourier transforms due to M. Plancherel. As a simple application of Plancherel’s theory, we discuss how to solve integral equations of convolution type. Finally, a tempered distribution is defined as an element of S , and its Fourier transform is examined in detail.

4.1 Fourier Transforms of Rapidly Decreasing Functions As we have already remarked, it is conjectured that if a function f is very smooth and vanishes at infinity very fast, its Fourier transform fˆ enjoys the same properties. We now provide a rigorous proof of this conjecture. Definition 4.1 A smooth (differentiable infinitely many times) function f R −→ C is said to be rapidly decreasing at infinity if it satisfies

:

sup | x n f (m) (x) |< ∞ x∈R

for all n, m = 0, 1, 2, · · · . We denote by S(R) the set of all the rapidly decreasing functions. (f (m) denotes the m-th derivative of f .) The condition sup | x n f (m) (x) |< ∞ means that every derivative of f decreases x∈R

to 0 more rapidly than any power of 1/ | x | as | x |→ ∞.

© Springer Nature Singapore Pte Ltd. 2018 T. Maruyama, Fourier Analysis of Economic Phenomena, Monographs in Mathematical Economics 2, https://doi.org/10.1007/978-981-13-2730-8_4

65

66

4 Fourier Transforms (II)

We often denote by D(R) the set of all the smooth functions f : R −→ C with 2 compact support. It is obvious that D ⊂ S. The function f (x) = e−x is an example 1 which is contained in S but not in D. It is quite easy to verify the following facts. 1◦ A function f ∈ E(R) belongs to S(R) if and only if sup |P (x)f (m) (x)| < ∞ x∈R

for all the polynomials P (x) and all the nonnegative integers m 0. 2◦ lim P (x)f (m) (x) = 0

|x|→∞

for any f ∈ S(R) and any polynomial P (x). In fact, there exists some C > 0 such that sup |(1 + |x|2 )P (x)f (m) (x)| C < ∞ x∈R

by 1◦ , since (1 + |x|2 )P (x) is a polynomial. Hence |P (x)f (m) (x)|

C 1 + |x|2

for all x ∈ R. It implies that lim P (x)f (m) (x) = 0. Note that the convergence to |x|→∞

0 is uniform with respect to m. Theorem 4.1 (properties of S) (i) If f is an element of S, x n f (m) (x) is bounded and integrable for any integers n, m 0. (ii) If f is an element of S, (x n f (x))(m) is bounded and integrable for any integers n, m 0. Proof (i) By the definition of S, there exists some M > 0 such that sup | x n+2 f (m) (x) | M < ∞. x∈R

Consequently, | x n f (m) (x) |

1 Where

M x2

for all

x ∈ R.

the domain and the range are clear enough, we write just S, D and so on, for brevity.

4.1 Fourier Transforms of Rapidly Decreasing Functions

(i) immediately follows from this. (ii) is clear from (i) and the Leibnitz formula.

67

S is a vector space, the topology of which is defined by the family of seminorms pn,m (f ) = sup | x n f (m) (x) | ; n, m = 0, 1, 2, · · · . x∈R

Thus S is a metrizable locally convex Hausdorff topological vector space. A sequence {fν } converges to 0 (in the above topology) if and only if x n fν(m) (x) → 0

as

ν → ∞ (uniformly)2

for all n, m = 0, 1, 2, · · · . In the case of n = m = 0, in particular, we have p0,0 (f ) = sup |f (x)|

(uniform-norm).

x∈R

So the convergence of {fν } to f in the topology of S implies the uniform convergence. We shall use the following notations of some function spaces: C∞ (R, C) :

the space of continuous functions (R → C) vanishing at infinity.

C0 (R, C) :

the space of continuous functions (R → C) with compact support.

C∞ 0 (R, C) = D(R) :

the space of infinitely differentiable functions (R → C) with compact support.

We now list several peculiar relations between these function spaces from topological viewpoint. The first three results concern the uniform convergence: 1◦ C0 is dense in C∞ with respect to the uniform convergence topology. 2◦ D is dense in C0 with respect to the uniform convergence topology. 3◦ D is dense in S with respect to the topology defined above (consequently also with respect to the uniform convergence topology). Proof Let f be any element of S. We choose ψ ∈ D so as to satisfy ψ(x) = 1 for 2 See

|x| 1.

Schwartz [11], Bourbaki [1] and Grothendieck [4] for the theory of locally convex spaces. See also Appendix B in this book.

68

4 Fourier Transforms (II)

Define a function fε , for any ε > 0, by fε (x) = f (x)ψ(εx). Then fε ∈ D and D α (fε (x) − f (x)) = D α [f (x)(ψ(εx) − 1)] is a linear combination of functions of the form (β + γ = α, γ > 0),

D β f (x)εγ D γ ψ(εx) (D f (x))(ψ(εx) − 1). α

Hence fε → f as ε → 0 by the definition of the topology of S.

The following two facts concern the relationships with Lp (R, C). First: 4◦ C0 is dense in Lp (R, C) (p 1) with respect to Lp -topology.3 Combining 2◦ and 4◦ , we obtain the next one: 5◦ D is dense in Lp (R, C) (p 1) with respect to Lp -topology. (Since D ⊂ S, S is also dense in Lp (R, C).) Theorem 4.2 is a basic result which verifies that S is invariant under the Fourier transform. Theorem 4.2 (Fourier transform on S) The Fourier transform on S is an automorphism on S. The inverse transform is given by

∞ 1 f (t)eixt dt , f ∈ S. f˜(x) = √ 2π −∞ (This is called the Fourier inversion formula.) Proof Since f ∈ S is an integrable function, we can define its Fourier transform fˆ by 1 fˆ(t) = √ 2π

∞ −∞

f (x)e−itx dx , f ∈ S.

(4.1)

fˆ is differentiable infinitely many times. In fact, the n-th formal derivative of the integrand of (4.1) is (−i)n x n f (x)e−itx , 3A

more general theorem is explained in Maruyama [9] pp. 232–233.

(4.2)

4.1 Fourier Transforms of Rapidly Decreasing Functions

69

the absolute value of which is | x n f (x) | (independent of t). Since x n f (x) is integrable by Theorem 4.1, the integral of (4.2) over R is uniformly convergent in t. Hence, by applying the formula of differentiation under an (improper) integral,4 we obtain

∞ 1 (n) ˆ f (t) = √ (−i)n x n f (x)e−itx dx. (4.3) 2π −∞ Thus fˆ is differentiable arbitrarily many times. By Theorem 3.4 (or Corollary 3.1, p. 58), 1 i m t m fˆ(t) = √ 2π

∞

−∞

f (m) (x)e−itx dx.

(4.4)

It follows that

∞ dm 1 e−itx m (x n f (x))dx √ dx 2π −∞ n f (t) = i m t m · x= i m t m · fˆ(n) (t) · (−i)−n

(by (4.3))

(4.5)

= i m+n t m fˆ(n) (t), which gives m ˆ(n)

|t f

1 (t) | √ 2π

∞

−∞

|(x n f (x))(m) |dx.

(4.6)

Since (x n f (x))(m) is integrable by Theorem 4.1(ii), the right-hand side of (4.6) is finite, and so sup | t m fˆ(n) (t) |< ∞. t∈R

This holds good for any m, n = 0, 1, 2, · · · . We can conclude that fˆ ∈ S. The continuity of the Fourier transform F : S → S is easily seen from (4.6). In fact, if {fν } is a sequence in S which converges to 0, then it satisfies, by the Leibnitz formula: sup | (x ν fν (x))(m) |→ 0 x∈R

4 Takagi

[13] p. 166.

as

ν→∞

70

4 Fourier Transforms (II)

for any m, n = 0, 1, 2, · · · . Hence we have, by (4.6), sup | t m fˆν(n) (t) |→ 0 as

ν → ∞,

t∈R

which implies that {fˆν } converges to 0 in S. This proves the continuity of the Fourier transform F . We define the operator F˜ which associates with each f ∈ S the function 1 f˜(x) = √ 2π

∞

−∞

f (t)eixt dt.

The difference between F and F˜ is just the sign of the exponential. Obviously, F˜ (f )(x) = F (f )(−x). It is clear that F˜ is a continuous linear operator of S into itself. Finally, we have to show that F˜ is the inverse of F . It is enough to show that F˜ ◦ F = I ; i.e. f˜ˆ = f

for all

f ∈ S,

(4.7)

F ◦ F˜ = I ; i.e. fˆ˜ = f

for all

f ∈ S,

(4.8)

where I is the identity operator on S. We prove only (4.7) ((4.8) can be proved similarly), which is equivalent to 1 √ 2π

∞ −∞

fˆ(t)eixt dt = f (x)

for all f ∈ S.

(4.7 )

For any f, g ∈ S and any ε > 0, we have

∞

1 g(εt)fˆ(t)eixt dt = √ 2π −∞ 1 = √ 2π 1 = √ 2π

−∞

∞

∞

g(εt)

−∞

∞

−∞

−∞

f (y)e−ity dy eixt dt

∞ −∞

∞

∞ −∞

g(εt)e

−it (y−x)

g(εt)e−i(εt)

dt f (y)dy

y−x ε

dt f (y)dy.

(4.9)

4.1 Fourier Transforms of Rapidly Decreasing Functions

71

Changing the variables by w = εt, z = (y − x)/ε, (4.9) is rewritten as

∞

1 g(εt)fˆ(t)eixt dt = √ 2π −∞

=

∞

−∞

In a particular case of g(t) = e−t

g(0)

∞ −∞

∞ −∞

∞

−∞

g(w)e−iwz dw f (x + εz)dz

g(z)f ˆ (x + εz)dz.

2 /2

(4.10)

, we obtain

fˆ(t)eixt dt = f (x)

∞ −∞

g(z)dz ˆ

(4.11)

by passing to the limit ε ↓ 0 and the dominated convergence theorem. We note that g(0) = 1 and

∞ −∞

g(z)dz ˆ =

√ 2π

by Example 3.3 (p. 54). Therefore 1 √ 2π

∞

−∞

fˆ(t)eixt dt = f (x).

This completes the proof.5

Henceforth we denote the operator F˜ : f −→ f˜ (S → S) by F −1 and call it the inverse Fourier transform. Remark 4.1 It is clear that S = {fˆ|f ∈ S} is dense in C∞ (by 1◦ on p. 67). Furthermore, if we write Sc ≡ {f ∈ S|the support of fˆ is compact}, Sc and {fˆ|f ∈ Sc } are dense in S (and hence in C∞ ) with respect to the uniform convergence topology.

5 The proof of (4.7) here is due to Yosida [16], p. 147. Although this approach is a little bit technical,

I adopt it because of its simplicity. There are various other approaches. For instance, see Kawata [6] Chap. 11, Treves [14] Theorem 25.1 or Maruyama [8] Theorem 4.13.

72

4 Fourier Transforms (II)

The proof is as follows. If f is any element of S, then fˆ ∈ S. Hence there exists a sequence {gn } of D such that gn → fˆ (in S-topology) (cf. 3◦ on p. 67). Let fn ∈ S be the inverse Fourier transform of gn ; i.e. fˆn = gn . Then fn ∈ Sc . {fn } uniformly converges to f , since fn → f (in S-topology) by the continuity of the inverse Fourier transform. This proves that Sc is dense in S. The denseness of {fˆ|f ∈ Sc } in S can be proved exactly in the same way.

4.2 Fourier Transforms on L2 (R, C) It may not be possible to define the Fourier transform of a square-integrable function, since L2 (R, C) L1 (R, C). However, the Plancherel theorem guarantees the existence of the Fourier transform of a function in L2 (R, C) in a suitable sense. Theorem 4.3 (Plancherel) The Fourier transform F on S can be uniquely extended to a bounded linear operator of L2 (R, C) into itself. This is called the Fourier transform on L2 (R, C) and is denoted by F2 . Similarly, the inverse Fourier transform F −1 can be uniquely extended as a bounded linear operator of L2 (R, C) into itself, denoted by F2−1 . The following properties hold good for f, g ∈ L2 (R, C): F2 f , g = f, F2−1 g. (·, · is the inner product in L2 (R, C).)

F2 f 2 = f 2 . F2 : L2 (R, C) → L2 (R, C) is a bijection. F2−1 is the inverse of F2 .

r 1 f (x)e−itx dx. (l.i.m. is the limit with respect to (v) (F2 f )(t) = l.i.m. √ r→∞ 2π −r L2 -norm.)

r 1 (vi) (F2−1 f )(x) = l.i.m. √ f (t)eixt dt. r→∞ 2π −r

(i) (ii) (iii) (iv)

Proof We denote by F and F −1 the Fourier transform and the inverse Fourier transform on S, respectively. Then it is easy to check that 1 Ff, g = √ 2π 1 = √ 2π

∞

−∞

∞

−∞

+

∞

−∞

f (x)e

f (x)

∞

−∞

−itx

dx g(t)dt /

g(t)eixt dt dx

= f, F −1 g for all f, g ∈ S. If g = Ff , in particular, (4.12) implies that

(4.12)

4.2 Fourier Transforms on L2 (R, C)

73

Ff 2 = f 2

f ∈ S.

for all

(4.13)

Hence F : S → S is an isometric (with respect to L2 -norm) operator. S being dense in L2 (by 5◦ on p. 68),6 F can be uniquely extended to a bounded linear operator F2 on L2 , preserving the norm. Similarly, F −1 : S → S admits the unique extension to a bounded linear operator F2−1 on L2 . (Note that whether F2−1 is the inverse of F2 has not been verified yet. It is proved in (iv).) (i) Suppose that f, g ∈ L2 (R, C). Since S is dense in L2 (R, C), there are sequences {fn }, {gn } in S such that l.i.m.fn = f, n

l.i.m.gn = g. n

According to (4.12), we have Ffn , gn = fn , F −1 gn

for all

n.

Hence (taking account of F = F2 , F −1 = F2−1 on S) we obtain F2 fn , gn = fn , F2−1 gn

for all

n.

(4.14)

Passing to the limit of (4.14) as n → ∞, we get (i) by the continuities of F2 , F2−1 and the inner product. (ii) Obvious from (4.13). (iii) The injectiveness of F2 immediately follows from (ii). For any g ∈ L2 (R, C), there exists a sequence {gn } in S which satisfies l.i.m.gn = g. Taking account of the fact that F : S → S is an automorphism, we define fn = F −1 gn . Since {fn } is a Cauchy sequence in L2 (R, C) by (ii), there exists some f ∈ L2 (R, C) such that l.i.m.fn = f . By the continuity of F2 , we have n

gn = F2 fn = Ffn → F2 f = g. This shows that F2 is a surjection. (iv) We have to show that F2−1 ◦ F2 = F2 ◦ F2−1 = I . For any f ∈ L2 (R, C), there exists a sequence {fn } in S which converges to f (in L2 ). Then (F2−1 ◦ F2 )fn = (F −1 ◦ F )fn = fn .

(4.15)

Passing to the limit, (4.15) gives (F2−1 ◦ F2 )f = f. direct proof is also possible. Since f ∈ S is bounded and integrable, f · f is integrable (i.e. f ∈ L2 ) by Hölder’s inequality. We have only to approximate f ∈ L2 by a simple function ϕ with compact support and to approximate ϕ by a smooth function with compact support.

6A

74

4 Fourier Transforms (II)

Similarly for (F2 ◦ F2−1 )f = f . (v) Let f ∈ L2 (R, C). We define fr (x) by + fr (x) =

f (x)

if

| x | r,

0

if

| x |> r.

Then there exists a sequence {ϕn } in S such that: a. {ϕn } converges to fr in L2 , and b. there exists a bounded interval J containing [−r, r] which satisfies supp ϕn ⊂ J for all n. Since {ϕn } converges to fr in L1 ,

∞ 1 ∞ 1 −itx ϕ (x)e dx − fr (x)e−itx dx √ √ n 2π −∞ 2π −∞

∞ 1 √ | ϕn (x) − fr (x) | dx → 0 2π −∞ as

n → ∞.

Consequently, we obtain 1 F2 fr (t) = √ 2π

∞ −∞

fr (x)e

−itx

1 dx = √ 2π

r

−r

f (x)e−itx dx,

by extension by continuity. Thus (v) follows from fr −f 2 → 0 and the continuity of F2 . (vi) can be proved similarly. Remark 4.2 1◦ It is clear that F2 f, F2 g = f, g 2◦

for any f, g ∈ L2 (R, C). For any f ∈ L1 (R, C) ∩ L2 (R, C), the Fourier transform fˆ in the usual sense and the Fourier transform F2 f in the sense of Plancherel coincide. In fact, since

fr − f 1 → 0, we have fˆr − fˆ ∞ → 0 by (3.21) on p. 52. On the other hand, fˆr − F2 f 2 → 0 as shown above. Hence fˆ = F2 f .7

7 The

proof here is due to Dunford–Schwartz [2] III, pp. 1988–1989. See also Sect. 4.4 in this chapter and Schwartz [12] Chap. VII for the connection with the theory of distributions.

4.3 Application: Integral Equations of Convolution Type

75

4.3 Application: Integral Equations of Convolution Type In this section, we consider how to solve an integral equation of convolution type:

f (x) = g(x) +

∞

−∞

K(x − y)f (y)dy,

(4.16)

where the functions g, K : R → C are given. We wish to find a function f which satisfies (4.16). This problem provides a typical sample of applications of Plancherel’s theorem. We start with a rough sketch of ideas. The Fourier transforms of both sides of (4.16) give fˆ(t) = g(t) ˆ +

√ ˆ 2π fˆ(t) · K(t)

from which fˆ is expressed as fˆ(t) =

g(t) ˆ . √ ˆ 1 − 2π K(t)

Hence we have, by the inverse Fourier transform, 1 f (x) = √ 2π

∞

−∞

g(t) ˆ eixt dt. √ ˆ 1 − 2π K(t)

This f seems to be a solution of (4.16). However, we have to admit that there are several obscure points in the above reasoning. First of all, it is required that the Fourier transforms in a certain sense can be defined for g, K and f . In particular, we have to apply the inverse Fourier transform at the final step. Is it possible? We have to specify some suitable function √ spaces which permit these operations. Furthermore, it is very possible that ˆ 1 − 2π K(t) = 0, which is an obstacle to computing fˆ(t). Taking account of the difficulties stated above, we assume a couple of conditions: (i) g ∈ L2 (R, C). (ii) K ∈ L1 (R, C);

ˆ | K(t) | C <

√1 2π

for all t ∈ R.

We would like to find a solution of (4.16) in L2 (R, C) under these conditions. We use the notation , for denoting Fourier transforms of functions of either L1 or L2 , because no confusion may occur. Under the conditions (i) and (ii), we have g(t) ˆ ∈ L2 (R, C). √ ˆ 1 − 2π K(t)

76

4 Fourier Transforms (II)

Hence, by Theorem 4.3,

1 f (x) ≡ l.i.m. √ r→∞ 2π

r

−r

g(t) ˆ eixt dt √ ˆ 1 − 2π K(t)

can be defined and f ∈ L2 (R, C). We can also define a function h by

h(x) =

∞ −∞

K(x − y)f (y)dy

and this h belongs to L2 (R, C).8 The Fourier transform of h in the sense of Plancherel is given by ˆ = l.i.m. √1 h(t) r→∞ 2π

r

−r

∞ −∞

K(x − y)f (y)dy e−itx dx =

√ ˆ fˆ(t). 2π K(t) (4.17)

This can be proved as follows. If we define fA = f · χ[−A,A] , fA ∈ L1 ∩ L2 and l.i.m.fA = f.

A→∞

(4.18)

It follows from Theorem 4.3 (continuity of Fourier transform) that l.i.m.fˆA = fˆ (in the sense of Plancherel).

A→∞

Define a function hA (x) by

hA (x) =

∞

−∞

K(x − y)fA (y)dy.

Then the Fourier transform formula of the convolution of L1 -functions gives hˆ A =

8 In

√

2π Kˆ · fˆA .

general, it holds good that u ∗ v ∈ Lp (R, C) and u ∗ v p u p · v 1 for any u ∈ p ∞) and v ∈ L1 (R, C). See Maruyama [9] pp. 235–236.

Lp (R, C) (1

4.3 Application: Integral Equations of Convolution Type

77

Since9 l.i.m.hA = h,

(4.19)

A→∞

we obtain, applying Theorem 4.3 again, l.i.m.hˆ A = l.i.m.

A→∞

√

A→∞

ˆ 2π Kˆ · fˆA = h.

(4.20)

ˆ we can conclude By the boundedness of K, hˆ =

√

2π Kˆ · fˆ.

This proves (4.17). Since fˆ(t) =

g(t) ˆ , √ ˆ 1 − 2π K(t)

we obtain √

ˆ = h(t)

ˆ g(t) 2π K(t) ˆ . √ ˆ 1 − 2π K(t)

Consequently, √

(g + h)(t) = g(t) ˆ +

ˆ 2π g(t) ˆ K(t) g(t) ˆ = = fˆ(t). √ √ ˆ ˆ 1 − 2π K(t) 1 − 2π K(t)

relation (4.19) can be verified as follows. In general, if ur ∈ L2 (R, C), and v ∈ L1 (R, C), and l.i.m.ur = u, then

9 The

r→∞

l.i.m. r→∞

v(x − z)ur (z)dz =

v(x − z)u(z)dz.

(†)

In fact, by the footnote just above,

v(x − z)ur (z)dz − v(x − z)u(z)dz = [ur (z) − u(z)]v(x − z)dz 2

2

ur − u 2 · v 1 → 0 as which establishes (†). Now (4.19) immediately follows from (†).

r → ∞,

78

4 Fourier Transforms (II)

Taking the inverse Fourier transforms of both sides in the sense of Plancherel, we have f = g + h, i.e.

f (x) = g(x) +

∞

−∞

K(x − y)f (y)dy.

This proves that f is a solution of the integral equation (4.16).10

4.4 Fourier Transforms of Tempered Distributions We now proceed to the theory of Fourier transforms of distributions in the sense of L. Schwartz. The readers can find a brief exposition of Schwartz’s theory in the appendices of this book. Let Ω be an open set in R. We denote by E(Ω) the set of all the infinitely differentiable complex-valued functions defined on Ω. For each compact set K ⊂ Ω and m ∈ N ∪ {0}, we define a seminorm pK,m by pK,m (ϕ) = sup |D s ϕ(x)|. x∈K sm

The space E(Ω) endowed with the topology by the family of seminorms {pK,m } is a locally convex Hausdorff topological vector (or linear) space (LCHTVS). Furthermore, this topology is completely metrizable. So E(Ω) is a Fréchet space (cf. Appendix B, Sect. B.1). Let K be a compact set in Ω. We denote by DK (Ω) the set of all the infinitely differentiable functions whose supports are contained in K; i.e. DK (Ω) = {ϕ ∈ E(Ω)|suppϕ ⊂ K}. DK (Ω) is a linear subspace of E(Ω). DK (Ω) is also a LCHTVS, the topology of which is defined by the relative topology induced by E(Ω). It is defined by the family of seminorms pK,m (ϕ) = sup |D s ϕ(x)|,

m ∈ N ∪ {0}.

x∈K sm

DK (Ω) is also a Fréchet space.

10 This

section is basically due to Kawata [5]II, Chap. 17. However, the proof of (4.19) given there does not seem correct.

4.4 Fourier Transforms of Tempered Distributions

79

Construct a sequence {K1 , K2 , · · · } of compact sets in Ω which satisfies (i) Kn ⊂ int.Kn+1 , and ∞ (ii) Kn = Ω. n=1

We denote by D(Ω) the set of all elements of E(Ω) with compact support. Then it is the union of DKn (Ω) (n = 1, 2, · · · ); i.e. D(Ω) =

∞

DKn (Ω).

n=1

The topology of each DKn (Ω) is explained above. On the other hand, the topology of D(Ω) is given as the strict inductive limit topology induced by {DKn (Ω)}. This topology is uniquely determined independently of the choice of {Kn } (cf. Appendix B, Sect. B.1). Each element of D(Ω) is called a test function. D(Ω) is a LCHTVS, but is not metrizable. A net {ϕα }α∈A (A is a directed set) in D(Ω) converges to some ϕ ∗ ∈ D(Ω) if and only if there exists a compact set K ⊂ Ω (independent of α) such that suppϕα ⊂ K

for all

α∈A

(4.21)

and D s ϕα → D s ϕ ∗

(uniform convergence)

(4.22)

for all s ∈ N ∪ {0}. An element of the dual space D(Ω) of D(Ω) is called a distribution (or a generalized function) on Ω. For instance, a locally integrable function f : Ω → C defines a distribution Tf : D(Ω) → C by

Tf (ϕ) =

f (x)ϕ(x)dx,

ϕ ∈ D(Ω).

We sometimes denote the distribution Tf simply by f if no confusion occurs. For any distribution T , we define another distribution S by S(ϕ) = −T (ϕ ),

ϕ ∈ D(Ω).

S is called the generalized derivative or distributional derivative of T and denoted by T . Any distribution admits a distributional derivative of any order. The s-th distributional derivative, denoted by D s T or T (s) , is defined by D s T (ϕ) = (−1)s T (D s ϕ),

ϕ ∈ D(Ω),

80

4 Fourier Transforms (II)

where D s ϕ is the s-th derivative of ϕ in the usual sense. The basic motivation to define the concept of distributional derivative in this way is explained in Appendix B, Sect. B.4. If we restrict T ∈ S(R) to D(R) ⊂ S(R), T is continuous on D(R) with respect to its topology (defined by the strictly inductive limit). Observe that the identity mapping of D(R) into S(R) is continuous. The topolopy of S(R) is defined in Sect. 4.1. In fact, if a net {ϕα } in D(R) converges to 0∈ D(R), then (4.21) and (4.22) must hold good. Hence sup |x n ϕα(m) (x)| = sup |x n ϕα(m) (x)| → 0

x∈R

x∈K

for any m = 0, 1, 2, · · · . This implies that {ϕα } converges to 0 in the topology of S(R). Thus we conclude that the identity I : D(R) → S(R) is continuous. In other words, the proper topology of D(R) is stronger than the relative topology induced by S(R). So T |D(R) is continuous, that is, T |D(R) is a distribution. Conversely, if a distribution T ∈ D(R) is continuous with respect to the relative topology of D(R) induced by S(R), T can be uniquely extended to an element of S(R) since D(R) is dense in S(R). (cf. 3◦ on p. 67.) Based upon the reasoning so far discussed, we can identify S(R) with a subspace of D(R) by identifying T ∈ S(R) with its unique extension to a distribution ∈ D(R) . S(R) ⊂ D(R) . Definition 4.2 An element of S(R) is called a tempered distribution. The motivation for this terminology will be made clear soon. Let us show several examples. Example 4.1 An integrable function f : R → C defines a tempered distribution. In fact, the functional

Tf ϕ = f (x)ϕ(x)dx, ϕ ∈ S(R), R

which makes sense, is continuous with respect to the topology of S(R). Example 4.2 A bounded measurable function f : R → C defines a tempered distribution. In fact, the functional

Tf ϕ = f (x)ϕ(x)dx, ϕ ∈ S(R), R

which makes sense because any ϕ ∈ S(R) is integrable, is continuous. Example 4.3 Assume that a locally integrable function f : R → C satisfies lim |x|−k |f (x)| = 0,

|x|→∞

(4.23)

4.4 Fourier Transforms of Tempered Distributions

81

and so there exists some constant A > 0 such that |f (x)| A|x|k

for large x.

Then f defines a tempered distribution. It can be verified as follows. If ϕ ∈ S(R), we have sup|x k+2 ϕ(x)| B

for some

B<∞

x∈R

and so |ϕ(x)|

B |x|k+2

for all

x ∈ R.

Hence it holds good that |f (x)ϕ(x)|

A·B |x|2

for large x.

Thus the functional

Tf ϕ =

R

f (x)ϕ(x)dx,

ϕ ∈ S(R)

makes sense, and is continuous on S(R). The concept of tempered distribution is a generalization of a distribution defined by a locally integrable function which is slowly increasing in the sense of (4.23). Example 4.4 Assume that μ is a Borel measure on R which satisfies

R

(1 + |x|2 )−k dμ < ∞

for some nonnegative integer k. Then μ defines a tempered distribution by the relation

ϕdμ, ϕ ∈ S(R). Tμ (ϕ) = R

(Such a measure is called a tempered measure.) It is almost obvious because any rapidly decreasing function ϕ satisfies ϕ(x) = o((1 + |x|2 )−k ) for sufficiently large |x|.

82

4 Fourier Transforms (II)

Example 4.5 Let f : R → C be a function of the class C∞ . f is said to be slowly increasing if there exists some k ∈ N ∪ {0}, for each m ∈ N ∪ {0}, such that 1 |f (m) (x)| = 0. |x|→∞ |x|k lim

A slowly increasing function f defines a tempered distribution Tf . Fix a slowly increasing function f . Let S be an operator which transforms each ϕ ∈ S to f ϕ. S is a continuous linear operator of S into itself. For a slowly increasing function f and T ∈ S , their product f T is defined by f T : ϕ → T (f ϕ),

ϕ ∈ S.

f T is an element of S . Example 4.6 A distribution (on R) with compact support is a tempered distribution. It is obvious that D(R) ⊂ S(R) ⊂ E(R). We have only to observe the following two facts: 1◦ The proper topology of S(R) is stronger than the relative topology of S(R) induced by E(R). 2◦ Since D(R) is dense in E(R) (cf. Lemma C.5, p. 399), S(R) is also dense in E(R). Hence E(R) can be regarded as a subspace of S(R) ; i.e. E(R) ⊂ S(R) ⊂ D(R) . The derivatives of a tempered distribution are defined in the same way as the usual distributional derivatives. The operator ϕ → D p ϕ,

ϕ ∈ S(R)

is a continuous linear operator of S(R) into S(R) for any nonnegative integer p 0. Hence, given T ∈ S(R) , it is possible to define an operator ϕ → T (D p ϕ),

ϕ ∈ S(R)

which is continuous. Thus we can define derivatives of tempered distributions as follows. Definition 4.3 Let p be a nonnegative integer. The p-th distributional derivative of T ∈ S(R) is defined by D p T (ϕ) = (−1)p T (D p ϕ),

ϕ ∈ S(R).

If T is a tempered distribution, so is D p T . We can now define the Fourier transform of a tempered distribution. We should recall the fact that the Fourier transform F of S(R) into S(R)(ϕ → ϕ) ˆ is an automorphism of S(R). So if T ∈ S(R) , the operator Tˆ defined by

4.4 Fourier Transforms of Tempered Distributions

Tˆ : ϕ → T (ϕ), ˆ

83

ϕ ∈ S(R)

is also a tempered distribution. We call this operator Tˆ the Fourier transform of T . Definition 4.4 Let T be a tempered distribution. The tempered distribution Tˆ defined by Tˆ (ϕ) = T (ϕ), ˆ

ϕ ∈ S(R)

is called the Fourier transform of T . The Fourier transform on S(R) is an automorphism of S(R) into itself. In order to see this clearly, we introduce the concept of dual operators. Definition 4.5 Let X and Y be two topological vector (linear) spaces, and A : X → Y a continuous linear operator. Then the operator A : Y → X defined by A : y → y ◦ A,

y ∈ Y

is called the dual operator of A. Theorem 4.4 Let X and Y be a pair of topological vector spaces, and A : X → Y a continuous linear operator. Then the dual operator A : Y → X is continuous in the strong topology.11 Proof Let {yα } be a net on Y , which converges to 0 in the strong topology. Let B be any bounded set in X. Then A(B) is bounded in Y, since A is continuous. Taking account of the relation A (yα ) = yα ◦ A, we obtain sup |A (yα )(x)| = sup |(yα ◦ A)(x)| = sup |yα (y)| → 0.

x∈B

x∈B

y∈A(B)

This relation holds good for any bounded set B ⊂ X. Hence we have immediately that yα → 0 (strong topology) ⇒ A (yα ) → 0 (strong topology). We denote by F : S(R) → S(R) the operator which associates a tempered distribution T ∈ S(R) with its Fourier transform. F is also called the Fourier transform.12 A relation between F : S(R) → S(R) and F : S(R) → S(R) is given by FT (ϕ) = Tˆ (ϕ) = T (ϕ) ˆ = T (F ϕ) for all

11 cf.

T ∈ S(R) , ϕ ∈ S(R),

Appendix B, Sect. B.2 (p. 366).

12 While the Fourier transform on S(R) is denoted by F , the Fourier transform on S(R)

is denoted by the German Fraktur letter F. The different letters are used for the sake of clear distinction.

84

4 Fourier Transforms (II)

that is, FT = T ◦ F

for all

T ∈ S(R) .

In other words, F is the dual operator of F : F = F .

(4.24)

We define an operator F−1 : S(R) → S(R) by ˜ F−1 T (ϕ) = T (F −1 ϕ) = T (ϕ) (where T ∈ S(R) and ϕ ∈ S(R)). F−1 is called the inverse Fourier transform. As usual, we sometimes write T˜ instead of F−1 T . It is also obvious that F−1 is the dual operator of F −1 . F−1 = (F −1 ) .

(4.25)

Theorem 4.5 F is an automorphism of S(R) . The inverse Fourier transform F−1 is its inverse. Proof The linearity and the injectivity are clear. The surjectivity can be shown as follows. If T is an element of S(R) , then ˜ˆ = T (ϕ) Tˆ˜ (ϕ) = T˜ (ϕ) ˆ = T (ϕ)

for all ϕ ∈ S(R).

This shows that T is the Fourier transform of T˜ ∈ S(R) . Similarly, the inverse of F is given by the inverse Fourier transform, since (F−1 ◦ F)(T )(ϕ) = T (F ◦ F −1 )(ϕ) = T (ϕ)

for all

T ∈ S(R) , ϕ ∈ S(R).

Hence F−1 ◦ F = I (identity). F and F−1 are continuous (in the strong topology) in view of Theorem 4.4.

Remark 4.3 We use the notation ϕˇ (or Tˇ ) which means ϕ(x) ˇ = ϕ(−x),

Tˇ (ϕ) = T (ϕ). ˇ

Then the following relations can be verified without any difficulty: (i) ϕˆˆ = ϕ, ˇ ϕ ∈ S(R). ˆ ˆ ˇ (ii) T = T , T ∈ S(R) . We now explain several examples of Fourier transforms of tempered distributions.

4.4 Fourier Transforms of Tempered Distributions

85

Example 4.7 The distributions defined by f ∈ L1 (R, C) and fˆ (usual Fourier transform) are denoted by Tf and Tfˆ , respectively. Then * T f = Tfˆ . In fact, it follows from

1 f (x)ϕ(x)dx ˆ = √ f (x) e−ixξ ϕ(ξ )dξ dx 2π R R R

ϕ(ξ ) e−ixξ f (x)dx dξ = Tfˆ (ϕ); ϕ ∈ S(R).

* T ˆ = f (ϕ) = Tf (ϕ) 1 = √ 2π

R

R

This example illuminates that the concept of Fourier transform of tempered distributions is a generalization of the usual Fourier transform. √ 1 Example 4.8 δˆ = √ T1 , T,1 = 2π δ. (T1 is the distribution defined by the 2π function = 1.) By direct computation, we have 1 ˆ δ(ϕ) = δ(ϕ) ˆ = ϕ(0) ˆ = √ 2π

1 1 · ϕ(ξ )dξ = √ T1 (ϕ) 2π R

for all

ϕ ∈ S(R).

This shows the first relation. The second one immediately follows from 1 δ = δˇ = δˆˆ = √ T,1 . 2π int . The relation Example 4.9 We next evaluate e-

1 int ϕ = √ e2π

0

ϕ(x)e

gives the inverse Fourier transform of int ϕ = e-

√ 2π δn ϕ

−itx

1 dx eint dt,

ϕ ∈ S(R)

√ 2π ϕ. ˆ Hence

(δn is the Dirac function concentrating at n),

which gives int = e-

√ 2π δn .

Example 4.10 For T ∈ S(R) , we have T, = ix Tˆ ,

(4.26)

86

4 Fourier Transforms (II)

- = −Tˆ , ixT

(4.27)

where ixT (resp. ix Tˆ ) is a tempered distribution defined as the product of ix and T (resp. Tˆ ). We recapitulate the formula (4.3): 1 D p fˆ(ξ ) = √ 2π

R

D p (e−iξ x f (x))dx,

f ∈ S(R).

(4.28)

The relation (4.26) follows from - = T (ixϕ) - = ix Tˆ (ϕ). ˆ = −T (ϕˆ ) = −T (−ixϕ) T, (ϕ) = T (ϕ) Readers should recall the formula (4.4): q f (ξ ) = i q ξ q fˆ(ξ ). D

(4.29)

Making use of this formula, we obtain (4.27) by - )(ϕ) = (ixT )(ϕ) (ixT ˆ = T (ix ϕ) ˆ = T (ϕ, ) = Tˆ (ϕ ) = −Tˆ (ϕ). (4.29)

We add here several remarks on simple convergence of Fourier transforms of distributions. 1◦ If a sequence {Tn } in D(R) simply converges to some T ∈ D(R) , then the sequence {Tn } of the derivatives (in the sense of distribution) also simply converges to T . (More generally, the sequence {D p Tn } of the p-th derivatives simply converges to D p T .) Proof For any ϕ ∈ D(R), we have lim T ϕ n→∞ n

= − lim Tn ϕ = −T ϕ = T ϕ. n→∞

2◦

S(R)

A sequence {Tn } in (the space of tempered distributions) simply converges to T ∈ S(R) if and only if the sequence {Tˆn } of the Fourier transforms simply converges to Tˆ .13 Proof Suppose that {Tn } simply converges to T . Then lim Tˆn ϕ = lim Tn ϕˆ = T ϕˆ = Tˆ ϕ,

n→∞

n→∞

Hence {Tˆn } simply converges to Tˆ . 13 cf.

Yosida and Kato [15] pp. 98–99.

ϕ ∈ S(R).

4.5 Fourier Transforms on L2 (R, C) Revisited

87

Conversely, assume that {Tˆn } simply converges to Tˆ . Let ψ be the inverse Fourier transform of ϕ ∈ S(R) (i.e. ψˆ = ϕ). Then lim Tn ϕ = lim Tn ψˆ = lim Tˆn ψ = Tˆ ψ = T ψˆ = T ϕ,

n→∞

n→∞

n→∞

which proves that {Tn } simply converges to T .

3◦ F : S(R) → S(R) is simply continuous. 4◦ n

ck δk simply converges ⇐⇒

k=−n

n

ikx simply converges ck e-

k=−n

⇐⇒

n

ck eikx simply converges.

k=−n

These relations can be confirmed by combining 2◦ and 3◦ .

4.5 Fourier Transforms on L2 (R, C) Revisited We have already studied Plancherel’s theory of Fourier transforms on L2 (R, C) in Sect. 4.2. The theory of distributions sheds new light on this topic.14 Lemma 4.1 f ∈ Lp (R, C) (p 1) defines a tempered distribution by the relation

Tf (ϕ) =

R

f (x)ϕ(x)dx;

ϕ ∈ S(R).

In order to see this, it is enough to show that the measure dμ = |f |dx is tempered (cf. Example 4.4). It is verified by applying Hölder’s inequality to

(1 + |x|2 )−k |f (x)|dx. R

We also have to recall the fact that S(R) is dense in L2 (R, C) (cf. 5◦ on p. 68). Theorem 4.3 (Plancherel) Let f be an element of L2 (R, C). We denote by Tf the distribution defined by f . 2 * (i) Tf is a tempered distribution and T f is defined by some fˆ ∈ L (R, C); i.e.

* T f = Tfˆ

for some

fˆ ∈ L2 (R, C).

Moreover, fˆ 2 = f 2 . ( · 2 is the L2 -norm.) 14 cf.

Yosida [16] pp. 151–155.

88

4 Fourier Transforms (II)

(ii) fˆ obtained in (i) is expressed as 1 fˆ(x) = l.i.m. √ h↑∞ 2π

|y|h

e−ixy f (y)dy.

(iii) The mapping f → fˆ is a bijection of L2 (R, C) onto L2 (R, C). (iv) For any f, g ∈ L2 (R, C), f, g = fˆ, g ˆ (inner product in L2 ). The mapping F2 : L2 (R, C) → L2 (R, C) defined by F2 : f → fˆ is called the Fourier transform on L2 (R, C). F2 (f ) = fˆ is also called the Fourier transform of f . The concept of the Fourier transform on L2 given in Sect. 4.2 (p. 72) is, of course, equivalent to that just defined above. Proof of Theorem 4.3 (i) We saw that f ∈ L2 (R, C) defines a tempered distribution in Lemma 4.2. Consequently, the Fourier transform of Tf can be defined. It follows, then, that

* ˆ = f (x)ϕ(x)dx ˆ |T f (ϕ)| = |Tf (ϕ)| R

(4.30) f 2 · ϕ

ˆ 2 = f 2 · ϕ 2

for all

ϕ ∈ S(R)

by Schwarz’s inequality. ( ϕ

ˆ 2 = ϕ 2 comes from (4.13).) 2 * This shows that T f is an L -continuous linear functional on S(R). Since S(R) is 2 * dense in L (R, C), Tf can be uniquely extended to a continuous linear functional on 2 * * L2 (R, C). The norm of T f viewed as an operator on L (R, C) satisfies T f f 2 by (4.30). Thus, thanks to Riesz’s representation theorem (Theorem 1.1, p. 3), there exists uniquely some fˆ ∈ L2 (R, C) such that

* T (ϕ) = fˆ(x)ϕ(x)dx = Tfˆ (ϕ), f R

i.e.

R

(4.31)

f (x)ϕ(x)dx ˆ =

R

fˆ(x)ϕ(x)dx

for all

ϕ ∈ S(R).

* This fˆ satisfies fˆ 2 = T f f 2 as shown above. Hence

fˆ 2 f 2 .

(4.32)

fˆˆ 2 fˆ 2 f 2 .

(4.33)

Similarly, we obtain

4.5 Fourier Transforms on L2 (R, C) Revisited

89

On the other hand, we have

ˆ ˆ ˆ f (−x)ϕ(x)dx f (x)ϕ(x)dx ˆ = f (x)ϕ(x)dx = R

R

R

for all

ϕ ∈ S(R)

by Remark 4.3 (p. 84) and (4.31). Hence fˆˆ(x) = f (−x)

a.e.

(4.34)

Thus it follows that

fˆˆ 2 = f 2 .

(4.35)

Combining (4.33) and (4.35), we obtain f 2 = fˆ 2 . (ii) Define a function fh (h > 0) by + f (x) for |x| h, fh (x) = 0 for |x| > h. Then we have (say, by the dominated convergence theorem) lim fh − f 2 = 0.

h→∞

Hence, by (i), lim f,h − fˆ 2 = 0,

h→∞

i.e. fˆ = l.i.m.f,h .

(4.36)

h→∞

It follows from (4.31) that

R

f,h (x)ϕ(x)dx =

R

fh (x)ϕ(x)dx ˆ

1 f (x) √ e−ixy ϕ(y)dy dx 2π R |x|h

1 = e−ixy f (x)dx ϕ(y)dy √ 2π R |x|h

=

for all

ϕ ∈ S(R)

(apply Fubini’s theorem in view of the integrability of f (x) on |x| h),

which implies 1 f,h (x) = √ 2π (ii) is verified by (4.36) and (4.37).

|x|h

e−ixy f (y)dy.

(4.37)

90

4 Fourier Transforms (II)

Fig. 4.1 Fourier transform as automorphism

(iii) Since the Fourier transform F2 : L2 (R, C) → L2 (R, C) is an isometric linear operator, it is obviously injective. (See Fig. 4.1.) 2 Now, the inverse Fourier transform F−1 Tf = T f of the distribution Tf defined 2 by f ∈ L (R, C) is a distribution defined by 1 f˜(x) = l.i.m. √ h→∞ 2π

|y|h

eixy f (y)dy,

(4.38)

which is an element of L2 (R, C). (Follow a similar argument in (i) and (ii))15 The mapping F2−1 : f → f˜ is called the inverse Fourier transform on 2 L (R, C). (f˜ is also called the inverse Fourier transform of f as in the preceding discussions.) We can show f 2 = f˜ 2 as in (i). (See Fig. 4.2.) If we define S2 (R, C) by S2 (R, C) = {Tf ∈ S (R, C)|f ∈ L2 (R, C)}, F is an automorphism on S2 . Correspondingly, F2 is an automorphism on L2 . (iv) This is easy.

2 is a 2 ∈ L2 (R, C) determines a tempered distribution Tf , it has its inverse T f ∈ S(R) . T f continuous linear functional on S(R) (with respect to L2 -norm). This can be checked by a similar 2 computation to that in (4.30). Hence T f can be uniquely extended to a continuous linear functional 2 on L2 (R, C). By Riesz’s theorem, there exists some f˜ ∈ L2 (R, C) which represents T f . f˜ is given in a concrete form

1 f˜(x) = l.i.m. √ eixy f (y)dy h→∞ 2π |y|h

15 Since f

2 * (cf. (4.36) and (4.37)). T f = Tf˜ is the inverse operator of T f = Tfˆ on S(R). So it is clear that these are mutually inverse as operators extended to L2 (R, C).

4.5 Fourier Transforms on L2 (R, C) Revisited

91

Fig. 4.2 Inverse Fourier transform as automorphism

Theorem 4.6 (Parseval) The following relations hold true for any f and g ∈ L2 (R, C):

f (x)g(x)dx = fˆ(ξ )g(−ξ ˆ )dξ. (i) R

R (ii) f (y)g(x − y)dy. fˆ(ξ )g(ξ ˆ )eiξ x dξ = R

R

Proof (i) It can easily be verified that gˆ = g(−ξ ˆ ) by a simple calculation: ˆ ) = l.i.m. √1 g(ξ h→∞ 2π 1 = l.i.m. √ h→∞ 2π 1 = l.i.m. √ h→∞ 2π

|x|h

−iξ x g(x)e ¯ dx

|x|h

g(x)eiξ x dx

|x|h

g(x)e−i(−ξ x) dx

= g(−ξ ˆ ). It follows from Theorem 4.3 (iv) and (4.39) that

ˆ )dξ fˆ(ξ )g(−ξ ˆ )dξ = fˆ(ξ )g(ξ R

R

ˆ = f, g ¯ = fˆ, g

=

R

f (x)g(x)dx.

(Theorem 4.3 (iv))

(4.39)

92

4 Fourier Transforms (II)

(ii) We try to find the Fourier transform of g(x − y) with respect to y. First we assume that g is an element of S. 1 √ 2π

1 g(x − y)e−iξy dy = √ 2π R

R

g(t)e−iξ(x−t) dt

(changing variables : t = x − y)

1 = e−iξ x √ g(t)e−iξ(−t) dt 2π R

−iξ x 1 =e g(t)e−i(−ξ )t dt √ 2π R = e−iξ x g(−ξ ˆ ). In the general case of g ∈ L2 (R, C), the same relation holds good, since S is dense in L2 and the Fourier transform is continuous. By (i), we obtain

R

f (y)g(x − y)dy =

R

fˆ(ξ )g(ξ ˆ )eiξ x dξ.

4.6 Periodic Distributions A locally integrable function f ∈ L1loc. (R, C) defines a distribution Tf by the relation16,17

∞ Tf (ϕ) = f (x)ϕ(x)dx, ϕ ∈ D(R). −∞

Translating f (x) by τ ∈ R, the function f (x − τ ) defines a distribution Tf (x−τ ) by

Tf (x−τ ) (ϕ) = =

16 This

∞

−∞

∞ −∞

f (x − τ )ϕ(x)dx f (x)ϕ(x + τ )dx,

ϕ ∈ D(R).

section is based upon Maruyama [10]. by L1loc. (R, C) the space of locally integrable complex-valued functions defined on R. D(R) is the space of test functions. See Sect. 4.4 and Appendix C. 17 We denote

4.6 Periodic Distributions

93

Hence, if f is a periodic function with period τ (or simply, τ -periodic), we must have

∞

∞ f (x)ϕ(x)dx = f (x)ϕ(x + τ )dx −∞

−∞

for all ϕ ∈ D(R). Generalizing this reasoning, the concept of periodic distribution is defined as follows.18 Definition 4.6 A distribution T ∈ D(R) is called a periodic distribution with period τ if T (ϕ(x)) = T (ϕ(x + τ )) for all ϕ ∈ D(R). We denote by Dτ (R) the set of τ -periodic distributions. For the sake of simplicity, we assume τ = 2π from now on. T denotes the torus (cf. Appendix A). To start with, let us confirm that D2π (R) can be identified with C∞ (T, C) . Assume that S ∈ C∞ (T, C) is given. We associate with each ϕ ∈ D(R) a new function ϕ(x) ˜ =

∞

ϕ(x + 2nπ ).

(4.40)

n=−∞

The value ϕ(x) ˜ is defined without any ambiguity because the support of ϕ is compact and so the right-hand side of (4.40) is actually a finite sum for each x. Since ϕ˜ is 2π -periodic and infinitely differentiable, it can be regarded as an element of C∞ (T, C). If we define an operator T on D(R) by T (ϕ) = S(ϕ), ˜

ϕ ∈ D(R),

(4.41)

T is a continuous linear functional and so T ∈ D(R) . Since it is obvious that T (ϕ(x)) = T (ϕ(x + 2π )), T is a 2π - periodic distribution; i.e. T ∈ D2π (R) . Conversely, let T be an element of D2π (R) . It can be shown that T ∈ C∞ (T, C) . We prepare a lemma due to Yosida–Kato [15], §7. Lemma 4.2 For any a > 0, there exists some θ ∈ D(R) which satisfies19 (i) supp θ = [−a, a], and ∞ (ii) θ (x + na) = 1. n=−∞

18 D(R) 19 The

denotes the dual space of D(R). Each element of D(R) is called a distribution. notation supp θ means the support of the function θ.

94

4 Fourier Transforms (II)

Proof Define a function θ : R → C by

θ (x) =

⎧

⎪ ⎪ ⎨ ⎪ ⎪ ⎩0

a

|x|

exp −

3 a 1 1 dw dw exp − w(a − w) w(a − w) 0

for |x| a, for |x| > a.

Then it is clear that θ ∈ D(R) and (i) is satisfied. Computing θ (x − a) for |x| a, we obtain that 3 a

a 1 1 θ (x − a) = dw dw. exp − exp − w(a − w) w(a − w) |x−a| 0 Changing the variable by z = a − w, we can rewrite it as 3 a 1 1 dz dw θ (x − a) = − exp − z(a − z) w(a − w) x 0 3 a

x 1 1 dz dw = exp − exp − z(a − z) w(a − w) 0 0

0

exp −

if x ∈ [0, a]. Consequently, it follows that θ (x) + θ (x − a) = 1 for

x ∈ [0, a].

We also obtain the same relation for x ∈ [−a, 0) by a similar argument. There exists, for each x ∈ R, a unique integer k ∈ Z such that ka |x| < (k + 1)a. Therefore we must have ∞

θ (x + na) = θ (x − ka) + θ (x − (k + 1)a) = 1.

n=−∞

This proves (ii).

Let us go back to prove that T ∈ D2π (R) can be regarded as an element of

C∞ (T, C) .

Any ψ ∈ C∞ (T, C) can be regarded as a 2π - periodic smooth function defined on R. If θ ∈ D(R) is a function which satisfies (i) and (ii) of Lemma 4.2 for a = 2π , then ψθ ∈ D(R). Define an operator U on C∞ (T, C) by U (ψ) = T (ψθ ).

(4.42)

4.6 Periodic Distributions

95

It is well-defined in the sense that U does not depend upon the choice of θ . In fact, if η ∈ D(R) satisfies ∞

η(x + 2nπ ) = 0,

n=−∞

we have20 ∞ T (ψη) = T ψ(x)η(x) θ (x + 2nπ ) n=−∞

=

∞

T (ψ(x − 2nπ )η(x − 2nπ )θ (x))

(by the periodicity of T )

n=−∞

=

∞

T (η(x − 2nπ ) · ψ(x)θ (x))

n=−∞

=T

(by the periodicity of ψ)

5 η(x − 2nπ ) ψ(x)θ (x)

4 ∞ n=−∞

= 0. This confirms that U is well-defined. U is continuous on C∞ (T, C). In fact, for any net {ψα } in C∞ (T, C) which converges to some ψ ∈ C∞ (T, C), we have21 ψα θ → ψθ

in D(R).

Consequently, it follows that U (ψα ) = T (ψα θ ) → T (ψθ ) = U (ψ). Thus we establish the chain illustrated by Fig. 4.3.

Fig. 4.3 C∞ (T, C) and D2π (R)

20

p k=−p

21 We

θ(x + 2kπ ) →

∞

θ(x + 2nπ ) (in C∞ ) on supp ψη.

n=−∞

should note that supp ψα θ ⊂ supp θ.

96

4 Fourier Transforms (II)

It is natural to ask if S is identical with U . The answer is positive as confirmed by the following calculation: U (ψ) = T (ψθ ) 4 ∞ 5 =S ψ(x + 2nπ )θ (x + 2nπ ) n=−∞

4 =S

∞

5 ψ(x)θ (x + 2nπ )

n=−∞

4

= S ψ(x)

(by the periodicity of ψ) 5

∞

θ (x + 2nπ )

n=−∞

= S(ψ)

for any

ψ ∈ C∞ (T, C).

The reasoning discussed above justifies the following result. Theorem 4.7 (periodic distribution) There is a one-to-one correspondence between C∞ (T, C) and D2π (R) . The operators defined by (4.41) and (4.42) are inverse to each other. Let T be a 2π -periodic distribution; i.e. T ∈ D2π (R) . The Fourier coefficients of T are defined by 1 cn = √ T (e−inx ), 2π

n ∈ Z.

The formal series ∞ 1 cn einx √ 2π n=−∞

is called the Fourier series of T . Remark 4.4 Assume that a trigonometric series ∞ 1 Cn einx √ 2π n=−∞

(4.43)

4.6 Periodic Distributions

97

simply converges to a distribution T ; i.e. 1 √ 2π

π

p

−π k=−p

Ck eikx · ϕ(x)dx → T (ϕ)

for any

ϕ ∈ D2π (R).

(4.44)

If we = e−inx ,√the left-hand side of (4.44) converges √ consider a special case of ϕ −inx ) = 2π cn . Hence we must have Cn = cn , to 2π Cn . On the other hand, T (e and so the series (4.43) is nothing other than the Fourier series of T . Theorem 4.8 (Fourier coefficients of periodic distribution) Consider a sequence {cn }n∈Z of complex numbers. The cn ’s are the Fourier coefficients of some 2π periodic distribution if and only if there exists some N ∈ N ∪ {0} such that cn = O(|n|N ) ; i.e. Proof

22

(4.45)

|cn | K|n|N

for some

K > 0.

Assume first that {cn } satisfies (4.45). We write formally u(x) =

n0

1 cn einx , (in)N +2

x ∈ R.

(4.46)

Since

1

n0

|n|N +2

|cn ||einx |

K n2 n0

by the assumption (4.45), the right-hand side of (4.46) is absolutely, uniformly convergent. Hence u(x) is a continuous function which satisfies |u(x)| K

1 . n2 n0

Consequently, u(x) defines a distribution, and un (x) =

n k=−n k0

22 See

1 ck eikx (ik)N +2

Folland [3], pp. 320–322 and Lax [7], p. 570 for an outline of ideas.

(4.47)

98

4 Fourier Transforms (II)

simply converges to u(x) exactly speaking, to the distribution defined by u(x) . In fact, it can be verified by

∞ −∞

(un (x) − u(x))ϕ(x)dx as

∞ −∞

|un (x) − u(x)| · |ϕ(x)|dx → 0

n → ∞ for any

ϕ ∈ D(R).

It follows that, in view of 1◦ on p. 86, D N +2 u(x)(ϕ) = lim

n→∞

n k=−n k0

n 1 N +2 ikx c (ik) e (ϕ) = lim ck eikx (ϕ). k n→∞ (ik)N +2 k=−n k0

Hence we must have n 1 1 ck eikx → √ (c0 + D N +2 u(x)) √ 2π k=−n 2π

simply as

n → ∞.

By Remark 4.4, √ cn ’s are the Fourier coefficient of the 2π - periodic distribution defined by (1/ 2π )(c0 + D N +2 u(x)). Let us go over to the converse. Assume that n 1 ck eikx √ 2π k=−n

simply converges to some distribution. If cn ’s do not satisfy (4.45), there exists a sequence {nr } of integers such that |cnr | > |nr |r ;

r = 1, 2, · · · .

(4.48)

Define a pair of functions λ(x) and ϕ(x) by + λ(x) =

e−x

2 /(1−x 2 )

if |x| < 1, if |x| 1

0

(4.49)

and ϕ(x) =

∞ r=1

cn−1 λ(x − nr ). r

(4.50)

References

99

The right-hand side of (4.50) is a finite sum for each x and ϕ(x) ∈ D(R). By the definition of ϕ, we have23 ϕ(n) =

+ 0 cn−1 r

if

n nr ,

if

n = nr

(r = 1, 2, · · · ).

It follows that n

ck δk (ϕ) =

k=−n

=

∞ −∞

n

n

ck δk ϕ(t)dt

k=−n

ck ϕ(k)

(4.51)

k=−n

= the number of nr s between − n and n. The right-hand side of (4.51) diverges to ∞ as n → ∞. Consequently, by 4◦ on n p. 87, ck eikt can not be simply convergent. Contradiction. k=−n

References 1. Bourbaki, N.: Eléments de mathématique; Espaces vectoriels topologiques. Hermann, Paris, Chaps. 1–2 (seconde édition) (1965), Chaps. 3–5 (1964) (English edn.) Elements of Mathematics; Topological Vector Spaces. Springers, Berlin/Heidelberg/New York (1987) 2. Dunford, N., Schwartz, J.T.: Linear Operators, Part 1–3. Interscience, New York (1958–1971) 3. Folland, G.B.: Fourier Analysis and Its Applications. American Mathematical Society, Providence (1992) 4. Grothendieck, A.: Topological Vector Spaces. Gordon and Breach, New York (1973) 5. Kawata, T.: Ohyo Sugaku Gairon (Elements of Applied Mathematics). I, II, Iwanami Shoten, Tokyo (1950, 1952) (Originally published in Japanese) 6. Kawata, T.: Fourier Kaiseki (Fourier Analysis). Sangyo Tosho, Tokyo (1975) (Originally published in Japanese) 7. Lax, P.D.: Functional Analysis. Wiley, New York (2002) 8. Maruyama, T.: Kansu Kaisekigaku (Functional Analysis). Keio Tsushin, Tokyo (1980) (Originally published in Japanese) 9. Maruyama, T.: Sekibun to Kansu-kaiseki (Integration and Functional Analysis). Springer, Tokyo (2006) (Originally published in Japanese) 10. Maruyama, T.: Herglotz-Bochner representation theorem via theory of distributions. J. Operations Res. Soc. Japan 60, 122–135 (2017) 11. Schwartz, L.: Functional Analysis. Courant Institute of Mathematical Sciences, New York University, New York (1964)

23 n

nr ⇒ λ(n − nr ) = 0, n = nr ⇒ λ(nr − nr ) = 1.

100

4 Fourier Transforms (II)

12. Schwartz, L.: Théorìe des distributions. Nouvelle édition, entièrement corrigée, refondue et augmentée, Hermann, Paris (1973) 13. Takagi, T.: Kaiseki Gairon (Treatise on Analysis), 3rd edn. Iwanami Shoten, Tokyo (1961) (Originally published in Japanese) 14. Treves, J.F.: Topological Vector Spaces, Distributions and Kernels. Academic Press, New York (1967) 15. Yosida, K., Kato T.: Ohyo Sugaku (Applied Mathematics). I, Shokabou, Tokyo (1961) (Originally published in Japanese) 16. Yosida, K.: Functional Analysis, 3rd edn. Springer, New York (1971)

Chapter 5

Summability Kernels and Spectral Synthesis

We have discussed the (C, 1)-summability of Fourier series in Chap. 2. We recapitulate this topic primarily in the framework of Fourier series in complex form. The details of the Fejér summability on [−π, π ] and R are examined. The fundamental results on spectral synthesis based upon the summability technique are the main concerns of this chapter. Similar procedures will be effectively utilized in the proof of the Herglotz–Bochner theorem in Chap. 6.

5.1 Shift Operators Fixing x0 ∈ R, we define the shift operator τx0 : Lp (R, C) → Lp (R, C)1 by (τx0 f )(x) = f (x − x0 ),

f ∈ Lp ,

x ∈ R.

(5.1)

Since τx0 f p = f p , the operator τx0 is an isometric automorphism of Lp onto itself. Furthermore, x → τx is a group-homomorphism of R into Aut (Lp )2 because τx ◦ τy = τx+y ;

x, y ∈ R.

(5.2)

Theorem 5.1 (continuity theorem) (i) The operator F : R → C0 (R, C) defined by F : x → τx f

(5.3)

for any fixed f ∈ C0 (R, C) is uniformly continuous. 1 More

generally, we can consider a locally compact Hausdorff commutative topological group instead of R, and Haar measure instead of Lebesgue measure. 2 Aut(Lp ) is the group of automorphisms on Lp . © Springer Nature Singapore Pte Ltd. 2018 T. Maruyama, Fourier Analysis of Economic Phenomena, Monographs in Mathematical Economics 2, https://doi.org/10.1007/978-981-13-2730-8_5

101

102

5 Summability Kernels and Spectral Synthesis

(ii) If f ∈ Lp (R, C) (1 p < ∞), then the operator F : R → Lp (R, C) defined by (5.3) is uniformly continuous. Proof (i) Since f ∈ C0 is uniformly continuous, there exists some δ > 0, for each ε > 0, such that |f (x) − f (y)| < ε

if |x − y| < δ.

If |x0 − y0 | < δ, |(x − y0 ) − (x − x0 )| < δ

for any x.

Hence |τx0 f (x) − τy0 f (x)| = |f (x − x0 ) − f (x − y0 )| < ε for any x ∈ R. That is,

τx0 f − τy0 f ∞ ε

if |x0 − y0 | < δ.

(ii) Assume f ∈ Lp (1 p < ∞). Then for any ε > 0, there exists some ϕ ∈ C0 such that f − ϕ p < ε/3. It is trivial that τx f − τy f = τx (f − ϕ) + τx ϕ − τy (f − ϕ) − τy ϕ.

(5.4)

Taking account of

τx (f − ϕ) p = τy (f − ϕ) p = f − ϕ p <

ε , 3

we obtain

τx f − τy f p <

2 ε + τx ϕ − τy ϕ p 3

by (5.4). If we denote supp ϕ by K, it is easy to check that supp(τx ϕ − τy ϕ) ⊂ (x + K) ∪ (y + K), which implies m(supp(τx ϕ − τy ϕ)) < 2mK (m is the Lebesgue measure). Consequently, it follows that

τx ϕ − τy ϕ p τx ϕ − τy ϕ ∞ · (2mK)1/p .

(5.5)

5.1 Shift Operators

103

If δ > 0 is small enough,

τx ϕ − τy ϕ ∞ <

ε , 3(2mK)1/p

if

|x − y| < δ

(5.6)

by (i). Combining (5.4), (5.5) and (5.6), we have

τx ϕ − τy ϕ p < ε

if |x − y| < δ.

Theorem 5.2 (shift operator and convolution) If f ∈ is continuous and integrable, then3

∞

−∞

L1 (R, C)

and k : R → C

k(x)τx f dx = k ∗ f.

Proof Assume first that f is continuous and suppf is compact. In this case, the integration on the left-hand side is actually evaluated on some finite interval. Hence

∞

−∞

k(x)τx f dx = lim

(xj +1 − xj )k(xj )τxj f,

(5.7)

j

where the limit is taken with respect to L1 -norm as the decomposition of the interval of integration becomes finer and finer. On the other hand, we have (k ∗ f )(x) = lim

(xj +1 − xj )k(xj )f (x − xj )

(uniform convergence).

j

(5.8) Comparing (5.7) and (5.8), the proof is finished in this special case. We shall now turn to the general case: f ∈ L1 . There exists, for any ε > 0, some continuous function g with compact support which satisfies f − g 1 < ε. Since

∞ −∞

k(x)τx gdx = k ∗ g

3 This

result may seem to be trivial. However, we have to note carefully what it means. The righthand side is a function defined by

∞ (k ∗ f )(y) = k(x)f (y − x)dx. −∞

The left-hand side is the Cauchy–Bochner integral of the L1 -valued function x → k(x)τx f . The theorem asserts that these two are equal. cf. Amann and Escher [1], II pp. 17–23, Maruyama [6] pp. 329–333 for the Cauchy–Bochner integral.

104

5 Summability Kernels and Spectral Synthesis

as observed above, it follows that

∞

−∞

k(x)τx f dx − k ∗ f =

∞ −∞

k(x)τx (f − g)dx + k ∗ (g − f ).

Consequently, we obtain

∞ −∞

k(x)τx f dx − k ∗ f 2 k 1 · ε. 1

p L2π (R, C)

We denote by the set of all the measurable complex-valued functions on R which are 2π -periodic and p-th integrable on [−π, π ]. Fixing x0 ∈ R, we p p define an operator τx0 : L2π (R, C) → L2π (R, C) (p 1) by p

(τx0 f )(x) = f (x − x0 ), f ∈ L2π , x ∈ R. p

Clearly, τx0 f ∈ L2π . Regarding it as an Lp -function on [−π, π ], we note

f Lp ([−π,π ],C) = τx0 f Lp ([−π,π ],C) . The following two theorems can be shown in a similar manner to Theorems 5.1 and 5.2. Theorem 5.1 (continuity theorem) (i) The operator F : R → C2π (R, C) defined by F : x → τx f

(5.9)

for any fixed f ∈ C2π (R, C) is uniformly continuous. C2π (R, C) denotes the set of all the complex-valued 2π -periodic continuous functions on R. C2π (R, C) is endowed with the uniform convergence topology. p p (ii) If f ∈ L2π (R, C) (1 p < ∞), then the operator F : R → L2π defined by (5.9) is uniformly continuous. Theorem 5.2 If f ∈ L12π (R, C) and k ∈ C2π (R, C), then4

π

−π

4 See

k(x)τx f dx = k ∗ f.

footnote 3 for the interpretation of the result.

5.2 Summability Kernels on [−π, π ]

105

5.2 Summability Kernels on [−π, π] We now go over to the Fejér method, which is quite effective for examining the summability of Fourier series in complex form.5 We first give a general definition. Definition 5.1 A sequence {kn : [−π, π ] → R} of continuous functions is called a summability kernel if it satisfies the following conditions:

π kn (x)dx = 1 for all n = 1, 2, · · · (i) −π

(ii) kn 1 constant (iii) lim

n→∞ δ|x|π

for all n = 1, 2, · · ·

|kn (x)|dx = 0

δ ∈ (0, π ).

for any

Lemma 5.1 Let X be a Banach space and a function ϕ : [−π, π ] → X be continuous. Then

π lim kn (x)ϕ(x)dx = ϕ(0) n→∞ −π

for any summability kernel {kn } on [−π, π ]. Proof In view of the condition (i) in the definition of summability kernels, it holds good, for any δ ∈ (0, π ), that

π −π

kn (x)ϕ(x)dx − ϕ(0) =

−π

=

π

δ

−δ

kn (x)(ϕ(x) − ϕ(0))dx (5.10)

+

= I1 + I2 . δ|x|π

I1 is evaluated as

I1 Max ϕ(x) − ϕ(0) · kn 1 . |x|δ

(5.11)

Choosing δ > 0 sufficiently small for any ε > 0, the right-hand side of (5.11) is less than ε by the continuity of ϕ and the condition (ii) in the definition. As for I2 , we have

|kn (x)|dx. (5.12)

I2 Max ϕ(x) − ϕ(0)

δ|x|π

5 Katznelson

[3] Chap. I, §2 is quite helpful.

δ|x|π

106

5 Summability Kernels and Spectral Synthesis

By the condition (iii), the right-hand side of (5.12) is less than ε for large n. Equations (5.10), (5.11) and (5.12) imply

π

−π

kn (x)ϕ(x)dx → ϕ(0)

as

n → ∞.

In the special case of X = R, the assertion of Lemma 5.1 can be expressed as “the sequence {kn dx} of measures on [−π, π ] w ∗ -converges to the Dirac measure δ0 ”. The operator F : R → L12π defined by F : x → τx f for a fixed f ∈ L12π (R, C) is, by Theorem 5.1 , uniformly continuous. Of course, F (0) = f . Hence the following theorem follows from Lemma 5.1. Theorem 5.3 (convergence of kn ∗ fτ ) For any f ∈ L12π (R, C) and a summability kernel {kn } on [−π, π ],

π −π

kn (x)τx f dx → f

as

n→∞

in · 1 . The Fejér kernel ⎛

⎞2 nx sin 1 ⎜ 2 ⎟ ⎜ ⎟ Kn (x) = 2nπ ⎝ x ⎠ sin 2

(5.13)

is a special example of summability kernel (cf. Chap. 2, Sect. 2.5, p. 42). Kn can be represented in the form 1 Kn (x) = 2π

n−1 j =−(n−1)

|j | ij x 1− e . n

We prove the relation (5.14) by a direct calculation. First of all, taking account of the elementary fact sin2 we obtain

1 1 1 x 1 = (1 − cos x) = − e−ix + − eix , 2 2 4 2 4

(5.14)

5.2 Summability Kernels on [−π, π ]

x sin 2 2

n−1

107

|j | ij x 1 ix 1 −ix 1 1− − e e = − e + n 4 2 4

j =−(n−1)

n−1

|j | ij x 1− e n

j =−(n−1)

1 inx 1 −inx 1 (∗) 1 − e − e = + n 4 2 4

(5.15)

1 nx sin2 . n 2

=

We had better check the second equality (∗), since it may be a little bit troublesome: 1 −ix 1 1 ix − e + − e 4 2 4

n−1

j =−(n−1)

|j | ij x e 1− n

n−1 n−1 j j 1 1 ij x 1− 1− e + e−ij x = 2 n 2 n j =0

−

j =1

n−1 n−1 1 1 j j 1− 1− ei(j −1)x − e−i(j +1)x 4 n 4 n j =0

−

j =1

n−1 n−1 j j 1 1 1− 1− ei(j +1)x − e−i(j −1)x . 4 n 4 n j =0

j =1

(5.16) Changing the indices j − 1 and j + 1 to j , (5.16) is rewritten in the form n−1 n−1 j j 1 1 ij x e e−ij x 1− 1− + (5.16) = 2 n 2 n j =0 j =1 n n−2 j − 1 −ij x 1 j + 1 ij x 1 1− e − 1− e − 4 n 4 n j =2 j =−1 n n−2 j − 1 ij x j + 1 −ij x 1 1 1− 1− e e − − . 4 n 4 n j =1

(5.17)

j =0

A

B

We continue calculations, dividing (5.16) into two blocks A and B. Note that the three formulas constituting the block A share common terms eij x , j = 1, 2, · · · , n − 2. The coefficients of these terms in A are seen to be cancelled: coefficient of e

ij x

j 1 j +1 1 j −1 1 1− − 1− − 1− = 0. = 2 n 4 n 4 n

108

5 Summability Kernels and Spectral Synthesis

Fig. 5.1 Calculation of (5.17) A

Fig. 5.2 Calculation of (5.17) B

Hence, the remaining part of A is given by 0 i0x 1 n − 1 i(n−1)x 1 1− e + 1− e A = 2 n 2 n −1 + 1 i(−1)x 1 0 + 1 i0x 1 1− e 1− e − − 4 n 4 n n − 1 − 1 i(n−1)x 1 n − 1 inx 1 1− e 1− e . − − 4 n 4 n

(5.18)

Similar arguments apply also to the block B. The constituent three formulas of B share common terms e−ij x , j = 2, 3, · · · , n − 2. The coefficient of these terms in B must be zero: coefficient of e

−ij x

j 1 j −1 1 j +1 1 1− − 1− − 1− = 0. = 2 n 4 n 4 n

Gathering the remaining parts ofB, we have B =

1 1 −i1x 1 n − 1 −i(n−1)x 1− e 1− e + 2 n 2 n (n − 1) − 1 −i(n−1)x 1 n − 1 −inx 1 1− e 1− e − − 4 n 4 n 0 + 1 −i0x 1 1 + 1 −i1x 1 1− e 1− e − . − 4 n 4 n

(5.19)

The formula (5.16) is the sum of (5.18) and (5.19). Thus, removing the cancelled parts, we obtain

5.3 Spectral Synthesis on [−π, π ]

109

1 1 −inx 1 1 inx (5.16) = A + B = + − e . − e n 4 2 4 This proves the equality (∗) in (5.15). (See Fig. 5.1 and Fig. 5.2.) The representation (5.14) immediately follows from (5.15) and (5.13).

5.3 Spectral Synthesis on [−π, π] We have discussed the concept of convolution on R and its applications in Chap. 3, Sect. 3.2. To start with, a similar concept of convolution on [−π, π ] is introduced. For any f, g ∈ L12π (R, C), we define the convolution h(x) on [−π, π ] by

h(x) =

π

−π

f (x − y)g(y)dy.

(5.20)

h(x) is denoted by (f ∗ g)(x), the same notation as the usual convolution on R. (i) h(x) is defined for a.e. x ∈ [−π, π ]. (ii) h(x) is integrable on [−π, π ]. (iii) The convolution operation ∗ on [−π, π ] is commutative, associative and distributive (with respect to addition). (iv) The Fourier coefficients of f, g and f ∗ g satisfy the relation √ 2π fˆ(n)g(n), ˆ

f ∗ g(n) = where fˆ(n) is given by 1 fˆ(n) = √ 2π

π −π

f (x)e−inx dx.

g(n) ˆ and f ∗ g(n) are defined similarly. Theorem 5.4 (special convolution; einx ∗ f ) Assume that f ∈ L12π (R, C). (i) If ϕ(x) = einx , (ϕ ∗ f )(x) = (ii) If ϕ(x) =

n

√

2π fˆ(n)einx .

aj eij x ,

j =−n

(ϕ ∗ f )(x) =

√

2π

n j =−n

aj fˆ(j )eij x .

110

5 Summability Kernels and Spectral Synthesis

Proof (i) If ϕ(x) = einx , we have

(ϕ ∗ f )(x) =

π −π

ein(x−ξ ) f (ξ )dξ = einx

π −π

f (ξ )e−inξ dξ =

√

2π fˆ(n)einx .

(ii) is an immediate consequence of (i). We denote by σn (x) the convolution of f ∈ σn (x) = (Kn ∗ f )(x),

L12π (R, C)

and Kn on [−π, π ]; i.e.

x ∈ [−π, π ].

(5.21)

Then σn (x) can be expressed as 1 σn (x) = √ 2π

n−1 j =−(n−1)

|j | ˆ 1− f (j )eij x , n

(5.22)

by (5.14) in the preceding section and Theorem 5.4. The partial sum Sn (x) of the Fourier series of f in complex form is given by n−1

1 Sn (x) = √ 2π

fˆ(j )eij x .

(5.23)

j =−(n−1)

The number of occurrences of fˆ(j )eij x (j = 0, ±1, · · · , ±(n − 1)) in the average of partial sums 1 {S0 (x) + S1 (x) + · · · + Sn−1 (x)} n

(5.24)

is n − |j |. Consequently, we obtain (5.24) =

n·

1 √

1 = √ 2π

n−1

2π

(n − |j |)fˆ(j )eij x

j =−(n−1) n−1

j =−(n−1)

|j | ˆ 1− f (j )eij x = σn (x) n (by (5.22)).

Theorem 5.5 (σn = (C, 1)-sum) For f ∈ L12π (R, C), σn (x) =

1 {S0 (x) + S1 (x) + · · · + Sn−1 (x)}. n

(5.25)

5.4 Summability Kernels on R

111

That is, σn (x) is nothing other than the (C, 1)-sum of the Fourier series in complex form. ∞ Therefore if we impose an additional assumption that fˆ(n)einx converges, n=−∞

it follows that6 ∞ 1 σn (x) → √ fˆ(n)einx 2π n=−∞

as

n → ∞.

On the other hand, σn (x) converges to f in L1 by (5.21) and Theorem 5.3. So some subsequence {σ√ to f a.e. However, the original sequence n } converges ) {σn } converges to (1/ 2π ) fˆ(n)einx . Consequently, it turns out that {σn } itself converges to f a.e.: σn (x) → f (x) a.e. Thus an important result immediately follows. Theorem 5.6 (spectral synthesis) Assume that f ∈ L12π (R, C) and ∞

|fˆ(n)| < ∞.

n=−∞

Then σn (x) → f (x)

a.e. as n → ∞;

i.e. ∞ 1 fˆ(n)einx . f (x) = √ 2π n=−∞

This theorem illuminates a procedure to recover the original function f when its Fourier coefficients are given. This operation is called the spectral synthesis.

5.4 Summability Kernels on R We now try to extend the Fejér method of summability and the spectral synthesis on [−π, π ] to similar problems on R.7

6 If

a series

∞

an converges, it is (C, 1)-summable to the same limit (cf. Remark 2.4, p. 40).

n=0 7 cf.

Katznelson [3] Chap. VI, §1.

112

5 Summability Kernels and Spectral Synthesis

Definition 5.2 A family of continuous functions {kλ : R → R} (λ ∈ (0, ∞), or λ ∈ N) is called a summability kernel on R if it satisfies:

∞ kλ (x)dx = 1 for all λ, (i) −∞

(ii) kλ 1 = O(1) as λ → ∞,

(iii) lim |kλ (x)|dx = 0 for any

δ > 0.

λ→∞ |x|>δ

If a function f ∈ L1 (R, R) satisfies

∞

−∞

f (x)dx = 1,

a summability kernel can be made based upon f . That is, if we define kλ (x) = λf (λx),

(5.26)

then {kλ } is a summability kernel. In fact, the condition (i) is verified by changing the variables: y = λx. (ii) is satisfied, since

kλ 1 =

∞ −∞

|kλ (x)|dx =

∞ −∞

|f (y)|dy = f 1

for every λ > 0. It is also easy to check (iii), since

|x|>δ

|kλ (x)|dx =

|y|>λδ

|f (y)|dy → 0 as

λ → ∞.

For instance, if we define functions A(x) and G(x) by A(x) =

1 −|x| e , 2

1 2 G(x) = √ e−x , π

the integrals of them over R are equal to 1. So it is possible to make summation kernels based upon them. The kernels based upon A and G are called the Abel summability kernel and the Gauss summability kernel, respectively.8 We concentrate here on the Fejér kernel on R, which will play a crucial role later on.

8 See Kawata [4] II, pp. 66–73.

The Poisson summation kernel is also very useful, although it is not discussed in this book. Helson [2] and Malliavin [5] contain suggestive expositions of the Poisson kernel.

5.4 Summability Kernels on R

113

Define a function K(x) by

1 K(x) = 2π

sin x2

2

x 2

(5.27)

.

The following lemma can be verified by a simple calculation. Lemma 5.2 K(x) =

1 2π

1 −1

(1 − |ξ |)eiξ x dξ.

(5.28)

Proof

1 −1

(1 − |ξ |)eiξ x dξ =

1

−1

eiξ x dξ −

1

ξ eiξ x dξ −

0

0 −1

(−ξ )eiξ x dξ

1 2 sin x ξ cos ξ xdξ −2 x 0

1 sin ξ x 1 2 2 sin x sin ξ x −2 ξ − dξ = − 2 (cos x − 1) = x x x x 0 0 x 2 sin 2 x 2 = − 2 1 − 2 sin2 − 1 = . x 2 x 2 =

This proves (5.28).

K(x) is clearly integrable. (The integral is equal to 1 as is shown later.) If we define Kλ (x) (λ > 0) by Kλ (x) = λK(λx),

(5.29)

{Kλ } satisfies the conditions (ii) and (iii) of summability kernels. It remains to show (i). By the properties of the Fejér kernel on [−π, π ] (Lemma 2.4, pp. 42–43), 1 lim n→∞ 2π

δ −δ

1 n+1

4

sin (n+1)x 2 sin x2

52 dx = 1

(5.30)

for any δ ∈ (0, π ). For λ = n + 1, we have 1 Kn+1 (x) = 2π(n + 1) 1 = 2π(n + 1)

4

sin (n+1)x 2

52

x 2

sin x2 x 2

2 4

sin (n+1)x 2 sin x2

52 .

114

5 Summability Kernels and Spectral Synthesis

Hence 1 Kn+1 (x) > 2π(n + 1)

sin δ δ

2 4

sin (n+1)x 2 sin x2

52

for sufficiently small δ > 0.9 It follows that

sin δ δ

2

1 2π

δ

1 −δ n + 1

4

<

52

sin (n+1)x 2 sin x2 δ

−δ

Kn+1 (x)dx

1 2π

π

−π

1 n+1

4

5 (n+1)x 2

sin 2 sin x2

(5.31) dx.

The integration appearing in the first term of (5.31) satisfies

1 2π

δ −δ

1 n+1

4

sin (n+1)x 2 sin x2

52 dx → 1 as

n→∞

for arbitrary δ ∈ (0, π ), by Lemma 2.4 (pp. 42–43). The third term of (5.31) is equal to 1 for all n. Consequently, we obtain the evaluation

sin δ δ

2

<

δ

−δ

Kn+1 (x)dx 1

for sufficiently large n. On the other hand, since

δ

−δ

Kn+1 (x)dx →

∞

−∞

K(x)dx

as

n → ∞,

we also get10

sin δ δ

2

∞ −∞

K(x)dx 1.

that (sin z/z) < 0 for sufficiently small z > 0. 6∞ 6δ 6 = −∞ Kn+1 (x)dx = −δ Kn+1 (x)dx + |x|>δ Kn+1 (x)dx. The last term tends to 0 as n → ∞.

9 Note

6 10 ∞ K(x)dx −∞

5.5 Spectral Synthesis on R: Inverse Fourier Transforms on L1

115

This evaluation holds good for any δ ∈ (0, π ). Thus we finally obtain

∞

−∞

K(x)dx = 1,

which implies that the integral of Kλ (x) is also equal to 1. We conclude that {Kλ } satisfies all the requirements of summability kernels. {Kλ } is called the Fejér summability kernel on R.

5.5 Spectral Synthesis on R: Inverse Fourier Transforms on L1 We now try to look for a method of inverse Fourier transforms. Given any integrable function, is it possible to find some function, the Fourier transform of which is exactly equal to it? We already know the positive answer to this question in the frameworks of L2 (Plancherel’s Theorem 4.3, p. 72), S(Theorem 4.2, p. 68) and S (Theorem 4.5, p. 84). But how about in the case of L1 ? The procedure to find some function, the Fourier transform of which is given is called spectral synthesis. A vector-valued integration appearing in the next lemma (which corresponds to Lemma 5.1) is the one in the sense of Cauchy–Bochner.11 Lemma 5.3 Let X be a Banach space, ϕ : R → X a bounded continuous function and {kλ } a summability kernel. Then

∞

lim

λ→∞ −∞

kλ (x)ϕ(x)dx = ϕ(0).

Proof Taking account of the condition (i) of summability kernels, we have

∞ −∞

kλ (x)ϕ(x)dx − ϕ(0) =

∞ −∞

kλ (x)(ϕ(x) − ϕ(0))dx =

δ

−δ

+

|x|>δ

= I1 + I2 (5.32)

for any δ > 0. I1 can be evaluated as

I1 Max ϕ(x) − ϕ(0) · kτ 1 . |x|δ

(5.33)

Let ε > 0 be any positive number. If we choose δ > 0 sufficiently small, the righthand side of (5.33) is less than ε. As for I2 , we obtain 11 See

Amann and Escher [1] II, pp. 17–23, Maruyama [6] pp. 329–333.

116

5 Summability Kernels and Spectral Synthesis

I2 sup ϕ(x) − ϕ(0)

|x|>δ

|x|>δ

|kλ (x)|dx.

(5.34)

The right-hand side of (5.34) is also less than ε > 0 for sufficiently large λ > 0 by the condition (iii) of summability kernels and the boundedness of ϕ. The relation (5.32), and the evaluations (5.33), (5.34) imply

∞

−∞

kλ (x)ϕ(x)dx → ϕ(0)

as

λ → ∞.

Given f ∈ L1 (R, C), we define an operator F : R → L1 (R, C) by F : x → τx f . By Theorem 5.1, F is uniformly continuous and bounded. Of course, F (0) = f . Thus we obtain the following result from Lemma 5.3. Theorem 5.7 (convergence of kλ ∗ f ) Let f, g ∈ L1 (R, C) and {kλ } a summability kernel. Then

∞ kλ (x)τx f dx → f as λ → ∞ −∞

in · 1 . Lemma 5.4 Let f, g ∈ L1 (R, C). If h is expressed as

1 h(x) = √ 2π

∞

−∞

H (ξ )eiξ x dx

for some integrable function H (ξ ), then it follows that

(h ∗ f )(x) =

∞

−∞

H (ξ )fˆ(ξ )eiξ x dξ.

Proof 1 (h ∗ f )(x) = √ 2π 1 = √ 2π

=

∞

−∞

∞

∞

−∞ −∞

∞

∞

−∞ −∞

H (ξ )eiξ(x−y) dξ · f (y)dy

f (y)e−iξy dy · H (ξ )eiξ x dξ

H (ξ )fˆ(ξ )eiξ x dξ.

5.5 Spectral Synthesis on R: Inverse Fourier Transforms on L1

117

When we use the Fejér summability kernel, we have

λ Kλ (x) = 2π

1

−1

(1 − |ξ |)e

iξ ·λx

1 dξ = 2π

λ

−λ

|ξ | iξ x 1− e dξ λ

(5.35)

(changing variables). Specity H (ξ ) in Lemma 5.4 as ⎧ ⎨ √1 1 − 2π H (ξ ) = ⎩ 0

|ξ | λ

on

[−λ, λ],

on

[−λ, λ]c .

(5.36)

Then H ∈ L1 and 1 Kλ (x) = √ 2π

∞

−∞

H (ξ )eiξ x dξ

(5.37)

by (5.35). Thanks to Lemma 5.4,

(Kλ ∗ f )(x) =

∞

−∞

H (ξ )fˆ(ξ )eiξ x dξ.

This is an analogue of (5.22). By Theorem 5.7, we have the following result. Theorem 5.8 (analogue of σn → f ) For f ∈ L1 (R, C), 1 √ 2π

λ

|ξ | ˆ 1− f (ξ )eiξ x dξ → f λ

−λ

as

λ → ∞ in · 1 .

Corollary 5.1 If f ∈ L1 (R, C) and fˆ(ξ ) = 0 (ξ ∈ R), then f = 0. This means that different integrable functions have different Fourier transforms. If we further assume that fˆ ∈ L1 , then 1 lim √ λ→∞ 2π

λ −λ

1−

|ξ | ˆ f (ξ )eiξ x dξ λ

1 = lim √ λ→∞ 2π 1 = √ 2π

∞ −∞

|ξ | ˆ χ[−λ,λ] (ξ ) 1 − f (ξ )eiξ x dξ λ −∞ ∞

fˆ(ξ )eiξ x dξ

(dominated convergence theorem).

118

5 Summability Kernels and Spectral Synthesis

Thus we obtain the formula of spectral synthesis; i.e. the inverse Fourier transforms on L1 . Theorem 5.9 (spectral synthesis) If both f and fˆ are in L1 (R, C), then

1 f (x) = √ 2π

∞

−∞

fˆ(t)eitx dt.

The function space A = {f ∈ L1 (R, C) | fˆ ∈ L1 (R, C)} is called the Wiener algebra, since it forms an algebra with respect to the usual operations and the convolution ∗. Remark 5.1 In order to show that A(R) is an algebra, it is enough to verify that f ∗ g ∈ A(R) for any f, g ∈ A(R). Basic properties of A(R) are picked up in the following.12 1◦ f ∈ A(R) ⇔ fˆ ∈ A(R). Proof By Theorem 5.9, 1 f (x) = √ 2π

∞ −∞

fˆ(t)eitx dt

for f ∈ A(R). If we define g(x) = f (−x)(∈ L1 (R, C)), 1 g(x) = √ 2π

∞

−∞

fˆ(t)e−itx dt.

Hence ˆˆ g = f, which implies that fˆ ∈ A(R). The converse can be proved similarly. 2◦ Define f A(R) = f 1 + fˆ 1 . Then 1

fˆ ∞ √ f A(R) . 2π (Obvious from (3.21), p. 52.) 3◦ f ∈ A(R) ⇒ f ∈ Lp (R, C) (1 p ∞).

12 cf.

Malliavin [5] Chap. III, §2.

References

119

Proof

∞

−∞

p−1

|f |p dx f ∞ · f 1 .

4◦

f, g ∈ A(R) ⇒ f ∗ g ∈ A(R). ∗g = Proof Since f ∗ g 1 f 1 · g 1 and f

f ∗ h 1 =

√

2π fˆg

ˆ 1

√

√

2π fˆg, ˆ we have

2π fˆ ∞ g

ˆ 1 < ∞.

Hence f ∗ g ∈ A(R).

Since H (ξ ) defined by (5.36) is integrable, *λ (ξ ) H (ξ ) = K by (5.37) and Theorem 5.9. Consequently, it follows that ⎧ ⎨ √1 1 − 2π Kλ ∗f (ξ ) = ⎩ 0

.

|ξ | λ

fˆ(ξ ) on on

[−λ, λ], [−λ, λ]c .

Combining this result with Theorem 5.8, we obtain the next theorem. Theorem 5.10 (dense subspace of L1 ) {f ∈ L1 (R, C) | suppfˆ is compact} is dense in L1 .

References 1. Amann, H., Escher, J.: Analysis, I, II. Birkhäuser, Basel (2005, 2008) 2. Helson, H.: Harmonic Analysis. Wadsworth and Brooks, Belmont (1991) 3. Katznelson, Y.: An Introduction to Harmonic Analysis, 3rd edn. Cambridge University Press, Cambridge (2004) 4. Kawata, T.: Ohyo Sugaku Gairon (Elements of Applied Mathematics). I, II, Iwanami Shoten, Tokyo (1950, 1952) (Originally published in Japanese) 5. Malliavin, P.: Integration and Probability. Springer, New York (1995) 6. Maruyama, T.: Suri-keizaigaku no Hoho (Methods in Mathematical Economics). Sobunsha, Tokyo (1995) (Originally published in Japanese)

Chapter 6

Fourier Transforms of Measures

So far, we have studied Fourier transforms or Fourier coefficients of functions defined on R or T. Inverse procedures to recover original functions from given Fourier transforms or Fourier coefficients were also discussed (spectral synthesis). However, there are many functions to which the methods of classical Fourier analysis can not be applied. In this chapter, we develop the theory of Fourier transforms of measures as a similar but new method to overcome such difficulties. The celebrated theorems, due to G. Herglotz and S. Bochner, to represent positive semi-definite functions in terms of Fourier transforms of some positive measures will be made use of in later chapters. It is one of the aims of this chapter to give rigorous proofs of the Herglotz–Bochner theorems. There are several approaches to them. We present here a couple of ways. One is based on the summability method using the Fejér kernel. Another is due to the theory of Fourier transforms of distributions. Its relation to the representation theorem of one-parameter groups of unitary operators and to Fourier analysis of weakly stationary stochastic processes is discussed in detail in succeeding chapters.

6.1 Radon Measures To start with, we briefly discuss the relation between the space of continuous functions and its dual space. Let X be a locally compact Hausdorff topological space. We denote by C∞ (X, C)(resp. C∞ (X, R)) the set of all the complex-valued (resp. real-valued) continuous functions vanishing at infinity.1 It is a Banach space with respect to the norm of uniform convergence:

is said to vanish at infinity if there exists, for each ε > 0, some compact set K ⊂ X such that |f (x)| < ε for all x ∈ K c .

1 A function f

© Springer Nature Singapore Pte Ltd. 2018 T. Maruyama, Fourier Analysis of Economic Phenomena, Monographs in Mathematical Economics 2, https://doi.org/10.1007/978-981-13-2730-8_6

121

122

6 Fourier Transforms of Measures

f ∞ = sup |f (x)|, x∈X

f ∈ C∞ .

Let μ be a complex-valued regular measure on (X, B(X)), the total variation |μ| of which is finite. B(X) is the Borel σ -field on X. The completion of such a measure with respect to |μ| is called a Radon measure. The set of all the Radon measures is denoted by M(X). M(X) is a Banach space, the norm of which is given by the total variation μ M(X) = |μ|. In particular, the set of all the positive (real-valued) Radon measures is denoted by M+ (X). For any μ ∈ M(X), we define a linear functional Λμ on C∞ (X, C) by

Λμ f =

f (x)dμ, X

f ∈ C∞ (X, C).

Then Λμ is bounded; i.e. Λμ ∈ C∞ (X, C) . Conversely, for any Λ ∈ C∞ (X, C) , there exists a measure μΛ ∈ M(X) which satisfies

f (x)dμΛ , f ∈ C∞ (X, C) Λf = X

and Λ = |μ|. Such a measure μ is uniquely determined. Thus the two Banach spaces C∞ (X, C) and M(X) are isomorphic to each other. This result is called the Riesz–Markov–Kakutani theorem.2 Riesz–Markov–Kakutani theorem Let X be a locally compact Hausdorff topological space. Then C∞ (X, C) and M(X) are isometrically isomorphic as Banach spaces; i.e. M(X) C∞ (X, C) .

6.2 Fourier Coefficients of Measures (1) The space C(T, C) of complex-valued continuous function on T is a Banach space with the uniform convergence norm. By the Riesz–Markov–Kakutani theorem, the Banach space M(T) of complex-valued Radon measures on T (norm is given by the total variation) is isomorphic to the dual space of C(T, C). From now on, the measurable space (T, B(T)) is identified with ([−π, π ), B([−π, π ))). (cf. Appendix A.)

2 Malliavin

[7] pp. 96–97, Maruyama [9] pp. 300–315.

6.2 Fourier Coefficients of Measures (1)

123

Definition 6.1 For μ ∈ M(T), 1 μ(n) ˆ = √ 2π

π

−π

e−inx dμ(x), 3

n∈Z

are called the Fourier coefficients of μ. Fourier coefficients of a measure are also called Fourier–Stieltjes coefficients in order to distinguish them from Fourier coefficients of a function. In particular, the Fourier coefficients of a measure μf = f dx defined by a function f ∈ L1 (T, C) are given by

π 1 e−inx f (x)dx, n ∈ Z. μˆ f (n) = √ 2π −π These are nothing other than usual Fourier coefficients of f . Theorem 6.1 For any f ∈ C(T, C) and μ ∈ M(T),

π

−π

n−1

f (x)dμ = lim

n→∞

1−

j =−(n−1)

|j | ˆ f (j )μ(−j ˆ ). n

(6.1)

Proof Consider first a trigonometric polynomial 1 P (x) = √ 2π

n−1

aj eij x

j =−(n−1)

as a special case of f . Then we obtain

π −π

P (x)dμ =

n−1

1 aj √ 2π j =−(n−1)

π

−π

eij x dμ =

n−1

aj μ(−j ˆ ).

j =−(n−1)

If we define σn (x) by 1 σn (x) = √ 2π

n−1

1−

j =−(n−1)

|j | ˆ f (j )eij x n

as in the preceding chapter (p. 110), it follows that

π −π

σn (x)dμ =

n−1 j =−(n−1)

|j | ˆ 1− f (j )μ(−j ˆ ) n

by the above result. 3 The

interval of integration should be interpreted as [−π, π ).

124

6 Fourier Transforms of Measures

By Theorem 2.6 (p. 43), σn (x) uniformly converges to f (x) as n → ∞. Hence

π

−π

π

f (x)dμ = lim

n→∞ −π

n−1

= lim

n→∞

σn (x)dμ

j =−(n−1)

|j | ˆ 1− f (j )μ(−j ˆ ). n

It is assured that the limit of) the right-hand side of (6.1) exists by Theorem 6.1. That is, (C, 1)-summability of fˆ(j )μ(j ˆ ) is verified. Corollary 6.1 Let μ ∈ M(T). If μ(n) ˆ = 0 for all n ∈ Z, then μ = 0. This is obvious, since

μ(n) ˆ = 0 for all n

⇒

π

−π

f (x)dμ = 0 for all f ∈ C(T, C)

by Theorem 6.1. Furthermore, the corollary implies the uniqueness of Fourier coefficients of a measure; i.e. μν

⇒

μˆ νˆ .

Define a couple of operators Sn : C(T, C) → C(T, C) and Sn∗ : M(T) → M(T) by 1 Sn :f → √ 2π 1 Sn∗ :μ → √ 2π

n−1

fˆ(j )eij x ,

(6.2)

μ(j ˆ )eij x dx.

(6.3)

j =−(n−1) n−1 j =−(n−1)

They are bounded linear operators which associate f and μ with the partial sums of their Fourier series. If we change fˆ(j )eij x and μ(j ˆ )eij x by fˆ(−j )e−ij x and μ(−j ˆ )e−ij x , respectively, there occurs no change in what (6.2) and (6.3) mean. Let us compute μ ◦ Sn and Sn∗ μ at f ∈ C(T, C), regarding μ ∈ M(T) as a bounded linear functional on C(T, C).

(μ ◦ Sn )f =

π

−π

(Sn f )(x)dμ =

1 = √ 2π

n−1 j =−(n−1)

fˆ(j )

π −π

π −π

1 √ 2π

n−1

fˆ(j )eij x dμ

j =−(n−1)

eij x dμ =

n−1 j =−(n−1)

(6.4) fˆ(j )μ(−j ˆ ).

6.2 Fourier Coefficients of Measures (1)

125

On the other hand, we have (Sn∗ μ)f =

n−1

π

1 f (x) √ 2π −π

1 = √ 2π =

j =−(n−1)

n−1

μ(−j ˆ )

j =−(n−1)

n−1

μ(j ˆ )eij x dx π −π

f (x)e−ij x dx

(6.5)

fˆ(j )μ(−j ˆ ).

j =−(n−1)

Comparing (6.4) and (6.5), we obtain (μ ◦ Sn )f = (Sn∗ μ)f

(6.6)

for every f ∈ C(T, C) and μ ∈ M(T). Thus we should observe that Sn∗ is the dual operator of Sn .4 Similarly, if we define the operators σn and σn∗ by 1 σn :f → √ 2π 1 σn∗ :μ → √ 2π

n−1 j =−(n−1) n−1 j =−(n−1)

|j | ˆ 1− f (j )eij x , n

(6.7)

|j | 1− μ(j ˆ )eij x dx, n

(6.8)

σn∗ is the dual operator of σn . Theorem 6.2 For any μ ∈ M(T), w ∗ - lim σn∗ μ = μ.

(6.9)

n→∞

Proof σn f (x) uniformly converges to f (x) for any f ∈ C(T, C) (cf. Theorem 2.6, p. 43). Hence we have

μf =

π −π

f (x)dμ = lim

n→∞ −π

π

= lim (σn∗ μ)f = lim n→∞

4 See

p. 83 for dual operators.

π

n→∞ −π

σn f (x)dμ = lim (μ ◦ σn )f n→∞

f (x)d(σn∗ μ).

126

6 Fourier Transforms of Measures

The fourth equality comes from the fact that σn∗ is the dual operator of σn . The relation (6.9) immediately follows from this.

6.3 Fourier Coefficients of Measures (2) We have already studied conditions under which a given sequence {an }∞ n=−∞ of complex numbers is the one of Fourier coefficients of some periodic distribution (Theorem 4.8, p. 97). In this section, we try to look for certain conditions which assure that the an ’s are Fourier coefficients of a Radon measure on T = [−π, π ). In particular, under what conditions are the an ’s Fourier coefficients of some positive Radon measure on T? It is the well-known theorem due to G. Herglotz which gives an answer to this question.5 Theorem 6.3 For a given sequence {an }∞ n=−∞ , the following two statements are equivalent. (i) There exists some μ ∈ M(T) which satisfies μ C (constant) and an = μ(n) ˆ (n ∈ Z). (ii) For any trigonometric polynomial P , it holds good that ∞ (6.10) Pˆ (n)a−n C P ∞ . n=−∞

Proof (i)⇒(ii): Let P (x) be a trigonometric polynomial. By Theorem 6.1, we have

π n−1 |j | ˆ 1− P (x)dμ = lim P (j )μ(−j ˆ ), (6.11) n→∞ n −π j =−(n−1)

where Pˆ (j ) = 0 except some finite j ’s. Hence the right-hand side of (6.11) =

∞

Pˆ (n)μ(−n) ˆ =

n=−∞

(i)

∞

Pˆ (n)a−n .

(6.12)

n=−∞

Since μ C, (6.11) and (6.12) imply ∞ π = ˆ (n)a−n C P ∞ . P (x)dμ P −π n=−∞

This proves (ii). 5 For

Sects. 6.3–6.8, we are very much indebted to Katznelson [5] Chap. I, §7 and Chap. VI, except for the distribution approach to the Herglotz–Bochner theorem.

6.3 Fourier Coefficients of Measures (2)

127

(ii)⇒(i): Let T be the set of all the trigonometric polynomials on T (with uniform convergence topology). We define a linear functional Λ : T → C by Λ : P →

∞

Pˆ (n)a−n .

(6.13)

n=−∞

Λ is bounded; i.e. Λ C for some C > 0. Λ can be extended uniquely to a bounded linear functional on C(T, C), preserving the norm. We denote this extended functional also by Λ. By the Riesz–Markov–Kakutani theorem, there exists a Radon measure μ ∈ M(T) which satisfies

Λf =

π −π

f ∈ C(T, C).

f (x)dμ,

For a special case of f (x) = einx , we have

Λeinx =

π

−π

einx dμ =

√ 2π μ(−n). ˆ

On the other hand, (6.13) implies

π √ 1 einx e−inx dx · a−n = 2π · a−n . Λeinx = √ 2π −π Then an = μ(n) ˆ follows from (6.14) and (6.15). It is obvious that μ C.

(6.14)

(6.15)

Corollary 6.2 For a sequence {an }∞ n=−∞ of complex numbers, the following two statements are equivalent: (i) There exists some μ ∈ M(T) which satisfies μ C (constant) and an = μ(n) ˆ (n ∈ Z). (ii) Let σn (x) be the (C, 1)-sum of a series ∞ 1 an einx , √ 2π n=−∞

that is, 1 σn (x) = √ 2π

n−1 j =−(n−1)

|j | 1− aj eij x . n

Then the measures σn dx ∈ M(T) are uniformly bounded, i.e.

σn dx C for some C (constant).

for all n ∈ Z

(6.16)

128

6 Fourier Transforms of Measures

Remark 6.1 Let P (x) be any trigonometric polynominal P (x) =

n

(αj ∈ C).

αj eij x

j =−n

Then the function the Fourier series of which is P (x) is P (x) itself.6 This applies to σn (x) defined above. Proof of Corollary 6.2 (i)⇒(ii): Assume that there exists some μ ∈ M(T) such that μ C and an = μ(n) ˆ (n ∈ Z). Then 1 σn (x)dx = √ 2π 1 = √ 2π

n−1 j =−(n−1) n−1 j =−(n−1)

|j | 1− aj eij x dx n |j | 1− μ(j ˆ )eij x dx n

= σn∗ μ. That is, σn dx = σn∗ μ. By Theorem 6.2, w ∗ - lim σn∗ μ = μ. Hence sup σn∗ μ < ∞. n→∞

n

(ii)⇒ (i): Conversely, suppose that σn dx C for all n ∈ Z. Let P be any trigonometric polynomial. Defining σn by (6.16), we can prove that

Pˆ (j )a−j = lim

π

n→∞ −π

P (x)σn (x)dx.

In fact,

π

−π

6 The

P (x)σn (x)dx =

π

1 P (x) √ 2π −π

n−1

1−

j =−(n−1)

|j | aj eij x dx n

Fourier coefficients of P (x) are given by

π √ 1 P (x)e−ij x dx = 2π αj √ 2π −π

for |j | n, and 0 for |j | n + 1. Hence the Fourier series of P (x) is computed as P (x) =

n √ n 1 2π αj √ eij x = αj eij x . 2π j =−n j =−n

(6.17)

6.4 Herglotz’s Theorem

129

1 = √ 2π

−π

P (x)

n−1

1 = √ 2π =

π

j =−(n−1)

n−1 j =−(n−1)

n−1 j =−(n−1)

|j | 1− a−j e−ij x dx n

π |j | 1− a−j P (x)e−ij x dx n −π

|j | 1− a−j Pˆ (j ). n

Pˆ (j ) = 0 except finite number of j s. We obtain (6.17) by letting n → ∞. (ii) and (6.17) imply Pˆ (j )a−j C P .

Consequently, we have (i) by Theorem 6.3. C

Remark 6.2 Even if we change the statement “ σn dx for all n ∈ Z” in Corollary 6.2(i) to “ σn dx C for infinitely many n ∈ Z”, the equivalence of (i) and (ii) holds good. If we denote indices n for which σn dx C by n , (6.17) implies

Pˆ (j )a−j = lim

π

n→∞ −π

P (x)σn dx = lim

π

n →∞ −π

P (x)σn dx.

) Hence | Pˆ (j )a¯ j | C P . We should remind the readers of this remark in later discussions in Sect. 6.4. That is, if (6.18) in Lemma 6.2 holds good for infinitely many n, it is equivalent to (i).

6.4 Herglotz’s Theorem Given a sequence {an }∞ n=−∞ of complex numbers, we pose a question: under what conditions can we represent an ’s as Fourier coefficients of some positive Radon measure? It is Herglotz’s theorem which answers this question. We denote the (C, 1)-sum of the Fourier series of a function f by σn f as in Sect. 6.2. Lemma 6.1 If f ∈ L1 (T, C) is real-valued and m f (x) a.e., then m σn f (x). Proof The Fejér kernel Kn satisfies Kn (x) 0 and

π −π

Kn (x)dx = 1.

130

6 Fourier Transforms of Measures

Hence

σn f (x) − m =

π

−π

Kn (ξ )(f (x − ξ ) − m)dξ 0.

Lemma 6.2 For a sequence statements are equivalent:

{an }∞ n=−∞

of complex numbers, the following two

(i) There exists a positive Radon measure μ ∈ M(T) such that an = μ(n) ˆ (n ∈ Z). (ii) n−1 j =−(n−1)

|j | 1− aj eij x 0 n

(6.18)

for any n ∈ Z and x ∈ T. Proof (i)⇒(ii): Suppose that an = μ(n) ˆ (n ∈ Z). Then 1 √ 2π

|j | 1 1− aj eij x = √ n 2π j =−(n−1) n−1

n−1

j =−(n−1)

|j | 1− μ(j ˆ )eij x n

(6.18 ) is the (C, 1)-sum of the Fourier series of μ. σn∗ μ denotes the measure determined by (6.18 ) in the manner of Sect. 6.2 (cf.(6.8)). For any f ∈ C(T, R) with f 0, we have

π

π f (x)d(σn∗ μ) = σn f (x)dμ, (6.19) −π

−π

where σn f is the (C, 1)-sum of the Fourier series of f . By f 0 and Lemma 6.1, σn f (x) 0. Consequently, (6.19) 0. Since this is true for all the continuous functions f 0, we obtain σn∗ μ 0. (6.18) immediately follows. ∞ an einx , (ii)⇒(i): Regarding (6.18) as the (C, 1)-sum of the series ϕ(x) ∼ n=−∞

we denote it by σn ϕ. Assume that σn ϕ(x) 0. σn ϕ, viewed as a measure, satisfies

σn ϕ =

π

−π

σn ϕ(x)dx = 2π a0 .

ˆ Since Hence, by Corollary 6.2, there exists a μ ∈ M(T) such that an = μ(n). σn∗ μ(x) = σn ϕ(x)dx

6.4 Herglotz’s Theorem

131

(cf. p. 125), Theorem 6.2 implies that

π −π

f (x)dμ = lim

π

n→∞ −π

f (x)d(σn∗ μ) = lim

π

n→∞ −π

f (x)σn ϕ(x)dx 0

(6.20) holds good for all the continuous functions f 0. So the measure μ must be positive. We now define the key concept to state and prove Herglotz’s theorem. Definition 6.2 Let G be either Z or R.7 A function f : G → C is said to be positive semi-definite if n

f (xi − xj )λi λ¯ j 0

(6.21)

i,j =1

for any finite elements x1 , x2 , · · · , xn of G and λ1 , λ2 , · · · , λn ∈ C. The simple properties of a positive semi-definite function can easily be verified. Lemma 6.3 Let G be either Z or R. A positive semi-definite function f : G → C satisfies the following properties: (i) f (0) 0, (ii) f (x) = f (−x), (iii) |f (x)| f (0). Proof (i) This is an obvious result if we consider a special case where n = 1, x1 = 0 and λ1 = 1 in the above definition. (ii) Consider the special case where n = 2, x1 = 0, x2 = x, λ1 = λ2 = 1. Then 2f (0) + f (x) + f (−x) 0.

(6.22)

If λ2 = i, we have f (0) + f (x)i − f (−x)i − f (0) = f (x)i − f (−x)i 0.

(6.23)

Combining (i) and (6.22), we learn that f (x) + f (−x) is real. And (6.23) tells us that f (x) − f (−x) is purely imaginary. Hence f (x) = f (−x).

7G

may be, more generally, a commutative group.

132

6 Fourier Transforms of Measures

(iii) In the case of |f (x)| = 0, (iii) is obvious by (i). So we have only to check the case |f (x)| > 0. If n = 2, x1 = 0, x2 = x, λ1 = −f (x), λ2 = |f (x)|, then 2f (0)|f (x)|2 − 2|f (x)|3 0. This implies |f (x)| f (0), since |f (x)| > 0.

Remark 6.3 1◦ Consider an (n × n)-matrix (αij ), the (i, j )-element αij of which is f (xi − xj ) (1 i, j n). If f is positive semi-definite, then (αij ) is an Hermitian matrix by Lemma 6.3(ii). (6.21) means that the Hermitian form determined by (αij ) is positive semi-definite. 2◦ f (0) f (−x) f (x) f (0) is an Hermitian matrix obtained in the case n = 2, x1 = 0, x2 = x. The Hermitian form determined by this matrix is positive semi-definite. Thanks to the well-known “sign theorem”of sub-determinants, Lemma 6.3 is almost trivial.8 Lemma 6.4 Assume that a function f : R → C is positive semi-definite and continuous at t = 0. Then it has the following properties: (i) f is uniformly continuous on R. (ii) For any continuous function ζ : R → C,

T 0

T

f (s − t)ζ (s)ζ (t)dsdt 0,

(6.24)

0

(T > 0). Proof (i) Since |f (t)| f (0) by Lemma 6.3, f ≡ 0 in the case of f (0) = 0. Hence it is sufficient to consider the case of f (0) > 0. Without loss of generality, we may assume f (0) = 1. Specifying the value of t as t1 = 0, t2 = t and t3 = t +h, we have

8 See

textbooks on linear algebra for Hermitian forms and their signs.

6.4 Herglotz’s Theorem

133

f (0) f (−t) f (−t − h) f (t) f (0) f (−h) = 1 − |f (t)|2 − |f (t + h)|2 f (t + h) f (h) f (0) − |f (h)|2 + 2Re{f (t)f (h)f (t + h)} 0 by the positive semi-definiteness of f (sign theorem of Hermitian forms). Thus it follows that |f (t) − f (t + h)|2 = |f (t)|2 + |f (t + h)|2 − 2Re{f (t)f (t + h)} 1 − |f (h)|2 + 2Re{f (t)f (t + h)[f (h) − 1]} 1 − |f (h)|2 + 2|1 − f (h)| 4|1 − f (h)|. Since we are assuming that f (0) = 1, f is uniformly continuous. (ii) Consider two decompositions of [0, T ] : 0 = s0 < s1 < · · · < sn = T and 0 = t0 < t1 < · · · < tn = T . By the continuity of f , the integration in (6.24) is the limit of its Riemann sum n

f (si − tj )ζ (si )ζ (tj )(si − si−1 )(tj − tj −1 ).

i,j =1

Since f is positive semi-definite, the Riemann sum is 0. Hence (ii) follows.

A sequence {an }∞ n=−∞ of complex numbers can be regarded as a function ϕ : Z → C. Therefore we can talk about the positive semi-definiteness of {an }. That is, {an } is positive semi-definite if p

ani −nj λi λ¯ j 0

i,j =1

for any finite integers n1 , n2 , · · · , np and λ1 , λ2 , · · · , λp ∈ C. Theorem 6.4 (Herglotz) 9 For a sequence {an }∞ n=−∞ of complex numbers, the following two statements are equivalent. (i) {an }∞ n=−∞ is positive semi-definite. (ii) There exists a positive Radon measure μ ∈ M(T) which satisfies an = μ(n) ˆ (n ∈ Z).

9 Herglotz

[3]. Essentially the same result is obtained by Carathéodory [2] and Toeplitz [13]. The proof given here is due to Katznelson [5] Chap. 1, §7.

134

6 Fourier Transforms of Measures

Proof (ii)⇒(i): Assume that μ is a positive Radon measure on T such that an = μ(n) ˆ (n ∈ Z). Fix a finite number of integers, say n1 , n2 , · · · , np . Then for any n, m ∈ {n1 , n2 , · · · , np } and any complex numbers λn , λm , we have

an−m λn λ¯ m =

n,m

n,m

1 √ 2π

π −π

e−i(n−m)x dμ · λn λ¯ m

2

π 1 −inx = √ λn e dμ 0. 2π −π n (i)⇒(ii): We write formally ∞ 1 ϕ(x) ∼ √ an einx 2π n=−∞

and define + λn =

einx

if

|n| N − 1,

0

if

|n| N

for any N ∈ N. Then

an−m λn λ¯ m =

Cj,N aj eij x

(6.25)

j

where

Cj,N = Max{0, 2N − 1 − |j |}.

Hence 1 1 σ2N ϕ(x) = √ Cj,N aj eij x . · 2N − 1 2π j By (i) and (6.25), we obtain σ2N ϕ(x) 0 for all N ∈ N. (ii) follows by Lemma 6.2 and Remark 6.2. Remark 6.4 The positive measure μ which satisfies (ii) is unique by Corollary 6.1. So far we have proved the Herglotz theorem having recourse to the summability method. We now turn to discussing an alternative proof based upon the theory of periodic distributions.10 The proof of the “if” part is easy and well-known. So it is enough to show the “only if” part.

10 Due

to Maruyama [9]. I acknowledge the priority of Lax [6], pp. 142–143 for the idea of proof.

6.4 Herglotz’s Theorem

135

Assume that {an }n∈Z is positive semi-definite. By Lemma 6.3 above, we have |an | a0

for all n ∈ Z.

(6.26)

Hence it is obvious that |an | const. |n|N for some N ∈ N ∪ {0}. It follows from Theorem 4.8 (p. 97) that the an ’s are the Fourier coefficients of some 2π -periodic distribution T ; i.e. 1 an = √ T (e−inx ). 2π

(6.27)

Let ϕ be any element of D2π (R) and ϕk ’s its Fourier coefficients. By a simple computation, we have Tϕ = T

1 1 inx = √ a¯n ϕn . ϕn T (einx ) = ϕn e √ 2π 2π

(6.28)

Since ϕ is 2π -periodic and smooth, the series summing its Fourier coefficients is absolutely convergent.11 Taking account of (6.26), we observe that the right-hand side of (6.28) converges. We now proceed to show that T is positive. Let qN (x) be any trigonometric polynomial of order N : N

qN (x) =

φn einx .

(6.29)

n=−N

If we adopt |qN (x)|2 =

N

φn φ¯ k ei(n−k)x

n,k=−N

as ϕ ∈ D2π (R), (6.28) implies T (|qN (x)|2 ) =

√

2π

ak−n φn φ¯ k 0.

(6.30)

Any q(x) ∈ D2π (R) can be approximated by a sequence of trigonometric polynomials in C∞ -topology. Passing to the limit, (6.30) implies T (|q(x)|2 ) 0 for any

11 See

Theorem 2.5 (p. 37) and Remark 2.3 (p. 39).

q(x) ∈ D2π (R).

(6.31)

136

6 Fourier Transforms of Measures

Let p(x) ∈ D2π (R) be nonnegative (real-valued) and q(x) = q(x) ∈ D2π (R) and

√

p(x). Then

T (p(x)) 0 by (6.31). Consequently, the distribution T is positive and hence it is a positive measure;12 i.e.

T (ϕ) = ϕ(x)dμ for all ϕ ∈ C∞ (T, C) T

for some μ ∈ M+ (T). So we must have the desired result: 1 1 an = √ T (e−inx ) = √ 2π 2π

T

e−inx dμ.

6.5 Fourier Transforms of Measures Is any similar representation possible for a positive definite function defined on R rather than T? The well-known theorem due to S. Bochner provides a positive answer to this question. There are several approaches to its proof. We first discuss the approach based upon the summability method, and then present an alternative proof from the viewpoint of distribution theory. This section is devoted to a few preparatory results for Bochner’s theorem. Definition 6.3 For a measure μ ∈ M(R), 1 μ(ξ ˆ )= √ 2π

∞

−∞

e−iξ x dμ(x)

is called the Fourier transform of μ. Another terminology, the Fourier–Stieltjes transform is also used in order to distinguish the Fourier transform of a measure from that of a function. It is easy to show that μ(ξ ˆ ) is uniformly continuous. The Fourier transform of a measure dμf = f dx defined by a function f ∈ L1 (R, C) is given by 1 μˆ f (ξ ) = √ 2π

∞ −∞

e−iξ x f (x)dx.

It is nothing other than the usual Fourier transform of f . 12 See

Appendix C, Sect. C.3, or Schwartz [12], Chap. 1, §4, Théorème 5. M+ (T) denotes the set of all the positive Radon measures on T.

6.5 Fourier Transforms of Measures

137

Definition 6.4 The convolution of a measure μ ∈ M(R) and a function ϕ ∈ C∞ (R, C) is defined by the integration13

(μ ∗ ϕ)(x) =

∞ −∞

ϕ(x − y)dμ(y).

It is clear that

μ ∗ ϕ ∞ μ M(R) · ϕ ∞ , where μ M(R) is the norm of μ defined by its total variation |μ|. The convolution operation can also be defined for a pair of a measure μ ∈ M(R) and a μ-integrable functions ϕ ∈ L1μ (R, C) in a similar manner. The following result corresponds to Theorem 6.1. Theorem 6.5 For μ ∈ M(R) and f ∈ A (Wiener algebra),

∞

−∞

f (x)dμ =

∞

fˆ(ξ )μ(−ξ ˆ )dξ.

−∞

Proof By the inversion formula (Theorem 5.9, p. 118), we have 1 f (x) = √ 2π

∞ −∞

fˆ(ξ )eiξ x dξ.

Applying Fubini’s theorem, we can deduce the desired result:

∞

1 f (x)dμ = √ 2π −∞

∞

∞

−∞ −∞

fˆ(ξ )eiξ x dξ dμ(x)

∞ ∞ 1 = √ eiξ x dμ(x) fˆ(ξ )dξ 2π −∞ −∞

∞ fˆ(ξ )μ(−ξ ˆ )dξ. = −∞

The result corresponding to Corollary 6.1 immediately

follows.14

Corollary 6.3 Let μ ∈ M(R). If μ(ξ ˆ ) = 0 for every ξ ∈ R, then μ = 0.

13 See

Sect. 6.7 in this chapter for the convolution of two measures. = 0 for a.e. ξ ” is sufficient to get μ = 0. However, μ(ξ ˆ ) is eventually equal to 0 for every ξ , since we know that μ(ξ ˆ ) is continuous. 14 “μ(ξ ˆ )

138

6 Fourier Transforms of Measures

Theorem 6.6 Let ϕ ∈ C(R, C). Define a function Φλ (λ > 0) by 1 Φλ (x) = √ 2π

λ

−λ

1−

|ξ | ϕ(ξ )eiξ x dξ. λ

Then the following two statements are equivalent: (i) There exists some μ ∈ M(R) such that ϕ = μ. ˆ (ii) Φλ ∈ L1 (R, C) for all λ > 0, and Φλ 1 = O(1) as λ → ∞. Proof (i)⇒(ii): Writing ϕ = μˆ (μ ∈ M(R)), we compute the convolution of μ and the Fejér kernel Kλ :

(μ ∗ Kλ )(x) =

∞

−∞

Kλ (x − y)dμ(y)

1 λ (1 − |θ |)eiθ(x−y)λ dθ · dμ(y) (by Lemma 5.2, p. 113) −∞ 2π −1

1 ∞ λ = e−iθλy dμ(y) (1 − |θ |)eiθλx dθ 2π −1 −∞ 789:

=

∞

√

2π μ(θλ) ˆ

1 |ξ | = √ 1− μ(ξ ˆ )eiξ x dξ λ 2π −λ

λ |ξ | 1 1− ϕ(ξ )eiξ x dξ = √ λ 2π −λ

λ

(changing variables: θ λ = ξ ) (by (i))

= Φλ (x). Hence (ii) follows from

Φλ 1 = =

∞

|Φλ (x)|dx =

−∞

∞ ∞

−∞ −∞

∞ −∞

|(μ ∗ Kλ )(x)|dx

|Kλ (x − y)|d|μ|(y)dx

Kλ 1 · μ M(R) . (ii)⇒(i): Define a measure μλ by Φλ (x)dx. Since {Φλ }λ>0 is L1 -bounded, μλ = Φλ (x)dx has a limiting point μ (in w∗ -topology) as λ → ∞ thanks to Alaoglu’s theorem.15 We can show that ϕ = μˆ as follows.

15 cf.

Lax [6] pp. 120–121, Maruyama [8] pp. 354–355.

6.5 Fourier Transforms of Measures

We first show that

∞ −∞

139

ϕ(−ξ )g(ξ )dξ =

∞

−∞

μ(−ξ ˆ )g(ξ )dξ

(6.32)

for any g ∈ S (space of rapidly decreasing functions). Let G ∈ S be the inverse ˆ Then it follows that Fourier transform of g; i.e. g = G.

|ξ | g(ξ )ϕ(−ξ )dξ = lim g(ξ )ϕ(−ξ ) 1 − dξ λ→∞ −λ λ −∞

λ

∞ 1 |ξ | −iξ x dξ G(x)e dx ϕ(−ξ ) 1 − = lim √ λ→∞ −λ λ 2π −∞ λ

∞ |ξ | 1 −iξ x 1− ϕ(−ξ )e = lim √ G(x) dξ dx λ→∞ λ 2π −∞ −λ λ

∞ |ξ | 1 iξ x 1− ϕ(ξ )e dξ dx G(x) = lim √ λ→∞ λ 2π −∞ −λ ∞

λ

∞

= lim

(changing the variable − ξ to ξ )

λ→∞ −∞

∞

= lim =

λ→∞ −∞

∞ −∞ ∞

G(x)Φλ (x)dx G(x)dμλ

G(x)dμ

w∗

(μλ → μ)

∞ 1 g(ξ )eiξ x dξ · dμ(x) √ 2π −∞ −∞

∞

∞ 1 iξ x = g(ξ ) √ e dμ(x) dξ 2π −∞ −∞

∞ = g(ξ )μ(−ξ ˆ )dξ. =

−∞

This proves (6.32). Since S is dense in C∞ , (6.32) tells us that

∞ −∞

g(ξ ){ϕ(−ξ ) − μ(−ξ ˆ )}dξ = 0

ˆ ), that is, ϕ = μ. ˆ for any g ∈ C∞ . Therefore ϕ(−ξ ) = μ(−ξ

140

6 Fourier Transforms of Measures

Lemma 6.5 Let μn , μ ∈ M(R) and ϕ ∈ C(R, C). If w ∗ - lim μn = μ n→∞

and lim μˆ n (ξ ) = ϕ(ξ )

f or each

n→∞

ξ ∈ R,

then μˆ = ϕ. Proof As in the proof of the previous theorem, pick up any g ∈ S. Let G ∈ S be ˆ Then16 the inverse Fourier transform of g; i.e. g = G.

∞

1 g(ξ )μ(−ξ ˆ )dξ = √ 2π −∞

∞

−∞

∞ −∞

eiξ x dμ(x) g(ξ )dξ

∞ ∞ 1 g(ξ )eiξ x dξ dμ(x) = √ 2π −∞ −∞

∞ G(x)dμ(x) = −∞

= lim

∞

n→∞ −∞

G(x)dμn (x)

1 = lim √ n→∞ 2π

∞

−∞

∞

−∞

w∗

(by μn → μ) g(ξ )eiξ x dξ dμn (x)

∞ ∞ 1 eiξ x dμn g(ξ )dξ = lim √ n→∞ 2π −∞ −∞

∞ = lim g(ξ )μˆ n (−ξ )dξ n→∞ −∞

=

∞ −∞

g(ξ )ϕ(−ξ )dξ

(dominated convergence theorem).

16 g(ξ )μ(−ξ ˆ )

→ g(ξ )ϕ(−ξ ). And {μn } is bounded (with respect to the operator norm), since μn → μ in w ∗ -topology. Furthermore, since g ∞ < ∞, there exists some F ∈ L1 such that |g(ξ )μˆ n (−ξ )| |g(ξ )| · C F (ξ ).

Then apply the dominated convergence theorem.

6.5 Fourier Transforms of Measures

141

Hence

∞

−∞

g(ξ )(μ(−ξ ˆ ) − ϕ(−ξ ))dξ = 0

for any g ∈ S. μˆ = ϕ follows from the continuity of μ(−ξ ˆ ) − ϕ(−ξ ).17

The next theorem corresponds to Theorem 6.3. Theorem 6.7 The following statements are equivalent for ϕ ∈ C(R, C): (i) There exists a μ ∈ M(R) such that ϕ = μ. ˆ (ii) There exists a constant C such that ∞ ˆ(ξ )ϕ(−ξ )dξ C sup |f (x)| f −∞

x∈R

for any f ∈ S. Proof (i)⇒(ii): If we choose C = μ , (ii) is obvious by Theorem 6.5. (ii)⇒(i): Define a linear functional Λ : S → C by

Λ : f →

∞

−∞

fˆ(ξ )ϕ(−ξ )dξ.

(6.33)

Then Λ is bounded and Λ C by (ii). (S is endowed with the uniform convergence norm.) Since S is dense in C∞ , Λ can be extended uniquely as an element of (C∞ ) preserving the norm. The extended functional is denoted by the same notation Λ. By the Riesz–Markov–Kakutani theorem, there exists a unique μ ∈ M(R) such that

Λf =

∞

−∞

f (x)dμ,

f ∈ C∞

(6.34)

( μ C). By Theorem 6.5 again, we have

∞

−∞

f (x)dμ =

∞

−∞

fˆ(ξ )μ(−ξ ˆ )dξ,

f ∈ S.

(6.35)

on the contrary, that μ(− ˆ ξ¯ ) − ϕ(−ξ¯ ) 0 (say, > 0) for some ξ¯ . Then there exists a neighborhood U of ξ¯ such that μ(−ξ ˆ ) − ϕ(−ξ ) > 0 on U . There exists some g ∈ S such that supp g ⊂ U and g(x) > 0 in the interior of its support. Then clearly

∞ g(ξ )(μ(−ξ ˆ ) − ϕ(−ξ ))dξ > 0. 17 Assume,

−∞

Contradiction.

142

6 Fourier Transforms of Measures

Combining (6.33) and (6.35), we obtain

∞

fˆ(ξ ){ϕ(−ξ ) − μ(−ξ ˆ )}dξ = 0,

−∞

f ∈ S.

(6.36)

Taking account of S = {fˆ|f ∈ S} and the continuity of ϕ(−ξ ) − μ(−ξ ˆ ), we conclude that ϕ = μˆ by (6.36). We should note that even if we change the function space S appearing in (ii) ◦ of Theorem 6.7 to Sc or C∞ 0 , the equivalence of (i) and (ii) is still valid. (See 1 , ◦ 2 on p. 67 and Remark 4.1 on p. 71). Theorem 6.8 The following two statements are equivalent for ϕ ∈ C(R, C): (i) There exists some μ ∈ M(R) such that ϕ = μ. ˆ (ii) There exists some constant C such that the numerical sequence {ϕ(λn)}n∈Z , for any λ > 0, is the sequence of the Fourier coefficients of some νλ ∈ M(T) with

νλ C (independent of λ). Proof (i)⇒(ii): Assume (i). Let E be a measurable set in T and define En = E + 2π n (n ∈ Z), E˜ = ∪n En . Then we also define a measure ν on T by ν(E) = ˜ 18 ν satisfies μ(E). ϕ(n) = μ(n) ˆ = νˆ (n),

ν μ .

If we define a measure μλ (λ > 0) on R by

∞ −∞

f (x)dμλ =

∞

−∞

f (λx)dμ,

f ∈ C∞ ,

then μλ = μ and ˆ ). μˆ λ (ξ ) = μ(λξ

(6.37)

In fact, (6.37) follows from 1 μˆ λ (ξ ) = √ 2π

18 For

∞

1 e−iξ x dμλ (x) = √ 2π −∞

a 2π -periodic function f , we have

∞

f (x)dμ = −∞

∞

−∞

π −π

f (x)dν.

e−iξ ·λx dμ(x) = μ(λξ ˆ ).

6.5 Fourier Transforms of Measures

143

Hence ϕ(λn) = μ(λn) ˆ = μˆ λ (n). Let νλ be the measure on T which corresponds to μλ . Since νλ μλ = μ

for every λ > 0, it is enough to define C = μ . (ii)⇒(i): Let f be an element of Sc = {f ∈ S | supp fˆ is compact}. Since ˆ f (ξ )ϕ(−ξ ) is an element of C0 , the integration

∞ −∞

fˆ(ξ )ϕ(−ξ )dξ

can be approximated by Riemann sum. That is, for any ε > 0, dividing some interval [−A, A] ⊃supp fˆ into sufficiently small intervals, the length of which is less than λ, we obtain ∞ (6.38) fˆ(ξ )ϕ(−ξ )dξ < λ fˆ(λn)ϕ(−λn) + ε. −∞ n Define a function ψλ (x) : T → C by ψλ (x) =

∞

f

m=−∞

x + 2π m . λ

(6.39)

Since f ∈ Sc (⊂ S), the series ψλ (x) is well-defined and continuous. The Fourier coefficients of ψλ (x) are computed as (cf. Fig. 6.1) 1 ψˆ λ (n) = √ 2π 1 = √ 2π λ = √ 2π

π

−π

π

ψλ (x)e−inx dx ∞

−π m=−∞

π λ

− πλ

f

x + 2π m −inx e dx λ

2π m −iλnx e f x+ dx λ m=−∞ ∞

(changing the variable x/λ by x) ∞ π λ λ 2π m −iλnx = √ e f x+ dx λ 2π m=−∞ − πλ (by uniform convergence of · · · ).

(6.40)

144

6 Fourier Transforms of Measures

Fig. 6.1 Calculation of (6.40)

The integration of (6.40) is evaluated as

π λ

− πλ

π + 2π m λ λ 2π m −iλnx e f x+ dx = f (u)e−iλnu · e2π inm du π 2π λ − λ + λm (changing variables: u = x + (2π m)/λ)

=

π 2π m λ+ λ

− πλ + 2πλm

f (u)e−iλnu du

(6.41)

(e2π inm = 1). Substituting (6.41) into (6.40),19 we obtain λ ψˆ λ (n) = √ 2π

∞

−∞

f (u)e−iλnu du = λfˆ(λn).

(6.42)

Since f ∈ S, we can show that

ψλ ∞ f ∞ + ε

(6.43)

for sufficiently small λ. In fact, there exists some M such that f x + 2π m λ

M x + 2π m λ

2

M 1 · 2 2 m π λ

(6.44)

for all x ∈ T because f is rapidly decreasing.20 Consequently, we have

up the upper and lower ends of the integration (6.41) for all m ∈ Z, we get the integration over (−∞, ∞). See Fig. 6.1. 2 π − 2π |m| 2 π2 π2 20 x + 2π m = 2 (1 − 2|m|)2 2 m2 , where |m| 1. λ λ λ λ 19 Summing

6.5 Fourier Transforms of Measures

|ψλ (x)|

145

∞ f x + 2π m λ

m=−∞

x M λ2 f · + λ π 2 m2 (m=0)

A

(6.45)

m0

(constant A is independent of x).

Hence we obtain, λ0 being fixed, sup

x + 2π m <ε f λ0

(6.46)

x∈T |m|>N 0

for sufficiently large N0 . The evaluation (6.46) holds good for λ λ0 . For |m| N0 , x + 2π m 1 sup f 2N + 1 f ∞ λ 0 x∈T

(6.47)

for sufficiently small λ λ0 . Choosing λ λ0 very small so that both (6.45) and (6.47) hold good, we obtain, by (6.45), (6.46) and (6.47),

ψλ ∞ f ∞ + ε.

(6.48)

The evaluation (6.43) immediately follows from this. By (ii), we can choose νλ ∈ M(T) so that ϕ(λn) = νˆ λ (n) hold good. The relations |ψˆ λ (n)| jointly imply

√

and

νλ C

2π ψλ ∞ , |ˆνλ (n)| (1/

√

2π )C and (6.42)

∞ ∞ ψˆ λ (n)ˆνλ (−n) C ψλ ∞ . fˆ(λn)ϕ(−λn) = λ n=−∞

n=−∞

Furthermore, we obtain, by (6.38), (6.43) and (6.49), that

∞

−∞

ˆ f (ξ )ϕ(−ξ )dξ C ψλ ∞ + ε

C( f ∞ + ε) + ε = C f ∞ + (C + 1)ε,

(6.49)

146

6 Fourier Transforms of Measures

from which it follows that

∞ −∞

ˆ f (ξ )ϕ(−ξ )dξ C f ∞ ,

since ε > 0 is arbitrary. This holds good for all f ∈ Sc . Hence (i) follows by Theorem 6.7 and the remark to it (pp. 141–142).

6.6 Bochner’s Theorem We have completed all the preparations for proving Bochner’s theorem. Theorem 6.9 For ϕ ∈ Cb (R, C), the following two statements are equivalent: (i) There exists some μ ∈ M+ (R) such that ϕ = μ. ˆ (ii) For any nonnegative f ∈ A (Wiener algebra),

∞ −∞

fˆ(ξ )ϕ(−ξ )dξ 0.

(6.50)

Proof (i)⇒(ii): If we assume (i), we have

∞

−∞

f (x)dμ =

∞

−∞

(i) fˆ(ξ )μ(−ξ ˆ )dξ =

∞

−∞

fˆ(ξ )ϕ(−ξ )dξ 0

for any nonnegative f ∈ A by Theorem 6.5. (ii)⇒(i): If there exists some μ ∈ M(R) such that ϕ = μ, ˆ (ii) implies

∞

−∞

f (x)dμ =

∞

−∞

fˆ(ξ )ϕ(−ξ )dξ 0

for any nonnegative f ∈ S. (Keep in mind that S ⊂ A.) Since S is dense in C∞ , μ must be a positive measure. So it remains only to show the existence of μ ∈ M(R) which satisfies ϕ = μ. ˆ We should recall a couple of properties of the Fejér kernel Kλ (x): 1◦ 1 1 Kλ (x) = K(λx) = λ 2π

sin λx/2 λx/2

2 (6.51)

is nonnegative and converges to 1/2π as λ → 0.21 This convergence is uniform on any compact set in R. 21 For

the Fejér kernel, see Chap. 5, Sect. 5.4 (pp. 111–115).

6.6 Bochner’s Theorem

147

2◦ 1 K(λx)(ξ ) = Max λ

.

1 |ξ | 1− ,0 . √ λ 2π

(6.52)

It can be proved as follows. Since K(λx) = Kλ (x)/λ, .

K(λx)(ξ ) =

1 ˆ Kλ (ξ ). λ

(6.53)

As we have already verified,22 ⎧ ⎨ √1 1 − 2π Kˆ λ (ξ ) = ⎩ 0

|ξ | λ

on

[−λ, λ],

[−λ, λ]c |ξ | 1 1− ,0 . = Max √ λ 2π on

(6.54)

(cf. Fig. 6.2.) The relation (6.52) follows from (6.53) and (6.54). We proceed to show that √ ∞ 2π lim Kˆ λ (ξ )ϕ(−ξ )dξ = ϕ(0). λ→0 λ −∞

(6.55)

Since 1 λ

∞ −∞

Kˆ λ (ξ )dξ =

∞

1 K(λx)(ξ )dξ = √ 2π −∞ .

(6.56)

by (6.54), it follows that

∞ √2π ∞ √2π Kˆ λ (ξ )ϕ(−ξ )dξ − ϕ(0) = Kˆ λ (ξ ){ϕ(−ξ ) − ϕ(0)}dξ −∞ λ −∞ λ √ 2π ˆ

Kλ 1 · ϕ(−ξ ) − ϕ(0) ∞ = ϕ(−ξ ) − ϕ(0) ∞ . (6.57) λ 789: =1

By the continuity of ϕ, sup

ξ ∈[−λ,λ]

22 cf.

Chap. 5, Sect. 5.5, p. 119.

|ϕ(−ξ ) − ϕ(0)| → 0

148

6 Fourier Transforms of Measures

Fig. 6.2 Graph of λ1 Kˆ λ (ξ )

as λ → 0. That is, the most right-hand side of (6.57) converges to 0. This proves (6.55). Finally, taking account of Theorem 6.7 and the remark to it, we wish to prove the existence of a constant C such that ∞ (6.58) fˆ(ξ )ϕ(−ξ )dξ C f ∞ −∞

for any f ∈ C∞ 0 . Without loss of generality, we may assume that f is real-valued. (If not, we have only to divide f into real and imaginary parts.) If we choose sufficiently small λ > 0, for each ε > 0, we have23 2π(ε + f ∞ )K(λx) − f (x) 0

for all

x ∈ R.

(6.59)

The left-hand side of (6.59) is contained in A and nonnegative. So it follows by (ii) that

∞

∞ 1 K(λx)(ξ )ϕ(−ξ )dξ fˆ(ξ )ϕ(−ξ )dξ (ε + f ∞ ) 2π −∞ −∞

∞ 1 = (ε + f ∞ ) Kˆ λ (ξ )ϕ(−ξ )dξ λ −∞ √

∞ 2π 1 (ε + f ∞ ) = √ Kˆ λ (ξ )ϕ(−ξ )dξ 2π λ −∞ .

−→

ε + f ∞ ϕ(0) √ 2π

as

λ→0

(by (6.55)). (6.60)

23 We have already seen that K(λx) → 1/2π (uniformly on any compact set) as λ → 0 (cf. p. 146). We can choose λ (independent of x ∈ supp f ) so that (6.59) holds good on supp f . The inequality (6.59) is satisfied, of course, outside supp f .

6.6 Bochner’s Theorem

149

Hence

∞ −∞

fˆ(ξ )ϕ(−ξ )dξ

√ 2π (ε + f ∞ )ϕ(0).

(6.61)

By repeating the same arguments √ for −ϕ, we can prove the converse inequality. Thus (6.58) is satisfied for C = 2π ϕ(0). Combining Theorem 6.8 and Theorem 6.9, we obtain the next result. Theorem 6.10 For ϕ ∈ Cb (R, C), the following two statements are equivalent: (i) There exists some μ ∈ M+ (R) such that ϕ = μ. ˆ (ii) For any λ > 0, there exists some νλ ∈ M+ (T) such that the numerical sequence {ϕ(λn)}n∈Z is the sequence of Fourier coefficients of νλ . Proof (i)⇒(ii) is proved in the same way as in Theorem 6.8. In the course of the proof, the measure μλ is positive. Hence the corresponding νλ is also positive. (ii)⇒(i): Since νλ ∈ M+ (T) satisfies ϕ(λn) = νˆ λ (n),

n ∈ Z,

we obtain

π 1 1 dνλ = √ νλ . ϕ(0) = √ 2π −π 2π √ By Theorem 6.8 (for C = 2π ϕ(0)), there exists some μ ∈ M(R) which satisfies ϕ = μ. ˆ We show that (6.50) is valid for any nonnegative f ∈ A. (It is required in order to apply Theorem 6.9.) Without loss of generality, we may assume that supp fˆ is compact.24 Under this assumption, the integration appearing in the left-hand side of (6.50) can be approximated by Riemann sum so that (6.38) in the previous section (p. 143) is satisfied. Furthermore, if we define a function ψλ : T → C as in (6.39) in the previous section, we have ψˆ λ (n) = λfˆ(λn),

n∈Z

(6.62)

1 ˆ f ∈ A and f 0, there exists a sequence {gn } in C∞ 0 which converges to f in L . Let fn be the inverse Fourier transform of gn ; i.e. fˆn = gn . Then {fn } is a sequence in Sc . If

∞

∞ fˆn (ξ )ϕ(−ξ )dξ = gn (ξ )ϕ(−ξ )dξ 0

24 If

−∞

−∞

for each n, we obtain, by passing to the limit,

∞ fˆ(ξ )ϕ(−ξ )dξ 0. −∞

150

6 Fourier Transforms of Measures

as we have already seen (cf. (6.42)). And, by (6.48),

ψλ ∞ f ∞ + ε for sufficiently small λ > 0. Consequently, Theorem 6.1, λ

fˆ(λn)ϕ(−λn) =

)

ψˆ λ (n)ˆνλ (−n) =

(6.63)

ψˆ λ (n)ˆνλ (−n) converges and, by

π −π

ψλ (x)dνλ 0.

(6.64)

By Riemann approximation, we obtain

∞ −∞

fˆ(ξ )ϕ(−ξ )dξ λ

fˆ(λn)ϕ(−λn) − ε.

We get (6.50) from (6.64) and (6.65) by passing to the limit. Theorem 6.11 equivalent:

(Bochner) 25

(6.65)

For ϕ : R → C, the following two statements are

(i) ϕ is positive semi-definite and continuous at 0.26 (ii) There exists some μ ∈ M+ (R) such that ϕ = μ. ˆ Proof (ii)⇒ (i): For any ξ1 , ξ2 , · · · , ξp ∈ R, z1 , z2 , · · · , zp ∈ C, we have j,k

1 ϕ(ξj − ξk )zj z¯ k = √ 2π 1 = √ 2π

∞

−∞ j,k ∞ −∞

e−iξj x zj · eiξk x z¯ k dμ(x)

2 zj e−iξj x dμ(x)

0. Thus ϕ is positive semi-definite. Its continuity at 0 is obvious. (i)⇒(ii): If ϕ is positive semi-definite, the sequence {ϕ(λn)}n∈Z is positive semi-definite for any λ > 0. Hence, by Theorem 6.4 due to Herglotz, there exists some νλ ∈ M+ (T) such that ϕ(λn) = νˆ λ (n), (ii) immediately follows by Theorem 6.10.

25 Bochner 26 By

at 0.

n ∈ Z.

[1]. μ, which satisfies (ii), is unique by Corollary 6.3. Lemma 6.4, a positive semi-definite function is uniformly continuous on R if it is continuous

6.6 Bochner’s Theorem

151

Corresponding to Theorem 6.4, we can give an alternative proof of Bochner’s theorem, based upon distribution theory. Remark 6.5 Since S(R) is a dense subspace of C∞ (R, C) (space of continuous functions vanishing at infinity), any μ ∈ M (space of Radon measures on R) can be regarded as a tempered distribution; i.e. μ ∈ S(R) . Here arises a question. Does the usual Fourier transform (Fourier–Stieltjes transform) of μ coincide with its Fourier transform in the sense of distribution? Let us evaluate the Fourier transform μ(θ ˆ ) in the sense of distribution for θ ∈ S(R): 0 1

1 −iξ x ˆ μ(θ ˆ ) = μ(θ ) = μ √ θ (x)e dx 2π R 1 = √ 2π

0

R

R

1 θ (x)e−iξ x dx dμ(ξ )

0 1

1 θ (x) √ e−iξ x dμ(ξ ) dx 2π R R

=

=

R

θ (x)μ(ξ ˆ )dx.

(μ(ξ ˆ )is the usual Fourier transform of μ.) This equality holds good for any θ ∈ S(R). Consequently, we conclude that the two definitions are identical. As in Theorem 6.11, (ii)⇒(i) is easy and well-known. So we have only to prove (i)⇒(ii).27 Assume (i). Then ϕ is bounded since |ϕ(x)| ϕ(0)

for all

x∈R

(6.66)

by Lemma 6.3. Hence ϕ defines a tempered distribution (∈ S(R )). We denote by ϕˇ ∈ S(R) the inverse Fourier transform (as a distribution). By Parseval’s theorem, we have28 ϕ(s) = (ϕ)ˆ(s) ˇ = ϕ(ˆ ˇ s ),

27 The

for

s ∈ S(R).

basic ideas are given in Lax [6], pp. 144–146. See also Maruyama [10]. ϕ denotes the tempered distribution defined by the bounded function ϕ. Hence

ϕ(s) = ϕ(x)s(x)dx, s ∈ S(R).

28 Here

R

(6.67)

152

6 Fourier Transforms of Measures

The positive semi-definiteness of ϕ implies (by Lemma 6.4)

R R

ϕ(x − y)θ (x)θ (y)dxdy 0

θ ∈ D(R),

for any

which can be rewritten as (changing variables: z = x − y)

R R

ϕ(z)θ (x)θ (x − z)dxdz 0.

(6.68)

If we define the function Θ ∈ D(R) by

Θ(z) =

θ (x)θ (x − z)dx,

R

(6.69)

(6.68) can be rewritten as

R

(6.66 )

ϕ(z)Θ(z)dz 0.

By (6.69), 1 ˆ Θ(w) = √ 2π 1 = √ 2π

0

R

R

0

R

R

1

θ (x)θ (x − z)dx e−iwz dz 1

θ (x)θ (u)dx e−iwx eiwu du

(changing variables : u = x − z) 1 = √ 2π = θˆ (w) · =

R

θ (x)e−iwx dx ·

R

iwu ¯ θ(u)e du

√ ¯ 2π θˆ (w)

√ 2π |θˆ (w)|2 .

It follows that 1 ˆ |θˆ (w)|2 = θˆ (w)θ¯ˆ (w) = √ Θ(w). 2π

(6.70)

(6.67), (6.66 ) and (6.70) imply ˆ = ϕ( ˇ Θ) ˇ ϕ(Θ) = ϕ( (6.67)

(6.70)

√

ˆ 2) 2π |θ|

(6.66 )

0.

(6.71)

6.6 Bochner’s Theorem

153

The inequality (6.71) holds good for any θ ∈ D(R). Since D(R) is dense in S(R), (6.71) is also valid for any θ ∈ S(R). Let p(w) 0√be an element of D(R) (space of smooth functions with compact √ ˆ support). Then p(w) is also an element of D(R) ⊂ S(R). Using p as |θ| 29 in (6.71), we obtain ϕ(p) ˇ 0

0 p ∈ D(R).

for any

(6.72)

The result (6.72) tells us that ϕˇ is a positive distribution, and so it is a positive measure; i.e.

θ (x)dμ, θ ∈ D(R) (6.73) ϕ(θ ˇ )= R

for some positive measure μ. (cf. Appendix C, pp. 391–392.) Finally, we claim that μ(R) < ∞. Let g be an element of D(R) which is nonnegative and satisfies g(x) = 1 on

[−1, 1].

(6.74)

Define a function gn by gn (x) = g(x/n). Let G and Gn be the inverse Fourier transforms of g and gn , respectively. Taking account of Gn (y) = nG(ny), we apply (6.67) to get

ϕ(g ˇ n) =

R

(6.75)

ϕ(y)nG(ny)dy.

Since ϕˇ is a positive distribution, the properties (g 0 and (6.74)) of g imply

ϕ(g ˇ n)

[−n,n]

dμ.

Furthermore, it follows from (6.66) that

ϕ(y)nG(ny) ϕ(0) n|G(ny)|dy = ϕ(0) |G(y)|dy. R

R

R

(6.76)

The right-hand side of (6.76) is independent of n. If we write

C = ϕ(0)

29 We

R

|G(y)|dy,

note that there exists a unique function in S(R), the Fourier transform of which is just

√

p.

154

6 Fourier Transforms of Measures

we obtain

[−n,n]

dμ C,

and so μ(R) C. Looking at (6.73), we can conclude that ϕ is the Fourier transform of the distribution defined by μ; i.e. ϕ(s) = μ(s) ˆ for

s ∈ D(R).

This equality holds good for all s ∈ S(R).

6.7 Convolutions of Measures Let G be either T or R.30 As a final topic of this chapter, we now proceed to N. Wiener’s theory to evaluate the discrete part of a Radon measure. M(G) denotes the space of Radon measures on G. Its subset Md (G) is defined by ∞ ∞ Md (G) = μ ∈ M(G) μ = αn δxn , |αn | < ∞, αn ∈ C . n=1

n−1

That is, μ ∈ Md (G) is a complex measure with finite total variation concentrating at a countable set {xn }, each element of which is assigned a mass αn .

30 I

repeat again the exposition of the concept of the torus. A binary relation ∼ on R is defined by x ∼ y ⇐⇒ x − y ∈ 2π Z.

∼ is an equivalence relation, and any two numbers are equivalent if and only if the difference between them is a multiple of 2π . Any real number has its equivalent number in [−π, π ). Given a function f : [−π, π ) → C, we may not be able to define f (x + y), since x + y may not be in [−π, π ). However, there exists some z ∈ [−π, π ) which is equivalent to x + y. We understand f (x + y) as f (z). A similar convention applies to the Dirac measures on [−π, π ). That is, we understand δx+y as δz . More generally, we can consider f (x1 + x2 + · · · + xn ) or δx1 +x2 +···+xn in the same way. When we consider a function on R rather than [−π, π ), such a convention is not necessary. Exactly speaking, we consider a function defined on the quotient group R/2π Z modulo 2π Z. This is called torus and denoted by T. cf. Appendix A. We are indebted to Malliavin [7] Chap. III, §1 for the expositions in this section.

6.7 Convolutions of Measures

155

The convolution δx1 ∗ δx2 of a pair of Dirac measures δx1 and δx2 is defined by δx1 ∗ δx2 = δx1 +x2 . Generalizing this concept, we obtain the following definition. Definition 6.5 Let μ and ν be elements of M(G) of the form μ= ν=

∞

∞

αn δxn ,

n=1

n=1

∞

∞

βn δyn ,

n=1

|αn | < ∞, (6.77) |βn | < ∞.

n=1

Then the convolution μ ∗ ν of μ and ν is defined by μ∗ν =

(6.78)

αn βm δxn +ym .

n,m

μ ∗ ν is, of course, an element of Md (G). We list some elementary properties of the convolution, the verification of which is easy. 1◦ μ ∗ ν = ν ∗ μ. 2◦ (μ ∗ ν) ∗ θ = μ ∗ (ν ∗ θ ). 3◦ (μ + ν) ∗ θ = μ ∗ θ + ν ∗ θ. 4◦ λ(μ ∗ ν) = (λμ) ∗ ν = μ ∗ (λν) for any λ ∈ R. 5◦ μ ∗ ν μ · ν . In summary, Md (G) is a commutative normed algebra31 with respect to the convolution operation ∗. Lemma 6.6 Let θ be the convolution of μ and ν ∈ Md (G). Then

f (x)dθ = G

f (x + y)dμ(x)dν(y)

for all

G G

f ∈ C∞ (G, C).

(f ∈ C(G, C) in the case of G = [−π, π )). Proof If μ and ν are of the form (6.77), θ is given by θ=

αn βm δxn +ym .

n,m

31 See

Naimark [11], Lax [6] Chaps. 17–20 and Maruyama [8] Chap. 7 for normed algebras.

156

6 Fourier Transforms of Measures

For f ∈ C∞ (G, C), we have

f (x + y)dμ(x)dν(y) = G G

f (xn + ym )αn βm =

f (x)dθ. G

n,m

We now extend the convolution to the entire space M(G). Given a pair of μ and ν ∈ M(G), a linear functional Λμ,ν ∈ C∞ (G, C) is defined by

Λμ,ν (f ) =

f (x + y)dμ(x)dν(y), G G

f ∈ C∞ (G, C).

Λμ,ν ∈ C∞ (G, C) , since we have the evaluation

|Λμ,ν (f )|

|f (x + y)|d|μ|(x)d|ν|(y) G G

f ∞

d|μ|(x)d|ν|(y) G G

f ∞ μ

ν . So there exists some unique θ ∈ M(G) such that

Λμ,ν (f ) =

f (x)dθ, G

f ∈ C∞ (G, C)

by the Riesz–Markov–Kakutani theorem. We call this θ the convolution of μ and ν and denote it by θ = μ ∗ ν. Similarly to the case of Md (G), we can show the following properties (i)–(v) of the convolution on M(G). Furthermore, (vi) can also be verified in addition to them. Theorem 6.12 Let μ, ν, θ ∈ M(G). Then the following relations hold good: μ ∗ ν = ν ∗ μ. (μ ∗ ν) ∗ θ = μ ∗ (ν ∗ θ ). (μ + ν) ∗ θ = μ ∗ θ + ν ∗ θ . λ(μ ∗ ν) = (λμ) ∗ ν = μ ∗ (λν) for any λ ∈ C.

μ ∗ ν μ · ν . Hence M(G) is a Banach algebra with respect to the operation ∗. (vi) If two sequences {μn }, {νn } in M(G) converge to μ0 and ν0 respectively in w ∗ -topology, then {μn ∗ νn } also converges to μ0 ∗ ν0 in w ∗ -topology.

(i) (ii) (iii) (iv) (v)

The following basic result holds good for the Fourier transform μ ∗ ν of the convolution of μ and ν ∈ M(G). μˆ and νˆ are Fourier transforms of μ and ν, respectively.

6.7 Convolutions of Measures

157

Theorem 6.13 For μ and ν ∈ M(G), μ ∗ν =

√ 2π μˆ · νˆ .

Proof Writing θ = μ ∗ ν, we obtain 1 θˆ (t) = √ 2π

1 e−itx dθ (x)= √ 2π G

e−it (x+y) dμ(x)dν(y)=

√ 2π μ(t)ˆ ˆ ν (t).

G G

A measure contained in Md (G) is called a discrete measure. On the other hand, μ ∈ M(G) is said to be continuous if μ({x}) = 0 for all x ∈ G. It is well-known that any μ ∈ M(G) can be represented as a sum of a discrete measure μd and a continuous measure μc ; i.e. μ = μd + μc . Such a decomposition is determined uniquely.32 We list some elementary rules of the operation ∗ on M(G): 1◦ If μ ∈ M(G) is continuous, μ ∗ ν is also continuous for all ν ∈ M(G). 2◦ For μ ∈ M(G), we define33 μ ∈ M(G) by μ (E) = μ(−E). If μ = μc + μd , then

μ = μc + μd . 3◦

μ ∗ μ = (μc ∗ μc + μc ∗ μd + μd ∗ μc ) + μd ∗ μd . 789: 789: continuous

4◦ If μd =

discrete

aj δτj , then

j

μd =

a¯j δ−τj .

j

bounded Borel measure μ on G can be represented as μ = μa + μs + μd , where μa m (Lebesque measure) μs is continuous and μs ⊥ m, and μd is discrete. Such a decomposition is uniquely determined. cf. Igari [4] p. 137. 33 μ * = μ. ¯ˆ 32 Any

158

6 Fourier Transforms of Measures

Consequently, (μ ∗ μ )({0}) =

|aj |2 .

j

6.8 Wiener’s Theorem We first discuss Wiener’s theorem on [−π, π ) and then go over to the case of R.34 The following lemma is a simple consequence of 4◦ above. Lemma 6.7 For any μ ∈ M(G), |μ(τ )|2 = (μ ∗ μ )({0}), where the left-hand side is the sum of countable τ ’s on which the mass of the discrete μd concentrates. Hence μ is continuous if and only if (μ ∗ μ )({0}) = 0. Theorem 6.14 For any μ ∈ M([−π, π )) and τ ∈ [−π, π ), √

N 2π μ(j ˆ )eij τ . N →∞ 2N + 1

μ({τ }) = lim

j =−N

Proof We start by showing some preparatory calculations. Define ϕN (t) =

N

eij t ,

t ∈ [−π, π ).

(6.79)

j =−N

If t = 0, ϕN (t) = 2N + 1. If, on the contrary, t 0,35 N j =0

34 We 35

eij t =

ei(N +1)t − 1 . eit − 1

owe this to Katznelson [5] Chap. I, §7 and Chap. VI, §2.

(6.80)

6.8 Wiener’s Theorem

159

Similarly, we have 0

eij t =

j =−N

ei(N +1)t − 1 . eit − 1

(6.81)

It follows from (6.80) and (6.81) that ϕN (t) =

2 cos(N + 1)t − 2 − 1. eit − 1

(6.82)

Taking account of t 0, choose a sufficiently small ε > 0 such that |t| > ε. If we define α = Min |eiθ − 1|, |θ|>ε

we obtain |ϕN (t)|

4 +1 α

(6.83)

by (6.82). By the above results, we get by passing to the limit N → ∞ + 1 1 if ϕN (t) → 2N + 1 0 if

t = 0, t 0.

(Note that ϕN (0) = 2N + 1.) Of course 1 ϕ (t) 2N + 1 N 1.

(6.84)

(6.85)

If we write ϕN (τ − t) =

N j =−N

eij (τ −t) =

N

eij τ e−ij t ,

j =−N

we obtain + 1 1 if t = τ, ϕN (τ − t) → 2N + 1 0 if t τ, 1 1 ϕ (τ − t) 2N + 1 N

(6.84 )

(6.85 )

160

6 Fourier Transforms of Measures

by the above calculations. Hence by the dominated convergence theorem,

lim

π

N →∞ −π

1 ϕN (τ − t)dμ(t) = 2N + 1

π −π

χ{τ } dμ = μ({τ }).

(6.86)

The relation (6.86) can be rewritten as √ N 2π μ(j ˆ )eij τ = μ({τ }) lim N →∞ 2N + 1 j =−N

by definition of ϕN . Theorem 6.15 (Wiener) For μ ∈ M([−π, π )),

N 2π |μ(j ˆ )|2 . N →∞ 2N + 1

|μ({τ })|2 = lim

j =−N

If μ is continuous, in particular, then the limit appearing on the right-hand side is 0. Proof By Lemma 6.7, we have

|μ({τ })|2 = (μ ∗ μ )({0}) √

N 2π = lim (μ ∗ μ )(j )eij 0 N →∞ 2N + 1 .

j =−N

N 2π , (j ) μ(j ˆ )μ N →∞ 2N + 1

= lim

(Theorem 6.13)

j =−N

N 2π μ(j ˆ )μ(j ˆ ) (by footnote 33 on p. 157) N →∞ 2N + 1

= lim

j =−N

N 2π |μ(j ˆ )|2 . N →∞ 2N + 1

= lim

j =−N

It is not hard to transform the above results so as to fit with the case M(R). Define a function ϕN by

ϕn (t) =

n

−n

eiξ t dξ,

t ∈ R.

(6.87)

6.8 Wiener’s Theorem

161

This is clearly an analogue of (6.79). Passing to the limit n → ∞,36 we have + 1 1 ϕn (t) → 2n 0

if t = 0, if t 0

(6.88)

and 1 ϕn (t) 1. 2n

(6.89)

If we write

ϕn (τ − t) =

n

−n

eiξ(τ −t) dξ,

we have + 1 1 ϕn (τ − t) → 2n 0

if t = τ, if t τ,

(6.88 )

by the above result (n → ∞), and 1 ϕn (τ − t) 1. 2n

(6.89 )

By the dominated convergence theorem,

∞

lim

n→∞ −∞

36 Assume

1 ϕn (τ − t)dμ(t) = 2n

∞

−∞

χ{τ } dμ = μ({τ }).

t 0. Then ϕn (t) =

1 iξ t n 1 2 e = (eint − e−int ) = sin nt. it it t −n

Hence 1 1 ϕn (t) = sin t. 2n nt We obtain (6.88) as n → ∞.

(6.90)

162

6 Fourier Transforms of Measures

Since 1 2n

∞

1 ϕn (τ − t)dμ(t) = 2n −∞

∞

n

eiξ τ · e−iξ t dξ dμ(t)

−∞ −n

∞

n iξ τ

1 e e−iξ t dμ(t)dξ 2n −n −∞ √ n 2π = μ(ξ ˆ )eiξ τ dξ 2n −n

=

by the definition of ϕn , we can rewrite (6.90) as √ n 2π lim μ(ξ ˆ )eiξ τ dξ = μ({τ }). n→∞ 2n −n

(6.90 )

This is the result which corresponds to Theorem 6.14: Theorem 6.14 For any μ ∈ M(R) and τ ∈ R, we have √ n 2π iξ τ lim μ(n)e ˆ dξ = μ({τ }). n→∞ 2n −n It follows from Theorem 6.14 that |μ({τ })|2 = (μ ∗ μ )({0}) (by Lemma 6.7) √ n 2π (μ ∗ μ )(ξ )eiξ 0 dξ = lim n→∞ 2n −n

2π n , (ξ )dξ μ(ξ ˆ )μ = lim n→∞ 2n −n

π n |μ(ξ ˆ )|2 dξ (by footnote 33 on p. 157). = lim n→∞ n −n Summing up: Theorem 6.16 (Wiener) For μ ∈ M(R),

π n→∞ n

|μ({τ })|2 = lim

n

−n

|μ(ξ ˆ )|2 dξ.

If μ is continuous, the limit appearing on the right-hand side is 0.37

37 The

θn +n

upper and lower ends of the integration can be θn

for any sequence {θn }.

(6.91)

References

163

References 1. Bochner, S.: Monotone Funktionen, Stieltjessche Integrale und harmonische Analyse. Math. Ann. 108, 378–410 (1933) 2. Carathéodory, C.: Über den Variabilitätsbereich der Koefficienten von Potenzreihen die gegebene Werte nicht annemen. Math. Ann. 54, 95–115 (1907) 3. Herglotz, G.: Über Potenzreihen mit positiven reellen Teil in Einheitskreise. Berichte Verh Säcks. Akad. Wiss. Leipzig. Math.-phys. Kl. 63, 501–511 (1911) 4. Igari, S.: Jitsu-kaiseki Nyumon (Introduction to Real Analysis). Iwanami Shoten, Tokyo (1996) (Originally published in Japanese) 5. Katznelson, Y.: An Introduction to Harmonic Analysis, 3rd edn. Cambridge University Press, Cambridge (2004) 6. Lax, P.D.: Functional Analysis. Wiley, New York (2002) 7. Malliavin, P.: Integration and Probability. Springer, New York (1995) 8. Maruyama, T.: Kansu Kaisekigaku (Functional Analysis). Keio Tsushin, Tokyo (1980) (Originally published in Japanese) 9. Maruyama, T.: Sekibun to Kansu-kaiseki (Integration and Functional Analysis). Springer, Tokyo (2006) (Originally published in Japanese) 10. Maruyama, T.: Herglotz-Bochner representation theorem via theory of distributions. J. Operations Res. Soc. Japan 60, 122–135 (2017) 11. Naimark, M.A.: Normed Algebras. Wolters Noordhoff, Groningen (1972) 12. Schwartz, L.: Théorìe des distributions. Nouvelle édition, entièrement corrigée, refondue et augmentée, Hermann, Paris (1973) 13. Toeplitz, O.: Über die Fouriersche Entwickelung positive Funktionen. Rend. di Cinc. Mat. di Palermo 32, 191–192 (1911)

Chapter 7

Spectral Representation of Unitary Operators

The main topic of this chapter is the spectral representation of unitary operators as well as one-parameter groups of unitary operators. That is, the problem is how to represent such objects by certain analogues of Fourier transforms. We are going to describe theories based upon the Herglotz–Bochner theorem already discussed in Chap. 6. However, the Herglotz–Bochner theorem can be conversely deduced from the spectral representation theorem of unitary operators. The results obtained here will play a crucial role in the spectral representation theory of weakly stationary stochastic processes to be discussed in the next chapter.

7.1 Lax–Milgram Theorem Linear spaces appearing in the following are basically complex ones. We have to add one more important fundamental result to the exposition on Hilbert spaces discussed in Chap. 1. Definition 7.1 Let X be a vector space. A function Φ : X × X → C is called a sesquilinear functional if it satisfies the following conditions, where x, x , y, y are any elements of X and α ∈ C: (i) (ii) (iii) (iv)

Φ(x + x , y) = Φ(x, y) + Φ(x , y). Φ(x, y + y ) = Φ(x, y) + Φ(x, y ). Φ(αx, y) = αΦ(x, y). Φ(x, αy) = αΦ(x, ¯ y).

Furthermore, if Φ satisfies Φ(x, y) = Φ(y, x), it is said to be skew-symmetric. © Springer Nature Singapore Pte Ltd. 2018 T. Maruyama, Fourier Analysis of Economic Phenomena, Monographs in Mathematical Economics 2, https://doi.org/10.1007/978-981-13-2730-8_7

165

166

7 Spectral Representation of Unitary Operators

An inner product ·, · on a Hilbert space is clearly a skew-symmetric sesquilinear functional.1 Remark 7.1 Φ(x, x) is not necessarily real for a sesquilinear functional Φ. However, if Φ is skew-symmetric, in addition, Φ(x, x) must be real. This concept permits us to generalize the Riesz theorem (Theorem 1.1) to represent duals of Hilbert spaces.2 Theorem 7.1 (Lax–Milgram) Let H be a Hilbert space. Suppose that a function B : H × H → C is a sesquilinear functional which satisfies the following two conditions: (i) There exists a constant α > 0 such that |B(x, y)| α x · y

for all x, y ∈ H.

(ii) There exists a constant β > 0 such that |B(y, y)| β y 2

for all y ∈ H.

Then there exists a unique yΛ ∈ H for each Λ ∈ H which satisfies Λ(x) = B(x, yΛ ) for all

x ∈ H.

Proof For a fixed y ∈ H, the function x → B(x, y) is a bounded linear functional on H. By Riesz’s theorem, there exists some unique z ∈ H such that B(x, y) = x, z

for all

x ∈ H.

(7.1)

If we write Ay = z, (7.1) can be rewritten as B(x, y) = x, Ay,

1 However,

x, y ∈ H.

(7.2)

we have to remember that an inner product must satisfy one more requirement, x, x 0;

x, x = 0 ⇔ x = 0,

besides skew-symmetric sesquilinearity. 2 The Lax–Milgram theorem has a lot of applications. Among them, its application to the existence theorem of solutions for elliptic partial differential equations is most peculiar. See Evans [4], Chap. 6.

7.1 Lax–Milgram Theorem

167

A : H → H is a bounded linear operator. It can be verified as follows. In fact, A is linear, since we have, for any λ1 , λ2 ∈ C and y1 , y2 ∈ H, x, A(λ1 y1 + λ2 y2 ) = B(x, λ1 y1 + λ2 y2 )

(by (7.2))

= λ¯ 1 B(x, y1 ) + λ¯ 2 B(x, y2 )

(by sesquilinearity)

= λ¯ 1 x, Ay1 ) + λ¯ 2 x, Ay2 = x, λ1 Ay1 + λ2 Ay2 for each x ∈ H. Furthermore,

Ay 2 = Ay, Ay = B(Ay, y) α Ay · y . (7.2)

(i)

In the case of Ay 0, we obtain

Ay α y

for all

y∈H

by dividing both sides by Ay . If Ay = 0, the same result holds good obviously. Hence A is a bounded linear operator. We next show that A is injective. In fact β y 2 |B(y, y)| = |y, Ay| y · Ay

by (ii). It follows immediately that β y Ay .

(7.3)

This proves that A is injective. The inequality (7.3) implies that the image A(H) of A is a closed subspace of H. Assume that Ayn → z

as

n→∞

for a sequence {yn } in H. Then {yn } is Cauchy, since β yn − ym Ayn − Aym

by (7.3). Therefore {yn } has a limit y0 ∈ H. Since A is continuous, we have Ayn → Ay0

as

n → ∞.

168

7 Spectral Representation of Unitary Operators

By the uniqueness of the limit, we must have z = Ay0 and hence z ∈ A(H). This proves the closedness of A(H).3 However, it holds good actually that A(H) = H.

(7.4)

Suppose that A(H) H to the contrary. Then there exists some x ∈ A(H)⊥ \ {0}. But it follows that x = 0, since β x 2 |B(x, x)| = x, Ax = 0. Contradiction. Applying Riesz’s theorem again to Λ ∈ H , there exists a unique zΛ ∈ H such that Λ(x) = x, zΛ

x ∈ H.

for all

(7.5)

By (7.4), there exists some yΛ ∈ H such that zΛ = AyΛ . Hence B(x, yΛ ) = x, AyΛ = x, zΛ = Λ(x), that is, Λ(x) = B(x, yΛ )

x ∈ H.

for all

∈ H Finally, we show the uniqueness of yΛ . Suppose that there is another yΛ which satisfies ) Λ(x) = B(x, yΛ ) = B(x, yΛ

for all

x ∈ H.

Then ) = 0 for all B(x, yΛ − yΛ

3 There

x ∈ H.

is an alternative way to prove z = Ay0 . Since (7.2) implies B(x, yn ) = x, Ayn for each fixed

x ∈ X,

we have B(x, y0 ) = x, z for all x ∈ H by passing to the limit. Hence z = Ay0 .

7.2 Conjugate Operators and Projections

169

By (ii), we obtain 2

|B(yΛ − yΛ , yΛ − yΛ )| = 0. β yΛ − yΛ . This proves yΛ = yΛ

Remark 7.2 The skew-symmetricity of B is not assumed in Theorem 7.1. If it is assumed, B itself is an inner product. Consequently, the Lax–Milgram theorem immediately follows through the Riesz theorem applied to B. An important point about the Lax–Milgram theorem is that the skew-symmetricity is not required. Let Φ(x, y) be a sesquilinear functional on H. It is said to be bounded if |Φ(x, y)| α x · y

for all

x, y ∈ H

for some α > 0. The set of all the bounded sesquilinear functionals forms a (complex) vector space under usual operations and a norm on this space is given by

Φ = sup x0 y0

|Φ(x, y)| .

x · y

7.2 Conjugate Operators and Projections We denote by L(H) the space of bounded linear operators (H → H) on a complex Hilbert space H. Fixing T ∈ L(H), we define a function Φ : H × H → C by Φ(x, y) = T x, y.

(7.6)

Then Φ is clearly a sesquilinear functional on H and it satisfies |Φ(x, y)| T · x · y

and so T . On the other hand, since

T x 2 = T x, T x = Φ(x, T x) · x · T x ,

T x Φ · x if T x 0. Hence T Φ . Thus we prove that

T = Φ .

(7.7)

It should be observed that any bounded sesquilinear functional on H can be represented in the form (7.6). It is easily shown by appealing to the Riesz theorem.

170

7 Spectral Representation of Unitary Operators

Theorem 7.2 For any bounded sesquilinear functional Φ on H, there exists uniquely some T ∈ L(H) such that Φ(x, y) = T x, y

x, y ∈ H.

Furthermore, T = Φ . Proof Fixing x ∈ H, we define Φx (y) = Φ(x, y). Then Φx (y) ∈ H . Hence by the Riesz theorem, there exists uniquely some T x ∈ H such that Φx (y) = y, T x,

y ∈ H.

The operator T : x → T x thus defined is clearly linear,4 and T = Φ by (7.7). Theorem 7.3 For any T ∈ L(H), there exists uniquely some T ∗ ∈ L(H) which satisfies T x, y = x, T ∗ y,

x, y ∈ H.

It holds good that T = T ∗ . Proof Define a couple of functionals, Φ and : H × H → C, by Φ(x, y) = T x, y,

(x, y) = Φ(y, x) = T y, x.

These are bounded sesquilinear functionals and

Φ = = T . By Theorem 7.2, there exists uniquely some T ∗ ∈ L(H) such that (x, y) = T ∗ x, y,

x, y ∈ H,

and

T ∗ = = T . This T ∗ satisfies T x, y = Φ(x, y) = (y, x) = T ∗ y, x = x, T ∗ y,

x, y ∈ H.

x1 , x2 ∈ H and α1 , α2 ∈ C. Then, for any y ∈ H, y, α1 T x1 + α2 T x2 = α¯1 y, T x1 + α¯2 y, T x2 = α¯1 Φ(x1 , y) + α¯2 Φ(x2 , y) = Φ(α1 x1 + α2 x2 , y) = y, T (α1 x1 + α2 x2 ). This implies α1 T x1 + α2 T x2 = T (α1 x1 + α2 x2 ).

4 Let

7.2 Conjugate Operators and Projections

171

Definition 7.2 The operator T ∗ defined by Theorem 7.3 is called the conjugate operator or adjoint operator of T .5 The elementary properties of conjugate operators are as follows6 : 1◦ T ∗∗ = T . 2◦ For S, T ∈ L(H) and α, β ∈ C, ¯ ∗. ¯ ∗ + βT (αS + βT )∗ = αS 3◦ For S, T ∈ L(H), (ST )∗ = T ∗ S ∗ . 4◦ T 2 = T ∗ T . Example 7.1 Any T ∈ L(Cn ) can be represented by a matrix (tij ). The conjugate operator T ∗ of T is represented by the conjugate transposed matrix of (tij ). Example 7.2 Consider the Hilbert space L2 ([0, 1], C). Let K(x, y) be a measurable function of [0, 1] × [0, 1] into C such that 1 1

0

|K(x, y)|2 dxdy < ∞.

0

If we define an operator T on L2 ([0, 1]2 , C) by

1

(Tf )(x) =

K(x, y)f (y)dy, 0

then T is a bounded linear operator. The dual operator T ∗ of T is given by

∗

1

(T f )(y) =

K(x, y)f (x)dx. 0

If we define a function K ∗ : [0, 1] × [0, 1] → C by K ∗ (y, x) = K(x, y),

abstract theory of linear operators on a Hilbert space (theory of C ∗ -algebra), see Arveson [1], Diximier [2] and Maruyama [7] Chap. 7. 6 For instance, 4◦ can be proved as follows: T ∗ T T ∗ · T = T 2 . On the other hand, 5 For

T x 2 = T x, T x = T ∗ T x, x T ∗ T · x 2 for all x ∈ H. This implies T 2 T ∗ T .

172

7 Spectral Representation of Unitary Operators

then T ∗ can also be expressed as

∗

1

(T f )(y) =

K ∗ (y, x)f (x)dx.

0

Example 7.3 The Fourier transform on L2 (R, C) in the sense of Plancherel is denoted by F2 . By Theorem 4.3 (p. 72), F2∗ = F2−1

(inverse Fourier transform).

Definition 7.3 An operator T ∈ L(H) is called a symmetric or Hermitian operator if T = T ∗ .7 Example 7.4 In Example 7.1 above, T = T ∗ if tij = tj i . In Example 7.2, T = T ∗ if K(x, y) = K(x, y) a.e. x, y. Theorem 7.4 The following statements are equivalent for T ∈ L(H): (i) T is symmetric. (ii) The sesquilinear operator Φ defined by Φ(x, y) = T x, y (x, y ∈ H) is skew-symmetric. (iii) T x, x is real for all x ∈ H. Proof (i)⇔(ii):

This is obvious because Φ(x, y) = Φ(y, x)

for all

x, y ∈ H

is equivalent to T x, y = T y, x = y, T ∗ x = T ∗ x, y (i)⇔(iii):

for all

x, y ∈ H.

If T is symmetric, x, T x = T x, x = x, T x

for any x ∈ H. Hence T x, x is real. Conversely, assume that T x, x is real for all x ∈ H. Then x + y, T (x + y) = x, T x + y, T y + x, T y + y, T x for any x, y ∈ H. Then the left-hand side and the first two terms on the right-hand side are real. So x, T y + y, T x ≡ α 7 For

(7.8)

the concept of a self-adjoint operator which is closely related with that of a symmetric operator, see Lax [6] Chap. 32.

7.2 Conjugate Operators and Projections

173

is also real. Substituting iy instead of y, we observe similarly that ix, T y − iy, T x ≡ β

(7.9)

is real. By (7.8) and (7.9), we have 2x, T y = α − iβ, 2y, T x = α + iβ,

(7.10) α, β ∈ R.

(7.11)

A simple calculation 2T x, y = 2y, T x = α − iβ = 2x, T y (7.11)

(7.10)

gives a conclusion T x, y = x, T y,

x, y ∈ H,

that is, T = T ∗ .

Remark 7.3 We used the imaginary unit i in the course of the proof (iii)⇒(i). This is an essential point. This argument can not be applied to a real Hilbert space. In fact, T x, x is always real in a real Hilbert space. However, a linear operator is not necessarily symmetric.8 Combining Theorems 7.2 and 7.4, we get the following corollary. Corollary 7.1 For any bounded and skew-symmetric sesquilinear functional Φ : H → C, there exists uniquely some symmetric operator T ∈ L(H) such that Φ(x, y) = T x, y,

x, y ∈ H.

Furthermore, it holds good that T = Φ . Let M be a closed subspace of a Hilbert space H. We have already seen the orthogonal decomposition; that is, any z ∈ H can be uniquely represented as z = x + y,

x ∈ M,

y ∈ M⊥ .

The linear operator P : H → M which associates each z ∈ H with x in the decomposition is called a projection or projection operator of H into M. We list a few elementary properties of a projection.

8 This

remark is due to Kato [5] p. 257.

174

7 Spectral Representation of Unitary Operators

1◦ Let P be the projection of H into the closed subspace M. M1 and M2 are defined by M1 = {x ∈ H|P x = x},

M2 = {P x|x ∈ H}.

Then M = M1 = M2 . 2◦ Let P be the projection of H into the closed subspace M. If P x = x for x ∈ H, then P x = x. Proof Since x = P x + (x − P x), P x ∈ M and x − P x ∈ M⊥ , we have

x 2 = P x 2 + x − P x 2 . By assumption P x = x . Hence x − P x = 0.

Theorem 7.5 Let H be a Hilbert space and M its closed subspace. The projection P : H → M satisfies the following properties: (i) P 2 = P (ii) P ∗ = P

(idempotency), (symmetricity).

If M {0}, then P = 1. Proof Let z = x + y (x ∈ M, y ∈ M⊥ ) be the orthogonal decomposition of z ∈ H. P 2 = P follows from P 2 z = P x = x = P z. Since

P z 2 = x 2 x 2 + y 2 = z 2 , we obtain P 1. In the case of M {0}, there exists some z ∈ M \ {0}. We must have P z = z. Hence it follows that P = 1. Finally, we show P = P ∗ . Decompose x, y ∈ H in the form x = x1 + x2 ,

x1 ∈ M,

x2 ∈ M⊥ ,

y = y1 + y2 ,

y1 ∈ M,

y2 ∈ M⊥ .

P x, y can be calculated as P x, y = x1 , y = x1 , y1 = x, y1 = x, P y. This holds good for any x, y ∈ H. Hence P = P ∗ . The next theorem is the converse assertion of Theorem 7.5.

7.3 Unitary Operators

175

Theorem 7.6 Assume that P ∈ L(H) satisfies (i) P 2 = P and (ii) P ∗ = P . Then P is the projection of H into M = {x ∈ H|P x = x}. Proof By (i), we obtain P (P x) = P x

(7.12)

for all x ∈ H. We also obtain, by (ii), y, x − P x = y, x − P y, x = 0

(7.13)

for all x ∈ H and y ∈ M. It follows from (7.12) and (7.13) that P x ∈ M,

x − P x ∈ M⊥ .

Taking account of x = P x + (x − P x),

we get the desired result.

In sum, a projection operator on a Hilbert space can be characterized as an idempotent and symmetric operator.

7.3 Unitary Operators We start with the definition and some properties of unitary operators. Definition 7.4 An operator T ∈ L(H) is called a unitary operator if it satisfies T T ∗ = T ∗ T = I (identity operator). The Fourier transform and the inverse Fourier transform are typical examples of this category. The set of unitary operators in L(H) forms a group with respect to the composition of operators. It is called the unitary group. Theorem 7.7 The following three statements are equivalent for an operator T ∈ L(H): (i) T is a unitary operator. (ii) T (H) = H and T x, T y = x, y for all x, y ∈ H.

176

7 Spectral Representation of Unitary Operators

(iii) T (H) = H and

T x = x

for all x ∈ H. Proof (i)⇒(ii): Assume T is unitary. Then T is invertible and T ∗ is the inverse, since T T ∗ = I by definition. Hence T is a bijection. By a simple calculation, we have T x, T y = x, T ∗ T y = x, y for all

x, y ∈ H.

(ii)⇒(iii): Obvious. (iii)⇒(i): Assuming (iii), we obtain T ∗ T x, x = T x, T x = T x 2 = x 2 = x, x for all x ∈ H; i.e. (T ∗ T − I )x, x = 0 for all

x ∈ H.

(7.14)

Hence T ∗ T = I . (See the remark below.) Since T is a bijection of H onto itself by assumption, there exists the inverse operator T −1 of T . Thus we must have T −1 = T ∗ , which implies T ∗ T = T T ∗ = I. T ∈ L(H) is unitary if and only if it is an isometric isomorphism on H. Remark 7.4 The deduction of T ∗ T = I from (7.14) in the proof of Theorem 7.7 is based upon the following general principle: Sx, x = 0 for all

x∈H⇒S=0

(7.15)

for S ∈ L(H). In fact, according to (7.15), S(x + y), x + y = Sx, y + Sy, x = 0,

(7.16)

S(x + iy), x + iy = −i Sx, y + i Sy, x = 0

(7.17)

for x, y ∈ H. Multiplying (7.17) by i and adding it to (7.16), we have 2 Sx, y = 0

for all

x, y ∈ H.

7.4 Resolution of the Identity

177

This implies that Sx = 0 for all

x ∈ H.

In the proof of Theorem 7.7, T ∗ T − I plays the role of S.

7.4 Resolution of the Identity In this section, we discuss a general theory of the resolution of the identity, which is to take part in the spectral representation problem of unitary operators, the main topic in the next section. Definition 7.5 Let (Ω, E) be a measurable space and H a Hilbert space. A mapping E : E → L(H) is called a resolution of the identity on E if it satisfies the following conditions: 1◦ E(∅) = 0, E(Ω) = I. 2◦ E(A) is a projection for each A ∈ E. 3◦ E(A ∩ B) = E(A)E(B) for A, B ∈ E. 4◦ A, B ∈ E, A ∩ B = ∅ ⇒ E(A ∪ B) = E(A) + E(B). 5◦ If we define, for each x, y ∈ H, a function Ex,y : E → C by Ex,y (A) = E(A)x, y , A ∈ E, then Ex,y is a complex measure on (Ω, E). The next result is an easy consequence of the definition. Theorem 7.8 (properties of the resolution) A resolution E of the identity satisfies the following properties, where A, B ∈ E and x ∈ H: (i) Ex,x (A) = E(A)x 2 . (ii) Ex,x is a (positive) measure on (Ω, E) and Ex,x (Ω) = x 2 . (iii) E(A)E(B) = E(B)E(A). (iv) If A ∩ B = ∅, the images of E(A) and E(B) are orthogonal to each other. (v) E is finitely additive.

178

7 Spectral Representation of Unitary Operators

Proof

(i) By 2◦ , E(A) is a projection. Hence Ex,x (A) = E(A)x, x = E(A)2 x, x = E(A)x, E(A)x = E(A)x 2 .

(ii) Obvious by E(Ω) = I and (i). (iii) By 3◦ , E(A)E(B) = E(A ∩ B) = E(B)E(A). (iv) Suppose that A ∩ B = ∅. Then E(A)x, E(B)y = x, E(A)E(B)y (by E(A) = E(A)∗ ) =◦ x, E(A ∩ B)y = x, E(∅)y =◦ x, 0 = 0 3

1

for any x, y ∈ H. (v) is also clear by 4◦ . Remark 7.5 1◦ E is not necessarily countably additive. Let An ∈ E be a sequence of mutually disjoint measurable sets. Does a series ∞

(7.18)

E(An )

n=1

of operators converge strongly? Since each E(An ) is a projection, its norm is 0 or 1 (cf. Theorem 7.5). Hence the sequence of partial sums N

E(An );

N = 1, 2, · · ·

n=1

of (7.18) can not be Cauchy except for the case where all but at most finite terms of (7.18) are zero. This observation shows that the series (7.18) does not strongly converge in general except for some special case. Some more detailed explanation may be required for the above argument. If we define Mn = {x ∈ H|E(An )x = x} for each operator E(An ), then E(An ) is a projection to Mn . In the case of Mn {0}, E(An ) = 1. Suppose that Mn {0} for infinitely many n, and say, Mk0 {0}, m < k0 < n. Consider partial sums m k=1

E(Ak ),

n k=1

E(Ak )

7.4 Resolution of the Identity

179

of (7.18) and suppose that x0 ∈ Mk0 , x0 = 1. Since E(Ak )x0 , E(Ak )x0 = 0, we obtain n 2 n = E(A )x

E(Ak )x0 2 E(Ak0 )x0 = x0 = 1. k 0 k=m+1

k=m+1

n E(Ak ) Hence 1. There exist infinitely many indices which satisfy a k=m+1

similar condition to k0 above. Thus we observe that the sequence of partial sums of (7.18) can not be Cauchy. 2◦ However, E(·)x is σ -additive for any fixed x ∈ H. To show this, we should recall the following general principle.9 Let {xn }be a sequence in a Hilbert space H such that xn ⊥xm for n m. Then the following statements are equivalent: (i) (ii)

∞ n=1 ∞

xn converges strongly.

xn 2 < ∞.

n=1

∞ (iii) xn , y converges for any y ∈ H. n=1

We now go back to 2◦ . Let An ∈ E (n = 1, 2, · · · ) be mutually disjoint measurable sets. By Theorem 7.8 (iv), E(An )x, E(Am )x = 0 if n m. Since E(·)x, y is a complex measure by 5◦ in the definition, we obtain ∞ ∞ E An x, y = E(An )x, y for all n=1

y ∈ H.

n=1

Hence, by (i)⇔(iii) in the general result, ∞

E(An )x

(strong convergence)

n=1

exists. We can rewrite the right-hand side of (7.19) as

9 See

Yosida [9] p. 69; Maruyama [7] p. 168.

(7.19)

180

7 Spectral Representation of Unitary Operators

∞ N E(An )x, y = lim E(An )x, y N →∞

n=1

n=1

= =

lim

N

N →∞

E(An )x, y

(by continuity of inner product)

n=1

∞

E(An )x, y

(by strong convergence of the first term).

n=1

Since ∞ ∞ E An x, y = E(An )x, y n=1

n=1

for any y ∈ H, we obtain E

∞

∞ An x = E(An )x.

n=1

n=1

This proves the σ -additivity of E(·)x.

7.5 Spectral Representation of Unitary Operators Let H be a (complex) Hilbert space and U ∈ L(H) a unitary operator. If we define a sequence {an }n∈Z by an = U n x, x ,

n∈Z

for a fixed x ∈ H, then it is positive semi-definite; that is, for any sequence {zn }n∈Z of complex numbers all of which are zero except finite elements,

an−m zn z¯ m 0.

n,m

It can be verified by10 n,m

10 Note

that U ∗ = U −1 .

an−m zn z¯ m =

n,m

U n−m x, x zn z¯ m

7.5 Spectral Representation of Unitary Operators

181

n m n m = U x, U x zn z¯ m = zn U x, zm U x n,m

n

m

2 = zn U n x 0. n Hence, by Herglotz’s theorem (Theorem 6.4, p. 133), an ’s can be represented as Fourier coefficients of some positive Radon measure mx ∈ M+ (T) on T:

1 an = U n x, x = √ e−inθ dmx (θ ). (7.20) 2π T We have to remark here that the measure mx depends upon x. It is clear that 1 √ mx (T) = x 2 2π

(7.21)

by setting n = 0. We now try to represent U n x, y in a similar way for fixed x, y ∈ H. As is well-known, U n x, y can be expressed as a linear combination of11 U n (x ± y), x ± y ,

U n (x ± iy), x ± iy .

Hence by (7.20), we obtain a representation 1 U x, y = √ 2π

n

where

T

e−inθ dmx,y (θ ),

(7.22)

4mx,y = mx+y − mx−y + imx+iy − imx−iy .

Needless to say, mx,x = mx . Theorem 7.9 Let U ∈ L(H) be a unitary operator. Then there exists a Borel complex measure mx,y for any fixed x, y ∈ H, such that 1 U x, y = √ 2π

n

T

e−inθ dmx,y (θ ).

Such a measure mx,y is determined uniquely. Theorem 7.10 The measure mx,y obtained above has the following properties for any S ∈ B(T): (i) (x, y) → mx,y (S) is sesquilinear. (ii) (x, y) → mx,y (S) is skew-symmetric. 11 U n x, y

= 14 { U n (x + y), x + y − U n (x − y), x − y + i U n (x + iy), x + iy − i U n (x − iy), x − iy }.

182

(iii)

7 Spectral Representation of Unitary Operators √1 |mx,y (S)| 2π

x · y

for any S ∈ B(T).

Proof (i) For α, β ∈ C and x1 , x2 ∈ H, U n (αx1 + βx2 ), y = α U n x1 , y + β U n x2 , y

β α e−inθ dmx1 ,y + √ e−inθ dmx2 ,y = √ 2π T 2π T

1 = √ e−inθ d(αmx1 ,y + βmx2 ,y ), 2π T which implies mαx1 +βx2 ,y = αmx1 ,y + βmx2 ,y . Similarly, we have ¯ x,y2 . ¯ x,y1 + βm mx,αy1 +βy2 = αm (ii) By a simple computation, U n y, x = y, U −n x = U −n x, y

1 1 inθ e dmx,y = √ e−inθ dmx,y . = √ 2π T 2π T Similarly, 1 U n y, x = √ 2π

T

e−inθ dmy,x .

Therefore mx,y = my,x . (iii) Define a function ϕS : H × H → C by ϕS (x, y) = mx,y (S) for fixed S ∈ B(T). ϕS is a skew-symmetric, sesquilinear functional as we have already checked. It is also clear that ϕS (x, x) = mx,x (S) = mx (S) 0 and ϕS (0, 0) = 0 for x = 0. By Schwarz’s inequality,12 S is not necessarily an inner product because “x 0 ⇒ ϕS (x, x) > 0” is not fulfilled in general. But all the requirements other than this are satisfied. Hence we can apply Schwarz’s inequality here. The condition “x 0 ⇒ x, x 0” is not necessary for Schwarz’s inequality.

12 ϕ

7.5 Spectral Representation of Unitary Operators

183

|mx,y (S)| mx (S)1/2 · my (S)1/2 .

(7.23)

The inequality (7.23) implies 1 √ |mx,y (S)| x · y , 2π 1 1 since √ mx (T) = x 2 , √ my (T) = y 2 (by (7.21)) and mx (·) as well as 2π 2π my (·) are measures. Thus ϕS (x, y) is a skew-symmetric sesquilinear functional on H × H which is bounded in the sense of |ϕS (x, y)|

√ 2π x · y .

(7.24)

Thanks to Corollary 7.1, we obtain the next theorem.13 Theorem 7.11 There exists a symmetric operator E(S) ∈ L(H), for each S ∈ B(R), such that 1 √ mx,y (S) = E(S)x, y , 2π

x, y ∈ H.

1 It is determined uniquely and satisfies E(S) = √ ϕS . 2π Theorem 7.12 The operator E : B(R) → L(H) is a resolution of identity on B(R). Proof We have to show 1◦ –5◦ on p. 177. 1◦ Since |mx,y (∅)| = |ϕ ∅ (x, y)| = 0

for all

x, y ∈ H

by (7.23), it holds true that E(∅) = 0. Setting n = 0 in (7.22), we have 1 x, y = √ mx,y (T) = E(T)x, y for all 2π Hence E(T) = I .

13 We

owe this to Lax [6] Chap. 31, especially Section 31.7.

x, y ∈ H.

184

7 Spectral Representation of Unitary Operators

2◦ E(S) is symmetric by Theorem 7.11. Applying 3◦ (to be shown below) to the special case S = T , we obtain E(S) = E(S)2 . So E(S) is a projection by Theorem 7.6. 3◦ Taking account of the relation between mx,y and E(·)x, y , we have 1 U n x, y = √ 2π

T

e−inθ dmx,y (θ ) =

T

e−inθ E(dθ )x, y .

(7.25)

Substituting n by n + k, we also have

U n+k x, y =

T

e−inθ e−ikθ E(dθ )x, y .

(7.26)

Replacing x in (7.25) by U k x, we further obtain

U

n+k

x, y =

T

e−inθ E(dθ )U k x, y .

(7.27)

The comparison of (7.26) and (7.27) gives14 e−ikθ E(dθ )x, y = E(dθ )U k x, y .

(7.28)

χS denotes the characteristic function of S ∈ B(R). Then it follows from (7.25) (n = 0) that

T

χS (θ )e−ikθ E(dθ )x, y = E(S)U k x, y . = U k x, E(S)y (by E(S) = E(S)∗ )

= e−ikθ E(dθ )x, E(S)y .

(7.25)

(7.29)

T

Since the first and the fourth terms in (7.29) are equal, χS (θ ) E(dθ )x, y = E(dθ )x, E(S)y .

the Fourier transforms of two Radon measures on T are equal, then the two measures are coincide.

14 If

7.5 Spectral Representation of Unitary Operators

185

Integrating both sides on T ∈ B(T),15 we have

T

χT (θ )χS (θ ) E(dθ )x, y = E(T )x, E(S)y .

The left-hand side is equal to E(S ∩ T )x, y , and so E(S ∩ T )x, y = E(T )x, E(S)y = E(S)E(T )x, y . This proves E(S ∩ T ) = E(S)E(T ). 4◦ Assume S ∩ T = ∅. Then mx,y (S ∪ T ) = E(S ∪ T )x, y .

(7.30)

On the other hand, mx,y (S ∪ T ) = mx,y (S) + mx,y (T ) = E(S)x, y + E(T )x, y = (E(S) + E(T ))x, y .

(7.31)

Comparing (7.30) and (7.31), we obtain E(S ∪ T ) = E(S) + E(T ). 5◦ It is easily observed that mx,y (·) = E(·)x, y is a complex measure by its construction. In addition to 1◦ –5◦ , E(·) has some more properties. 6◦ E(S)U = U E(S). In fact, U n (U x), y = U n x, U −1 y since U is unitary. Hence mU x,y = mx,U −1 y .

(7.32)

Both sides of (7.32) can be rewritten as mU x,y (S) = E(S)U x, y , mx,U −1 y (S) = E(S)x, U −1 y = U E(S)x, y .

15 χ χ T S

= χS∩T .

(7.33)

186

7 Spectral Representation of Unitary Operators

The relations (7.32) and (7.33) imply E(S)U = U E(S). Needless to say, all the properties corresponding to Theorem 7.8 are satisfied. That is, the following statements hold good for S, T ∈ B(T) and x ∈ H. 7◦ mx,x (S) = E(S)x, x = E(S)x 2 . 8◦ mx,x (·) is a positive measure on (T, B(T)) and mx,x (T) = x 2 . 9◦ E(S)E(T ) = E(T )E(S). 10◦ If S ∩ T = ∅, the images of E(S) and E(T ) are orthogonal. U n x, y can be expressed in terms of E(·) in the form

U x, y = n

T

e

−inθ

E(dθ )x, y =

−inθ

e E(dθ ) x, y . T 789: integration with respect to the vector measureE(·)x

We may write this relation symbolically as

Un =

T

e−inθ dE.

In the special case of n = 1, we have

U=

T

e−iθ dE.

This is the spectral representation of U . Remark 7.6 The above result is based upon the observation that the numerical sequence {an = U n x, x} defined by the corresponding composition U n (n ∈ Z) of a unitary operator U is positive semi-definite. By Herglotz’s theorem, an ’s can be represented as the Fourier coefficients of a positive Radon measure on T. We can generalize this result to the family {Un | n ∈ Z} of unitary operators with parameter n which satisfies Un Um = Un+m (n, m ∈ Z),

T0 = I.

(The family {U n | n ∈ Z} of compositions is its special case.) In this case, too, a quite similar representation is possible. That is, there exists uniquely some bounded symmetric operator E(S), for each S ∈ B(T), which satisfies

Un =

T

e−inθ dE(θ ),

n ∈ Z.

7.6 Stone’s Theorem

187

In the next section, we consider the representation problem of the group of unitary operators where the parameter space is R rather than Z. It may be naturally conjectured that Bochner’s theorem plays a key role instead of Herglotz’s theorem.

7.6 One-Parameter Group of Unitary Operators and its Spectral Representation: Stone’s Theorem Definition 7.6 A family {Tt |t ∈ R} of bounded linear operators on a Hilbert space H is called a one-parameter group of linear operators16 if Tt Ts = Tt+s

(t, s ∈ R),

T0 = I.

A one-parameter group {Tt } is said to be of class-C0 if s- limTt x = Tt0 x t→t0

for all

x∈H

for each t0 ∈ R.17 We now consider the special case of a one-parameter group consisting of unitary operators.18 Lemma 7.1 Let {Ut |t ∈ R} be a one-parameter group of class-C0 consisting of unitary operators. A function ϕ : R → C is defined by ϕ(t) = Ut x, x . Then ϕ enjoys the following properties: (i) ϕ is continuous. (ii) ϕ(−t) = ϕ(t). n ϕ(ti − tj )zi z¯ j 0 (iii) i,j =1

for any finite t1 , t2 , · · · , tn ∈ R and z1 , z2 , · · · , zn ∈ C. Proof The continuity of ϕ is clear. We also note that U−t = Ut−1 = Ut∗ ,

(7.34)

t is restricted to [0, ∞), {Tt | t 0} is called a semi-group. one-parameter group {Tt } of class-C0 is also said to be strongly continuous. 18 This theory was established by Stone [8]. I benefited much from Lax [6] Chap. 35, as well as Dunford [3]. 16 If

17 A

188

7 Spectral Representation of Unitary Operators

since {Ut |t ∈ R} is a one-parameter group. By (7.32), ϕ(−t) = U−t x, x = Ut∗ x, x = x, Ut∗ x = Ut x, x = ϕ(t). Finally, the positive semi-definiteness follows from a simple computation:

=

ϕ(ti − tj )zi z¯ j =

Uti x, Utj xzi z¯ j =

Uti −tj x, x zi z¯ j =

zi Uti x,

Ut−1 Uti x, x zi z¯ j j

2 z j Ut j x = z i Ut i x 0

for t1 , t2 , · · · , tn ∈ R and z1 , z2 , · · · , zn ∈ C.

We apply Bochner’s theorem (Theorem 6.11, p. 150) to the function ϕ(t) to obtain its representation as the Fourier transform of a positive Radon measure m(· ; x) on R. That is, there exists a measure m(· ; x) ∈ M+ (R) such that 1 ϕ(t) = Ut x, x = √ 2π

R

e−itλ dm(λ; x).

For fixed x, y ∈ H, Ut x, y is expressed as a linear combination of Ut (x ± y), x ± y ,

Ut (x ± iy), x ± iy .

Hence there exists some (complex) measure m(· ; x, y) which satisfies 1 Ut x, y = √ 2π

R

e−itλ dm(λ; x, y).

(7.35)

Theorem 7.13 Let {Ut | t ∈ R} be a one-parameter group of class-C0 consisting of unitary operators. Then there exists a Borel complex measure m(· ; x, y) for each x, y ∈ H such that 1 Ut x, y = √ 2π

R

e−itλ dm(λ; x, y).

(7.36)

Such a measure m(· ; x, y) is unique. The properties of m(· ; x, y) are summarized as follows. We can prove it in a similar way to Theorem 7.10. Theorem 7.14 m(· ; x, y) has the following properties: (i) (x, y) → m(· ; x, y) is a sesquilinear functional. (ii) (x, y) → m(· ; x, y) is skew-symmetric. (iii) √1 |m(S; x, y)| x · y for any S ∈ B(R). 2π

It is obvious that m(· ; x, x) = m(· ; x).

7.6 Stone’s Theorem

189

Thus if we define a function ΦS : H × H → C (S ∈ B(R)) by 1 ΦS : (x, y) → √ m(S; x, y), 2π it is a skew-symmetric sesquilinear functional with

ΦS (x, y) x · y . Consequently, by Corollary 7.1, there exists, for each S ∈ B(R), a bounded symmetric operator E(S) ∈ L(H) which satisfies 1 ΦS (x, y) = √ m(S; x, y) = E(S)x, y , 2π

x, y ∈ H.

(7.37)

E(S) is uniquely determined and E(S) = ΦS . By Theorem 7.13 and (7.37), (7.36) can be reexpressed eventually in the form

Ut x, y =

R

e−itλ E(dλ)x, y.

This is the final form of Stone’s theorem. Theorem 7.15 (Stone) Let {Ut |t ∈ R} be a one-parameter group of class-C0 consisting of unitary operators. Then there exists, for each S ∈ B(R), a unique bounded symmetric operator E(S) ∈ L(H) which satisfies

Ut x, y =

R

e−itλ E(dλ)x, y.

The proof of Stone’s theorem explained here is based upon Bochner’s theorem. However, it is remarkable that Bochner’s theorem can be deduced from Stone’s theorem, conversely. That is, these two theorems are mathematically equivalent.19 Let ϕ ∈ Cb (R, C) be positive semi-definite. We denote by H the set of functions of R into C, the values of which are zero except some finite points. H is a vector space with usual operations and an inner product ·, ·20 : (x + y)(t) = x(t) + y(t), (αx)(t) = α · x(t), x, y = ϕ(t − s)x(t)y(s). t,s∈R

19 We

owe this to Yosida [9] pp. 346–347. is not necessarily true that x, x = 0 ⇒ x = 0. x, x 0 follows from the positive semidefiniteness of ϕ.

20 It

190

7 Spectral Representation of Unitary Operators

H is a pre-Hilbert space under these operations. We define a subspace21 N of H by N = {x ∈ H|x, x = 0}. Let ξx , ξy be elements of the quotient space H/N which contain x and y, respectively. H/N becomes a pre-Hilbert space under the inner product ξx , ξy = x, y. The completion H˜ of H/N is a Hilbert space. It is clear that the operator Uτ : H → H (τ ∈ R) defined by (Uτ x)(t) = x(t − τ ),

x∈H

satisfies Uτ x, Uτ y = x, y, Uτ Uσ = Uτ +σ ,

x, y ∈ H,

τ, σ ∈ R,

U0 = I.

˜ ∈ R) by Hence if we define another operator U˜ τ ; H˜ → H(τ (U˜ τ ξx )(t) = (Uτ x)(t), ˜ and it is of class-C0 by the {U˜ τ } is a one-parameter group of unitary operators on H; continuity of ϕ. Therefore, by Stone’s theorem, there exists, for each E ∈ B(R), a symmetric operator E(S) such that

U˜ τ ξx , ξy = Uτ x, y = e−iτ λ E(dλ)x, y. R

If we define a complex measure m(· ; x, y) by √ m(S; x, y) = 2π E(S)x, y, m(· ; x, x) is real by Theorem 7.14.

any two elements x and y of N, we must have x + y ∈ N. It can be shown as follows. By the positive semi-definiteness of ϕ, ϕ(−u) = ϕ(u) (Lemma 6.3, p. 131). And we may apply Schwarz’s inequality to ·, ·. Since x, x = y, y = 0, we have

21 For

x + y, x + y = 2Rex, y. Hence by Schwarz’s inequality, |x + y, x + y| 2x, x1/2 · y, y1/2 = 0. x + y ∈ N immediately follows.

References

191

Specifying x0 ∈ H by x0 (t) =

+ 1

if t = τ,

0

if t τ,

we obtain Uτ x0 , x0 =

ϕ(t − s)Uτ x0 (t)x0 (s) =

t,s∈R

ϕ(t − s)x0 (t − τ )x0 (s)

t,s∈R

by definition of ·, ·. Since x0 (t − τ ) 0 and x0 (s) 0 only when t = 2τ and s = τ , it follows that Uτ x0 , x0 = ϕ(τ ). Consequently, 1 ϕ(τ ) = √ 2π

R

e−iτ λ dm(λ; x0 , x0 ).

This proves Bochner’s theorem.

References 1. Arveson, W.: An Invitation to C ∗ -Algebra. Springer, Berlin (1976) 2. Dixmier, J.: C ∗ -Algebras. North-Holland, Amsterdam (1977) 3. Dunford, N.: On one-parameter groups of linear transformations. Ann. Math. 39, 305–356 (1938) 4. Evans, L.: Partial Differential Equations. American Mathematical Society, Providence (1998) 5. Kato, T.: Iso Kaiseki (Functional Analysis). Kyoritsu Shuppan, Tokyo (1957) (Originally published in Japanese) 6. Lax, P.D.: Functional Analysis. Wiley, New York (2002) 7. Maruyama, T.: Kansu Kaisekigaku (Functional Analysis). Keio Tsushin, Tokyo (1980) (Originally published in Japanese) 8. Stone, M.H.: On one-parameter unitary groups in Hilbert space. Ann. Math. 33, 643–648 (1932) 9. Yosida, K.: Functional Analysis, 3rd edn. Springer, New York (1971)

Chapter 8

Fourier Analysis of Periodic Weakly Stationary Processes

During the decade around 1930, the world economy was thrown into a serious depression that nobody had previously experienced. A number of economic theories were proposed in order to analyze and control the violence of business fluctuations. We enumerate, say, the interaction theory of accelerator and multiplier by P.A. Samuelson and J.R. Hicks, the nonlinear oscillation model by N. Kaldor and R. Goodwin, the analysis based upon the differential equation with delay by R. Frisch, and so on. Furthermore, we should remember that J.M. Keynes’ innovative macroeconomic theory paved the way leading to dynamic theory of economic fluctuations. From a mathematical point of view, the work of E. Slutsky [29], a Ukrainian mathematician, is particularly remarkable.1 It was published in Econometrica, 1937. He tried to explain more or less regular fluctuations of macro-economic movements based upon the overlapping effects of random shocks. However, frankly speaking, his analysis was rather experimental and devoid of mathematical rigor.2 In this chapter and the next, we investigate periodic and almost periodic behaviors of stationary stochastic processes and give a systematic exposition of the mathematical skeleton of the Slutsky problem from the viewpoint of classical Fourier analysis. The Bochner–Herglotz theory concerning Fourier representation of positive semi-definite functions provides the key analytical tool.3

1 R.

Frisch [6] also deserves special attention. Maruyama [22] for the outline of the theories of business fluctuations. See also Samuelson [24, 25], Hicks [9], Kaldor [11], R. Frisch [6], Keynes [18], Slutsky [28–30]. 3 Kawata [16, 17], Maruyama [20] and Wold [31] are classical works on Fourier analysis of stationary stochastic processes, which provided me with all the basic mathematical background. Among more recent literature, I wish to mention Brémaud [1]. Granger and Newbold [7] Chap. 2, Hamilton [8] Chap. 3 and Sargent [26] Chap. XI are textbooks written from the standpoint of economics. 2 See

© Springer Nature Singapore Pte Ltd. 2018 T. Maruyama, Fourier Analysis of Economic Phenomena, Monographs in Mathematical Economics 2, https://doi.org/10.1007/978-981-13-2730-8_8

193

194

8 Periodic Weakly Stationary Processes

8.1 Stochastic Processes of Second Order Let (Ω, E, P ) be a probability space, and T a subset of R. T is usually interpreted as the space of time and, in this chapter, T is assumed to be either R or Z. A function X : T × Ω → C is said to be a stochastic process if the function ω → X(t, ω) is (E, B(C))-measurable for any fixed t ∈ T . The trajectory of X(t, ω) for a fixed ω, that is, the function t → X(t, ω), is called the sample function of this stochastic process. Let T be the set of all the finite tuples of elements of T , that is T = {t = (t1 , t2 , · · · , tn ) | tj ∈ T , j = 1, 2, · · · , n, n ∈ N}. Xt (ω) denotes the vector Xt (ω) = (X(t1 , ω), X(t2 , ω), · · · , X(tn , ω)),

t ∈ T.

The set function νXt : B(Cn ) → R defined by νXt (E) = P {ω ∈ Ω|Xt (ω) ∈ E},

E ∈ B(Cn )

is a measure on B(Cn ), called the distribution of Xt (ω). The set of all the distributions {νXt | t ∈ T } is called the system of finite dimensional distributions determined by X(t, ω). {νXt | t ∈ T } is determined by a given stochastic process X(t, ω). Conversely, assume now that a probability measure νt on (Cn , B(Cn )) is given for each t = (t1 , t2 , · · · , tn ) ∈ T . The set of all of them is {νt | t ∈ T }. Does there exist a stochastic process, the system of finite dimensional distributions of which is {νt | t ∈ T }? The positive answer to this question is given by the famous theorem due to A.N. Kolmogorov.4 Kolmogorov’s Theorem Assume that a family {νt | t ∈ T } of probability measures satisfies the following, (i) and (ii). Then there exists a stochastic process X : T × Ω → C(or R), the system of finite dimensional distributions of which is {νt | t ∈ T }. (i) If t1 = (t1 , t2 , · · · , tn ) and t2 = (t1 , t2 , · · · , tn , tn+1 ), νt1 (E) = νt2 (E × C)

for

E ∈ B(Cn ).

(ii) Let π be a permutation of n letters, t = (t1 , t2 , · · · , tn ) ∈ T , s = (s1 , s2 , · · · sn ) = π(t). Then

4 See

Kawata [17] pp. 5–7, for the proof. Loève [19] pp. 92–94 and Dudley [5] pp. 346–349 are also suggestive.

8.1 Stochastic Processes of Second Order

195

νt (E) = νs (π(E))

for E ∈ B(Cn ),

where π(E) = {(xs1 , · · · , xsn ) | (xt1 , · · · , xtn ) ∈ E}. From now on, the space T of time is assumed to be R, unless otherwise mentioned. A stochastic process X(t, ω) is said to be of the p-th order if

E|X(t, ω)| =

|X(t, ω)|p dP (ω) < ∞,

p

Ω

where E is the expectation operator. In this chapter, the special case of p = 2 is our main concern. Let X : T × Ω → C be a stochastic process of the second order. The following two functions play a crucial role: R : T × T → C,

ρ : T × T → C.

R(s, t) is defined by R(s, t) = EX(s, ω)X(t, ω),

s, t ∈ T

and is called the correlation function of X. ρ(s, t) is defined by ρ(s, t) = E[X(s, ω) − EX(s, ω)][X(t, ω) − EX(t, ω)],

s, t ∈ T

and is called the covariance function of X. We list several basic properties of covariance function ρ(s, t) of a stochastic process X(t, ω): 1◦ ρ(s, t) = R(s, t) − EX(s, ω)EX(t, ω). Proof ρ(s, t) = E[X(s, ω)X(t, ω) − X(s, ω)EX(t, ω) − X(t, ω)EX(s, ω) + EX(s, ω) · EX(t, ω)] = E[X(s, ω)X(t, ω)] − EX(s, ω) · EX(t, ω) − EX(t, ω) · EX(s, ω) + EX(s, ω) · EX(t, ω) = R(s, t) − EX(s, ω) · EX(t, ω). 2◦

ρ(s, t) = ρ(t, s). 3◦ ρ(t, t) = E|X(t, ω) − EX(t, ω)|2 0. ρ(t, t) is nothing other than the variance of X(t, ω).

196

8 Periodic Weakly Stationary Processes

4◦ |ρ(s, t)|2 ρ(s, s) · ρ(t, t). Proof By Schwarz’s inequality, |ρ(s, t)|2 E|X(s, ω) − EX(s, ω)|2 × E|X(t, ω) − EX(t, ω)|2 = ρ(s, s) · ρ(t, t). 5◦

The (n × n)-matrix M = (ρ(ti , tj ))1i,j n

defined for any ti , tj ∈ R (i, j = 1, 2, · · · , n) and n ∈ N is an Hermite matrix which determines an Hermitian form of positive semi-definite. Proof It is obvious that M is Hermite by 2◦ . M determines an Hermitian form of positive semi-definite, since n i,j =1

n ρ(ti , tj )zi z¯ j = E [X(ti , ω) − EX(ti , ω)] × [X(tj , ω) − EX(tj , ω)]zi z¯ j i,j =1

2 n = E [X(ti , ω) − EX(ti , ω)]zi 0. i=1

Let X : T × Ω → C(T = R) be a stochastic process of the second order. When we regard X as a function of ω, fixing t ∈ T , we sometimes write it as Xt (ω). Definition 8.1 If a function A : R → L2 (Ω, C) defined by A : t → X(t, ω) is continuous at a point t0 ∈ R; i.e. E|X(t, ω) − X(t0 , ω)|2 → 0 as

t → t0 ,

we say that X(t, ω) is strongly continuous at t0 . If X(t, ω) is strongly continuous at every t ∈ R, X(t, ω) is said to be strongly continuous on R. In other words, X(t, ω) is strongly continuous at t0 if there exists some δ > 0, for each ε > 0, such that E|X(t, ω) − X(t0 , ω)|2 < ε if |t − t0 | < δ. Even if X(t, ω) is strongly continuous on R, the magnitude of δ depends upon ε and t, in general. If δ depends upon ε but is independent of t, we say that X(t, ω) is uniformly strongly continuous.

8.1 Stochastic Processes of Second Order

197

1◦ If X(t, ω) is strongly continuous, the function t → EX(t, ω) is continuous. Proof At any fixed t0 ∈ T , we have | EX(t, ω) − EX(t0 , ω) | 2 = | E[X(t, ω) − X(t0 , ω)] · 1 | 2 E | X(t, ω) − X(t0 , ω) | 2 · E(1)2

(Schwarz’s inequality)

= E | X(t, ω) − X(t0 , ω) | → 0 as 2

t → t0 .

2◦ R(s, t) is continuous if and only if both EX(t, ω) and ρ(s, t) are continuous. Proof Suppose that R(s, t) is continuous. Making use of the proof of 1◦ again, we have | EX(t, ω) − EX(t0 , ω) | 2 E | X(t, ω) − X(t0 , ω) | 2

(see the proof of 1◦ )

= E | X(t, ω) | 2 +E | X(t0 , ω) | 2 −2ReEX(t, ω)X(t0 , ω) = R(t, t) + R(t0 , t0 ) − 2ReR(t, t0 ).

(8.1)

The formula (8.1) converges to 0 as t → t0 , since R(s, t) is continuous. Hence t → EX(t, ω) is continuous. ρ(s, t) is computed as ρ(s, t) = R(s, t) − EX(s, ω)EX(t, ω).

(8.2)

The second term on the right-hand side of (8.2) is continuous as shown above. So ρ(s, t) is also continuous. Conversely, suppose that both EX(t, ω) and ρ(s, t) are continuous. The continuity of R(s, t) follows from (8.2). Theorem 8.1 (strong continuity) The following statements are equivalent: (i) X(t, ω) is strongly continuous. (ii) R(s, t) is continuous. (iii) R(s, s) is continuous in s, and the function s → R(s, t) is continuous for each fixed t ∈ T . Proof (i)⇒(ii): For s0 , t0 ∈ R, we have R(s, t) − R(s0 , t0 ) = E[X(s, ω)X(t, ω) − X(s0 , ω)X(t0 , ω)] = E{[X(s, ω) − X(s0 , ω)]X(t, ω)} + E{X(s0 , ω)[X(t, ω) − X(t0 , ω)]}. By Schwarz’s inequality,

(8.3)

198

8 Periodic Weakly Stationary Processes

(E | X(s, ω) − X(s0 , ω) | · | X(t, ω) | )2 E | X(s, ω) − X(s0 , ω) | 2 · E | X(t, ω) | 2 . (8.4) (E | X(s0 , ω) | · | X(t, ω) − X(t0 , ω) | )2 E | X(s0 , ω) | 2 · E | X(t, ω) − X(t0 , ω) | 2 . (8.5)

We obtain (ii) by (i), (8.3), (8.4) and (8.5). (ii)⇒(iii): Obvious. (iii)⇒(i): We have already observed in (8.1) that E | X(t, ω) − X(t0 , ω) | 2 → 0 as

t → t0 .

We introduce another concept of continuity which is weaker than strong continuity. Given a stochastic process X(t, ω) of the second order, we define a subspace H(X) of L2 (Ω, C) by H(X) = span{X(t, ω) | t ∈ R}, which is called the linear subspace determined by X. It is clear that H(X) is a Hilbert space (under the inner product of L2 ). Definition 8.2 X(t, ω) is H-weakly continuous at t0 ∈ R if EY (ω)X(t, ω) → EY (ω)X(t0 , ω)

as

t → t0

for any Y ∈ H(X). X(t, ω) is said to be H-weakly continuous on R or just H-weakly continuous if it is H-weakly continuous at every t0 ∈ R. If X(t, ω) is strongly continuous, it is H-weakly continuous. In fact, it immediately follows from the evaluation (Schwarz’s inequality) that |EY (ω)X(t, ω) − EY (ω)X(t0 , ω)| Y 2 · X(t, ω) − X(t0 , ω) 2 for any Y ∈ L2 . Theorem 8.2 (separability of H(X)) If X(t, ω) is H-weakly continuous, H(X) is a separable Hilbert space. Proof Let D be the set of all random variables of the form n

rj X(tj , ω),

n ∈ N,

rj : rational complex, 5 tj ∈ Q.

j =1

5 That

is, both the real and imaginary parts are rationals.

8.1 Stochastic Processes of Second Order

199

Then D is a countable set in H(X). If we define M = spanD, M is a separable closed subspace of H(X). So our target is to show that M = H(X). Since H(X) is a Hilbert space, it can be expressed as H(X) = M ⊕ M⊥ , where M⊥ is the orthogonal complement of M. Hence any Y ∈ H(X) can be uniquely represented in the form Y (ω) = V (ω) + W (ω),

V ∈ M,

W ∈ M⊥ .

Then we can show W = 0 as an element of L2 . Let t0 be an element of R and {tn } a sequence in Q converging to t0 . The H-weak continuity of X(t, ω) implies EW (ω)X(tn , ω) → EW (ω)X(t0 , ω)

as

n → ∞.

(8.6)

However, EW (ω)X(tn , ω) = 0 for all n,

(8.7)

since X(tn , ω) ∈ D ⊂ M and W ∈ M⊥ . Hence it is obvious by (8.6) and (8.7) that EW (ω)X(t0 , ω) = 0. Therefore it follows that EW (ω)

0 n

1 cj X(tj , ω) = 0

j =1

for any element

n

cj X(tj , ω), (cj ∈ C, tj ∈ R, n ∈ N) of span{X(t, ω) | t ∈ R}.

j =1

According to the extension by continuity, we obtain EW (ω)Z(ω) = 0 for all

Z ∈ H(X),

from which W = 0 (in L2 ) follows. By the uniqueness of orthogonal decomposition, we obtain Y (ω) = V (ω),

200

8 Periodic Weakly Stationary Processes

from which Y ∈ M follows. Since this holds good for any Y ∈ H(X), we obtain H(X) ⊂ M. Thus we prove that M = H(X). If X(t, ω) is strongly continuous, it is H-weakly continuous, and so H(X) is separable by Theorem 8.2.

8.2 Weakly Stationary Stochastic Processes Throughout this section, T = R, Z or N and (Ω, E, P ) is a probability space. X(t, ω) : T × Ω → C is a stochastic process. In the case of T = Z or N, we sometimes write Xt (ω) instead of X(t, ω). Definition 8.3 A stochastic process X(t, ω) is strongly stationary if the distribution of (X(t1 + t, ω), X(t2 + t, ω), · · · , X(tn + t, ω)) is independent of t for any t = (t1 , t2 , · · · , tn ) ∈ T . We define Φt1 ,t2 ,··· ,tn (E) for each E ∈ B(Cn ) by Φt1 ,t2 ,··· ,tn (E) = P {ω ∈ Ω|(X(t1 , ω), · · · , X(tn , ω)) ∈ E}. Then X(t, ω) is strongly stationary if and only if Φt1 ,t2 ,··· ,tn = Φt1 +t,t2 +t,··· ,tn +t ,

n ∈ N,

t1 , t2 , · · · , tn ∈ T .

Definition 8.4 X(t, ω) is said to be weakly stationary if the following conditions are satisfied: (i) The absolute moment of the second order is finite: E|X(t, ω)|2 < ∞ for each

t ∈ T.

(ii) The expectation is constant throughout time: EX(t, ω) = m(t) = m constant for all t ∈ T . (iii) The covariance depends only upon the difference u = s − t of times: ρ(s, t) = ρ(s + h, t + h) for any

s, t, h ∈ T .

In this case, the convariance ρ(s, t) is a function of s − t. Hence we write it as ρ(u), u = s − t instead of ρ(s, t).6 6 We should use distinct notations for one-variable function ρ(u) and two variable function ρ(s, t). However, we use the same notation because there seems no possibility of confusion. ρ(u) is also called the autocovariance and ρ(u)/ρ(0) is called the autocorrelation of the process.

8.2 Weakly Stationary Stochastic Processes

201

Theorem 8.3 If a stochastic process X(t, ω) is strongly stationary and E|X(t0 , ω)|2 < ∞ for some t0 ∈ T , then X(t, ω) is weakly stationary. Proof X(t, ω) is a stochastic process of the second order, since E|X(t, ω)|2 = E|X(t0 , ω)|2 < ∞ for all

t ∈T

by the strong stationarity. Furthermore,

m(t) =

C

ξ dΦt (ξ ),

ρ(s, t) =

C C

(ξ − m(s))(η − m(t))dΦs,t (ξ, η).

Hence X(t, ω) is weakly stationary.

Remark 8.1 The strong stationarity does not imply the weak one because the condition E|X(t, ω)|2 < ∞ is not required in the definition of the strong one. Example 8.1 Let Xj : Ω → C (j = 1, 2, · · · , n) be mutually orthogonal random variables in L2 (Ω, C) with mean 0, i.e. EXj (ω) = 0,

j = 1, 2, · · · , n,

EXj (ω)Xk (ω) = 0

if j k. (orthogonality)

We define X(t, ω) by X(t, ω) =

n

eiλj t Xj (ω),

t ∈ R,

j =1

where λj ∈ R (j = 1, 2, · · · , n). Then X(t, ω) is weakly stationary with EX(t, ω) = 0 ρ(u) =

n

for all

t ∈ R,

eiλj u E|Xj (ω)|2 .

j =1

In fact, (8.8) is obvious. ρ(s, t) is computed as

(8.8) (8.9)

202

8 Periodic Weakly Stationary Processes

ρ(s, t) = EX(s, ω)X(t, ω) = E

n n

eiλj s e−iλk t Xj (ω)Xk (ω)

j =1 k=1

=

n

eiλj (s−t) E|Xj (ω)|2 .

j =1

So we obtain (8.9) by setting u = s − t. Example 8.2 Let Xj , Yj : Ω → R (j = 1, 2, · · · , n) be orthogonal random variables with EXj (ω) = EYj (ω) = 0,

j = 1, 2, · · · , n,

E|Xj (ω)|2 = E|Yj (ω)|2 = 1,

j = 1, 2, · · · , n.

We define X(t, ω) by X(t, ω) =

n

aj [Xj (ω) cos λj t + Yj (ω) sin λj t],

j =1

where αj , λj ∈ R (j = 1, 2, · · · , n). Then it is clear that EX(t, ω) = 0

for all

t ∈ R.

Since Xj ’s and Yj ’s are random variables with mean 0 and variance 1, the covariance ρ(s, t) is computed as ρ(s, t) = EX(s, ω)X(t, ω) 1 0 n 2 2 2 aj (Xj (ω) cos λj s cos λj t + Yj (ω) sin λj s sin λj t) =E j =1

=

n

aj2 cos λj (s − t).

j =1

Hence we have, setting u = s − t, that ρ(u) =

n

aj2 cos λj u.

j =1

This proves that X(t, ω) is weakly stationary.

8.2 Weakly Stationary Stochastic Processes

203

Example 8.3 (moving average) Let Xn : Ω → C (n ∈ Z) be orthogonal random variables in L2 (i.e. EXj (ω)Xk (ω) = 0 for j k) with EXn = 0,

E|Xn (ω)|2 = σ 2

(independent of n)

for each n ∈ Z. Define Yn (ω)(n ∈ Z) by Yn (ω) =

∞

n ∈ Z,

ck−n Xk (ω),

(8.10)

k=−∞

where {cν }ν∈Z ∈ l2 (C). Then we can prove that the right-hand side of (8.10) strongly converges in L2 and {Yn (ω)} is weakly stationary with EYn (ω) = 0,

n ∈ Z,

(8.11)

and ρ(u) = σ 2

∞

cu+ν c¯ν .

ν=−∞

We start by proving L2 -convergence of the right-hand side of (8.10). Fix n ∈ Z. Suppose first that p, p ∈ Z satisfy 0 p < p . We have, by a simple computation, 2 p p E ck−n Xk (ω) = cj −n ck−n EXj (ω)Xk (ω)

k=p

j,k=p

=

p

|ck−n |2 σ 2

(by orthogonality)

k=p

=σ

2

p k=p

|ck−n | = σ 2

2

−n p

|cν |2 .

(8.12)

ν=p−n

When p and p get larger (n being fixed), p − n and p − n become positive and their absolute values grow indefinitely. Therefore (8.12) → 0 as p, p → ∞, since {cν } ∈ l2 (C). (See Fig. 8.1.)

Fig. 8.1 Calculation of (8.12) – 1

204

8 Periodic Weakly Stationary Processes

Fig. 8.2 Calculation of (8.12) – 2

Similarly, in the case of p < p < 0, we have (Fig. 8.2) 2 p p−n 2 ck−n Xk (ω) = σ |cν |2 E k=p

p, p → −∞.

→ 0 as

ν=p −n

We consider partial sums of (8.10): U (ω) =

q

V (ω) =

ck−n Xk (ω),

q

ck−n Xk (ω)

k=p

k=p

with7 p < p < 0, 0 q < q and evaluate the distance between them based upon preliminary computations shown above (Fig. 8.3): −n p−n 2 q

U (ω) − V (ω) 22 = c X (ω) + c X (ω) ν n+ν ν n+ν

ν=p −n

2

ν=q−n

p−n 2 q 2 −n = c X (ω) + c X (ω) ν n+ν ν n+ν

2

ν=p −n

= σ2

p−n ν=p −n

→0

as

|cν |2 +

−n q

ν=q−n

(by orthogonality)

2

|cν |2

ν=q−n

p, p → −∞, q, q → ∞.

This proves that the sequence of partial sums q

ck−n Xk (ω)

p < 0, q 0, p → −∞, q → ∞

k=p

this evaluation, we are examining the situation where the upper limits (q, q ) of the sums tend to ∞, and the lower limits (p, p ) tend to −∞. So, without loss of generality, we may assume p , p < 0, 0 q, q . There are various cases other than p < p and q < q . However, we can treat them in the same manner.

7 In

8.2 Weakly Stationary Stochastic Processes

205

Fig. 8.3 Calculation of U (ω) − V (ω) 22

of the right-hand side of (8.10) is Cauchy in L2 . Hence the right-hand side of (8.10) strongly converges in L2 . Its limit is, as the above argument shows, determined uniquely and independent of the choice of p → −∞ and q → ∞. So Yn (ω) is defined without any ambiguity. The zero mean condition (8.11) is verified as follows. By definition8 ⎡ EYn (ω) = E ⎣l.i.m. p→∞

⎤

p

ck−n Xk (ω)⎦

(p > 0).

k=−p

Since Ω is a probability space, L2 -convergence implies L1 -convergence. Hence we obtain EYn (ω) = lim E p→∞

p

ck−n Xk (ω) = 0.

k=−p

Finally, we evaluate the covariance: ρ(m, n) = EYm (ω)Yn (ω) = σ 2 lim E p→∞

p

ck−m ck−n = σ 2

k=−p

∞

cn−m+ν c¯ν .

ν=−∞

Writing u = n − m, we obtain ρ(u) = σ 2

∞

cu+ν c¯ν .

ν=−∞

Thus we conclude that {Yn (ω)} is a weakly stationary process. The weakly stationary stochastic process defined by (8.10) is called a moving average process of {Xn (ω)} or a linear stochastic process generated by {Xn (ω)}.

8 It is known

limits of

Hence we can specify the upper and lower ) that the right-hand side converges strongly. as p and −p. l.i.m. denotes the limit in L2 .

206

8 Periodic Weakly Stationary Processes

Definition 8.5 A stochastic process Xn : Ω → C (n ∈ Z) is called a white noise if it satisfies: (i) EXn (ω) = 0 for all n ∈ Z, (ii) E | Xn (ω)|2 = σ 2 for all n ∈ Z, and (iii) EXn (ω)Xm (ω) = 0 for all n m. The orthogonality expressed in (iii) is also called the condition of serially uncorrelatedness. In terms of this terminology, Example 8.3 says that a moving average process of a white noise is weakly stationary. The next results concern the relations between the strong continuity of a weakly stationary process and the continuity of its covariance. Theorem 8.4 Let X : R × Ω → C be a weakly stationary stochastic process with EX(t, ω) = 0 for all t ∈ R. Then the following statements are equivalent: (i) (ii) (iii) (iv)

X(t, ω) is strongly continuous at t = 0. X(t, ω) is uniformly strongly continuous on R. ρ(u) is continuous at u = 0. ρ(u) is uniformly continuous on R.

Proof (i)⇔(ii): Assume that X(t, ω) is strongly continuous at t = 0. The uniform continuity follows from E|X(t + h, ω) − X(t, ω)|2 = E|X(t + h, ω)|2 + E|X(t, ω)|2 − 2ReEX(t + h, ω)X(t, ω) = E|X(h, ω)|2 + E|X(0, ω)|2 − 2ReEX(h, ω)X(0, ω) (by weak stationarity) = E|X(h, ω) − X(0, ω)|2 .

(8.13)

The converse assertion is obvious. (ii)⇒(iv): By a simple calculation, we have 0 12 |ρ(u + h) − ρ(u)| = E X(u + h, ω)X(0, ω) − X(u, ω)X(0, ω) 2

E|X(u + h, ω) − X(u, ω)|2 · E|X(0, ω)|2 . (iv) follows from (8.14) and (ii). (iv)⇒(iii): Obvious. (iii)⇒(ii): The right-hand side of (8.13) is rewritten as

(8.14)

8.2 Weakly Stationary Stochastic Processes

207

E|X(t + h, ω) − X(t, ω)|2 = E|X(h, ω) − X(0, ω)|2

(by (8.13))

= 2E|X(0, ω)|2 − 2ReEX(h, ω)X(0, ω) (by weak stationarity) = 2[ρ(0) − Reρ(h)].

(8.15)

(ii) follows from (8.15) and (iii).

Definition 8.6 If X(t, ω) is (L ⊗ E, B(C))-measurable, X(t, ω) is called a measurable stochastic process, where L is the Lebesgue σ -field on R. The next theorem shows the strong continuity of a measurable weakly stationary stochastic process. Theorem 8.5 (Crum)9 The covariance function of a measurable weakly stationary process X(t, ω) is continuous. Hence X(t, ω) is strongly continuous. Proof We may assume that EX(t, ω) = 0 without loss of generality. It is sufficient to show the continuity of ρ(u) at u = 0, by Theorem 8.4. To start with, we verify e−t X(t, ω) ∈ L2 (R, C) 2

a.e.

In fact, it immediately follows from

E

e−2t |X(t, ω)|2 dt = 2

R

R

e−2t E|X(t, ω)|2 dt = ρ(0) 2

e−2t dt < ∞. 2

R

Define a couple of functions, Y (ω) and Z(u, ω), as follows:

Y (ω) =

R

Z(u, ω) =

R

e−2t |X(t, ω)|2 dt, 2

|e−t X(t, ω) − e−(t+u) X(t + u, ω)|2 dt. 2

2

(Of course, they are defined for almost every ω.) By the continuity of the shift operator (Theorem 5.1, p. 101), the function u → e−(t+u) X(t + u, ω) 2

is continuous. This implies that

9 Crum

[3].

(R → L2 )

(8.16)

208

8 Periodic Weakly Stationary Processes

Z(u, ω) → 0

u → 0 a.e.

as

(8.17)

By the definition (8.16) of Z,10 0 Z(u, ω) 1 0

2 2 e−2t |X(t, ω)|2 dt + e−2(t+u) |X(t + u, ω)|2 dt 2 R

R

= 4Y (ω). Since EY (ω) < ∞, (8.17) and the dominated convergence theorem imply lim EZ(u, ω) = 0.

(8.18)

u→0

On the other hand,

2 2 Z(u, ω) = [e−2t |X(t, ω)|2 + e−2(t+u) |X(t + u, ω)|2 R

− 2Re e−t

=2

[e−2t |X(t, ω)|2 − e−t 2

R

2 −(t+u)2

2 −(t+u)2

X(t + u, ω)X(t, ω)]dt

ReX(t + u, ω)X(t, ω)]dt (by changing variables).

Hence

EZ(u, ω) = 2

[e−2t ρ(0) − e−t 2

R

= 2ρ(0)

2 −(t+u)2

e−2t dt − 2e−u Re ρ(u) 2

R

Re ρ(u)]dt

2

= 2[ρ(0)−Reρ(u)]e−u

2

R

e−2t

2 −2tu

R

e−2t

2 −2tu

dt

2 2 dt + 2ρ(0) e−2t (1−e−2tu−u )dt. R

Writing, for the sake of simplicity,

10 It

is well-known that |α − β|2 2(|α|2 + |β|2 )

for any α, β ∈ C, in general. Specifying α and β as α = e−t X(t, ω), 2

we obtain the desired result.

β = e−(t+u) X(t + u, ω), 2

8.2 Weakly Stationary Stochastic Processes

I1 =

R

e−2t

2 −2tu

209

I2 =

dt,

R

e−2t (1 − e−2tu−u )dt, 2

2

we obtain EZ(u, ω) = 2[ρ(0) − Re ρ(u)]e−u I1 + 2ρ(0)I2 . 2

(8.19)

By the monotone convergence theorem, it holds good that11 11 The

graph of the function t → e−2tu is depicted as follows according to the sign of u.

(a)

(b)

(a) In the case of u > 0, e−2tu monotonically converges to 1 at each t as u → 0. To say more in detail, it is monotonically increasing in the region t > 0, and decreasing in the region 2 2 t < 0. Since we may assume 0 < u 1, without loss of generality, e−2t −2tu e−2t −2t in the region t < 0. Hence applying the dominated convergence theorem and the monotone convergence theorem to the first and the second terms, respectively, of the right-hand side of the integral

R

e−2t

2 −2tu

dt =

we obtain

R

0

2

−∞

e−2t

e−2t e−2tu dt +

∞

e−2t e−2tu dt, 2

0

2 −2tu

dt →

e−2t dt 2

R

(†)

as u → 0 (u > 0). (b) A similar argument applies to the case of u < 0, u → 0. In general, assume that a numerical sequence {un } (un may either be positive or negative) converges to 0. Suppose

2 2 e−2t e−2tun dt e−2t dt. R

R

Then for sufficiently small ε > 0, there exists a subsequence {un } of {un } such that

e−2t 2 −2tun dt − e−2t 2 dt ε for all n . R

R

If we choose a further subsequence {un } of {un } of the type (a) or (b), then we must have

e−2t 2 −2tun dt − e−2t 2 dt ε for all n . R

A contradiction occurs.

R

210

8 Periodic Weakly Stationary Processes

lim I1 =

u→0

e−2t dt, 2

R

lim I2 = 0.

u→0

(8.20)

Hence it follows that ρ(0) − Re ρ(u) → 0 as

u→0

by (8.18), (8.19) and (8.20).That is, Re ρ(u) → ρ(0) Taking account of |ρ(u)| ρ(0)

as

u → 0.

(8.21)

(cf. 4◦ on p. 196), we have

(Re ρ(u))2 + (Im ρ(u))2 ρ(0)2 .

(8.22)

By (8.21) and (8.22), Im ρ(u) → 0

as

u → 0.

as

u → 0.

Thus we obtain the desired result: ρ(u) → ρ(0)

8.3 Periodicity of Weakly Stationary Stochastic Process Let X : T × Ω → C be a weakly stationary stochastic process. Then it is easy to see that its covariance function ρ(u) is positive semi-definite. In fact, n i,j =1

ρ(ti − tj )λi λj =

n

EX(ti , ω)X(tj , ω)λi λj

i,j =1 n 2 = E X(tj , ω)λj 0 i,j =1

for any tj ∈ T and λj ∈ C (j = 1, 2, · · · , n). In the case of T = Z, the following result is immediately obtained by Herglotz’s theorem 6.4 (p. 133). Theorem 8.6 (Spectral representation of ρ : T = Z) If X : Z × Ω → C is a weakly stationary process, its covariance function ρ(u) can be expressed as the Fourier transform of certain positive Radon measure ν on T:

8.3 Periodicity of Weakly Stationary Stochastic Process

1 ρ(n) = √ 2π

T

211

e−inθ dν(θ ).

Such a measure ν is determined uniquely. In the case of T = R, the corresponding result holds good. Theorem 8.6 (Spectral representation of ρ : T = R) If X : R × Ω → C is a measurable and weakly stationary process, its covariance function ρ(u) can be expressed as the Fourier transform of certain positive Radon measure ν on R: 1 ρ(u) = √ 2π

R

e−iut dν(t).

Such a measure ν is determined uniquely. Since X(t, ω) is measurable, ρ(u) is continuous by Theorem 8.5. We have already confirmed the positive semi-definiteness of ρ(u). Hence Theorem 8.6 follows from Bochner’s Theorem 6.11 (p. 150). √ ν is not necessarily a probability measure because ρ(0) = 1/ 2π is not necessarily satisfied. The Radon measure ν appearing Theorem 8.6 or Theorem 8.6 is called the spectral measure of X(t, ω). The function F : R → R defined by F (α) = ν((−∞, α)),

α∈R

is called the spectral distribution function of X(t, ω). If ν is absolutely continuous with respect to the Lebesgue measure dt, the Radon–Nikodým derivative p(t) ∈ L1 (R, R), p(t) 0 is called the spectral density function of X(t, ω); i.e.

ν(E) =

p(t)dt. E

The spectral density function, if it exists, is unique. We will discuss later on the condition which assures the existence of a spectral density function of a stochastic process. Example 8.4 We recapitulate Example 8.1 on p. 201. That is, consider a stochastic process X(t, ω) =

n

eiλj t Xj (ω),

t ∈R

(λj ∈ R, j = 1, 2, · · · , n),

j =1

where Xj : Ω → C (j = 1, 2, · · · , n) is a random variable with mean 0 and Xj ⊥Xk (j k). Then X(t, ω) is weakly stationary and

212

8 Periodic Weakly Stationary Processes

EX(t, ω) = 0 ρ(u) =

n

for all

t ∈ R,

eiλj u E|Xj (ω)|2

j =1

as we have already verified. We may assume λj ∈ [−π, π ), without loss of generality. E|Xj (ω)|2 is the variance of Xj (ω) and is denoted by σj2 . The spectral measure of X(t, ω) is a Radon measure on T which assigns a mass √

2π E|Xj (ω)|2 =

√

2π σj2

to each t = −λj ; i.e. 1 σj2 δ−λj , √ ν= 2π j =1 n

where δλj is a Dirac measure concentrating at λj . There is no spectral density function. Furthermore, the spectral distribution function is given by ⎧ ⎪ 0 for α < −λn , ⎪ ⎪ ⎪ n ⎪ ⎪ ⎪ ⎨ σj2 for − λk α < −λk−1

1 √ F (α) = j =k ⎪ 2π ⎪ n ⎪ ⎪ ⎪ ⎪ σj2 ⎪ ⎩

for

(k = 2, 3, · · · , n),

− λ1 α,

j =1

where λ1 < λ2 < · · · < λn . Example 8.5 Look at Example 8.2 on p. 202. That is, consider a stochastic process X(t, ω) =

n

aj [Xj (ω) cos λj t + Yj (ω) sin λj t],

j =1

where Xj , Yj : Ω → R (j = 1, 2, · · · , n) are orthogonal random variables with mean 0 and variance 1. aj , λj ∈ R (j = 1, 2, · · · , n). Then X(t, ω) is weakly stationary and EX(t, ω) = 0 ρ(u) =

n j =1

for all

t ∈ R,

aj2 cos λj u.

(8.23)

8.3 Periodicity of Weakly Stationary Stochastic Process

213

We may assume again that λj ∈ T, without loss of generality. The spectral√measure of X(t, ω) is, by (8.23), the Radon measure on T which assigns a mass 2π aj2 /2 to each t = ±λj ; i.e. ν=

√

2π

n a2 j j =1

2

δλj

n a2 n a2 √ √ j j δ−λj = 2π (δλj + δ−λj ). + 2π 2 2 j =1

j =1

There is no spectral density function. The spectral distribution function of X(t, ω) √ is a step function with a jump 2π aj2 /2 at each t = ±λj . The spectral measure of a moving average process (Example 8.3) will be discussed later on. Here arises a natural question.12 When a positive Radon measure ν on T is given, does there exist a weakly stationary stochastic process the spectral measure of which is exactly ν? The following proposition answers this question positively. Theorem 8.7 A positive Radon measure ν on T is assumed to be given. Then there exist a probability space (Ω, E, P ) and a stochastic process Xn : Ω → C(n ∈ Z) which satisfies the following conditions: (i) Xn (ω) is a weakly stationary process. (ii) ν is the spectral measure of Xn (ω). The next result is a continuous-time (T = R) version of the above. Theorem 8.7 A positive Radon measure ν on R is assumed to be given. Then there exist a probability space (Ω, E, P ) and a stochastic process X : R × Ω → C which satisfies the following conditions: (i) X(t, ω) is a measurable and weakly stationary process.13 (ii) ν is the spectral measure of X(t, ω). Proof Let ν be a positive Radon measure on R. If we define a set function θ on B(R) by θ (E) = ν(E)/ν(R),

E ∈ B(R),

then θ is a Radon probability measure on R. As is well-known in probability theory, there exist a certain probability space (Ω, E, P ) and a couple of (real) random variables Y (ω), Z(ω) which are defined on Ω and satisfy the following conditions14 :

12 Kawata

[16], pp. 150–151. is strongly continuous by Theorem 8.5. 14 Let Φ , Φ , · · · be a sequence of Borel probability measures. Then there exist a certain 1 2 probability space (Ω, E, P ) and mutually independent random variables, the distributions of which are Φ1 , Φ2 , · · · (Itô [10], p. 68). 13 X(t, ω)

214

8 Periodic Weakly Stationary Processes

a. Y (ω) and Z(ω) are independent. b. The distribution of Z(ω) is θ . c. EY (ω) = 0, EY (ω)2 = √1 ν(R). 2π

Define a stochastic process X(t, ω) by X(t, ω) = Y (ω)e−iZ(ω)t . Then it satisfies15 EX(t, ω) = EY (ω)Ee−iZ(ω)t = 0, ρ(t + u, t) = EX(t + u, ω)X(t, ω) = E[Y (ω)2 e−iZ(ω)u ]

1 1 −iλu = √ ν(R) e dθ (λ) = √ e−iλu dν(λ). 2π 2π R R The covariance ρ(t + u, t) of X(t, ω) depends only upon u, and it is represented as the Fourier transform of ν. Since the Fourier transform of ν is uniformly continuous,16 X(t, ω) is strongly continuous. The measurability of X(t, ω) is also clear by its definition. We now investigate the periodic behaviors of weakly stationary processes. Definition 8.7 Let X : T × Ω → C be a weakly stationary process with the covariance function ρ(u). X(t, ω) is called a periodic weakly stationary process with period τ or, in short, τ -periodic if the ρ(u) is periodic with period τ ; i.e. ρ(u + τ ) = ρ(u). Theorem 8.8 is a characterization of periodic processes in the discrete-time case. Theorem 8.8 is its continuous-time version. Theorem 8.8 Let X : Z × Ω → C be a weakly stationary process with the spectral measure ν. Then the following three statements are equivalent:

15 Let X , X , · · · 1 2

be independent real random variables and g1 , g2 , · · · : R → C Borel measurable. Then Y1 = g1 (X1 ), Y2 = g2 (X2 ), · · · are independent random variables (Itô [10], p. 66). Based upon this result, Y (ω) and e−iZ(ω)t in the text are independent. 16 If we denote by νˆ the Fourier transform of ν, we have

1 |ˆν (t + h) − νˆ (t)| = √ (e−i(t+h)x − e−itx )dν(x) 2π R

1 |e−ihx − 1|dν(x) √ 2π R → 0 as by the dominated convergence theorem.

h → 0,

8.3 Periodicity of Weakly Stationary Stochastic Process

215

(i) Xn (ω) is τ -periodic. (ii) Xn+τ (ω) − Xn (ω) = 0 a.e.(ω) for all n ∈ Z. (iii) If E ∈ B(T) and E ∩ {2kπ/τ |k ∈ Z} = ∅, then ν(E) = 0. Theorem 8.8 Let X : R × Ω → C be a measurable and weakly stationary process with the spectral measure ν. Then the following three statements are equivalent: (i) X(t, ω) is τ -periodic. (ii) X(t + τ, ω) − X(t, ω) = 0. a.e.(ω) for all t ∈ R. (iii) If E ∈ B(R) and E ∩ {2kπ/τ |k ∈ Z} = ∅, then ν(E) = 0. Proof (i)⇒(ii): We assume that ρ(·) is a periodic function with period T , and show E|X(t + T , ω) − X(t, ω)|2 = 0, which is equivalent to (ii). It is easily verified by a direct computation: E|X(t + T , ω)−X(t, ω)|2 = E|X(t + T , ω)|2 + E|X(t, ω)|2 − 2ReEX(t +T , ω)X(t, ω) = 2ρ(0) − 2Reρ(T ) = 2(ρ(0) − Reρ(T )) = 0 (by (i)).

(ii)⇒(i): (i) follows from (ii) because |ρ(u + T ) − ρ(u)|2 = |E[X(u + T , ω)X(0, ω) − X(u, ω)X(0, ω)]|2 E|X(u + T , ω) − X(u, ω)|2 · E|X(0, ω)|2 (Schwarz’s inequality) = 0 (by (ii)). (i)⇒(iii): If ρ(·) is a periodic function with period T , then

0 = 2ρ(0) − ρ(T ) − ρ(−T ) = 2

R

(1 − cos tT )dν(t).

Since 1 − cos tT 0, we must have ν(E) = 0 (E ∈ B(R)) if E contains no point t such that 1 − cos tT = 0; i.e. E ∩ {2kπ/T |k ∈ Z} = ∅. (iii)⇒(i): Assume (iii). Let ν be the spectral measure of X(t, ω); i.e. 1 ρ(u) = √ 2π

R

e−iut dν(t).

216

8 Periodic Weakly Stationary Processes

Fig. 8.4 Spectral distribution function

Writing ak = ν({2π k/T }) (k ∈ Z), we obtain ∞ 1 ρ(u) = √ ak e−iu·2π k/T 2π k=−∞

by (iii). This is clearly a T -periodic function. Remark 8.2 The spectral distribution function of X(t, ω) is defined by F (α) = ν((−∞, α]),

α ∈ R.

Then (iii) means that F (α) is a step function with possible discontinuities at {2kπ/T |k ∈ Z}. See Fig. 8.4. If X(t, ω) is a T -periodic weakly stationary process, the spectral measure concentrates on a countable set in T or R, informally called the energy set of X(t, ω), such that the distance of any adjacent two points is some multiple of 2π/T . We have to keep in mind that the periodic weakly stationary process can not have a spectral density function. Sargent [26] (Chap. XI) talks about the spectral density function in this case. It is, according to Sargent, zero on (2(k − 1)π/T , 2kπ/T ) and has a “spike” at a jump-point of F (α). This exposition is something like a parable and not correct mathematically. We need Schwartz’s distribution theory to grasp the situation exactly.

8.4 Orthogonal Measures Our crucial objects in this chapter include the spectral representation of a weakly stationary process and the absolute continuity of spectral measures with respect to the Lebesgue measure. For the analysis of these problems, the concept of orthogonal measures is indispensable. We now state its definition and basic properties. B(R) denotes the Borel σ -field on R, m is the Lebesgue measure on R, and B∗ (R) = {S ∈ B(R)|m(S) < ∞}. (Ω, E, P ) is a probability space.

8.4 Orthogonal Measures

217

Definition 8.8 A function ξ : B(R) × Ω → C is called an orthogonal measure (or L2 -orthogonal measure) if the following three conditions are satisfied: (i) The function ω → ξ(S, ω) is in L2 (Ω, C) for every S ∈ B∗ (R). ∞ (ii) If S1 , S2 , · · · ∈ B∗ (R) are mutually disjoint and Sn ∈ B∗ (R), then n=1

ξ

∞

∞ Sn , ω = ξ(Sn , ω) in L2 (Ω, C).17

n=1

(σ -additivity)

n=1

(iii) If S, S ∈ B∗ (R) and S ∩ S = ∅, then

Eξ(S, ω)ξ(S , ω) = 0 (orthogonality). If (i), (ii) and (iii) are satisfied on B(R) instead of B∗ (R), ξ(S, ω) is called a finite orthogonal measure. Let ξ be an orthogonal measure. The function ω → ξ(R, ω) may not be in L2 (Ω, C) in general. However, it is in L2 (Ω, C) if ξ is a finite orthogonal measure. The concept of orthogonal measures is similarly defined in the case of T = T. In this case, any orthogonal measure is automatically finite. Theorem 8.9 Let ξ : B(R) × Ω → C be a finite orthogonal measure. The set function νξ : B(R) → R defined by νξ (S) = ξ(S, ω) 22 ,

S ∈ B(R)

is a finite measure on (R, B(R)). Proof It is obvious that νξ (S) 0 for all S ∈ B(R). Furthermore, if Sn ∈ B(R) (n = 1, 2, · · · ) are mutually disjoint, then νξ

∞

Sn

∞ n 2 2 = ξ Sn , ω = lim ξ(Sj , ω)

n=1

2 (ii) n→∞

n=1

= lim

n

(iii) n→∞

ξ(Sj , ω) 22 =

j =1

j =1

∞

νξ (Sn ).

n=1

This proves the σ -additivity of νξ (·). νξ (R) < ∞ is clear by (i). 17 Hence

if the right-hand side is a finite sum, ξ

p j =1

2

p Sj , ω = ξ(Sj , ω) a.e. j =1

218

8 Periodic Weakly Stationary Processes

Theorem 8.9 does not hold for a general orthogonal measure. If νξ satisfies νξ (S) = cm(S)

S ∈ B∗ (R),

for all

ξ is called a Khintchine orthogonal measure. We now define the integration by means of a finite orthogonal measure. Let ϕ : R → C be a simple function of the form ϕ(x) =

p

cj ∈ C,

cj χSj (x),

Sj ∈ B(R),

(8.24)

j =1

Si ∩ Sj = ∅ (i j ),

m(Sj ) < ∞,

(χSj is the characteristic function of Sj ). m is the Lebesgue measure on R and it is also denoted by dx, dy, · · · in order to specify integrating variables. We define the integral of ϕ by

R

ϕ(x)ξ(dx, ω) =

p

cj ξ(Sj , ω).

(8.25)

j =1

Of course, the representation (8.24) of ϕ is not unique. However, the integral is welldefined in the sense that the value of (8.25) is uniquely determined independently of the representation (8.24). This can be proved in the same manner as for the ordinary Lebesgue integrals. n The integral of a linear combination αk ϕk (x) (αk ∈ C) of simple functions k=1

is defined by

n R k=1

αk ϕk (x)ξ(dx, ω) =

n

αk

k=1

R

ϕk (x)ξ(dx, ω).

Defining a measure νξ as in Theorem 8.9, we obtain 2

ϕ(x)ξ(dx, ω) = |ϕ(x)|2 νξ (dx). R

2

R

In fact, it is verified by left-hand side of (8.26) =

n k=1

|ck |2 νξ (Sk ) = right-hand side of (8.26).

(8.26)

8.4 Orthogonal Measures

219

The set of measurable functions f : R → C which is square-integrable with respect to νξ is denoted by L2νξ (R, C). For any f ∈ L2νξ (R, C), there exists a sequence {ϕn } of simple functions in 2 Lνξ (R, C) such that

f − ϕn 2νξ ,2 =

R

|f (x) − ϕn (x)|2 dνξ → 0

as

n → ∞.

(8.27)

By (8.26) and (8.27),

2

ϕm (x)ξ(dx, ω) − ϕn (x)ξ(dx, ω) R R 2

= |ϕm (x) − ϕn (x)|2 dνξ → 0 as n, m → ∞. R

That is, {ϕn } is Cauchy in L2νξ . Hence

R

ϕn (x)ξ(dx, ω)

converges to some element of L2νξ . We call this function the integral of f with respect to ξ(S, ω) and write it as

R

(8.28)

f (x)ξ(dx, ω).

It can be shown as usual that the limit (8.28) is determined independently of the choice of {ϕn } converging to f . The integration by an orthogonal measure has basic properties as follows: 1◦ For f1 , f2 , · · · , fp ∈ L2νξ (R, C), α1 , α2 , · · · , αp ∈ C,

p R k=1

αk fk (x)ξ(dx, ω) =

p k=1

αk

R

fk (x)ξ(dx, ω).

2◦ Let {fn } be a sequence which converges to f ∈ L2νξ (R, C) in L2νξ . Then

R

in L2 (Ω, C).

fn (x)ξ(dx, ω) →

R

f (x)ξ(dx, ω)

as

n→∞

220

8 Periodic Weakly Stationary Processes

The two formulas in Theorem 8.10 will be effectively made use of in later discussions. 2◦ stated above can be proved by them. Theorem 8.10 (D-K formulas)

E f (x)ξ(dx, ω) g(x)ξ(dx, ω) = f (x)g(x)dνξ (x) R

R

R

for any

(8.29) f, g ∈ L2νξ (R, C).

In particular,

2

|f (x)|2 dνξ (x) f (x)ξ(dx, ω) = 2

R

R

for f ∈

(8.30) L2νξ (R, C).

For the sake of later reference, we call (8.29) and (8.30) the the Doob–Kawata formulas (D–K formulas). Proof We prove (8.30) first. If f is a simple function in L2νξ , (8.30) is clear. If f is a general element of L2νξ , there exists a sequence {ϕn } of simple functions in L2νξ such that

f − ϕn 2νξ ,2 → 0

as

n→∞

( · νξ ,2 is the L2 -norm with respect to νξ ). It follows that

R

|ϕn (x)|2 dνξ (x) →

R

|f (x)|2 dνξ (x).

(8.31)

On the other hand,

2

2 ϕn (x)ξ(dx, ω) → f (x)ξ(dx, ω) , R

R

2

(8.32)

2

by the definition (8.28) of integration. Since the left-hand side of (8.31) and that of (8.32) are equal, (8.30) follows. Next we turn to (8.29). By (8.30), we obtain 2

(f (x) − g(x))ξ(dx, ω) = |f (x) − g(x)|2 dνξ . R

2

R

(8.33)

8.5 Spectral Representation of Weakly Stationary Processes

221

Both sides of (8.33) are evaluated as follows: 2

2

left-hand side of (8.33) = f (x)ξ(dx, ω) + g(x)ξ(dx, ω) R R 2 2

− 2ReE f (x)ξ(dx, ω) g(x)ξ(dx, ω). R

right-hand side of (8.33) =

R

|f (x)|2 dνξ (x) +

− 2Re

R

R

R

|g(x)|2 dνξ (x)

f (x)g(x)dνξ (x).

Hence by (8.30) and (8.33), we obtain

ReE

R

f (x)ξ(dx, ω) ·

R

g(x)ξ(dx, ω) = Re

R

f (x)g(x)dνξ (x).

(8.34)

Replacing f (x) − g(x) in (8.33) by if (x) − g(x), (8.34) holds good after changing f (x) to if (x); i.e.

ImE

R

f (x)ξ(dx, ω) ·

R

g(x)ξ(dx, ω) = Im

R

f (x)g(x)dνξ (x).

(8.35)

The formula (8.29) follows from (8.34) and (8.35).

8.5 Spectral Representation of Weakly Stationary Stochastic Processes: Cramér–Kolmogorov Theorem In Sect. 8.3, we discussed the problem of representing the convariance function of a weakly stationary stochastic process by the Fourier transform of some positive Radon measure. In this section, we proceed to the Fourier analytic representation of a weakly stationary process itself. Theorem 8.11 (Cramér–Kolmogorov)18 Let X : R × Ω → C be a measurable and weakly stationary process with EX(t, ω) = a (for all t ∈ R). Then there exists an orthogonal measure ξ : B(R) × Ω → C which satisfies 1 X(t, ω) = a + √ 2π 18 According

R

e−iλt ξ(dλ, ω).

(8.36)

to Itô [10] p. 255, Kolmogorov’s important article was published in C.R. Acad. Sci. URSS, 26 (1940), 115–118. However, I have never read it, very regrettably. That is why I dropped it from the reference list.

222

8 Periodic Weakly Stationary Processes

Conversely, the stochastic process X(t, ω) represented by the above formula in terms of an orthogonal measure ξ(S, ω) is weakly stationary. ξ is uniquely determined corresponding to X. The same result holds good for the case T = Z. In this case, B(R) should be replaced by B(T) and the scope of integration in (8.36) should be T instead of R. Remark 8.3 1◦ Since we are assuming the measurability of X(t, ω), X(t, ω) is strongly continuous by Crum’s theorem (Theorem 8.5). 2◦ Let M be a complete metric space and D a dense subset in M. If a function f : D → D is an isometric (and so uniformly continuous) surjection, f is, of course, a bijection. By extension by continuity,19 f can be uniquely extended to an isometric function of M into M. We denote the extended function by the same notation f . f : M → M is an isometric injection. It can be proved that f is also a surjection. Let y be any point of M. Since D¯ = M, there exists a sequence {yn } in D which converges to y: yn → y

as

n → ∞,

yn ∈ D.

Since f : D → D is a bijection, there exists a unique xn ∈ D, for each n, such that f (xn ) = yn . Since {yn } is convergent and f is isometry, {xn } is Cauchy in D. Hence, by the completeness of M, {xn } converges to some x ∈ M: xn → x

as

n → ∞.

f (xn ) → f (x) and f (xn ) = yn → y imply y = f (x). This proves that f : M → M is surjective. We make use of this reasoning in the proof. Proof of Theorem 8.11 We may assume a = 0 without loss of generality. Otherwise we apply our reasoning below to X (t, ω) = X(t, ω) − a. If we define M = span{X(t, ω)|t ∈ R},

¯ H(X) = M,

an element of M has the form n

ai X(ti , ω),

ai ∈ C.

i=1

H(X) is a closed subspace of L2 (Ω, C), and is itself a Hilbert space.

19 See

Maruyama [21] pp. 74–76.

8.5 Spectral Representation of Weakly Stationary Processes

223

We now proceed to define an operator Uτ (τ ∈ R) on H(X). To start with, we define Uτ X(t, ω) = X(t + τ, ω). If W (ω) ∈ M is expressed in the form W (ω) =

n

(8.37)

ai X(ti , ω),

i=1

Uτ W is defined by Uτ W (ω) =

n

ai X(ti + τ, ω).

i=1

We have to check that Uτ W is well-defined in the sense that the value defined above is independent of the expression (8.37) of W (ω). Let W (ω) =

m

bj X(sj , ω)

j =1

be another expression of W (ω). We have to show n

ai X(ti + τ, ω) =

m

bj X(sj + τ, ω).

(8.38)

j =1

i=1

Taking account of a = 0, we have (·, · is the inner product of L2 , ρ is the covariance)

αi X(ti + τ, ω),

i

βj X(sj + τ, ω) =

j

=

αi β¯j ρ(ti − sj )

i,j

αi X(ti , ω),

i

βj X(sj , ω) ,

(8.39) αi , βj ∈ C,

j

from which 2 2 αi X(ti + τ, ω) = αi X(ti , ω) i

2

i

2

(8.39 )

224

8 Periodic Weakly Stationary Processes

follows. As a special case of (8.39 ), it follows that 2 2 m n n n ai X(ti +τ, ω)− bj X(sj +τ, ω) = ai X(ti , ω)− bj X(sj , ω) . i=1

2

j =1

2

j =1

i=1

Since the right-hand side is 0, (8.38) is verified. Thus the operator Uτ is defined on M, and its images are also contained in M. Uτ : M → M. We list elementary properties of Uτ : 1◦ (linearity) Uτ (αV + βW ) = αUτ V + βUτ W, α, β ∈ C; V , W ∈ M. 2◦ (group) Uτ Uθ W = Uτ +θ W, W ∈ M. U0 = I (identity). 3◦ (isometry) Uτ V , Uτ W = V , W , V , W ∈ M. In particular, Uτ V 22 =

V 22 (by (8.39), (8.39 )). 4◦ (surjection) Any element n

ai X(ti , ω)

i=1

of M is a value of Uτ since Uτ

n

n

ai X(ti − τ, ω) =

i=1

ai X(ti , ω).

i=1

Uτ can be uniquely extended to an isometric linear operator on H(X). This extension is also denoted by the same notation Uτ . By Remark 8.3, 2◦ (p. 222), Uτ : H(X) → H(X),

τ ∈R

is a surjection. That is, Uτ is a unitary operator on H(X) and {Uτ |τ ∈ R} is a oneparameter group. We next show the strong continuity of this one-parameter group; i.e. for τ0 ∈ R s- lim Uτ V = Uτ0 V τ →τ0

for all

V ∈ H(X).

For any V ∈ H(X) and ε > 0, there exists some W ∈ M such that

V − W 2 <

ε . 3

(8.40)

8.5 Spectral Representation of Weakly Stationary Processes

225

If W (ω) has the form W (ω) =

n

ai X(ti , ω),

i=1

we have the evaluation

Uτ W − Uτ0 W 2

n

|ai | · X(ti + τ, ω) − X(ti + τ0 , ω) 2 .

i=1

By the strong continuity of X (Remark 8.3, 1◦ on p. 222),

Uτ W − Uτ0 W 2 <

ε 3

(8.41)

provided that τ and τ0 are sufficiently near.

Uτ V − Uτ0 V 2 Uτ V − Uτ W 2 + Uτ W − Uτ0 W 2 + Uτ0 W − Uτ0 V 2 . The first and third terms of the right-hand side are less than ε/3 by the isometry of Uτ and Uτ0 together with (8.40). And the second term is also less than ε/3 by (8.41), provided that |τ − τ0 | is very small. Hence

Uτ V − Uτ0 V 2 < ε provided that |τ − τ0 | is sufficiently small. This holds good for every V ∈ H(X). Thus it is confirmed that {Uτ } is a strongly continuous one-parameter group of unitary operators on H(X). According to Stone’s theorem (Theorems 7.13, 7.15, pp. 188–189), it has a representation 1 Uτ = √ 2π

R

e−iτ λ E(dλ),

(8.42)

where E(·) is a resolution of the identity on (R, B(R)), the values of which are bounded symmetric operators. The more detailed expression of (8.42) is 1 Uτ V , W = √ 2π

R

e−itλ E(dλ)V , W ,

V , W ∈ H(X).

Define a function ξ : B(R) × Ω → C by ξ(S, ω) = E(S)X(0, ω),

S ∈ B(R).

226

8 Periodic Weakly Stationary Processes

We also define νξ (S) (S ∈ B(R)) by20 νξ (S) = E(S)X(0, ω), X(0, ω)L2 (Ω,C) = E 2 (S)X(0, ω), X(0, ω)

(E(S)is a projection)

= E(S)X(0, ω), E(S)X(0, ω) = ξ(S, ω) 22

(8.43)

(E(S)is symmetric)

(by the definition of ξ ),

S ∈ B(R),

where ·, · is the inner product of L2 (Ω, C). Then ξ is an L2 -orthogonal measure. In fact, the set function S → ξ(S, ω) is σ -additive for any fixed ω ∈ Ω. The function ω → ξ(S, ω) is square-integrable for any fixed S ∈ B(R). It remains to show the orthogonality. For S1 , S2 ∈ B(R), ξ(S1 , ω), ξ(S2 , ω) = E(S1 )X(0, ω), E(S2 )X(0, ω) = E(S2 )E(S1 )X(0, ω), X(0, ω) = E(S1 ∩ S2 )X(0, ω), X(0, ω) = νξ (S1 ∩ S2 )

(E(·) is symmetric) (3◦ on p. 177)

(by (8.43)).

Hence if S1 ∩ S2 = ∅, Eξ(S1 , ω)ξ(S2 , ω) = νξ (S1 ∩ S2 ) = 0. Finally, X(t, ω) can be represented as21 1 X(t, ω) = Ut X(0, ω) = √ 2π

R

e

−iλt

1 E(dλ)X(0, ω) = √ 2π

R

e−iλt ξ(dλ, ω).

This completes the proof. The relation ν(S) = νξ (S) = ξ(S, ω) 22 (8.43)

20 ν (S) = ξ(S, ω) 2 ξ 2 21 The last equality n

αi χSi (λ)

is a special case of Theorem 7.8(i) (p. 177). is verified as follows. We consider a simple function ϕ(λ)

(Si ∩ Sj = ∅

if i j ).

i=1

R

ϕ(λ)dE(λ)X(0, ω) =

n i=1

αi E(Si )X(0, ω) =

n

αi ξ(Si , ω) =

i=1

The integration of e−iλt is a limit of such integrations of simple functions.

R

ϕ(λ)ξ(dλ, ω).

=

8.6 Spectral Density Functions

227

holds good for the covariance ρ(u) and the spectral measure ν. In fact, ρ(u) = X(u, ω), X(0, ω)

1 = √ e−iλu E(dλ)X(0, ω), X(0, ω) 2π R

1 = √ e−iλu dνξ (λ). 2π R Consequently, ν = νξ by the uniqueness of the spectral measure.

8.6 Spectral Density Functions As we saw in previous sections, any periodic weakly stationary process can not have spectral density functions. Hence a problem arises: under what conditions does a weakly stationary process have a spectral density function?22 Let ξ : B(T) × Ω → C be an orthogonal measure with Eξ(S, ω) = 0 for any S ∈ B(T). There exist a probability space (Ω , E , P ) and a weakly stationary process Yn (ω ) : Ω → C(n ∈ Z), the spectral measure of which is the Lebesgue measure m on T (by Theorem 8.7). We denote by Eω (resp. Eω ) the expectation operator on (Ω, E, P ) (resp. (Ω , E , P )). The Cramér–Kolmogorov theorem assures the representation 1 Yn (ω ) = √ 2π

T

e−iλn η(dλ, ω )

for some orthogonal measure η : B(T)×Ω → C. (We assume that Eω Yn (ω ) = 0.) η satisfies: (a) Eω η(S, ω ) = 0 for any S ∈ B(T), and (b) Eω |η(S, ω )|2 = m(S) for any S ∈ B(T). In order to express some relations between ξ and η, we need the “adjunction” method23 as follows. E(ω,ω ) denotes the expectation operator on the product probability space (Ω × Ω , E ⊗ E , P ⊗ P ). 1(ω) (resp. 1(ω )) denotes the constant function which is identically 1 on Ω (resp. on Ω ). Then the following conditions hold good:

22 This

problem was studied by Doob [4] Chap. X, §8, Chap. XI, §8 and Kawata [17] pp. 69–73. I try to clarify the subtle details embedded in their works. cf. Maruyama [23]. 23 See Doob [4] Chap. II, §2.

228

8 Periodic Weakly Stationary Processes

(i) E(ω,ω ) η(S, ω )1(ω) = Eω η(S, ω ) = 0 for any S ∈ B(T). (ii) E(ω,ω ) |η(S, ω )1(ω)|2 = νη (S) = m(S) for any S ∈ B(T). (iii) E(ω,ω ) ξ(S, ω)1(ω ) · η(S , ω )1(ω) = Eω ξ(S, ω)Eω η(S , ω ) = 0 for any S and S ∈ B(T). Theorem 8.12 Let Xn (ω) (n ∈ Z) be a weakly stationary process with the spectral measure ν. Assume also that EXn (ω) = 0. Then the following two statements are equivalent: (i) Xn (ω) has the spectral density function. (ii) Xn (ω) is a linear stochastic process; that is, there exist a sequence {cn } of complex numbers such that ∞

|cn |2 < ∞

n=−∞

and a stochastic process Zn (ω)(n ∈ Z) with EZn (ω) = 0,

E|Zn (ω)|2 = 1, (8.44)

EZn (ω)Zm (ω) = 0 if

nm

which satisfy ∞

Xn (ω) =

ck−n Zk (ω)

a.e.

k=−∞

(The convergence on the right-hand side is in L2 (Ω, C).) Proof (i)⇒(ii): Let p(λ) 0 be the Radon–Nikodým derivative of ν. We also define α(λ) = p(λ). Since α(λ) ∈ L2 (T, C) (actually real-valued), it can be expanded by the Fourier series24 : ∞ 1 αk eikλ , α(λ) ∼ √ 2π k=−∞ ∞

(8.45)

|αk |2 < ∞,

k=−∞

where αk is the Fourier coefficient corresponding to (1/

√

2π )eikλ (k ∈ Z).

24 The convergence of the series in (8.45) is in L2 (T, C). However, the series is, actually, convergent

a.e. and is equal to α(λ) according to Carleson’s theorem. cf. Carleson [2].

8.6 Spectral Density Functions

229

By the Cramér–Kolmogorov theorem, Xn (ω) is represented as the Fourier transform of some orthogonal measure ξ(S, ω). According to the argument preceeding the theorem, there exist a probability space (Ω , E , P ) and an orthogonal measure η(S, ω ) : B(T)×Ω → C which satisfies (i), (ii) and (iii) mentioned above (p. 228). Divide T into T+ and T0 , defined as T+ = {t ∈ T|α(t) > 0}, T0 = {t ∈ T|α(t) = 0}. We define a couple of functions, α1 (λ) and α2 (λ), by + α1 (λ) = + α2 (λ) =

1 α(λ)

on T+ ,

0

on T0 ,

0

on T+ ,

1

on T0 .

Furthermore, define a function γ : B(T) × Ω × Ω → C by25 γ (S, ω, ω ) =

α1 (λ)ξ(dλ, ω)1(ω ) + S

α2 (λ)η(dλ, ω )1(ω)

S

=

α1 (λ)ξ(dλ, ω) + S

(8.46) α2 (λ)η(dλ, ω ).

S

For the definition (8.46) to be possible, we have to check that α1 (·) ∈ L2νξ and α2 (·) ∈ L2νη , where νξ (S) = Eω |ξ(S, ω)|2

νη (S) = Eω |η(S, ω )|2 ,

and

S ∈ B(T),

respectively. But there is no difficulty in checking it as shown in the following:

1 p(λ)dm(λ) α(λ)2

|α1 (λ)|2 dνξ = S

S∩T+

dm(λ) = m(S ∩ T+ ) < ∞,

= S∩T+

(8.47)

|α2 (λ)|2 dνη = S

dm(λ) = m(S ∩ T0 ) < ∞. S∩T0

the case T0 = ∅ (and so α(λ) never vanishes), the discussion becomes much easier, since it is enough to define γ (S, ω) simply by

1 γ (S, ω) = ξ(dλ, ω) S α(λ)

25 In

for any S ∈ B(T). Clearly, νγ (S) = m(S).

230

8 Periodic Weakly Stationary Processes

γ (S, ω, ω ) is an orthogonal measure. For instance, the orthogonality is proved as follows. Let S and S ∈ B(T) be disjoint. Then we obtain E(ω,ω ) γ (S, ω, ω )γ (S , ω, ω )

= Eω α1 (λ)ξ(dλ, ω) α1 (λ)ξ(dλ, ω) S

S

+ E(ω,ω )

α1 (λ)ξ(dλ, ω) S

α2 (λ)η(dλ, ω )

+ E(ω,ω ) S

α2 (λ)η(dλ, ω )

+ Eω

S

S

S

α2 (λ)η(dλ, ω ) (8.48)

S

α1 (λ)ξ(dλ, ω)

α2 (λ)η(dλ, ω ).

The second and third terms of (8.48) are zero because of the orthogonality of ξ and η in the sense of (iii) (p. 228). The first and the fourth terms are also zero because S ∩ S = ∅. By a computation similar to (8.47) and (8.48), we have26 E(ω,ω ) |γ (S, ω, ω )|2 = m(S),

S ∈ B(T).

Finally, a function γ : B(T) × Ω → C is defined by γ (S, ω) = Eω γ (S, ω, ω ).

26

E(ω,ω ) |γ (S, ω, ω )|2

2 = Eω α1 (λ)ξ(dλ, ω) S

2 + Eω α2 (λ)η(dλ, ω ) S

S

S

|α2 (λ)|2 νη (dλ) S

+ 2ReE(ω,ω )

α2 (λ)η(dλ, ω )1(ω) S

|α1 (λ)|2 νξ (dλ) +

=

α1 (λ)ξ(dλ, ω)1(ω )

+ 2ReE(ω,ω )

α2 (λ)η(dλ, ω )

α1 (λ)ξ(dλ, ω) S∩T+

= m(S ∩ T+ ) + m(S ∩ T0 ) + 0.

S∩T0

(8.49)

8.6 Spectral Density Functions

231

Taking account of the properties of γ , γ is shown to be an orthogonal measure with νγ (S) = m(S ∩ T+ ). The Cramér–Kolmogorov representation theorem gives27 1 Xn (ω) = √ 2π 1 = √ 2π

T

T

e−inλ ξ(dλ, ω) (8.50) e−inλ α(λ)γ (dλ, ω).

Since the Fourier series (8.45) converges to α(λ) in L2 , it follows that Xn (ω) =

∞

1 αk e−i(n−k)λ γ (dλ, ω) a.e. 2π T

(8.51)

k=−∞

In fact, (8.51) is verified by the computation: p

2 1 Eω Xn (ω) − αk ei(k−n)λ γ (dλ, ω) 2π T k=−p

p 2 1 = Eω Xn (ω) − αk ei(k−n)λ γ (dλ, ω) 2π T k=−p

1 = Eω √ 2π

p 2 ! 1 e−inλ α(λ) − √ αk eikλ γ (dλ, ω) 2π k=−p T

(by (8.50))

27

T

e−inλ α(λ)γ (dλ, ω)

1 e−inλ α(λ) · α2 (λ)η(dλ, ω ) ξ(dλ, ω) + Eω α(λ) T+ T0

= e−inλ ξ(dλ, ω) = e−inλ ξ(dλ, ω). e−inλ α(λ) ·

=

T+

T

The final equality is justified by

2

Eω e−inλ ξ(dλ, ω) = T0

T0

=

T0

νξ (dλ) (by D-K formula)

p(λ)dm = 0 (p(λ) = 0 on T0 ).

232

8 Periodic Weakly Stationary Processes

=

1 2π

=

1 2π

1 2π

→0

p 2 1 − αk eikλ dνγ (λ) √ α(λ) 2π k=−p T

(by D-K formula)

p 2 1 − αk eikλ dm(λ) √ α(λ) 2π T+ k=−p

(by νγ (S) = m(S ∩ T+ ))

p 2 1 αk eikλ dm(λ) α(λ) − √ 2π T k=−p p→∞

as

(by (8.45)).

If we define 1 Zj (ω) = √ 2π and ck = (1/

√

T

eij λ γ (dλ, ω),

2π )αk , then

Xn (ω) =

∞

∞

ck Zk−n (ω) =

cn+j Zj (ω)

(in L2 ).

j =−∞

k=−∞

It is easy to check that Zn (ω) satisfies the condition (8.44). EZj (ω) = 0 is clear from the definition (8.49) of γ (E = Eω ). The variance = 1 and the orthogonality come from 2π EZn (ω)Zm (ω)

= E einλ γ (dλ, ω) e−imλ γ (dλ, ω)

=

T

T

ei(n−m)λ dνγ =

T

T+

ei(n−m)λ dm(λ)

=

T

ei(n−m)λ dm(λ) −

T0

ei(n−m)λ dνγ =

T

ei(n−m)λ dm(λ)

= δn,m × 2π. The second equality is justified by D-K formula. (ii)⇒(i): Conversely, assume that Xn (ω) is a moving average of a stochastic process Zn (ω) which satisfies (8.44). Since Zn (ω) is weakly stationary, there exists an orthogonal measure ξ(S, ω) on B(T) × Ω such that 1 Zn (ω) = √ 2π

T

e−inλ ξ(dλ, ω)

8.6 Spectral Density Functions

233

by the Cramér–Kolmogorov theorem. If we define νξ (S) = ξ(S, ω) 22 as usual, νξ √ is the spectral measure of Zn (ω), which has the spectral density function 1/ 2π (constant function).28 Consequently, we obtain, for some {cn } ∈ l2 (C), that

N 1 Xn (ω) = l.i.m. √ ck−n e−ikλ ξ(dλ, ω) N →∞ 2π T k=−N (8.52) 1 = l.i.m. √ N →∞ 2π

N

T k=−N

ck−n e−ikλ ξ(dλ, ω).

Since {cn } ∈ l2 (C), cn is the Fourier coefficient of some C(λ) ∈ L2 (T, C), and29 q 1 ck e−ikλ − C(λ) → 0 √ 2 2π k=−p

as

p, q → ∞.

Hence (writing k − n = j ), N

ck−n e−ikλ = e−inλ

√

2π e−inλ C(λ) in L2 as N → ∞ for fixed n; i.e.

N √ ck−n e−ikλ − 2π e−inλ C(λ) → 0 2

k=−N

28 Since

29 The

cj e−ij λ

j =−N −n

k=−N

tends to

N −n

as

N → ∞.

(8.53)

Zn (ω) is a white noise, the covariance is given by + 1 if u = 0, EZn+u (ω)Zn (ω) = 0 if u 0.

convergence of (1/

√

2π )

q

ck e−ikλ to C(λ) also holds good “almost everywhere”

k=−p

thanks √ to the Carleson theorem. Hence ck is the Fourier coefficient of C(λ) corresponding to (1/ 2π )e−ikλ .

234

8 Periodic Weakly Stationary Processes

Taking account of the fact that the density function of νξ is 1/ above, we have

2 EXn (ω) − C(λ)e−inλ ξ(dλ, ω)

√

2π as remarked

T

N 2 1 = El.i.m. √ ck−n e−ikλ ξ(dλ, ω)− C(λ)e−inλ ξ(dλ, ω) (by (8.52)) N →∞ 2π T T k=−N

1 E N →∞ 2π T

= lim

1 N →∞ 2π

= lim

N −n

cj e−ij λ −

T j =−N −n

1 √ N →∞ T 2π

√

2 ! 2π C(λ) e−inλ ξ(dλ, ω)

j =−N −n N −n

= lim

cj e−ij λ −

2 √ 2π C(λ) dνξ

(by D-K formula)

2 1 cj e−ij λ − C(λ) √ dm(λ) 2π j =−N −n N −n

= 0 (by (8.53)). Hence we obtain 1 Xn (ω) = √ 2π

√ T

2π C(λ)e−inλ ξ(dλ, ω),

a.e.

If we define θ (S, ω) by θ (S, ω) =

√ 2π C(λ)ξ(dλ, ω),

S ∈ B(T),

S

then θ (S, ω) is an orthogonal measure30 and 1 Xn (ω) = √ 2π

30 The

T

e−inλ θ (dλ, ω).

orthogonality, for instance, can be verified as follows. If S and S ∈ B(T) are disjoint,

Eθ(S, ω)θ(S , ω) = E C(λ)χS (λ)ξ(dλ, ω) C(λ)χS (λ)ξ(dλ, ω) T

=

T

T

|C(λ)|2 χS (λ)χS (λ)dνξ = 0 (by D-K formula).

8.6 Spectral Density Functions

235

This is the spectral representation of Xn (ω) in terms of θ (S, ω). Consequently, the spectral measure ν of Xn (ω) is given by

ν(S) = E|θ (S, ω)|2 = 2π =

√

|C(λ)|2 dνξ S

|C(λ)|2 dm(λ),

2π S

which is, of course, absolutely continuous with respect to m.

We prepare a lemma to be used in the proof of Theorem 8.13. We denote by F2 (resp. F2−1 ) the Fourier transform (resp. inverse Fourier transform) in the sense of Plancherel.31 Lemma 8.1 1 √ 2π

1 f (z)g(z)e−iuz dm(z) = √ F2−1 f (λ − u), F2−1 g(λ) ¯ 2π R for any f and g ∈ L2 (R, C).

(We have to note f · g ∈ L1 (R, C). ·, · denotes the inner product in L2 (R, C).) Proof By definition of the inverse Fourier transform in the sense of Plancherel, we have 1 F2−1 (f (z)e−izu )(λ) = l.i.m. √ A→∞ 2π 1 = l.i.m. √ A→∞ 2π

A −A A −A

f (z)e−izu eizλ dm(z) f (z)ei(λ−u)z dm(z)

= F2−1 f (λ − u). Taking account of f · g ∈ L1 , we obtain 1 √ 2π

R

f (z) · g(z)e−izu dm(z) 1 = √ f (z)e−izu , g(z) ¯ 2π

31 We

can also establish 1 √ 2π

R

1 f (λ)g(λ)e−izλ dm(λ) = √ (F2 f ∗ F2 g)(z). 2π

(cf. Kawata [15] pp. 282–283.)

(8.54)

236

8 Periodic Weakly Stationary Processes

1 = √ F2−1 (f (z)e−izu )(λ), F2−1 g(λ) ¯ 2π (by Parseval equality) 1 = √ F2−1 f (λ − u), F2−1 g(λ) ¯ (by (8.54)) . 2π Theorem 8.13 Let X(t, ω)(t ∈ R) be a measurable and weakly stationary process with the spectral measure ν. Assume also that EX(t, ω) = 0. Then the following two statements are equivalent: (i) (ii)

X(t, ω) has the spectral density function. There exists a measurable and weakly stationary process X : R × Ω → C of the form

X (t, ω) =

R

w(λ − t)γ (dλ, ω) a.e. (ω),

the spectral measure of which is ν, where w(·) ∈ L2 (R, C) and the orthogonal measure γ (S, ω) : B(R) × Ω → C satisfies νγ (S) = E|γ (S, ω)|2 = m(S ∩ R+ ), S ∈ B∗ (R). (The definition of R+ is given below.) Proof By the Cramér–Kolmogorov theorem, there exists an orthogonal measure ξ(S, ω) on B(R) × Ω such that 1 X(t, ω) = √ 2π

e−iλt ξ(dλ, ω).

R

(i)⇒(ii): Let p(λ) 0 be the Radon–Nikodým derivative of ν. If we define α(λ) =

p(λ),

then α ∈ L2 (R, R). The covariance ρ(u) of X(t, ω) can be written as 1 ρ(u) = √ 2π

R

e

−iuz

1 dν(z) = √ 2π

R

e−iuz α(z) · α(z)dm(z) (8.55)

1 = √ F2−1 α(λ − u), F2−1 α(λ). 2π The third equality is assured by Lemma 8.1.

8.6 Spectral Density Functions

237

Divide R into R+ and R0 defined as R+ = {t ∈ R|α(t) > 0}, R0 = {t ∈ R|α(t) = 0}. As in Theorem 8.12, we introduce the orthogonal measure η(S, ω) : B(R) × Ω → C, two functions α1 (·) and α2 (·), and another orthogonal measure γ (S, ω). We can construct η and γ so as to be νη (S) = m(S),

νγ = m(S ∩ R+ )

for

S ∈ B(R).

Note that the values of νη and νγ admit ∞. Since α(·) is in L2 with respect to the Lebesgue measure, it is also in L2 with respect to νγ . If we define a weakly stationary process X (t, ω) by 1 X (t, ω) = √ 4 2π

R

α(λ − t)γ (dλ, ω),

then the covariance ρ (u) of X (t, ω) is32 ρ (u) = EX (t + u, ω)X (t, ω)

1 = √ E α(λ − (t + u))γ (dλ, ω) α(λ − t)γ (dλ, ω) 2π R R

1 = √ α(λ − (t + u))α(λ − t)dνγ (λ) 2π R

1 = √ F2−1 α(λ − u)F2−1 α(λ)dm(λ). 2π R 32 The

(8.56) (by D-K formula)

third line of (8.56)

1 α(λ − (t + u))α(λ − t)dm(λ) + α(λ − (t + u))α(λ − t) dνγ = √ 2π R+ R0 789: (†)

1 = √ (†)dm(λ) − (†)dm(λ) + (†)dνγ 2π R R0 R0

1 (†)dm(λ) − (†)dm(λ) − (†)dm(λ) = √ λ∈R0 λ∈R0 2π R λ−t∈R+ λ−t∈R0

+ (†)dνγ (λ) + (†)dνγ (λ) 1 = √ 2π

λ∈R0 λ−t∈R+

λ∈R0 λ−t∈R0

R

(†)dm(λ) −

+

(R0 −t)∩R+

(R0 −t)∩R+

α(λ − u)α(λ )dm(λ )

α(λ − u)α(λ )dνγ (λ ) .

The last two terms cancel out. So we obtain (8.56).

238

8 Periodic Weakly Stationary Processes

By (8.55) and (8.56), we get ρ = ρ . That is, the spectral density function of is p(·). (ii)⇒(i): Conversely, assume (ii). The covariance ρ(u) of

X (t, ω)

X(t, ω) =

R

w(λ − t)γ (dλ, ω)

is given by

ρ(u) =

∞

−∞

F2−1 w(λ − u)F2−1 w(λ)dm(λ)

as in the calculation of ρ (·) above. Then we get, again by Lemma 8.1,

ρ(u) =

R

F2−1 w(λ − u)F2−1 w(λ)dm(λ)

= F2−1 w(λ − u), F2−1 w(λ)

|w(z)|2 e−iuz dm(z) = R

1 = √ 2π Finally, if we define p(z) = of X(t, ω).

√ 2π |w(z)|2 e−iuz dm(z).

√

R

2π |w(z)|2 , p(z) is the spectral density function

8.7 A Note on Slutsky’s Work So far, we have investigated the periodic behaviors of weakly stationary stochastic processes.33 In particular, most of this chapter is devoted to a characterization of the periodicity by means of the concept of spectral measures. As we have already noted, it is E. Slutsky who initiated research on the periodic behaviors of moving average processes generated by white noises. He discovered a special case of moving average which asymptotically approaches to some sine or cosine curve. However, his work is rather experimental and is, regrettably, devoid of any general principle which characterizes the periodicity. This shortcoming seems to accrue from the lack of Fourier analytic viewpoint.

33 This

section was added according to Professor S. Kusuoka’s suggestion. It is not included in the Japanese edition of this monograph.

8.7 A Note on Slutsky’s Work

239

In this section, we make a digression to a brief exposition of what Slutsky did in his pioneering work. Let V (n) be a function of discrete time n ∈ Z. The operator B defined by BV (n) = V (n − 1),

n∈Z

(8.57)

is called the backward operator or lag operator.34 B k denotes the repeated applications of B k-times; i.e. B k V (n) = V (n − k),

n ∈ Z.

(8.58)

The backward operator is sometimes used in a polynomial form: κ(B) = a0 + a1 B + · · · + ap B p ,

(8.59)

i.e. κ(B)V (n) = a0 V (n) + a1 V (n − 1) + · · · + ap V (n − p). Let (Ω, E, P ) be a probability space and Xn : Ω → R(n ∈ Z) a white noise. Slutsky [28] constructs the following stochastic process based upon Xn . First take two-period moving sums: Xn(1) =Xn + Xn−1 = (1 + B)Xn , (1)

Xn(2) =Xn(1) + Xn−1 = (1 + B)2 Xn ,

(8.60)

··············· ··············· (μ−1)

Xn(μ) =Xn(μ−1) + Xn−1

= (1 + B)μ Xn .

Then take the second differences consecutively: (μ)

ΔXn(μ) =Xn(μ) − Xn−1 = (1 − B)Xn(μ) , (μ)

Δ2 Xn(μ) =ΔXn(μ) − ΔXn−1 = (1 − B)2 Xn(μ) , ··············· ··············· (μ)

Δν Xn(μ) =Δν−1 Xn(μ) − Δν−1 Xn−1 = (1 − B)ν Xn(μ) . μ and ν are some natural numbers. 34 For

backward operators, see Granger and Newbold [7] Chap. 1.

(8.61)

240

8 Periodic Weakly Stationary Processes

Slutsky proposes a weakly stationary stochastic process Ynμ,ν (ω) = Δν Xn(μ) (ω) = (1 − B)ν (1 + B)μ Xn (ω),

(8.62)

where μ and ν are parametric natural numbers. This is a candidate, the behavior of which approaches asymptotically to a simple periodic curve, like sine/cosine. Before we proceed to Slutsky’s work, let us evaluate the second difference of a simple periodic function, say yn = A cos αn

(A > 0),

(8.63)

for the sake of comparison: Δ2 yn = yn − 2yn−1 + yn−2 = A{cos αn − 2 cos α(n − 1) + cos α(n − 2)} αn − αn + 2α αn + αn − 2α cos − 2 cos α(n − 1) = A 2 cos 2 2

(8.64)

= 2A cos α(n − 1)(cos α − 1) = 2A(cos α − 1)yn−1 . Since | cos α| 1, we obtain −a ≡ 2A(cos α − 1) 0 and (8.64) can be written as Δ2 yn = −ayn−1 .

(8.65)

Let Zn : Ω → R be a weakly stationary stochastic process with EZn = 0 and E|Zn |2 = σ 2 (constant independent of n). If Zn is exactly equal to yn , the second difference of Zn is given by Δ2 Zn = −aZn−1 . However, in general, this equality is not satisfied. The gap Gn between them, i.e. Δ2 Zn = −aZn−1 + Gn , is interpreted as a measure of deviation of Zn from the periodic curve yn = A cos αn in the spirit of Slutsky. The value of a which minimizes E(G2n ) = E(Δ2 Zn + aZn−1 )2 is given by a=− =−

EΔ2 Zn · Zn−1 EΔ2 Zn · Zn−1 = − 2 σ2 EZn−1 1 {E(Zn − 2Zn−1 + Zn−2 )Zn−1 } σ2

(8.66)

8.7 A Note on Slutsky’s Work

241

1 2 {EZn Zn−1 − 2EZn−1 + EZn−1 Zn−2 } σ2 1 = − 2 {2ρ(1) − 2σ 2 } σ ρ(1) =2 1− 2 . σ

=−

(8.67)

ρ(1) is the covariance of Zn ; i.e. ρ(1) = EZm Zm−1 for any m ∈ Z. Substituting (8.67) into (8.66), we obtain E(G2n )

2 ρ(1) 2 = E Δ Zn + 2 1 − 2 Zn−1 . σ

(8.68)

Since EGn = 0, EG2n is the variance of Gn . Hence, for any t > 0, P {ω ∈ Ω| |Gn | t}

E(G2n ) t2

(8.69)

as is well-known (Tchebichev’s inequality).35 As we have already seen, Slutsky proposes a weakly stationary stochastic process Ynμ,ν (ω) = Δν Xn(μ) (ω) = (1 − B)ν (1 + B)μ Xn (ω),

(8.62)

where μ and ν ∈ N ∪ {0} are parameters. This is a candidate, the behavior of which approaches asymptotically to a simple periodic curve like sine/cosine curves, as the parameters, μ and ν, go to infinity in a specific manner. Slutsky obtained a couple of theorems. The first theorem elucidates a set of conditions under which a sequence of stochastic processes converges (in some sense) to a simple periodic function. The second theorem proves the convergence μ,ν of Yn defined by (8.62) to a periodic function by means of the first theorem. μ

[I] Suppose that {Zn (ω)} is a sequence of (real-valued) stochastic processes which satisfy the conditions a–d: μ

a. EZn = 0 for all n ∈ Z and μ ∈ N ∪ {0}. μ b. E|Zn |2 = σμ2 (constant independent of n). μ μ μ c. The autocorrelation rm between Zn and Zn−m is denoted by μ

μ = rm μ

μ

EZn Zn−m (independent of n). σμ2

In particular, |r1 | C < 1 for sufficiently large μ. 35 Loève

[19] p. 11.

242

8 Periodic Weakly Stationary Processes μ

μ

d. The correlation coefficient between Δ2 Zn and Zn−1 , that is μ

μ

μ

EΔ2 Zn · Zn−1

? = μ μ E(Δ2 Zn )2 E(Zn−1 )2

μ

EΔ2 Zn · Zn−1 σμ2

tends to −1 as μ → ∞. μ

Then there exists some sinusoid A cos 2π n/T for which the sequence {Gn } of μ μ μ gaps (defined by Gn = Δ2 Zn + a μ Zn−1 as above) converges to zero in probability; that is, for any ε, η > 0, there exists some μ0 such that μ Gn ε η P ω ∈ Ω σμ

for all μ μ0 .

μ,ν

defined above has two parameters, μ and ν. The stochastic process Yn However, if we fix the ratio ν/μ at a constant, μ (or ν) may be regarded as the only parameter associated with the process. Thus we denote the process simply by μ μ,ν Yn instead of Yn . μ Slutsky proves that the sequence {Yn }μ∈N of stochastic processes satisfies all the requirements a–d of Theorem [I]. For instance, it follows that μ

|r1 | C < 1

for all μ

in this case, since μ μ

μ r1

=

EYn Yn−1 σμ2

=

1 − ν/μ μ−ν < μ+ν+1 ν/μ + 1

and ν/μ = constant. Thus the condition c is affirmed. μ μ The autocorrelation between Δ2 Yn and Yn−1 is evaluated as @ −

(2ν + 1)(μ + ν + 2) . (2ν + 3)(μ + ν + 1)

The value tends to −1 as μ → ∞, ν/μ being fixed. The condition d is thus confirmed. Theorem [II] immediately follows. μ

[II] The conclusion of Theorem [I] holds good for the sequence {Yn }μ∈N defined μ above instead of {Zn }μ∈N . We shall skip the proofs and more rigorous discussions because Slutsky’s work was done outside the world of Fourier analysis.36 36 Sargent

[26] Chap. 11 and Shinkai [27] Chaps.7–8 are very suggestive.

References

243

References 1. Brémaud, P.: Fourier Analysis and Stochastic Processes. Springer, Cham (2014) 2. Carleson, L.: On the convergence and growth of partial sums of fourier series. Acta Math. 116, 135–157 (1966) 3. Crum, M.M.: On positive definite functions. Proc. London Math. Soc. 6, 548–560 (1956) 4. Doob, J.L.: Stochastic Processes. Wiley, New York (1953) 5. Dudley, R.M.: Real Analysis and Probability. Wadsworth and Brooks, Pacific Grove (1988) 6. Frisch, R.: Propagation problems and inpulse problems in dynamic economics. In: Economic Essays in Honor of Gustav Cassel. Allen and Unwin, London (1933) 7. Granger, C.W.J., Newbold., P.: Forecasting Economic Time Series, 2nd edn. Academic, Orlando (1986) 8. Hamilton, J.D.: Time Series Analysis. Princeton University Press, Princeton (1994) 9. Hicks, J.R.: A Contribution to the Theory of the Trade Cycle. Oxford University Press, London (1950) 10. Itô, K.: Kakuritsu-ron (Probability Theory). Iwanami Shoten, Tokyo (1953) (Originally published in Japanese) 11. Kaldor, N.: A model of the trade cycle. Econ. J. 50, 78–92 (1940) 12. Kawata, T.: Ohyo Sugaku Gairon (Elements of Applied Mathematics), I, II. Iwanami Shoten, Tokyo (1950, 1952) (Originally published in Japanese) 13. Kawata, T.: Fourier Henkan to Laplace Henkan (Fourier and Laplace Transforms). Iwanami Shoten, Tokyo (1957) (Originally published in Japanese) 14. Kawata, T.: On the Fourier series of a stationary stochastic process, I, II. Z. Wahrsch. Verw. Gebiete 6, 224–245 (1966); 13, 25–38 (1969) 15. Kawata, T.: Fourier Kaiseki (Fourier Analysis). Sangyo Tosho, Tokyo (1975) (Originally published in Japanese) 16. Kawata, T.: Fourier Kaiseki to Tokei (Fourier Analysis and Statistics). Kyoritsu Shuppan, Tokyo (1985) (Originally published in Japanese) 17. Kawata, T.: Teijyo Kakuritsu Katei (Stationary Stochastic Processes). Kyoritsu Shuppan, Tokyo (1985) (Originally published in Japanese) 18. Keynes, J.M.: The General Theory of Employment, Interest and Money. Macmillan, London (1936) 19. Loève, M.: Probability Theory, 3rd edn. van Nostrand, Princeton (1963) 20. Maruyama, G.: Harmonic Analysis of Stationary Stochastic Processes. Mem. Fac. Sci. Kyushu Univ. Ser. A, 4, 45–106 (1949) 21. Maruyama, T.: Suri-keizaigaku no Hoho (Methods in Mathematical Economics). Sobunsha, Tokyo (1995) (Originally published in Japanese) 22. Maruyama, T.: Shinko Keizai Genron (Principles of Economics), 3rd edn. Iwanami Shoten, Tokyo (2013) (Originally published in Japanese) 23. Maruyama, T.: Fourier Analysis of Periodic Weakly Stationary Processes. A Note on Slutsky’s Observation. Adv. Math. Eco. 20, 151–180 (2016) 24. Samuelson, P.A.: Interaction between the multiplier analysis and the principle of acceleration. Rev. Econ. Stud. 21, 75–78 (1939) 25. Samuelson, P.A.: A synthesis of the principle of acceleration and the multiplier. J. Polit. Econ. 47, 786–797 (1939) 26. Sargent, T.J.: Macroeconomic Theory. Academic, New York (1979) 27. Shinkai, Y.: Keizai-hendo no Riron (Theory of Economic Fluctuations). Iwanami Shoten, Tokyo (1967) (Originally published in Japanese) 28. Slutsky, E.: Alcune proposizioni sulla teoria delle funzioni aleatorie. Giorn. Inst. Ital. degli Attuari 8, 193–199 (1937)

244

8 Periodic Weakly Stationary Processes

29. Slutsky, E.: The summation of random causes as the source of cyclic processes. Econometrica 5, 105–146 (1937) 30. Slutsky, E.: Sur les fonctions aléatoires presque-periodiques et sur la décomposition des fonctions aléatoires stationnaires en composantes. Actualités Sci. Ind. 738, 38–55 (1938) 31. Wold, H.: A Study in the Analysis of Stationary Time Series, 2nd edn. Almquist and Wicksell, Uppsala (1953)

Chapter 9

Almost Periodic Functions and Weakly Stationary Stochastic Processes

It is a basic idea for the classical theory of Fourier series to express periodic functions as compositions of harmonic waves. This idea can be successfully extended to nonperiodic functions by means of Fourier transforms. However, we will be confronted with a lot of obstacles when we consider Lp -function spaces in the case p > 2. H. Bohr developed an investigation to mitigate this difficulty in the 1920s. S. Bochner as well as J. von Neumann succeeded Bohr’s approach, and their endeavors yielded an abundant harvest in the form of the theory of almost periodic functions. In this chapter, we are mainly interested in its connection with the theory of weakly stationary stochastic processes. It was observed in the last chapter that the periodicity of a weakly stationary process is characterized by some regular discreteness of the spectral measure. On the other hand, the almost periodicity of a weakly stationary process is characterized by a nonregular discreteness of the spectral measure.1

9.1 Almost Periodic Functions Definition 9.1 Let f : R → C be a function and ε a positive real number. τ ∈ R is called an ε-almost period of f if sup |f (x − τ ) − f (x)| < ε. x∈R

1I

am much indebted to Dunford and Schwartz [7] pp. 281–285, Katznelson [8] pp. 191–200, Kawata [9] II, pp. 96–103, 149–152, [10], [11] pp. 78–86, Loomis [12], Rudin [16, 17] for the contents of this chapter. The classical works cited above are Bohr [4, 5], von Neumann [19], Bochner [2] and Bochner and von Neumann [3].

© Springer Nature Singapore Pte Ltd. 2018 T. Maruyama, Fourier Analysis of Economic Phenomena, Monographs in Mathematical Economics 2, https://doi.org/10.1007/978-981-13-2730-8_9

245

246

9 Almost Periodic Functions and Stochastic Processes

f is said to be (uniformly) almost periodic in the sense of Bohr [5] if (i) f is continuous, and (ii) there exists Λ = Λ(ε, f ) ∈ R for each ε > 0 such that any interval, the length of which is Λ(ε, f ), contains an ε-almost period of f . The set AP(R, C) of all the almost periodic functions of R into C forms a closed subalgebra of L∞ (R, C), as will be shown later on. Example 9.1 τ = 0 is an ε-almost period for any ε > 0. If f is a τ -periodic function, the period τ of f is an ε-almost period for any ε > 0. Suppose that f is uniformly continuous. Then for any ε > 0, τ is an ε-almost period if |τ | is sufficiently small. Example 9.2 Any periodic continuous function f : R → C is almost periodic. f (x) = cos x + cos π x is not periodic.2 However, f (x) is almost periodic because, as we show later on (Theorem 9.2), a sum of almost periodic functions is almost periodic. If f is almost periodic, |f |, f¯, af, f (λx) are also almost periodic, where a ∈ C, λ ∈ R. The following two lemmata show the boundedness and the uniform continuity of an almost periodic function. Lemma 9.1 Suppose that f ∈ AP(R, C) and ε > 0. Let I be any closed interval of length Λ(ε, f ). Then f (R) is contained in the ε-neighborhood of f (I ). Therefore f is bounded. Proof Choose any x ∈ R. Let τ be an ε-almost period contained in x − I . Of course x − τ ∈ I , and |f (x) − f (x − τ )| < ε.

(9.1)

Hence f (R) is contained in the ε-neighborhood of f (I ); i.e. sup |f (x)| sup |f (x)| + ε. x∈R

x∈I

So f is bounded. Corollary 9.1 If f ∈ AP(R, C), then

f2

∈ AP(R, C).

fact, we note that f (0) = 2 and f (x) 2 if x 0. If f (x) = 2, we must have cos x = cos π x = 1. It follows that both x = 2kπ (k ∈ Z) and π x = 2kπ (k ∈ Z) would have to hold good at the same time. However, it is impossible. 2 In

9.2 AP(R, C) as a Closed Subalgebra of L∞ (R, C)

247

Proof For any x, τ ∈ R, f 2 (x − τ ) − f 2 (x) = (f (x − τ ) + f (x))(f (x − τ ) − f (x)). Let ε be any positive number and τ an ε/(2 f ∞ )-almost period. Then it follows that |f 2 (x − τ ) − f 2 (x)| 2 f ∞ ·

ε = ε, 2 f ∞

which shows that τ is an ε-almost period of f 2 . Lemma 9.2 Any almost periodic function is uniformly continuous.

Proof Assume f ∈ AP(R, C). For ε > 0, we define Λ = Λ(ε/3, f ). f is uniformly continuous on the interval [0, Λ]. Hence it follows that, for a sufficiently small δ(ε) > 0, sup |f (x + η) − f (x)| < x∈[0,Λ]

ε 3

if |η| < δ(ε).

(9.2)

Let y ∈ R and τ an ε/3-almost period of f contained in the interval [y − Λ, y]. Then |f (y +η)−f (y)| |f (y +η)−f (y −τ +η)|+|f (y −τ +η)−f (y −τ )|+|f (y −τ )−f (y)|. (9.3)

The second term of the right-hand side of (9.3) is less than ε/3 if |η| < δ(ε). (Note that y − τ ∈ [0, Λ].) The first and the third terms on the right-hand side are less than ε/3, respectively, since τ is an ε/3-almost period. Hence we have, by (9.3). |f (y + η) − f (y)| < ε

if |η| < δ(ε).

9.2 AP(R, C) as a Closed Subalgebra of L∞ (R, C) In the preceding section, we showed that an almost periodic function is bounded and uniformly continuous. We now proceed to prove that AP(R, C) is a closed subalgebra of L∞ (R, C). For f ∈ L∞ (R, C), we write W0 (f ) = {fy (x) = f (x − y)|y ∈ R}.

(9.4)

Theorem 9.1 (Bochner) The following two statements are equivalent for f ∈ L2 (R, C):

248

9 Almost Periodic Functions and Stochastic Processes

(i) f is an almost periodic function. (ii) W0 (f ) is relatively compact in L∞ (R, C). Proof (i)⇒(ii): Since L∞ is a complete metric space, it is sufficient to verify that W0 (f ) is totally bounded in order to show its relative compactness. Let Λ = Λ(ε/2, f ) for ε > 0. We make a decomposition η1 < η2 < · · · < ηJ of [0, Λ]. By the uniform continuity of f (Theorem 9.2), we obtain for any y0 ∈ [0, Λ] inf fy0 − fηj ∞ <

1j J

ε 2

(9.5)

provided that the decomposition is sufficiently fine.3 For any y ∈ R, let τ be an ε/2-almost period of f contained in [y − Λ, y]. y0 ≡ y − τ ∈ [0, Λ] and ε . 2

(9.6)

y ∈ R.

(9.7)

fy − fy0 ∞ = fy − fy−τ ∞ < It follows from (9.5) and (9.6) that inf fy − fηj ∞ < ε

1j J

for any

This shows that fηj (1 j J ) is an ε-net of W0 (f ). Hence W0 (f ) is totally bounded. (ii)⇒(i): Conversely, assume that W0 (f ) is totally bounded. Then, for any ε > 0, there exist some yj ∈ R (1 j J ) such that W0 (f ) ⊂

J

Bε (fyj ).

(9.8)

j =1

3 For

any ε > 0, there exists some δ > 0 such that |f (u) − f (v)| <

ε 2

if |u − v| < δ.

We make a decomposition ηj (1 j J ) of [0, Λ] so that distances of any two adjacent points are less than δ. y0 ∈ [0, Λ] is contained in a small interval [ηj , ηj +1 ]. Hence |f (x − y0 ) − f (x − ηj )| <

ε . 2

(Note that |(x − y0 ) − (x − ηj )| = |y0 − ηj | < δ.)

This holds good for any x ∈ R, and so (9.5) follows.

9.2 AP(R, C) as a Closed Subalgebra of L∞ (R, C)

249

We may assume that Bε (fyj ) ∩ W0 (f ) ∅ for all j,

(9.9)

without loss of generality. Define Λ = 2 Max |yj |. 1j J

(9.10)

We now show that any interval I of length Λ contains an ε-almost period of f . Let y be the midpoint of I ; i.e. 1 0 Λ Λ . I = y − ,y + 2 2

(9.11)

By (9.8), there exists some j0 ∈ {1, 2, · · · , J } such that fy − fyj0 ∞ < ε.4 Let τ = y − yj0 . Then it is clear that τ ∈ I , since |yj0 | Λ/2 by (9.10). (cf. Fig. 9.1.) On the other hand,

fτ − f ∞ = fτ + yj0 − fyj0 ∞ < ε. 789: =y

Thus I contains an ε-almost period τ of f . Fig. 9.1 Relation between y and yj0

4

(9.12)

250

9 Almost Periodic Functions and Stochastic Processes

Finally, we have to show that f is continuous. We can actually prove a stronger result, lim fη − f ∞ = 0

(uniform continuity).

η→0

(9.13)

Let fyj (1 j J ) be an ε-net of W0 (f ) which satisfies (9.8). If we define Ej by Ej = {τ |fτ ∈ Bε (fyj )},

1 j J,

(9.14)

we have J

Ej = R.

j =1

Hence there exists at least one Ej , say Ej0 , which has a positive measure. It is wellknown that Ej0 − Ej0 is a neighborhood of 0.5 y = y − y (y , y ∈ Ej0 ) satisfies |fy (x)−f (x)| |f (x − (y − y ))−f (x + y − yj0 )| + |f (x + y −yj0 )−f (x)| = |f (x + y − y ) − f (x + y − yj0 )| + |f (x + y − y ) − f (x + y − yj0 )| < ε + ε = 2ε,

from which (9.13) follows. Definition 9.2 Let f ∈ L∞ (R, C). The translation closed convex hull is the set W (f ) = co

W0 (af ).6

|a|1

Remark 9.1 1◦ W (f ) is the set of all the · ∞ -limit of functions of the form ak fxk , xk ∈ R, |ak | 1.

(9.15)

To prove this, we first note that an element of co W0 (af ) |a|1

5 This

delicate result is called Steinhaus’ theorem. See Dudley [6] p. 80 and Stromberg [18] pp. 297–298. 6 coA ¯ is the closed convex hull of a set A. coA is the convex hull of A.

9.2 AP(R, C) as a Closed Subalgebra of L∞ (R, C)

251

can be expressed in the form p αk · ak fxk ,

(9.16)

k=1

|ak | 1, xk ∈ R, αk 0,

p

αk = 1.

k=1

Writing βk = αk · ak , we can rewrite (9.16) as p

βk fxk ,

k=1

p

|βk | 1.

k=1

Hence any element of W (f ) is a · ∞ -limit of functions of the form (9.15). Conversely, consider a function of the form (9.15). Writing ak = |ak |eiθk (without loss of generality, we may assume p k=1

ak fxk =

p

|ak |eiθk fxk =

k=1

)

|ak | 0), we have

p p |ak |eiθk |a | j fxk . p k=1 j =1 |aj | j =1

)

Furthermore, if we define αk = |ak |/ |aj |, αk ’s are nonnegative real ) numbers, the sum of which is equal to 1. On the other hand, ak = eiθk ( |aj |) is a complex number with |ak | 1. Hence any function of the form (9.15) can be expressed in the form (9.16). It follows that any · ∞ -limit of functions of the form (9.15) is contained in W (f ). 2◦ If f is uniformly continuous, in particular, W (f ) is equal to the · ∞ -closure of the set of functions of the form ϕ ∗ f,

ϕ ∈ L1 (R, C),

ϕ 1 1.

In order to see this, it is sufficient to show that any function of the form K

ak fxk ,

xk ∈ R,

k=1

K

|ak | 1

k=1

can be · ∞ -approximated by a convolution ϕ ∗ f, ϕ ∈ L1 , ϕ 1 1. Since f is uniformly continuous, there exists some δ > 0, for each ε > 0, such that |f (u) − f (v)| < ε

if |u − v| < δ,

252

9 Almost Periodic Functions and Stochastic Processes

where it is possible to choose δ > 0 so that [xk − δ, xk + δ] (k = 1, 2, · · · , K) are mutually disjoint. Define a function ϕ(y) by

ϕ(y) =

⎧ ak ⎪ ⎪ ⎨ 2δ

for

y ∈ [xk − δ, xk + δ],

⎪ ⎪ ⎩0

for

y

K

[xk − δ, xk + δ].

k=1

Since

)

xk +δ xk −δ

|ak | 1, it is obvious that ϕ 1 1. We also have

f (x − y)ϕ(y)dy − ak f (x − xk )

1 = 2δ

xk +δ

xk −δ

|ak | 2δ

xk +δ

xk −δ

1 f (x − y)ak dy − 2δ

xk +δ

xk −δ

ak f (x − xk )dy |ak | · 2δ · ε = ε|ak |, 2δ

|f (x − y) − f (x − xk )|dy

where |(x − y) − (x − xk )| = |xk − y| < δ. Summing up for all k, we obtain

K K f (x − y)ϕ(y)dy − ε a f (x − x ) |ak | ε. k k R

k=1

k=1

3◦ For f ∈ L∞ (R, C), W (eiξ x f ) = {eiξ x g|g ∈ W (f )}. To prove this, we first note that any element of co

W0 (aeiξ x f ) can be

|a|1

expressed in the form

ak eiξ(x−xk ) f (x − xk ) = eiξ x

k

k

= eiξ x

ak e−iξ xk f (x − xk ) 789: =bk

bk f (x − xk ) ∈ eiξ x co

where xk ∈ R,

W0 (bf ),

|b|1

k

)

|ak | 1, according to 1◦ . Hence

co

|a|1

W0 (aeiξ x f ) ⊂ eiξ x co

|b|1

W0 (bf ).

(9.17)

9.2 AP(R, C) as a Closed Subalgebra of L∞ (R, C)

Conversely, if g ∈ co

253

W0 (bf ),

|b|1

eiξ x g(x) =

bk f (x − xk ) · eiξ x =

k

=

k

where xk ∈ R,

bk f (x − xk )eiξ(x−xk ) · eiξ xk

k

bk eiξ xk f (x − xk )eiξ(x−xk ) = 789:

=ak

)

|ak | 1. Since

co

)

ak eiξ(x−xk ) f (x − xk ),

k

|bk | 1,

W0 (aeiξ x f ) ⊃ eiξ x co

|a|1

W0 (bf ).

(9.18)

|b|1

By (9.17) and (9.18), both sides are equal. Taking closures of them, we have W (eiξ x f ) = eiξ x W (f ). 3◦ is thus confirmed. Lemma 9.3 For f ∈ L∞ (R, C), the following three statements are equivalent: (i) W (f ) is compact with respect to · ∞ . (ii) W0 (f ) is relatively compact with respect to · ∞ . (iii) f is almost periodic. Proof (ii)⇔(iii) is known by Theorem 9.1 and (i)⇒(ii) is obvious. So it remains to show (ii)⇒(i). Let h ∈ W (f ). For any ε > 0, there exists g ∈ co W0 (af ) such that |a|1

h − g ∞ < ε.

(9.19)

We express g as g(x) =

k

ak fxk (x),

|ak | 1

(9.20)

k

(cf. Remark 9.1, 1◦ ). Since W0 (f ) is relatively compact, there exists an ε-net {fy1 , fy2 , · · · , fyJ } of W0 (f ) consisting of finite points. Each fxk in (9.20) is contained in an εneighborhood of some fyj (1 j J ). Let j (k) be any one of such j ’s. Then we have

254

9 Almost Periodic Functions and Stochastic Processes

ak fxk (x)− ak fyj (k) (x) = ak {fxk (x) − fyj (k) (x)} k

k

k

(9.21) |ak | · |fxk (x) − fyj (k) | = ε.

k

Collecting k’s such that j (k) = j , for each j we write (j ) = {k|j (k) = j }. Then we obtain

ak fyj (k) (x) =

J j =1

k

J J ak fyj (x) = ak fyj (x) = bj fyj (x), j =1

k∈(j )

j =1

k∈(j )

789: =bj

(9.22) where

)

|bj | 1. It follows from (9.21) and (9.22) that J g(x) − bj fyj (x) j =1

If we introduce the norm z 1 =

J

(9.23)

< ε. ∞

|zj | in RJ , (b1 , b2 , · · · , bJ ) is contained in

j =1

its unit ball. Let c1 , c2 , · · · , cN be an εJ −1 f −1 ∞ -net of the unit ball. Then bj fyj (x) − bj fyj (x) j

j

<ε

(9.24)

∞

for some bj ∈ {c1 , c2 , · · · , cN }.7 By (9.19), (9.23) and (9.24), 7 Select some c

this cn as

n

∈ {c1 , c2 , · · · , cN } such that (b1 , b2 , · · · , bJ )−cn 1 < εJ −1 f −1 ∞ and express cn = (cn,1 , cn,2 , · · · , cn,J ) = (b1 , b2 , · · · , bJ ).

Then the following evaluation follows. b f (x) − c f (x) (b − c )f (x) j y n,j y j n,j y j j j j

j

∞

j

∞

= (b − b )f (x) j y j j j

< ε. ∞

9.2 AP(R, C) as a Closed Subalgebra of L∞ (R, C)

255

h − bj fyj j

< 3ε.

∞

Thus j

αj fyj

j = 1, 2, · · · , J, αj ∈ {c1 , c2 , · · · , cN }

is a 3ε-net of W (f ) consisting of J × N elements. Hence W (f ) is totally bounded. It is also closed (complete) and so compact. Based upon the preparations made above, we can prove the following fundamental theorem. Theorem 9.2 AP(R, C) is a closed subalgebra of L∞ (R, C). Proof It is clear that W (f + g) ⊂ W (f ) + W (g) for any f, g ∈ AP(R, C). Since W (f ) and W (g) are compact by Lemma 9.3, W (f + g) is relatively compact. W (f + g) being closed, however, it is compact. Again by Lemma 9.3, f + g ∈ AP(R, C). Since f 2 , g 2 , (f + g)2 ∈ AP(R, C), it is easy to see fg =

1 {(f + g)2 − f 2 − g 2 } ∈ AP(R, C). 2

Hence AP(R, C) is a subalgebra of L∞ (R, C). Finally, we have to show that AP(R, C) is closed. For any f ∈ AP(R, C) and ε > 0, there exists some g ∈ AP(R, C) such that

f − g ∞ <

ε . 3

Let τ be a ε/3-almost period of g. Then we obtain

fτ − f ∞ fτ − gτ ∞ + gτ − g ∞ + g − f ∞ <

ε ε ε + + = ε. 3 3 3

That is, τ is an ε-almost period of f . So any interval of the length Λ(ε/3, g) contains an ε-almost period of f . Thus f is an almost periodic function.

256

9 Almost Periodic Functions and Stochastic Processes

9.3 Spectrum of Almost Periodic Functions We mean by a trigonometric polynomial a function of the form f (x) =

n

ξj ∈ R.

aj eiξj x ,

j =1

ξj (j = 1, 2, · · · , n) is called a frequency of f . Note that ξj may be any real number in this definition, while ξj is assumed to be an integer in the ordinary usage of the term “trigonometric polynomial”. Remark 9.2 Each of eiξj x is periodic, and so almost periodic. Hence, by Theorem 9.2, trigonometric polynomials and their uniform limits are almost periodic. If a measure μ ∈ M(R) is discrete, i.e. μ=

aj δξj ,

|aj | < ∞

(δξj is a Dirac measure concentrating at ξj ), then its Fourier transform 1 μ(x) ˆ = √ 2π

1 e−itx dμ(t) = √ aj e−iξj x 2π R

is almost periodic. Definition 9.3 Let f ∈ L∞ (R, C). The norm spectrum σ (f ) of f is the set of ξ ∈ R such that aeiξ x ∈ W (f ) for a 0 with sufficiently small |a|. Remark 9.3 1◦ σ (f ) may be empty. For instance, W (f ) ⊂ C∞ (R, C) for any f ∈ C∞ (R, C). Hence aeiξ x W (f ) for any a 0. This implies σ (f ) = ∅. ◦ 2 For f ∈ L∞ (R, C), σ (eiξ x f ) = ξ + σ (f ). It is verified as follows: θ ∈ σ (eiξ x f ) ⇔ For a with sufficiently small |a|, aeiθx ∈ W (eiξ x f ) ⇔ For some g ∈ W (f ), aeiθx = eiξ x g (Remark 9.1, 3◦ )

9.3 Spectrum of Almost Periodic Functions

⇔ ⇔

257

For some g ∈ W (f ), aei(θ−ξ )x = g θ − ξ ∈ σ (f ).

The Fourier transform fˆ of f ∈ L∞ (R, C) which appears in the following discussions should be understood as the one in the sense of distributions. (cf. Appendix C.) Lemma 9.4 For any f ∈ L∞ (R, C), σ (f ) ⊂ supp fˆ. (supp fˆ is the support of the distribution fˆ.)8 Proof Since f,y = e−iξy fˆ,9 we obtain supp f,y = supp fˆ. Hence supp gˆ ⊂ supp fˆ for every g ∈ W (f ).10

8 cf. 9 In

Appendix C, p. 397. fact, we have

1 f,y (ϕ) = fy (ϕ) ϕ(ξ )e−iξ x dξf (x − y)dx ˆ = √ 2π

1 ϕ(ξ )e−iξ(y+z) dξf (z)dz = √ 2π = f (ϕ · e−iξy ) = fˆ(ϕ · e−iξy ) = e−iξy fˆ(ϕ)

for any ϕ ∈ S by the definition of the Fourier transform of distributions. Hence f,y = e−iξy fˆ. In the course of computations, f (·) and fy (·) are tempered distributions defined by the functions f and fy . cf. Appendix C, Sect. C.3, p. 389. 10 We shall give a brief proof. Since g ∈ W (f ), there exists a sequence {g } of functions of the n form ak fxk , xk ∈ R, |ak | 1 such that gn − g ∞ → 0 (as n → ∞), by Remark 9.1, 1◦ . Hence it is easy to see that the sequence of tempered distributions defined by gn ’s simply converges to the tempered distribution defined by g. Consequently, gˆn also simply converges to gˆ in S (cf. 2◦ on p. 86). If the support of a function ϕ ∈ S is contained outside suppfˆ, gˆ n (ϕ) = 0 since suppfˆxk = suppfˆ. Therefore g(ϕ) ˆ = lim gˆn (ϕ) = 0. We conclude suppgˆ ⊂ suppfˆ. n→∞

258

9 Almost Periodic Functions and Stochastic Processes

Assume that σ (f ) ∅ and ξ ∈ σ (f ). If g(x) = aeiξ x ∈ W (f ), √ gˆ = 2π aδξ .11 Hence we have supp gˆ = {ξ }. Consequently, ξ ∈ supp fˆ for ξ ∈ σ (f ). This implies σ (f ) ⊂ supp fˆ.

Definition 9.4 Let F ∈ L1 (R, C). Ω(F, δ) = sup F (x + y) − F (x) 1 |y|<δ

is called the L1 -modulus of continuity of F . We list below several elementary properties of Ω(F, δ), which are derived from the definition: 1◦ Ω(ηF (ηx), δ) = Ω(F, ηδ). In fact, left-hand side = sup ηF (η(x + y)) − ηF (ηx) 1 = sup F (z + ηy) − F (z) 1 |y|<δ

|y|<δ

(changing variables: z = ηx) = sup F (z + w) − F (z) 1 = right-hand side. |w|<δη

2◦ Given δ > 0,12 lim Ω(ηF (ηx), δ) = 0.

η→0

11 The

Fourier transform of eiξ x can be evaluated as in Example 4.9 on p. 85. For any ϕ ∈ S, we

have iξ x (ϕ) = e-

∞ −∞

eiξ x ϕ(x)dx ˆ =

√ 2π ϕ(ξ ).

Hence iξ x = e-

√

2π δξ

is derived. 12 Since F ∈ L1 , the mapping y → τ F (R → L1 ) is uniformly continuous by Theorem 5.1 y (p. 101). It is, of course, continuous at y = 0. Hence there exists some θ > 0, for each ε > 0, such that τy F − F 1 < ε provided that |y| < θ. Combining this result and 1◦ , we obtain 2◦ .

9.3 Spectrum of Almost Periodic Functions

259

3◦ If we define gη by gη = ηF (ηx) ∗ h for h ∈ L∞ (R, C), then |gη (x) − gη (y)| Ω(F, η|x − y|) · h ∞ . 4◦ gη is defined in the same manner as in 3◦ . If lim gη (x) exists at each point x, η→0

then lim gη (x) = c(constant) for all

η→0

x ∈ R.

If x y, we obtain |gη (x) − gη (y)| Ω(ηF (ηx), |x − y|) · h ∞ by 1◦ and 2◦ . The right-hand side converges to 0 as η → 0, according to 2◦ . Hence 4◦ follows. 6 5◦ Assume that F 1 1 and R F (x)dx 0. If f ∈ L∞ (R, C) is uniformly continuous, gη = ηF (ηx) ∗ f ∈ W (f ) by Remark 9.1, 2◦ . If the uniform limit of gη exists, the limit is contained in W (f ). In the case f ∈ AP(R, C), a limiting point (with respect to the uniform convergence) exists, since W (f ) is compact (Lemma 9.3). 6 Lemma 9.5 Assume that f ∈ AP(R, C), F ∈ L1 (R, R), F (x) 0 and R F (x)dx = 1. Then the limit lim ηF (ηx) ∗ f

η→0

(uniform convergence)

exists and lim ηF (ηx) ∗ f = 0

η→0

⇔

0 σ (f ).

Proof Since f is uniformly continuous, gη = ηF (ηx)∗f ∈ W (f ) by 5◦ just stated. W (f ) being compact, there exists some sequence ηn → 0 such that {gηn } has a limit and the limit is constant by 4◦ above. 1◦ Suppose that gηn (x) → 0 (uniform convergence).

(9.25)

260

9 Almost Periodic Functions and Stochastic Processes

Then 0 ∈ W (f ). It is also easy to see that lim ηn F (ηn x) ∗ h = 0 (uniform convergence)

(9.26)

n→∞

for any h ∈ W (f ).13 The proof proceeds as follows. Computing the left-hand side for h(x) = fξ (x) = f (x − ξ ) ∈ W0 (f ), we obtain

ηn F (ηn x) ∗ fξ =

=

=

R

R

R

ηn F (ηn (x − y))fξ (y)dy =

R

ηn F (ηn (x − z − ξ ))f (z)dz

ηn F (ηn (x − y))f (y − ξ )dy

(changing variables: y − ξ = z)

ηn F (ηn (x − ξ − z))f (z)dz → 0

n → ∞.

as

Although this relation is valid for fixed x − ξ , we also obtain, by (9.25), that lim ηn F (ηn x) ∗ fξ = 0 (uniform convergence).

n→∞

(9.27)

Therefore the same relation as (9.27) holds good even if we replace fξ by a function of the form ak fξk , ξk ∈ R, |ak | 1 (9.28) k

(

k

is a finite sum). h ∈ W (f ) is a uniform limit of a sequence of functions of the

k

form (9.28). Hence we conclude that (9.26) is true. In particular, if h is a constant c contained in W (f ),14 we obtain ηF (ηx) ∗ c = c,

(9.29)

from which it follows that c = 0 by (9.26). Hence if (9.25) is satisfied, the only constant contained in W (f ) is 0. By the definition of σ (f ), we have 0 σ (f ).

13 Obviously 14 It

(9.26) holds good if we use f instead of h. can be verified by

ηF (ηx) ∗ c = ηF (η(x − y))cdy = c ηF (η(x − y))dy = c F (u)du = c R

R

R

(changing variables: η(x − y) = u).

9.3 Spectrum of Almost Periodic Functions

261

Suppose that there exists a limiting point of {gη (x)} other than 0; i.e. lim θn F (θn x) ∗ f = α 0

n→∞

(uniform convergence)

(9.30)

for some sequence θn → 0. If this is the case, α ∈ W (f ).15 But this is impossible because the only constant contained in W (f ) is 0. Thus, if (9.25) is satisfied, lim ηF (ηx) ∗ f = 0 (uniform convergence)

η→0

and 0 σ (f ). This is the first case. 2◦ Suppose next that a limiting point of gη is not 0; that is, (9.29) holds good for some sequence θn → 0. In this case, we obtain lim θn F (θn x) ∗ (f − α) = 0.

n→∞

Hence it follows that lim ηF (ηx) ∗ f = α

η→0

(uniform convergence).

Here we have the alternative case ; i.e. gη uniformly converges to some constant other than 0, and 0 ∈ σ (f ). As is illuminated by the computation in (9.29), lim ηF (ηx) ∗ f

η→0

is determined independently of F provided that F ∈ L1 (R, R),F 0, = 1.

6

R F (x)dx

Definition 9.5 Let f ∈ AP(R, C). The number M(f ) which satisfies 0 σ (f − M(f )) is called the mean value of f . The concept of a norm spectrum can be characterized in terms of the mean value.

15 By

definition of W (f ), we have 0 ∈ σ (f ).

262

9 Almost Periodic Functions and Stochastic Processes

Theorem 9.3 For f ∈ AP(R, C),

1 T →∞ 2T

M(f ) = lim

T −T

f (y)dy.

Proof It holds good that M(f ) = lim ηF (ηx) ∗ f η→0

by Lemma 9.5. As is remarked above, the value of M(f ) is determined indepen6 dently of the choice of F provided that F ∈ L1 (R, C), F 0, R F (x)dx = 1. Let us specify F as + F (x) =

1 2

if |x| < 1,

0

if |x| 1

and T as T = 1/η. Then

ηF (ηx) ∗ f =

=

x+ η1

x− η1

=

x+T

=

T

ηF (η(x − y))f (y)dy 789:

(cf. Fig. 9.2)

1 1 F (x − y) f (y)dy T T

1 f (y)dy + 2T

−T

1 = 2T

ηF (η(x − y))f (y)dy

=1/2

x−T

R

T −T

x+T T

1 f (y)dy + 2T

(9.31)

1 1 F (x − y) f (y)dy T T x+T

f (y)dy T

= I1 + I2 .

|I2 |

1 2T

T

x+T

|f (y)|dy

|x|

f ∞ → 0 2T

as

T → ∞.

(9.32)

9.3 Spectrum of Almost Periodic Functions

263

Fig. 9.2 F (η(x − y))

By (9.31) and (9.32), we obtain 1 T →∞ 2T

M(f ) = lim I1 = lim T →∞

T −T

f (y)dy.

Theorem 9.4 The following three statements are equivalent for f ∈ AP(R, C): (i) ξ ∈ σ (f ). (ii) 0 ∈ σ (f e−iξ x ). (iii) M(f e−iξ x ) 0. This result is clear from Remark 9.3, 2◦ and Lemma 9.5. Corollary 9.216 (i) If we define 1 f (x) = √ 2π

R

eiξ x dμ(ξ )

for μ ∈ M(R), then μ is the Fourier transform of f . (Hence any μ ∈ M(R) is the Fourier transform of some function in L∞ .) (ii) Let f ∈ AP(R, C). If μ ∈ M(R) satisfies μ = fˆ and μ{0} 0, then 0 ∈ σ (f ).17 Proof (i) Let μ ∈ M(R). If we define f ∈ L∞ by 1 f (x) = √ 2π

16 Katznelson

R

eiξ x dμ(ξ ),

[8] 2nd edn., pp. 159–160. There is a difference in the explanation of this result between the second edition and the third one. The latter seems to contain a slip. 17 fˆ is the Fourier transform of f in the sense of distribution.

264

9 Almost Periodic Functions and Stochastic Processes

then we obtain18 1 fˆ(ϕ) = √ 2π 1 = √ 2π

R R

iξ x ϕ(x)e ˆ dμ(ξ )dx

R

R

iξ x ϕ(x)e ˆ dx dμ(ξ )

=

R

ϕ(ξ )dμ(ξ ) for any ϕ ∈ S (inverse transform).

Thus fˆ = μ follows. (ii) We specify F as Fejér’s summation kernel19 Kη (x) = ηK(ηx), and define gη = ηK(ηx) ∗ f.

(9.33)

√ ξ ξ ˆ ˆ ˆ · f = 2π K · μ. 2π K η η

(9.34)

Then gˆ η is evaluated as20 gˆ η =

√

Regarding gˆ η as an element of S , we obtain w ∗ - lim gˆ η = η→0

√ 2π μ{0}δ0 .

Hence gη w ∗ -converges to the inverse Fourier transform of thanks to 2◦ on p. 86. That is, w ∗ - lim gη = μ{0} 0. η→0

(9.35) √ 2π μ{0}δ0 , i.e. μ{0}

(9.36)

gη has a uniform limit by Lemma 9.5. The limit is nothing other than (9.36). Hence, again by Lemma 9.5, we have 0 ∈ σ (f ). Thus if fˆ ∈ M(R), M(f ) can be characterized as μ{0} = fˆ({0}) = M(f ).

18 fˆ

is the Fourier transform of the tempered distribution defined by f . 3 x x 2 19 See Chap. 5, Sect. 5.4. K(x) = 1 sin . 2π 2 2 20 fˆ, gˆ are the Fourier transforms of distributions. η

9.3 Spectrum of Almost Periodic Functions

265

Similarly, we have fˆ({ξ }) = M(f e−iξ x ).

(9.37)

The relation (9.37) is the representation of the discrete part of the measure fˆ. It will be shown later that the continuous part of fˆ actually does not exist (cf. Corollary 9.3). It is easy to show the following formulas concerning the average M(·) on AP(R, C): 1◦ M(f + g) = M(f ) + M(g). 2◦ M(af ) = aM(f ), a ∈ C. 3◦ M(fy ) = M(f ). Lemma 9.6 Let f ∈ AP(R, C). Then M(f ) > 0 if f 0 and f is not identically 0. Proof We may assume f (0) > 0 by 3◦ . Hence on

f (x) > α

(−α, α)

for sufficiently small α > 0. Let Λ = Λ(α/2, f ). Then any interval of the length Λ contains an α/2-almost period τ of f . Without loss of generality, we may assume Λ 2α. Since f (x) >

α 2

(τ − α, τ + α),

on

we have

f (x)dx I

α · 2α = α 2 2

for an interval I of the length Λ. Consequently, we obtain, by Theorem 9.3, 1 M(f ) = lim n→∞ 2nΛ

nΛ

−nΛ

f (x)dx

α2 . Λ

Corollary 9.3 If μ ∈ M(R) and μˆ ∈ AP(R, C), then μ is discrete. Proof μ can be expressed in the form μ = μd + μc , where μd is the discrete part and μc is the continuous part. If μˆ ∈ AP(R, C), then μˆc ∈ AP(R, C), since μˆ d ∈ AP(R, C) (cf. Remark 9.2). So we can conclude |μˆ c |2 ∈ AP(R, C). By Wiener’s theorem 6.15 (p. 160), we obtain 1 T →∞ 2T lim

T

−T

|μˆ c (x)|2 dx = 0.

266

9 Almost Periodic Functions and Stochastic Processes

Therefore μc = 0 by Lemma 9.6.

9.4 Fourier Series of Almost Periodic Functions Let ·, ·M be an operation on AP(R, C) × AP(R, C) defined by ¯ f, gM = M(f g),

f, g ∈ AP(R, C).

(9.38)

·, ·M is an inner product on AP(R, C), and so AP(R, C) becomes a pre-Hilbert space with respect to this operation. The family {eiξ x |ξ ∈ R} forms an orthonormal family with respect to ·, ·M . In fact, it can immediately be verified by e

iξ x

,e

iηx

M

1 = lim T →0 2T

T

−T

e

i(ξ −η)x

dx =

+ 1

if ξ = η,

0

if ξ η.

(9.39)

For f ∈ AP(R, C), the numbers defined by 1 T →∞ 2T

f, eiξ x M = M(f e−iξ x ) = lim

T

−T

f (x)e−iξ x dx

(9.40)

are called the Fourier coefficients of f with respect to {eiξ x |ξ ∈ R}, and are denoted by fˆ{ξ }.21 For any finite numbers ξ1 , ξ2 , · · · , ξn ∈ R, we obtain f −

n

fˆ{ξj }eiξj x , f −

j =1

n

fˆ{ξj }eiξj x M

j =1

=f, f M −

n

fˆ{ξj }f, eiξj x M

j =1

−

n

fˆ{ξj }eiξj x , f M +

j =1

= M(|f |2 ) −

n

n

|fˆ{ξj }|2

j =1

|fˆ{ξj }|2 0.

j =1

Hence it follows that

21 We use such a notation because we wish to interpret the Fourier transform of f as something like a measure and its value at ξ as the mass at ξ .

9.4 Fourier Series of Almost Periodic Functions n

267

|fˆ{ξj }|2 f, f M = M(|f |2 ),

(9.41)

j =1

which is obviously a familiar inequality corresponding to Bessel’s inequality. It implies that there are at most countable ξ ’s such that fˆ{ξ } 0. Some exposition may be required to clarify this point. Assume that there are infinitely many ξ ’s such that |fˆ{ξ }| 1. Choose any countable elements ξ1 , ξ2 , · · · among them. Then the sum on the left-hand side of (9.41) is greater than any n ∈ N and exceeds M(|f |2 ) if n is very large. This contradicts (9.41). Furthermore, there are only finite ξ ’s such that 1/(n + 1) |fˆ{ξ }| < 1/n (n = 1, 2, · · · ). Suppose that there are infinitely many ξ ’s (say, ξ1 , ξ2 , · · · ) which satisfy this inequality for some n. If we sum up (n + 1)2 terms |fˆ(ξj )|2 , the sum is greater than 1. Hence the left-hand side of (9.41) grows indefinitely. A contradiction to (9.41) occurs again. Since ξ ∈ σ (f ) ⇔ M(f e−iξ x ) = fˆ{ξ } 0, σ (f ) is a countable set. Each ξn (n ∈ Z) contained in σ (f ) is called a Fourier index of f , and the formal series ∞

cn eiξn x

n=−∞

is called the Fourier series of an almost periodic function f , where cn = M(f e−iξn x ) and ξn ∈ σ (f ). Theorem 9.5 For f ∈ AP(R, C), σ (f ) is a countable set. For a pair of f and g ∈ AP(R, C), we define the mean convolution (f ∗ g)(x) M

by 1 T →∞ 2T

(f ∗ g)(x) = My (f (x − y)g(y)) = lim M

T −T

f (x − y)g(y)dy.

(9.42)

My (f (x − y)g(y)) is well-defined because f (x − y)g(y) is an almost periodic function in y for each fixed x. Lemma 9.7 f ∗ g ∈ AP(R, C) for f, g ∈ AP(R, C). If M(|g|) 1, then f ∗ g ∈ M

M

W (f ). Proof Let Λ = Λ(ε/ g ∞ , f ). Then any interval of length Λ contains an ε/ g ∞ almost period τ . It is not difficult to show that

268

9 Almost Periodic Functions and Stochastic Processes

sup |(f ∗ g)(x − τ ) − (f ∗ g)(x)| ε. M

x

(9.43)

M

In fact, (9.43) follows from

1 |(f ∗ g)(x − τ ) − (f ∗ g)(x)| = lim M M T →∞ 2T

T

−T

1 T →∞ 2T

lim lim

T →∞

f (x − τ − y)g(y)dy

1 − lim T →∞ 2T

T −T

T −T

f (x − y)g(y)dy

|f (x −τ − y)−f (x −y)||g(y)|dy

ε 1 · 2T · · g ∞ ε. 2T

g ∞

That is, any interval of length Λ contains an ε-almost period of f ∗ g. Hence f ∗ g M

M

is almost periodic. Assume next that M(|g|) < 1; i.e. 1 T →∞ 2T

lim

T −T

|g(y)|dy < 1.

Then 1 2T

T

−T

|g(y)|dy < 1

for sufficiently large T . If we define a function ϕ by + ϕ(y) =

1 2T

g(y)

0

on

[−T , T ],

on

[−T , T ]c ,

then ϕ 1 < 1 and ϕ ∗ f ∈ W (f ) (cf. p. 251, 2◦ ) That is, 1 2T

T

−T

f (x − y)g(y)dy ∈ W (f )

(9.44)

for sufficiently large T . The left-hand side of (9.44) is convergent for each x by (9.42). f ∗ g is the uniform limit of the left-hand side of (9.44) which is in M

W (f ). Since W (f ) is · ∞ -compact, f ∗ g ∈ W (f ). M

Assume that M(|g|) = 1. For any ε > 0,

9.4 Fourier Series of Almost Periodic Functions

1 2T

T

−T

|g(y)|dy < 1 + ε,

269

1 2T

i.e.

T

−T

|g(y)| dy < 1 1+ε

holds good for sufficiently large T > 0. If we define a function ϕε by + ϕε (y) =

1 2T (1+ε) g(y)

on

[−T , T ],

0

on

[−T , T ]c ,

then ϕε 1 < 1. Applying the argument above, we obtain f ∗ g/(1 + ε) ∈ W (f ). M

Since f ∗ g/(1+ε) uniformly converges to f ∗ g as ε ↓ 0, we obtain f ∗ g ∈ W (f ). M M M The operation of mean convolution enjoys all the basic properties of ordinary convolution. We now list a few of them. 1◦ (f ∗ g){ξ } = fˆ{ξ } · g{ξ ˆ } for f, g ∈ AP(R, C). M

This relation follows from a direct computation: (f ∗ g){ξ } = Mx (My (f (x − y)g(y))e−iξ x ) M

= Mx My (f (x − y)e−iξ(x−y) g(y)e−iξy ) = fˆ{ξ }g{ξ ˆ }. 2◦ f ∗ eiξ x = fˆ{ξ }eiξ x for f ∈ AP(R, C). M

3◦ If we define a function g(x) by g(x) =

g{ξ ˆ j }eiξj x

(finite sum),

j

then f ∗ g(f ∈ AP(R, C)) is given by M

f ∗g= M

g{ξ ˆ j }fˆ{ξj }eiξj x .

j

Let f ∈ AP(R, C). If we define a function f ∗ by f ∗ (x) = f (−x),

(9.45)

fˆ∗ {ξ } = fˆ{ξ }.

(9.46)

it is easy to see

270

9 Almost Periodic Functions and Stochastic Processes

Hence we obtain (f ∗ f ∗ ){ξ } = |fˆ{ξ }|2 .

(9.47)

M

In the case of f ∞ 1, we obtain f ∗ f ∗ ∈ W (f ). M

Lemma 9.8 Suppose that f ∈ AP(R, C). (i) h = f ∗ f ∗ is positive semi-definite. M

(ii) h can be represented as the Fourier transform of a positive Radon measure on R. (iii) hˆ is a positive Radon measure on R. Proof

(i) For xj ∈ R, zj ∈ C (j = 1, 2, · · · , p), we obtain p i,j =1

h(xi − xj )zi z¯ j =

0 i,j

1 lim T →∞ 2T

1 T →∞ 2T

= lim

1 T →∞ 2T

= lim

T

T

−T

−T i,j

∗

−xj +T

−xj −T

1

f (xi − xj − y)f (y)dy zi z¯ j

f (xi − xj − y)f (−y)dyzi z¯ j

f (xi + u)f (xj + u)zi z¯ j du

i,j

(changing variables: u = −xj − y) 2

−xj +T 1 = lim f (xj + u) · zj du T →∞ 2T −xj −T j

0. (ii) Clear from Bochner’s theorem (Theorem 6.11, p. 150). (iii) If h = μˆ for some μ ∈ M+ (R), we have

1 ˆ ϕ(ξ ˆ ) e−iξ x dμ(x)dξ h(ϕ) = h(ϕ) ˆ = μ( ˆ ϕ) ˆ = √ 2π R R 1

0

1 i(−x)ξ ϕ(ξ ˆ )e dξ dμ(x)= ϕ(−x)dμ(x) =√ 2π R R R

= ϕ(x)dμ(x) = μ(ϕ) for any ϕ ∈ S. R

It follows that hˆ = μ, where h(·), μ(·) are distributions defined by h and μ, the ˆ μ, Fourier transforms of which are h, ˆ respectively.

9.4 Fourier Series of Almost Periodic Functions

271

It is obvious that h = μˆ ⇔ hˆ = μ

(9.48)

by (iii). Since the Radon measure hˆ is discrete by Corollary 9.3, it can be expressed as ˆ }δξ . hˆ = (9.49) h{ξ Consequently, h(x) =

ˆ }eiξ x = h{ξ

(9.47)

|fˆ{ξ }|2 eiξ x .

(9.50)

If x = 0, in particular, we have M(|f |2 ) = h(0) =

|fˆ{ξ }|2 .

(9.51)

This is Parseval’s equality for almost periodic functions. The Fourier indices are at most countable in any one of (9.49), (9.50) and (9.51). Hence the sums appearing in them are countable sums. The orthonormal system {eiξ x |ξ ∈ R} in AP(R, C) is complete by (9.51) and Theorem 1.6 (p. 16). Thus f ∈ AP(R, C), f 0 ⇒ σ (f ) ∅. Lemma 9.9 For any finite real numbers ξ1 , ξ2 , · · · , ξn and ε > 0, there exists a trigonometric polynomial P such that: (i) (ii) (iii)

P (0) 0, M(P ) = 1, and Pˆ {ξj } > 1 − ε for j = 1, 2, · · · , n.

Proof Assume first that ξ1 , ξ2 , · · · , ξn are all integers. Choose m ∈ Z so that 1 Max |ξj | < m. ε 1j n If we adopt Fejér’s kernel m 1− Km (x) = k=−m

|k| eikx , m+1

as P (x), then P (x) = Km (x) satisfies (i)–(iii). We proceed to the general case. Note that there exist real numbers λ1 , λ2 , · · · , λq for ξ1 , ξ2 , · · · , ξn such that:

272

(a)

9 Almost Periodic Functions and Stochastic Processes q

θh λh = 0, θh ∈ Q ⇒ θh = 0 for all h, and

h=1

(b) there exist Aj,h ∈ Z which satisfy q

ξj =

Aj,h λh , j = 1, 2, · · · , n

h=1

for each j = 1, 2, · · · , n. We can verify this fact by induction in n.22 the case n = 1, P (x) = eiξ1 x works. Next assume that there exist λ1 , λ2 , · · · , λq which satisfy (a) and (b) for ξ1 , ξ2 , · · · , ξn−1 . Consider ξ1 , ξ2 , · · · , ξn . We may assume that

22 In

ξn

q

λh Z

h=1

for any λ1 , λ2 , · · · , λq which satisfy (a) and (b). (Otherwise ξn is also expressible in the form (b) by using λ1 , λ2 , · · · λq .) Define λ by λ = ξn −

q

λh zh

for

zh ∈ Z.

h=1

Assume that q

θh λh + θλ = 0,

θh , θ ∈ Q.

h=1

If θ = 0, then θh = 0 (h = 1, 2, · · · , q) by (a). Hence we may concentrate at the case θ 0. Since q

θh λh + θλ =

h=1

q

q q θh λh + θ ξn − λh zh = (θh − θzh )λh + θξn = 0,

h=1

h=1

h=1

it is obvious that ξn = −

q θh − θzh λh . θ h=1

Expressing θh /θ = vh /uh (uh , vh ∈ Z, h = 1, 2, · · · , q), we denote by u∗ the least common multiple of uh ’s. If we write θh /θ = vh∗ /u∗ , we obtain ξn = −

q ∗ v h=1

Furthermore, if we write

λ∗h

= λh ξj =

h u∗

/u∗

q h=1

q λh − z h λh = − (vh∗ − u∗ zh ) ∗ . u h=1

(h = 1, 2, · · · , q), ξj ’s can be reexpressed as u∗ Aj,h λ∗h , j = 1, 2, · · · , n − 1,

9.4 Fourier Series of Almost Periodic Functions

273

Choose ε1 > 0 and m ∈ N so that Max|Aj,k | (1 − ε1 ) > 1 − ε, q

ε1 >

j,k

m

.

Define a function P (x) by P (x) =

q A

Km (λh x).

(9.52)

h=1

Then it is easy to see that P (x) satisfies (i). In order to prove (ii), we first rewrite P (x) as P (x) =

|kq | |k1 | 1− ··· 1 − ei(k1 λ1 +···+kq λq )x , m+1 m+1

(9.53)

) where the summation on the right-hand side of (9.53) runs over all (k1 , k2 , · · · , kq ) such that |k1 | m, · · · , |kq | m. (By (a), the expression (9.53) is unique.) It follows that Pˆ (0) = the constant of the right-hand of (9.53) = M(P ) = 1 from (9.53). Furthermore, Pˆ {ξj } = [the constant of the right-hand of (9.53)]

T )q 1 ei( h=1 (kh −Aj,h )λh )x dx. × lim T →∞ 2T −T Here remain only terms such as kh = Aj,h in the above integration. Hence Pˆ {ξj } = Pˆ

q

Aj,h λh

h=1

ξn = −

q A |Aj,h | = 1− m+1 h=1

q (vh∗ − u∗ zh )λ∗h . h=1

Thus

λ∗1 , λ∗2 , · · ·

, λ∗q

satisfy (a) and (b) for ξ1 , ξ2 , · · · , ξn−1 and also ξn ∈

q

λ∗h Z.

h=1

This contradicts our assumption. Hence θ must be zero.

274

9 Almost Periodic Functions and Stochastic Processes

(1 − ε1 )q > 1 − ε.

This proves (iii).

Theorem 9.6 For any f ∈ AP(R, C), there exists a sequence of trigonometric polynomials in W (f ) which uniformly converges to f . Proof σ (f ) is a countable set by Theorem 9.5. So we may write σ (f ) = {ξ1 , ξ2 , · · · }. Let Pn be a trigonometric polynomial which satisfies (i)–(iii) of Lemma 9.9 for ξ1 , ξ2 , · · · , ξn and ε = 1/n. If we define Tn = f ∗ Pn , then M

Tn ∈ W (f ) by Lemma 9.7. It also follows that lim Tˆn {ξj } = fˆ{ξj } = 0 for all

n→∞

ξj ∈ σ (f )

by Lemma 9.9 (iii). If ξ σ (f ), then Tˆn {ξ } = fˆ{ξ } = 0 for all n. On the other hand, let g be a · ∞ -limiting point of Tn in the compact set W (f ). If a subsequence {Tn } of {Tn } uniformly converges to g, we have lim Tˆn {ξ } = g{ξ ˆ }.

n →∞

Hence g{ξ ˆ } = fˆ(ξ ) for all ξ. By the remark concerning the uniqueness of Fourier coefficients, we must have g = f . Consequently, {Tn } uniformly converges to f .

9.5 Almost Periodic Weakly Stationary Stochastic Processes Let (Ω, E, P ) be a probability space. T is either Z or R, interpreted as time space. For the sake of simplicity, we assume EX(t, ω) = 0 (t ∈ R). Let X : T × Ω → C be a weakly stationary process with the covariance function ρ(u). X(t, ω) is called an almost periodic weakly stationary process if ρ(u) is almost periodic. The following theorems play roles similar to those of Theorems 8.8 and 8.8 (pp. 214–215). Theorem 9.7 Let X : Z × Ω → C be a weakly stationary process with the spectral measure ν. Then the following three statements are equivalent: (i) Xn (ω) is an almost periodic process.

9.5 Almost Periodic Weakly Stationary Stochastic Processes

275

(ii) There exists Γ = Γ (ε, X) ∈ R for each ε > 0 such that any interval, the length of which is Γ (ε, X), contains some τ ∈ Z which satisfies sup E|Xn+τ (ω) − Xn (ω)|2 < ε. n∈Z

(iii) ν is discrete. Theorem 9.7 Let X : R × Ω → C be a measurable and weakly stationary process with the spectral measure ν. Then the following three statements are equivalent: (i) X(t, ω) is an almost periodic process. (ii) There exists Γ = Γ (ε, X) ∈ R for each ε > 0 such that any interval, the length of which is Γ (ε, X), contains some τ ∈ R which satisfies sup E|X(t + τ, ω) − X(t, ω)|2 < ε. t∈R

(iii) ν is discrete. We prove only Theorem 9.7 . (Theorem 9.7 can be proved in an entirely similar manner.) Proof of Theorem 9.7 23 have

(i)⇒(ii): Assume that ρ(μ) is almost periodic. Then we

sup E|X(t + τ, ω) − X(t, ω)|2 = 2[ρ(0) − Reρ(τ )] t∈R

= 2Re[ρ(0) − ρ(τ )] 2|ρ(0) − ρ(τ )|

(9.54)

2 sup |ρ(u + τ ) − ρ(u)|. u∈R

Since ρ(·) is almost periodic, there exists a number Λ(ε/2, ρ) > 0, for each ε > 0, such that any interval the length of which is Λ(ε/2, ρ) contains an ε/2-almost period, τ . Since ε τ ∈ R sup |ρ(u + τ ) − ρ(u)| < 2 u∈R ⊂ τ ∈ R sup E|X(t + τ, ω) − X(t, ω)|2 < ε t∈R

by (9.54), we see that (ii) is satisfied by setting Γ (ε, X) = Λ(ε/2, ρ). (ii)⇒(i): Assume (ii). By a simple calculation, we obtain the inequality |ρ(u + τ ) − ρ(u)|2 = |E[X(u + τ, ω)X(0, ω) − X(u, ω)X(0, ω)]|2 23 The

proof of (i)⇔(ii) is due to Kawata [11] pp. 80–82.

276

9 Almost Periodic Functions and Stochastic Processes

E|X(u + τ, ω) − X(u, ω)|2 E|X(0, ω)|2 (by Schwarz’s inequality). It follows that |ρ(u + τ ) − ρ(u)| [E|X(t + τ, ω) − X(u, ω)|2 ]1/2 ρ(0)1/2 . Hence we have ε2 τ ∈ R sup E|X(u + τ, ω)−X(u, ω)|2 < ρ(0) u∈R ⊂ τ ∈ R sup |ρ(u + τ ) − ρ(u)| < ε . u∈R

Thus (i) holds good by setting Λ(ε, ρ) = Γ (ε2 /ρ(0), X). (i)⇒(iii): The covariance function ρ is represented by the Fourier transform of some positive Radon measure ν on R. Since ρ(u) = νˆ (u) is almost periodic, ν must be discrete by Corollary 9.3. (iii)⇒(i): Assume that the spectral measure ν of X is discrete, say ν=

∞

an δξn

(δξn : Dirac measure).

n=1

Then the covariance can be expressed as 1 ρ(u) = √ 2π

∞ 1 e−iλu dν(λ) = √ an e−iξn u . 2π R n=1

(9.55)

Since an 0, the series (9.55) is absolutely and hence uniformly convergent. Thus ρ(u) is the uniform limit of trigonometric polynomials,24 and hence almost periodic. A (measurable) weakly stationary process X(t, ω) is τ -periodic if and only if its spectral measure ν concentrates on a countable set in T or R such that the distance of any adjacent two points is some multiple of 2π/τ (cf. Theorem 8.8 (p. 215)). X(t, ω) is almost periodic if and only if ν is discrete (cf. Theorem 9.7 ).

24 A

function of the form f (x) =

n

aj e−iξj u

(ξj ∈ R)

j =1

is called a trigonometric polynomial. Any trigonometric polynomial is almost periodic.

References

277

Therefore if X(t, ω) is periodic or almost periodic, the spectral measure ν can never be absolutely continuous with respect to the Lebesgue measure. Now under what conditions does the spectral measure have the density function? The answers were given in Theorem 8.12 (p. 228) and Theorem 8.13 (p. 236). ν is absolutely continuous with respect to the Lebesgue measure if and only if X(t, ω) is a moving average process of white noise. So we can conclude that any moving average process can be neither periodic nor almost periodic. n However, we know that the set Md+ (R) = αj δxj αj ∈ R, xj ∈ R, n ∈ N j =1

of all the discrete positive Radon measures is w∗ -dense in M+ (R), the set of all the positive Radon measures (cf. Billingsley [1] p. 237, Malliavin [13] Chap. II, §6, Maruyama [14] Chap. 8,§2). The theorem below immediately follows from this observation. Theorem 9.8 Let X(t, ω) be a measurable and weakly stationary process on R×Ω with the spectral measure ν. Then there exists a sequence of almost periodic weakly stationary processes Xk (t, ω) with the spectral measure ν k such that ν k converges to ν in the w ∗ -topology. Proof Since Md+ (R) is w ∗ -dense in M+ (R) (metrizable), there exists a sequence ν k ∈ Md+ (R) which w ∗ -converges to ν. By Theorem 8.7 (p. 213), there is a measurable and weakly stationary process Xk (t, ω), the spectral measure of which is exactly equal to ν k . Xk (t, ω) is almost periodic because ν k ∈ Md+ (R). A similar result also holds good for the discrete time case. (In this case R should be replaced by T as usual.) Theorem 9.8 tells us that any weakly stationary process X(t, ω) can be approximated by a sequence {Xk (t, ω)} of almost periodic weakly stationary processes in the sense that the sequence of the spectral measures of Xk (t, ω) converges to the spectral measure of X(t, ω) in the w∗ -topology.25

References 1. Billingsley, P.: Convergence of Probability Measures. Wiley, New York (1968) 2. Bochner, S.: Beiträge zur Theorie der fastperiodischen Funktionen, I, II. Math. Ann. 96, 119– 147, 383–409 (1927) 3. Bochner, S., von Neumann, J.: Almost periodic functions in a group, II. Trans. Amer. Math. Soc. 37, 21–50 (1935) 4. Bohr, H.: Zur Theorie der fastperiodischen Funktionen. I–III. Acta Math. 45, 29–127 (1925); 46, 101–214 (1925); 47, 237–281 (1926) 5. Bohr, H.: Fastperiodische Funktionen. Springer, Berlin (1932)

25 This

observation is based upon Maruyama [15].

278

9 Almost Periodic Functions and Stochastic Processes

6. Dudley, R.M.: Real Analysis and Probability. Wadsworth and Brooks, Pacific Grove (1988) 7. Dunford, N., Schwartz, J.T.: Linear Operators, Part 1, Interscience, New York (1958) 8. Katznelson, Y.: An Introduction to Harmonic Analysis, 3rd edn. Cambridge University Press, Cambridge (2004) 9. Kawata, T.: Ohyo Sugaku Gairon (Elements of Applied Mathematics). I, II. Iwanami Shoten, Tokyo (1950, 1952) (Originally published in Japanese) 10. Kawata, T.: On the Fourier series of a stationary stochastic process, I, II. Z. Wahrsch. Verw. Gebiete, 6, 224–245 (1966); 13, 25–38 (1969) 11. Kawata, T.: Teijo Kakuritsu Katei (Stationary Stochastic Processes). Kyoritsu Shuppan, Tokyo (1985) (Originally published in Japanese) 12. Loomis, L.: The spectral characterization of a class of almost periodic functions. Ann. Math. 72, 362–368 (1960) 13. Malliavin, P.: Integration and Probability. Springer, New York (1995) 14. Maruyama, T.: Sekibun to Kansu-kaiseki (Integration and Functional Analysis). Springer, Tokyo (2006) (Originally published in Japanese) 15. Maruyama, T.: Fourier analysis of periodic weakly stationary processes. A note on Slutsky’s observation. Adv. Math. Econ. 20, 151–180 (2016) 16. Rudin, W.: Weak almost periodic functions and Fourier–Stieltjes transforms. Duke Math. J. 26, 215–220 (1959) 17. Rudin, W.: Fourier Analysis on Groups. Interscience, New York (1962) 18. Stromberg, K.R.: An Introduction to Classical Real Analysis. American Mathematical Society, Providence (1981) 19. von Neumann, J.: Almost periodic functions in a group, I. Trans. Amer. Math. Soc. 36, 445–492 (1934)

Chapter 10

Fredholm Operators

A bounded linear operator acting between Banach spaces is called a Fredholm operator if the dimension of its kernel and the codimension of its image are both finite. An equation defined by a Fredholm operator sometimes enjoys a nice property which reduces the difficulties associated with infinite dimension to the finite dimensional problem. The object of this chapter is to study the basic elements of Fredholm operators, which will be made use of in the next chapter in the context of bifurcation theory. It is not easy to extend the spectral theory of linear operators between finite dimensional spaces to infinite dimensional case. Several special kinds of operators are known for which detailed spectral theory can be constructed even in an infinite dimensional context; compact operators and Fredholm operators are typical and remarkable examples. In this book, we do not intend to go into details of spectral theory itself.1

10.1 Direct Sums and Projections All the normed linear spaces in this chapter are assumed to be real. Let M1 and M2 be two subspaces of a vector space X. If every x ∈ X can be expressed uniquely as x = m1 + m2 for some m1 ∈ M1 and m2 ∈ M2 , X is said to be a direct sum of M1 and M2 , and this situation is denoted by X = M1 ⊕ M2 . Suppose that X = M1 ⊕ M2 . If x ∈ X is expressed uniquely as x = m 1 + m2 ;

m1 ∈ M1 , m2 ∈ M2 ,

1 We are indebted to Kuroda [3] Chap. 11 and Zeidler [6] Chaps. 3 and 5. See also Kato [2] Chap. 6 for the spectral theory of compact operators.

© Springer Nature Singapore Pte Ltd. 2018 T. Maruyama, Fourier Analysis of Economic Phenomena, Monographs in Mathematical Economics 2, https://doi.org/10.1007/978-981-13-2730-8_10

279

280

10 Fredholm Operators

the operators P1 :x → m1

(X → M1 ),

P2 :x → m2

(X → M2 )

are called the projection to M1 (along M2 ) and the projection to M2 (along M1 ), respectively. Both P1 and P2 are linear operators and have basic properties as follows: 1◦ P12 = P1 , P22 = P2 . 2◦ P1 (X) = M1 , P2 (X) = M2 . 3◦ M2 = (I − P1 )(X) = KerP1 , M1 = (I − P2 )(X) = KerP2 . Theorem 10.1 Suppose that X is a linear space and a linear operator P : X → X satisfies P 2 = P . Then X = M1 ⊕ M2 , where M1 = P (X),

M2 = (I − P )(X).

Proof It is clear that M1 and M2 are linear subspaces of X. For each x ∈ X, we define m1 and m2 as m1 = P x,

m2 = (I − P )x.

Then, obviously, x = m1 + m 2 ;

m1 ∈ M1 , m2 ∈ M2 .

Such an expression is unique. If another expression x = m1 + m2 ;

m1 ∈ M1 , m2 ∈ M2

is possible, there exist x , x ∈ M which satisfy m1 = P x ,

m2 = (I − P )x

by the definitions of M1 and M2 . Since P 2 = P , we have P m1 = m1 ,

P m2 = 0.

Hence it holds good that m1 = P x = P m1 + P m2 = P m1 = m1 .

10.1 Direct Sums and Projections

281

Similarly, it is easy to show that m2 = m2 .

Theorem 10.2 Let X be a linear space and M its subspace. (i) There exists a subspace N of X such that X = M ⊕ N. (ii) If X = M ⊕ N, then codimM = dimN. (iii) If X = M ⊕ N, then dimX = dimM + dimN = dimM + codimM. (Hence in the case of dimX < ∞, the equality codimM = dimX − dimM holds good.) Proof (i) Let P be the set of all the linear operators P : D(P ) → M such that the domain D(P ) of P includes M and P (x) = x for all x ∈ M. Of course, P ∅.2 Define a partial ordering ≤ on P by P1 ≤ P2 ⇔ P2 is an extension of P1 . Then any chain in P has an upper bound in P. Hence, by Zorn’s lemma, P has a maximal element P0 with respect to ≤. It can be shown that P0 is defined on the whole space X. Assume, on the contrary, that D(P0 ) X. Define a subspace N by N = D(P0 ) + span{x0 }, where x0 is any element of X \ D(P0 ). Then the operator P : N → M defined by P (x + αx0 ) = P0 (x),

x ∈ D(P0 ),

α∈R

is an element of P and is an extension of P0 . This contradicts the maximality of P0 . Furthermore, P02 = P0 can be verified easily. In fact, P0 (P0 x) = P0 x, since P0 x ∈ M for any x ∈ X. Hence (i) follows by Theorem 10.1. (ii) Assume that X = M ⊕ N. Define a linear operator T : M → X/N by T x = ξx ,

x ∈ M,

where ξx is the residue class x + N of x modulo N. Since T is clearly a bijection, we obtain dimM = dimX/N.

2 Let

P : M → M the identity mapping on M. This P is clearly contained in P.

282

10 Fredholm Operators

(iii) It is enough to show only dimX = dimM + dimN. If dimM = ∞ or dimN = ∞, it is obvious that dimX = ∞. So there is nothing to prove in this case. If dimM < ∞ and dimN < ∞, the union of bases of M and N forms a basis of X. The desired result immediately follows. We now introduce a topology to X. Suppose that X is a normed linear space and X = M1 ⊕ M2 for some pair of subspaces, M1 and M2 . Let P1 and P2 be the projections to M1 and M2 , respectively. Then it is almost obvious that the following three statements are equivalent: (i) P1 is continuous. (ii) P2 is continuous. (iii) Both P1 and P2 are continuous. If any one of (i)–(iii) holds good, X is said to be a topological direct sum of M1 and M2 . M1 and M2 are called topological complements of each other. Theorem 10.3 Let X be a normed linear space, with a subspace M. (i) M has a topological complement if and only if there exists a continuous linear operator P : X → X such that a. P 2 = P ,

b. P (X) = M.

(ii) If M has a topological complement, M is closed. Theorem 10.4 Let X be a Banach space such that X = M1 ⊕ M2 for some subspaces M1 , M2 . Then the following two statements are equivalent: (i) X is a topological direct sum of M1 and M2 . (ii) M1 and M2 are closed. Proof Since (i)⇒(ii) is obvious, we have only to prove (ii)⇒(i). Let P1 and P2 be the projections to M1 and to M2 , respectively. We have to show their continuity. By the remark stated above, it is sufficient to show the continuity of either of P1 and P2 , say P1 . Furthermore, P1 is continuous if its graph G(P1 ) = {(x, m1 ) ∈ X × M1 | m1 = P1 x} is closed in X × M1 .3 Let {xn } be a sequence in X such that m1n = P1 xn ,

m2n = P2 xn ,

n = 1, 2, · · ·

Then we have, of course, (xn , m1n ) ∈ G(P1 ). If (xn , m1n ) → (x ∗ , m∗1 )

as

n → ∞,

3 This is so by the closed graph theorem (Dunford and Schwartz [1] pp. 57–58, Maruyama [4] pp. 145–146).

10.1 Direct Sums and Projections

283

then m2n → m∗2 ≡ x ∗ − m∗1 (as n → ∞). By (ii), m∗1 ∈ M1 and m∗2 ∈ M2 . Hence m∗1 = P1 x ∗ ,

m∗2 = P2 x ∗

and so (x ∗ , m∗1 ) ∈ G(P1 ). Therefore G(P1 ) is closed.

Theorem 10.5 provides several cases in which a Banach space can be a topological direct sum. We need a new concept of a biorthogonal system to prove it. Definition 10.1 Suppose that X is a normed linear space and that x1 , x2 , · · · , xn ∈ X and Λ1 , Λ2 , · · · , Λn ∈ X .4 {xj , Λj }j =1,2,··· ,n is called a biorthogonal system of X if it satisfies Λi , xj = δij

for all

i, j = 1, 2, · · · , n.

Lemma 10.1 Let X be a normed linear space. If Λ1 , Λ2 , · · · , Λk+1 ∈ X satisfy KerΛ1 ∩ · · · ∩ KerΛk ⊂ KerΛk+1 , then Λk+1 can be expressed as a linear combination of Λ1 , Λ2 , · · · , Λk . Proof We prove this lemma by induction. Suppose first that k = 1, and KerΛ1 ⊂ KerΛ2 . Without loss of generality, we may assume that Λ1 0. Since X/KerΛ1 R, we obtain X = KerΛ1 ⊕ span{x0 },

x0 KerΛ1

(topological direct sum)

by Theorem 10.5 (iii). Any x ∈ X can be expressed as x = u + αx0 ,

u ∈ KerΛ1 ,

α ∈ R.

Taking account of KerΛ1 ⊂ KerΛ2 , we obtain Λ1 (x) = αΛ1 (x0 ),

Λ2 (x) = αΛ2 (x0 ).

Since Λ1 (x0 ) 0, we can define γ = Λ2 (x0 )/Λ1 (x0 ). Thus Λ2 (x) = γ Λ1 (x). Next we assume that the claim of the lemma holds good if the number of given elements of X is k. Consider the case of k + 1(k 2). We write Λ˜ j = Λj |KerΛ1 ,

j = 2, 3, · · · , k + 1.

If we regard KerΛ1 as a normed linear space, Λ˜ j ∈ (KerΛ1 ) ,

4 X

j = 2, 3, · · · , k + 1.

is the dual space of X. Λi , xj means the value of Λj at xj ; i.e. Λj (xj ). δij is Kronecker’s delta.

284

10 Fredholm Operators

By assumption KerΛ˜ 2 ∩ KerΛ˜ 3 ∩ · · · ∩ KerΛ˜ k ⊂ KerΛ˜ k+1 . Hence, by assumption of induction, we can express Λ˜ k+1 as Λ˜ k+1 = γ2 Λ˜ 2 + · · · + γk Λ˜ k (γj ∈ R). If we define Φ ∈ X by Φ = Λk+1 − (γ2 Λ2 + · · · + γk Λk ) ∈ X , Φ satisfies Φ = Λ˜ k+1 − (γ2 Λ˜ 2 + · · · + γk Λ˜ k ) = 0

on

KerΛ1 .

That is KerΛ1 ⊂ KerΦ. Consequently, we can express Φ as Φ = γ1 Λ1 , following the case of k = 1. Finally, we obtain Λk+1 = γ1 Λ1 + γ2 Λ2 + · · · + γk Λk . Lemma 10.2 Let X be a normed linear space. (i) If x1 , · · · , xn ∈ X are linearly independent, there exist some Λ1 , · · · , Λn ∈ X such that {xj , Λj }j =1,2,··· ,n is a biorthogonal system. (ii) If Λ1 , · · · , Λn ∈ X are linearly independent, there exist some x1 , · · · , xn ∈ X such that {xj , Λj }j =1,2,··· ,n is a biorthogonal system. Proof (i) Let Mj be a subspace of X spanned by x1 , · · · , xj −1 , xj +1 , · · · , xn , excluding xj . Of course, xj Mj . By the Hahn–Banach theorem, there exists a Λj ∈ X which satisfies Λj (xj ) = 1 and vanishes on Mj . Combining x1 , x2 , · · · , xn with Λ1 , Λ2 , · · · , Λn thus B chosen, we obtain a biorthogonal system. (ii) By Lemma 10.1, KerΛi \ KerΛj ∅ for every j . Hence there exists some xj ∈ X which satisfies

ij

Λj (xj ) = 1,

Λi (xj ) = 0 for

Thus {xj , Λj }j =1,2,··· ,n is a biorthogonal system.

i j.

Theorem 10.5 Let X be a Banach space, with a subspace M. Then M has a topological complement if either one of the following conditions is satisfied: (i) X is a Hilbert space, with a closed subspace M. (ii) dim M < ∞. (iii) M is closed, and codimM < ∞.

10.1 Direct Sums and Projections

285

Proof (i) is explained in almost all the textbooks on functional analysis,5 so we may omit it here. (ii) Let {x1 , x2 , · · · , xn } be the basis of M. By Lemma 10.1, there exist Λ1 , Λ2 , · · · , Λn ∈ X such that {xj , Λj }j =1,··· ,n is a biorthogonal system. Define a linear operator P : X → X by Px =

n

Λj (x)xj ;

x ∈ X.

j =1

Then P is continuous and satisfies P 2 = P , since P 2x =

n

Λk (P x)xk =

k=1

=

Λk

n

Λj (x)xj xk

j =1

k=1

n 0 n k=1

n

1 n Λj (x)Λk (xj ) xk = Λk (x)xk = P x;

j =1

x ∈ X.

k=1

We note that P x ∈ M for any x ∈ X, since P x is a linear combination of x1 , x2 , · · · , xn . Hence P (X) ⊂ M. Conversely, it is also clear that P (X) ⊃ M, since P xj = xj (j = 1, 2, · · · , n). So we obtain P (X) = M. By Theorem 10.3, M has a topological complement. (iii) Let ξyk ≡ yk + M;

k = 1, 2, · · · , p

be a basis of X/M. (Note that dim(X/M) < ∞.) For any x ∈ X, there exists some k such that x ∈ yk + M. Hence there exists some z ∈ M such that x = yk + z.

(10.1)

If we define N = span{y1 , y2 , · · · , yp }, it is clear that X = M + N. We now show that the expression (10.1) is unique. The only important point to be checked for proving the uniqueness is M ∩ N = {0}. In fact, if 0

p

αk yk ∈ M ∩ N,

k=1

5 See

Maruyama [4] p. 190, Yosida [5] p. 82 for instance.

286

10 Fredholm Operators

then we have p 0 = ξ)

αk yk

=

k=1

p

αk ξyk

(0 on the left-hand side means ξ0 ),

k=1

which is a contradiction to the assumption that ξyk (k = 1, 2, · · · , p) is a basis of X/M. Suppose that x ∈ X can be expressed in two ways, x = z + y = z + y ,

z, z ∈ M,

y, y ∈ N.

Then we have obviously z − z = y − y ∈ M ∩ N. This implies that z = z and y = y . Hence X = M ⊕ N. By Theorem 10.4, it is a topological direct sum. If a subspace M satisfies (iii), then any subspace N containing M also satisfies (iii). Theorem 10.6 Let X be a Banach space and M a closed subspace with codim M < ∞. Then any subspace N containing M is closed and codim N < ∞. Proof The canonical mappings, π1 : X → X/M and π2 : N → N/M, are continuous in the quotient topologies for X/M and N/M. Since dim X/M < ∞, it is obvious that dim N/M < ∞. Hence N/M is a closed subspace of X/M. It follows that N = π1−1 (N/M) is closed. Suppose that ξx1 , ξx2 , · · · , ξxn ∈ X/N are linearly independent, where xj ∈ X and ξxj = xj + N(j = 1, 2, · · · , n). Since n

αj xj ≡ 0

(mod N) ⇒

αj = 0 for all j,

αj xj ≡ 0 (mod M) ⇒

αj = 0 for all j.

j =1

we also obtain n j =1

That is, x1 + M, x2 + M, · · · , xn + M are linearly independent in X/M. Consequently, we have dimX/N dimX/M.

10.1 Direct Sums and Projections

287

Lemma 10.3 Let X be a normed linear space, with a closed subspace M. Then dim M⊥ < ∞ if and only if there exists a subspace N which satisfies X = M ⊕ N and dim N < ∞. In this case, dim M⊥ = dimN.6 Proof (Sufficiency) Assume that X = M ⊕ N and dim N = n < ∞. Let v1 , v2 , · · · , vn be a basis of N. Then there exist Λj ∈ M⊥ (j = 1, 2, · · · , n) such that {vj , Λj }j =1,2,··· ,n is a biorthogonal system. We can prove this in a way similar to Lemma 10.2 (i). That is, if we define Nj = span{v1 , · · · , vj −1 , vj +1 , · · · , vn },

j = 1, 2, · · · , n

and Mj = M ⊕ Nj , then Mj is a closed subspace by Theorem 10.3. Of course vj Mj . By the Hahn–Banach theorem, there exists Λj ∈ X , which satisfies Λj (vj ) = 1,

Λj (v) = 0 on Mj .

Λ1 , Λ2 , · · · , Λn thus defined clearly fulfill the desired conditions. Λ1 , Λ2 , · · · , Λn are linearly independent.7 We have only to verify that any Λ ∈ ⊥ M is a linear combination of Λ1 , Λ2 , · · · , Λn in order to conclude that M⊥ = span{Λ1 , Λ2 , · · · , Λn } and dim M⊥ = n. Let Λ ∈ M⊥ . Defining Φ=

n

Λ(vj )Λj ,

j =1

we obtain Φ(vk ) =

n

Λ(vj )δj k = Λ(vk ).

j =1

On the other hand, x ∈ X can be expressed as x =u+

n

αj vj ,

u ∈ M.

j =1

6 M⊥

denotes the annihilator, i.e. M⊥ = {Λ ∈ X |Λx = 0 for all x ∈ M}.

α1 Λ1 + · · · + αn Λn = 0. Evaluate the left-hand side at vj . By the definition of a biorthogonal system, it follows that αj = 0.

7 Set

288

10 Fredholm Operators

Since Λ, Φ ∈ M⊥ , we have Λ(x) = Φ(x)

for all

x∈X

by Λ(vj ) = Φ(vj ). That is, Λ = Φ. Hence Λ is a linear combination of Λj ’s. (Necessity) Assume that dim M⊥ = n < ∞ and Λ1 , Λ2 , · · · , Λn form a basis of M⊥ . By Lemma 10.2, there exists a biorthogonal system of the form {uj , Λj }j =1,2,··· ,n . Let N = span{v1 , v2 , · · · , vn }. We show that X = M ⊕ N (topological direct sum). It is easy to verify M ∩ N = {0}. Any x ∈ M ∩ N can be expressed as x=

m

αj ∈ R,

αj vj ,

j =1

since x ∈ N. On the other hand, Λj (x) =

n

αk Λ(vk ) = αj = 0

k=1

by x ∈ M. Hence x = 0. For any x ∈ X, we define w=

n

Λj (x)vj ∈ N,

u = x − w.

j =1

We have only to show that u ∈ M. On the contrary, suppose that u M. By the Hahn–Banach theorem, there exists Λ ∈ M⊥ , which satisfies Λ(u) 0. Λ can be expressed in the form Λ=

n

βj Λj ,

βj ∈ R.

j =1

Since Λj (w) =

n k=1

Λk (x)Λj (vk ) =

n

Λk (x)δj k = Λj (x),

k=1

we obtain Λ(w) = Λ(x). Hence Λ(u) = 0, which contradicts Λ(u) 0.

Lemma 10.4 Suppose that X is a normed linear space. If M is a closed subspace of X and N is a finite dimensional subspace of X such that M ∩ N = {0}, then M ⊕ N is a closed subspace.

10.1 Direct Sums and Projections

289

Proof Let {xn } be a sequence in M ⊕ N which converges to x ∗ . We have to show that x ∗ ∈ M ⊕ N. Each xn can be expressed as xn = un + vn ;

un ∈ M, vn ∈ N.

Then we observe that the sequence { vn } is bounded. If it is unbounded, there exists some subsequence {vn } of {vn } such that vn → ∞ (as n → ∞). Define zn =

vn .

vn

Then we obtain un 1 1 + z

un + vn =

xn → 0 n= v

v

v n n n

as

n → ∞.

However, {zn } is a sequence in the unit sphere SN of N, and un / vn ∈ M. So the above result implies inf ρ(z, M) = 0,

z∈SN

where ρ(z, M) is the distance between z and M. Since SN is compact by dim N < ∞, there exists some z0 ∈ SN such that ρ(z0 , M) = 0. On the other hand, we are assuming that M is closed and M ∩ N = {0}. Hence we obtain ρ(z0 , M) > 0 since z0 ∈ SN (and so z0 0). Contradiction. Thus we have proved that { vn } is bounded. Hence there exists a subsequence {vn } (different subsequence from {vn } used above) of {vn } such that vn → v ∗ ∈ N. Since un = xn − vn and xn → x ∗ , vn → v ∗ , we obtain un → x ∗ − v ∗ ≡ u∗

as

n → ∞.

u∗ ∈ M because M is closed. We conclude that x ∗ ∈ M ⊕ N.

290

10 Fredholm Operators

10.2 Fredholm Operators: Definitions and Examples Definition 10.2 Let X and Y be normed linear spaces. A bounded linear operator T : X → Y is called a Fredholm operator if it satisfies dim KerT < ∞ and

codimT (X) < ∞.

If T is a Fredholm operator, κ(T ) = dim KerT − codimT (X) is called the index of T . Example 10.1 If X = Rn and Y = Rm , any linear operator T : Rn → Rm is Fredholm. Since X/KerT T (X), codim KerT = dim T (X). Consequently, we obtain κ(T ) = dim KerT − codimT (X) = (n − codim KerT ) − (m − dim T (X)) = n − m. Example 10.2 Let X = C1 ([a, b], R) and Y = C([a, b], R).8 A linear operator T is given by T : f → f ,

f ∈ C1 .

Since KerT is the space of constant functions, dim KerT = 1. Furthermore, codimT (X) = 0, since T (X) = Y. Hence T is Fredholm and κ(T ) = 1. Example 10.3 Let λ ∈ R\{0} and a continuous function K(x, y) : [a, b]×[a, b] → R be given. Define an operator T : L2 ([a, b], R) → L2 ([a, b], R) by

Tf (x) =

b

K(x, y)f (y)dy − λf (x).

a

(X = Y = L2 ([a, b], R).) As is well-known, the operator

b

S : f →

K(x, y)f (y)dy a

8 The

topologies of C1 and C are defined by the C1 -norm and the uniform convergence norm, respectively.

10.2 Fredholm Operators: Definitions and Examples

is a compact operator on L2 ([a, b], R). Since T codimT (X).9 Hence T is Fredhom and κ(T ) = 0.

291

= S − λI , dim KerT

=

Theorem 10.7 Let X and Y be Banach spaces. If T : X → Y is a bounded linear operator and codimT (X) < ∞, then T (X) is closed. Proof By Theorem 10.2, there exists some subspace Z such that10 Y = T (X) ⊕ Z. Since dim Z < ∞, Z is a closed subspace. Define an operator S : X × Z → Y by S(x, z) = T x + z;

x ∈ X, z ∈ Z.

Then S is a bounded linear operator of the Banach space X × Z onto another Banach space Y. Hence, by the open mapping theorem,11 S is an open mapping. If we define an operator S˜ : X × Z/KerS → Y by S˜ : (x, z) + KerS → T x + z, ˜ T (X) coincides X × Z/KerS and Y are isomorphic (as Banach spaces) through S. ˜ Thus we can conclude that with the image of the Banach space X × {0}/KerS by S. T (X) is complete (and so closed). Theorem 10.8 Let X be a Banach space, and K : X → X a compact operator. (i) dim Ker(I + K) < ∞. (ii) dim Ker(I + K ) < ∞.12 (iii) The image of I + K is closed.

A : X → X be a compact operator and λ 0. All the following four numbers are finite and equal. (A is the dual operator of A.)

9 Let

α = dim Ker(A − λI ),

α = dim Ker(A − λL), 10 We

β = codim(A − λI )(X), β = codim(A − λI )(X ).

can not have recourse to Theorem 5.1 because the closedness of T (X) has not been shown yet. 11 See Maruyama [4] pp. 139–142, Yosida [5] pp. 75–77. 12 K is the dual operator of K.

292

10 Fredholm Operators

Proof (i) Let S be the unit ball of Ker(I + K). Since I = −K on Ker(I + K), we have S = −K(S). So S is relatively compact because K is a compact operator. S being closed, it is compact. Hence dim Ker(I + K) < ∞. (ii) Similarly, (ii) follows as in (i), since (I + K) = I + K and K is a compact operator. (iii) According to (i) and Theorem 10.5, there exists a closed subspace M such that X = M ⊕ Ker(I + K) (topological direct sum). It should be observed that there exists a number θ > 0 such that

m θ (I + K)m for all

m ∈ M.

(10.2)

If there is no such θ , there exists a sequence {mn } of M, which satisfies13 mn ∈ M,

mn = 1,

(I + K)mn → 0.

Since K is a compact operator. {Kmn } has a convergent subsequence {Kmn }. Let y ∗ be its limit. Then mn → −y ∗ , since Kmn → y ∗ .

(I + K)mn = mn + Kmn → 0,

That is, y ∗ ∈ Ker(I + K). −y ∗ ∈ M because M is closed. Hence we obtain y ∗ ∈ M ∩ Ker(I + K). It follows that y ∗ = 0. However, it is a contradiction to mn → −y ∗ ,

mn = 1.

Finally, we show that (I + K)(X) is closed. Assume that (I + K)xn = un → u∗ ,

xn ∈ X.

Since X = M ⊕ Ker(I + K), we can express un as un = (I + K)mn ,

mn ∈ M.

(10.3)

reason is as follows. For any ε > 0, there exists some vε ∈ M such that vε > ε−1 (I + K)vε . So defining mε = vε / vε , we have

13 The

(I + K)mε < ε.

10.3 Parametrix

293

By (10.2) and (10.3), we have

mn − mk θ un − uk . Hence {mn } is Cauchy, and so it has a limit m∗ . By the continuity of I + K, u∗ = lim un = lim (I + K)mn = (I + K)m∗ , n→∞

n→∞

which implies u∗ ∈ (I + K)(X).

10.3 Parametrix To start with, we prepare a useful lemma, which will be made use of frequently. Lemma 10.5 Let X and Y be Banach spaces, and T : X → Y a Fredholm operator. Assume that M ⊂ X and Z ⊂ Y are subspaces which satisfy X = V ⊕ M, Y = Z ⊕ R (topological direct sum), where V = KerT and R = T (X). Then the restriction T |M of T to M is an isomorphism between two normed spaces, M and R. Proof It is obvious that T |M : M → R is linear. Any x ∈ X is uniquely expressed as x = v + m (v ∈ V, m ∈ M). Hence T x = T v + T m = 0 + T m. Thus we obtain R = T |M (M). T |M is injective, since {m ∈ M | T |M m = 0} = V ∩ M = {0}. This proves that T |M is a bijection. Therefore it has the inverse T |−1 M : R → M. M and R are closed by Theorem 10.4. Hence, by the open mapping theorem,14 T |M is an open mapping. We conclude that T |M is an isomorphism between M and R. Theorem 10.9 (parametrix) Let X, Y be Banach spaces and T : X → Y a bounded linear operator. Then the following statements are equivalent: (i) T is a Fredholm operator. (ii) There exist bounded linear operators Pl , Pr : Y → X as well as compact linear operators Cl : X → X,

14 See

footnote 11, p. 291.

Cr : Y → Y

294

10 Fredholm Operators

such that Pl T = I + Cl ,

(10.4)

T Pr = I + Cr .

(10.5)

Proof (i)⇒(ii): If T is Fredholm, there exist subspaces M ⊂ X and Z ⊂ Y such that X = V ⊕ W,

Y = Z ⊕ R (topological direct sum),

where V ≡ KerT and R ≡ T (X). (Note that R is closed by Theorem 10.7.) Let πV : X → V,

πZ : Y → Z

be canonical projections corresponding to the above direct sums. The restriction T |M : M → R of T to M is an isomorphism by Lemma 10.5. Let S : Y → M be an operator of Y into M, which maps each element y = z + r (z ∈ Z, r ∈ R) of Y to T |−1 M (r); i.e. S : (z + r) → T |−1 M (r). Then ST : X → X and T S : Y → Y can be expressed as ST = I − πV , T S = I − πZ . Since T is Fredholm, dim πV (X) < ∞,

dim πZ (Y) < ∞.

Hence both πV and πZ are compact operators. Defining Pl = Pr = S, Cl = −πV ,

Cr = −πZ ,

we obtain (ii). (ii)⇒(i): We next assume (ii). By (10.4), KerT ⊂ Ker(I + Cl ). Theorem 10.7 tells us that dim Ker(I + Cl ) < ∞. Hence dim KerT dim Ker(I + Cl ) < ∞.

10.4 Product of Fredholm Operators

295

On the other hand, we have (I + Cr )(Y) ⊂ T (X) = R by (10.5). (I + Cr )(Y) is closed (by Theorem 10.7) and codim(I + Cr )(Y) < ∞ (see footnote 9 on p. 291). We obtain codimR < ∞ by Theorem 10.6. We conclude that T is Fredholm. The operators Pl and Pr , the existence of which has just been confirmed by the above lemma, are called the left parametrix and the right parametrix, respectively. Remark 10.1 1◦ The operator S : Y → X obtained above is Fredholm and κ(S) = −κ(T ). (It is clear that KerS = Z and S(Y) = M.) 2◦ As is observed from the proof above, (i)⇒(ii) can be reformulated in a more exact form. “Let T be a Fredholm operator. Then there exist a bounded linear operator S : Y → X and a couple of compact operators Cl : X → X and Cr : Y → Y which satisfy ST = I + Cl ,

(10.4 )

T S = I + Cr .

(10.5 )

Furthermore: (a) S is Fredholm, (b) dim Cl (X) < ∞, (c) κ(S) = −κ(T ).”

dim Cr (Y) < ∞,

10.4 Product of Fredholm Operators Theorem 10.10 Let X, Y and Z be Banach spaces, and let T : X → Y and S : Y → Z be Fredholm operators. Then the product ST = S ◦ T : X → Z is also Fredholm and κ(ST ) = κ(S) + κ(T ). Proof Since T and S are Fredholm, dim KerT < ∞ and codimS(Y) < ∞, which implies that S(Y) is closed (by Theorem 10.7). Then there exist subspaces W ⊂ X and U ⊂ Z such that

296

10 Fredholm Operators

X = KerT ⊕ W, Z = U ⊕ S(Y)

(10.6) (topological direct sum)

(10.7)

by Theorem 10.5. Y can be expressed as a direct sum in terms either of T or S. If we define Y1 = T (X) ∩ KerS,

(10.8)

dim Y1 < ∞. Since T (X) and KerS are closed subspaces, they can be expressed as T (X) = Y1 ⊕ Y2 ,

(10.9)

KerS = Y1 ⊕ Y3

(10.10)

(topological direct sums) by Theorem 10.5. Suppose that y ∈ T (X) ∩ Y3 . Then y ∈ T (X) ∩ KerS = Y1 by (10.8). By Y1 ∩ Y3 = {0}, we obtain y = 0. That is, T (X) ∩ Y3 = {0}. Hence we can consider T (X) ⊕ Y3 . Since T (X) is closed and codimT (X) < ∞, T (X ⊕ YZ ) is closed and codim(T (X) ⊕ YZ ) < ∞ by Theorem 10.6. Applying Theorem 10.5 repeatedly, we obtain Y = (T (X) ⊕ Y3 ) ⊕ Y4 = T (X) ⊕ (Y3 ⊕ Y4 )

(10.11)

for some subspace Y4 . By (10.9) and (10.10), Y = Y1 ⊕ Y2 ⊕ Y3 ⊕ Y4 = Y2 ⊕ Y4 ⊕ KerS.

(10.12)

Since codimT (X) = dim Y3 + dim Y4 by (10.11), we obtain κ(T ) = dim KerT − (dim Y3 + dim Y4 ).

(10.13)

We also have dim KerS = dim Y1 + dim Y3 by (10.10). Hence κ(S) = dim Y1 + dim Y3 − dim U by (10.7).

(10.14)

10.4 Product of Fredholm Operators

297

Noting that T |W : W → T (X) is an isomorphism by Lemma 10.5, we have, from (10.9), that −1 W = T |−1 W (Y1 ) ⊕ T |W (Y2 ).

(10.15)

Combining this with (10.6), we have −1 X = KerT ⊕ T |−1 W (Y1 ) ⊕ T |W (Y2 ).

(10.16)

KerT ⊕ T |−1 W (Y1 ) ⊂ Ker(ST ).

(10.17)

It is clear that

ST is injective on T |−1 W (Y2 ). In fact, S is injective on Y2 , since T (X) = (T (X) ∩ KerS) ⊕ Y2 by (10.8) and (10.9). And T maps T |−1 W (Y2 ) to Y2 one-to-one (Lemma 10.5). Consequently, Ker(ST ) = KerT ⊕ T |−1 W (Y1 ), ST (X) = ST (T |−1 W (Y2 )) = SY2 .

(10.18) (10.19)

It follows from (10.18) that dim Ker(ST ) = dim KerT + dim Y1 .

(10.20)

According to (10.12), S maps Y2 ⊕ Y4 onto S(Y) one-to-one (Lemma 10.5). Since Y2 is a closed subspace and dim Y4 < ∞, Y2 ⊕ Y4 is a closed subspace by Lemma 10.4. Applying Lemma 10.5 again, we verify that S|Y2 ⊕Y4 is an isomorphism between Y2 ⊕ Y4 and S(Y). Hence ST (X) is closed by (10.19). Since Z = U ⊕ S(Y) = U ⊕ S(Y2 ) ⊕ S(Y4 ), we obtain codimST (X) = dim U + dim Y4

(10.21)

by (10.19). By (10.20) and (10.21), κ(ST ) = dim KerT + dim Y1 − (dim U + dim Y4 ). Taking account of (10.13) and (10.14), we conclude κ(ST ) = κ(S) + κ(T ).

298

10 Fredholm Operators

10.5 Stability of Indices Theorem 10.11 Let X and Y be Banach spaces, and T : X → Y a Fredholm operator. Then there exists some ε > 0 such that κ(U ) = κ(T ) for any Fredholm operator U : X → Y with U − T < ε. Proof By Remark 10.1, there exist a bounded linear operator S : Y → X and a pair of compact operators, Cl : X → X and Cr : Y → Y which satisfy (10.4 ), (10.5 ) as well as (a), (b) and (c) there. Let ε = S −1 .15 If V : X → Y is a bounded linear operator with V < ε, then SV : X → X satisfies SV S · V < 1 and V S : Y → Y also satisfies V S < 1. Hence I + SV and I + V S have bounded inverses (I + SV )−1 and (I + V S)−1 .16 If we define W = T + V , (10.4 ) and (10.5 ) become SW = I + SV + Cl , W S = I + V S + Cr . Multiplying (I + SV )−1 by the left of the first formula, and multiplying (I + V S)−1 by the right of the second, we obtain (I + SV )−1 SW = I + (I + SV )−1 Cl , W S(I + V S)−1 = I + Cr (I + V S)−1 . Writing (I + SV )−1 S = Pl , S(I + V S)−1 = Pr , (I + SV )−1 Cl = Cl and Cr (I + V S)−1 = Cr , we can rewrite the above relations as Pl W = I + Cl ,

W P r = I + Cr .

Since dim Cl (X) < ∞ and dim Cr (Y) < ∞ (Remark 10.1, 2◦ ), Cl and Cr are compact operators. Hence, by Theorem 10.9, W is a Fredholm operator. We next evaluate the index. Since κ(S) = −κ(T ) and κ((I + SV )−1 ) = 0, we obtain κ(Pl W ) = κ((I + SV )−1 SW ) = κ((I + SV )−1 ) + κ(S) + κ(W ) = −κ(T )+κ(W ) by Theorem 10.10. On the other hand, dim Cl (X) < ∞ implies κ(I + Cl ) = 0. Therefore κ(E) = κ(T ).

the case of S = 0, we may consider ε = ∞. But such a case occurs when dim X < ∞ and dim Y < ∞. 16 Maruyama [4] pp. 153–154. 15 In

References

299

Theorem 10.12 Let X and Y be Banach spaces. If T : X → Y is a Fredholm operator and K : X → Y is a compact operator, then T + K is Fredholm and κ(T + K) = κ(T ). Proof There exist parametrices Pl , Pr : Y → X and compact operators Cl : X → X, Cr : Y → Y such that Pl T = I + Cl ,

T Pr = I + Cr .

Consequently, Pl (T + K) = I + Cl + Pl K, (T + K)Pr = I + Cr + KPr . Since K is a compact operator, so are Pl K and KPr . So we observe that Pl and Pr are parametrices of T + K. That is, T + K is Fredholm. Define a continuous mapping ϕ : [0, 1] → L(X, Y) (space of bounded linear operators) by ϕ(t) = T + tK. Since tK is a compact operator, T + tK is Fredholm for every t ∈ [0, 1]. κ(T + tK) is locally constant on [0, 1] by Theorem 10.11. Hence, by connectedness of [0, 1], κ(T + tK) = constant

for all t ∈ [0, 1].

References 1. Dunford, N., Schwartz, J.T.: Linear Operators, Part 1. Interscience, New York (1958) 2. Kato, T.: Iso Kaiseki (Functional Analysis). Kyoritsu Shuppan, Tokyo (1957) (Originally published in Japanese) 3. Kuroda, N.: Kansu Kaiseki (Functional Analysis). Kyoritsu Shuppan, Tokyo (1980) (Originally published in Japanese) 4. Maruyama, T.: Suri-keizaigaku no Hoho (Methods in Mathematical Economics). Sobunsha, Tokyo (1995) (Originally published in Japanese) 5. Yosida, K.: Functional Analysis. 3rd edn. Springer, New York (1971) 6. Zeidler, E.: Applied Functional Analysis. Springer, New York (1995)

Chapter 11

Hopf Bifurcation Theorem

The Hopf bifurcation theorem provides an effective criterion for finding periodic solutions for ordinary differential equations. Although various proofs of this classical theorem are known, there seems to be no easy way to arrive at the goal. Among them, the idea of Ambrosetti and Prodi [1] is particularly noteworthy. They start by formulating the Hopf theorem in an abstract fashion and then try to deduce the classical result from it. Let F (ω, μ, ·) be a smooth function of a Banach space X into another one Y with a pair (ω, μ) of real parameters. In order to find some bifurcation point (ω∗ , μ∗ ) of the equation F (ω, μ, x) = 0, the derivative Dx F (ω∗ , μ∗ , 0) of F with respect to x ∈ X at (ω∗ , μ∗ , 0) plays a crucial role. As will be stated in Theorem 11.1 exactly, the condition that both the dimension of the kernel of Dx F (ω∗ , μ∗ , 0) and the codimension of the image of Dx F (ω∗ , μ∗ , 0) are 2, together with another condition, assures that there occurs a bifurcation phenomenon at (ω∗ , μ∗ ). Based upon this result, Ambrosetti and Prodi successfully paved the way to deducing the classical Hopf theorem and elucidate its mathematical structure.1 Let f (μ, x) be a given function of R × Rn into Rn . We consider the ordinary differential equation F (ω, μ, x(·)) ≡ ω

dx(·) − f (μ, x(·)) = 0 dt

(11.1)

with a pair (ω, μ) of parameters. Here Ambrosetti and Prodi adopt some suitable function space Cr (resp. Cr−1 ) as X (resp. Y). In order to apply the abstract theorem mentioned above to this concrete problem, we have to calculate the dimension of the kernel of the operator

1 See

Hopf [8] for the original formulation. Crandall and Rabinowitz [4] is extremely suggestive.

© Springer Nature Singapore Pte Ltd. 2018 T. Maruyama, Fourier Analysis of Economic Phenomena, Monographs in Mathematical Economics 2, https://doi.org/10.1007/978-981-13-2730-8_11

301

302

11 Hopf Bifurcation Theorem

Dx F (ω∗ , μ∗ , 0) : x → ω∗

dx − Dx f (μ∗ , 0)x dt

(11.2)

of X into Y and the codimension of its image. Ambrosetti and Prodi’s idea to overcome this problem is very simple in a sense. They first expand the function x(t) in the uniformly convergent Fourier series x(t) =

∞

uk eikt .

(11.3)

k=−∞

If the derivative dx/dt = x(·) ˙ can be expanded in the form x(t) ˙ =

∞

ikuk eikt ,

(11.4)

k=−∞

we obtain Dx F (ω∗ , μ∗ , 0)x =

∞

[ikω∗ I − Dx f (μ∗ , 0)]uk eikt

(11.5)

k=−∞

by substituting (11.3) and (11.4) into (11.2), where I is the (n × n)-identity matrix. Hence the kernel of Dx F (ω∗ , μ∗ , 0) consists of all the x(·) such that [ikω∗ I − Dx f (μ∗ , 0)]uk = 0 for all k ∈ Z (the set of all the integers). And when we wish to determine the image of Dx F (ω∗ , μ∗ , 0), we have only to examine the equation [ikω∗ I − Dx f (μ∗ , 0)]uk = vk

for all

k ∈ Z,

(11.6)

where the vk ’s are the Fourier coefficients of y ∈ Y. The image of Dx F (ω∗ , μ∗ , 0) is the set of all the elements y of Y, for which the equation (11.6) is solvable. The main purpose of the present chapter is to establish the Hopf theorem in the framework of some Sobolev space instead of Cr . This approach seems to enable us to simplify the technical details in the course of the proof to some extent. The basic result (cf. p. 37) due to Carleson [2] and Hunt [9] plays a crucial role in our theory. Incidentally, we have to note that we require the condition that x(·) is of the class Cr , r 3 and y(·) is of the class Cr−1 in order to justify the expression (11.4) if we remain in the space Cr . As will be shown later, we have, for any ε > 0, that ∞ ∞ ikt iku e

kuk

k k=−∞

k=−∞

|k|
kuk + ε

1

|k|N

|k|r−1

11.1 Ljapunov–Schmidt Reduction Method

303

for sufficiently large N . Hence if we assume r 3, the left-hand side is uniformly convergent and the expression (11.4) is valid. Thus although Ambrosetti and Prodi assume only r = 1, we have to impose some more restrictions on the smoothness of x(·) and y(·), and some more subtle and careful account of the magnitudes of the Fourier coefficients seems to be required. The other object of this chapter is to fortify these analytical details and to complete the Ambrosetti and Prodi theory from the standpoint of the classical Fourier analysis. Furthermore, we apply this theory to laying a solid mathematical foundation for N. Kaldor’s theory of business fluctuations.2

11.1 Ljapunov–Schmidt Reduction Method Let X and Y be Banach spaces. Consider an equation f (λ, x) = 0 defined by a function f ∈ Cp (R × X, Y) (p 1). Although this equation is defined in the framework of infinite dimensional spaces in general, it can be solved by reducing it to the case of finite variables and finite equations if certain conditions are satisfied. This is the so-called Ljapunov–Schmidt reduction method. Suppose that a function f ∈ Cp (R × X, Y) (p 1) satisfies the following four conditions. We use several notations: T = Dx f (λ∗ , 0),

V = KerT ,

R = T (X)

where λ∗ ∈ R. (i) f (λ∗ , 0) = 0. (ii) T is not invertible. (iii) V has a topological complement in X: X = V ⊕ W (topological direct sum). (iv) R is closed, and has a topological complement Z: Y = Z ⊕ R (topological direct sum). Remark 11.1 If T is Fredholm, (iii) and (iv) are satisfied. (cf. Theorem 10.4 on p. 282, Theorem 10.7 on p. 291.) 2 This

chapter is based upon Ambrosetti and Prodi [1] and Maruyama [13, 14]. Masuda [16] Part I, Chap. 6 is quite helpful as a good exposition of bifurcation theory in general.

304

11 Hopf Bifurcation Theorem

Let P : Y → Z and Q : Y → R be the projections of Y into Z and R, respectively. If we express each x ∈ X as x = v + w;

v ∈ V, w ∈ W,

f (λ, x) = 0 is equivalent to Pf (λ, v + w) = 0,

(11.7a)

Qf (λ, v + w) = 0.

(11.7b)

Defining ϕ(λ, x) = f (λ, x) − T x, we obtain f (λ, x) = T w + ϕ(λ, v + w), since T v = 0, where x = v + w (v ∈ V, w ∈ W).3 The equality (11.7b) can be rewritten as Qf (λ, v + w) = QT w + Qϕ(λ, v + w) = T w + Qϕ(λ, v + w) = 0,

(11.8)

since T w ∈ R. Furthermore, if we define Φ(λ, v, w) = T w + Qϕ(λ, v + w), then Φ ∈ Cp (R × V × W, R) and Dw Φ(λ∗ , 0, 0) : W → R is given by Dw Φ(λ∗ , 0, 0) : w → T w + QDx ϕ(λ∗ , 0)w. Since ϕ(λ, x) = f (λ, x) − T x by definition, it follows that Dx ϕ(λ∗ , 0) = Dx f (λ∗ , 0) − T = 0. Hence Dw Φ(λ∗ , 0, 0) = T |W . We should note that T |W : W → R is an isomorphism (cf. Lemma 10.5, p 293).

3 In the case of f (λ∗ , 0), in particular, f (λ∗ , x)

= f (λ∗ , x) − f (λ∗ , 0) = Dx f (λ∗ , 0)x + ϕ(λ∗ , x). Thus, ϕ(λ∗ , x) is a residue term of the linear approximation of f (λ∗ , x).

11.2 Abstract Hopf Bifurcation Theorem

305

We now apply the implicit function theorem4 to the equation (11.7b) ⇔ (11.8): Φ(λ, v, w) = T w + Qϕ(λ, v + w) = 0. Of course, Φ(λ∗ , 0, 0) = 0. There exist a neighborhood Γλ∗ of λ∗ , a neighborhood Γ0,V of 0 ∈ V, a neighborhood Γ0,W of 0 ∈ W and a unique function γ ∈ Cp (Γλ∗ × Γ0,V , Γ0,W ) such that: a. Φ(λ, v, γ (λ, v)) = 0 b. γ (λ∗ , 0) = 0.

for all

(λ, v) ∈ Γλ∗ × Γ0,V ,

Remark 11.2 1◦ If f (λ, 0) = 0 for all λ ∈ R, γ (λ, 0) = 0

λ ∈ Γλ∗ .

for all

2◦ Dv γ (λ∗ , 0) = 0. Substituting w = γ (λ, v) to (11.7a), we obtain P (f (λ, v + γ (λ, v))) = 0 for all

(λ, v) ∈ Γλ∗ × Γ0,V .

(11.9)

This is called the bifurcation equation. Combining with w = γ (λ, v), it is locally (that is, in Γλ∗ × Γ0,V × Γ0,W ) equivalent to f (λ, x) = 0. When Dx f (λ∗ , 0) = T is Fredholm in particular, the number of variables and the number of equations are finite. So the problem is reduced to the finite dimensional case.

11.2 Abstract Hopf Bifurcation Theorem Let X and Y be a pair of real Banach spaces. F (ω, μ, x) is assumed to be a function of the class C2 (R2 × X, Y) which satisfies F (ω, μ, 0) = 0

for all

(ω, μ) ∈ R2 .

A point (ω∗ , μ∗ ) ∈ R2 is called a bifurcation point of F if (ω∗ , μ∗ , 0) is in the closure of the set S = {(ω, μ, x) ∈ R2 × X | x 0, F (ω, μ, x) = 0}.

4 See Maruyama [12], pp. 281–283 for the implicit function theorem in infinite dimensional spaces.

306

11 Hopf Bifurcation Theorem

We shall use several notations for the sake of simplicity: T = Dx F (ω∗ , μ∗ , 0), V = Ker T ,

R = T (X),

2 M = Dx,μ F (ω∗ , μ∗ , 0), 2 N = Dx,ω F (ω∗ , μ∗ , 0).

T is the derivative of F with respect to x at (ω∗ , μ∗ , 0). It is a bounded linear operator of X into Y. V and R are the kernel and the image of T , respectively. M (resp. N ) is the second derivative of F with respect to (x, μ) (resp. (x, ω)) at 2 F is the bounded linear operator of R into L(X, Y) (the (ω∗ , μ∗ , 0). Since Dx,μ Banach space of all the bounded linear operators of X into Y), it can be identified 2 F. with an element of L(X, Y). The same is true for Dx,ω The following theorem is an abstract version of the Hopf bifurcation theorem due to Ambrosetti and Prodi [1] (pp. 136–139). Theorem 11.1 Let X and Y be real Banach spaces. Assume that F (ω, μ, x) is a function of the class C2 (R2 × X, Y) which satisfies the following two conditions: 1. dim V = 2. R is closed and codimR = 2. We represent X and Y in the forms of topological direct sums: X = V ⊕ W,

Y = Z ⊕ R.

We denote by P the projection of Y into Z, and by Q the projection of Y into R. 2. There exists some point v∗ ∈ V such that P Mv ∗ and P Nv ∗ are linearly independent.5 Then (ω∗ , μ∗ ) is a bifurcation point of F . The condition that dimV = 2 and codimR = 2 implies that T : X → Y is a Fredholm operator with index zero. The proof of this theorem is based upon the Ljapunov–Schmidt reduction method, which is explained in the preceding section. Proof Expressing x =v+w ;

v ∈ V,

w ∈ W,

we can rewrite F (ω, μ, x) in the form F (ω, μ, x) = T x + ϕ(ω, μ, x) = T w + ϕ(ω, μ, v + w). 5 Let

(11.10)

L(X, Y) be the space of bounded linear operators of X into Y. M and N are bounded linear operators of R into L(X, Y). Hence each of M and N can be regarded as a point of L(X, Y). Mv ∗ and N v ∗ are points of Y. Thus we can let P act on them.

11.2 Abstract Hopf Bifurcation Theorem

307

This is the definition of the function ϕ : R2 × V × W → Y. QF (ω, μ, x) = 0 can also be rewritten as QF (ω, μ, v + w) = QT w + Qϕ(ω, μ, v + w) = T w + Qϕ(ω, μ, v + w) = 0. If we define Φ(ω, μ, v, w) = T w + Qϕ(ω, μ, v + w), then Φ ∈ C2 (R2 × V × W, Y) and Dw Φ(ω∗ , μ∗ , 0, 0)w = T w + QDx ϕ(ω∗ , μ∗ , 0)w.

(11.11)

By the definition (11.10) of ϕ, Dx ϕ(ω∗ , μ∗ , 0) = Dx F (ω∗ , μ∗ , 0) − T = 0.

(11.12)

The equations (11.11) and (11.12) imply Dw Φ(ω∗ , μ∗ , 0, 0) = T |W , which gives an isomorphism between W and R.6 Being based upon the above observation, we apply the implicit function theorem to the equation Φ = 0. Then there exist a neighborhood G of (ω∗ , μ∗ , 0) and a function γ ∈ C2 (G, W) such that Φ(ω, μ, v, γ (ω, μ, v)) = 0

for all

(ω, μ, v) ∈ G,

γ (ω∗ , μ∗ , 0) = 0.

(11.13)

We also obtain Dν γ (ω∗ , μ∗ , 0) = −[Dw Φ(ω∗ , μ∗ , 0, 0)]−1 ◦ Dν Φ(ω∗ , μ∗ , 0, 0) = 0 (cf. Remark 11.2, 2◦ ). Thus the bifurcation equation (corresponding to (11.9)) P (F (ω, μ, v + γ (ω, μ, v))) = 0 holds good. 6 cf.

Lemma 10.5 on p. 293.

(11.14)

308

11 Hopf Bifurcation Theorem

Define a function h : R3 → Z by h(ω, μ, s) = P F (ω, μ, sv ∗ + γ (ω, μ, sv ∗ )). h is of class C2 and satisfies h(ω, μ, 0) = 0. The partial derivative of h with respect to s is given by Ds h(ω, μ, s) = P Dx F (ω, μ, sv ∗ + γ (ω, μ, sv ∗ )) · [v ∗ + Dv γ (ω, μ, sv ∗ )v ∗ ]. (11.15) We denote it by χ (ω, μ, s); i.e. χ (ω, μ, s) = Ds h(ω, μ, s). χ is of class C1 and satisfies χ (ω∗ , μ∗ , 0) = Ds h(ω∗ , μ∗ , 0) = 0 by (11.13) and (11.15). We now try to find derivatives of χ . First of all, we have 2 F (ω∗ , μ∗ , 0)[v ∗ + Dv γ (ω∗ , μ∗ , 0)v ∗ ] Dμ χ (ω∗ , μ∗ , 0) = P Dx,μ 2 + P Dx F (ω∗ , μ∗ , 0)[Dv,μ γ (ω∗ , μ∗ , 0)v ∗ ] 789: =0

=

2 P Dx,μ F (ω∗ , μ∗ , 0)v ∗

= P Mv ∗ .

Similarly, Dω χ (ω∗ , μ∗ , 0) = P Nv ∗ . P Mv ∗ , P Nv ∗ ∈ Z and dim Z = 2. So we may regard P Mv ∗ and P Nv ∗ as two elements of R2 . By assumption, det D(ω,μ) χ (ω∗ , μ∗ , 0) 0. Applying the implicit function theorem to the equation χ (ω, μ, s) = 0, we can solve this equation with respect to (ω, μ) locally in the neighborhood of (ω∗ , μ∗ , 0). That is, there exist functions, ω(s) and μ(s), which are defined in a neighborhood of s = 0, and satisfy χ (ω(s), μ(s), s) = 0, ω(0) = ω∗ ,

μ(0) = μ∗ .

11.3 Classical Hopf Bifurcation for Ordinary Differential Equations

309

By this relation, the bifurcation equation (11.14) becomes P F (ω(s), μ(s), sv ∗ + γ (ω(s), μ(s), sv ∗ )) = 0. We observe (ω(s), μ(s)) → (ω∗ , μ∗ )

as

s → 0.

If we define xs = sv ∗ + γ (ω(s), μ(s), sv ∗ ), then xs 0 (s 0) and xs → 0 as s → 0. Thus we conclude that (ω∗ , μ∗ ) is a bifurcation point of F .

11.3 Classical Hopf Bifurcation for Ordinary Differential Equations We now turn to the classical bifurcation phenomena of periodic solutions for an ordinary differential equation. Let f (μ, x) be a function of the class C2 (R×Rn , Rn ), and consider the differential equation dx = f (μ, x). ds

(11.16)

Changing the time variable s by the relation t = ωs

(ω 0),

we rewrite the equation (11.16) as 1 dx = f (μ, x), dt ω i.e. ω

dx = f (μ, x). dt

(11.16 )

This is an ordinary differential equation with two real parameters, ω and μ. For the sake of simplicity, we assume that the function f (μ, x) satisfies f (μ, 0) = 0 for all

μ ∈ R.

310

11 Hopf Bifurcation Theorem

n We denote by W1,2 2π (R, R ) the set of all the 2π -periodic and absolutely continuous functions x : R → Rn such that x| ˙ [0,2π ] ∈ L2 ([0, 2π ], Rn ), where x| ˙ [0,2π ] denotes the restriction of x˙ = dx/dt to the interval [0, 2π ]; i.e. n W1,2 2π = {x : R → R | x is 2π -periodic, absolutely continuous and

x(·)| ˙ [0,2π ] ∈ L2 ([0, 2π ], Rn )}. W1,2 2π is a Banach space under the norm

x W1,2 = 2π

2π

x(t) dt 2

1/2

+

0

2π

2

x(t)

˙ dt

1/2

.

0

We also denote by L22π (R, Rn ) the set of all the 2π -periodic measurable functions y : R → Rn such that y|[0,2π ] ∈ L2 ([0, 2π ], Rn ); i.e. L22π = {y : R → Rn | y is 2π -periodic and y|[0,2π ] ∈ L2 ([0, 2π ], Rn )}. L22π is a Banach space under the norm

y L2 = 2π

2π

y(t) 2 dt

1/2

.

0

2 In this section, we adopt W1,2 2π as X and L2π as Y, respectively; i.e.

X = W1,2 2π ,

Y = L22π .

Define the function F : R2 × X → Y by F (ω, μ, x) = ω

dx − f (μ, x). dt

Then we can prove that F is a function of the class C2 (R2 × X, Y) provided that the following assumptions are satisfied. Assumption 1 (i) There exists some constants α and β ∈ R such that

f (μ, x) α + β x

for all

x ∈ Rn .

(ii) There exists some constant ρ such that

Dx f (μ, x) ,

D 2 f (μ, x) ρ

for all

x ∈ Rn .

11.3 Classical Hopf Bifurcation for Ordinary Differential Equations

311

The proof of the fact F (·) ∈ C2 (R2 × X, Y) will be given in the next section. It is obvious that F (ω, μ, 0) = 0

for all

(ω, μ) ∈ R2 .

(ω∗ , μ∗ ) ∈ R2 is called a bifurcation point of F if there exists a sequence (ωn , μn , xn ) in R2 × X such that ⎧ ⎪ ⎪ ⎨F (ωn , μn , xn ) = 0, (ωn , μn ) → (ω∗ , μ∗ ) as n → ∞, and ⎪ ⎪ ⎩x 0, x → 0 as n → ∞. n

n

Each xn is a nontrivial (not identically zero) periodic solution with period 2π of the equation ωn

dx − f (μn , x) = 0. dt

Hence, changing the time-variable to s again, we obtain a periodic solution Xn (s) = xn (ωn s) with period τn = 2π/ωn for the original equation (11.16). Consequently, we have that τn → τ ∗ =

2π , ω∗

Xn W1,2

2π/ωn

→ 0 as

n → ∞,

provided that ω∗ 0. The target of our investigations is to find a bifurcation point of F according to the principle of Theorem 11.1. We have to note that the derivative Dx F (ω, μ, 0) : x → ωx˙ − Dx f (μ, 0)x

(11.17)

is to play the most important role in the course of our discussions. (x˙ means dx/dt.) Of course, Dx F (ω, μ, 0) is a bounded linear operator of X into Y. If we write Aμ = Dx f (μ, 0), Aμ is an (n × n)-matrix and (11.17) can be rewritten in the form Dx F (ω, μ, 0)x = ωx˙ − Aμ x. Here we need a couple of assumptions to be imposed upon the matrix Aμ at some (ω∗ , μ∗ ).

312

11 Hopf Bifurcation Theorem

Assumption 2 Aμ∗ is regular, and ±iω∗ (ω∗ > 0) are simple eigenvalues of Aμ∗ . Assumption 3 None of ±ikω∗ (k ±1) is an eigenvalue of Aμ∗ .

11.4 Smoothness of F In this section, we examine the differentiability of the so-called Nemyckii operator.7 Let us define the operator Φ on W1,2 2π by x ∈ W1,2 2π

Φ(x(·)) = f (μ, x(·)),

for any fixed μ. If Assumption 1(i) is satisfied, then Φ(x(·)) is in L22π . 2 Lemma 11.1 Under Assumption 1(i), the operator Φ : W1,2 2π → L2π is continuous.

Proof Assume that xn (·) → x0 (·) (as n → ∞) in W1,2 2π . It obviously implies that 2 xn (·) → x0 (·) (as n → ∞) in L2π . Then there exists some subsequence {xn (·)} and a function ϕ(·) ∈ L2 ([0, 2π ], R) such that xn (t) → x0 (t) a.e. on

xn (t) ϕ(t)

[0, 2π ], and

a.e. on

[0, 2π ].

Since f is C2 by assumption, we must have Φ(xn (·)) = f (μ, xn (·)) → Φ(x0 (·)) = f (μ, x0 (·)) a.e.

(11.18)

Taking account of Assumption 1(i), we obtain

f (μ, xn (·)) α + β xn (·) α + βϕ(t)

a.e.

Now (11.18) and (11.19) imply that

Φ(xn (·)) − Φ(x0 (·)) 2L2 2π

=

2π

f (μ, xn (t)) − f (μ, x0 (t)) 2 dt

0

→ 0 as

n → ∞

by the dominated convergence theorem.

7 Related

topics are discussed in Ambrosetti-Prodi [1] pp. 17–21.

(11.19)

11.4 Smoothness of F

313

If follows that Φ(xn (·)) → Φ(x0 (·)) in L22π . (If not, a contradiction would occur to the above argument.) 2 Lemma 11.2 If Assumption 1 is satisfied, then the operator Φ : W1,2 2π → L2π is twice continuously differentiable.

Proof Let us begin by evaluating the first variation of Φ. For any x, z ∈ W1,2 2π , we have 1 [Φ(x(·) + λz(·)) − Φ(x(·))](t) λ 1 = [f (μ, x(t) + λz(t)) − f (μ, x(t))] λ 1 = [Dx f (μ, x(t))λz(t) + o(λz(t))] λ → Dx f (μ, x(t))z(t)

λ → 0,

(11.20)

x(·), z(·) ∈ W1,2 2π

(11.21)

as

where we must have Dx f (μ, x(·))z(·) ∈ L22π

for any

by Assumption 1 and z(·) ∈ W1,2 2π . We can also confirm that, for any ε > 0, 2 1 [Φ(x(·) + λz(·)) − Φ(x(·))](t) − Dx f (μ, x(t))z(t) λ 2 o(λz(t)) D − D = f (μ, x(t))z(t) + f (μ, x(t))z(t) x x λ ε2 · z(t) 2

for sufficiently small λ.

Hence we obtain, by (11.20) and the dominated convergence theorem, that 1 [Φ(x(·) + λz(·)) − Φ(x(·))] → Dx f (μ, x(·))z(·) λ in L22π as λ → 0. Therefore Φ has the first variation of the form (11.21). It is easy to check that the mapping z(·) → Dx f (μ, x(·))z(·) 2 is a bounded linear operator of W1,2 2π into L2π . Hence Φ is Gâteaux-differentiable.

314

11 Hopf Bifurcation Theorem

Furthermore, Φ turns out to be Fréchet-differentiable8 since x(·) → Dx f (μ, x(·)) 1,2 2 is a continuous mapping of W1,2 2π into L(W2π , L2π ) (the space of bounded linear 2 9 operators of W1,2 2π into L2π ). Finally, we can prove that Φ is twice continuously differentiable by a similar method as above. So we omit the details.

Writing T = Dx F (ω∗ , μ∗ , 0), we have T x = 0 if and only if ω∗ x˙ − Aμ∗ x = 0. In order to apply Theorem 11.1 to our classical problem in Sect. 11.3, we have to start by confirming that (a) the dimension of the kernel of T is 2, and (b) the codimension of the image of T is also 2. For brevity, we have to show that dim Ker T = 2, codim T (X) = 2.

and

Henceforth, we denote Ker T by V and T (X) by R. Remark 11.3 Actually, we can confirm a priori that T is a Fredholm operator with index zero when ω∗ = 0. Hence we do not have to check both dimV = 2 and codimR = 2, because either one follows from the other automatically. 2 The operator T : x → ω∗ x˙ − Aμ∗ x (W1,2 2π → L2π ) can be rewritten as T x = ω∗ x˙ − Aμ∗ x = ω∗ x˙ + x − x − Aμ∗ x = ω∗ x˙ + x − (I + Aμ∗ )x. Since the mapping x → ω∗ x˙ + x (ω∗ 0) is an isomorphism between W1,2 2π and L22π , it is a Fredholm operator with index zero. We also know that the inclusion 2 10 mapping of W1,2 2π into L2π is a compact operator. Consequently, the mapping x → 1,2 (I +Aμ∗ )x is also a compact operator of W2π into L22π . (Note that Aμ∗ is a bounded operator of L22π into itself.) Thus we confirm that the operator T can be expressed

8 Let V and W be a couple of Banach spaces. Assume that a function ϕ of an open subset U of V into W is Gâteaux-differentiable in a neighborhood V of x ∈ U . We denote by δϕ(v) the Gâteaux-derivative of ϕ at v. If the function v → δϕ(v)(V → L(V, W)) is continuous, then ϕ is Fréchet-differentiable. cf. Maruyama [12] pp. 236–237. 9 The continuity of the mapping can be proved in the same manner as in the proof of Lemma 11.1. Assumption 1(i) is used again for the dominated convergence argument. 10 This is a special case of the Rellich–Kondrachov Compactness Theorem. Evans [5] pp. 272–274.

11.5 dim V = 2

315

as a sum of a Fredholm operator with index zero and a compact operator. It follows from Theorem 10.12 (p. 299) that T is also a Fredholm operator with index zero. However, we prefer an elementary way to prove both dimV = 2 and codimR = 2 without having recourse to the Fredholm operator theory discussed above. I appreciate Professor S. Kusuoka’s suggestion on this point.

11.5 dim V = 2 Expanding x ∈ X in the Fourier series, we obtain x(t) =

∞

uk eikt ,

uk ∈ Cn ,

(11.22)

k=−∞

where uk is a vector, the j -th coordinate of which is given by 1 2π

2π

xj (t)e−ikt dt,

j = 1, 2, · · · , n.

0

We must have the relation u−k = u¯ k

(conjugate),

k ∈ Z,

since x(t) is a real vector. Furthermore, we have to keep in mind that the series (11.22) is uniformly convergent by√ Theorem 2.5 (p. 37).11 The k-th Fourier coefficient (times 1/ 2π , rigorously speaking) of x(·)is ˙ given 2 . Hence we obtain , it is clear that x(·) ˙ ∈ L by ikuk . Since x(·) ∈ W1,2 2π 2π ∞

x(t) ˙ =

ikuk eikt

a.e.,

(11.23)

k=−∞

that is, the Fourier series of x(·) ˙ given by the right-hand side of (11.23) converges a.e. and equal to x(·). ˙ This result is justified by Carleson’s theorem (p. 37). It follows that ω∗ x˙ − Aμ∗ x =

∞

[ikω∗ I − Aμ∗ ]uk eikt = 0

a.e.

k=−∞

a 2π -periodic function ϕ : R → R is absolutely continuous and its derivative ϕ belongs to L2 ([0, 2π ], R), then the Fourier series of ϕ uniformly converges to ϕ on R. The k-th Fourier coefficient of ϕ is given by ik ϕ(k), ˆ where ϕ(k) ˆ is the k-th Fourier coefficient of ϕ. 11 If

316

11 Hopf Bifurcation Theorem

By the uniqueness of √ the Fourier coefficients (with respect to the complete orthonormal system (1/ 2π )eikt ; k = 0, ±1, ±2, · · · ), we must have [ikω∗ I − Aμ∗ ]uk = 0 for all

k ∈ Z.

(11.24)

It is enough to find all x ∈ X, the Fourier coefficients of which satisfy (11.24), in order to determine V. By the regularity of Aμ∗ in Assumptions 2 and 3, ikω∗ I − Aμ∗ ,

k ±1

are all invertible. Therefore we have simply uk = 0

for k ±1.

Thus all the coefficients in (11.22) besides k = ±1 must be zero. Consequently, all we have to do is to find functions whose Fourier coefficients u±1 satisfy [±iω∗ I − Aμ∗ ]u±1 = 0. Cn

(11.25)

Since ±iω∗ are simple eigenvalues of Aμ∗ by Assumption 2, there exists ξ ∈ (ξ 0) such that12 Ker{iω∗ I − Aμ∗ } = span{ξ }.

(11.26)

On the other hand, Ker{−iω∗ I − Aμ∗ } = span{ξ¯ }. The Fourier coefficients u±1 which satisfy (11.25) can be expressed as u+1 = aξ,

u−1 = bξ¯

(a, b ∈ C).

Then any real solution x(t) of (11.23) is of the form: x(t) = aξ eit + a¯ ξ¯ e−it . If we set a = α + iβ, ξ = γ + iδ (α, β ∈ R; γ , δ ∈ Rn ), it follows from simple calculations that x(t) = 2Re[aξ eit ] = 2[α(γ cos t − δ sin t) − β(γ sin t + δ cos t)]. 12 span{ξ }

denotes the subspace of Cn spanned by ξ .

11.6 codim R = 2

317

Writing p(t) = γ cos t − δ sin t and q(t) = γ sin t + δ cos t, we have x(t) = 2αp(t) − 2βq(t), where α and β are any real numbers. It can easily be checked that p(t) and q(t) are linearly independent.13 Thus we have shown that p(·) and q(·) form a base of V. And so dimV = 2.

11.6 codim R = 2 We now proceed to examining the dimension of the quotient space Y/R of Y modulo R. Writing T x = y,

x ∈ X,

y ∈ Y,

we again expand x and √ y in the Fourier series. Let uk (resp. vk ) be the Fourier coefficients (times 1/ 2π ) of x (resp. y). Then we must have ∞

[ikω∗ I − Aμ∗ ]uk eikt =

k=−∞

∞

vk eikt .

(11.27)

k=−∞

2 Since x ∈ W1,2 2π , y ∈ L2π and both of them are 2π -periodic, the right-hand side of (11.27) converges a.e. again by Carleson’s theorem (p. 37). By the uniqueness of the Fourier coefficients, we must have

[ikω∗ I − Aμ∗ ]uk = vk

for all

k ∈ Z.

(11.28)

13 Set μp(t)+νq(t) = μ(γ cos t −δ sin t)+ν(γ sin t +δ cos t) = (μγ +νδ) cos t +(νγ −μδ) sin t = 0. Then we have + μγ + νδ = 0,

νγ − μδ = 0. It follows that + μνγ + ν 2 δ = 0, μνγ − μ2 δ = 0. Hence (ν 2 + μ2 )δ = 0. If μ 0 or ν 0, δ must be zero. And so ξ = γ , that is ξ = ξ¯ (real vector). Thus we get a contradiction.

318

11 Hopf Bifurcation Theorem

By the regularity of Aμ∗ in Assumptions 2 and 3, uk (k ±1) in (11.28) can be solved uniquely in the form: u0 = −A−1 μ∗ v0 , uk = [ikω∗ I − Aμ∗ ]−1 vk ,

k 0, ±1.

(11.29)

Since the norm of (1/ ikω∗ )Aμ∗ is less than 1 for sufficiently large |k|’s (say |k| k0 ), we have ∗

[ikω I − Aμ∗ ]

−1

0 1−1 1 1 I− = Aμ∗ ikω∗ ikω∗ 0 1 1 1 1 2 ∗ I+ = Aμ + A ∗ + ··· ikω∗ ikω∗ (ikω∗ )2 μ 1 1 for |k| k0 . I + O = ikω∗ k2

(11.30)

It follows, from (11.29) and (11.30), that uk = [ikω∗ I − Aμ∗ ]−1 vk =

1 1 vk v + O k ikω∗ k2

for

|k| k0 .

Consequently, k0,±1

uk eikt =

[ikω∗ I − Aμ∗ ]−1 vk eikt

|k|
+

|k|k0

1 vk ikt e + O 2 vk eikt . ikω∗ k

(11.31)

|k|k0

We claim that the second term of the right-hand side of (11.31) is of the class In order to prove it, we define the function θ (t) by

W1,2 2π .

θ (t) =

vk eikt .

(11.32)

|k|k0

Then θ (t) is of the class L22π . Let Θ(t) be the indefinite integral of θ (t), that is

t

Θ(t) =

θ (τ )dτ. 0

(11.33)

11.6 codim R = 2

319

˙ Clearly, Θ(t) is of the class W1,2 2π and Θ(t) = θ (t) a.e. Expanding θ (t) in the Fourier series, we confirm that the coefficient θˆ (0) corresponding to k = 0 is zero. Hence we obtain by Remark 3.3 (p. 57)14 that θˆ (k) for all k 0, ik vk 1 Θ(t) = eikt ω∗ ikω∗ ˆ Θ(k) =

i.e.

|k|k0

= the second term of (11.31). This is of the class W1,2 2π by (11.33). We can also prove that the third term of the right-hand side of (11.31) is of the class W1,2 2π by a similar argument as above. In this case we define the function θ (t) ∈ 2 L2π by θ (t) =

ikO

|k|k0

1 k2

vk · eikt

instead of (11.32). Θ(t) is the indefinite integral of θ (t) as in (11.33). The first term of the right-hand side of (11.31) is obviously of the class W1,2 2π . This finishes the proof of the claim that the right-hand side of (11.31) is of the class W1,2 2π . Thus denoting the Fourier coefficients of any y ∈ Y by vk ’s, the function

uk eikt = −A−1 μ∗ v0 +

k±1

[ikω∗ I − Aμ∗ ]−1 vk eikt

k0,±1

is of the class W1,2 2π and uk ’s defined here satisfy the relation (11.28).

f : R → R (we may replace R by Rn ) be a 2π -periodic function which is integrable on [−π.π ]. Furthermore, we assume fˆ(0) = 0 (fˆ(0) is the Fourier coefficient corresponding to k = 0). If we define

t F (t) = f (0) + f (τ )dτ, 14 Let

0

F is a 2π -periodic continuous function and 1 Fˆ (k) = fˆ(k), ik

k 0.

320

11 Hopf Bifurcation Theorem

We shall now go over to k = ±1. The equation [±iω∗ I − Aμ∗ ]u±1 = v±1

(11.34)

does or does not have a solution. Since codim[±iω∗ I − Aμ∗ ](Cn ) = 1 by Assumption 2, there must exist some ϕ ∈ Cn (ϕ 0) such that Cn /[iω∗ I − Aμ∗ ](Cn ) = span{ϕ + [iω∗ I − Aμ∗ ](Cn )}. On the other hand, we also have Cn /[−iω∗ I − Aμ∗ ](Cn ) = span{ϕ¯ + [−iω∗ I − Aμ∗ ](Cn )}. An element of Y is not contained in R if and only if its Fourier coefficients v±1 (corresponding to k = ±1) do not admit the existence of u±1 which satisfy (11.34). Such vector v1 (resp. v−1 ) is contained in a class of span {ϕ + [iω∗ I − Aμ∗ ](Cn )} (resp. ϕ¯ + [−iω∗ I − Aμ∗ ](Cn )). Therefore any element of Y/R can be expressed as aϕeit + bϕe ¯ −it + R;

a, b ∈ C.

In order to find a real solution, we should set b = a. ¯ If we write ϕ = γ + iδ, a = α + iβ (α, β ∈ R; γ , δ ∈ Rn ), we obtain (in the same manner as the calculations on pp. 316–317) ¯ −it = 2[α(γ cos t − δ sin t) − β(γ sin t + δ cos t)] aϕeit + bϕe = 2αp(t) − 2βq(t), where p(t) = γ cos t − δ sin t and q(t) = γ sin t + δ cos t. Thus we have confirmed that any element of Y/R can be expressed as a linear combination of the two linearly independent elements, p(t) + R and q(t) + R. Hence codim R = 2. Remark 11.4 We can specify ϕ = ξ (see (11.26) for the definition of ξ ). Therefore we will henceforth choose ξ as ϕ. 2 Remark 11.5 We can represent X = W1,2 2π and Y = L2π in the form of topological direct sums:

X = V ⊕ W,

Y = Z ⊕ R,

by choosing suitable subspaces W ⊂ X and Z ⊂ Y, respectively. Of course, V = KerT and R = T (X) as stated on p. 303. We also denote by P the projection of Y into Z corresponding to the direct product defined above.

11.7 Linear Independence of P Mv ∗ and P N v ∗ (1)

321

11.7 Linear Independence of P Mv ∗ and P Nv ∗ (1) According to Assumption 2, ±iω∗ are simple eigenvalues of Aμ∗ . Hence Cn can be expressed as a direct sum as (cf. Fig. 11.1) Cn = Ker[±iω∗ I − Aμ∗ ] ⊕ [±iω∗ I − Aμ∗ ](Cn ).

(11.35)

We shall now concentrate on the case +iω∗ . (The case −iω∗ can be discussed similarly.) Let η ∈ Cn (η 0) be any vector which is orthogonal to [iω∗ I − Aμ∗ ](Cn ). Define a function g : R × C × Cn → Cn × C by (λI − Aμ )(ξ + θ ) . g(μ, λ, θ ) = η, θ ) (·, · denotes the inner product, that is η, θ = nj=1 ηj θ¯j .) We should note again that the vector ξ is defined by (11.26) on p. 316. Then the function g is of the class C1 and satisfies g(μ∗ , iω∗ , 0) = 0. We are now going to solve the equation g(μ, λ, θ ) = 0 locally with respect to (λ, θ ) in terms of μ in some neighborhood of (μ∗ , iω∗ , 0). The derivative of g with respect to (λ, θ ) is given by D(λ,θ) g(μ∗ , iω∗ , 0)(λ, θ ) =

λξ + (iω∗ I − Aμ∗ )θ . η, θ

Here D(λ,θ) g(μ∗ , iω∗ , 0) ((n+1)×(n+1) matrix) is regular by (11.35).15 Applying the implicit function theorem, we obtain the following lemma. Lemma 11.3 There exist a couple of functions, λ(μ) and θ (μ) of the C1 -class which are defined in some neighborhood of μ∗ and satisfy

any (α0 , β0 ) ∈ Cn × C, there exist some λ0 ∈ C and γ0 ∈ (iω∗ I − Aμ∗ )(Cn ) such that α0 = λ0 ξ + γ0 . Such λ0 and γ0 are unique. Let (α0 , β0 ) = (0, 0). Then we must have λ0 = 0 and γ0 = 0. The equation ∗ (iω I − Aμ∗ )θ 0 = η, θ 0

15 For

has a unique solution θ = 0(∈ Cn ) because Ker[iω∗ I − Aμ∗ ] ∩ Kerη, · = Ker[iω∗ I − Aμ∗ ] ∩ [iω∗ I − Aμ∗ ](Cn ) = {0}. Thus we conclude that D(λ,θ ) g(μ∗ , iω∗ , 0) is injective.

322

11 Hopf Bifurcation Theorem

Fig. 11.1 Orthogonal decomposition of Cn

(λ(μ)I − Aμ )(ξ + θ (μ)) 0 = , η, θ (μ) 0

(11.36)

and λ(μ∗ ) = iω∗ ,

θ (μ∗ ) = 0.

(11.37)

Denoting ξ + θ (μ) in (11.36) by ξ(μ), we can rewrite the relations (11.36) and (11.37) as follows: Aμ ξ(μ) = λ(μ)ξ(μ),

(11.36 )

λ(μ∗ ) = iω∗ , ξ(μ∗ ) = ξ.

(11.37 )

11.8 Linear Independence of P Mv ∗ and P Nv ∗ (2) Since codim[iω∗ I − Aμ∗ ](Cn ) = 1 by Assumption 2, there exists a nonzero vector η ∈ Cn such that η, κ = 0

for all

κ ∈ [iω∗ I − Aμ∗ ](Cn )

as we have seen above. Taking account of the fact16 that ξ [iω∗ I − Aμ∗ ](Cn ), we can choose η so that η, ξ = 1. The function Π : κ → η, κξ, κ ∈ Cn 16 iω∗

is a simple eigenvalue, again by Assumption 2.

11.8 Linear Independence of P Mv ∗ and P N v ∗ (2)

323

Fig. 11.2 Spectral projection

is called the spectral projection17 associated with ξ . We can similarly define the spectral projection Π associated with ξ¯ by using η¯ instead of η. Writing Aμ∗ =

d Aμ μ=μ∗ dμ

for the sake of simplicity, we get the following result. Lemma 11.4 Π Aμ∗ ξ = λ (μ∗ )ξ, ΠAμ∗ ξ¯ = λ (μ∗ )ξ¯ .

17 Look

at Fig. 11.2. For the sake of an intuitive exposition, the vectors η, ξ and κ are treated as real vectors. By η, ξ = η · ξ cos θ = 1, it follows that η = 1/ ξ cos θ. Hence Π (κ) = ξ η, κ = ( κ cos ζ / ξ cos θ) · ξ.

Since κ cos ζ = 0A (the length of the segment) and ξ cos θ = 0B, Π (κ) =

0A · ξ. 0B

Here ζ is the angle between η and κ, and θ is the one between ξ and η. κ can be represented uniquely as κ = αη + βz for some α, β ∈ C and z ∈ [iω∗ I − Aμ∗ ](Cn ). On the other hand, η can be represented uniquely in the form η = aξ + bz for some a, b ∈ C and z ∈ [iω∗ I − Aμ∗ ](Cn ). Since η 2 = aξ + bz , η = aξ, η + bz , η = a, it follows that η = η 2 ξ + bz . Hence we have κ = αη + βz = α η 2 ξ + (αbz + βz). Furthermore, Π (κ) = η, κξ = η, α η 2 ξ +(αbz +βz)ξ = α η 2 ξ. (η, αbz +βz = 0 because αbz + βz ∈ [iω∗ I − Aμ∗ ](Cn ). ) Thus we obtain κ = Π (κ) + (αbz + βz). This is the direct sum of Cn corresponding to span{ξ } and [iω∗ I − Aμ∗ ](Cn ).

324

11 Hopf Bifurcation Theorem

Proof It is enough to prove only the first part. It is straightforward that Aμ ξ = Aμ (ξ − ξ(μ)) + Aμ ξ(μ) = Aμ (ξ − ξ(μ)) + λ(μ)ξ(μ). Define ξ =

d . ξ(μ) μ=μ∗ dμ

Then it follows that Aμ∗ ξ = Aμ∗ (ξ − ξ(μ∗ )) − Aμ∗ ξ + λ (μ∗ )ξ(μ∗ ) + λ(μ∗ )ξ = λ (μ∗ )ξ(μ∗ ) + (λ(μ∗ )I − Aμ∗ )ξ = λ (μ∗ )ξ + (iω∗ I − Aμ∗ )ξ . Taking account of the fact Π [(iω∗ I − Aμ∗ )ξ ] = 0, we must have Π Aμ∗ ξ = λ (μ∗ )ξ. We have finished the preparation for the spectral projection. We shall now go 2 F (ω∗ , μ∗ , 0) back to our task to evaluate P Nv ∗ and P Mv ∗ . Recall that M = Dx,μ 2 ∗ ∗ and N = Dx,ω F (ω , μ , 0). If we specify v ∗ ∈ V as v ∗ = ξ eit + ξ¯ e−it , it follows that Dx F (ω, μ, 0)v ∗ = ωv˙ ∗ − Aμ v ∗ = iω(ξ eit − ξ¯ e−it ) − Aμ (ξ eit + ξ¯ e−it ).

(11.38)

Evaluation of P Nv ∗ . We obtain, by (11.38), that 2 Dxω F (ω∗ , μ∗ , 0) v ∗ = iξ eit − i ξ¯ e−it . 789: N

In general, the projection P y of y ∈ Y into Z can be calculated as P y = Π (v1 )eit + Π (v−1 )e−it ,

(11.39)

11.8 Linear Independence of P Mv ∗ and P N v ∗ (2)

325

where v±1 are the Fourier coefficients of y corresponding to k = ±1.18 Hence, by (11.39), P N v ∗ is evaluated as ¯ ξ¯ ξ¯ e−it = iξ eit − i ξ¯ e−it . P N v ∗ = iη, ξ ξ eit − iη,

(11.40)

Evaluation of P Mv ∗ . Dividing λ(μ) (obtained by Lemma 11.3) into real and imaginary parts, we write λ(μ) = α(μ) + iβ(μ). We also write λ (μ) = α (μ) + iβ (μ). It follows from (11.38) that 2 F (ω, μ, 0) v ∗ = −Aμ∗ (ξ eit + ξ¯ e−it ). Dxμ 789: M

Therefore we have by Lemma 11.4 that P Mv ∗ = −Π Aμ∗ ξ eit − Π Aμ∗ ξ¯ e−it = −λ (μ∗ )ξ eit − λ (μ∗ )ξ¯ e−it = −α (μ∗ )(ξ eit + ξ¯ e−it ) − β (μ∗ )(ξ eit − ξ¯ e−it ).

(11.41)

Comparing (11.40) and (11.41), we get a simple fact that P Nv ∗ and P Mv ∗ are linearly independent if and only if α (μ∗ ) 0. Thus all the requirements in Theorem 11.1 are fulfilled if we make an additional assumption that α (μ∗ ) 0. Theorem 11.2 Let f (μ, x) : R × Rn → Rn be a function of the class C2 (R × Rn , Rn ) which satisfies f (μ, 0) = 0 for all μ ∈ R. Suppose that Assumptions 1–3 as well as the condition α (μ∗ ) 0 are satisfied. Then (ω∗ , μ∗ ) is a bifurcation point of F (ω, μ, x) = ωdx/dt − f (μ, x). Remark 11.6 (Due to S. Kusuoka) The additional condition α (μ∗ ) 0 can be expressed in an alternative equivalent form.

each of the Fourier coefficients of y by the direct sum corresponding to span{ξ } and [iω∗ I − Aμ∗ ](Cn ), and delete all the terms which do not contribute to the former.

18 Express

326

11 Hopf Bifurcation Theorem

We denote by V0 (resp. W0 ) the kernel (resp. the image) of ω∗ I + A2μ∗ ; i.e. 2

V0 = Ker{ω∗ I + A2μ∗ }, 2

W0 = [ω∗ I + A2μ∗ ](Rn ). 2

We have, of course, that Rn = V0 ⊕ W0 .

(11.42)

Let π0 : Rn → V0 be the projection of Rn into V0 corresponding to the above direct product (11.42). Then we can prove that α (μ∗ ) = 0

if and only if trAμ∗ π0 0,

where tr is the trace of a matrix. C C n Proof Let VC 0 = C ⊗ V0 and π0 : C → V0 be the complexifications of V0 and π0 , respectively. Recall the formula (11.26) in Sect. 11.5; i.e.

Ker{iω∗ I − Aμ∗ } = span{ξ }. We write ξ = a + ib;

a, b ∈ Rn .

It can easily be checked that Aμ∗ a = −ω∗ b

and

Aμ∗ b = ω∗ a.

Consequently, we also have that a and b are contained in V0 . They must be linearly independent. Hence there exist some a˜ and b˜ ∈ Rn which satisfy the following conditions: a, ˜ a = 1, ˜ a = 0, b,

a, ˜ b = 0, ˜ b = 1, and b,

˜ w = 0 a, ˜ w = b,

for all w ∈ W0 .

We then define the vector η by η=

1 ˜ (a˜ + i b). 2

This η clearly satisfies the condition in Sect. 11.7 (p. 321) that η 0 and η is orthogonal to [iω∗ I − Aμ∗ ](C∗ ).

11.8 Linear Independence of P Mv ∗ and P N v ∗ (2)

327

Since Π Aμ∗ ξ = λ (μ∗ )ξ by Lemma 11.4, it follows that λ (μ∗ ) ≡ η, Aμ∗ ξ =

1 ˜ Aμ∗ ξ . a˜ + i b, 2

Finally, we obtain the desired result by α (μ∗ ) = Reλ (μ∗ ) =

1 ˜ Aμ∗ b} = 1 trAμ∗ π0 . {a, ˜ Aμ∗ a + b, 2 2

This proves the desired result. Example 11.1 Consider the differential equation of the Van der Pol type x¨ − (μ − 3x 2 )x˙ + x = 0,

(11.43)

where μ is a real parameter.19 The equation (11.43) is equivalent to +

x˙ = y, y˙ = −x + (μ − 3x 2 )y.

(11.44)

Let u = (x, y). Define f (μ, u) = (y, −x + (μ − 3x 2 )y), and set ω∗ = 1. Then (11.44) is reduced to the form ωu˙ = f (μ, u). If we define F (ω, μ, u) = ωu˙ − f (μ, u), then (ω∗ , μ∗ ) = (1, 0) is a bifurcation point of F . Du f (μ, 0) = Aμ is given by

0 1 Aμ = . −1 μ Hence the eigenvalues λ(μ) of Aμ are 1 λ(μ) = (μ ± 2

? μ2 − 4).

In the case of μ = μ∗ = 0, we have λ(0) = ±i.

19 x˙

and x¨ denote the first and second derivatives of x with respect to t, respectively.

328

11 Hopf Bifurcation Theorem

There are clearly simple eigenvalues of Aμ∗ = A0 and there is no other eigenvalue. A0 is regular. Writing λ(μ) = α(μ) + iβ(μ), we finally obtain α (0) = 1/2 0. Consequently, we can conclude that (ω∗ , μ∗ ) = (1, 0) is a bifurcation point of F .

11.9 Hopf Bifurcation in Cr In the preceding sections, we examined the Hopf bifurcation phenomena in the framework of the Sobolev space X = W1,2 2π . However, we have to note that some technical modifications are required when we consider the same problem in the alternative space consisting of periodic smooth functions. Here we specify a couple of function spaces, X and Y, as X = {x ∈ Cr (R, Rn ) | x(t + 2π ) = x(t) for all t}, Y = {x ∈ Cr−1 (R,Rn ) | y(t + 2π ) = y(t) for all t}, where r 3. Furthermore, the function f : R × Rn → Rn is assumed to be of the class r−1 C (R × Rn , Rn ). 1◦ The first point to be examined is the formula (11.23) in Sect. 11.5. In the alternative setting, we have to proceed somewhat more carefully as follows. Since x is of the class Cr (r 3), we must have uk = o

1 as |k|r

|k| → ∞.20

Therefore, for any ε > 0, there exists some N ∈ N such that |uk | ε · (1/|k|r ) (|k| N). By differentiating the right-hand side of (11.22) of Sect. 11.5 termwise, we obtain ∞ ∞ ikuk eikt

kuk

k=−∞

k=−∞

kuk + ε

|k|
=

|k|
20 See

Theorem 3.6 (p. 59).

|k|N

kuk + ε

|k|N

|k| ·

1 |k|r

1 . |k|r−1

11.9 Hopf Bifurcation in Cr

329

Thus, taking account of the condition r 3, the series obtained by the termwise differentiation of the right-hand side of (11.22) is uniformly convergent. Hence we obtain21 x(t) ˙ =

∞

ikuk eikt .

k=−∞

2◦ The second modification is required in Sect. 11.6. The argument which succeeds the formula (11.31) should be replaced by the following one. We claim that the second term of the right-hand side of (11.31) in Sect. 11.6 is of the class Cr . In order to prove it, we define the function θ (t) by θ (t) = vk eikt . |k|k0

Then θ (t) is of the class Cr−1 . Let Θ(t) be the indefinite integral of θ (t), that is

t

Θ(t) =

(11.45)

θ (τ )dτ. 0

˙ Clearly, Θ(t) is continuously differentiable and Θ(t) = θ (t). Expanding θ (t) in the ˆ Fourier series, we confirm that the coefficient θ (0) corresponding to k = 0 is zero. Hence we obtain by Remark 3.3 (p. 57) again22 that θˆ (k) for all k 0, ik vk 1 Θ(t) = eikt ∗ ω ikω∗ ˆ Θ(k) =

i.e.

|k|k0

= the second term of (11.31). This is of the class Cr by (11.45). The third term of the right-hand side of (11.31) in Sect. 11.6 is of the class Cr−1 , and converges at every t. Differentiating termwise formally, we get |k|k0

21 Assume

O

1 vk · ikeikt . k2

(11.46)

that ϕn : [a, b] → R is of class C1 (n = 1, 2, · · · ) and S(t) =

∞

ϕn (t) is convergent.

n=1

If

∞ ∞ ϕn (t) converges uniformly, S(t) is differentiable and S (t) = ϕn (t). cf. Takagi [21], n=1

n=1

pp. 158–159, Stromberg [20] pp. 214–215. We assumed r 3 to use this theorem. 22 See footnote 11 on p. 315.

330

11 Hopf Bifurcation Theorem

The series (11.31) is uniformly convergent, since vk = o(1/k r−1 ) (r 3). Hence the third term of (11.31) in Sect. 11.6 is differentiable and (11.46) deduced above is exactly its derivative. Thus the third term is of the class Cr . The first term of the right-hand side of (11.31) is obviously of the class Cr . This finishes the proof of the claim that the right-hand side of (11.46) is of the class Cr . No change is required at all in the remaining part of the proof. Theorem 11.3 Let f (μ, x) : R × Rn → Rn be a function of the class Cr−1 (R × Rn , Rn ) (r 3) which satisfies f (μ, 0) = 0 for all μ ∈ R. Suppose that Assumptions 2, 3 and α (μ∗ ) 0 are satisfied. Then (ω∗ , μ∗ ) is a bifurcation point of F (ω, μ, x) = ωdx/dt − f (μ, x).

11.10 Kaldorian Business Fluctuations The phenomena of quasi-regular business fluctuations have been attracting the interest of numerous economists from both of theoretical and empirical viewpoints. In particular, the theories to describe and explain economic fluctuations have deepened very much since the “Great Depression”, through the 1930s. I have also to add Frisch’s famous paper [6] and Keynes’book [11] as indispensable contributions providing solid foundations for further developments since then. Among the various economic theories on business fluctuations, the Hicks [7]– Samuelson [17, 18] model and the Kaldor [10] model should be regarded as the most typical and influential. The former tries to explain the business cycles as a result of interactions between the multiplier and the accelerator, whereas the latter is based upon the so-called “profit principle” of investment which is embodied in nonlinear investment functions. In the present chapter, we concentrate upon the Kaldorian theory of business fluctuations. Yasui [23] tried to clarify the mathematical structure of the theory and successfully reduced the fundamental dynamics due to Kaldor to the Liénard differential equation, which was quite familiar in the field of mathematical analysis. Yasui also investigated the periodic behavior of the solutions, having recourse to the theory of nonlinear oscillations which was then in the course of vigorous development.23 However, from the mathematical viewpoint, some more rigorous reasoning seems to be required in order to establish the existence of periodic solutions. In this section, we present an approach to the existence problem of periodic solutions of the Kaldor–Yasui equation. The proof is based upon the Hopf bifurcation theorem. According to the profit principle due to N. Kaldor, we assume that the gross investment I depends upon the (real) national income Y and the capital stock K. The relation between I and (Y, K) is expressed by the equation I = ψ(Y, K), 23 For

an introductory exposition of business cycle theory, see Maruyama [15], Chap. 18.

11.10 Kaldorian Business Fluctuations

331

which satisfies the conditions DY ψ > 0 and

DK ψ < 0,

where DY ψ and DK ψ are the partial derivatives of ψ with respect to Y and K, respectively. From now on, we specify the function ψ as ψ(Y, K) = F (Y ) − μK

(μ > 0),

(11.47)

for the sake of simplicity. The function F is assumed to be twice continuously differentiable. If we denote by δ > 0 the rate of depreciation of capital, the condition I = δK

(11.48)

describes the situation of zero net investment. The saving function S(Y ) is simply given by S(Y ) = sY

(0 < s < 1).

In the state of zero net investment, we must have, by (11.47) and (11.48), that F (Y ) − μK = δK, that is, K=

1 F (Y ). μ+δ

I=

δ F (Y ), μ+δ

It follows that (11.49)

since I = δK. The equation (11.49) expresses the relation between Y and I , under which there is no variation of the capital stock K. In Fig. 11.3, the relation is depicted as the curve RR with upward slope. The equilibrium S = I on RR is attained at the level of Y such that sY =

δ F (Y ). μ+δ

We denote this level of Y by Y0 . The corresponding level K0 of the capital stock and that I = 0 of the (gross) investment are determined by (11.48) and (11.49). The difference between I = ψ(Y, K) and I0 = ψ(Y0 , K0 ) is calculated as I − I0 = F (Y ) − F (Y0 ) − μ(K − K0 ).

(11.50)

332

11 Hopf Bifurcation Theorem

Fig. 11.3 Koldor’s investment function

Fig. 11.4 Relation of y and i

Changing the variables as i = I − I0 , y = Y − Y0 , k = K − K0 and f (y) = F (Y ) − F (Y0 ), we can rewrite the equation (11.50) in the form i = f (y) − μk.

(11.51)

Similarly, the relation (11.49) is reexpressed as (cf. Fig. 11.4) i=

δ f (y). μ+δ

Assume further that the rate of change in y is proportional to the difference between the investment and saving, say εy˙ = i − sy,

ε > 0,

(11.52)

where y˙ means the derivative of y with respect to time. Substituting (11.51) into (11.52), we have εy˙ = f (y) − μk − sy,

(11.53)

11.10 Kaldorian Business Fluctuations

333

(11.53 )

εy˙ − f (y) + μk + sy = 0.

i.e.

According to (11.51) again, the net investment is given by k˙ = i − δk = f (y) − μk − δk.

(11.54)

Differentiating (11.53 ) with respect to time t, we obtain εy¨ − f (y)y˙ + μk˙ + s y˙ = εy¨ − f (y)y˙ + μ[f (y) − μk − δk] + s y˙

= εy¨ + [s − f (y)]y˙ + μ[εy˙ + sy − δk]

(by (11.54))

(by (11.53))

= εy¨ + [s + με − f (y)]y˙ + μ(sy − δk) = εy¨ + [s + με − f (y)]y˙ + μsy + δεy˙ − δf (y) + δsy

(by (11.53))

= εy¨ + [ε(μ + δ) + s − f (y)]y˙ + s(μ + δ)y − δf (y) = 0.

(11.55)

Dividing both sides of (11.55) by ε, we have 1 y¨ + ϕ(y)y˙ + g(y) = 0, ε

(11.56)

ϕ(y) = ε(μ + δ) + s − f (y),

(11.57)

where

and 0 1 δ s(μ + δ) g(y) = y − f (y) . ε δ The differential equation (11.56) is clearly of the Liénard type. This is the fundamental dynamic equation of the Kaldor–Yasui theory of business cycles. It is a duty for economic theorists to look for some striking sufficient conditions which guarantees the existence of periodic solutions for the differential equation (11.56). We now present a couple of alternative approaches leading to this target. See Chang and Smyth [3] and Shinasi [19] as related contributions. The Kaldor–Yasui equation (11.56) can be rewritten as ⎞ ⎛ z 1 0 y˙ ⎠. δ s(μ + δ) =⎝ 1 y − f (y) − [ε(μ + δ) + s − f (y)]z − z˙ ε ε δ

(11.58)

334

11 Hopf Bifurcation Theorem

If we denote t (y, z) by u and the right-hand side of (11.58) by h(μ, u), regarding μ as a parameter, (11.58) can be shortened as u˙ = h(μ, u).

(11.59)

K(ω, μ, u) = ωu˙ − h(μ, u) = 0,

(11.60)

Consider an equation

where ω is another parameter. Writing Aμ = Du h(μ, 0), we obtain ⎛

0

Aμ = ⎝ s(μ + δ) δ − + f (0) ε ε

1

−

⎞

⎠ . 1 ε(μ + δ) + s − f (0) ε

We specify the value μ∗ of μ so that24 −

s(μ + δ) δ + f (0) = −1, ε ε

(11.61)

and 1 − [ε(μ + δ) + s − f (0)] = 0. ε The value of ω∗ is specified as ω∗ = 1. Then ±iω∗ = ±i are simple eigenvalues of Aμ∗ . If we write the eigenvalue of Aμ by λ(μ) = α(μ) ± iβ(μ), we can easily confirm that α (μ∗ ) 0. We thus obtain the following theorem by Theorem 11.3. Theorem 11.4 (ω∗ , μ∗ ) = (1, ε(δ 2 + 1)/(s − δε)) is a bifurcation point of K(ω, μ, u) defined by (11.60). This implies the existence of a nontrivial periodic solution for (11.58) ⇔ (11.59) (with period near 2π ) corresponding to μ near to μ∗ .

24 We

obtain s = −ε(μ + δ) + f (0). It follows from this relation and (11.61) that ε = s(μ + δ) − δf (0) = s(μ + δ) − δ(s + ε(μ + δ)),

which gives μ∗ =

ε (δ 2 + 1). s − δε

11.11 Ljapunov’s Center Theorem

335

11.11 Ljapunov’s Center Theorem It is well-known that trajectories of planar linear ordinary differential equations can be classified by eigenvalues of coefficient matrix. In particular, the equilibrium point is a center if all the eigenvalues are purely imaginary. However, the situation becomes quite different in the case of nonlinear systems as illustrated by the following example. Example 11.2 Consider a nonlinear ordinary differential equation x˙ = −y − x(x 2 + y 2 ), (11.62) y˙ = x − y(x + y ). 2

2

The matrix which gives a linear approximation of the right-hand side of (11.62) around the origin 0 is 0 −1 A= . 1 0 The eigenvalues of A are ±i. However, trajectories of the equation (11.62) do not show the behaviors moving around a center. In fact, we obtain d dt

0

1 1 2 2 (x + y ) = x x˙ + y y˙ = −(x 2 + y 2 )2 2

for any solution (x(t), y(t)) of (11.62). Hence it follows that x2 + y2 =

1 2t + C

(C is constant).

The trajectory is not a closed orbit. Thus the origin can not be a center. Under what conditions does a center occur for a nonlinear differential equation? We next study this problem by making use of Hopf’s bifurcation theorem. Let Ω be an open set in Rn and Ω be an open subset of Ω. Suppose that a function f : Ω → Rn is continuous and has partial derivatives. A continuous differentiable function u : Ω → R is called a first integral of the differential equation x˙ = f (x)

(11.63)

u(ϕ(t)) = constant

(11.64)

if

336

11 Hopf Bifurcation Theorem

for any solution ϕ(t) of (11.63) such that its trajectory is contained in Ω . The condition (11.64) means that the value of u(ϕ(t)) depends upon the choice of ϕ(t) but is independent of t. Lemma 11.5 A function u is a first integral of (11.63) if and only if Du(x), f (x) = 0

for all x ∈ Ω .

(11.65)

Proof Suppose that u is a first integral of (11.63). Let ϕ(t, ξ ) be a solution of (11.63) with the initial value ξ ∈ Ω ; i.e. ϕ(0, ξ ) = ξ . By definition of the first integral, d u(ϕ(t, ξ ))|t=0 = Du(ξ ), f (ξ ) = 0. dt This relation holds good for any ξ ∈ Ω . Hence (11.65) follows. Suppose conversely that (11.65) holds good. Let ϕ(t) be a solution of (11.63), the trajectory of which is contained in Ω . If we define a function v(t) by v(t) = u(ϕ(t)), we obtain d v(t) = Du(ϕ(t)), f (ϕ(t)) = 0 dt by (11.65). Consequently, v(t) = u(ϕ(t)) =constant.

Example 11.3 Let a function H : Rn × Rn → R be differentiable. Consider a differential equation called the Hamiltonian system x˙ = −Dy H (x, y), (11.66) y˙ = Dx H (x, y). Then H itself is a first integral of (11.66). Lemma 11.6 Let f ∈ C1 (Rn , Rn ). Suppose that u ∈ C1 (Rn , R) is a first integral of the differential equation. x˙ = f (x).

(11.67)

If x is a solution of x˙ = f (x) + μDu(x),

μ∈R

with period T , then x is also a solution of (11.67) with period T .

(11.68)

11.11 Ljapunov’s Center Theorem

337

Proof There is nothing to prove in the case μ = 0. So we may assume that μ 0. Let x be a T -periodic solutions of (11.68). If we define v(t) = u(x(t)), then we obtain v(t) ˙ =

d u(x(t)) = Du(x(t)), x(t) ˙ dt

= Du(x(t)), f (x(t)) + μDu(x(t)) (by (11.68)) = Du(x(t)), f (x(t)) + μ Du(x(t)) 2 . Since u is a first integral of (11.67), Du(x(t)), f (x(t)) = 0. Hence v(t) ˙ = μ Du(x(t)) 2 .

(11.69)

v˙ 0.

(11.70)

If μ > 0, we obtain

So v(t) is nondecreasing. On the other hand, v(0) = u(x(0)) = u(x(T )) = v(T )

(11.71)

by the periodicity of x. Hence v(t) ˙ = 0 by (11.70) and (11.71). It follows from (11.69) that μDu(x(t)) = 0. We obtain the same result in the case of μ < 0. Thus we conclude that x is a T periodic solution of (11.67). Theorem 11.5 (Ljapunov) Let f ∈ C2 (Rn , Rn ) satisfy f (0) = 0. Assume that A = Df (0) satisfies the following two conditions: (i) A is regular and ±iω∗ (ω∗ > 0) are simple eigenvalues of A. (ii) ikω∗ is not an eigenvalue of A for all k ∈ Z other than k ±1. Assume further that u ∈ C3 (Rn , R) is a first integral of (11.63) and D 2 u(0) ≡ B is regular. Then (ω∗ , μ∗ ) = (ω∗ , 0) is a bifurcation point of ωx˙ = f (x) + μDu(x)

338

11 Hopf Bifurcation Theorem

in the space X = {x ∈ Cr (R, Rn ) | x(t + 2π ) = x(t)

for all t}, r 3.

Proof The following reasoning depends upon Theorem 11.3: 1◦ Define a function ϕ ∈ C2 by ϕ(μ, x) = f (x) + μDu(x).

(11.72)

ϕ plays a role corresponding to f in Theorem 11.3. 2◦ Since Du(ξ ), f (x) = 0 (x ∈ Rn ) by Lemma 11.5, we obtain D 2 u(ξ )y, f (ξ ) + Du(ξ ), Df (ξ )y = 0 for all

ξ, y ∈ Rn .

(11.73)

y ∈ Rn .

(11.74)

If ξ = 0, in particular, D 2 u(0)y, f (0) + Du(0), Df (0)y = 0 for all The first term of (11.74) is zero, since f (0) = 0. Hence t y = 0 Du(0), Ay = ADu(0),

for all

y ∈ Rn .

By the regularity of A, we must have Du(0) = 0. It follows from the definition (11.72) that ϕ(μ, 0) = f (0) + μDu(0) = 0. 3◦ Writing Aμ = Dx ϕ(μ, 0), we have Aμ = Dx f (0) + μD 2 u(0) = A + μB. The following two conditions are satisfied for μ∗ = 0 by (i) and (ii): (a) Aμ∗ is regular, and ±iω∗ are simple eigenvalues of Aμ∗ . (b) ikω∗ is not an eigenvalue of Aμ∗ for all k ∈ Z other than k = ±1. 4◦ We are now going to verify the final condition of the Hopf theorem. Differentiating (11.73) with respect to ξ , we obtain at ξ = 0 that D 3 u(0)(y, z), f (0) + Df (0)z, D 2 u(0)y + D 2 u(0)z, Df (0)y + Du(0), D 2 f (0)(y, z) = 0 for all

y, z ∈ Rn .

11.11 Ljapunov’s Center Theorem

339

The first and the fourth terms on the left-hand side disappear, since f (0) = 0 and Du(0) = 0. Hence Az, By + Ay, Bz = 0

for all

y, z ∈ Rn .

Since B is a symmetric matrix, t + BA)y = 0 for all z,tABy + z, BAy = z, (AB

y, z ∈ Rn

and so t AB + BA = 0.

(11.75)

Applying a suitable transformation (±iω∗ are simple eigenvalues of A), we obtain A=

S 0 , 0R

where S=

0 −ω∗ ω∗ 0

and ±iω∗ are not contained in the spectrum of R. (S is a (2 × 2)-matrix, and R is a (n − 2) × (n − 2)-matrix.) On the other hand, let us decompose B as 4

5

U W B= t , W V where U (resp. V ) is a (2 × 2)-symmetric matrix (resp. (n − 2) × (n − 2)-symmetric matrix) and W is a 2 × (n − 2)-matrix. It follows from (11.75) that SU = U S,

(11.76)

SW = W R.

(11.77)

For instance, (11.76) can be verified as follows. Looking at the multiplication of (2 × 2)-submatrices in the upper left zones in the calculation ⎛ ⎞ ⎞ 0 −ω∗ 0 ω∗ 0⎟ ⎜ −ω∗ 0 0 ⎟ U W U W ⎜ t ⎜ ω∗ 0 ⎟ ⎟ = 0, + AB + BA = ⎜ tW V ⎝ ⎠ tW V ⎠ ⎝ 0 tR 0 R ⎛

340

11 Hopf Bifurcation Theorem

we can easily observe that −SU + U S = 0, which is equivalent to (11.76). Write the symmetric matrix U as U=

ab . ba

By (11.76), we must have b = 0, since ω∗ 0. Hence a0 U= . 0a Furthermore, the matrix W is zero. In order to see this, we decompose W as X W = , Y where X, Y ∈ Rn−2 . By (11.77), we have XR + ω∗ Y = 0,

(11.78)

∗

(11.79)

Y R − ω X = 0. It follows from (11.79) that X=

1 Y R. ω∗

(11.80)

Substituting (11.80) into (11.78), we have 1 1 2 Y R 2 + ω∗ Y = ∗ Y [R 2 + ω∗ I ] ω∗ ω 1 = ∗ Y (R + iω∗ I )(R − iω∗ I ) ω

(11.81)

= 0. Y = 0 follows from (11.81), since ±iω∗ are not eigenvalues of R. Hence X = 0 by (11.80). Thus we obtain W = 0. That is, B has the following form:

11.11 Ljapunov’s Center Theorem

341

⎞

⎛

a 0 ⎜0 a ⎜ B=⎝

0⎟

⎟. ⎠

0 V a 0 by the regularity of B. Consequently, A + μB is expressed as ⎛

μa −ω∗ ⎜ ω∗ μa A + μB = ⎜ ⎝

0

⎞

0

⎟ ⎟. ⎠

R + μV

Find eigenvalues λ(μ) of A + μB with parameter μ, which are equal to ±iω∗ when μ = 0. Then λ(μ) = μa ± iω∗

(a 0).

We have verified all the conditions of the Hopf theorem.

The above theorem tells us that the differential equation ωn x˙ = f (x) + μn Du(x),

n = 1, 2, · · ·

has a nontrivial solution xn (t) with period 2π and xn → 0 (in Cr ) as ωn → ω∗ and μn → 0. Specifying ω∗ = 1, we obtain a solution xn with period Tn = 2π/ωn of x˙ = f (x) + μn Du(x),

n = 1, 2, · · ·

(11.82)

such that xn → 0 as μn → 0. According to Lemma 11.6, however, xn is a solution not only of (11.82) but also of x˙ = f (x). That is, there exists a sequence of nontrivial solutions with period 2π/ωn such that xn → 0 as n → ∞. Remark 11.7 1◦ The domain of f , which determines the differential equation, is assumed to be Rn in Theorem 11.5. It is clear that Rn can be substituted by any open set Ω containing 0. In this case, the domain of the first integral may be Ω. That is, what we require is “u(ϕ(t)) = constant for any solution ϕ(t) of (11.67) such that its trajectory is contained in Ω”.

342

11 Hopf Bifurcation Theorem

Fig. 11.5 Trajectories of Volterra–Lotka equation (1)

2◦ In Theorem 11.5, the origin 0 is considered as an equilibrium point of (11.67) and A = Df (0). If an equilibrium point x0 is nonzero, then the problem can be reduced to the above case by changing the variable v = x − x0 . Example of Application (Volterra–Lotka Differential Equation) Consider a planar system of ordinary differential equations x˙ = x(a − by), (11.83) y˙ = y(cx − d), where a, b, c and d are positive constants (x˙ and y˙ denote the derivatives with respect to the time variable t). x(t) and y(t) may be interpreted, for instance, as the quantity of corn at time t and the population at time t. The life of a man is supported by corn. (11.83) means that the growth ratios, x/x ˙ and y/y, ˙ are equal to (a − by) and (cx − d), respectively. Expressing the right-hand side of (11.83) as a vector f (x, y), we can rewrite (11.83) in the form x˙ = f (x, y). y˙ The equilibrium points of (11.83) are (0, 0) and (d/c, a/b) ≡ (x0 , y0 ). The trajectories with initial values on axes are depicted in Fig. 11.5. Adapting to the interpretation as above, we restrict our attention to x 0, y 0. When x = 0 and y 0, x(a − by)|x=0 = 0. When x 0 and y = 0, y(cx − d)|y=0 = 0. Hence a nonnegative solution (x(t), y(t)) of (11.83) exists for any nonnegative initial value (α0 , β0 ).25 We denote by Ω the interior of the first orthant;

25 See

Yamaguti [22], p. 25.

11.11 Ljapunov’s Center Theorem

343

i.e. Ω = {(x, y) ∈ R2 |x > 0, y > 0}. Any solution with initial value in Ω remains in Ω. Evaluating Df (x0 , y0 ) at the equilibrium point (x0 , y0 ), we obtain

0 −bd/c Df (x0 , y0 ) = . ca/b 0 √ The eigenvalues are ±i ad ≡ iω0 . Both of them are simple. We next look for a first integral. By a simple manipulation of (11.83), we obtain acx − bdy = cx˙ + by, ˙ d

y˙ x˙ − (acx − bdy) + a = 0. x y

(11.84) (11.85)

Substituting (11.85) into (11.84), we further obtain cx˙ − d

x˙ y˙ + by˙ − a = 0. x y

(11.86)

Any solution of (11.83) must satisfy (11.86). Integrating (11.86) with respect to a solution (x(t), y(t)) which remains in Ω, we get cx(t) − d log x(t) + by(t) − a log y(t) = const. Define a function u : Ω → R by u(x, y) = cx − d log x + by − a log y. Then it satisfies d u(x(t), y(t)) = 0. dt That is, u(x, y) is a first integral of (11.83). Finally, d/x02 0 D u(x0 , y0 ) = 0 a/y02 2

is regular. Hence, according to Theorem 11.5, (x0 , y0 ) must be a center. (cf. Fig. 11.6.)

344

11 Hopf Bifurcation Theorem

Fig. 11.6 Trajectories of Volterra–Lotka equation (2)

References 1. Ambrosetti, A., Prodi, G.: A Primer of Nonlinear Analysis. Cambridge University Press, Cambridge (1993) 2. Carleson, L.: On the convergence and growth of partial sums of Fourier series. Acta Math. 116, 135–157 (1966) 3. Chang, W.W., Smyth, D.J.: The existence and persistence of cycles in a non-linear model of Kaldor’s 1940 model re-examined. Rev. Econ. Stud. 38, 37–44 (1971) 4. Crandall, M.G., Rabinowitz, P.H.: The Hopf bifurcation theorem. MRC Technical Summary Report, No. 1604, University of Wisconsin Math. Research Center (1976) 5. Evans, L.: Partial Differential Equations. American Mathematical Society, Providence (1998) 6. Frisch, R.: Propagation problems and inpulse problems in dynamic economics. In: Economic Essays in Honor of Gustav Cassel. Allen and Unwin, London (1933) 7. Hicks, J.R.: A Contribution to the Theory of the Trade Cycle. Oxford University Press, London (1950) 8. Hopf, E.: Abzweigung einer periodishcen Lösung von einer stationären Lösung eines Differentialsystems. Ber. Math. Phys. Sächsische Akademie der Wissenschafters, Leipzig 94, 1–22 (1942) 9. Hunt, R.A.: On the convergence of Fourier series. In: Proceedings of the Conference on Orthogonal Expanisions and their Continuous Analogues, Southern Illinois University Press, Carbondale, pp. 234–255 (1968) 10. Kaldor, N.: A model of the trade cycle. Econ. J. 50, 78–92 (1940) 11. Keynes, J.M.: The General Theory of Employment, Interest and Money. Macmillan, London (1936) 12. Maruyama, T.: Suri-keizaigaku no Hoho (Methods in Mathematical Economics). Sobunsha, Tokyo (1995) (Originally published in Japanese) 13. Maruyama, T.: Existence of periodic solutions for Kaldorian business fluctuations. Contemp. Math. 514, 189–197 (2010) 14. Maruyama, T.: On the Fourier analysis approach to the Hopf bifurcation theorem. Adv. Math. Econ. 15, 41–65 (2011) 15. Maruyama, T.: Shinko Keizai Genron (Principles of Economics). Iwanami Shoten, Tokyo (2013) (Originally published in Japanese) 16. Masuda, K. (ed.): Ohyo-kaiseki Handbook (Handbook of Applied Analysis). Springer, Tokyo (2010) (Originally published in Japanese)

References

345

17. Samuelson, P.A.: Interaction between the multiplier analysis and the principle of acceleration. Rev. Econ. Stud. 21, 75–78 (1939) 18. Samuelson, P.A.: A synthesis of the principle of acceleration and the multiplier. J. Polit. Econ. 47, 786–797 (1939) 19. Shinasi, G.J.: A nonlinear dynamic model of short run fluctuations. Rev. Econ. Stud. 48, 649– 656 (1981) 20. Stromberg, K.R.: An Introduction to Classical Real Analysis. American Mathematical Society, Providence (1981) 21. Takagi, T.: Kaiseki Gairon (Treatise on Analysis), 3rd edn. Iwanami Shoten, Tokyo (1961) (Originally published in Japanese) 22. Yamaguti, M: Hisenkei Gensho no Sugaku (Mathematics of Nonlinear Phenomena). Asakura Shoten, Tokyo (1972) (Originally published in Japanese) 23. Yasui, T.: Jireishindo to Keiki Junkan. In:Kinko-bunseki no Kihon Mondai. (Self-oscillations and trade cycles. In: Fundamental Problems in Equilibrium Analysis.) Iwanami Shoten, Tokyo (1965) (Originally published in Japanese)

Appendix A

Exponential Function eiθ

The imaginary exponential function eiθ incessantly appears in Fourier analysis. Everybody knows that eiθ is a complex number, the real and the imaginary parts of which are cos θ and sin θ , respectively. If we take it for granted, the trajectory of eiθ is exactly the unit circle in the complex plane C. However, it is not so easy for us to give a coherent exposition of this concept. We would like to provide an idea on how to look at this concept systematically in order to promote more rigorous reasoning.1

A.1 Complex Exponential Function We define a function ez of a complex variable z by ez =

∞ 1 n z . n!

(A.1)

n=0

Since the convergence radius of the power series on the right-hand side of (A.1) is infinite, (A.1) is absolutely convergent. The derivative is d z e = ez . dz

(A.2)

It is easy to see that

ez+z = ez · ez

(A.3)

1I

am indebted to Cartan [1] and Takagi [5] very much. See van der Waerden [6] for basic knowledge on group theory.

© Springer Nature Singapore Pte Ltd. 2018 T. Maruyama, Fourier Analysis of Economic Phenomena, Monographs in Mathematical Economics 2, https://doi.org/10.1007/978-981-13-2730-8

347

A Exponential Function eiθ

348

for z, z ∈ C. In fact, it is verified by2 ∞ ∞ 1 n 1 n z · z n! n!

ez · ez =

n=0

n=0

n ∞

=

n=0 p=0

1 zp z(n−p) p!(n − p)!

∞ 1 (z + z )n = n! n=0

= ez+z . Since ez · e−z = 1 by (A.3), we observe that ez 0 for any z ∈ C.

2 In

general, let

∞ n=0

un and

∞

vn be absolutely convergent series. Define

n=0

wn =

n

up vn−p .

p=0

Then

∞

wn is also absolutely convergent and

n=0 ∞

∞

wn =

n=0

un

∞

n=0

vn .

(†)

n=0

We can prove this relation as follows. Defining αp =

∞

|un |,

βq =

n=p

∞

|vn |,

n=q

we obtain ∞

|wn |

n=0

∞ ∞

|up | |vq | = α0 · β0 .

p=0 q=0

For m 2n, the absolute value of m k=0

wk −

n k=0

uk

n

vk

k=0

is bounded by some finite sum of |up | · |vq | (either p or q is greater than n). This sum is, in turn, bounded by α0 βn+1 + β0 αn+1 . Hence it converges to 0 as n → ∞. Thus we verify (†).

A.2 Imaginary Exponential Function

349

If we write z ∈ C as z = x + iy (x, y ∈ R), then ez = ex · eiy . The study of ez is eventually reduced to examining the properties of the real exponential function ex and of the imaginary exponential function eiy . It does not seem necessary here to give any expositions concerning the properties of ex and the logarithmic function as its inverse.3 So we exclusively concentrate on the imaginary exponential function eiy (y ∈ R).

A.2 Imaginary Exponential Function By definition of eiθ given by (A.1) in the preceding section, e−iθ = eiθ (θ ∈ R). Hence eiθ · e−iθ = |eiθ |2 = 1, and so |eiθ | = 1 for all

θ ∈ R.

(A.4)

The set U = {z ∈ C| |z| = 1} forms a unit circle of the complex plane C, and is a group with respect to the multiplication. A function ϕ : R → U defined by ϕ(θ ) = eiθ ,

θ ∈R

(A.5)

is a (group) homomorphism of an additive group R into the multiplicative group U . In order to examine ϕ closely, we divide eiθ into the real part and the imaginary part. We define cos θ as the real part of eiθ and sin θ as its imaginary part; i.e. eiθ = cos θ + i sin θ.

(A.6)

Then cos θ and sin θ are given by the power series cos θ = 1 −

1 2 (−1)n 2n θ + ··· + θ + ··· , 2 (2n)! (A.7)

sin θ = θ −

3 cf.

(−1)n

1 3 θ + ··· + θ 2n+1 + · · · , 3! (2n + 1)!

Stromberg [4] pp. 134–136, Takagi [5] pp. 189–193.

A Exponential Function eiθ

350

the convergence radii of which are both infinite.4 The derivatives are evaluated as d cos θ = − sin θ, dθ

d sin θ = cos θ. dθ

(A.8)

Since cos 0 = 1 and cos θ is continuous, there exists some θ0 > 0 such that θ ∈ [0, θ0 ].

cos θ > 0 for all

(A.9)

Hence, by (A.8), sin θ is strictly increasing on [0, θ0 ] and sin θ0 is positive, since sin 0 = 0. We now show that there exists some θ > 0 such that cos θ = 0. Suppose that cos θ > 0 on [θ0 , θ1 ]. We know that

θ1

cos θ1 − cos θ0 = −

sin θ dθ.

(A.10)

θ0

cos θ > 0 on the interval of integration. So sin θ is increasing there and sin θ > sin θ0 ≡ a. Hence

θ1

sin θ dθ > a(θ1 − θ0 ).

θ0

4 Consider

a series

∞

an (x) of functions an : R → R.

n=1

1◦ If there exists a sequence {cn } of positive numbers such that |an (x)| cn ; n = 0, 1, 2, · · · , and

∞

cn < ∞,

n=1

on some interval I , then

∞

an (x) uniformly converges on I .

n=1

2◦ If each an (x) is continuously differentiable, convergent, then

∞

∞

an (x) converges, and

n=1

an (x) is also continuously differentiable and

n=1 ∞ ∞ d d an (x) = an (x). dx dx n=1

n=1

(cf. Stromberg [4] pp. 213–215, Takagi [5] pp. 155–159.)

∞ n=1

an (x) is uniformly

A.2 Imaginary Exponential Function

351

By (A.10) and cos θ1 > 0, we obtain θ1 − θ0 <

1 cos θ0 . a

Thus we must conclude that there exists some point θ ∈ [θ0 , θ0 + (1/a) cos θ0 ] such that cos θ = 0. We define π/2 as the minimum positive number which satisfies cos θ = 0. This is the definition of π . On the interval [0, π/2], cos θ is monotonically decreasing from 1 to 0, and sin θ is increasing from 0 to 1. That is, ϕ is a bijection of [0, π/2] onto U1 = {(u, v) ∈ U | u 0, v 0}. The inverse of ϕ is also continuous, since [0, π/2] is compact. Thus [0, π/2] is homeomorphic to U1 through ϕ. Similarly, ϕ gives a homeomorphic relation between each of the following interval and the corresponding quarter of the unit circle: 0

1 π , π ←→ U2 = {(u, z) ∈ U | u 0, v 0}, 2 0 1 3 π, π ←→ U3 = {(u, z) ∈ U | u 0, v 0}, 2 0 3 π, 2π ←→ U4 = {(u, z) ∈ U | u 0, v < 0}. 2 For instance, suppose θ moves on [π/2, π ]. Write θ = π/2 + ξ . When θ moves from π/2 to π , ξ moves from 0 to π/2. It is easy to see that5 eiθ = ei(π/2+ξ ) = ei·π/2 · eiξ π iξ π e = − sin ξ + i cos ξ. = cos + i sin 2 2 When θ moves on [π/2, π ] and so ξ moves on [0, π/2], the real part of eiθ decreases from 0 to −1, and the imaginary part also decreases from 1 to 0. Therefore the trajectory of ϕ(θ ) = eiθ (θ ∈ [π/2, π ]) is the quarter U2 of the unit circle. We can apply similar reasonings to other pairs. Hence [0, 2π ) and U are homeomorphic to each other through ϕ. Of course, U is also homeomorphic to [−π, π ) instead of [0, 2π ).

5 The

second equality is justified because the function ϕ defined by (A.5) is a homomorphism.

A Exponential Function eiθ

352

Since the unit element of U is 1, the kernel Kerϕ = {θ ∈ R|ϕ(θ ) = 1} of the homomorphism ϕ is the set of multiples of 2π ; i.e. Kerϕ = 2π Z = {2π n|n ∈ Z}.

(A.11)

A.3 Torus R/2πZ 2π Z = {2π n|n ∈ Z} is a subgroup of the additive group R. The factor group R/2π Z of R modulo 2π Z is called the (one dimensional) torus, which is denoted by T. T and the unit circle U is isomorphic through ϕ defined in the preceding section: i.e. T = R/2π Z U.

(A.12)

This is justified by the isomorphism theorem of groups and (A.11) in the preceding section. T endowed with the quotient topology is a Hausdorff topological space. Let ξ : R → T be a canonical mapping; i.e. a function which maps each θ ∈ R to its residue class θ + 2π Z. By the definition of the quotient topology, ξ is continuous. Since T is the image of [0, 2π ] by ξ , T is compact in the quotient topology. If we define another function ϕ˜ : T → U by ϕ(θ ˜ + 2π Z) = ϕ(θ ),

θ ∈ R,

(A.13)

ϕ˜ is a continuous bijection.6 continuity of ϕ˜ can be proved as follows. Let Σ be an open neighborhood of ϕ(θ ˜ 0 + 2π Z) = ϕ(θ0 ). By the continuity of ϕ, there exists an open set Θ(⊂ R) containing θ0 such that

6 The

ϕ(θ) ∈ Σ Define a set

Θ

for all θ ∈ Θ.

by Θ = {θ + 2π z | θ ∈ Θ, z ∈ Z} =

(Θ + 2π z).

z∈Z

Θ is an open set in R which contains θ0 and satisfies ϕ(θ) ∈ Σ

for all θ ∈ Θ .

Let Γ be the set of residue classes containing elements of Θ ; i.e. Γ = {θ + 2π Z | θ ∈ Θ }.

A.3 Torus R/2π Z

353

Consequently, ϕ˜ is a homeomorphism between T and U . (Note that U is compact.) We conclude that the torus T and the unit circle U are isomorphic as groups and homeomorphic as topological spaces. ϕ˜ −1 (u) ∈ T is uniquely determined for each u ∈ U ; that is, a unique real number modulo 2π Z is determined. We call this real number the argument of u and denote it by arg u. For any complex number z 0, we define its argument by arg z = arg

z . |z|

(A.14)

(Since z/|z| ∈ U , the right-hand side has already been defined above.) We must note that arg z is defined at most modulo 2π Z. Making use of this notation, z can be expressed as z = |z|ei arg z .

(A.15)

Given a complex number t ∈ C (t 0), the complex number z which satisfies ez = t is given by log |t| + i arg t.

(A.16)

This complex number is the definition of log t, i.e. log t = log |t| + i arg t. Since arg t is determined only modulo 2π Z, log t is also defined only modulo 2π i · Z. Let D be a region (open connected set) in C \ {0}. A continuous function f : D → C which satisfies ef (t) = t (t ∈ D) (that is, f (t) is a value of log t) is called a branch of log t. If f (t) is a branch of log t, then its derivative f (t) exists and f (t) = 1/t. Proof For h 0 with sufficiently small |h| (and so t + h ∈ D), we have f (t + h) − f (t) f (t + h) − f (t) = f (t+h) . h e − ef (t)

As h → 0, this number approaches to the inverse of the limit of (ez − ez )/(z − z) as z → z = f (t). Hence f (t) is the inverse of the derivative of ez at z = f (t), that is, e−f (t) = 1/t.

Then θ0 + 2π Z ∈ Γ and ξ −1 (Γ ) = Θ . Hence, by the definition of the quotient topology, Γ (⊂ T) is open in T. It also holds good that ϕ(θ ˜ + 2π Z) = ϕ(θ) ∈ Σ This proves the continuity of ϕ. ˜

for all θ ∈ Θ

(i.e. for all θ + 2π Z ∈ Γ ).

A Exponential Function eiθ

354

A.4 A Homomorphism of R into U We define a function ψx : R → U for a fixed x ∈ R by ψx (θ ) = eixθ . ψx is a homomorphism between the two groups. How about the converse? That is, is any homomorphism η : R → U represented in the form η(θ ) = eixθ for some x ∈ R?7 Since η is continuous and η(0) = 1, there exists some δ > 0 such that

δ

η(t)dt = α 0.

(A.17)

0

Furthermore,

δ

αη(θ ) = η(θ )

δ

η(t)dt =

0

η(θ + t)dt

0

(A.18)

θ+δ

=

η(t)dt

(changing variables)

θ

because η is a homomorphism. Although we assume only the continuity of η, η actually turns out to be differentiable by (A.18).8 αη (θ ) =

δ

η (θ + t)dt = η(θ + δ) − η(θ )

0

= η(θ )η(δ) − η(θ ) = η(θ )(η(δ) − 1). Hence η (θ ) = η(θ ) ·

η(δ) − 1 . α

Since (η(δ) − 1)/α on the right-hand side is exactly equal to η (0),9 we have η (θ ) = xη(θ ), 7 We

x = η (0).

owe this to Rudin [3] pp. 12–13. [2] pp. 381–382.

8 Maruyama 9

αη (0) =

δ o

η (t)dt = η(δ) − η(0) = η(δ) − 1.

(A.19)

A.5 Functions and σ -Fields on the Torus

355

Solving this differential equation with the initial value η(0) = 1, we obtain η(θ ) = eixθ

(A.20)

by the differentiation formula of a branch of log t.10 Thus we conclude that any continuous homomorphism ψ : R → U can be expressed in the form ψ(θ ) = eiθx for a uniquely determined x ∈ R.

A.5 Functions and σ -Fields on the Torus We denote by F(T, C) the set of complex-valued functions on T and by F2π (R, C) the set of complex-valued functions on R which are 2π -periodic. Then there is a oneto-one correspondence between these two sets. Define an operator T : F(T, C) → F2π (R, C) by (Tf )(θ ) = f (ξ(θ )) = f (θ + 2π Z),

f ∈ F(T, C).

The kernel KerT of T is {0}. Hence F(T, C) and F2π (R, C) are isomorphic as linear spaces. (ξ is the canonical mapping which maps each θ to its residue class θ +2π Z.) We denote by C(T, C) the space of complex-valued continuous functions on T, and by Cb (R, C) the space of bounded complex-valued continuous functions on R. Both of them are Banach spaces with the uniform convergence norm. If we define a linear operator T : C(T, C) → Cb (R, C) by (Tf )(θ ) = f (ξ(θ )),

f ∈ C(T, C),

Tf is a continuous function with period 2π . So if we denote by Cb2π (R, C) the space of bounded complex-valued continuous functions with period 2π , C(T, C) is isomorphic to Cb2π (R, C) as Banach spaces. It is also clear that C(T, C) is also isomorphic to {f ∈ C([−π, π ), C) | lim f (θ ) = θ↑π

f (−π )}. Since T and U are homeomorphic, the measurable spaces (T, B(T)) and (U, B(U )) are isomorphic, where B(·) denotes the Borel σ -field. Although [−π, π ) and U are not homeomorphic, (U, B(U )) and ([−π, π ), B([−π, π ))) are isomorphic. So we can consider [−π, π ) instead of U or T as a measurable space.

= xθ+ constant. But the constant must be 0 since η(0) = 1. On the other hand, log η(θ) = log |η(θ)| + i argη(θ) = i argη(θ). i argη(θ) = θx. If we express x as x = a + ib, we obtain i argη(θ) = ib(θ). Hence η(θ) = ei argη(θ ) = eibθ . This proves (A.20).

10 log η(θ)

356

A Exponential Function eiθ

References 1. Cartan, H.: Théorie élémentaires des fonctions analytiques d’une ou plusieurs variables complexes. Hermann, Paris (1961) (English edn.) Elementary Theory of Analytic Functions of One or Several Complex Variables. Addison Wesley, Reading (1963) 2. Maruyama, T.: Suri-keizaigaku no Hoho (Methods in Mathematical Economics). Sobunsha, Tokyo (1995) (Originally published in Japanese) 3. Rudin, W.: Fourier Analysis on Groups. Interscience, New York (1962) 4. Stromberg, K.R.: An Introduction to Classical Real Analysis. American Mathematical Society, Providence (1981) 5. Takagi, T.: Kaiseki Gairon (Treatise on Analysis), 3rd edn. Iwanami Shoten, Tokyo (1961) (Originally published in Japanese) 6. van der Waerden, B.L.: Algebra, nouvelle édn. Unger, New York (1970)

Appendix B

Topics from Functional Analysis

We have studied the theory of Fourier transforms of distributions, and then applied it to the theory of almost periodic functions. Spaces like D(Ω), D(Ω) , S(Ω) and S(Ω) , which play basic roles in the theory in question, are not normed spaces. So more abstract theories of locally convex topological vector spaces are indispensable. However, it seems to be a digression for us to describe the entire landscape of the theory in this book. We ask our readers to read through some reliable textbooks on functional analysis for systematic knowledge.1 In this appendix, we are going to give a supplementary exposition of a couple of topics for the sake of readers’ convenience; on the concept of inductive limit and on the dual space of a locally convex topological vector space. The topology of D(Ω) which appears in Appendix C is determined by the inductive limit and its dual D(Ω) is the space of distributions. Hence the basic theories explained in this appendix are more or less required in order to study the distributions at a rigorous level.

B.1 Inductive Limit Topology A distribution is a continuous linear functional on a topological vector space consisting of certain smooth functions. The topology of its domain is defined by the inductive limit. So we start by explaining this concept.

1 For

general theories of locally convex spaces, Bourbaki [1], and Grothendieck [2] are most reliable works. We are much indebted to Schwartz [6] Chap. III and Schwartz [5] Chap. IV as well as Appendix, Part II for the theories of inductive limit and barreled spaces, which are discussed in relation to the theory of distributions.

© Springer Nature Singapore Pte Ltd. 2018 T. Maruyama, Fourier Analysis of Economic Phenomena, Monographs in Mathematical Economics 2, https://doi.org/10.1007/978-981-13-2730-8

357

358

B Topics from Functional Analysis

Let X be a vector space and {Xα }α∈A a family of subspaces of X such that X = Xα . We suppose that each Xα is a locally convex Hausdorff topological vector α∈A

(or linear) space (abbreviated as LCHTVS) and its topology is denoted by Tα . We also denote by B0 the set of all the balanced convex subsets U such that U

B

Xα ∈ N(0, Tα )

for all α ∈ A, where N(0, Tα ) is a perfect family of neighborhoods of 0 ∈ Xα with respect to Tα . In fact, such a neighborhood U of 0 certainly exists. Let Vα be a balanced convex neighborhood of 0 ∈ Xα . Then U ≡ co Vα certainly satisfies the required α

properties.2 If we define B = {x + U |x ∈ X, U ∈ B0 }, a topology T of X, the base of which is B, is determined. By the construction of B0 , X endowed with the topology T is a locally convex topological vector space (not necessarily Hausdorff). This topology T is called the inductive limit topology. The relative topology of Xα induced by T is weaker than Tα . The inductive limit topology can be characterized in several ways. Consider a topology T on X which satisfies the following conditions: (a) X endowed with T is a locally convex topological vector space. (b) The relative topology Tα of Xα induced by T is weaker than Tα . (c) The embedding mapping Iα : Xα → X(α ∈ A) is continuous with respect to T . Then the inductive limit is characterized as follows. Theorem B.1 (i) The inductive limit topology T is the strongest one which satisfies (a) and (b). (ii) T is the strongest one which satisfies (a) and (c). Proof (i) Let T be a topology of X which satisfies (a) and (b). If U ∈ N(0, T ) is a balanced convex set, U ∩ Xα ∈ N(0, Tα )

for all

α ∈ A.

Since Tα is weaker than Tα , we also have 2 Let X be a vector space and M its subset. M is balanced if αx ∈ M for every x ∈ M and α ∈ C with |α| 1. M is said to be absorbing if there exists some α > 0 for every x ∈ X such that (1/α)x ∈ M. (cf. Maruyama [3] pp. 246–248, Schwartz [5] p. 7.) coM is the convex hull of M.

B.1 Inductive Limit Topology

359

U ∩ Xα ∈ N(0, Tα )

for all

α ∈ A.

Hence U ∈ N(0, T ). This proves that T is weaker than T . (ii) Let T be a topology of X which satisfies (a) and (c). If U ∈ N(0, T ) is a balanced convex set, U ∩ Xα ∈ N(0, Tα )

for all

α ∈ A,

by (c). Hence U ∈ N(0, T ).

The following several theorems show that seminorms, linear functionals and linear operators defined on (X, T ) are continuous if and only if they are continuous on (Xα , Tα )(α ∈ A). Theorem B.2 Let (Xα , Tα ), α ∈ A be LCHTVS’s. Suppose that the vector space Xα is endowed with the inductive limit topology T . The following two X = α∈A

statements are equivalent for any seminorm p on X: (i) p is continuous with respect to T . (ii) p is continuous on Xα for all α ∈ A. Proof (i)⇒(ii): If p is continuous on (X, T ), p is, of course, continuous on (Xα , Tα ) (where Tα is the relative topology of Xα induced by T ). Since Tα is stronger than Tα , p is continuous on (Xα , Tα ). (ii)⇒(i): If we assume (ii), the set {x ∈ X|p(x) 1} is balanced and convex, and satisfies {x ∈ X|p(x) 1} ∩ Xα ∈ N(0, Tα )

for all

α ∈ A.

Hence, by the definition of the topology of T , we have {x ∈ X|p(x) 1} ∈ N(0, T ). Thus p is continuous on (X, T ).3

Theorem B.3 (Xα , Tα ) and (X, T ) are the same as in Theorem B.2. The following two statements are equivalent for a linear functional Λ : X → C:

3 N(0) is the perfect family of neighborhoods of 0. The following five statements are equivalent for a seminorm p(·) on a topological vector space X. (i) p is continuous. (ii) {x ∈ X|p(x) < 1} is open. (iii) {x ∈ X|p(x) < 1} ∈ N(0). (iv) {x ∈ X|p(x) 1} ∈ N(0). (v)p is continuous at 0 (Schwartz [5] p. 10).

360

B Topics from Functional Analysis

(i) Λ is continuous with respect to T . (ii) Λ is continuous on Xα for all α. Proof Λ is continuous on (X, T ). ⇐⇒ |Λ(·)| is continuous seminorm on (X, T ). ⇐⇒ |Λ(·)| is continuous seminorm on (Xα , Tα ) for all α. (Theorem B.2)

⇐⇒

Λ is continuous on (Xα , Tα ) for all α.

Corollary B.1 (Xα , Tα ) and (X, T ) are the same as in Theorem B.2. Let Y be a locally convex topological vector space and T : X → Y a linear operator. Then the following two statements are equivalent: (i) T is continuous with respect to T . (ii) T is continuous on Xα for all α ∈ A. The proof is quite easy. It is enough to check q ◦ T for a continuous seminorm q on Y. The inductive limit topology T is called a strictly inductive limit topology if the following three conditions are satisfied in the definition of T : (i) A = N. (ii) Xn ⊂ Xn+1 and the topology Tn on Xn coincides with the relative topology on Xn , induced by Tn+1 . (iii) Xn is a closed subspace of Xn+1 . Remark B.1 1◦ (i) Let M be a subspace of a topological vector space X. Suppose that U is a convex, open, balanced neighborhood of 0 ∈ X, and V is a convex, open, balanced neighborhood of 0 ∈ M such that U ∩ M ⊂ V . Then co(U ∪ V ) is a convex, open, balanced neighborhood of 0 ∈ X. (ii) Let q be a continuous seminorm on M, and p a continuous seminorm on X such that q(x) p(x)

for all

x ∈ M.

Then there exists a continuous seminorm q˜ on X which satisfies a. q(x) ˜ = q(x) for all x ∈ M, b. q(x) ˜ p(x) for all x ∈ X. 2◦ The following results are generalizations of the facts discussed in Chap. 10. (i) Let X be a vector space. Suppose that Y and Z are orthogonal complements of each other; i.e. X = Y ⊕ Z. If P : X → Y is a projection, it satisfies P (X) = Y, P −1 ({0}) = Z and P 2 = P . Conversely, assume that P : X → X is a linear operator which satisfies P 2 = P . Then there exist uniquely a

B.1 Inductive Limit Topology

361

pair of subspaces Y, Z ⊂ X such that X = Y ⊕ Z and P is a projection of X into Y. (ii) Suppose that X is a Hausdorff topological vector space, and Y, Z and P are just as in (i). If we define a linear operator ϕ : Y × Z → X by ϕ : (x, y) → x + y, then ϕ is a injection. In the case that ϕ is a homeomorphism, Y and Z are said to be topological complements of each other. Y and Z are topological complements if and only if P is continuous. And Y and Z are closed subspaces of X if P is continuous. Furthermore, the converse holds good if dimY < ∞ or dimZ < ∞. (iii) Let X be a LCHTVS. If Y is a subspace of X such that either a. dimY < ∞, or b. Y is closed and codimY < ∞, then Y has a topological complement Z. Theorem B.4 (Continuous extension of seminorm) Let (Xn , Tn ) be LCHTVS’s. ∞ Xn is endowed with a strictly inductive limit topology T . Any continuous X= n=1

seminorm pn on (Xn , Tn ) can be extended as a continuous seminorm on (X, T ). Proof If we define a set V in Xn by V = {x ∈ Xn |pn (x) 1}, there exists a convex, open, balanced neighborhood U of 0 ∈ Xn+1 such that U ∩ Xn ⊂ V by the definition of the strictly inductive limit topology. If we denote by q the continuous seminorm on Xn+1 determined by U , it holds good that pn (x) q(x)

for all

x ∈ Xn .

Applying Remark B.1, 1◦ above, we know that pn can be extended to a continuous seminorm pn+1 on Xn+1 . Repeating the same process, we obtain a seminorm p on the entire space X. The restriction of p to each Xk (k n) is continuous. Hence p is continuous on X by Theorem B.2. It should be noted that the relative topology of Xn induced by (X, T ) coincides with Tn . This is an important implication of Theorem B.4.

362

B Topics from Functional Analysis

Corollary B.2 A strictly inductive limit topology is Hausdorff.4 Remark B.2 Let X be a vector space. Suppose that both {(X and {(X˜ m , T˜ m )} n , Tn )} X˜ m . Assume Xn = are increasing sequences of LCHTVS’s such that X = n

m

further that: (a) For any n ∈ N, there exists some m ∈ N such that Xn ⊂ X˜ m and the relative topology of Xn induced by T˜ m coincides with Tn . (b) For any m ∈ N, there exists some n ∈ N such that X˜ m ⊂ Xn and the relative topology of X˜ m induced by Tn coincides with T˜ m . Then the strictly inductive limit topology on X defined by {Xn } coincides with that defined by {X˜ m }. Theorem B.5 (Xn , Tn ) and (X, T ) are the same as in Theorem B.4. The following two statements are equivalent for a set A ⊂ X: (i) A is bounded in (X, T ), (ii) A is bounded in (Xn , Tn ) for some n. Proof It is sufficient to show (i)⇒(ii) because (ii)⇒(i) is obvious. Suppose that there is no Xn such that A ⊂ Xn . Then there exist a subsequence {X˜ m } of {Xm } and a sequence {x1 0, x2 , · · · , xm , · · · } of A which satisfy • x1 ∈ X˜ 1 , xm ∈ X˜ m \ X˜ m−1 ; ∞ X˜ m . • X=

m = 2, 3, · · ·

m=1

• {X˜ m } satisfies (a) and (b) in the above remark. Note that the strictly inductive limit topology on X determined by {X˜ m } coincides with T . Construct a sequence {pn } of seminorms by induction. First of all, choose a continuous seminorm p1 on X˜ 1 such that p1 (x1 ) = 1. Next assume that a continuous seminorm pm−1 on X˜ m−1 is constructed so that pm−1 |X˜ m−2 = pm−2 ,

pm−1 (xm−1 ) = m − 1.

We now define Z = span{xm } = Cxm (⊂ X˜ m ), M = span(X˜ m−1 ∪ Z).

0. Since x ∈ Xn for some n, there exists a continuous seminorm pn such that pn (x) = 1. Extend this pn , by Theorem B.2, to a continuous seminorm on X.

4 It is sufficient to find disjoint neighborhoods of 0 and x

B.1 Inductive Limit Topology

363

Applying Remark B.1, 2◦ (iii), we observe that X˜ m−1 and Z are topological complements in M of each other since X˜ m−1 is closed and dimM/X˜ m−1 = dimZ = 1. Hence X˜ m−1 × Z and X˜ m−1 ⊕ Z = M are homeomorphic. x ∈ M can be represented as x = y + λxm ;

λ ∈ C, y ∈ X˜ m−1 ,

or x = (y, λ), for brevity. If we define qm by qm (x) = pm−1 (y) + m|λ|,

x = (y, λ) ∈ X˜ m−1 ⊕ Z,

qm is a continuous seminorm on X˜ m−1 ⊕ Z = M and qm (xm ) = m. We can extend qm to a continuous seminorm pm on X˜ m , by Remark B.1, 1◦ (ii). Repeating this process, we obtain a seminorm p on X defined by p(x) = pm (x)

if x ∈ X˜ m .

Then p|X˜ m = pm (m ∈ N). Hence p is continuous on each X˜ m . It follows that p is continuous on X by Theorem B.2. However, p is not bounded on A. This contradicts the boundedness of A. Thus we have shown that A ⊂ Xn for some n. The next important consequence follows immediately from Theorem B.5. Corollary B.3 Let (Xn , Tn ) and (X, T ) be the same as in Theorem B.4. The following two statements are equivalent for any net {xα } in X: (i) {xα } converges to some x ∗ ∈ X in the strict inductive limit topology. (ii) {xα } is contained in some Xn and it converges to x ∗ ∈ Xn . Definition B.1 A closed, convex, balanced and absorbing set in a topological vector space is called a barrel. A LCHTVS is called a barreled space if every barrel is a neighborhood of the origin 0. The inductive limit topology of barreled spaces is also barreled. We now show it successively. Theorem B.6 A LCHTVS X is barreled if and only if every lower semi-continuous (l.s.c.) seminorm is continuous. Proof Assume first that X is a barreled LCHTVS, and a seminorm p is l.s.c. on X. If we define A = {x ∈ X|p(x) 1} then A is clearly convex, balanced and absorbing. It is closed, since p is l.s.c. Hence A is a barrel and a neighborhood of 0. Therefore p is continuous. (cf. footnote 3 on p. 359.)

364

B Topics from Functional Analysis

Conversely, assume that every l.s.c. seminorm on X is continuous. Let A be a barrel. If we define p(x) = inf{λ > 0|x ∈ λA}, then p(x) is finite, since A is absorbing. It is also easily seen that p is a seminorm.5 We next show that A = {x ∈ X|p(x) 1}. In case of x ∈ A, it is obvious that p(x) 1. If p(x) < 1, we must have x ∈ A, since A is balanced. If p(x) = 1, (1 − ε)x ∈ A

for all

ε ∈ (0, 1).

So we must have x ∈ A, since A is closed. Furthermore, p is l.s.c. again by the closedness of A. Thus p is continuous, and A is a neighborhood of 0. Hence X is barreled. Lemma B.1 Let X be a Baire space6 and a function f : X → R be l.s.c. Then for any open set U in X, there exists an open set V ⊂ U such that sup f (x) = M < ∞. x∈V

Proof U is a Baire space, since it is an open set in a Baire space.7 We restrict our attention to U . If we define Fn = {x ∈ U |f (x) n};

n = 1, 2, · · · ,

we observe that int.Fn ∅ for some n by the Baire category theorem. If we define V = int.Fn , it satisfies the required property. Remark B.3 Let X be a Baire space and {fα : X → R} a family of continuous functions such that sup|fα (x)| < ∞ α

5 In

general, suppose that K is a balanced, absorbing and convex neighborhood of 0 of a LCHTVS. If we define x pK (x) = inf r > 0 ∈ K , r

pK (·) is a seminorm. It is called the Minkowski functional determined by K (cf. Maruyama [3] pp. 250–251 and Yosida [7] p. 24). 6 A topological space X is called a Baire space if X\M = X for any set M ⊂ X of the first category. (cf. Maruyama [3] pp. 28–33.) 7 Maruyama [3] Theorem 1.14, p. 30.

B.1 Inductive Limit Topology

365

for each x ∈ X. Then {fα } is uniformly bounded on some open set V in X. (Consider f (x) = sup|fα (x)|.) This is a slight generalization of Osgood’s uniform α

boundedness principle. Theorem B.7 If a LCHTVS X is a Baire space, it is barreled. Proof Let p be a l.s.c. seminorm on X. Then, by Lemma B.1, there exists an open set V ∈ X such that sup p(x) = M < ∞. x∈V

Pick up any x ∈ V and define W = V − x. Then W is an open neighborhood of 0. We observe that p(w) 2M

for any w ∈ W,

since w = v − x for some v ∈ V . Hence 1 W ⊂ {x ∈ X|p(x) 1}. 2M It tells us that {x ∈ X|p(x) 1} is a neighborhood of 0. So p is continuous. By Theorem B.6, we conclude that X is barreled. Corollary B.4 Any Banach space is barreled. Any Fréchet space is barreled. Example B.1 C([0, 1], R) endowed with L1 -norm is not barreled. (Check U = {f ∈ C([0, 1], R) | f ∞ 1}.) Theorem B.8 Suppose that X is a vector space and {Xα }α∈A is a family of vector Xα . If every Xα is barreled and X is endowed with subspaces of X such that X = α∈A

the inductive limit topology, then X is also barreled. Proof If A is a barrel in X, A ∩ Xα is a barrel in Xα for all α. Since Xα is barreled, A ∩ Xα is a neighborhood of 0 in Xα . Consequently, A is a neighborhood of 0 in X by the definition of the inductive limit topology. So X is barreled. By Corollary B.4 and Theorem B.8, it is clear that E(Ω), DK (Ω) and D(Ω), which play crucial roles in the theory of distributions, are all barreled. (See Corollary C.1 for details.) Definition B.2 A barreled space is called a Montel space if any bounded set is relatively compact. It will be discussed later on that D(Ω) and E(Ω) are typical Montel spaces. (cf. Theorem C.5, p. 384.)

366

B Topics from Functional Analysis

B.2 Duals of Locally Convex Spaces Definition B.3 Let X be a topological vector space (TVS) and Y a locally convex topological vector space. We denote by L(X, Y) the set of all the continuous linear operators of X into Y. There are three ordinary topologies on L(X, Y): 1◦ Simple topology: If we define px,q by px,q (T ) = q(T x);

T ∈ L(X, Y)

(B.1)

for x ∈ X and a continuous seminorm q on Y, then px,q is a seminorm on L(X, Y). The topology on L(X, Y) generated by the family of seminorms {px,q |x ∈ X, q is a continuous seminorm on Y} is called the simple topology. In the special case of Y = C, the simple topology is nothing other than the weak*-topology. 2◦ Compact topology: If we define pK,q by pK,q (T ) = sup q(T x);

T ∈ L(X, Y)

(B.2)

x∈K

for a compact set K ⊂ X and a continuous seminorm q on Y, pK,q is a seminorm on L(X, Y). The topology on L(X, Y) generated by the family of seminorms of the form (B.2) is called the compact topology. 3◦ Strong topology: If we define pB,q by pB,q (T ) = sup q(T x);

T ∈ L(X, Y)

(B.3)

x∈B

for a bounded set B ⊂ X and a continuous seminorm q on Y, pB,q is a seminorm on L(X, Y). The topology on L(X, Y) generated by the family of seminorms of the form (B.3) is called the strong topology. Theorem B.9 Let X be a LCHTVS. The following two statements are equivalent: (i) X is barreled. (ii) Any w ∗ -bounded set in X is equi-continuous. Proof (i)⇒(ii): Let A be a w∗ -bounded set in X . The seminorm x → |Λx| defined for each Λ ∈ A is continuous. Hence if we define pA (x) ≡ sup |Λx|, Λ∈A

pA (x) is finite (by w ∗ -boundedness of A) and l.s.c. Since X is barreled, pA is continuous by Theorem B.6. Consequently, for each ε > 0, there exists a neighborhood U of 0 ∈ X, such that

B.2 Duals of Locally Convex Spaces

367

sup pA (x) = sup sup |Λx| ε. x∈U

x∈U Λ∈A

That is, A is equi-continuous. (ii)⇒(i): Conversely, assume (ii). If B is a barrel in X,8 B ◦ = {Λ ∈ X || Λx| 1 for all x ∈ B} is w ∗ -bounded in X . It can be proved as follows. Since B is absorbing, we have, for any z ∈ X, λz ∈ B for some λ ∈ C. Hence sup |Λz| = sup

Λ∈B ◦

Λ∈B ◦

1 1 |Λ(λz)| . |λ| |λ|

This holds good for any z ∈ X. Thus B ◦ is w ∗ -bounded. Hence B ◦ is equicontinuous by (ii). Defining B ◦◦ = {x ∈ X| |Λx| 1 for all Λ ∈ B ◦ }, we obtain B = B ◦◦ by the Hahn–Banach theorem. It follows that9 B = B ◦◦ ∈ N(0).

That is, X is barreled.

Remark B.4 For any set A ⊂ X, in general, A◦◦ is the smallest closed, convex and balanced set containing A. The proof is not so hard. A◦◦ ⊃ coA is obvious, since A◦◦ is a closed, convex and balanced set containing A. Suppose that A◦◦ ⊂ coA is not true. Then there exists some x0 ∈ A◦◦ \ coA. By the Hahn–Banach theorem, there exists some Λ ∈ X which satisfies B {Λx0 } = ∅. Λ(coA) The shape of Λ(coA) is: • a closed disc with center 0 in C in the case of complex vector space; • a closed segment with center 0 in R in the case of real vector space. In any case, Λ(coA) = {z ∈ C or R | |z| ρ}

A be a set in X. The polar set A◦ of A is defined by A◦ = {Λ ∈ X | |Λx| 1 for all x ∈ A}. We list several simple properties of A◦ : (i) {0}◦ = X , X◦ = {0} ∈ X . (ii) A ⊂ B ⇒ B ◦ ⊂ A◦ . (iii) (λA)◦ = (1/λ)A◦ for λ ∈ C \ {0}. (iv) (A ∪ B)◦ = A◦ ∩ B ◦ . (cf. Schwartz [5] pp. 56–57.) 9 For any 0 < ε < 1, there exists some V ∈ N(0) such that |Λx| < ε < 1 for all x ∈ V and all Λ ∈ B ◦ by the equi-continuity of B ◦ . Hence V ⊂ B ◦◦ . Thus we obtain B ◦◦ ∈ N(0). 8 Let

368

B Topics from Functional Analysis

for some ρ 0. Since Λx0 is located outside this set, |Λx0 | > ρ. Defining Γ = Λ/ρ, we have Γ ∈ A◦ , since |Γ x| 1 for any x ∈ coA. On the other hand, |Γ x0 | > 1. This contradicts x0 ∈ A◦◦ . Hence it follows that A◦◦ ⊂ coA. Theorem B.10 Suppose that X is a barreled space and Y is a LCHTVS. Then any bounded set in L(X, Y) with respect to the simple topology (s-bounded for brevity) is equi-continuous. Proof Let H ⊂ L(X, Y) be s-bounded; i.e. H (x) ≡ {T x ∈ Y|T ∈ H } is bounded in Y for any x ∈ X. We wish to show that U≡

B

T −1 (V ) ∈ NX (0)

T ∈H

for any V ∈ NY (0). (NX (0) and NY (0) are the perfect family of neighborhoods of 0 in X and Y, respectively.) Without loss of generality, we may assume that V is closed, convex and balanced. Since T −1 (V ) is closed, convex and balanced for any T ∈ H , so is U=

B

T −1 (V ).

T ∈H

By the s-boundedness of H , there exists some λ ∈ R, for each x ∈ X, which satisfies λH (x) = {T (λx) ∈ Y|T ∈ H } ⊂ V . That is, λx ∈ T −1 (V ). Thus we have verified that U is absorbing. So U is a barrel. Since X is barreled, U ∈ NX (0). Remark B.5 Suppose that X and Y {0} are LCHTVS’s. If every s-bounded set in L(X, Y) is equi-continuous, U is barreled. (The converse of Theorem B.10.) It is proved by applying Theorem B.9 to obtain a contradiction. We now proceed to the generalized Ascoli–Arzelà theorem and then to the Banach–Steinhaus theorem. Theorem B.11 (Ascoli–Arzelà) Let X be a topological space and Y a locally convex topological vector space. Suppose that F is an equi-continuous subset of C(X, Y), the space of continuous functions of X into Y. (i) The closure F¯ of F with respect to the pointwise convergence topology is an equi-continuous set in C(X, Y). (ii) The following three topologies on F (similarly for F¯ ) coincide:

B.2 Duals of Locally Convex Spaces

369

a. topology defined by pointwise convergence on a dense subset of X : (Ta ). b. pointwise convergence topology on X : (Tb ). c. uniform convergence on compacta topology on X : (Tc ). (iii) If F (x) = {f (x) ∈ Y|f ∈ F } is relatively compact in Y for each x ∈ X, then F is relatively compact in Tc . Proof 10 (i) Let x ∈ X and V ∈ NY (0). There exists some balanced W ∈ NY (0) such that W + W + W ⊂ V. Since F is equi-continuous on X, there exists some U ∈ NX (x) such that f (z) − f (x) ∈ W

for all

z ∈ U, f ∈ F .

If f˜ ∈ F¯ , there exists some f ∈ F , for each z ∈ U , such that f˜(z) − f (z) ∈ W,

f˜(x) − f (x) ⊂ W.

Therefore f˜(z) − f˜(x) = (f˜(z) − f (z)) + (f (z) − f (x)) + (f (x) − f˜(x)) ∈ W + W + W ⊂ V. Hence F¯ is equi-continuous on X. (And so, F¯ ⊂ C(X, Y).) (ii) It is obvious that Ta ⊂ Tb ⊂ Tc . So it is enough to show Tc ⊂ Ta ; i.e. N(f, Tc ) ⊂ N(f, Ta ) for each f ∈ F . Without loss of generality, we may assume f = 0. Let D = {zα } be a dense subset of X. Suppose that K ⊂ X is compact, and p is a continuous seminorm on Y. Then qK,p (f ) = sup p(f (x));

f ∈F

x∈K

is a continuous seminorm on F with respect to Tc . We now show that S ≡ {f ∈ F |qK,p (f ) < 1} belongs to N(0, Ta ). 10 Schwartz

[5] pp. 78–80.

370

B Topics from Functional Analysis

Define V ∈ NY (0) by 1 . V = y ∈ Yp(y) < 4 By the equi-continuity of F , there exists some Ux ∈ NX (x) for each x ∈ K such that f (Ux ) − f (x) ⊂ V

for all

f ∈ F.

Since {Ux |x ∈ K} is an open covering of K, it has a finite subcovering {Ux1 , · · · , Uxn }. D ∩ Uxj ∅ for j = 1, 2, · · · , n because D¯ = X. Pick up zαj ∈ D ∩ Uxj and we write zαj = zj for notational simplicity. It is clear that f (Uxj )−f (zj ) = (f (Uxj )−f (xj ))+(f (xj )−f (zj )) ⊂ V +V

for all

Hence there exists some zj ∈ D for each x ∈ K such that p(f (x) − f (zj )) <

1 2

for all

f ∈ F.

Thus it follows that sup Min p(f (x) − f (zj ))

x∈K 1j n

1 . 2

Defining n B 1 T = f ∈ F qzj ,p (f ) < , 2 j =1

we obtain T ∈ N(0, Ta ). For f ∈ T , x ∈ K, we have p(f (x)) p(f (x) − f (zj )) + p(f (zj )) and qK,p (f ) = sup p(f (x)) < x∈K

1 1 + = 1. 2 2

This implies f ∈ S, that is, T ⊂ S. We conclude that S ∈ N(0, Ta ).

f ∈ F.

B.2 Duals of Locally Convex Spaces

371

(iii) F (x) = {f (x) ∈ Y|f ∈ F } is relatively compact in Y. So, by Tihonov’s theorem, A F (x) x∈X

is relatively compact in YX . Thus F is relatively compact with respect to Tc by (i) and (ii). The generalized version of the Banach–Steinhaus theorem can be proved based upon Theorem B.11. Theorem B.12 (Banach–Steinhaus) Let X be a barreled space and Y a LCHTVS. Suppose that a sequence {Tn } in L(X, Y) has a pointwise limit, that is lim Tn (x) ≡ n→∞

T (x) exists for each x ∈ X. Then T ∈ L(X, Y) and {Tn } converges to T with respect to the compact topology. Proof {Tn } is bounded in L(X, Y) with respect to the simple topology, since lim Tn (x) exists for each x ∈ X. Since X is barreled, {Tn } is equi-continuous by

n→∞

Theorem B.10. Furthermore, thanks to (i) and (ii) of Theorem B.11, T ∈ L(X, Y) and {Tn } converges to T with respect to the compact topology.

We are led to the resonance theorem that the boundedness (with respect to the original topology) and the weak boundedness coincide in a locally convex space. We make use of a quotient space of a locally convex space. We shall explain a few basic points. Let X be a locally convex topological vector space and M a subspace of X. An element of the quotient space X/M, say a residue class containing x, is denoted by ξx . (i) Let p be a seminorm on X. If we define p˙ : X/M → R by p(ξ ˙ x ) = inf p(z), z∈ξx

then p˙ is a seminorm, called the quotient seminorm on X/M. The following equality holds good: {ξx ∈ X/M|p(ξ ˙ x ) < 1} = {ξx ∈ X/M|p(x) < 1}. (ii) If the topology on X is generated by a family {pα } of seminorms, the topology on X/M generated by {p˙ α } coincides with the quotient topology. (iii) The quotient topology is Hausdorff if and only if M is a closed subspace. (iv) If X is a Fréchet space and M is a closed subspace, then X/M is also a Fréchet space. (v) If X is a barreled space and M is a closed subspace, then X/M is also a barreled space. (vi) (X/M) M◦ .

372

B Topics from Functional Analysis

Theorem B.13 (Banach–Mackey Resonance Theorem) Let X be a locally convex topological vector space (not necessarily Hausdorff) and M a subset of X. Then M is bounded if and only if it is weakly bounded. Proof 1◦ It is well-known that the theorem holds good in the case of a normed space.11 2◦ We next consider a locally convex space X, the topology of which is defined by a single seminorm p. Since N = {x ∈ X|p(x) = 0} is a closed subspace of X, X/N is a locally convex topological vector space. The quotient seminorm p˙ defined by p actually turns out to be a norm. If z ∈ ξx , z=x+y

for some y ∈ N.

Hence p(z) p(x) + p(y) = p(x). We also obtain p(x) p(z) similarly. It follows that p(x) = p(z); that is, p is constant on each ξx ∈ X/M. Hence p(x) = p(ξ ˙ x ). It follows that p(ξ ˙ x ) = 0 ⇔ ξx = N. This proves that p˙ is a norm. For Λ ∈ X , there exists a constant C such that |Λ(x)| C · p(x)

for all

x ∈ X.

(It is verified in a similar way in the case of a normed space.) Since p(x) = 0 for x ∈ N, Λ(x) = 0

for all

x ∈ N.

Corresponding to Λ ∈ X , we define (cf. Fig. B.1) ˙ x ) = Λ(x ˙ + N) = Λ(x). Λ(ξ

11 Yosida

[7] p. 69, and Maruyama [3] p. 351.

B.2 Duals of Locally Convex Spaces

373

Fig. B.1 Definition of Λ˙

Then ˙ x )| < 1} = {ξx ∈ X/N | Λ(x) < 1} = π({x ∈ X | Λ(x) < 1}) {ξx ∈ X/N | |Λ(ξ (π : X → X/N is the canonical projection). This set is open, since π is an open mapping. Hence it is a neighborhood of 0 ∈ X/N. Thus Λ˙ is continuous and Λ˙ ∈ (X/N) . If we define a linear operator A : X → (X/N) by ˙ A : Λ → Λ, it is clear that KerA = {0} ⊂ X . That is, A is a linear bijection. Hence X and (X/N) are isomorphic. So M is weakly bounded in X if and only if M˙ ≡ {ξx |x ∈ M} is weakly bounded in X/N. Since X/N is a normed space, M˙ is strongly bounded in X/N; i.e. p(ξ ˙ x) B for some B < ∞. This means that

for all

x∈M

374

B Topics from Functional Analysis

p(x) B

for all

x ∈ M.

3◦ Finally, let X be a general locally convex space. Suppose that a family {pα } of seminorms defines the topology on X. We denote by Xα the locally convex space, the topology on which is defined by a single seminorm pα . The original topology on X is, of course, stronger than the topology on Xα . Hence σ (X, X ) is stronger than σ (X, Xα ). It follows that, if M is weakly bounded in X, it is also weakly bounded in Xα . M is bounded in Xα by 2◦ . This holds good for all α, and so sup pα (x) < ∞ for all

α.

x∈M

Thus M is bounded on X.

Remark B.6 The above theorem does not hold good if X is not locally convex. To illuminate the situation by an example, let X = L1/2 . Then X = {0}. It follows that any set in X is weakly bounded. However, X is not bounded with respect to the original topology.12 Corollary B.5 Any weakly convergent sequence in a locally convex topological vector space is bounded. Definition B.4 Let X be a locally convex topological vector space. X is said to be semi-reflexive if the dual space X of the locally convex topological vector space X endowed with the strong topology coincides with X. X is called reflexive if X is semi-reflexive and the strong topology on X coincides with the topology on X. The following two results are most basic to characterize semi-reflexivity and reflexivity. Theorem B.14 (Mackey) Let X be a LCHTVS. The following two statements are equivalent: (i) X is reflexive. (ii) Any bounded set in X is weakly relatively compact. Theorem B.15 (reflexive vs semi-reflexive) Let X be a LCHTVS. The following two statements are equivalent. (i) X is reflexive. (ii) X is semi-reflexive and barreled. We prepare a couple of lemmata for the proofs. Lemma B.2 Let X be a LCHTVS and (X )∗ the algebraic dual space of X , endowed with the topology σ ((X )∗ , X ). Then f ∈ (X )∗ belongs to (X ) (the topology of X 12 This

example is due to Rudin [4] p. 82. See also the revised version, pp. 35–36.

B.2 Duals of Locally Convex Spaces

375

is the strong one) if and only if f belongs to the closure of some bounded set in X. (Note that we regard X as a subspace of (X )∗ .) Proof For f ∈ (X )∗ , we obtain f ∈ X ⇔ f ∈ V ◦ = {u ∈ (X )∗ |sup |u(v)| 1}

V ∈ NX (0).

for some

v∈V

A set V ⊂ X is a neighborhood of 0 (with respect to the strong topology) if and only if there exists a bounded set B in X such that B ◦ ⊂ V . Without loss of generality, we may assume that B is also closed, convex and balanced. Thus f ∈ (X )∗ belongs to (X ) if and only if there is a bounded set B in X such that f ∈ B ◦◦ . ¯ we have f ∈ B ◦◦ = B. ¯ Hence f ∈ X follows from the above result. If f ∈ B, Conversely, if f ∈ (X ) , it follows that f ∈ B ◦◦ for some bounded set B in X. Since ¯ Hence we may admit that B is closed, convex and balanced, we obtain B ◦◦ = B. ¯ f ∈ B. Lemma B.3 Suppose that X is a locally convex topological vector space and V is a neighborhood of 0 ∈ X. Then V ◦ is w ∗ -compact. Proof Let V be a neighborhood of 0 ∈ X. Then V ◦ is equi-continuous.13 Since V is absorbing, V ◦ (x) = {Λ(x)|Λ ∈ V ◦ } is a bounded set in C for all x ∈ X. Hence V ◦ is w ∗ -compact by Theorem B.11, (ii) and (iii). Proof of Theorem B.14 (i)⇒(ii): Let X be semi-reflexive and B a bounded set in X. B ◦ is a neighborhood of 0 in X with respect to the strong topology. By Lemma B.3, B ◦◦ is σ (X , X )-compact in X = X. Hence B ⊂ B ◦◦ is weakly relatively compact. (ii)⇒(i): Assume that any bounded set B ⊂ X is weakly relatively compact. If we regard X as a subspace of (X )∗ , the weak topology σ (X, X ) coincides with the relative topology of X induced by σ ((X )∗ , X ). We denote by B

σ (X,X )

the weak closure of B in X, and by B

σ ((X )∗ , X )-closure

(X )∗ ,

of B in by assumption, it follows that

B

13 Let

respectively. Since B

σ (X,X )

=B

σ ((X )∗ ,X )

σ (X,X )

If H = V ◦ , in particular, V ◦ is equi-continuous.

the

is weakly compact

.

X be a LCHTVS and H ⊂ X . The following statements are equivalent:

a. H is equi-continuous. b. H ⊂ V ◦ for some V ∈ NX (0). c. H ◦ ∈ NX (0).

σ ((X )∗ ,X )

376

B Topics from Functional Analysis

If f ∈ X , there exists a bounded set B ⊂ X which satisfies f ∈B

σ ((X )∗ ,X )

=B

σ (X,X )

(⊂ X)

by Lemma B.2. Hence X ⊂ X. The converse inclusion X ⊂ X is obvious. So X = X ; that is X is semi-reflexive. We now proceed to the proof of Theorem B.15. Lemma B.4 Let X be a locally convex topological vector space. If V is a barrel in X with respect to the w ∗ -topology, then V is a neighborhood of 0 in X with respect to the strong topology. Proof Since V is absorbing, it is easy to see that V ◦ is weakly bounded. V ◦ is also bounded with respect to the original topology by Theorem B.13. Consequently, V = V ◦◦ is a neighborhood of 0 in X with respect to the strong topology. Lemma B.5 If X is semi-reflexive, X (endowed with the strong topology) is barreled. Proof Let V be a barrel in X (endowed with the strong topology). Since X = X , σ (X , X) = σ (X , X ). Hence V is a barrel with respect to the w ∗ -topology. By Lemma B.4, V is a neighborhood of 0 in X with respect to the strong topology. Proof of Theorem B.15 If X is reflexive, so is X (endowed with the strong topology). Hence X is semi-reflexive and X = X is barreled by Lemma B.3 and Lemma B.4. Conversely, suppose that X is semi-reflexive, and barreled. Compare X endowed with the original topology and (X ) . Let V be a neighborhood of 0 in X with respect to the original topology, and W a neighborhood of 0 in (X ) . We may assume that both V and W are closed, convex and balanced without loss of generality. V ◦ is equi-continuous since V ∈ NX (0) (cf. footnote 13 on p. 375), and V ◦ (x) = {Λ(x)|Λ ∈ V ◦ } is a bounded set in C for all x ∈ X because V is absorbing. Hence V ◦ is w ∗ relatively compact by Theorem B.11. By X = X (semi-reflexivity), σ (X , X) = σ (X , X ). Therefore V ◦ is weakly relatively compact and so weakly bounded. By Theorem B.13, V ◦ is also bounded with respect to the original topology. It follows that V ◦◦ ∈ NX (0). Thus we know that the topology on X is stronger than the original topology on X. (The assumption that X is barreled is not used here.)

References

377

Suppose next that B is a bounded set in X and define W = {x ∈ X|pB (x) 1}, where pB (x) = sup |Λ(x)|. Λ∈B

This means that W = B ◦ . W is a closed, convex and balanced set with respect to the original topology. It is easy to see that W is absorbing by the boundedness of B. Hence W is a barrel, and so W ∈ NX (0). Thus it is verified that the original topology of X is stronger than that of X . Combining with the previous result, we conclude that X is reflexive. Remark B.7 1◦ A Montel space is reflexive. 2◦ If X is a reflexive locally convex topological vector space, X (endowed with the strong topology) is also reflexive. Theorem B.16 (dual of Montel space) If X is a Montel space, X (endowed with the strong topology) is also a Montel space. Proof Since X is a Montel space, X is reflexive by the above remark. Hence X is barreled by Theorem B.15. There remains to show that any bounded set in X is relatively compact. Let A be a bounded set in X . Since X is barreled, A is equi-continuous by Theorem B.10. A is weakly relatively compact by the reflexivity of X and the boundedness of A. According to Theorem B.11, the w ∗ -topology (= the weak topology) coincides with the compact-topology on A. Hence A is relatively compact with respect to the compact topology. Any bounded set is relatively compact, since X is a Montel space. Therefore the strong topology and the compact topology coincide on X . Consequently, A is relatively compact with respect to the strong topology.

References 1. Bourbaki, N.: Eléments de mathématique; Espaces vectoriels topologiques. Hermann, Paris, Chaps. 1–2 (seconde édition) (1965), Chaps. 3–5 (1964). (English edn.) Elements of Mathematics; Topological Vector Spaces. Springer, Berlin/Heidelberg/New York (1987) 2. Grothendieck, A.: Topological Vector Spaces. Gordon and Breach, New York (1973) 3. Maruyama, T.: Kansu Kaisekigaku (Functional Analysis). Keio Tsushin, Tokyo (1980) (Originally published in Japanese) 4. Rudin, W.: Functional Analysis. McGraw-Hill, New York (1973)

378

B Topics from Functional Analysis

5. Schwartz, L.: Functional Analysis. Courant Institute of Mathematical Sciences, New York University, New York (1964) 6. Schwartz, L.: Théorìe des distributions. Nouvelle édition, entièrement corrigée, refondue et augmentée, Hermann, Paris (1973) 7. Yosida, K.: Functional Analysis, 3rd edn. Springer, New York (1971)

Appendix C

Theory of Distributions

In this appendix, we discuss basic elements of the theory of distributions due to L. Schwartz, which are required to understand the text rigorously. We freely make use of various concepts and results in abstract functional analysis which were supplied in Appendix B.1

C.1 The Space D A distribution is a continuous linear functional on the space D consisting of very smooth functions. We start by constructing the basic space D. Let Ω be a nonempty open set in Rl . We denote2 by E(Ω) the set of all the complex-valued functions defined on Ω which are differentiable infinitely many times. We restrict our attention to the case of l = 1 for the sake of simplicity. D s denotes the differential operator to take the s-th derivative. We define a seminorm pK,m on E(Ω) by pK,m (ϕ) = sup |D s ϕ(x)| x∈K sm

for each compact set K ⊂ Ω and m ∈ N ∪ {0}. E(Ω) becomes a LCHTVS endowed with the topology generated by the family {pK,m |K ⊂ Ω is compact, m ∈ N ∪ {0}}. Theorem C.1 (space E(Ω)) E(Ω) is a Fréchet space. Proof We can construct a sequence {K1 , K2 , · · · } of compact sets in Ω which satisfies 1 cf. 2 Of

Schwartz [6] Chaps. I–III, Yosida [8] pp. 46–52, 62–64. course, we may write C∞ (Ω, C). But we follow the tradition of distribution theory here.

© Springer Nature Singapore Pte Ltd. 2018 T. Maruyama, Fourier Analysis of Economic Phenomena, Monographs in Mathematical Economics 2, https://doi.org/10.1007/978-981-13-2730-8

379

380

C Theory of Distributions

(a) Kn ⊂ int.Kn+1 , ∞ (b) Kn = Ω. n=1

(For instance, the sequence defined by B 1 n = 1, 2, · · · Kn = Bn (0) x ∈ R ρ(Ω c , x) n does work.) Then any compact set in Ω is contained in some Kn . Hence the family {pKn ,m |m, n ∈ N} of countable seminorms determines a topology on E(Ω). E(Ω) endowed with this topology is metrizable. We next show the completeness. Let {ϕn } be a Cauchy sequence in E(Ω). Then for any differential operator D s , {D s ϕn } converges to some function gs : Ω → C uniformly in wide sense. In particular, ϕn (x) = D 0 ϕn (x) → g0 (x) ;

x ∈ Ω.

It follows that gs = D s g0 . Consequently, g0 ∈ E(Ω)

and

ϕn → g0

in E(Ω).

Thus E(Ω) is complete. Theorem C.2 (i) A bounded set in E(Ω) is totally bounded. (ii) E(Ω) is not normable.

Proof (i) Let {pKn ,m } be a family of countable seminorms which defines the topology on E(Ω). If A is a bounded set in E(Ω), then sup pKn ,m (ϕ) ≡ CKn,m < ∞ for each

n, m ∈ N ∪ {0}.

ϕ∈A

For any sequence {ϕj } in A, {D s ϕj } is uniformly bounded and equi-continuous on each Kn and for each differential operator D s . Then applying the Ascoli– Arzelà theorem (Theorem B.11, p. 368) and Cantor’s diagonal method, {ϕj } has a convergent subsequence in E(Ω). That is, A is sequentially compact in E(Ω), and so totally bounded. It is recommended that readers prove (ii).3

3 Let X be a Hausdorff topological vector space. The topology is defined by a norm if and only if there exists a convex and bounded neighborhood of 0.

C.1 The Space D

381

Let K be a compact set in Ω. We define a subspace DK (Ω) of E(Ω) by DK (Ω) = {ϕ ∈ E(Ω)|supp ϕ ⊂ K}. The relative topology of DK (Ω) induced by E(Ω) is determined by the family pK,m (ϕ) = sup |D s ϕ(x)|;

m ∈ N ∪ {0}

x∈K sm

of seminorms, and DK (Ω) becomes a LCHTVS. Theorem C.3 (space DK ) DK (Ω) is a Fréchet space. Proof Define a linear functional Λx : E(Ω) → C(x ∈ Ω) by Λx : ϕ → ϕ(x). Λx is continuous by the definition of the topology on E(Ω). Since DK (Ω) =

B

KerΛx ,

x∈Ω\K

DK (Ω) is a closed subspace of E(Ω). So DK (Ω) is a Fréchet space.

We denote by D(Ω) the set of all elements of E(Ω) with compact support;4 i.e. D(Ω) = {ϕ ∈ E(Ω)| supp ϕ is compact} = {DK (Ω)| K is compact subset in Ω}. If we choose a sequence {Kn } of compact sets in Ω so that it satisfies (a) and (b) on p. 380, the sequence {DKn (Ω)} of Fréchet spaces determines the strictly inductive limit topology on D(Ω) =

∞

DKn (Ω).

n=1

This topology is determined independently of the choice of {Kn } (cf. Remark B.2, p. 362). Elements of D(Ω) are called test functions.

may use a more ordinary notation C∞ 0 (Ω, C) instead of D(Ω). Again we follow here the tradition of distribution theory.

4 We

382

C Theory of Distributions

The basic properties of D(Ω) are summarized in the next theorem; they are direct consequences of the general theory of inductive limit topology (cf. Appendix B, Sect. B.1). Theorem C.4 (basic properties of D(Ω)) (i) D(Ω) is a LCHTVS. (ii) Let K be a compact set in Ω. Then the relative topology of DK (Ω) induced by D(Ω) coincides with the original topology DK (Ω). (iii) B ⊂ D(Ω) is bounded if and only if B is a bounded set in DK (Ω) for some compact set K ⊂ Ω. (iv) A net {ϕα } in D(Ω) converges to ϕ ∗ ∈ D(Ω) if and only if there exists some compact set K ⊂ Ω (independent of α) such that suppϕα ⊂ K

for all α

and D s ϕα → D s ϕ ∗

uniformly

ϕα → ϕ ∗

(i.e.

in DK (Ω))

for any differential operator D s . (v) D(Ω) is complete. Proof (i)–(iv) are obvious in view of the general theory discussed in Appendix B.1. So it is enough to examine only (v).5 Let {ϕα } be a Cauchy net in D(Ω). The identity mapping I : D(Ω) → E(Ω) is continuous. Hence {ϕα } is also a Cauchy net in E(Ω). Since E(Ω) is a Fréchet space by Theorem C.1, there exists some ϕ ∗ ∈ E(Ω) such that ϕα → ϕ ∗

in E(Ω).

We have only to show that the support of ϕ ∗ is compact. Suppose that the support of ϕ ∗ is not compact. There exist a sequence {Kn } of compact sets, which satisfies (a) and (b) on p. 380, and a sequence {xn } such that xn ∈ Kn \ Kn−1 ,

5 We

ϕ ∗ (xn ) 0

owe this to Choquet [2] Vol.I, pp. 305–306.

(n = 1, 2, · · · ).

C.1 The Space D

383

If we define |ϕ ∗ (xn )| , V = ϕ ∈ D(Ω)ϕ ∈ DKn (Ω) ⇒ ϕ ∞ < 2 V is a neighborhood of 0 in D(Ω). Since {ϕα } is Cauchy, there exists some α0 such that ϕα − ϕβ ∈ V

for all α, β ( α0 .

Since the support of each ϕα is compact, ϕα0 (xn ) = 0 for sufficiently large n. Hence (setting β = α0 ) we have |ϕα (xn )| <

|ϕ ∗ (xn )| 2

for all

α ( α0 .

This, however, contradicts ϕα (x) → ϕ ∗ (x)

pointwise. 6

Remark C.1 If {Kn } is a sequence of compact sets in Ω such that Kn ⊂ int. Kn+1 ,

∞

Kn = Ω,

n=1

then it is clear that DKn (Ω) ⊂ DKn+1 (Ω);

∞

DKn (Ω) = D(Ω).

n=1

(i) Each DKn (Ω) is a closed subspace of D(Ω). (ii) int.DKn (Ω) = ∅ for each n. (iii) D(Ω) is not metrizable. The next corollary immediately follows from Theorem C.1, Theorem C.3, Corollary B.4 (p. 365) and Theorem B.8 (p. 365). Corollary C.1 E(Ω), DK (Ω) and D(Ω) are barreled spaces.

6 It

is much easier to show that any Cauchy sequence in D(Ω) converges in D(Ω) itself.

384

C Theory of Distributions

Theorem C.5 D(Ω) and E(Ω) are Montel spaces. Proof Let B a bounded set in D(Ω). By Theorem B.5 (p. 362), B is a bounded set in DK (Ω) for some compact K ⊂ Ω. Note that DK (Ω) is metrizable. Let {ϕj } be a sequence in B. Then {D s ϕj } is uniformly bounded and equi-continuous on K for any differential operator D s . Hence {ϕj } has a subsequence convergent in DK (Ω) by the Ascoli–Arzelà theorem (Theorem B.11, p. 368) and Cantor’s diagonal process. That is, B is sequentially compact in DK (Ω), and so relatively compact in DK (Ω). Since the identical embedding I : DK (Ω) → D(Ω) is continuous (Theorem B.1, p. 358), I (cl.DK (Ω) B) = cl.DK (Ω) B is a compact set in D(Ω), which contains B. Hence B is relatively compact. We can show that E(Ω) is a Montel space in a similar way.

C.2 Examples of Test Functions and an Approximation Theorem In this section, some important examples of test functions will be shown. We also discuss approximation of continuous functions by elements of D(Ω), as well as certain “ampleness” of D(Ω). We prepare a few results, the first of which is a special case of the finite increment formula. Lemma C.1 Let I be an open interval in R. Consider a function f : I → Rn . (i) If f is differentiable on I , then

f (y) − f (x) |y − x| sup Df (x + t (y − x))

0t1

for all x, y ∈ I . (ii) If f is continuous on [x, y] and differentiable on (x, y) for x, y ∈ I , then

f (y) − f (x) |y − x| sup Df (x + t (y − x)) . 0
(iii) Under the same assumptions as in (ii),

f (y) − f (x) − v, y − x |y − x| sup Df (x + t (y − x)) − v

0
for v ∈ Rn .

C.2 Examples of Test Functions and an Approximation Theorem

385

Proof (i) Choose a sufficiently large M > 0 so that M > sup Df (x + t (y − x)) . 0t1

(If the right-hand side is ∞, (i) is obvious.) Define a set E by E = {t ∈ [0, 1] | f (x + t (y − x)) − f (x) Mt|x − y|}. E is closed by the continuity of f , and clearly 0 ∈ E. Hence E has the maximum, say s. Suppose that s < 1. If t > s and t − s is sufficiently small, then it follows that

f (x + t (y − x)) − f (x) f (x + t (y − x)) − f (x + s(y − x))

+ f (x + s(y − x)) − f (x)

M|(t − s)(y − x)| + Ms|y − x| = Mt|y − x|. Hence t is also an element of E. It contradicts the maximality of s on E. So we must have s = 1. (ii) and (iii) are easy.7 Lemma C.2 Let I be an open interval in R and F ⊂ I a closed set. Suppose that a continuous function f : I → Rn is identically zero on F and differentiable on I \ F . Suppose further that x ∈ ∂F and a sequence {yj } in I \ F converges to x. If Df (yj ) → 0 (as j → ∞), then Df (x) exists and is equal to 0. The next lemma follows from Lemma C.2. Lemma C.3 Let P be a polynomial. Define a function f : R → R by ⎧

⎨P 1 exp − 1 if x > 0, x x f (x) = ⎩0 if x 0. Then f is differentiable at 0 and Df (0) = 0.8 We now proceed to discuss a typical example of test functions. Example C.1 Define a function ϕ : R → R by ϕ(x) =

if |x| 1, 1 ⎩exp − if |x| < 1. 1−x 2

x → f (x) − xv for (iii). acknowledge Hölmander [3] I, pp. 6–7 for Lemma C.1–C.3.

7 Consider 8 We

⎧ ⎨0

386

C Theory of Distributions

Then suppϕ = [−1, 1] and ϕ is infinitely differentiable at any x with |x| ≷ 1. We have only to check the smoothness of ϕ at x = ±1. If we define g : R → R by ⎧

⎨exp − 1 if t > 0, t g(t) = ⎩0 if t 0, g is infinitely differentiable at all points (including 0) by repeated application of Lemma C.3. If we also define h : R → R by h(x) = 1 − x 2 , it is also infinitely differentiable. Since ϕ can be represented as ϕ(x) = (g ◦ h)(x), ϕ is infinitely differentiable everywhere. Hence ϕ ∈ D(R). In the next section, we define distributions as continuous linear functionals on D(Ω). That is, D(Ω) is the space of distributions. Although detailed discussion is postponed to the next section, we should now observe that any locally integrable function f : Ω → C defines a distribution Tf by

Tf (ϕ) =

f (x)ϕ(x)dx,

ϕ ∈ D(Ω).

Ω

Given two different locally integrable functions f and g, we may ask whether D(Ω) is ample enough to discriminate Tf and Tg . That is, is there any ϕ ∈ D(Ω) such that Tf (ϕ) Tg (ϕ) for any different locally integrable functions, f and g? Theorem C.6 Let f : Ω → C be a locally integrable function. If

f (x)ϕ(x)dx = 0 Ω

for any ϕ ∈ D(Ω), then f (x) = 0 a.e. x ∈ Ω. Proof We divide our proof in three steps. 1◦ For any continuous function ϕ : Ω → C with compact support,

f (x)ψ(x)dx = 0. Ω

For any ε > 0 and δ > 0, there exists a function ϕ ∈ D(Ω) such that suppϕ ⊂ Bδ (suppψ),

ϕ − ψ ∞ ε.

C.2 Examples of Test Functions and an Approximation Theorem

387

ϕ satisfies

(ϕ(x) − ψ(x))f (x)dx ε Ω

|f (x)|dx. Bδ (suppψ)

Since

f (x)ϕ(x)dx = 0 Ω

by assumption, we obtain

ε f (x)ψ(x)dx Ω

|f (x)|dx.

Bδ (suppψ)

Passing to the limit as ε → 0 (δ being fixed), we verify 1◦ . 2◦ For any bounded measurable function η : Ω → C with compact support,

f (x)η(x)dx = 0. Ω

We start by proving the existence of a sequence {ψn : Ω → C} of continuous functions which satisfies: a. ψn (x) → η(x) a.e., b. suppψn ⊂ K for all n for some compact set K ⊂ Ω, and c. ψn ∞ η ∞ . As is well-known, there exists a sequence {θn : Ω → C} of continuous functions which converges to η a.e.9 Let K ⊂ Ω be a compact set such that suppη ⊂ int.K. Then there exists a continuous function α : Ω → [0, 1] such that10 α(suppη) = {1},

suppα ⊂ K.

Define θ˜n : Ω → C by θ˜n (x) = α(x)θn (x);

n = 1, 2, · · ·

Each θ˜n is a continuous function with suppθ˜n ⊂ K and θ˜n → η a.e. So {θ˜n } satisfies a and b. We express θ˜n (x) as ˜ θ˜n (x) = |θ˜n (x)|ei·arg θn (x)

9 Maruyama

[5] pp. 232–233. [5] p. 317n.

10 Maruyama

388

C Theory of Distributions

and define βn (x) and ψn (x) by βn (x) = Min{ η ∞ , |θ˜n (x)|}, ˜

ψn (x) = βn (x)ei·arg θn (x) ;

n = 1, 2, · · ·

Then {ψn } certainly satisfies all of a, b and c. By 1◦ , we have

f (x)ψn (x)dx = 0;

n = 1, 2, · · · .

Ω

On the other hand, it follows that

f (x)η(x)dx = lim f (x)ψn (x)dx = lim f (x)ψn (x)dx = 0 K n→∞

Ω

n→∞ Ω

by a, b and the dominated convergence theorem.11 Thus we obtain 2◦ . 3◦ f (x) = 0 a.e. Specify a bounded measurable function η : Ω → C with compact support as η(x) =

+ 0 e−i·argf (x)

if |x| > a

or

if |x| a

and

f (x) = 0, f (x) 0,

where a > 0. η thus defined is measurable and |η(x)| = 0 or 1. It is obvious that suppη ⊂ {x ∈ Ω | x a}. We obtain

f (x)η(x)dx =

Ω

|x|a

|f (x)|dx = 0

by 2◦ . Since this holds good for all a > 0, f (x) = 0 a.e.12

C.3 Distributions: Definition and Examples We now define the distributions in the sense of L. Schwartz and give several examples.

11 |f (x)ψ (x)| n 12 The proof of

η ∞ |f (x)|, and the right-hand side is integrable. Theorem C.5 is due to Schwartz [7] pp. 68–70. The significance of the theorem is discussed clearly in Kolmogorov and Fomin [4] pp. 212–213 (Japanese edn.).

C.3 Distributions: Definition and Examples

389

Definition C.1 Each element of D(Ω) , the dual space of D(Ω), is called a distribution. That is, a distribution is a continuous linear functional on D(Ω), the topological structure of which is discussed in Appendix B. Example C.2 Let f ; Ω → C be a locally integrable function. Define an operator Tf : D(Ω) → C by

Tf (ϕ) =

f (x)ϕ(x)dx. Ω

Tf is well-defined, and it is clearly linear. The continuity can be proved as follows. Let {ϕα } be a net in D(Ω) which converges to ϕ ∗ ∈ D(Ω). There exists a compact set K ⊂ Ω such that suppϕα ⊂ K. (cf. Theorem C.4.) Then it follows that

∗ ∗ |Tf (ϕα )−Tf (ϕ )| = f (x){ϕα (x)−ϕ (x)}dx |f (x)|dx·sup |ϕα (x)−ϕ ∗ (x)|. K

K

x

Since f is locally integrable and ϕα − ϕ ∗ ∞ → 0, we obtain |Tf (ϕα ) − Tf (ϕ ∗ )| → 0. Hence Tf is continuous. According to Theorem C.6, f =g

a.e.

⇔

Tf = T g

for any couple of locally integrable functions, f and g. If we identify two locally integrable functions which are equal a.e., there exists one-to-one correspondence between a locally integrable function f and the distribution Tf which is defined by f . Therefore we sometimes confuse f and Tf consciously and write f (ϕ) instead of Tf (ϕ). Example C.3 Let D be an arbitrary differential operator and f a locally integrable function. If we define an an operator T : D(Ω) → C by

T (ϕ) =

f (x)Dϕ(x)dx, Ω

T is a distribution.

390

C Theory of Distributions

Example C.4 Define an operator δ : D(Ω) → C by δ(ϕ) = ϕ(0). δ is a distribution, called the Dirac distribution. Similarly, an operator δ(a) defined by δ(a) (ϕ) = ϕ(a);

a∈Ω

is called the Dirac distribution which assigns a mass 1 at a point a. Example C.5 Define a space KK (Ω) of functions by KK (Ω) = {ϕ ∈ C(Ω, C)|suppϕ ⊂ K}, where K ⊂ Ω is compact. That is, KK (Ω) is the space of all continuous functions, the supports of which are contained in K. KK (Ω) is a linear space under usual operations. If we define pK (·) on KK (Ω) by pK (ϕ) = sup |ϕ(x)|, x∈K

pK (·) is a norm. KK (Ω) is a Banach space under this norm. Define K(Ω) by K(Ω) =

{KK (Ω)|K is a compact set in Ω}.

Then the strictly inductive limit topology can be defined on K(Ω) by {KK (Ω)} as in D(Ω). 1◦ B ⊂ K(Ω) is bounded if and only if B is bounded in KK (Ω) for some compact set K ⊂ Ω. 2◦ A net {ϕα } in K(Ω) converges to ϕ ∗ ∈ K(Ω) if and only if there exists some compact set K ⊂ Ω (independent of α) such that suppϕα ⊂ K

for all α

and ϕα → ϕ ∗

uniformly.

These results immediately follow from the general theory discussed in Appendix B. Each element of K(Ω) , the dual space of K(Ω), is called a Radon measure on Ω.13 13 In

Sect. 4.1, a continuous linear functional on the Banach space C∞ (with uniform convergence norm) is called a Radon measure. We give the same name to each element of K(Ω) . However,

C.3 Distributions: Definition and Examples

391

A Radon measure is obviously a distribution. We add here a few remarks concerning the relation between distributions and Radon measures. Remark C.2 A distribution T is defined by some Radon measure μ (i.e. the restriction of μ to D is equal to T ) if and only if T is continuous on DK endowed with the relative topology induced by KK for any compact set K ⊂ Ω. This proposition can be proved as follows. (It is enough to show only the sufficiency since the necessity is obvious.) K

(i) Let H be a compact set in R. We denote by DHH the closure of DH in KH . K K Then T |DH can be extended to continuous linear functional T¯H on DHH . (DHH is endowed with the relative topology induced by KH .) C K (ii) T can be extended to a linear functional T¯ on {DHH |H is compact} with T¯ = T¯H for each compact set H . C KH (iii) {DH |H is compact} = K. (iv) T¯ is a Radon measure. (v) Set T¯ = μ. μ is uniquely determined by T . Thus there is a one-to-one correspondence between some subspace of D and K . So a Radon measure can be identified with the corresponding distribution. A distribution T is said to be real valued if T (ϕ) is real for any real-valued ϕ ∈ D. A distribution T is called positive if T (ϕ) 0 for any ϕ ∈ D with ϕ 0. A positive distribution is a Radon measure. To see this, we have to show a positive distribution T is continuous on DK endowed with the relative topology induced by KK for each compact K. Let {ϕα } be a net in DK such that ϕα uniformly converges to 0. Choose ψ ∈ D so that ψ 0

on R ;

ψ 1 on K.

Then there exists εα > 0 such that |ϕα | εα ψ;

εα ↓ 0.

Write ϕα in the form

a rigorous distinction is required. If we restrict ν ∈ C∞ to K, we obtain ν|K ∈ K . If a net {ϕα } in K converges to ϕ ∗ ∈ K, then there exists some compact set K such that suppϕα ⊂ K for all α and ϕα → ϕ ∗ in KK . Hence {ϕα } uniformly converges to ϕ ∗ in C∞ . It follows that ν(ϕα ) → ν(ϕ ∗ ) since ν|K is continuous with respect to the uniform convergence topology. Thus, the concept of a Radon measure as an element of K is more general than that in the sense of Chap. 4. See Bourbaki [1], Choquet [2] Vol. I, Chap. 3 for Radon measures in general. Furthermore, Schwartz [7] Chap. 1.4 should be consulted for a correspondence between distributions and Radon measures.

392

C Theory of Distributions

ϕα = uα + ivα ;

uα , vα ∈ D.

Then we obtain −εα ψ uα , vα +εα ψ. Hence it holds good that |T (ϕα )| 2εα T (ψ), since T is a positive distribution.

C.4 Differentiation of Distributions Let f : R → C be a continuously differentiable function. We denote the distributions defined by f and f by the same notation. Evaluating the distribution f , we obtain f (ϕ) =

R

f · ϕdx =

+∞

−∞

f · ϕdx

for ϕ ∈ D(R). By integration by parts, it is equal to +∞

f · ϕ − −∞

+∞ −∞

f ϕ dx.

Since the support of ϕ is bounded, the first term is 0. Hence

f (ϕ) = −

R

f ϕ dx.

(*)

This shows how to evaluate values of the distribution defined by the derivative of a continuously differentiable function f . It is easy to generalize this reasoning to the concept of the derivative of a distribution. For a distribution T on R, we define a new linear functional S on D(Ω) by Sϕ = −T (ϕ );

ϕ ∈ D(R).

Then S is also a distribution. We call it the generalized derivative or distributional derivative of T , and denote it by T . We next evaluate the second distributional derivative T of T as follows:

C.4 Differentiation of Distributions

393

T ϕ = −T (ϕ ) = +T (ϕ ). More generally, the p-th distributional derivative D p T of T is given by D p T (ϕ) = (−1)p T (D p ϕ). Summing up: Theorem C.7 Any distribution T ∈ D(R) has distributional derivatives of all orders, and the p-th distributional derivative D p T (p ∈ N) is given by D p T (ϕ) = (−1)p T (D p ϕ);

ϕ ∈ D(R).

Example C.6 Let Y (x) be the Heaviside function; i.e. + Y (x) =

0 if

x < 0,

1 if

x > 0.

It is not necessary to define a value at x = 0. Y can be regarded as a distribution, which is evaluated as

+∞ Y (ϕ) = ϕ(x)dx; ϕ ∈ D(R). 0

The distributional derivative DY of Y is evaluated as follows:

+∞

+∞ +∞ DY (ϕ) = Y (Dϕ) = − Y (x)ϕ (x)dx = − ϕ (x)dx = −ϕ(x)0 −∞

0

= − lim (ϕ(a) − ϕ(0)) = ϕ(0) = δ(ϕ) ; ϕ ∈ D(R). a→+∞

Thus DY = Y is exactly equal to the Dirac distribution δ. The repeated distributional derivatives of Y are given by Y (ϕ) = δ (ϕ) = −δ(ϕ ) = −ϕ (0). D p+1 Y (ϕ) = D p δ(ϕ) = (−1)p δ(D p ϕ) = (−1)p D p ϕ(0). Example C.7 Suppose that countable points {xν } = {· · · , x−2 , x−1 , x0 , x1 , x2 , · · · } are given on R so that lim xν = ±∞ and intervals [xν−1 , xν ] are nondegenerate. ν→±∞

Let f be “piecewise smooth”, that is, infinitely differentiable (in the usual sense) on each (xν−1 , xν ). Assume further that each of the successive derivatives (in the usual sense) of f have discontinuities of the first kind at each xν . σνp = D p f (xν + 0) − D p f (xν − 0)

394

C Theory of Distributions

Fig. C.1 Graph of f

Fig. C.2 Computation of f

is the “jump” of the p-th derivative of f at xν . (The values of f at xν s are not required to be defined. cf. Fig. C.1.) We denote by f , f , · · · , f (p) , · · ·

(*)

the successive distributional derivatives of f , and by [f ], [f ], · · · , [f (p) ], · · ·

(**)

the distributions defined by the (locally integrable) functions which coincide with the successive derivatives (in the usual sense) of f on each (xν−1 , xν ) and are not necessarily defined at xν ’s. We should carefully distinguish (*) and (**). (For instance, if f is the Heaviside function, f = δ and [f ] = 0.) We now evaluate the distributional derivatives of such a function f . A couple of points, a and b, are fixed as in Fig. C.2. We try to integrate f · ϕ (∈ D) on the interval [a, b].

a

xν −ε

x −ε f · ϕ dx = f · ϕ aν −

xν −ε

f · ϕdx

a

= f (xν − ε) · ϕ(xν − ε) − f (a) · ϕ(a) − a

xν −ε

f · ϕdx,

C.4 Differentiation of Distributions

395

where f in the integral is in the usual sense. Letting ε → 0, we obtain

xν

f · ϕ dx = f (xν − 0)ϕ(xν ) − f (a)ϕ(a) −

a

xν

f · ϕdx.

a

Similarly, we have

b

f · ϕ dx = f (b) · ϕ(b) − f (xν + 0) · ϕ(xν ) −

xν

b

f · ϕdx.

xν

Hence it follows that

b

f · ϕ dx = f (b) · ϕ(b) − f (a) · ϕ(a)

a

b

− [f (xν + 0) − f (xν − 0)]ϕ(xν ) −

f · ϕdx.

a

If we divide R into intervals like [a, b], integrate f · ϕ on each of these small intervals and sum up all, we obtain

+∞

−∞

f · ϕ dx = −

σν0 ϕ(xν ) −

ν

+∞

−∞

f · ϕdx,

taking account of the support of ϕ is bounded. Consequently, f (ϕ) = −f (ϕ ) =

ν

σν0 ϕ(xν ) +

+∞

f · ϕdx =

−∞

σν0 δ(xν ) (ϕ) + [f ](ϕ).

ν

Thus we conclude that the distributional derivative of f is given in the form f = [f ] +

σν0 δ(xν ) .

ν

We should note that the discontinuities of f appear as masses at xν ’s in the distributional derivative. More generally, the p-th distributional derivative f (p) is given by f (p) = [f (p) ] +

ν

σν(p−1) δ(xν ) +

ν

(1) σν(p−2) δ(x + ··· + ν)

ν

(p−1)

σν0 δ(xν ) .

396

C Theory of Distributions

C.5 Topologies on the Space D(Ω) of Distributions Simply applying the general theory discussed in Appendix B.2 to D(Ω) , we obtain several basic results. Theorem C.8 Endow the space D of distributions with the strong topology. (i) D is complete. (ii) D is a Montel space. (iii) D and D are reflexive, and they are the dual spaces of each other. Lemma C.4 If a sequence {Tn } of distributions simply converges to a distribution T ; i.e. lim Tn (ϕ) = T (ϕ)

n→∞

for any

ϕ ∈ D,

then {D p Tn } also simply converges to D p T for any differential operator D p . Proof It is immediate, since D p Tn (ϕ) = (−1)|p| Tn (D p ϕ) → (−1)|p| T (D p ϕ) = D p T (ϕ) for any ϕ ∈ D.

The next theorem is easily derived from the Banach–Steinhaus theorem (Theorem B.12, p. 371) and Lemma C.4. Theorem C.9 Suppose that a sequence {Tn } of distributions has a limit T (ϕ) ≡ lim Tn (ϕ) n→∞

for each ϕ ∈ D. Then the following results hold good: (i) T is a distribution and {Tn } converges to T strongly. (ii) For any differential operator D p , D p T = lim D p Tn . n→∞

Corollary C.2 (series of distributions) Suppose that Tn (n = 1, 2, · · · ) are distributions and the limit T (ϕ) ≡

∞ n=1

exists for each ϕ ∈ D.

Tn (ϕ)

C.5 Topologies on the Space D(Ω) of Distributions

(i) T =

397

Tn is a distribution.

n

(ii) For any differential operator D p , Dp T =

∞

D p Tn .

n=1

Definition C.2 Let T be a distribution on Ω ⊂ R and U an open subset of Ω. T is said to be 0 in U (or vanish in U ) if T (ϕ) = 0 for all ϕ ∈ D(Ω) with suppϕ ⊂ U . Theorem C.10 Let {Ui }i∈I be a family of open sets in Ω. If a distribution T Ui . vanishes in all Ui ’s, then T vanishes in U = i∈I

Proof Let ϕ ∈ D with suppϕ ⊂ U . If we add Ω \ suppϕ to {Ui }i∈I , we obtain an open covering of Ω. We denote it by {Vj }j ∈J . Let {βj }j ∈J be a smooth partition of unity subordinate to {Vj }j ∈J ; i.e. 1◦ 2◦ 3◦ 4◦

βj : Ω → [0, 1], suppβj ⊂ Vj , for any compact set K ⊂ Ω, all but finite βj , are 0 on K, βj (x) = 1 for all x ∈ Ω. j ∈J

Then ϕ=

βj ϕ,

j ∈J

and this is actually a finite sum, since suppϕ is compact and the property 3◦ of {βj }. Hence T (βj ϕ). T (ϕ) = j ∈J

If suppβj is contained in some Ui (i ∈ I ), T (βj ϕ) = 0 by assumption. In the case of suppβj ⊂ Ω \ suppϕ, it holds good that βj ϕ = 0, and so clearly T (βj ϕ) = 0. Therefore T (ϕ) = 0. According to Theorem C.10, a distribution T is 0 in U≡

{V ⊂ Ω | T vanishes in V }.

U is the largest open set in which T is 0. Ω \ U is called the support of T . Definition C.3 The support of a distribution T ∈ D(Ω) is the smallest closed set F such that T vanishes in Ω \ F .

398

C Theory of Distributions

Example C.8 If T is a continuous function,14 the support of T as a distribution and that as a function coincide. Example C.9 The support of the Dirac distribution δ(a) is the singleton {a}. We insert here some basic results concerning boundedness in topological vector spaces: 1◦ Let X and Y be topological vector spaces and T : X → Y a continuous linear operator. Then the image T (B) of a bounded set B in X is bounded in Y. 2◦ Let X be a locally convex topological vector space. The following two statements are equivalent: a. Any balanced convex set which absorbs every bounded set is a neighborhood of 0 in X. b. Any seminorm which is bounded on every bounded set in X is continuous. A locally convex topological vector space is called bornologic if it satisfies a (⇔ b). 3◦ A metrizable locally convex topological vector space is bornologic. Suppose that A ⊂ X is a balanced, convex set which absorbs every bounded set. Since X is metrizable, 0 has a countable neighborhood base {Un }. We may assume, without loss of generality, that Un ⊂ Un−1 . If A is not a neighborhood of 0, there exists a sequence {xn } in X such that xn ∈ (1/n)Un , xn A. {nxn } is bounded, since nxn → 0. Hence nxn ∈ αA for all n for some α 0. Since xn ∈ (α/n)A, we have xn ∈ A for large n, a contradiction. 4◦ immediately follows from 3◦ . 4◦ D(Ω) and E(Ω) are bornologic. 5◦ Let X and Y be locally convex topological vector spaces. X is assumed, in addition, to be bornologic. A linear operator T : X → Y is continuous, if the image T (B) of any bounded set B in X is bounded in Y. The next result immediately follows. Theorem C.11 A linear functional on D(Ω) (resp. E(Ω)) is continuous if and only if the image T (B) of any bounded set B in D(Ω) (resp. E(Ω)) is bounded in C. Corollary C.3 The following two statements are equivalent for a linear functional T on D(Ω): (i) T is a distribution. (ii) For any compact set K in Ω, there exist some C > 0 and k ∈ N such that |T (ϕ)| C sup |D p ϕ(x)| |p|k x∈K

14 This means that T

suppf .

for all

ϕ ∈ DK (Ω).

is a distribution defined by a continuous function, say f . In this case suppT =

C.5 Topologies on the Space D(Ω) of Distributions

399

Proof (i)⇒(ii): If T is a distribution, T is continuous on DK for each compact set K in Ω. (ii) immediately follows. (ii)⇒(i): If we assume (ii) conversely, T maps any bounded set of D(Ω) into a bounded set. Hence by Theorem C.11, T is continuous. Lemma C.515 (i) The topology of D(Ω) is stronger than the relative topology induced by E(Ω), (ii) D(Ω) is dense in E(Ω). Based upon these preparations, we can prove the following important result. A distribution T ∈ D(Ω) can be extended uniquely to an element of E(Ω) . Proof We denote by K the support of T ; i.e. K = suppT (compact). Let ψ be an element of D(Ω), which is identically 1 in a neighborhood of K.16 (Such a ψ certainly exists.) Define T˜ : E(Ω) → C by

15 We

can prove Lemma C.5 in the following way:

(i) Choose a sequence {Kn } of compact sets such that Kn ⊂ int.Kn+1 and Ω =

∞

Kn .

n=1

Consider a basic neighborhood of 0 in E(Ω), say Vm,ε,K ≡ {ϕ ∈ E |

|D p ϕ(x)| < ε} (K is compact).

sup pm,x∈K

For Kn ⊃ K, we have Vm,ε,K

B

DKn (Ω) = {ϕ ∈ DKn (Ω)| ⊃ {ϕ ∈ DKn (Ω)|

sup

|D p ϕ(x)| < ε}

pm,x∈K

sup

|D p ϕ(x)| < ε}.

pm,x∈Kn

The lastDterm is a neighborhood of 0 in DKn (Ω). Hence, by definition of the topology of D(Ω), Vm,ε,K DKn (Ω) is a neighborhood of 0 in D(Ω). (ii) Let ϕn ∈ D(Ω) satisfy ϕn (Kn ) = {1} and ϕn (Ω \ Kn+1 ) = {0}. For ϕ ∈ E(Ω), we have ϕ · ϕn ∈ D(Ω). We have to show that ϕ · ϕn → ϕ in E(Ω). 16 We write K = suppT (compact). For any ε > 0, there exist finite points x , x , · · · , x such that 1 2 n K⊂

n

Bε (xj ).

j =1

Let {βj } be a smooth partition of unity subordinate to {Bε (xj )|j = 1, 2, · · · n}. Let H be a compact n neighborhood of K such that H ⊂ Bε (xj ). Define a function ψ : Ω → C by j =1

ψ(x) =

βj (x).

(suppβj )∩H ∅

Then ψ ∈ D(Ω) and ψ is identically 1 in a neighborhood of K.

400

C Theory of Distributions

T˜ (η) = T (ψη);

η ∈ E(Ω).

T˜ is well-defined in the sense that it is determined independently of the choice of ψ.17 T˜ is clearly linear. The continuity of T˜ is proved as follows. For any bounded set B in E(Ω), the set {ψη|η ∈ B} is bounded in D(Ω). Hence {T˜ (η)|η ∈ B} = {T (ψη)|η ∈ B} is bounded by Theorem C.11. Hence, again by Theorem C.11. T˜ is continuous. T˜ is an extension of T . In fact, if ψ ∈ D(Ω), then ϕ(1 − ψ) ∈ D(Ω) and suppϕ(1 − ψ)

B

K = ∅.

Hence T (ϕ(1 − ψ)) = 0, which implies T˜ ϕ = T (ψϕ) + T (ϕ(1 − ψ)) = T ϕ. Thus T˜ is an extension of T . D(Ω) is dense in E(Ω) (Fréchet space) and T˜ |D(Ω) = T is continuous (and so uniformly continuous). Hence, by the principle of extension by continuity, an extension of T to E(Ω) is uniquely defined. The converse assertion is as follows. An element of E(Ω) restricted to D(Ω) is a distribution with compact support. Proof Let T˜ ∈ E(Ω) . The restriction of T˜ to D(Ω) is denoted by T ; i.e. T = T˜ |D(Ω) . It is obvious that T is a linear functional on D(Ω). Let {ϕα } be a net in D(Ω) which converges to 0. Then there exists some compact set K ⊂ Ω such that ϕα → 0

in DK (Ω),

θ ∈ D(Ω) is also identically 1 in a neighborhood of K, then (ψ − θ)η is an element of D(Ω), which is identically 0 in a neighborhood of K. Hence it follows that T (ψη) − T (θη) = T ((ψ − θ)η) = 0.

17 If

C.5 Topologies on the Space D(Ω) of Distributions

401

and hence ϕα → 0 in E(Ω). Since T˜ is an element of E(Ω) , we have T (ϕα ) = T˜ (ϕα ) → 0. Hence T is continuous on D(Ω); i.e. T ∈ D(Ω) . Finally, we have to show that suppT is compact. If not, there exists a sequence {ϕn } in D(Ω) such that ⎧ ⎪ ⎪ ⎨suppϕn ⊂ Ω \ Bn (0), (i.e. ϕn (x) = 0 if ⎪ ⎪ ⎩T (ϕ ) = 1 n

x n),

for each n ∈ N. Since suppϕn goes away farther, ϕn → 0 in E(Ω). Hence T (ϕn ) = T˜ (ϕn ) → 0. However, it contradicts T (ϕn ) = 1 (for all n). suppT must be compact.

Combining the two propositions above, we can observe that a distribution with compact support corresponds to an element of E(Ω) one-to-one. If we identify corresponding T and T˜ , we have the following result. Theorem C.12 The space of distributions with compact support coincides with E(Ω) . Remark C.3 A linear functional T˜ on E(Ω) is continuous if and only if there exist a compact set H in Ω, C > 0 and K ∈ N such that |T˜ (η)| C ·

|D p η(x)| for all

sup

η ∈ E(Ω).

pk,x∈H

Proof We have only to prove necessity, since the converse is easy. T = T˜ |D(Ω) is a distribution if and only if there exist, for any compact set K ⊂ Ω, some C > 0 and k ∈ N such that |T (ϕ)| C

sup

pk ,x∈K

|D p ϕ(x)|

It holds good, in particular, for H = suppT˜ .

for all

ϕ ∈ DK (Ω).

402

C Theory of Distributions

Let ψ be an element of D(Ω) which is identically 1 in a neighborhood of H . If H = suppϕ, ψη ∈ DH (Ω) for any η ∈ D(Ω). Hence |T˜ (η)| = |T (ψη)| C

sup

pk ,x∈H

|D p (ψη)(x)| C {C

sup

pk ,x∈H

By setting C = C · C and k = k , we obtain the desired result.

|D p η(x)|}.

References 1. Bourbaki, N.: Eléments de mathématique; Espaces vectoriels topologiques. Hermann, Paris, Chaps.1–2 (seconde édition) (1965), Chaps. 3–5, (1964). (English edn.) Elements of Mathematics; Topological Vector Spaces. Springer, Berlin/Heidelberg/New York (1987) 2. Choquet, G.: Lectures on Analysis, vol. I. Benjamin, London (1969) 3. Hölmander, L.: The Analysis of Linear Partial Differential Operators, I, II. Springer, Berlin/Heidelberg (1983) 4. Kolmogorov, A.N., Fomin, S.V.: (English edn.) Introductory Real analysis, Prentice-Hall, New York (1970) (Japanese edn., translated from 4th edn.) Kansu-kaiseki no Kiso, 4th edn. Iwanami Shoten, Tokyo (1979) (Originally published in Russian) I use the Japanese edition for quotations. 5. Maruyama, T.: Sekibun to Kansu-kaiseki (Integration and Functional Analysis). Springer, Tokyo (2006) (Originally published in Japanese) 6. Schwartz, L.: Méthods mathématique pour les sciences physiques. Hermann, Paris (1965) 7. Schwartz, L.: Théorìe des distributions. Nouvelle édition, entièrement corrigée, refondue et augmentée, Hermann, Paris (1973) 8. Yosida, K.: Functional Analysis, 3rd edn. Springer, New York (1971)

Addendum

My basic policy in this book is “to restrict myself to a rather narrow field of applications familiar to many economists, and to give full details of mathematical materials instead”, as I wrote in the preface. However, I would like to add here a brief memorandom of a topic from recent developments in economic theory. As is well-known, it is Solow [5] who pointed out the significance of the “vintage” structure of a stock of capital. He wrote that any technological innovations realize their effect on output “only to the extent that they are carried into practice either by net capital formation or by the replacement of old-fashioned equipment by the latest models”. Solow’s vintage models of the embodied technological innovation exerted a remarkable influence and aroused much controversy in the theory of capital. We can now observe a renaissance of Solow’s model in a new framework. In the so-called “neo-classical” macro-economic theory, the aggregate output (say, gross domestic product, GDP) is produced by an aggregate stock of capital and labor services under certain technological conditions represented by a production function. The gross investment must be equal to the saving at each time. The surviving fraction of a unit of capital depends upon its “vintage”, that is, the date when the machine was produced. Goldman–Kato–Mui [4] presented and analyzed an integral equation

t

k(t) =

g(k(t − θ ))ϕ(θ )dθ + h(t),

(†)

o

which describes a dynamic behavior of the capital stock k(t) per laborer. Here g, ϕ and h are given functions. The equation (†) may be regarded as a nonlinear version of Volterra’s integral equation of the second kind, involving a “nonlinear convolution”. Comparing it with the integral equation of the form

© Springer Nature Singapore Pte Ltd. 2018 T. Maruyama, Fourier Analysis of Economic Phenomena, Monographs in Mathematical Economics 2, https://doi.org/10.1007/978-981-13-2730-8

403

404

Addendum

t

k(t) =

k(t − θ )ϕ(θ )dθ + h(t),

(‡)

o

which is called Feller’s renewal equation (cf. Feller [3]), we may view the equation (†) as a nonlinear version of it. A similar but distinct equation is discussed on pp. 75–78 of this book. In the same year, Benhabib–Rustichini [1] studied an optimal growth problem subject to the capital accumulation described by a nonlinear integral equation in the same category. From a mathematical viewpoint, one of the deepest analyses was provided by Diekmann and Gils [2]. They considered an integral equation of the form

x(t) =

t

B(τ )g(x(t − τ ))dτ + f (t),

t 0,

(∗)

0

where B is a (n × n)-matrix-valued L2 -function and g : Rn → Rn is a smooth function. Both B and g are given. By inserting real parameters, they also examined the existence of periodic solutions having recourse to Hopf’s bifurcation theorem (cf. Chap. 11). Integral equations of this type sometimes appear in epidemiology. (See Thieme [6].)

References 1. Benhabib, J., Rustichini, A.: Vintage capital, investment, and growth. J. Econ. Theory 55, 323– 339 (1991) 2. Diekmann, O., van Gils, S.A.: Invariant manifolds for Volterra integral equations of convolution type. J. Differential Equations 54, 139–180 (1984) 3. Feller, W.: On the integral equation of renewal theory. Ann. Math. Stat. 12, 243–267 (1941) 4. Goldman, S.M., Kato, T., Mui, V.-L.: Economic growth and general depreciation. J. Development Economics 34, 397–400 (1991) 5. Solow, R.M.: Investment and technical progress. In: Arrows, K.J., Karlin, S., Suppes, P. (eds.) Mathematical Methods in the Social Sciences 1959. Stanford University Press, Stanford (1960) 6. Thieme, H.R.: Renewal theorems for some mathematical models in epidemiology. J. Integral Equations 8, 185–216 (1985)

Name Index

A Amann, H., 103n, 115n, 119 Ambrosetti, A., 301–303n, 306, 312n, 344 Arveson, W., 171n, 191

B Billingsley, P., 277 Bochner, S., 115, 121, 150n, 163, 245, 270, 277 Bohr, H., 245n, 246, 277 Bourbaki, N., 67n, 99, 357n, 377, 391n, 402 Brémaud, P., 193n, 243

C Carathéodory, C., 133n, 163 Carleson, L., 37n, 44, 228n, 233, 243, 302, 315, 317, 344 Cartan, H., 7n, 21, 55n, 62, 347n, 356 Cauchy, 115 Chang, W.W., 333, 344 Choquet, G., 382n, 391n, 402 Cramér, H., 221, 227, 229, 231, 233, 236 Crandall, M.G., 301n, 344 Crum, M.M., 207n, 243

D Dixmier, J., 171n, 191 Doob, J.L., 227n, 243 du Bois Reymond, P., 34 Dudley, R.M., 20, 21, 194n, 243, 250n, 278 Dunford, N., 36n, 44, 74n, 99, 187n, 191, 245n, 278, 282n, 299 Dym, H., 51n, 62

E Escher, J., 103n, 115n, 119 Evans, L., 166n, 191, 314n, 344

F Folland, G.B., 5n, 15n, 21, 97n, 99 Fomin, S.V., 29n, 34n, 39n, 45, 51n, 62n, 63, 388n, 402 Frisch, R., 193n, 243, 330, 344

G Goldberg, R.R., 51n, 62 Goodwin, R., 193 Granger, C.W.J., 193n, 239n, 243 Grothendieck, A., 67n, 99, 357n, 377

H Halmos, P.R., 1n, 21 Hamilton, J.D., 193n, 243 Hardy, G.H., 36, 37n, 44 Helson, H., 112n, 119 Herglotz, G., 121, 133n, 163 Hicks, J.R., 193n, 243, 330, 344 Hölmander, L., 385n, 402 Hopf, E., 301n, 302, 305, 306, 309, 328, 330, 335, 338, 341, 344 Hunt, R.A., 37n, 44, 302, 344

I Igari, S., 157n, 163 Itô, K., 213n, 214n, 221n, 243

© Springer Nature Singapore Pte Ltd. 2018 T. Maruyama, Fourier Analysis of Economic Phenomena, Monographs in Mathematical Economics 2, https://doi.org/10.1007/978-981-13-2730-8

405

406 J Jørsboe, O.G., 37n, 44 K Kahane, J.P., 37n, 44 Kaldor, N., 193n, 243, 303, 330, 333, 344 Kato, T., 38n, 45, 86n, 93, 173n, 191, 279n, 299 Katznelson, Y., 37n, 44, 45, 105n, 111n, 119, 126n, 133n, 158n, 163, 245n, 263n, 278 Kawata, T., 15n, 21, 29n, 32n–34n, 45, 51n, 62, 63, 71n, 78n, 99, 112n, 119, 193n, 194n, 213n, 227n, 235n, 243, 245n, 275n, 278 Keynes, J.M., 193n, 243, 330, 344 Kolmogorov, A.N., 29n, 31, 34n, 37n, 39n, 45, 51n, 62n, 63, 194, 221, 227, 229n, 231, 233, 236, 388n, 402 Kondrachov, V.I., 314 Kuroda, N., 279n, 299 Kusuoka, S., 238n, 315 L Lax, P.D., 1n, 21, 97n, 99, 134n, 138n, 151n, 155n, 163, 172n, 183n, 187n, 191 Ljapunov, A., 303, 306, 335, 337 Loève, M., 194n, 243 Loomis, L., 245n, 278 Lotka, A.J., 342 Lusin, N.N., 36 M Malliavin, P., 112n, 118n, 119, 122n, 154n, 163, 277, 278 Maruyama, G., 193, 243 Maruyama, T., 1n, 21, 36n, 45, 68n, 71n, 76, 92n, 99, 103n, 115n, 119, 122n, 134n, 138n, 151n, 155n, 163, 171n, 179n, 191, 193n, 222n, 227n, 243, 277n, 278, 282n, 285n, 291n, 298n, 299, 303n, 305n, 344, 354n, 356, 358n, 364n, 372n, 377, 387n, 402 Masuda, K., 303n, 344 McKean, H.P., 51n Melbro, L., 37n, 44 N Naimark, M.A., 155n, 163 Newbold, P., 193n, 239n, 243 P Plancherel M., 65 Prodi, G., 301–303n, 306, 312n, 344

Name Index R Rabinowitz, P.H., 301n, 344 Rellich, F., 314 Riesz, F., 3 Rogozinski, W.W., 44 Rudin, W., 245n, 278, 354n, 356, 374n, 377 S Samuelson, P.A., 193n, 243, 330, 345 Sargent, T.J., 193n, 216, 242n, 243 Schwartz, J.T., 36n, 99, 245n, 278, 282n, 299 Schwartz, L., 1n, 21, 62n, 63, 67n, 74n, 78, 99, 100, 136n, 163, 357n–359n, 367n, 369n, 378, 379n, 388n, 391n, 402 Shinasi, G.J., 333, 345 Shinkai, Y., 242n, 243 Slutsky, E., 193n, 243, 244 Smyth, D.J., 333, 344 Stone, M.H., 187n, 191 Stromberg, K.R., 32n, 38n–40n, 45, 61n, 63, 250n, 278, 329n, 345, 349n, 350n, 356 T Takagi, T., 7n, 21, 32n, 39n, 40n, 45, 55n, 61n, 63, 69n, 100, 329n, 345, 347n, 350n, 356 Terasawa, K., 5n, 21 Titchmarsh, E.C., 51n, 63 Toeplitz, O., 133n, 163 Treves, J.F., 71n, 100 V van der Waerden, B.L., 347n, 356 Volterra, V., 342 von Neumann, J., 245n, 277, 278 W Wold, H., 193n, 244 Y Yamaguti, M., 342n, 345 Yasui, T., 330, 333, 345 Yosida, K., 5n, 15n, 20, 21, 36n, 45, 71n, 86n, 87n, 93, 100, 179n, 189n, 191, 285n, 291n, 299, 364n, 378, 379n, 402 Z Zeidler, E., 279n, 299 Zygmund, A., 37n, 45

Subject Index

A Abel summability kernel, 112n Absorbing, 358 Adjoint operator, 171 Almost periodic —function, 246 —weakly stationary process, 274 Argument (of a complex number), 353 Ascoli–Arzelà theorem, 368 autocorrelation, 200 autocovariance, 200

B Backward operator, 239 Baire space, 364 Balanced, 358 Banach–Mackey Resonance Theorem, 372 Banach–Steinhaus theorem, 36n, 371 Barrel, 363 Barreled space, 363 Bessel’s inequality, 14 Beta function, 7n Bifurcation equation, 305 Bifurcation point, 305, 311 Biorthogonal system, 283 Bochner’s theorem, 150 Bornologic space, 398 Branch (of log), 353

C (C, 1)-sum, 40 (C, 1)-summable, 40 Carleson’s theorem, 37

Cauchy’s principal value, 51 Compact topology, 366 Complete orthonormal system, 15 Conjugate operator, 171 Continuous measure, 157 Convergence of Fourier series almost everywhere—, 38 Dini test for—, 31 Jordan test for—, 31 uniform—, 37 —at a point, 27 —in norm, 16 Convolution —of a measure and a function, 137 —of measures, 156 —on L1 , 56 —on [−π, π ], 109 Correlation function, 195 Covariance function, 195 Cramér–Kolmogorov Representation Theorem, 221 Crum’s theorem, 207

D Dini condition, 31 Dini test, 31 Dirac distribution, 390 Direct sum, 279 topological—, 282 —on Hilbert space, 3 —on vector space, 279 Dirichlet integral, 25 Dirichlet kernel, 26 Discrete measure, 157

© Springer Nature Singapore Pte Ltd. 2018 T. Maruyama, Fourier Analysis of Economic Phenomena, Monographs in Mathematical Economics 2, https://doi.org/10.1007/978-981-13-2730-8

407

408 Distribution, 79 periodic—, 93 tempered—, 65 —with compact support, 82, 400 Distributional derivative —defined by a locally integrable function, 389 Dirac—, 390 positive—, 391 Distributional derivative, 79, 392 Doob–Kawata formulas (D–K formulas), 220 Dual operator, 83

E ε-almost period, 245

F Fejér integral, 42 Fejér kernel, 42 Fejér sum, 41 Fejér’s summability, 44 Fejér summability kernel, on R, 115 First integral, 335 First summability in the sense of Cesàro, 40 Fourier coefficient, 266 —of almost periodic function, 266 —on AP, 267 —on Hilbert space, 12 Fourier index, 267 Fourier integral formula, 50 Fourier inversion formula —on L1 , 52 —on L2 , 72 —on S, 68 Fourier series, 12 —in complex form, 13 —on AP, 267 —on Hilbert space, 11 —on L1 , 24 Fourier–Stieltjes coefficients, 123 Fourier–Stieltjes transform, 136 Fourier transform, 52 derivative of—, 57, 59 higher derivatives of—, 58, 60 —of convolution, 56 —of tempered distribution, 83 —on L1 , 52 —on L2 , 72, 75 —on S, 68 Fréchet-differentiable, 314

Subject Index Fredholm operator, 290 Frequency, 256

G Gamma function, 7n Gâteaux-differentiable, 313 Gauss summability kernel, 112 Generalized derivative, 79, 392 Generalized function, 79

H Hamiltonian system, 336 Heat equation, 60 Heaviside function, 393 Herglotz’s theorem, 133 Hermite polynomial, 8 Hermitian operator, 172 Hilbert space, 2 separable—, 19 Hopf bifurcation theorem, 301 H-weakly continuous, 198

I Idempotency, 174 Index, of Fredholm operator, 290 Inductive limit topology, 358 strictly—, 358 Inner product, 1 Inverse Fourier transform, 71 —on L2 , 72 —on S, 72 —on S , 84

J Jordan test, 31

K Khintchine orthogonal measure, 218 Kolmogorov’s theorem, 194

L Lag operator, 239 Laguerre polynomial, 9 Lax–Milgram theorem, 166 Left parametrix, 295 Legendre polynomial, 6

Subject Index Linear stochastic process, 205 Ljapunov center theorem, 337 Ljapunov–Schmidt reduction method, 303 Local property, 31 Lusin’s problem, 37

M Mackey theorem, 374 Mean convolution, 267 Mean value on AP, 261 Minkowski functional, 364 Modulus of continuity, 258 Montel space, 365 Moving average process, 205

N Nemyckii operator, 312 Norm spectrum, 256

O One-parameter group, 187 —of class-C0 , 187 Orthogonal, 2 Orthogonal complement, 3 Orthogonal measure, 217 finite—, 217 Khintchine—, 218 Orthogonal system, 5 Orthonormal system, 5

P Paralleogram law, 2 Parametrix, 293 parametrix left—, 295 right—, 295 Parseval’s equality, 17 Parseval’s theorem, 91 Periodic distribution, 92 Periodic weakly stationary process, 214 Plancherel’s theorem, 72 Poisson formula (of heat equation), 62 Poisson summability kernel, 112n Polar set, 367 Positive distribution, 391 Positive semi-definite, 131 Pre-Hilbert space, 2 Projection (projection operator), 173 Pythagorean theorem, 2

409 Q Quotient seminorm, 371 R Radon measure, 122, 390 Rapidly decreasing function, 65 Reflexive, 374 Rellich–Kondrachov Compactness Theorem, 314n Resolution of the identity, 177 Riemann–Lebesgue lemma, 58 —of Fourier transform, 58 —on Hilbert space, 14 Riesz–Markov–Kakutani theorem, 122 Riesz’s representation theorem, 3 Right parametrix, 295 S Sample function, 194 Schmidt’s orthogonalization, 9 Schwarz inequality, 2 Second mean-valued theorem of integration, 32 —of Bonnet-type, 32 —of Weierstrass-type, 32 Self-adjoint operator, 172n Semi-group, 187n Semi-reflexive, 374 Serially uncorrelatedness, 206 Sesquilinear functional, 165 Sesquilinear functional bounded—, 169 Shift operator, 101 Sign theorem, 132 Simple eigenvalue, 312 Simple topology, 366 Skew-symmetric, 165 Slowly increasing function, 82 tempered distribution defined by—, 81 Spectral density function, 211 Spectral distribution function, 211 Spectral measure, 211 Spectral projection, 323 Spectral representation —of covariance, 210 —of unitary operator, 187 —of weakly stationary stochastic process, 221 Spectral synthesis, 111 —on L12π , 111 —on [−π, π ], 109 —on Wiener algebra, 118

410 Steinhaus’ theorem, 250n Stochastic process, 194, 200 strongly continuous—, 196 strongly stationary—, 200 weakly continuous—, 198 weakly stationary—, 200 —of p-th order, 195 Stone’s theorem (on one parameter group of unitary operators), 189 Strictly inductive limit topology, 360 Strongly continuous, 187n Strongly continuous stochastic process, 196 uniformly—, 196 Strongly stationary stochastic process, 200 Strong topology, 366 Summability kernel Abel—, 112 Fejér—, 115 Gauss—, 112 Poisson—, 112 —on [−π, π ], 105 —on R, 112 Support of a distribution, 397 Symmetricity, 174 Symmetric operator, 173 T Tchebichev’s inequality, 241 Tempered distribution, 65

Subject Index Tempered measure, 81 Test function, 79, 384 Topological complements, 282, 361 Topological direct sum, 282 Torus, 352 Translation closed convex hull, 250 Triangular inequality, 2 Trigonometric polynomial, 256

U Uniformly strongly continuous, 196 Unitary group, 175 Unitary operator, 175

V Vanish at infinity, 121n Vanish in an open set, 397 Variance, 195 Volterra–Lotka differential equation, 342

W Weakly continuous stochastic process, 198 Weakly stationary stochastic process, 200 White noise, 206 Wiener algebra, 118 Wiener’s theorem, 160, 162