Scaling, Fractals and Wavelets

This page intentionally left blank Scaling, Fractals and Wavelets This page intentionally left blank Scaling, Fra...

Author: Patrice Abry | Paolo Goncalves | Jacques Levy Vehel

99 downloads 1608 Views 4MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

This page intentionally left blank

Scaling, Fractals and Wavelets

This page intentionally left blank

Scaling, Fractals and Wavelets

Edited by Patrice Abry Paulo Gonçalves Jacques Lévy Véhel

First published in France in 2 volumes in 2002 by Hermes Science/Lavoisier entitled: Lois d’échelle, fractales et ondelettes © LAVOISIER, 2002 First published in Great Britain and the United States in 2009 by ISTE Ltd and John Wiley & Sons, Inc. Apart from any fair dealing for the purposes of research or private study, or criticism or review, as permitted under the Copyright, Designs and Patents Act 1988, this publication may only be reproduced, stored or transmitted, in any form or by any means, with the prior permission in writing of the publishers, or in the case of reprographic reproduction in accordance with the terms and licenses issued by the CLA. Enquiries concerning reproduction outside these terms should be sent to the publishers at the undermentioned address: ISTE Ltd 27-37 St George’s Road London SW19 4EU UK

John Wiley & Sons, Inc. 111 River Street Hoboken, NJ 07030 USA

www.iste.co.uk

www.wiley.com

© ISTE Ltd, 2009 The rights of Patrice Abry, Paulo Gonçalves and Jacques Lévy Véhel to be identified as the authors of this work have been asserted by them in accordance with the Copyright, Designs and Patents Act 1988. Library of Congress Cataloging-in-Publication Data Lois d’échelle, fractales et ondelettes. English Scaling, fractals and wavelets/edited by Patrice Abry, Paulo Gonçalves, Jacques Lévy Véhel. p. cm. Includes bibliographical references. ISBN 978-1-84821-072-1 1. Signal processing--Mathematics. 2. Fractals. 3. Wavelets (Mathematics) I. Abry, Patrice. II. Gonçalves, Paulo. III. Lévy Véhel, Jacques, 1960- IV. Title. TK5102.9.L65 2007 621.382'20151--dc22 2007025119 British Library Cataloguing-in-Publication Data A CIP record for this book is available from the British Library ISBN: 978-1-84821-072-1 Printed and bound in Great Britain by CPI Antony Rowe Ltd, Chippenham, Wiltshire.

Table of Contents

Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

17

Chapter 1. Fractal and Multifractal Analysis in Signal Processing . . . . . Jacques L ÉVY V ÉHEL and Claude T RICOT

19

1.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2. Dimensions of sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.1. Minkowski-Bouligand dimension . . . . . . . . . . . . . . . . 1.2.2. Packing dimension . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.3. Covering dimension . . . . . . . . . . . . . . . . . . . . . . . . 1.2.4. Methods for calculating dimensions . . . . . . . . . . . . . . . 1.3. Hölder exponents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3.1. Hölder exponents related to a measure . . . . . . . . . . . . . . 1.3.2. Theorems on set dimensions . . . . . . . . . . . . . . . . . . . 1.3.3. Hölder exponent related to a function . . . . . . . . . . . . . . 1.3.4. Signal dimension theorem . . . . . . . . . . . . . . . . . . . . . 1.3.5. 2-microlocal analysis . . . . . . . . . . . . . . . . . . . . . . . 1.3.6. An example: analysis of stock market price . . . . . . . . . . . 1.4. Multifractal analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.4.1. What is the purpose of multifractal analysis? . . . . . . . . . . 1.4.2. First ingredient: local regularity measures . . . . . . . . . . . . 1.4.3. Second ingredient: the size of point sets of the same regularity 1.4.4. Practical calculation of spectra . . . . . . . . . . . . . . . . . . 1.4.5. Reﬁnements: analysis of the sequence of capacities, mutual analysis and multisingularity . . . . . . . . . . . . . . . . . . . 1.4.6. The multifractal spectra of certain simple signals . . . . . . . . 1.4.7. Two applications . . . . . . . . . . . . . . . . . . . . . . . . . . 1.4.7.1. Image segmentation . . . . . . . . . . . . . . . . . . . . . 1.4.7.2. Analysis of TCP trafﬁc . . . . . . . . . . . . . . . . . . . 1.5. Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

19 20 21 25 27 29 33 33 33 36 42 45 46 48 48 49 50 52

. . . . . .

. . . . . .

60 62 66 66 67 68

6

Scaling, Fractals and Wavelets

Chapter 2. Scale Invariance and Wavelets . . . . . . . . . . . . . . . . . . . . Patrick F LANDRIN, Paulo G ONÇALVES and Patrice A BRY

71

2.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 2.2. Models for scale invariance . . . . . . . . . . . . . . . . . . . . . . . . . 72 2.2.1. Intuition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 2.2.2. Self-similarity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 2.2.3. Long-range dependence . . . . . . . . . . . . . . . . . . . . . . . . 75 2.2.4. Local regularity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 2.2.5. Fractional Brownian motion: paradigm of scale invariance . . . . 77 2.2.6. Beyond the paradigm of scale invariance . . . . . . . . . . . . . . 79 2.3. Wavelet transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 2.3.1. Continuous wavelet transform . . . . . . . . . . . . . . . . . . . . 81 2.3.2. Discrete wavelet transform . . . . . . . . . . . . . . . . . . . . . . 82 2.4. Wavelet analysis of scale invariant processes . . . . . . . . . . . . . . . 85 2.4.1. Self-similarity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 2.4.2. Long-range dependence . . . . . . . . . . . . . . . . . . . . . . . . 88 2.4.3. Local regularity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 2.4.4. Beyond second order . . . . . . . . . . . . . . . . . . . . . . . . . . 92 2.5. Implementation: analysis, detection and estimation . . . . . . . . . . . . 92 2.5.1. Estimation of the parameters of scale invariance . . . . . . . . . . 93 2.5.2. Emphasis on scaling laws and determination of the scaling range . 96 2.5.3. Robustness of the wavelet approach . . . . . . . . . . . . . . . . . 98 2.6. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 2.7. Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 Chapter 3. Wavelet Methods for Multifractal Analysis of Functions . . . . 103 Stéphane JAFFARD 3.1. Introduction . . . . . . . . . . . . . . . . . . . . 3.2. General points regarding multifractal functions 3.2.1. Important deﬁnitions . . . . . . . . . . . . 3.2.2. Wavelets and pointwise regularity . . . . 3.2.3. Local oscillations . . . . . . . . . . . . . . 3.2.4. Complements . . . . . . . . . . . . . . . . 3.3. Random multifractal processes . . . . . . . . . 3.3.1. Lévy processes . . . . . . . . . . . . . . . 3.3.2. Burgers’ equation and Brownian motion . 3.3.3. Random wavelet series . . . . . . . . . . . 3.4. Multifractal formalisms . . . . . . . . . . . . . 3.4.1. Besov spaces and lacunarity . . . . . . . . 3.4.2. Construction of formalisms . . . . . . . . 3.5. Bounds of the spectrum . . . . . . . . . . . . . 3.5.1. Bounds according to the Besov domain .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

103 104 104 107 112 116 117 117 120 122 123 123 126 129 129

Contents

7

3.5.2. Bounds deduced from histograms . . . . . . . . . . . . . . . . . . 132 3.6. The grand-canonical multifractal formalism . . . . . . . . . . . . . . . . 132 3.7. Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134 Chapter 4. Multifractal Scaling: General Theory and Approach by Wavelets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 Rudolf R IEDI 4.1. Introduction and summary . . . . . . . . . . . . . . . 4.2. Singularity exponents . . . . . . . . . . . . . . . . . 4.2.1. Hölder continuity . . . . . . . . . . . . . . . . . 4.2.2. Scaling of wavelet coefﬁcients . . . . . . . . . 4.2.3. Other scaling exponents . . . . . . . . . . . . . 4.3. Multifractal analysis . . . . . . . . . . . . . . . . . . 4.3.1. Dimension based spectra . . . . . . . . . . . . 4.3.2. Grain based spectra . . . . . . . . . . . . . . . 4.3.3. Partition function and Legendre spectrum . . . 4.3.4. Deterministic envelopes . . . . . . . . . . . . . 4.4. Multifractal formalism . . . . . . . . . . . . . . . . . 4.5. Binomial multifractals . . . . . . . . . . . . . . . . . 4.5.1. Construction . . . . . . . . . . . . . . . . . . . 4.5.2. Wavelet decomposition . . . . . . . . . . . . . 4.5.3. Multifractal analysis of the binomial measure . 4.5.4. Examples . . . . . . . . . . . . . . . . . . . . . 4.5.5. Beyond dyadic structure . . . . . . . . . . . . . 4.6. Wavelet based analysis . . . . . . . . . . . . . . . . . 4.6.1. The binomial revisited with wavelets . . . . . . 4.6.2. Multifractal properties of the derivative . . . . 4.7. Self-similarity and LRD . . . . . . . . . . . . . . . . 4.8. Multifractal processes . . . . . . . . . . . . . . . . . 4.8.1. Construction and simulation . . . . . . . . . . 4.8.2. Global analysis . . . . . . . . . . . . . . . . . . 4.8.3. Local analysis of warped FBM . . . . . . . . . 4.8.4. LRD and estimation of warped FBM . . . . . . 4.9. Bibliography . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . .

139 140 140 142 144 145 145 146 147 149 151 154 154 157 158 160 162 163 163 165 167 168 169 170 170 173 173

Chapter 5. Self-similar Processes . . . . . . . . . . . . . . . . . . . . . . . . . 179 Albert B ENASSI and Jacques I STAS 5.1. Introduction . . . . . . . . . . . . . . 5.1.1. Motivations . . . . . . . . . . . 5.1.2. Scalings . . . . . . . . . . . . . 5.1.2.1. Trees . . . . . . . . . . . . 5.1.2.2. Coding of R . . . . . . . 5.1.2.3. Renormalizing Cantor set

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

179 179 182 182 183 183

8

Scaling, Fractals and Wavelets

5.1.2.4. Random renormalized Cantor set . . . . . . . . . . . . . . . 5.1.3. Distributions of scale invariant masses . . . . . . . . . . . . . . . 5.1.3.1. Distribution of masses associated with Poisson measures . 5.1.3.2. Complete coding . . . . . . . . . . . . . . . . . . . . . . . . 5.1.4. Weierstrass functions . . . . . . . . . . . . . . . . . . . . . . . . 5.1.5. Renormalization of sums of random variables . . . . . . . . . . 5.1.6. A common structure for a stochastic (semi-)self-similar process 5.1.7. Identifying Weierstrass functions . . . . . . . . . . . . . . . . . . 5.1.7.1. Pseudo-correlation . . . . . . . . . . . . . . . . . . . . . . . 5.2. The Gaussian case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.1. Self-similar Gaussian processes with r-stationary increments . . 5.2.1.1. Notations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.1.2. Deﬁnitions . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.1.3. Characterization . . . . . . . . . . . . . . . . . . . . . . . . 5.2.2. Elliptic processes . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.3. Hyperbolic processes . . . . . . . . . . . . . . . . . . . . . . . . 5.2.4. Parabolic processes . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.5. Wavelet decomposition . . . . . . . . . . . . . . . . . . . . . . . 5.2.5.1. Gaussian elliptic processes . . . . . . . . . . . . . . . . . . 5.2.5.2. Gaussian hyperbolic process . . . . . . . . . . . . . . . . . 5.2.6. Renormalization of sums of correlated random variable . . . . . 5.2.7. Convergence towards fractional Brownian motion . . . . . . . . 5.2.7.1. Quadratic variations . . . . . . . . . . . . . . . . . . . . . . 5.2.7.2. Acceleration of convergence . . . . . . . . . . . . . . . . . 5.2.7.3. Self-similarity and regularity of trajectories . . . . . . . . . 5.3. Non-Gaussian case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.2. Symmetric α-stable processes . . . . . . . . . . . . . . . . . . . 5.3.2.1. Stochastic measure . . . . . . . . . . . . . . . . . . . . . . . 5.3.2.2. Ellipticity . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.3. Censov and Takenaka processes . . . . . . . . . . . . . . . . . . 5.3.4. Wavelet decomposition . . . . . . . . . . . . . . . . . . . . . . . 5.3.5. Process subordinated to Brownian measure . . . . . . . . . . . . 5.4. Regularity and long-range dependence . . . . . . . . . . . . . . . . . . 5.4.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4.2. Two examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4.2.1. A signal plus noise model . . . . . . . . . . . . . . . . . . . 5.4.2.2. Filtered white noise . . . . . . . . . . . . . . . . . . . . . . 5.4.2.3. Long-range correlation . . . . . . . . . . . . . . . . . . . . 5.5. Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

184 184 184 185 185 186 187 188 188 189 189 189 189 190 190 191 192 192 192 193 193 193 193 194 195 195 195 196 196 196 198 198 199 200 200 201 201 201 202 202

Contents

9

Chapter 6. Locally Self-similar Fields . . . . . . . . . . . . . . . . . . . . . . 205 Serge C OHEN 6.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2. Recap of two representations of fractional Brownian motion . 6.2.1. Reproducing kernel Hilbert space . . . . . . . . . . . . . 6.2.2. Harmonizable representation . . . . . . . . . . . . . . . . 6.3. Two examples of locally self-similar ﬁelds . . . . . . . . . . . 6.3.1. Deﬁnition of the local asymptotic self-similarity (LASS) 6.3.2. Filtered white noise (FWN) . . . . . . . . . . . . . . . . . 6.3.3. Elliptic Gaussian random ﬁelds (EGRP) . . . . . . . . . . 6.4. Multifractional ﬁelds and trajectorial regularity . . . . . . . . . 6.4.1. Two representations of the MBM . . . . . . . . . . . . . . 6.4.2. Study of the regularity of the trajectories of the MBM . . 6.4.3. Towards more irregularities: generalized multifractional Brownian motion (GMBM) and step fractional Brownian motion (SFBM) . . . . . . . . . . . . . . . . . . . . . . . . 6.4.3.1. Step fractional Brownian motion . . . . . . . . . . . 6.4.3.2. Generalized multifractional Brownian motion . . . 6.5. Estimate of regularity . . . . . . . . . . . . . . . . . . . . . . . 6.5.1. General method: generalized quadratic variation . . . . . 6.5.2. Application to the examples . . . . . . . . . . . . . . . . . 6.5.2.1. Identiﬁcation of ﬁltered white noise . . . . . . . . . 6.5.2.2. Identiﬁcation of elliptic Gaussian random processes 6.5.2.3. Identiﬁcation of MBM . . . . . . . . . . . . . . . . . 6.5.2.4. Identiﬁcation of SFBMs . . . . . . . . . . . . . . . . 6.6. Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

205 207 207 208 213 213 214 215 218 219 221

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

222 223 224 226 226 228 228 230 231 233 235

Chapter 7. An Introduction to Fractional Calculus . . . . . . . . . . . . . . 237 Denis M ATIGNON 7.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.1.1. Motivations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.1.1.1. Fields of application . . . . . . . . . . . . . . . . . . . . . . . 7.1.1.2. Theories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.1.2. Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.1.3. Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2. Deﬁnitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2.1. Fractional integration . . . . . . . . . . . . . . . . . . . . . . . . . 7.2.2. Fractional derivatives within the framework of causal distributions 7.2.2.1. Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2.2.2. Fundamental solutions . . . . . . . . . . . . . . . . . . . . . . 7.2.3. Mild fractional derivatives, in the Caputo sense . . . . . . . . . . . 7.2.3.1. Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . .

237 237 237 238 238 239 240 240 242 242 245 246 246

10

Scaling, Fractals and Wavelets

7.2.3.2. Deﬁnition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247 7.2.3.3. Mittag-Lefﬂer eigenfunctions . . . . . . . . . . . . . . . . . . 248 7.2.3.4. Fractional power series expansions of order α (α-FPSE) . . 250 7.3. Fractional differential equations . . . . . . . . . . . . . . . . . . . . . . 251 7.3.1. Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251 7.3.1.1. Framework of causal distributions . . . . . . . . . . . . . . . 251 7.3.1.2. Framework of fractional power series expansion of order one half . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252 7.3.1.3. Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253 7.3.2. Framework of causal distributions . . . . . . . . . . . . . . . . . . 254 7.3.3. Framework of functions expandable into fractional power series (α-FPSE) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255 7.3.4. Asymptotic behavior of fundamental solutions . . . . . . . . . . . 257 7.3.4.1. Asymptotic behavior at the origin . . . . . . . . . . . . . . . 257 7.3.4.2. Asymptotic behavior at inﬁnity . . . . . . . . . . . . . . . . . 257 7.3.5. Controlled-and-observed linear dynamic systems of fractional order 261 7.4. Diffusive structure of fractional differential systems . . . . . . . . . . . 262 7.4.1. Introduction to diffusive representations of pseudo-differential operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263 7.4.2. General decomposition result . . . . . . . . . . . . . . . . . . . . . 264 7.4.3. Connection with the concept of long memory . . . . . . . . . . . . 265 7.4.4. Particular case of fractional differential systems of commensurate orders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265 7.5. Example of a fractional partial differential equation . . . . . . . . . . . 266 7.5.1. Physical problem considered . . . . . . . . . . . . . . . . . . . . . 267 7.5.2. Spectral consequences . . . . . . . . . . . . . . . . . . . . . . . . . 268 7.5.3. Time-domain consequences . . . . . . . . . . . . . . . . . . . . . . 268 7.5.3.1. Decomposition into wavetrains . . . . . . . . . . . . . . . . . 269 7.5.3.2. Quasi-modal decomposition . . . . . . . . . . . . . . . . . . 270 7.5.3.3. Fractional modal decomposition . . . . . . . . . . . . . . . . 271 7.5.4. Free problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272 7.6. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273 7.7. Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273 Chapter 8. Fractional Synthesis, Fractional Filters . . . . . . . . . . . . . . 279 Liliane B EL, Georges O PPENHEIM, Luc ROBBIANO and Marie-Claude V IANO 8.1. Traditional and less traditional questions about fractionals . . . . . . . . 279 8.1.1. Notes on terminology . . . . . . . . . . . . . . . . . . . . . . . . . 279 8.1.2. Short and long memory . . . . . . . . . . . . . . . . . . . . . . . . 279 8.1.3. From integer to non-integer powers: ﬁlter based sample path design 280 8.1.4. Local and global properties . . . . . . . . . . . . . . . . . . . . . . 281 8.2. Fractional ﬁlters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282 8.2.1. Desired general properties: association . . . . . . . . . . . . . . . 282

Contents

8.2.2. Construction and approximation techniques . . . . . . . . . . . 8.3. Discrete time fractional processes . . . . . . . . . . . . . . . . . . . 8.3.1. Filters: impulse responses and corresponding processes . . . . 8.3.2. Mixing and memory properties . . . . . . . . . . . . . . . . . . 8.3.3. Parameter estimation . . . . . . . . . . . . . . . . . . . . . . . . 8.3.4. Simulated example . . . . . . . . . . . . . . . . . . . . . . . . . 8.4. Continuous time fractional processes . . . . . . . . . . . . . . . . . . 8.4.1. A non-self-similar family: fractional processes designed from fractional ﬁlters . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.4.2. Sample path properties: local and global regularity, memory . 8.5. Distribution processes . . . . . . . . . . . . . . . . . . . . . . . . . . 8.5.1. Motivation and generalization of distribution processes . . . . 8.5.2. The family of linear distribution processes . . . . . . . . . . . 8.5.3. Fractional distribution processes . . . . . . . . . . . . . . . . . 8.5.4. Mixing and memory properties . . . . . . . . . . . . . . . . . . 8.6. Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter 9. Iterated Function Systems and Some Generalizations: Local Regularity Analysis and Multifractal Modeling of Signals . Khalid DAOUDI 9.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.2. Deﬁnition of the Hölder exponent . . . . . . . . . . . . . . . . 9.3. Iterated function systems (IFS) . . . . . . . . . . . . . . . . . . 9.4. Generalization of iterated function systems . . . . . . . . . . . 9.4.1. Semi-generalized iterated function systems . . . . . . . . 9.4.2. Generalized iterated function systems . . . . . . . . . . . 9.5. Estimation of pointwise Hölder exponent by GIFS . . . . . . . 9.5.1. Principles of the method . . . . . . . . . . . . . . . . . . . 9.5.2. Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.5.3. Application . . . . . . . . . . . . . . . . . . . . . . . . . . 9.6. Weak self-similar functions and multifractal formalism . . . . 9.7. Signal representation by WSA functions . . . . . . . . . . . . . 9.8. Segmentation of signals by weak self-similar functions . . . . 9.9. Estimation of the multifractal spectrum . . . . . . . . . . . . . 9.10. Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.11. Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . .

11

. . . . . . .

. . . . . . .

282 284 284 286 287 289 291

. . . . . . . .

. . . . . . . .

291 293 294 294 294 295 296 297

. . . . . 301 . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

Chapter 10. Iterated Function Systems and Applications in Image Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Franck DAVOINE and Jean-Marc C HASSERY 10.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.2. Iterated transformation systems . . . . . . . . . . . . . . . . . . . . . . 10.2.1. Contracting transformations and iterated transformation systems 10.2.1.1. Lipschitzian transformation . . . . . . . . . . . . . . . . . .

301 303 304 306 307 308 311 312 314 315 318 320 324 326 327 329 333 333 333 334 334

12

Scaling, Fractals and Wavelets

10.2.1.2. Contracting transformation . . . . . . . . . . . . . . . . . . 10.2.1.3. Fixed point . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.2.1.4. Hausdorff distance . . . . . . . . . . . . . . . . . . . . . . . 10.2.1.5. Contracting transformation on the space H(R2 ) . . . . . . 10.2.1.6. Iterated transformation system . . . . . . . . . . . . . . . . 10.2.2. Attractor of an iterated transformation system . . . . . . . . . . . 10.2.3. Collage theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.2.4. Finally contracting transformation . . . . . . . . . . . . . . . . . 10.2.5. Attractor and invariant measures . . . . . . . . . . . . . . . . . . 10.2.6. Inverse problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.3. Application to natural image processing: image coding . . . . . . . . . 10.3.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.3.2. Coding of natural images by fractals . . . . . . . . . . . . . . . . 10.3.2.1. Collage of a source block onto a destination block . . . . . 10.3.2.2. Hierarchical partitioning . . . . . . . . . . . . . . . . . . . . 10.3.2.3. Coding of the collage operation on a destination block . . . 10.3.2.4. Contraction control of the fractal transformation . . . . . . 10.3.3. Algebraic formulation of the fractal transformation . . . . . . . . 10.3.3.1. Formulation of the mass transformation . . . . . . . . . . . 10.3.3.2. Contraction control of the fractal transformation . . . . . . 10.3.3.3. Fisher formulation . . . . . . . . . . . . . . . . . . . . . . . 10.3.4. Experimentation on triangular partitions . . . . . . . . . . . . . . 10.3.5. Coding and decoding acceleration . . . . . . . . . . . . . . . . . 10.3.5.1. Coding simpliﬁcation suppressing the research for similarities . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.3.5.2. Decoding simpliﬁcation by collage space orthogonalization 10.3.5.3. Coding acceleration: search for the nearest neighbor . . . . 10.3.6. Other optimization diagrams: hybrid methods . . . . . . . . . . . 10.4. Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

334 334 334 335 335 335 336 338 339 340 340 340 342 342 344 345 345 345 347 349 350 351 352 352 358 360 360 362

Chapter 11. Local Regularity and Multifractal Methods for Image and Signal Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 367 Pierrick L EGRAND 11.1. Introduction . . . . . . . . . . . . . . . . . . . . . 11.2. Basic tools . . . . . . . . . . . . . . . . . . . . . 11.2.1. Hölder regularity analysis . . . . . . . . . . 11.2.2. Reminders on multifractal analysis . . . . . 11.2.2.1. Hausdorff multifractal spectrum . . . 11.2.2.2. Large deviation multifractal spectrum 11.2.2.3. Legendre multifractal spectrum . . . . 11.3. Hölderian regularity estimation . . . . . . . . . . 11.3.1. Oscillations (OSC) . . . . . . . . . . . . . . 11.3.2. Wavelet coefﬁcient regression (W CR) . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

367 368 368 369 369 370 371 371 371 372

Contents

11.3.3. Wavelet leaders regression (W L) . . . . . . . . . . . . . . . 11.3.4. Limit inf and limit sup regressions . . . . . . . . . . . . . . 11.3.5. Numerical experiments . . . . . . . . . . . . . . . . . . . . . 11.4. Denoising . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.4.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.4.2. Minimax risk, optimal convergence rate and adaptivity . . . 11.4.3. Wavelet based denoising . . . . . . . . . . . . . . . . . . . . 11.4.4. Non-linear wavelet coefﬁcients pumping . . . . . . . . . . . 11.4.4.1. Minimax properties . . . . . . . . . . . . . . . . . . . . 11.4.4.2. Regularity control . . . . . . . . . . . . . . . . . . . . 11.4.4.3. Numerical experiments . . . . . . . . . . . . . . . . . 11.4.5. Denoising using exponent between scales . . . . . . . . . . 11.4.5.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 11.4.5.2. Estimating the local regularity of a signal from noisy observations . . . . . . . . . . . . . . . . . . . . . . . . . 11.4.5.3. Numerical experiments . . . . . . . . . . . . . . . . . 11.4.6. Bayesian multifractal denoising . . . . . . . . . . . . . . . . 11.4.6.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 11.4.6.2. The set of parameterized classes S(g, ψ) . . . . . . . 11.4.6.3. Bayesian denoising in S(g, ψ) . . . . . . . . . . . . . 11.4.6.4. Numerical experiments . . . . . . . . . . . . . . . . . 11.4.6.5. Denoising of road proﬁles . . . . . . . . . . . . . . . . 11.5. Hölderian regularity based interpolation . . . . . . . . . . . . . . 11.5.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.5.2. The method . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.5.3. Regularity and asymptotic properties . . . . . . . . . . . . . 11.5.4. Numerical experiments . . . . . . . . . . . . . . . . . . . . . 11.6. Biomedical signal analysis . . . . . . . . . . . . . . . . . . . . . . 11.7. Texture segmentation . . . . . . . . . . . . . . . . . . . . . . . . . 11.8. Edge detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.8.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.8.1.1. Edge detection . . . . . . . . . . . . . . . . . . . . . . 11.9. Change detection in image sequences using multifractal analysis 11.10. Image reconstruction . . . . . . . . . . . . . . . . . . . . . . . . 11.11. Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

13

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

372 373 374 376 376 377 378 380 380 381 382 383 383

. . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . .

384 386 386 386 387 388 390 391 393 393 393 394 394 394 401 403 403 406 407 408 409

Chapter 12. Scale Invariance in Computer Network Traffic . . . . . . . . . 413 Darryl V EITCH 12.1. Teletrafﬁc – a new natural phenomenon . . . . . . . 12.1.1. A phenomenon of scales . . . . . . . . . . . . . 12.1.2. An experimental science of “man-made atoms” 12.1.3. A random current . . . . . . . . . . . . . . . . . 12.1.4. Two fundamental approaches . . . . . . . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

413 413 415 416 417

14

Scaling, Fractals and Wavelets

12.2. From a wealth of scales arise scaling laws . . 12.2.1. First discoveries . . . . . . . . . . . . . . 12.2.2. Laws reign . . . . . . . . . . . . . . . . . 12.2.3. Beyond the revolution . . . . . . . . . . 12.3. Sources as the source of the laws . . . . . . . 12.3.1. The sum or its parts . . . . . . . . . . . . 12.3.2. The on/off paradigm . . . . . . . . . . . 12.3.3. Chemistry . . . . . . . . . . . . . . . . . 12.3.4. Mechanisms . . . . . . . . . . . . . . . . 12.4. New models, new behaviors . . . . . . . . . . 12.4.1. Character of a model . . . . . . . . . . . 12.4.2. The fractional Brownian motion family 12.4.3. Greedy sources . . . . . . . . . . . . . . 12.4.4. Never-ending calls . . . . . . . . . . . . 12.5. Perspectives . . . . . . . . . . . . . . . . . . . 12.6. Bibliography . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

419 419 420 424 426 426 427 428 429 430 430 431 432 432 433 434

Chapter 13. Research of Scaling Law on Stock Market Variations . . . . . 437 Christian WALTER 13.1. Introduction: fractals in ﬁnance . . . . . . . . . . . . . . . . . . . . . . 13.2. Presence of scales in the study of stock market variations . . . . . . . 13.2.1. Modeling of stock market variations . . . . . . . . . . . . . . . . 13.2.1.1. Statistical apprehension of stock market ﬂuctuations . . . . 13.2.1.2. Proﬁt and stock market return operations in different scales 13.2.1.3. Traditional ﬁnancial modeling: Brownian motion . . . . . . 13.2.2. Time scales in ﬁnancial modeling . . . . . . . . . . . . . . . . . . 13.2.2.1. The existence of characteristic time . . . . . . . . . . . . . . 13.2.2.2. Implicit scaling invariances of traditional ﬁnancial modeling 13.3. Modeling postulating independence on stock market returns . . . . . . 13.3.1. 1960-1970: from Pareto’s law to Lévy’s distributions . . . . . . . 13.3.1.1. Leptokurtic problem and Mandelbrot’s ﬁrst model . . . . . 13.3.1.2. First emphasis of Lévy’s α-stable distributions in ﬁnance . 13.3.2. 1970–1990: experimental difﬁculties of iid-α-stable model . . . 13.3.2.1. Statistical problem of parameter estimation of stable laws . 13.3.2.2. Non-normality and controversies on scaling invariance . . 13.3.2.3. Scaling anomalies of parameters under iid hypothesis . . . 13.3.3. Unstable iid models in partial scaling invariance . . . . . . . . . 13.3.3.1. Partial scaling invariances by regime switching models . . 13.3.3.2. Partial scaling invariances as compared with extremes . . . 13.4. Research of dependency and memory of markets . . . . . . . . . . . . 13.4.1. Linear dependence: testing of H-correlative models on returns . 13.4.1.1. Question of dependency of stock market returns . . . . . . 13.4.1.2. Problem of slow cycles and Mandelbrot’s second model . .

437 439 439 439 442 443 445 445 446 446 446 446 448 448 448 449 451 452 452 453 454 454 454 455

Contents

13.4.1.3. Introduction of fractional differentiation in econometrics . 13.4.1.4. Experimental difﬁculties of H-correlative model on returns 13.4.2. Non-linear dependence: validating H-correlative model on volatilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.4.2.1. The 1980s: ARCH modeling and its limits . . . . . . . . . 13.4.2.2. The 1990s: emphasis of long dependence on volatility . . . 13.5. Towards a rediscovery of scaling laws in ﬁnance . . . . . . . . . . . . 13.6. Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter 14. Scale Relativity, Non-differentiability and Fractal Space-time Laurent N OTTALE

15

455 456 456 456 457 457 458 465

14.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 465 14.2. Abandonment of the hypothesis of space-time differentiability . . . . 466 14.3. Towards a fractal space-time . . . . . . . . . . . . . . . . . . . . . . . . 466 14.3.1. Explicit dependence of coordinates on spatio-temporal resolutions 467 14.3.2. From continuity and non-differentiability to fractality . . . . . . 467 14.3.3. Description of non-differentiable process by differential equations 469 14.3.4. Differential dilation operator . . . . . . . . . . . . . . . . . . . . 471 14.4. Relativity and scale covariance . . . . . . . . . . . . . . . . . . . . . . 472 14.5. Scale differential equations . . . . . . . . . . . . . . . . . . . . . . . . 472 14.5.1. Constant fractal dimension: “Galilean” scale relativity . . . . . . 473 14.5.2. Breaking scale invariance: transition scales . . . . . . . . . . . . 474 14.5.3. Non-linear scale laws: second order equations, discrete scale invariance, log-periodic laws . . . . . . . . . . . . . . . . . . . . . 475 14.5.4. Variable fractal dimension: Euler-Lagrange scale equations . . . 476 14.5.5. Scale dynamics and scale force . . . . . . . . . . . . . . . . . . . 478 14.5.5.1. Constant scale force . . . . . . . . . . . . . . . . . . . . . . 479 14.5.5.2. Scale harmonic oscillator . . . . . . . . . . . . . . . . . . . 480 14.5.6. Special scale relativity – log-Lorentzian dilation laws, invariant scale limit under dilations . . . . . . . . . . . . . . . . . . . . . . . 481 14.5.7. Generalized scale relativity and scale-motion coupling . . . . . . 482 14.5.7.1. A reminder about gauge invariance . . . . . . . . . . . . . . 483 14.5.7.2. Nature of gauge ﬁelds . . . . . . . . . . . . . . . . . . . . . 484 14.5.7.3. Nature of the charges . . . . . . . . . . . . . . . . . . . . . . 486 14.5.7.4. Mass-charge relations . . . . . . . . . . . . . . . . . . . . . 488 14.6. Quantum-like induced dynamics . . . . . . . . . . . . . . . . . . . . . 488 14.6.1. Generalized Schrödinger equation . . . . . . . . . . . . . . . . . 488 14.6.2. Application in gravitational structure formation . . . . . . . . . . 492 14.7. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 493 14.8. Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 495 List of Authors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 499 Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 503

This page intentionally left blank

Preface

It is a common scheme in many sciences to study systems or signals by looking for characteristic scales in time or space. These are then used as references for expressing all measured quantities. Physicists may for instance employ the size of a structure, while signal processors are often interested in correlation lengths: (blocks of) samples whose distance is several times the correlation lengths are considered statistically independent. The concept of scale invariance may be considered to be the converse of this approach: it means that there is no characteristic scale in the system. In other words, all scales contribute to the observed phenomenon. This “non-property” is also loosely referred to as scaling law or scaling behavior. Note that we may reverse the perspective and consider scale invariance as the signature of a strong organization in the system. Indeed, it is well known in physics that invariance laws are associated with fundamental properties. It is remarkable that phenomena where scaling laws have been observed cover a wide range of ﬁelds, both in natural and artiﬁcial systems. In the ﬁrst category, these include for instance hydrology, in relation to the variability of water levels, hydrodynamics and the study of turbulence, statistical physics with the study of long-range interactions, electronics with the so-called 1/f noise in semiconductors, geophysics with the distribution of faults, biology, physiology and the variability of human body rhythms such as the heart rate. In the second category, we may mention geography with the distribution of population in cities or in continents, Internet trafﬁc and ﬁnancial markets. From a signal processing perspective, the aim is then to study transfer mechanisms between scales (also called “cascades”) rather than to identify relevant scales. We are thus led to forget about scale-based models (such as Markov models), and to focus on models allowing us to study correspondences between many scales. The central notion behind scaling laws is that of self-similarity. Loosely speaking, this means that each part is (statistically) the same as the whole object. In particular, information gathered from observing the data should be independent of the scale of observation.

18

Scaling, Fractals and Wavelets

There is considerable variety in observed self-similar behaviors. They may for instance appear through scaling laws in the Fourier domain, either at all frequencies or in a ﬁnite but large range of frequencies, or even in the limit of high or low frequencies. In many cases, studying second-order quantities such as spectra will prove insufﬁcient for describing scaling laws. Higher-order moments are then necessary. More generally, the fundamental model of self-similarity has to be adapted in many settings, and to be generalized in various directions, so that it becomes useful in real-world situations. These include self-similar stochastic processes, 1/f processes, long memory processes, multifractal and multifractional processes, locally self-similar processes and more. Multifractal analysis, in particular, has developed as a method allowing us to study complex objects which are not necessarily “fractal”, by describing the variations of local regularity. The recent change of paradigm consisting of using fractal methods rather than studying fractal objects is one of the reasons for the success of the domain in applications. We are delighted to invite our reader for a promenade in the realm of scaling laws, its mathematical models and its real-world manifestations. The 14 chapters have all been written by experts. The ﬁrst four chapters deal with the general mathematical tools allowing us to measure fractional dimensions, local regularity and scaling in its various disguises. Wavelets play a particular role for this purpose, and their role is emphasized. Chapters 5 and 6 describe advanced stochastic models relevant in our area. Chapter 7 deals with fractional calculus, and Chapter 8 explains how to synthesize certain fractal models. Chapter 9 gives a general introduction to IFS, a powerful tool for building and describing fractals and other complex objects, while Chapter 10, of applied nature, considers the application of IFS to image compression. The four remaining chapters also deal with applications: various signal and image processing tasks are considered in Chapter 11. Chapter 12 deals with Internet trafﬁc, and Chapter 13 with ﬁnancial data analysis. Finally, Chapter 14 describes a fractal space-time in the frame of cosmology. It is a great pleasure for us to thank all the authors of this volume for the quality of their contribution. We believe they have succeeded in exposing advanced concepts with great pedagogy.

Chapter 1

Fractal and Multifractal Analysis in Signal Processing

1.1. Introduction The aim of this chapter is to describe some of the fundamental concepts of fractal analysis in view of their application. We will thus present a simple introduction to the concepts of fractional dimension, regularity exponents and multifractal analysis, and show how they are used in signal and image processing. Since we are interested in applications, most theoretical results are given without proofs. These are available in the references mentioned where appropriate. In contrast, we will pay special attention to the practical aspects. In particular, almost all the notions explained below are implemented in the FracLab toolbox. This toolbox is freely available from the following site: http://complex.futurs.inria .fr/FracLab/, so that interested readers may perform hands-on experiments. Before we start, we wish to emphasize the following point: recent successes of fractal analysis in signal and image processing do not generally stem from the fact that they are applied to fractal objects (in a more or less strict sense). Indeed, most real-world signals are neither self-similar nor display the characteristics usually associated with fractals (except for the irregularity at each scale). The relevance of fractal analysis instead results from the progress made in the development of fractal methods. Such methods have lately become more general and reliable, and they now allow to describe precisely the singular structure of complex signals,

Chapter written by Jacques L ÉVY V ÉHEL and Claude T RICOT.

20

Scaling, Fractals and Wavelets

without any assumption of “fractality”: as a rule, performing a fractal analysis will be useful as soon as the considered signal is irregular and this irregularity contains meaningful information. There are numerous examples of such situations, ranging from image segmentation (where, for instance, contours are made of singular points; see section 1.4.7 and Chapter 11) to vocal synthesis [DAO 02] or ﬁnancial analysis. This chapter roughly follows the chronological order in which the various tools have been introduced. We ﬁrst describe several notions of fractional dimensions. These provide a global characterization of a signal. We then introduce Hölder exponents, which supply local measures of irregularity. The last part of the chapter is devoted to multifractal analysis, a most reﬁned tool that describes the local as well as the overall singular structure of signals. All the concepts presented here are more fully developed in [TRI 99, LEV 02]. 1.2. Dimensions of sets The concept of dimension applies to objects more general than signals. To simplify, we shall consider sets in a metric space, although the notion of dimension makes sense for more complex entities such as measures or classes of functions [KOL 61]. Several interesting notions of dimension exist. This might look like a drawback for the mathematical analysis of fractal sets. However, it is actually an advantage, since each dimension emphasizes a different aspect of an object. It is thus worthwhile to determine the speciﬁcity of each dimension. As a general rule, none of these tools outperform the other. Let us ﬁrst give a general deﬁnition of the notion of dimension. DEFINITION 1.1.– We call dimension an application d deﬁned on the family of bounded sets of Rn and ranging in R+ ∪ {−∞}, such that: 1) d(∅) = −∞, d({x}) = 0 for any point x; 2) E1 ⊂ E2 ⇒ d(E1 ) d(E2 ) (monotonicity); 3) if E has non-zero n-dimensional volume, then d(E) = n; 4) if E is a diffeomorphism T of Rn (such as, in particular, a similarity with non-zero ratio, or a non-singular afﬁne application), then d(T (E)) = d(E) (invariance). Moreover, we will say that d is stable if d(E1 ∪ E2 ) = max{d(E1 ), d(E2 )}. It is said to be σ-stable if, for any countable collection of sets: d ∪n En = sup d En σ-stable dimensions may be extended in a natural way to characterize unbounded sets of Rn .

Fractal and Multifractal Analysis in Signal Processing

21

1.2.1. Minkowski-Bouligand dimension The Minkowski-Bouligand dimension was invented by Bouligand [BOU 28], who named it the Cantor-Minkowski order. It is now commonly referred to as the box dimension. Let us cover a bounded set E of Rn with cubes of side ε and disjoint interiors. Let Nε (E) be the number of these cubes. When E contains an inﬁnite number of points (i.e. if it is a curve, a surface, etc.), Nε (E) tends to +∞ when ε tends to 0. The box dimension Δ characterizes the rate of this growth. Roughly speaking, Δ is the real number such that: Δ 1 , Nε (E) ε assuming this number exists. More generally, we deﬁne, for all bounded E, the number: Δ(E) = lim sup ε→∞

log Nε (E) |log ε|

(1.1)

A lower limit may also be used: δ(E) = lim inf ε→∞

log Nε (E) |log ε|

(1.2)

Note that some authors refer to the box dimension only when both indices coincide, that is, when the limit exists. Both indices Δ and δ are dimensions in the sense previously deﬁned. However, Δ is stable, contrarily to δ, so that Δ is more commonly used. Let us mention an ¯ denotes the closure of E (the set of all limit points of important property: if E sequences in E), then: ¯ = Δ(E) Δ(E) This property shows that Δ is not sensitive to the topological type of E. It only characterizes the density of a set. For example, the (countable) set of the rational numbers of the interval [0, 1] has one dimension, which is the dimension of the interval itself. Even discrete sequences may have non-zero dimension: let, for instance, E be the set of numbers n−α with α > 0. Then Δ(E) = 1/(α + 1). Equivalent deﬁnitions It is not mandatory to use cubes to calculate Δ. The original deﬁnition of Bouligand is as follows:

22

Scaling, Fractals and Wavelets

– in Rn , let us consider the Minkowski sausage: E(ε) = ∪x∈E Bε (x) which is the union of all the balls of radius ε centered at E. Denote its volume by Voln (E(ε)). This volume is approximately of the order of Nε (E) εn . This allows us to give the equivalent deﬁnition: Voln E(ε) ; (1.3) Δ(E) = lim sup n − log ε ε→0 – we may also deﬁne Nε (E), which is the smallest number of balls of radius ε covering E; or Nε (E), the largest number of disjoint balls of radius ε centered on E. Replacing Nε (E) by any of these values in equation (1.1) still gives Δ(E). Discrete values of ε In these deﬁnitions, the variable ε is continuous. The results remain the same if we use a discrete sequence such as εn = 2−n . More generally we may replace ε with any sequence which does not converge too quickly towards 0. More precisely, we require that: lim

n→∞

log εn = 1. log εn+1

This remark is important, as it allows us to perform numerical estimations of Δ. Let us now give some well-known examples of calculating dimensions. EXAMPLE 1.1.– Let (an ) be a sequence of real numbers such that 0 < 2an+1 < an < a0 = 1. Let E0 = [0, 1]. We construct by induction a sequence of sets (En ) such that En is made of 2n closed disjoint intervals of length an , each containing exactly two intervals of En+1 . The sets En are nested, and the sequence (En ) converges to a compact set E such that: E = ∩ n En . Let us consider a particular case. When all the interval extremities En are also interval extremities of En+1 , E is called a perfect symmetric set [KAH 63] or sometimes, more loosely, a Cantor set. Assume that the ratio log an / log an+1 tends to 1. According to the previous comment on discrete sequences, we obtain the following values: δ(E) = lim inf n→∞

n log 2 , |log an |

Δ(E) = lim sup n→∞

n log 2 . |log an |

Fractal and Multifractal Analysis in Signal Processing

23

However, these results are true for any sequence (an ). Even more speciﬁcally, consider the case where an = an , with 0 < a < 12 . The ratios an /an+1 are then constant and dimensions take the common value log 2/|log a|. This is the case of the self-similar set which satisﬁes the following relation: E = f1 (E) ∪ f2 (E) with f1 (x) = a x and f2 (x) = a x + 1 − a. This set is the attractor of the iterated function system {f1 , f2 } (see Chapters 9 and 10). It is also called a perfect symmetric set with constant ratio. EXAMPLE 1.2.– We construct a planar self-similar curve with extremities A and B, A = B as follows: take N + 1 distinct points A1 = A, A2 , . . . , AN +1 = B, such that dist(Ai , Ai+1 ) < dist(A, B). For each i = 1, . . . , N , deﬁne a similarity fi (that is, a composition of a homothety, an orthogonal transformation and a translation), such that fi (AB) = Ai Ai+1 . The ratio of fi is ai = dist(Ai , Ai+1 )/ dist(A, B). Starting from the segment Γ0 = AB, deﬁne by induction the polygonal curves Γn = ∪i fi (Γn−1 ). This sequence (Γn ) converges to a curve Γ which satisﬁes the following relation: Γ = ∪i fi (Γ). In other words, Γ is the attractor of the IFS {f1 , . . . , fn }. When Γ is simple, the dimensions δ and Δ assume a common value, which is also the similarity dimension, i.e. the unique solution of the equation N

axi = 1.

i=1

In the particular case where all distances dist(Ai , Ai+1 ) are the same, the ratios ai are equal to a value a such that N a > 1 (necessary condition for the continuity of Γ) and N a2 < 1 (necessary condition for the simplicity of Γ). Clearly, δ(Γ) = Δ(Γ) = log N/|log a|.

0

1/3

2/3

1

Figure 1.1. Von Koch curve, the attractor of a system of four similarities with common ratio

1 3

24

Scaling, Fractals and Wavelets

Function scales The previous deﬁnitions all involve ratios of logarithms. This is an immediate consequence of the fact that a dimension is deﬁned as an order of growth related to the scale of functions {tα , α > 0}. In general, a scale of functions F in the neighborhood of 0 is a family of functions which are all comparable in the Hardy sense, that is, for any f and g in F, the ratio f (x)/g(x) tends to a limit (possibly +∞ or −∞) when x tends to 0. Function scales are deﬁned in a similar way in the neighborhood of +∞. Scales other than {tα } will yield other types of dimensions. A dimension must be considered as a Dedekind cut in a given scale of functions. The following expressions will make this clearer: Δ(E) = inf{α such that εα Nε (E) → 0} δ(E) = sup{α such that εα Nε (E) → +∞}

(1.4) (1.5)

these are equivalent to equations (1.1) and (1.2) (see [TRI 99]). Complementary intervals on the line In the particular case where the compact E lies in an interval J of the line, the complementary set of E in J is a union of disjoint open intervals, whose lengths will be denoted by cn . Let |E| be the Lebesgue measure of E (which means, for an interval, its length). The dimension of E may be written as: log E(ε) Δ(E) = lim sup 1 − log ε ε→0 If |E| = 0, the sum of the cn is equal to the length of J. The dimension is then equal to the convergence exponent of the series cn :

α Δ(E) = inf α such that cn < +∞ (1.6) n

Proof. This result may be obtained by calculating an approximation of the length of Minkowski sausage E(ε). Let us assume that the complementary intervals are ranked in decreasing lengths: c1 c2 · · · cn · · · If |E| = 0 and if cn ε > cn+1 , then: |E(ε)| nε +

in

ci

Fractal and Multifractal Analysis in Signal Processing

thus εα−1 L(E(ε)) nεα + εα−1

in ci .

It may be shown that both values

α

inf{α such that nε < +∞} and

25

inf

α such that ε

α−1

ci < +∞

in

are equal to the convergence exponent. It is therefore equal to Δ(E). EXERCISE 1.1.– Verify formula (1.6) for the perfect symmetric sets of Example 1.1. If |E| = 0, then the convergence exponent of cn still makes sense. It characterizes a degree of proximity of the exterior with the set E. More precisely, we obtain log|E(ε) − E| α (1.7) cn < +∞ = lim sup 1 − inf α such that log ε ε→0 n where the set E(ε) − E refers to the Minkowski sausage of E deprived of the points of E. How can we generalize the study of the complementary set in Rn with n 2? The open intervals must be replaced with an appropriate paving. The results connecting the elements of this paving to the dimension depend both on the geometry of the tiles and on their respective positions. The topology of the complementary set must be investigated more deeply [TRI 87]. The index that generalizes (1.7) (replacing the 1 of the space dimension by n) is the fact fractal exponent, studied in [GRE 85, TRI 86b]. In the case of a zero area curve in R2 , this also leads to the notion of lateral dimension. Note that the dimensions corresponding to each side of the curve are not necessarily equal [TRI 99].

1.2.2. Packing dimension The packing dimension is, to some extent, a regularization of the box dimension [TRI 82]. Indeed, Δ is not σ-stable, but we may derive a σ-stable dimension from any index thanks to the operation described below. PROPOSITION 1.1.– Let B be the family of all bounded sets of Rn and α : B −→ R+ . Then, the function α ˆ deﬁned for any subsets of Rn as: α ˆ (E) = inf{sup α(Ei )/E = ∪Ei , Ei ∈ B} is monotonous and σ-stable.

26

Scaling, Fractals and Wavelets

Proof. Any subset E of Rn is a union of bounded sets. If E1 ⊂ E2 , then any covering of E1 may be completed with a covering of E2 . This entails monotonicity. Now, let ε > 0 and a sequence (Ek )k1 of sets whose union is E. For any k, there ˆ (Ek ) + ε2−k . Since exists a decomposition (Ei,k ) of Ek such that sup α(Ei,k ) α E = ∪i,k Ei,k , we deduce that: α ˆ (E) sup α ˆ (Ek ) + ε

2−k = sup α ˆ (Ek ) + ε

k

k

Thus, the inequality α ˆ (E) supk α ˆ (Ek ) holds. The converse inequality stems from monotonicity. The packing dimension is the result of this operation on Δ. We set ˆ Dim = Δ The term packing will be explained later. The new index Dim is indeed a dimension, and it is σ-stable. Therefore, contrarily to Δ, it vanishes for countable sets. The inequality: Dim(E) Δ(E) is true for any bounded set. This becomes an equality when E presents a homogenous structure in the following sense: THEOREM 1.1.– Let E be a compact set such that, for all open sets U intersecting E, Δ(E ∩ U ) = Δ(E). Then Δ(E) = Dim(E). Proof. Let Ei be a decomposition of E. Since E is compact, a Baire theorem entails that the Ei are not all nowhere dense in E. Therefore, there exist an index i0 and an ¯ ∩ U , which yields: open set U intersecting E such that Ei0 ∩ U = E ¯ ∩ U ) Δ(E ∩ U ) = Δ(E) Δ(Ei0 ) = Δ(Ei0 ) Δ(Ei0 ∩ U ) = Δ(E As a result, Δ(E) supi Δ(Ei ), and thus Δ(E) Dim(E). The converse inequality is always true. EXAMPLE 1.3.– All self-similar sets are of this type, including those presented above: Cantor sets and curves. For these sets, the packing dimension has the same value as Δ(E).

Fractal and Multifractal Analysis in Signal Processing

27

EXAMPLE 1.4.– Dense sets in [0, 1], when they are not compact, do not necessarily have a packing dimension equal to 1. Let us consider, for any real p, 0 < p < 1, the set Ep of p-normal numbers, that is, those numbers whose frequency of zeros in their dyadic expansion is equal to p. Any dyadic interval of [0, 1], however small it may be, contains points of Ep , so Ep is dense in [0, 1]. As a consequence, Δ(Ep ) = 1. In contrast, the value of Dim(Ep ) is: Dim(Ep ) =

1 p log p + (1 − p) log(1 − p). log 2

This result will be derived in section 1.3.2. 1.2.3. Covering dimension The covering dimension was introduced by Hausdorff [HAU 19]. Here we adopt the traditional approach through Hausdorff measures; a direct approach, using Vitali’s covering convergence exponent, may be used to calculate the dimension without using measures [TRI 99]. Covering measures Originally, the covering measures were deﬁned to generalize and, most of all, to precisely deﬁne the concepts of length, surface, volume, etc. They constitute an important tool in geometric measure theory. Firstly, let us consider a determining function φ: R+ −→ R+ , which is increasing and continuous in the neighborhood of 0, and such that φ(0) = 0. Let E be a set in a metric space (that is, a space where a distance has been deﬁned). For every ε > 0, we consider all the coverings of E by bounded sets Ui of diameter diam(Ui ) ε. Let Hεφ (E) = inf φ diam(Ui ) /E ⊂ ∪i Ui , diam(Ei ) ε . When ε tends to 0, this quantity (possibly inﬁnite) cannot decrease. The limit corresponds to the φ-Hausdorff measure: H φ (E) = lim Hεφ (E) ε→0

In this deﬁnition, the covering sets Ui can be taken in a more restricted family. If we suppose that Ui are open, or convex, the result remains unchanged. The main properties are that of any Borel measure: – E1 ⊂ E2 =⇒ H φ (E1 ) H φ (E2 );

28

Scaling, Fractals and Wavelets

– if (Ei ) is a collection of countable sets, then H φ (∪Ei )

H φ (Ei )

i

– if E1 and E2 are at non-zero distance from each other, any ε-covering of E1 is disjoint from any ε-covering of E2 when ε is sufﬁciently small. Then H φ (E1 ∪ E2 ) = H φ (E1 ) + H φ (E1 ). This implies that H φ is a metric measure. The Borel sets are H φ -measurable and for any collection (Ei ) of disjoint Borel sets, φ H (∪i Ei ) = i H φ (Ei ). The scale of functions tα In the case where φ(t) = tα with α > 0, we use the simple notation H φ = H α . Consider the case α = 1. For any curve Γ the value H 1 (Γ) is equal to the length of Γ. Therefore H 1 is a generalization of the concept of length: it may be applied to any subset of the metric space. Now let α = 2. For any plane surface S, the value of H 2 (S) is proportional to the area of S. For non-plane surfaces, H 2 provides an appropriate mathematical deﬁnition of area – using a triangulation of S is not acceptable from a theoretical point of view. More generally, when α is an integer, H α is proportional to the α-dimensional volume. However, α can also take non-integer values, which makes it possible to deﬁne the dimension of any set. The use of the term dimension is justiﬁed by the following property: if aE is the image of E by a homothety of ratio a, then H α (aE) = aα H α (E)

Measures estimated using boxes If we want to restrict the class of sets from which coverings are taken even more, one option would be to cover E with centered balls or dyadic boxes. In each case, the result is a measure H ∗α which is generally not equal to H α (E); nevertheless, it is an equivalent measure in the sense that we can ﬁnd two non-zero constants, c1 and c2 , such that for any E: c1 H α (E) H ∗α (E) c2 H α (E) Clearly the H ∗α measures give rise to the same dimension.

Fractal and Multifractal Analysis in Signal Processing

29

Dimension For every E, there exists a unique critical value α such that: – H α (E) > 0 and β < α =⇒ H β (E) = +∞; – H α (E) < +∞ and β > α =⇒ H β (E) = 0. The dimension is deﬁned as

dim(E) = inf α such that H α (E) = 0

= sup α such that H α (E)= + ∞

(1.8)

NOTE 1.1.– This approach is not very different from the one which leads to the box dimension. Compare equation (1.8) with equations (1.4) and (1.5). Once again, it may be generalized by using other function scales than tα . Properties The properties of dim directly stem from those of H α measures. It is a σ-stable dimension, like Dim. To compare all these dimensions, let us observe that δ can be deﬁned in the same manner as the covering dimension, by using coverings made up of sets of equal diameter. This implies the inequality dim(E) δ(E) for any E. The σ-stability property then implies the following: ˆ ˆ dim(E) δ(E) Δ(E) = Dim(E) Δ(E). These inequalities may be strict. However, the equality dim(E) = Dim(E), and even dim(E) = Δ(E), occur in cases where E is sufﬁciently regular. Examples include rectiﬁable curves and self-similar sets. Packing measures By considering packings of E, that is, families of disjoint sets at zero distance from E, and by switching inf and sup in the deﬁnitions, it is possible to deﬁne packing measures which are symmetric to the covering measures and whose critical index is precisely equal to Dim. This explains why Dim is called a packing dimension. 1.2.4. Methods for calculating dimensions Since a dimension is an index of irregularity, it may be used for the classiﬁcation of irregular sets and for characterizing phenomenons showing erratic behaviors. Here we focus on signals. In practice, we may assume that the signal Γ is given in axes Oxy by discrete data (xi , fi ), with xi < xi+1 : for example, a time series. Notice that other sets, such as geographical curves obtained by aerial photography, are not of this type, so that the analysis tools can be different.

30

Scaling, Fractals and Wavelets

The algorithms used for estimating a dimension rely on the theoretical formula which deﬁnes the dimension, with the usual limitation of a minimal scale, roughly equal to the distances xi+1 − xi . Indeed, it is impossible to go further into the dataset structure without a reconstruction at very small scales, which leads to adding up arbitrarily new data. The evaluation of a dimension, whose theoretical deﬁnition requires the ﬁnest coverings possible, is therefore difﬁcult to justify. This is why we do not propose algorithms for the estimation of σ-stable dimensions such as Hausdorff or packing dimensions. Calculations may be performed to estimate Δ or related dimensions that are naturally adapted to signal analysis. They are usually carried out using logarithmic diagrams. A quantity Q(f, ε) is estimated, which characterizes the irregularity for a certain number of values of the resolution ε between two limit values εmax and εmin . If Q(f, ε) follows a power law of the type cεΔ , then log Q(f, ε) is an afﬁne function of log ε, with slope Δ. The idea is to seek functions Q(f, ε) which provide appropriate logarithmic diagrams, i.e. allow us to estimate the slopes precisely. Here are some examples. Boxes and disks After counting the number Nε of squares of a network of sides ε, we draw the diagram (|log ε|, log Nε ). Although it is very easy to program, the method presents an obvious disadvantage: the quantities Nε are integers, which makes the diagram chaotic for large values of ε. We could try a method using the Minkowski sausage of Γ: 1 |log ε|, log 2 Aire Γ(ε) ε However, this method is more difﬁcult to program than the previous method and also lacks precision: the diagram shows, in general, a strong concavity – even for a curve as simple as a straight line segment! These methods are very popular, in spite of numerical difﬁculties. Unfortunately there is a major drawback for signals. The coordinates (xi , fi ) are not, in general, of the same nature. If they refer, for example, to stock exchange data, xi is a time value and fi an exchange value. In this case, it makes no sense to give the units the same value on the axis Ox and Oy. The covering of Γ by squares or disks is therefore meaningless. It is preferable to use algorithms which will provide a slope independent of changes of unit. For this purpose, the calculated quantity Q(f, ε) should satisfy the following properties: for any real a, there exists c(a), such that, for any ε: Q(af, ε) = c(a) Q(f, ε) as is the case for the methods which are described below.

(1.9)

Fractal and Multifractal Analysis in Signal Processing

31

Variation method Here we anticipate section 1.3.4. The oscillation of a function f on any set F is deﬁned as β(f, F ) = sup{f (t) − f (t ) t, t ∈ F } The ε-variation on an interval J is the arithmetic mean of the oscillation over intervals of length 2ε: 1 β(f, [t − ε, t + ε] ∩ J) dt varε (f ) = |J| J The variation method [DUB 89, TRI 88] consists of ﬁnding the slope of the diagram: 1 |log ε|, log 2 varε (f ) . ε Since β(af, [x − ε, x + ε]) = a β(f, [x − ε, x + ε]) for all x and ε, we obtain through integration varε (af ) = a varε (f ), so equation (1.9) is satisﬁed. In this case we obtain diagrams which present an almost perfect alignment for a large class of known signals. Furthermore, this method presents the following advantages: ease of programming and speed of execution. Lp norms method Let us deﬁne the Lp norm of a function f : D ⊂ Rn −→ R by the relation: 1/p 1 p |f (x)| dx . Lp (f ) = Voln (D) D It is a functional norm when p 1. When p → +∞, the expression Lp (f ) tends to the norm: L∞ (f ) = sup |f (x)| x∈D

Given a signal f deﬁned on [a, b] and the values x ∈ [a, b] and ε > 0, we apply this tool at any x to the local function difference deﬁned by f (x) − f (x ) where x − x ε. Using the norm L∞ , this gives supx ∈[x−ε,x+ε] (|f (x) − f (x )|). This quantity is equivalent to ε-oscillation, since 1 β f, [x − ε, x + ε] sup (|f (x) − f (x )|) β f, [x − ε, x + ε] 2 x ∈[x−ε,x+ε]

32

Scaling, Fractals and Wavelets

It is therefore possible to replace the ε-variation of f by the integral over J of supx ∈[x−ε,x+ε] (|f (x) − f (x )|), without altering the theoretical result for Δ. However, it is also possible to use Lp norms. Indeed, the oscillation (or the local norm L∞ ) only takes into account the peaks of the function. In practice, it can happen that these peaks are measured with a signiﬁcant error, or even destroyed in the process of acquisition of data (proﬁles of rough surfaces, for example). It is preferable to use all the intermediate values and replace the ε-variation with the quantity: J

1 2ε

x+ε

|f (x) − f (x )|p dx

1/p dt

x−ε

In this expression, large values of p allow us to emphasize the effect of local peaks, whereas if p = 1, all the values of function f have equal importance. These integrals make it possible to rectify the corresponding logarithmic diagram and to calculate the slope with precision. We can also replace the above integral on J by a norm Lq , with q > 1. If q is large, this will take into account the more irregular parts of the signal. We can also change the integral in the window [x−ε, x+ε] into a convolution product by a kernel of type K(x /ε), so that the results are even smoother. However, it should be noted that except for particular cases (Weierstrass functions, for example), we do not exactly calculate the dimension Δ with these methods, but rather an index smaller than Δ [TRI 99], which nevertheless remains relevant to the signal irregularity. Let us develop an example of the index just referred to. Let K be a kernel belonging to the Schwartz class, with integral equal to 1. Let Ka (t) = a1 K( at ) for a > 0. For a function f deﬁned in a compact, let f a be the convolution of f with Ka . Since f a is regular, the length Λa of its graph is ﬁnite. We deﬁne the regularization dimension dimR (f ) of the graph of f as: dimR (f ) = 1 + lim

a→0

log(Λa ) . − log a

(1.10)

This dimension measures the speed at which the length of less and less regularized versions of the graph of f tend to inﬁnity. It is easily proved that if f is continuous, the inequality dimR Δ is always true. An interesting aspect of the dimension of regularization is that it is a well-adapted estimation tool. Results obtained on usual signals (Weierstrass function, iterated function system and Brownian fractional motion) are generally satisfactory, even for small-sized samples (a few hundred points). Moreover, the simple analytical form of dimR allows us to easily obtain an estimator for data corrupted by an additional noise, which is particularly useful in signal processing (see [ROU 98] and the FracLab manual for more details).

Fractal and Multifractal Analysis in Signal Processing

33

1.3. Hölder exponents 1.3.1. Hölder exponents related to a measure The dimensional analysis of a set is related to its local properties. To go further into this study, it is convenient to use a measure μ supported by the set. In many cases (self-similar sets, for example), E is deﬁned at the same time as μ. If E is a curve, constructing a measure on E is called a parameterization. Without a parameterization it is impossible to analyze the curve. However, a given set can support very different measures. Particularly interesting ones are the well-balanced measures, in a sense we will explain. Given a measure μ of Rn , let us ﬁrst deﬁne the Hölder exponent of μ over any measurable set F by αμ (F ) =

log μ(F ) log diam(F )

By convention, 0/0 = 1 and 1/0 = +∞. Given a set E, we use this notion in a local manner, i.e. on arbitrarily small intervals or cubes intersecting E. A pointwise Hölder exponent is then deﬁned using centered balls Bε (x) whose radius tends to 0: αμ (x) = lim inf αμ Bε (x) ε→0

The symmetric exponent can also be useful: αμ (x) = lim sup αμ Bε (x) ε→0

In addition, the geometric context sometimes induces a speciﬁc analysis framework. If a measure is deﬁned by its value on the dyadic cubes, it will be easier to use the following Hölder exponents: αμ∗ (x) = lim sup αμ un (x) , α∗μ (x) = lim inf αμ un (x) n→+∞

n→+∞

N where un (x) is the cube of i=1 [ki 2−n , (ki + 1)2−n [ that contains x. Other covering nets can obviously be used, but dyadic cubes are well suited for calculations. 1.3.2. Theorems on set dimensions The ﬁrst theorem can be used as a basis for understanding the subsequent more technical results.

34

Scaling, Fractals and Wavelets

THEOREM 1.2.– Let μ be a ﬁnite measure such that μ(E) > 0. Assume that there exists a real α, such that for any x ∈ E: αμ∗ (x) = α∗μ (x) = α Then α = dim(E) = Dim(E). Proof. Let ε > 0. Let En be the subset of E consisting of all points x such that, for k n: α − ε αμ uk (x) α + ε If ρ < 2−n , any ρ-covering {ui } of En by dyadic cubes of rank ≥ n is such that: |ui |α+ε μ(ui ) |ui |α−ε for all i. Therefore μ(En ) i |ui |α−ε and i |ui |α+ε μ . First, we deduce that H ∗(α−ε) (En ) μ(En ). If n is large enough then μ(En ) > 0. This gives dim(En ) α − ε. Since En ⊂ E, then dim(E) α − ε. Secondly, by using the covering of En formed by dyadic cubes of rank k n, we obtain: N2−k (En )2−k(α+ε) μ Therefore N2−k (En ) μ 2k(α+ε) , which implies Δ(En ) α + ε. As a consequence, Dim(E) α + ε by σ-stability. By making ε tend to 0, we obtain the desired result. An analogous theorem can be stated with balls Bε (x) centered at E. We can develop the arguments of the preceding proof to obtain more general results as follows. THEOREM 1.3.– Assume that μ is a ﬁnite measure such that μ(E) > 0. Then:

(1.11) inf αμ (x) dim(E) sup αμ (x) x∈E

x∈E

inf αμ (x) Dim(E) sup αμ (x)

(1.12)

inf αμ (x) Δ(E) lim sup sup αμ (x)

(1.13)

x∈E

x∈E

x∈E

ε→0

x∈E

Fractal and Multifractal Analysis in Signal Processing

35

Inequality (1.13) seems more complex than the others. Nevertheless, we can derive from it a simple result: if 0 < μ(E) < +∞ and if α(Bε (x)) converges uniformly on E to a number α, then α = Δ(E). The same results hold if, for example, we replace the network of balls centered at E with that of dyadic cubes. EXAMPLE 1.5.– A perfect symmetric set (see Example 1.1) is the support of a natural or canonical measure: each of the 2n covering intervals of rank n is associated with the weight 2−n . In the case where the set has constant ratio a, these intervals have rank n and their Hölder exponent assumes the value log 2/|log a| uniformly. This grid of intervals allows the computation of dimensions. Indeed dim(E) = Dim(E) = Δ(E) =

log 2 |log a|

By making the successive ratios an vary, it is also possible to construct a set such that: dim(E) = lim inf

n log 2 , |log an |

Dim(E) = Δ(E) = lim sup

n log 2 |log an |

EXAMPLE 1.6.– The set Ep of p-normal numbers (see Example 1.4) supports a measure which makes it possible to estimate its dimension and which is known as the Besicovitch measure. It is deﬁned on the dyadic intervals [0, 1]. Set μ([0, 12 ]) = p and μ([ 12 , 1]) = 1 − p, p ∈ (0, 1). The weights p and 1 − p are then distributed similarly at the following stages: Since each dyadic interval un of rank n is the union of the intervals vn (on the left) and vn (on the right) of rank n + 1, we put μ(vn ) = p μ(un ),

μ(vn ) = (1 − p) μ(un )

It is easy to calculate the exact measure of each dyadic interval un (x) containing the point x by using the base 2 expansion of x. Denote: N0 (x, n) = number of 0 in the expansion of x between ranks 1 and n N1 (x, n) = number of 1 in the expansion of x between ranks 1 and n Thus, N0 (x, n) + N1 (x, n) = n and: μ un (x) = pN0 (x,n) (1 − p)N1 (x,n)

(1.14)

36

Scaling, Fractals and Wavelets

First, let us show that Ep has full measure. The easiest way to proceed is to use the language of probability. Each point x can be viewed as the result of a process which is a sequence of independent Bernoulli random variables Xi taking n the value 0 and 1 with probabilities p and 1 − p. The frequency N1 (x, n)/n = ( 1 Xi )/n has mean 1 − p and variance p(1 − p)/n. We may apply here the strong law of large numbers: with probability 1, N1 (x, n)/n tends to 1 − p when n → +∞, and N0 (x, n)/n tends to p. Coming back to the language of measure, this result tells us that the set of x for which N0 (x, n)/n tends to p has measure 1. Such x are in Ep . Thus μ(Ep ) = 1. Secondly, to compute the dimension, we ﬁrst need to determine the value of the Hölder exponent. Equation (1.14) implies the following result: 1 1 |log p| |log(1 − p)| + N1 (x, n) α un (x) = N0 (x, n) n log 2 n log 2 for any x of [0, 1]. If x ∈ Ep , then α(un (x)) tends to the value: αμ∗ (x) = α∗μ (x) = p

|log(1 − p)| |log p| + (1 − p) log 2 log 2

(1.15)

Thus, the value is the same for dim(Ep ) and Dim(Ep ) according to equations (1.11) and (1.12). We observe that in equation (1.13), the left-hand side is also equal to this value. Moreover, supx∈Ep α(un (x)) is equal to the largest Hölder exponent of dyadic intervals of rank n. If, for example, p < 12 , this largest exponent is equal to |log p|/ log 2, which is larger than 1. Therefore, the right-hand side of (1.13) is larger than 1. In fact, equation (1.13) gives no indication on the value of Δ(E). An argument of density yields Δ(Ep ) = 1 for any value of p.

1.3.3. Hölder exponent related to a function The Hölder exponents of a function give much more information than those of measures. Firstly, let us generalize the notion of measure of a set F , by using the notion of oscillation of the function f in F : β(f, F ) = sup{f (t) − f (t ) for t, t ∈ F }

This allows us to deﬁne a Hölder exponent: α(F ) =

log β(f, F ) log diam(F )

Fractal and Multifractal Analysis in Signal Processing

37

Given an interval J and a function f : J → R, we may use this notion locally in arbitrarily small neighborhoods of t ∈ J. The pointwise Hölder exponent of f in t is obtained as αpf (t) = lim inf α [t − ε, t + ε] ∩ J ε→0

According to this deﬁnition, the exponent of an increasing function f is the same as that of the measure μ deﬁned on any [c, d] ⊂ J by μ([c, d]) = f (d) − f (c). Indeed, f (d) − f (c) is also the oscillation value of f in [c, d]. However, in general, f is not monotonous and it is therefore necessary to carry out a more accurate analysis, as we will see below. As in the case of measures, we may also consider the “symmetric” exponent deﬁned with an upper limit, and also the exponents obtained as lower and upper limits by using particular grids of intervals, like the dyadic intervals. Oscillation considered as a measurement of the local variability of a function possesses many advantages. In particular, it is closely related to the box dimension. However, there are some counterparts: it is not simple to use in a theoretical context, it is sometimes difﬁcult to estimate with precision under experimental conditions and, ﬁnally, it is sensitive to various disturbances (discretization, noise, etc.). It is possible to replace the oscillation with other set functions v(f, F ) showing more robustness. However, most alternatives no longer verify the important triangle inequality: v(f, F1 ∪ F2 ) v(f, F1 ) + v(f, F2 ) for all F1 , F2 ⊂ J (see [TRI 99]). We can simplify the analysis by restricting the general F to the class of intervals and by setting: v(f, [a, b]) = |f (b) − f (a)|.

We may even consider only dyadic intervals, and take: v f, [k 2−n , (k + 1) 2−n ] = |cj,k | where cj,k is the wavelet coefﬁcient of f at scale j and position k (see also Chapters 2, 8 and 9). Let us now give an alternate and useful deﬁnition of the Hölder exponent.

38

Scaling, Fractals and Wavelets

DEFINITION 1.2.– Let f : R → R be a function, s > 0 and x0 a real number. Then f ∈ C s (x0 ), if and only if there is a real number η > 0, a polynomial P of degree ≤ s and a constant C, such that ∀x ∈ B(x0 , η),

|f (x) − P (x − x0 )| C|x − x0 |s .

(1.16)

The pointwise Hölder exponent of f at x0 , denoted αpf (x0 ) or αp (x0 ), is given by

sup s/f ∈ C s x0 If 0 < s < 1, then the polynomial P is simply the constant f (x0 ) and the increments of f over [x0 − ε, x0 + ε] are indeed compared to ε. Considering P allows us to take into account the higher order singularities, i.e. in the derivatives of f . We remove the “regular part” of f to exhibit the singular behavior. Consider for example f (x) = x + |x|3/2 . Using P allows us to ﬁnd αp = 32 , whereas the simple increment would give αp = 1, a non-signiﬁcant value in this case. The pointwise exponent, whose deﬁnition is natural, has a geometric interpretation. To begin with, let us remove the signal’s regular part, thus performing the difference f (x) − P (x − x0 ). Around x0 , the signal thus obtained is entirely contained in a hull of the form C|x − x0 |αp +ε for any ε > 0, and this hull is optimal, i.e. any smaller hull C|x − x0 |αp −ε , does not contain an inﬁnite number of points of the signal. We observe that the smaller the αp is, the more irregular f is in the neighborhood of x0 and vice versa. In addition, αp > 1 implies that f is derivable at x0 , and a discontinuous function at x0 is such that αp = 0. In many applications, the regularity of a signal is as important as, or more important, than its amplitude. For example, the edges of an image do not vary if an afﬁne transformation of the gray-levels is carried out. This will modify the amplitude, but not the exponents. The pointwise Hölder exponent will thus be one of the main ingredients for the fractal analysis of a signal or image. Generally speaking, measuring and studying the pointwise exponent is particularly useful in the processing of strongly irregular signals whose irregularity is expected to contain relevant information. Examples includes biomedical signals and images (ECG, EEG, echography, scintigraphy), Internet trafﬁc logs, radar images, etc. The pointwise exponent, however, presents some drawbacks. For example, it is neither stable by integral-differentiation nor, more generally, under the action of pseudo-differential operators. This means that it is not possible to predict the exponent at x of a primitive F of f knowing αpf (x). It is only guaranteed that αpF (x) αpf (x) + 1. In the same way, the exponent of the analytical signal associated with f is not necessarily the same as f . This is a problem in signal processing, since

Fractal and Multifractal Analysis in Signal Processing

39

this type of operator is often used. The second disadvantage, related to the ﬁrst, is that αp (x) does not provide the whole information on the regularity of f in x. A common example is that of the chirp |x|γ sin(1/|x|δ ), with γ and δ positive. We verify that the pointwise exponent at 0 is equal to γ. In particular, it is independent of δ: αp (0) is not sensitive to “oscillating singularities”, i.e. to situations where the local frequency of the signal tends to inﬁnity in the neighborhood of 0. It is therefore necessary to introduce at least one more exponent to fully describe the local irregularity. During the last few years, several quantities have been proposed, including the chirp exponent, the oscillation exponent, etc. Here we will focus on the local Hölder exponent, which we now deﬁne (for more details on the properties of this exponent, see [LEV 98a, SEU 02]). Let us ﬁrst recall that, given a function f : Ω → R, where Ω ⊂ R is an open set, we say that f belongs to the global Hölder space Cls (Ω), with 0 < s < 1, if there is a constant C, such that for any x, y in Ω: |f (x) − f (y)| C|x − y|s

(1.17)

If m < s < m + 1 (m ∈ N), then f ∈ Cls (Ω) means that there exists a constant C, such that, for any x, y in Ω: |∂ m f (x) − ∂ m f (y)| C|x − y|s−m

(1.18)

Now let αl (Ω) = sup{s/f ∈ Cls (Ω)}. Notice that if Ω ⊂ Ω, then αl (Ω ) αl (Ω). To deﬁne the local Hölder exponent, we will use the following lemma. LEMMA 1.1.– Let (Oi )i∈I be a family of decreasing open sets (i.e. Oi ⊂ Oj if i > j), such that: ∩i Oi = {x0 }

(1.19)

αl (x0 ) = sup{αl (Oi )}

(1.20)

Let: i∈I

Then, αl (x0 ) does not depend on the choice of the family (Oi )i∈I .

This result makes it possible to deﬁne the local exponent by using any intervals family containing x0 .

40

Scaling, Fractals and Wavelets

DEFINITION 1.3.– Let f be a function deﬁned in a neighborhood of x0 . Let {In }n∈N be a decreasing sequence of open intervals converging to x0 . By deﬁnition, the local Hölder exponent of f at x0 , noted αl (x0 ), is: αl (x0 ) = sup αl (In ) = lim αl (In ) n∈N

n→+∞

(1.21)

Let us brieﬂy note that the local exponent is related to a notion of critical exponent of fractional derivation [KOL 01]. We may understand the difference between αp and αl as follows: let us suppose that there exists a single couple (y, z) such that β(f, B(x, ε)) = f (y)−f (z). Then αp results from the comparison between β(f, B(x, ε)) and ε, whereas for αl , we compare β(f, B(x, ε)) to |y − z|. This is particularly clear in the case of the chirp, where the distance between the points (y, z) realizing the oscillation tends to zero much faster than the size of the ball around 0. Accordingly, it is easy to demonstrate that αl (0) = γ/(1 + δ) for the chirp. The exponent αl thus “sees” oscillations around 0: for ﬁxed γ, the chirp is more irregular (in the sense of αl ) when δ is larger. The local exponent possesses an advantage over αp : it is stable under the action of pseudo-differential operators. However, as well as αp , αl cannot by itself completely characterize the irregularity around a point. Moreover, αl is, in a certain sense, less “precise” than αp . The results presented below support this assertion. PROPOSITION 1.2.– For any continuous f and for all x: αlf (x) min αpf (x), lim inf αpf (t) t→x

The following two theorems describe the structure of the Hölder functions, i.e. the functions which associate with any x the exponents of f at x. THEOREM 1.4.– Let g : R → R+ be a function. The two assertions below are equivalent: – g is the lower limit of a sequence of continuous functions; – there exists a continuous function f whose pointwise Hölder function αp (x) satisﬁes αp (x) = g(x) for all x. THEOREM 1.5.– Let g : R → R+ be a function. The following two assertions are equivalent: – g is a lower semi-continuous (LSC) function; – there exists a continuous function f whose local Hölder function αl (x) satisﬁes αl (x) = g(x) for any x.

Fractal and Multifractal Analysis in Signal Processing

41

NOTE 1.2.– Let us recall that a function f : D ⊂ R → R is LSC if, for any x ∈ D and for any sequence (xn ) in D tending to x: lim inf f (xn ) f (x) n→∞

(1.22)

Figure 1.2 shows a generalization of the Weierstrass function deﬁned on [0, 1] for which αp (x) = αl (x) = x for any x. This function is deﬁned as ∞ Wg (x) = i=0 ω −nx cos(ω n x), with ω > 1.

Figure 1.2. Generalized Weierstrass function for which αp (x) = αl (x) = x

Since the class of lower limits of continuous function is much larger than that of lower semi-continuous functions, we observe that αp generally supplies more information than αl . For example, αp can vary much “faster” than αl . In particular, it is possible to construct a continuous function whose pointwise Hölder function coincides with the indicator function of the set of rational numbers. It is everywhere discontinuous, and thus its local Hölder function is constantly equal to 0. The following results describe more precisely the relations between αl and αp . PROPOSITION 1.3.– Let f : I → R be a continuous function, and assume that there exists γ > 0 such that f ∈ C γ (I). Then, there exists a subset D of I such that: – D is dense, uncountable and has Hausdorff dimension zero; – for any x ∈ D, αp (x) = αl (x).

42

Scaling, Fractals and Wavelets

Moreover, this result is optimal, i.e. there exists a function of global regularity γ > 0 such that αp (x) = αl (x) for all x outside a set of zero Hausdorff dimension. THEOREM 1.6.– Let 0 < γ < 1 and f : [0, 1] → [γ, 1] be a lower limit of continuous functions. Let g : [0, 1] → [γ, 1] be a lower semi-continuous function. Assume that for all t ∈ [0, 1], f (t) g(t). Then, there exists a continuous function F : [0, 1] → R such that: – for all x, αl (x) = g(x); – for all x outside a set of zero Hausdorff dimension, αp (x) = f (x). This theorem shows that, when the “compatibility condition” f (t) g(t) is satisﬁed, we can simultaneously and independently prescribe the local and pointwise regularity of a function outside a “small” set. These two measures of irregularity are thus to some extent independent and provide complementary information. 1.3.4. Signal dimension theorem Let us investigate the relationships between the dimension of a signal and its Hölder exponents. There is no general result concerning the Hausdorff dimension, apart from obvious upper bounds resulting from the inequalities dim(Γ) Dim(Γ) Δ(Γ). Here is a result for Dim(Γ) [TRI 86a]. THEOREM 1.7.– If Γ is the graph of a continuous function f , then: 2 − sup αμ (x) Dim(Γ) 2 − inf αμ (x) . x∈J

x∈J

The same inequalities are true if we use the grid of the dyadic intervals: 2 − sup α∗μ (x) Dim(Γ) 2 − inf α∗μ (x) . x∈J

x∈J

(1.23)

(1.24)

We do not provide the demonstration of these results, which requires an evaluation of the packing measure of the graph. In the same context, we could show that if the local Hölder exponents α(un (x)) tend uniformly to a real α, then this number is also equal to Δ(Γ). However, a much more interesting equality may be given for Minkowski-Bouligand dimension of the graph which is both simple and general. THEOREM 1.8.– Let f be a continuous function deﬁned on an interval J, and non-constant on J. For any ε > 0, let us call ε-variation of f on J the arithmetic mean of ε-oscillations: 1 β f, [x − ε, x + ε] ∩ J dt varε (f ) = |J| J

Fractal and Multifractal Analysis in Signal Processing

43

then: log varε (f ) Δ(Γ) = lim sup 2 − log ε ε→0

(1.25)

The assumption that f is not constant is necessary, as otherwise the oscillations are all zero and varε (f ) = 0. In this case, the graph is a horizontal segment and the value of its dimension is 1. Proof. A proof [TRI 99] using geometric arguments consists of estimating the area of the Minkowski ε-sausage Γ(ε). We show that this is equivalent to that of the union of the horizontal segments ∪t∈J [t − ε, t + ε] × {z(t)} centered on the graph. This is equal to the variation varε (z). EXAMPLE 1.7.– The graph of a self-afﬁne function deﬁned on J = [a, b] may be obtained as the attractor of an iterated functions system, like the self-similar curves of Example 1.2 (see also Chapters 9 and 10). For this, it is sufﬁcient to deﬁne: – an integer N 2; – N + 1 points in the plane A1 = (x1 , y1 ), . . . , AN +1 = (xN +1 , yN +1 ), such that x1 = a < · · · < xi < · · · < xN +1 = b; – N afﬁne triangular applications of the plane T1 ,. . . ,TN , such that, for each Ti , the image of the segment A1 AN +1 is the segment Ai Ai+1 . These may be written as: Ti =

ρi hi

0 δi

+ i ηi

where 0 < ρi = (xi+1 − xi )/(b − a) < 1 and |δi | < 1. The ﬁve parameters of Ti are related by the relations Ti (A1 ) = Ai and Ti (AN +1 ) = Ai+1 . In the particular case where ρi = 1/N for any i and i |δ i | 1, we may verify that if ε = N −k , the quantity varε (f ) is of the order of (( i |δi |)/N )k . We then obtain: log i |δi | |log( i |δi |)/N | =1+ Δ(Γ) = 2 − log N log N (see the classical example of Figure 1.3 where N = 4 and δi = 12 for any i). The Hölder exponent is calculated using 4-adic intervals. Its uniform value is 12 . Therefore Δ(Γ) = Dim(Γ) = 32 . Let us note that the Hausdorff dimension, strictly lower than 3 2 , is much more difﬁcult to estimate [MCM 84].

44

Scaling, Fractals and Wavelets

EXAMPLE 1.8.– Lacunary series, such as the Weierstrass function, provide other examples of signals: f (x) =

∞

ω −nH cos(ω n x + φn )

i=0

where ω > 1 and 0 < H < 1. The values of the “phases” φn are arbitrary. We can directly prove [TRI 99] that, restricted to any bounded interval J, the box dimension of the graph Γ is equal to 2 − H. By homogenity, Dim(Γ) = 2 − H. This conﬁrms the fact – although it is more difﬁcult to prove (see [BOU 00]) – that the ε-oscillations are uniformly of the order of εH . In other words, there exist two constants c1 and c2 > 0 such that, for any t ∈ J and for any ε, with 0 < ε < 1: c1 εH β f, [x − ε, x + ε] c2 εH

Figure 1.3. Graph of a nowhere derivable function, attractor of a system of four afﬁne applications, such that ρi = 14 and δi = 12 . Dimensions Δ and Dim are equal to 32 . The Hausdorff dimension lies between 1 and 32

Fractal and Multifractal Analysis in Signal Processing

45

Finding the value of the Hausdorff dimension of such graphs is still an open problem. In the case where the φn are independent random variables with same distribution, it is known that dim(Γ) = 2 − H with probability 1. 1.3.5. 2-microlocal analysis In this section, we brieﬂy present an analysis of local regularity which is much ﬁner than the Hölder exponents described above. As was observed previously, neither αp nor αl allow us to completely describe the local irregularity of a function. To obtain an exhaustive description, we need to consider an inﬁnite number of exponents. This is the purpose of 2-microlocal analysis. This powerful tool was deﬁned in [BON 86], where it is introduced in the framework PDE in the frame of Littlewood-Paley analysis. A deﬁnition based on wavelets is developed in [JAF 96]. We present here a time-domain characterization [KOL 02].

DEFINITION 1.4.– A function f : I ⊂ R → R belongs to the 2-microlocal space Cxs,s 0 if there exist 0 < δ < 14 and C > 0 such that, for all (x, y) satisfying |x − x0 | < δ and |y − x0 | < δ: |f (x) − f (y)| −s /2 −s /2 |x − y| + |y − x0 | C|x − y|s+s |x − y| + |x − x0 |

Figure 1.4. Graph of the Weierstrass function for ω = 3 and H = 12 . Phases φn are random independent variables and are identically distributed. Dimensions Δ and Dim are equal to 32 . The Hausdorff dimension is almost surely close to 32

(1.26)

46

Scaling, Fractals and Wavelets

This deﬁnition is valid only for 0 < s < 1 and 0 < s + s < 1. The general case is slightly more complex and will not be dealt with here. 2-microlocal spaces, as opposed to Hölder space, require two exponents (s, s ) in their deﬁnition. While αp is deﬁned as the sup of exponents α, such that f belongs to Cxα0 , we cannot proceed in the same way to deﬁne “2-microlocal exponents”. Instead, we deﬁne in the abstract }). plan (s, s ) the 2-microlocal frontier of f at x0 as the curve (s, sup{s , f ∈ Cxs,s 0 It is not hard to show that this curve is well deﬁned, concave and decreasing. Its intersection with the s axis is exactly αl . Moreover, under the hypothesis that f possesses a minimum global regularity, αp is the intersection of the frontier with the line s + s = 0. The 2-microlocal frontier thus allows us to re-interpret the two exponents within a uniﬁed framework. The main advantage of the frontier is that it completely describes the evolutions of αp under integro-differentiation of arbitrary order: indeed, an integro-differentiation of order ε simply shifts the frontier by ε along the s axis. Thus, 2-microlocal analysis provides extremely rich information on the regularity of a function around a point. To conclude this brief presentation, let us mention that algorithms exist which make it possible to numerically estimate the 2-microlocal frontier. They often allow us to calculate the values of αp and αl more precisely than a direct method (various estimation methods for the exponents and the 2-microlocal frontier are proposed in FracLab). Furthermore, it is possible to develop a 2-microlocal formalism [LEV 04a] which presents strong analogies with the multifractal formalism (see below). 1.3.6. An example: analysis of stock market price As an illustration of some of the notions introduced above, we use this section to detail a simpliﬁed example of the fractal analysis of a signal based on Hölder exponents. Our purpose is not to develop a complete application (this would require a whole chapter) but instead to demonstrate how we calculate and use the information provided by local regularity analysis in a practical scenario. The signal is a stock market log. Financial analysis offers an interesting area of application for fractal methods (see Chapter 13 for a detailed study). We will consider the evolution of the Nikkei index between January 1, 1980 and November 5, 2000. These signals comprise 5,313 values and are presented in Figure 1.5. As with many stock market accounts, it is extremely irregular. We calculate the local Hölder exponent of the logarithm of this signal, which is the quantity on which ﬁnancial analysts work. The exponent is calculated by a direct application of Deﬁnition 1.3: at each point x, we ﬁnd, for increasing values of ε, one couple (y, z) for which the signal oscillation in a ball centered in x of radius ε is attained. A bilogarithmic regression between the vector of the values found for the oscillation and the distances |y − z| is then performed (see the FracLab manual for more details on the procedure). As Figure 1.6 shows, most local exponents are comprised between 0 and 1, with

Fractal and Multifractal Analysis in Signal Processing

47

Figure 1.5. The Nikkei index between 1st January 1980 and 5th November 2000

some peaks above 1 and up to 3. The exponent values in [0, 1], which imply that the signal is continuous but not differentiable, conﬁrm the visual impression of great irregularity. We can go beyond this qualitative comment by observing that the periods where “something occurs” have a deﬁnite signature in terms of the exponents: they are characterized by a dramatic increase of αl followed by very small values, below 0.2. Let us examine some examples. The most remarkable point of the local regularity graph is its maximum at abscissa 2,018, with an amplitude of 3. The most singular points, i.e. those with the smallest exponent, are situated just after this maximum: the exponent is around 0.2 for the abscissae between 2,020 and 2,050, and of the order of 0.05 between points 2,075 and 2,100. These two values are distinctly below the exponent average, which is 0.4 (the variance being 0.036). Calculations show that less than 10% of the log points possess an exponent smaller than 0.2. This remarkable behavior corresponds to the crash of October 19, 1987 which occurs at abscissa 2,036, in the middle of the ﬁrst zone of points with low regularity after the maximum: the most “irregular” days of the entire signal are thus, as expected, situated in the weeks which followed the crash. It is worthwhile noting that this fact is much more apparent on the regularity signal than on the original log, where only the crash appears clearly, with the subsequent period not displaying remarkable features. Let us now consider another area which contains many points of low regularity along with some isolated regular points (i.e. having αl > 1). It corresponds to the zone between abscissae 4,450 and 4,800: this period approximatively corresponds to the Asian crisis that took place between January 1997 and June 1998 (analysts do not agree upon the exact dates of the beginning and the ending of this crisis: some of them date its beginning in mid-1997 or its end towards the end of 1999, or much later). On the graph of the log, we can observe that this period seems slightly more

48

Scaling, Fractals and Wavelets

Figure 1.6. Local Hölder function of the Nikkei index

irregular than others. In terms of exponents, we notice that it contains two maxima, with values greater than 1, both followed by low regularity points: this area comprises a high proportion of irregular points, since 12% of its points have an exponent lower than 0.15. This proportion is three times higher than that observed in the whole log. The analysis just performed has remained at an elementary level. However, it has allowed us to show that major events have repercussions on the evolution of the local Hölder exponent and that the graph of αl emphasizes features not easily visible on the original log.

1.4. Multifractal analysis 1.4.1. What is the purpose of multifractal analysis? In the previous section, it has been observed that the Hölder functions provide precise information on the regularity at each point of a signal. In applications, this information is often useful as such, but there exists many cases where it is necessary to go further. Here are three examples that highlight this necessity. In image segmentation, we expect that edges correspond to low regularity points and hence to small exponents. However, the precise value of the Hölder exponents of contour points cannot be universal: non-linear transformations of an image, for instance, might preserve edges while modifying the exponent value. In this ﬁrst situation, we see that the pointwise regularity does not provide all the relevant information and that it is necessary to supplement it with structural information. Let us now consider the issue of image texture classiﬁcation. It is clear that classifying a pixel based on its

Fractal and Multifractal Analysis in Signal Processing

49

local exponent would not give satisfactory results. A more relevant approach would be to use the statistical distribution of exponents with zones. The same comment applies to the characterization of Internet trafﬁc according to its degree of sporadicity. In this second situation, the Hölder function provides information that is too rich, and which we would like to balance in a certain sense. The last situation is when the exponents are too difﬁcult to calculate: there exists, in particular, continuous signals, easy to synthesize, whose Hölder function is everywhere discontinuous. In this case, the pointwise regularity information is too complex to be used under its original form. In all these examples, we would like to use “higher level” information, which would be extracted from the Hölder function or would sum up, in some sense, its relevant features. Several ways of doing this exist. The idea that comes immediately to mind is simply to calculate histograms of exponents. This approach, however, is not adapted, both for mathematical reasons that go beyond the scope of this chapter and because, in fractal analysis, we always try to deal with quantities that are scale-independent. The most relevant way to extract information from the Hölder function and to describe it globally is to perform a multifractal analysis. There are many variants and we will concentrate on two popular examples: the ﬁrst is geometric and consists of calculating the dimension of the set points possessing the same exponent. The second is statistical: we study the probability of ﬁnding, at a ﬁxed resolution, a given exponent and how this probability evolves when the resolution tends to inﬁnity. The following two sections are devoted to developing these notions. 1.4.2. First ingredient: local regularity measures Before giving a detailed description of the local regularity variations of a signal, it is necessary to determine what method will be used to measure this regularity. It has been explained in section 1.3 that many characterizations are equally relevant and that the choice of one or the other is dictated by practical considerations and the type of applications chosen. Likewise, we may base multifractal analysis on various measures of local regularity. However, there are a certain number of advantages in using the pointwise exponent, which is reasonably simple while leading to a rich enough analysis. Therefore, the following text will use this measure of regularity, which corresponds to the most common choice. Grain exponents For reasons which will be explained below, we need to deﬁne a new class of exponents called grain exponents. These are simply approximations, at each ﬁnite resolution, of the usual exponents. For simplicity, let us assume that our signal X is deﬁned on [0, 1], and let u denote an interval in [0, 1]. We ﬁrst choose a descriptor VX (u) of the relevant information of X in u: if X is a measure μ, we will most often take Vμ (u) = μ(u). If X is a function f , then Vf (u) may for instance be

50

Scaling, Fractals and Wavelets

the absolute value of the increment of f in u, that is |f (umax ) − f (umin )| (where u = [umin , umax ]). A more precise descriptor is obtained by considering the oscillation instead, and setting Vf (u) = β(f, u). Finally, in cases where the intervals u are dyadic, of the form Ink = [(k − 1) 2−n , k 2−n ], a third possibility is to choose Vf (Ink ) = |ckn |, where ckn is the wavelet coefﬁcient of f on scale n and in position k (note that, in this case, most quantities deﬁned below will additionally depend on the wavelet). Once VX (u) has been deﬁned, the grain exponent may be calculated as follows: α(u) =

log VX (u) log|u|

Filtration It is then necessary to deﬁne a sequence of partitions (Ink )(n0,k=1...νn ) of [0, 1]. For each ﬁxed n, the collection of the νn intervals (Ink )k constitutes a partition of [0, 1] and we require that, when n tends to inﬁnity, maxk |Ink | tends to 0 (this implies, of course, that νn has to tend to inﬁnity). A common (but not neutral) choice is to consider the dyadic intervals (and thus νn = 2n ). The grain exponent αnk is then deﬁned by: log VX (Ink ) k αn = log(|Ink |)

Ink :

Intuitively, αnk does indeed measure the “singularity” of X in the (small) interval the smaller αnk , the greater the variation of X in Ink , and vice versa.

Abstract function A It is important to observe at this point that we can carry out a more general version of multifractal analysis by replacing the grain exponent α(u) with a function A deﬁned on the metric space of closed intervals of [0, 1], and ranging in R+ ∪{+∞} [LEV 04b]. In this context, more general results can be obtained. 1.4.3. Second ingredient: the size of point sets of the same regularity In the same way as the local regularity can be deﬁned through different approaches, there exist many ways of extracting high-level information. Geometric method Conceptually, the simplest method consists of considering the sets Eα of those points of [0, 1] possessing a given exponent α and then describing the size of Eα (to simplify notations and as the focus of this section is on the pointwise exponent,

Fractal and Multifractal Analysis in Signal Processing

51

we write α instead of αp ). In many cases of practical and theoretical interest, these sets have a zero Lebesgue measure for most of the values of α. In addition, they are often dense in [0, 1], with a box dimension equal to 1. To distinguish them, it is thus necessary to measure them either by their Hausdorff or packing dimension. Here, only the ﬁrst of these dimensions will be considered. We set: fh (α) = dimH (Eα ) where Eα = {x : α(x) = α}. The function fh is called the Hausdorff multifractal spectrum of X. Since the empty set dimension is −∞, we see that fh will take values in {−∞} ∪ [0, n] for an n-dimensional signal. Even though a strict deﬁnition of a multifractal object does not exist, it seems reasonable to talk of multifractality when fh (α) is positive or zero for several values of α: indeed, this means that X will display different singular behaviors on different subsets of [0, 1]. Often, we will require that fh be strictly positive on an interval in order to consider X as truly multifractal. While the Hausdorff spectrum is simple to deﬁne, it requires an extremely delicate calculation in theoretical as well as numerical studies. The next subsection presents another multifractal spectrum, which is easier to calculate and which also serves to give an approximation by excess of fh . Statistical method The second method used to globally describe the variations of regularity is to adopt a statistical approach (as opposed to fh , which is a geometric spectrum): we ﬁrst choose a value of α, and counts the number of intervals Ink , at a given resolution n, where X possesses a grain exponent approximately equal to α. We then let the resolution tend to inﬁnity and observes how this number evolves. More precisely, the large deviation spectrum fg is deﬁned as: fg (α) = lim lim inf ε→0 n→∞

log Nnε (α) log νn

where: Nnε (α) = #{k : α − ε αnk α + ε} We may heuristically understand this spectrum by letting Pn denote the uniform probability law on {1 . . . νn }, i.e. Pn (k) = 1/νn for k = 1, . . . , νn . Then, neglecting ε and assuming that the lower limit is a limit: Pn (αnk α) νnfg (α)−1

(1.27)

52

Scaling, Fractals and Wavelets

In other words, if an interval Ink at resolution n is drawn randomly (for a sufﬁciently large value of n), then the probability of observing a singularity exponent f (α)−1 . approximately equal to α is proportional to νng From the deﬁnition, it is clear that fg , as fh , ranges in {−∞} ∪ [0, 1] (in one dimension). As a consequence, whenever fg (α) is not equal to one (and thus is strictly smaller than 1), the probability of observing a grain exponent close to α decays exponentially fast to 0 when the resolution tends to inﬁnity: only those α such that fg (α) = 1 will occur in the limit of inﬁnite resolution. The study of this type of behavior and the determination of the associated exponential convergence speed (here, fg (α) − 1) are the topic of the branch of probability called “large deviation theory”, from which we derive the denomination of fg . Nature of the variable ε Let us consider the role of ε: at each ﬁnite resolution, the number of intervals Ink is ﬁnite. As a consequence, Nn0 (α) will be zero except for a ﬁnite number of values of α. The variable ε then represents a “tolerance” on the exponent value, which is made necessary by the fact that we work at ﬁnite resolutions. Once the limit on n has been taken, this tolerance is no longer needed and we can let ε tend to 0 (note that the limit in ε always exists, as we are dealing with a monotonic function of ε). From a more general point of view, we may understand the difference between fh and fg as an inversion of the limits: in the case of fh , we ﬁrst let the resolution tend to inﬁnity to calculate the exponents and then “count” how many points are characterized by a given exponent. In the case of fg , we start by counting, at a given resolution, the number of intervals having a prescribed exponent, and then let the resolution tend to inﬁnity. This second procedure renders the calculation of the spectrum easier, but in general it will obviously lead to a different result. In section 1.4.7, two examples are given that illustrate the difference between fh and fg . NOTE 1.3.– Historically, the large deviation spectrum was introduced as an easy means to calculate fh through arguments developed in section 1.4.4. It was then gradually realized that fg contains information sometimes more relevant than fh , particularly in signal processing applications. For more details on this topic, see [LEV 98b], where the denominations “Hausdorff spectrum”, “large deviation spectrum” and “Legendre spectrum” were introduced. The next section tackles the problem of calculating the multifractal spectra. 1.4.4. Practical calculation of spectra Let us begin with the Hausdorff spectrum. It is clear that the calculation of the exponents in each point, and then of all the associated dimensions of the Hausdorff spectrum, is an extremely difﬁcult task. It is thus desirable to look for indirect methods to evaluate fh . Two of them are described below.

Fractal and Multifractal Analysis in Signal Processing

53

Multifractal formalism In some cases, fh may be obtained as the Legendre transform of a function that is easily calculated. When this is the case, we say that the (strong) multifractal formalism holds. Deﬁne: n

Sn (q) =

2

VX (Ink )q

k=1

and set: τn (q) = −

1 log2 Sn (q) n

τ (q) = lim inf τn (q) n→∞

To understand the link between fh and τ heuristically, let us evaluate τn by grouping together the terms that have the same order of magnitude in the deﬁnition of Sn . In that view, ﬁx a “Hölder exponent” α and consider those intervals for which, when n is sufﬁciently large, VX (Ink ) ∼ |Ink |α . Assume that this approximation is true uniformly in k. We may then roughly estimate the number of such intervals by 2nfh (α) . Then: Sn (q) = {2−nqα : VX (Ink ) ∼ |Ink |α } = 2−n(qα−fh (α)) α

k

α

Factorizing 2−n inf α (qα−fh (α)) , we obtain for τn : 1 2−n[qα−fh (α)−inf α (qα−fh (α))] τn (q) = inf qα − fh (α) − log α n α and, taking the “limit” when n tends to inﬁnity: τ (q) := lim inf α (qα − fh (α)) =: fh∗ (α). The transformation fh → fh∗ is called the “Legendre transform” (for the concave functions). Provided we can justify all the manipulations above, we thus expect that τ will be the Legendre transform of fh . If, in addition, fh is concave, a well-known property of the Legendre transform guarantees that fh = τ ∗ . As announced above, fh is then obtained at once from τ . The important point is that τ is itself much easier to evaluate than fh : there is no need to calculate any Hausdorff dimension and furthermore the deﬁnition of τ involves only average quantities, and no evaluations of pointwise Hölder exponents. The only difﬁculty lies in the estimation of the limit. Therefore, the estimation of τ will be in general much easier and more robust than that of fh . For this reason, even though the multifractal formalism does not hold in

54

Scaling, Fractals and Wavelets

general, it is interesting to consider a new spectrum, called the Legendre spectrum and deﬁned as1 fl = τ ∗ . The study of the validity domain of the equality fh = fl is probably one of the most examined issues in multifractal analysis, for obvious theoretical and practical reasons. However, this is an extremely complex problem, which has only been partially answered to this day, even in “elementary” cases such as the one of multiplicative processes (see section 1.4.6). It is easy to ﬁnd counter-examples to the equality fh = τ ∗ . For instance, since a Legendre transform is always concave, the formalism certainly fails as soon as fh is not concave. There is no reason to expect that the spectrum of real signals will have this property. In particular, it is not preserved by addition, so that it will not be very stable in practice. Other dimensional spectra Instead of using the Legendre transform approach, another path consists of deﬁning spectra which are close in spirit to fh , but are easier to calculate. Since the major difﬁculty arises from the estimation of the Hausdorff dimension, we may try to replace it with the box dimension, which is simpler to evaluate. Unfortunately, merely replacing the Haussdorff dimension by Δ in the deﬁnition of the spectrum leads to uninteresting results. Indeed, as mentioned above, the sets Eα have, in many cases of interest, a box dimension equal to 1 (see section 1.4.6 for examples). In this situation, the spectrum obtained by replacing dim by Δ, being constant, will not supply any information. A more promising approach consists of deﬁning dimension spectra as in [LEV 04b, TRI 99]. To do so, let us consider a general function of intervals A, as in section 1.4.2. For any x in [0, 1], set: αn (x) = A un (x) where un (x) is the dyadic interval of length 2−n containing x (take for instance the right interval if two such intervals exist). Then let: Eα (, N ) = {x/n N ⇒ |αn (x) − α| } Eα (, N ) is an increasing function of N , and thus: ∪N Eα (, N ) supN Eα (, N ). Let:

=

Eα () = sup Eα (, N ) = {x/∃N such that n N ⇒ |αn (x) − α| } N

1. The inequalities fh g fl are always true (see below). Thus, it would be more accurate to say that the Legendre spectrum is an approximation (by excess) of the large deviation spectrum, rather than that of the Hausdorff spectrum.

Fractal and Multifractal Analysis in Signal Processing

55

Since the sets Eα () decrease with , we may deﬁne: Eα = lim Eα () = {x/αn (x)→n→∞ α} →0

DEFINITION 1.5.– Let d be any dimension. We deﬁne the following spectra: fd (α) = d(Eα ) = d lim sup Eα (, N ) →0 N

fdlim (α) = lim d Eα () = lim d sup Eα (, N ) →0

→0

fdlim sup (α) = lim sup d Eα (, N )

(1.28) (1.29)

N

→0 N

(1.30)

When d is the Hausdorff dimension, then fd is just the Hausdorff spectrum fh . Let D = Im (A) be the closure of the image of A. Then D is the support of the spectra. Indeed, for any α ∈ D, Eα (, N ) = ∅, and thus fd (α) = fdlim (α) = fdlim sup (α) = −∞ (obviously this also applies to fg ). The following inequalities are easily proved: fd fdlim

(1.31)

fdlim sup fdlim

(1.32)

There is no relation in general between fd and fdlim sup . However, if d is σ-stable: fd (α) fdlim (α) = fdlim sup (α)

(1.33)

Besides, if d1 and d2 have two dimensions such that, for any E ⊂ [0, 1]: d1 (E) d2 (E) then: sup sup (E) fdlim (E), fdlim (E) fdlim (E) fd1 (E) fd2 (E), fdlim 1 2 1 2

It is not hard to prove the below sequence of inequalities. PROPOSITION 1.4.– For any A: fh (α) fhlim sup (α) = fhlim (α) lim lim sup fΔ (α) min fΔ (α), fg (α)

(1.34)

56

Scaling, Fractals and Wavelets

This is an improvement on the traditional result fh (α) fg (α). In particular, lim . If, as soon as fh (α) = fg (α), all the above spectra coincide, except perhaps fΔ lim sup on the contrary, fh is smaller than fg , then we may hope that fΔ (α) will be a better approximation of fh than fg . The important point here is that the calculation of lim sup fΔ (α) only involves box dimensions and that it is of the same order of complexity lim sup as that of fg . The spectrum fΔ (α) is thus a good substitute when we want to lim sup numerically estimate fh . For example, the practical calculation of fΔ (α) on a multinomial measure (see section 1.4.6) yields good results. Since d([0, 1]) = 1, all the spectra have a maximum lower than or equal to 1. In certain cases, a more precise result concerning fdlim sup , fdlim and fg is available: PROPOSITION 1.5.– Let K be the set of x in [0, 1] such that the sequence (αn (x)) converges. Let us suppose that |K| > 0. Then, there exists α0 in D such that: fdlim sup (α0 ) = fdlim (α0 ) = fg (α0 ) = 1

(1.35)

Let us note that such a constraint is typically not satisﬁed by fh . This shows that fh contains, in general, more information. A more precise and more general result may be found in [LEV 04b]. The three spectra fdlim sup , fdlim and fg also obey a structural constraint, expressed by the following proposition. PROPOSITION 1.6.– The functions fg , fdlim and fdlim sup are upper semi-continuous. NOTE 1.4.– Recall that a function f : D ⊂ R → R is upper semi-continuous (USC) if, for any x ∈ D and for any sequence (xn ) of D converging to x: lim sup f (xn ) f (x)

(1.36)

n→∞

Let us deﬁne the upper semi-continuous envelope of a function f by: f˜(α) = lim sup{f (β)/|β − α| } →0

Then, the above results imply that: f˜d (α) fdlim (α)

(1.37)

Let us also mention the following result. PROPOSITION 1.7.– Let A and B be the two interval functions and C = max{A, B}. Let fg (α, A), fg (α, B) and fg (α, C) denote the corresponding spectra. Then, for any α: fg (α, C) max{fg (α, A), fg (α, B)}

(1.38)

Fractal and Multifractal Analysis in Signal Processing

57

PROPOSITION 1.8.– If d is stable, then inequality (1.38) is true for fd , fdlim and fdlim sup . A signiﬁcant result concerning the “inverse problem” for the spectra serves as a conclusion for this section. PROPOSITION 1.9.– Let f be a USC function ranging in [0, 1] ∪ {−∞}. Then, there exists an interval function A whose fdlim sup or fdlim spectrum is exactly f as soon as d is σ-stable or d = Δ. Let us note that fd , when d is σ-stable, is not necessarily USC. This shows once more that this spectrum is richer than the other ones (see [LEV 98b] for a study of the structural properties of fd with d = h). Weak multifractal formalism Let us now consider the numerical estimation of fg . As in the case of fh , two approaches exist: either we resort to a multifractal formalism with fg as the Legendre transform of a simple function, or we analyze in detail the deﬁnition of fg and deduces estimation methods from this. In the ﬁrst case, the heuristic justiﬁcation is the same as for fh and we expect that, under certain conditions, fg = τ ∗ . Since we avoid an inversion of limits as compared to the case of fh , this formalism (sometimes called weak multifractal formalism, as opposed to the strong formalism which ensures the equality of τ ∗ and fh ) will be satisﬁed more often. However, the necessary condition that fg be concave, associated once again with the lack of stability, still limits its applicability. An important difference between the strong and weak formalisms is that, in the latter case, a precise and reasonably general criterion ensuring its validity is known. We are referring to a version of the Ellis theorem, one of the fundamental results in the theory of large deviations, which is recalled below in a simpliﬁed form. THEOREM 1.9.– If τ (q) = limn→∞ τn (q) exists as a limit (rather than a lower limit), and if it is differentiable, then fg = fl . When fg is not concave, it cannot equal fl , but the following result holds. THEOREM 1.10.– If fg is equal to −∞ outside a compact set, then fl is its concave hull, i.e.: fl = (fg )∗∗ This theorem makes it possible to measure precisely the information which is lost when fg is replaced by fl . See [LEV 98b] for related results.

58

Scaling, Fractals and Wavelets

Continuous spectra It is possible to prove that the previous relation is still valid in the more subtle case of continuous spectra [LEV 04b, TRI 99]. These continuous spectra constitute a generalization of fg that allow us to avoid choosing a partition. As already mentioned, this choice is not neutral and different partitions will, in general, lead to different spectra. To begin with, interval families are deﬁned: Rη = {u interval of [0, 1] such that |u| = η} Rη (α) = {u : |u| = η, A(u) = α} Rεη (α) = {u : |u| = η, |A(u) − α| ε} DEFINITION 1.6 (continuous large deviation spectra).– log η1 |∪Rεη (α)| fgc (α) = lim lim sup ε→0 η→0 log η log η1 |∪Rη (α)| f˜gc (α) = lim sup log η η→0 Note that fgc (α) is deﬁned similarly to fg , except that all the intervals of a given length η are considered where the variation of X is of the order of η α±ε , rather than only dyadic intervals. Since the number of these intervals may be inﬁnite, we replace Nnε (α) with a measure of the average length, i.e. η1 |∪Rεη (α)|. Within this continuous framework, Rη (α) will, in general, be non-empty for inﬁnitely many values of α and not only for at most 2n values, as is the case for Nn0 (α): it is thus possible to get rid of ε and deﬁne the new spectrum f˜gc . Legendre transform of continuous spectra For any family of R intervals, ∪R denotes the union of all the intervals of R. A packing of R is a sub-family composed of disjoint intervals. DEFINITION 1.7 (Legendre continuous spectrum).– For any real q, let:

q qA(u) |u| : R is a packing of R H (R) = sup u∈R

and: Jηq = sup H q Rη (α) α

Fractal and Multifractal Analysis in Signal Processing

Set:

59

log H q (Rη ) η→0 log η q log J (Rη ) τ˜c (q) = lim inf η→0 log η τ c (q) = lim inf

and ﬁnally: flc = (τ c )∗ and f˜lc = (˜ τ c )∗ . Here are the main properties of fgc , f˜gc , flc and f˜lc . PROPOSITION 1.10.– – flc and f˜lc are concave functions; – for any α, f˜c (α) f c (α) and fg (α) f c (α); g

g

g

– if μ is a multinomial measure (see section 1.4.6), then fgc (α) = f˜gc (α) = flc (α) = f˜lc (α) = fg (α). THEOREM 1.11.– If fgc (respectively f˜gc ) is equal to −∞ outside a compact interval, then, for any α: ∗∗ flc (α) = fgc (α) ∗∗ f˜lc (α) = f˜gc (α) PROPOSITION 1.11.– – τ c and τ˜c are increasing and concave functions; – τ c (0) = τ˜c (0) = −Δ(supp(X)); – if X is a probability measure, then τ c (1) = τ˜c (1) = 0; – τ c (q) = lim inf n→∞ log H q (Rηn )/ log ηn , where (ηn ) is a sequence tending to zero such that log ηn / log ηn+1 → 1 when n → ∞. The same is true for τ˜c . The last property is important in numerical applications: it means that τ c and τ˜c may be estimated by using discrete sequences of the type ηn = 2−n . Kernel method A second method to estimate fg , which does not assume that the weak formalism is true (and thus in particular allows us to obtain non-concave spectra), is based on the following. Let K denote the “rectangular kernel”, i.e. K(x) = 1 for x ∈ [−1, 1] and K(x) = 0 elsewhere. Let Kε (t) = 1ε K( εt ). Then, by deﬁnition, Nnε (α) = 2n+1 εKε ∗ ρn (α), where the symbol ∗ represents convolution. It is not hard to check that replacing K by any positive kernel with compact support whose

60

Scaling, Fractals and Wavelets

integral equals 1 in the deﬁnition of Nnε (α) will not change the value of fg . A basic idea is then to use a more regular kernel than the rectangular one to improve the estimation. A more elaborate approach is to use ideas from density estimation to try and remove the double limit in the deﬁnition of fg : this is performed by choosing ε to be a function of n in such a way that appropriate convergence properties are obtained [LEV 96b]. We may, for instance, show the following result: PROPOSITION 1.12.– Assume that the studied signal is a ﬁnite sum of multinomial measures (see section 1.4.6). Let: fnε (α) =

log Nnε (α) log νn

εn Then, if εn is a sequence such that limn→∞ log(ν = c, where c is a positive n) constant: lim sup fnεn (α) − fg (α) = 0 n→∞ α

Even without making such a strong assumption on the signal structure, it is still possible, in certain cases, to obtain convergence results, as above, with ε = ε(n), using more sophisticated statistical techniques. To conclude our presentation of multifractal spectra, let us emphasize that no spectrum is “better” than the others in all respects. All the spectra give similar but different information. As was already observed for the local regularity measures, each one has advantages and drawbacks, and the choice has to be made in view of the considered application. If we are interested in the geometric aspects, a dimension spectrum should be favored. The large deviation spectrum will be used in statistical and most signal processing applications. When the number of data is small or if it appears that the estimations are not reliable, we will resort to the Legendre spectrum. To be able to compare the different information and also to assert the quality of the estimations, it is important to dispose of general theoretical relations between the spectra. It is remarkable that, as seen above, such relations exist under rather weak hypotheses. 1.4.5. Refinements: analysis of the sequence of capacities, mutual analysis and multisingularity In this section, some reﬁnements of multifractal analysis, useful in applications, will be brieﬂy discussed. The ﬁrst reﬁnement stems from the following consideration. Assume we are interested in the analysis of road trafﬁc and that the signal X(k) at hand is the ﬂow

Fractal and Multifractal Analysis in Signal Processing

61

per hour, i.e. the number of vehicles crossing a given small section of road during a ﬁxed time interval (often in the order a minute).2 In this case, each point of the signal is not a pointwise value but corresponds to a space-time integral. We would thus be inclined to model such data by a measure and carry out multifractal analysis of this measure. However, it appears that, if we want to anticipate congestions, the relevant quantity to consider for a given time interval Tn whose length n is large as compared to one minute is not the sum of the individual ﬂows X(k) for k in Tn , but the maximum of these ﬂows. Hence, instead of one signal, we would rather consider a sequence of signals, Xn , each yielding the maximum ﬂow at the time scale n: Xn (j) = maxk∈Tn X(k). At each scale n, the signal Xn is a set function (i.e. it does not give pointwise values but averages). However, the Xn are not measures, as they are not additive: the maximum on two disjoint intervals T 1 and T 2 is not, in general, the sum of the maxima of T 1 and T 2. Nonetheless, each Xn possesses some regularity properties which allows us to model it as a Choquet capacity. Thus, we are led to generalize multifractal analysis so as to no longer process one signal, which would be a pointwise function or a measure, but a sequence of signals which are Choquet capacities. A number of other examples exist, particularly in image analysis, where such a generalization is found necessary (see [LEV 96a]). We will not give here the deﬁnition of a Choquet capacity, but we stress that the generalization of multifractal analysis to sequences of Choquet capacities, at ﬁrst seemingly abstract, is in fact very simple. Indeed, nothing in the deﬁnition of multifractal analysis implies that the same signal has to be considered at each resolution, nor that the signal must be a function or a measure. In particular, the relations between the spectra are preserved in this generalization [LEV 98b]. A different generalization consists of noting that, in the usual formulation of multifractal analysis, the Lebesgue measure L plays a particular role which may not be desirable. Let us for instance consider the deﬁnition of the grain exponent. The logarithm of the interval measure Ink is compared to |Ink |, which is nothing but the Lebesgue measure of Ink . In the same way, when we deﬁne fh , the Hausdorff dimension is calculated with respect to L. However, it is a traditional fact that we may 3 deﬁne an s-dimensional Hausdorff with measure respects to an arbitrary non-atomic s measure μ by replacing the sum |Uj | with μ(Uj ) . As a matter of fact, once a non-atomic reference measure μ has been chosen, we may rewrite all the deﬁnitions of multifractal analysis (Hölder exponent, grain exponent, spectra, etc.) by replacing L with μ. If this analysis is applied to a signal X (which may be a function, a measure or a capacity), we obtain the multifractal analysis of X with respect to μ. This type of analysis is called mutual multifractal analysis. There are several beneﬁts to this

2. A multifractal analysis of trafﬁc in Paris is presented in [VOJ 94]. 3. A measure is non-atomic if it attributes zero mass to all singletons. The reasons for which one has to restrict to such measures go beyond the scope of this chapter; see [LEV 98b] for more details.

62

Scaling, Fractals and Wavelets

generalization. Let us brieﬂy illustrate some of them through two examples. Assume that the signal to be studied X is equal to Y + Z, where Y and Z are two measures Y Z (α) fh,L (α) for supported respectively on [0, 12 ] and on [ 12 , 1]. Suppose that fh,L T all α, where fh,L is the Haussdorff spectrum of T with respect to L. Then, it is easy X Z to see that fh,L (α) = fh,L (α) for any α. The Y component is not detected by the analysis. On the contrary, if we calculate the spectra with respect to the reference X Z Z (α) = fh,Z (α), since fh,Z (α) will be equal to measure Z, then, in general, fh,Z 1 for α = 1 and to −∞ everywhere else. A change in the reference measure thus allows us to carry out a more accurate analysis, by shedding light on possibly hidden components. As a second example, consider an application in image analysis. It is possible to use mutual multifractal analysis to selectively detect certain types of changes in image sequences (for example, the appearance of manufactured objects in sequences of aerial images [CAN 96]). The idea consists of choosing the ﬁrst image of the sequence as the reference measure and then analyzing each following image with respect to it. In the absence of any change, the mutual spectrum will be equal to 1 for α = 1 and to −∞ everywhere else. A change will lead to a spreading of this spectrum and it is possible to classify changes according to the corresponding values of the couple (α, f (α)) (see Chapter 11). Finally, a third reﬁnement consists of carrying out a multifractal analysis “at a point”, using 2-microlocal analysis: the so-called 2-microlocal formalism allows us to deﬁne a function, corresponding to fg , which completely describes the singular behavior of a signal in the vicinity of any point. Such an analysis provides in particular ways to modify pointwise regularity in a powerful manner. 1.4.6. The multifractal spectra of certain simple signals The paradigmatic example of multifractal measures is the Besicovitch measure – often called, in this context, the “binomial measure”. For x in [0, 1], write: x=

∞

xi 2−i

where xi = 0, 1

1

1 1 − xi , n i=1 n

Φn0 (x) =

1 xi n i=1 n

Φn1 (x) =

According to (1.13): α un (x) = −Φn0 (x) log2 p − Φn1 (x) log2 (1 − p) Write lim inf n→∞ Φn0 (x) =: Φ0 (x) and Φ1 (x) := 1 − Φ0 (x). We obtain: α∗ (x) = −Φ0 (x) log2 p − Φ1 (x) log2 (1 − p)

Fractal and Multifractal Analysis in Signal Processing

63

The sets Eα are the sets of points of [0, 1] having a given proportion Φ0 (x) of 0 in their base-2 expansion. Calculations similar to those in Example 1.6 lead to: dim Eα = −Φ0 log2 Φ0 − Φ1 log2 Φ1 The Hausdorff spectrum is thus given in parametric form by: α(Φ0 ) = −Φ0 log2 p − Φ1 log2 (1 − p) fn (α) = −Φ0 log2 Φ0 − Φ1 log2 Φ1 Note also that the sets of points in (0, 1) for which Φn0 (x) does not converge has Hausdorff dimension equal to 1, although it has a zero Lebesgue measure: indeed, the strong law of large numbers entails that, for L-almost x, Φ0 (x) = Φ1 (x) = 12 . An immediate consequence is that, for αm := − 12 log2 p(1 − p), fh (αm ) = 1. Now consider the grain exponents. We observe that L-a.s., αnk → αm . The function fg (α) will measure the speed at which Pr (|αnk − αm | > ε) tends to 0 for ε > 0 when n → ∞. For ﬁxed n there are exactly Cnk intervals Inj such that Φn0 (x) = nk for x ∈ Inj . This makes it possible to evaluate Nnε (α), which is equal to Cnk for ε sufﬁciently small and α close to α(Φ0 = nk ). Using Stirling’s formula to estimate Cnk , we can then obtain fg (α). However, these calculations are somewhat tedious, in particular due to the double limit, and it is much faster to evaluate fl (α) and use the general results on the relationships between the three spectra: By deﬁnition: τ (q) = lim inf n→∞

log2

2n −1 j q k=0 μ In −n

Now, μ(Inj ) = pk (1 − p)n−k exactly Cnk times, for k = 0 . . . n. As a consequence: n 2 −1

j=0

n n q μ Inj = Cnk pkq (1 − p)(n−k)q = pq + (1 − p)q k=0

and: τ (q) = − log2 pq + (1 − p)q A simple Legendre transform calculation then shows that fl (α) = τ ∗ (α) is equal to fh (α). Since fh fl fg is always true, it follows that the strong multifractal formalism holds in the case of the binomial measure: fh = fg = fl .

64

Scaling, Fractals and Wavelets

1

0.8

0.6

0.4

0.2

0

0.8

0.9

1

1.1

1.2

1.3

Figure 1.7. Spectrum of the binomial measure

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0.2

0.4

0.6

0.8

1

1.2

1.4

Figure 1.8. Spectrum of the sum of two binomial measures

Let us note, however, that simple operations will break the formalism: for instance, the sum of two Besicovitch measures on [0, 1], or the lumping of two such measures with disjoint supports, will have fh = fg = fl (see Figures 1.7 and 1.8). There are many generalizations of the Besicovitch measure. We can replace base 2 by base b > 2, i.e. partition [0, 1] into b sub-intervals. We then speak of multinomial measures. We may also distribute the measure on b < b intervals, in which case the

Fractal and Multifractal Analysis in Signal Processing

65

support of the measure will be a Cantor set. Another variation is to consider stochastic constructions by choosing the partitioning and/or p as random variables following certain probability laws. An example of such a stochastic Besicovitch measure, where, at each scale, p is chosen as an iid lognormal random variable, is represented in Figure 1.9. This type of measure can serve as a starting point for the modeling of certain Internet trafﬁc.

Figure 1.9. A random binomial measure

Essentially the same analysis as above applies to the self-afﬁne functions deﬁned in Example 1.7. Similar parametric formulae are obtained for the spectra, which also coincide in this case. Let us mention as a ﬁnal note that such measures and functions have a Hölder function which is everywhere discontinuous and has level lines which are everywhere dense. We have αl (x) = α0 for all x, where α0 = mint αp (t). Let us now consider the Weierstrass function: W (x) =

∞

ω −nH cos(ω n x)

i=0

As was already mentioned, α(t) = H for any t for W . As a consequence: fh (H) = 1,

fh (α) = −∞ if α = H

The value of the large deviation spectrum depends on the choice of function VW (u). Taking VW (u) = β(W, u) leads to fg = fh . However, if VW (u) is deﬁned to

66

Scaling, Fractals and Wavelets

be the increment of W in u, then, for certain values of ω: ⎧ ⎪ if α < H ⎨−∞ fg (α) = H + 1 − α if H α H + 1 ⎪ ⎩ −∞ if α > H + 1 The heuristic explanation of the fact that fg is positive for some values of α larger than H is as follows: at each ﬁnite resolution ε, the increments of W in intervals of size ε are at most of the order of εH , because the exponent is deﬁned as a lim inf. However, there will also exist intervals for which the increments are smaller, yielding a larger observed grain exponent. The fg spectrum does in fact measure the speed at which the probability of ﬁnding one of these smoother increments tends to 0 when ε tends to 0. To conclude, we note that, in both cases (oscillations and increments), fg is concave and coincides with fl . Similar results hold for fractional Brownian motion. Of course, since fractional Brownian motion is a stochastic process, its spectrum is a priori a random function. However, we can show that, with probability 1, fh and fg (deﬁned either using oscillations or increments) are exactly the same as those given above for the Weierstrass function. 1.4.7. Two applications To conclude this chapter, we brieﬂy mention two applications of multifractal analysis to signal and image processing. Our intention is not to go into the details of these applications (they are developed in Chapters 11 and 12), but to indicate, in a simpliﬁed way, how the tools introduced above are put into practice. 1.4.7.1. Image segmentation The issue of edge detection in images allows us to illustrate in a concrete way the relevance of multifractal spectra, as well as the difference between fh and fg . We have observed above that edge points are irregular points, but that they cannot be characterized by a universal value of αp , since non-linear transformations of an image may leave the contours unchanged while modifying the exponents. To characterize edges, it is necessary to include higher level information. This information may be obtained from the following obvious comment: (smooth) edges of an image form a set of lines which is of one dimension. Looking for contours thus means looking for sets of points which are characterized by speciﬁc values of αp (local regularity criterion), and such that their associated dimension is 1 (global criterion). In other words, we will characterize edges as those points possessing an exponent which belongs to fh−1 (1). This provides a geometric characterization of edges: on the one hand, we rely on

Fractal and Multifractal Analysis in Signal Processing

67

pointwise exponents, which measure the regularity at inﬁnite resolution; on the other hand, we use fh , which is a dimension spectrum. However, it is also possible to follow a statistical approach: assume we consider a very simple image which contains only a black line, the “edge”, drawn on a white background. If we draw randomly in a uniform way a point in the image at inﬁnite resolution, the probability of hitting the line is zero. However, at any ﬁnite resolution, where the image is made of, say, 2n ×2n pixels, the probability of hitting the edge is of the order 2−n , since the edge contains 2n pixels. According to the deﬁnition of fg (recall in particular (1.27)), we see that, on the black line, fg (α) = 1. In this approach, we thus characterize an edge point as a point possessing a singularity α whose probability of occurrence decreases as 2−n when the resolution tends to inﬁnity. When the multifractal formalism holds, the geometric and statistical approaches yield the same result. For more details, see Chapter 11 and [LEV 96a]. 1.4.7.2. Analysis of TCP trafﬁc Our second application deals with the modeling and analysis of TCP trafﬁc. Here the situation is a little bit different from that of the previous application: contrarily to typical images, TCP trafﬁc possesses, under certain conditions, a multifractal structure. In the language of the discussion at the end of the introduction to this chapter, we here use fractal methods to study a fractal object. This will entail few changes as far as the analysis of the data is concerned. However, dealing with a multifractal signal brings up the question of the source of this multifractality, and thus of a model capable of explaining this phenomenon. This issue will not be not tackled here. See Chapter 12 and [LEV 97, LEV 01, RIE 97, BLV 01]. What is the use of carrying out a multifractal analysis of TCP? First of all, the range of values taken by the Hölder exponents provides important information on the small-scale behavior of trafﬁc. The smaller α is, the more sporadic trafﬁc will be, which means that variations on short time intervals will be signiﬁcant. The spectrum also allows us to elucidate what is typical behavior, i.e. the value α0 such that fg (α0 ) = 1: with high probability, the variation of trafﬁc between two close time instants (t1 , t2 ) will be of the order |t2 − t1 |α0 . While this typical behavior is important for the understanding and the management of the network, it is also useful to know which other variations may occur, and with what probabilities. This is exactly the information provided by fg . Thus, the whole large deviation spectrum is useful in this application. Let us note that, in contrast, the Hausdorff spectrum is probably less adapted here: ﬁrst because the relevant physical quantities are increments at different time scales, small but ﬁnite; there is no notion of regularity at inﬁnite resolution, as is the case with images. Second, the relevant information is statistical in nature rather than geometric. To conclude, let us mention that the large deviation spectrum of certain TCP traces, as estimated by the kernel method, displays a shape reminiscent of that of

68

Scaling, Fractals and Wavelets

the sum of two binomial measures. This provides useful information on the ﬁne structure of these traces. In particular, fg is not concave. This shows the advantage of having procedures available to estimate spectra without making the assumption that a multifractal formalism is satisﬁed.

1.5. Bibliography [BLV 01] BARRAL J., L ÉVY V ÉHEL J., “Multifractal analysis of a class of additive processes with correlated nonstationary increments”, Electronic Journal of Probability, vol. 9, p. 508–543, 2001. [BON 86] B ONY J., “Second microlocalization and propagation of singularities for semilinear hyperbolic equations”, in Hyperbolic equations and related topics (Katata/Kyoto, 1984), Academic Press, Boston, Massachusetts, p. 11–49, 1986. [BOU 28] B OULIGAND G., “Ensembles impropres et nombre dimensionnel”, Bull. Soc. Math., vol. 52, p. 320–334 and 361–376, 1928 (see also Les déﬁnitions modernes de la dimension, Hermann, 1936). [BOU 00] B OUSCH T., H EURTEAUX Y., “Caloric measure on the domains bounded by Weierstrass-type graphs”, Ann. Acad. Sci. Fenn. Math., vol. 25, p. 501–522, 2000. [CAN 96] C ANUS C., L ÉVY V ÉHEL J., “Change detection in sequences of images by multifractal analysis”, in ICASSP’96 (Atlanta, Georgia), 1996. [DAO 02] DAOUDI K., L ÉVY V ÉHEL J., “Signal representation and segmentation based on multifractal stationarity”, Signal Processing, vol. 82, no. 12, p. 2015–2024, 2002. [DUB 89] D UBUC B., T RICOT C., ROQUES -C ARMES C., Z UCKER S., “Evaluating the fractal dimension of proﬁles”, Physical Review A, vol. 39, p. 1500–1512, 1989. [GRE 85] G REBOGI C., M C D ONALD S., OTT E., YORKE J., “Exterior dimension of large fractals”, Physics Letters A, vol. 110, p. 1–4, 1985. [HAR 16] H ARDY G., “Weierstrass non-differential function”, American Mathematical Society Translations, vol. 17, p. 301–325, 1916. [HAU 19] H AUSDORFF F., “Dimension und äusseres Mass”, Math. Ann., vol. 79, p. 157–179, 1919. [JAF 96] JAFFARD S., M EYER Y., “Wavelet methods for pointwise regularity and local oscillations of functions”, Mem. Amer. Math. Soc., vol. 123, 1996. [KAH 63] K AHANE J., S ALEM R., Ensembles parfaits et séries trigonométriques, Hermann, 1963. [KOL 61] KOLMOGOROV A., T IHOMIROV V., “Epsilon-entropy and epsilon-capacity of sets in functional spaces”, American Mathematical Society Translations, vol. 17, p. 277-364, 1961. [KOL 01] KOLWANKAR K., L ÉVY V ÉHEL J., “Measuring functions smoothness with local fractional derivatives”, Frac. Calc. Appl. Anal., vol. 4, no. 3, p. 285–301, 2001.

Fractal and Multifractal Analysis in Signal Processing

69

[KOL 02] KOLWANKAR K., L ÉVY V ÉHEL J., “A time domain characterization of the ﬁne local regularity of functions”, J. Fourier Anal. Appl., vol. 8, no. 4, p. 319–334, 2002. [LEV 96a] L ÉVY V ÉHEL J., “Introduction to the multifractal analysis of images”, in F ISHER Y. (Ed.), Fractal Image Encoding and Analysis, Springer-Verlag, 1996. [LEV 96b] L ÉVY V ÉHEL J., “Numerical computation of the large deviation multifractal spectrum”, in CFIC (Rome, Italy), 1996. [LEV 97] L ÉVY V ÉHEL J., R IEDI R., “Fractional Brownian motion and data trafﬁc modeling: The other end of the spectrum”, in L ÉVY V ÉHEL J., L UTTON E., T RICOT C. (Eds.), Fractals in Engineering, Springer-Verlag, 1997. [LEV 98a] L ÉVY V ÉHEL J., G UIHENEUF B., “2-Microlocal analysis and applications in signal processing”, in International Wavelets Conference (Tangier), 1998. [LEV 98b] L ÉVY V ÉHEL J., VOJAK R., “Multifractal analysis of Choquet capacities”, Advances in Applied Mathematics, vol. 20, no. 1, p. 1–43, 1998. [LEV 01] L ÉVY V ÉHEL J., S IKDAR B., “A multiplicative multifractal model for TCP trafﬁc”, in ISCC’2001 (Tunisia), 2001. [LEV 02] L ÉVY V ÉHEL J., “Multifractal processing of signals”, forthcoming. [LEV 04a] L ÉVY V ÉHEL J., S EURET S., “The 2-microlocal formalism”, in Fractal Geometry and Applications: A Jubliee of Benoit Mandelbrot, Proc. Sympos. Pure Math., PSPUM, vol. 72, Part 2, p. 153–215, 2004. [LEV 04b] L ÉVY V ÉHEL J., “On various multifractal spectra”, in BANDT C., M OSCO U., Z ÄHLE M. (Eds.), Fractal Geometry and Stochastics III, Progress in Probability, Birtkhäuser Verlag, vol. 57, p. 23–42, 2004. [MCM 84] M C M ULLEN C., “The Hausdorff dimension of general Sierpinski carpets”, Nagoya Mathematical Journal, vol. 96, p. 1–9, 1984. [RIE 97] R IEDI R., L ÉVY V ÉHEL J., TCP trafﬁc is multifractal: a numerical study, Technical Report RR-3129, INRIA, 1997. [ROU 98] ROUEFF F., L ÉVY V ÉHEL J., “A regularization approach to fractional dimension estimation”, in Fractals’98 (Malta), 1998. [SEU 02] S EURET S., L ÉVY V ÉHEL J., “The local Hölder function of a continuous function”, Appl. Comput. Hamron. Anal., vol. 13, no. 3, p. 263–276, 2002. [TRI 82] T RICOT C., “Two deﬁnitions of fractal dimension”, Math. Proc. Camb. Phil. Soc., vol. 91, p. 57–74, 1982. [TRI 86a] T RICOT C., “Dimensions de graphes”, Comptes rendus de l’Académie des sciences de Paris, vol. 303, p. 609–612, 1986. [TRI 86b] T RICOT C., “The geometry of the complement of a fractal set”, Physics Letters A, vol. 114, p. 430–434, 1986. [TRI 87] T RICOT C., “Dimensions aux bords d’un ouvert”, Ann. Sc. Math. Québec, vol. 11, no. 1, p. 205–235, 1987. [TRI 88] T RICOT C., Q UINIOU J., W EHBI D., ROQUES -C ARMES C., D UBUC B., “Evaluation de la dimension fractale d’un graphe”, Rev. Phys. Appl., vol. 23, p. 111–124, 1988.

70

Scaling, Fractals and Wavelets

[TRI 99] T RICOT C., Courbes et dimension fractale, Springer-Verlag, 2nd edition, 1999. [VOJ 94] VOJAK R., L ÉVY V ÉHEL J., DANECH -PAJOUH M., “Multifractal description of road trafﬁc structure”, in Seventh IFAC/IFORS Symposium on Transportation Systems: Theory and Application of Advanced Technology (Tianjin, China), p. 942–947, 1994.

Chapter 2

Scale Invariance and Wavelets

2.1. Introduction

Processes presenting “power law” spectra (often regrouped under the restrictive, but generic, term of “1/f ” processes) appear in various domains: hydrology [BER 94], ﬁnance [MAN 97], telecommunications [LEL 94, PAR 00], turbulence [CAS 96, FRI 95], biology [TEI 00] and many more [WOR 96]. The characteristics of these processes are based upon concepts such as fractality, self-similarity or long-range dependence and, even though these different notions are not equivalent, they all possess a common characteristic: that of replacing the idea of a structure related to a preferred time scale with that of an invariant relationship between different scales.

The study of scale invariant processes presents several difﬁculties ranging from modeling to analysis and processing, for which few tools were available until recently [BER 94]. The effective possibility of appropriately manipulating these processes has recently been reinforced by the appearance of adequate multiresolution techniques: the tools which are referred to here have been developed for this purpose. These tools are explicitly based on the theoretical as well as algorithmic potentialities offered by wavelet transforms.

Chapter written by Patrick F LANDRIN, Paulo G ONÇALVES and Patrice A BRY.

72

Scaling, Fractals and Wavelets

2.2. Models for scale invariance 2.2.1. Intuition From a qualitative point of view, the idea beyond a 1/f spectrum involves situations where, given a signal observed in the time domain, its empirical power spectrum density behaves as S(f ) = C|f |−α with α > 0. From a practical viewpoint, it is evidently the equivalent form: log S(f ) = log C − α log |f | which is the most signiﬁcant, since the “1/f ” character is translated by a straight line in doubly logarithmic coordinates. Generally, when dealing with physical observations, referring to some 1/f spectral behavior is only meaningful with respect to a frequency analysis band. Therefore, the introduction, on the half-line of positive frequencies, of two (adjustable) frequencies fbf and fhf such that 0 < fbf < fhf < +∞, will end up with a three regime classiﬁcation, whether we study the 1/f behavior in (at least) one of the three domains that this partition deﬁnes. Let us consider each of these cases: 1) fbf f fhf : this is the context of a bandpass domain where we simply observe an algebraic decrease of the spectrum density, without a predominant frequency; 2) fhf f +∞: the 1/f character is dominant in the high frequency limit and highlights the local regularity of the sample paths, their variability and their fractal nature; 3) 0 f fbf : the power law of the spectrum intervenes here in the limit of low frequencies, resulting in a divergence at the origin of the spectrum density. If 0 < α < 1, this divergence corresponds to an algebraic decrease of the correlation function, which is slow enough for the latter not to be summable; there is long-range dependence or long memory. In fact, these three regimes represent three different properties, hence, they have no reason to exist at the same time. However, they possess the common denominator of being linked to an idea of scale invariance according to which – within a scale range and up to some renormalization – the properties of the whole are the same as those of the parts (self-similarity). Indeed, a power law spectrum belongs to the class of homogenous functions. Its form therefore remains invariant under scaling in the sense that, for any c ∈ + : S(f ) = C|f |−α

=⇒

S(cf ) = C|cf |−α = c−α S(f )

Scale Invariance and Wavelets

73

Given that, through Fourier transformation, a dilation or compression in the frequency domain is translated by a corresponding compression or dilation in the time domain, it is thus legitimate to expect “1/f ” processes to be closely coupled with self-similar processes. 2.2.2. Self-similarity To be more precise [BER 94, SAM 94] (see also Chapter 5), we introduce the following deﬁnition. DEFINITION 2.1.– A process X = {X(t), t ∈ } is said to be self-similar of index H > 0 if, for any c ∈ , it satisﬁes the following equality in a distributional sense: L {cH X(t/c), t ∈ }, {X(t), t ∈ } =

∀c > 0.

(2.1)

According to this deﬁnition, a self-similar process does not possess any characteristic scale, insofar as it remains (statistically) identical to itself after any scale change. If, from a theoretical point of view, self-similarity is likely to extend from the largest to the ﬁnest scales, the above-mentioned deﬁnition must, in general, go with a scale domain (i.e., with a variation domain of the factor c) for which the invariance has a meaning. For instance, the ﬁnite duration of an observation settles the maximum attainable scale, in the same way as the ﬁnite resolution of a sensor limits the ﬁnest scale. It is noteworthy that, if a (second-order) process is self-similar, it is necessarily non-stationary. Indeed, assuming that, at some arbitrary time instant t1 := 1, the condition var X(t1 ) = 0 holds, it stems from Deﬁnition 2.1 that var X(t) = varX(t × 1) = t2H var X(t1 ) and, as a consequence, the variance of the process depends on time. This behavior applies to all ﬁnite moments of X: E |X(t)|q = E |X(1)|q |t|qH .

(2.2)

Therefore, in a strict sense, an ordinary spectrum cannot be attached to a self-similar process. Nevertheless, there exists an interesting sub-class of the class of self-similar processes, which, in a sense, could be paralleled with that of stationary processes: it is that of processes with stationary increments deﬁned as follows [BER 94, SAM 94]. DEFINITION 2.2.– A process X = {X(t), t ∈ } is said to have stationary increments if and only if, for any θ ∈ , the law of its increment process:

X (θ) := X (θ) (t) := X(t + θ) − X(t), t ∈ does not depend on t.

74

Scaling, Fractals and Wavelets

Figure 2.1. Path of a self-similar process. When we simultaneously apply to the sample path of a self-similar process a dilation of the time axis by a factor c and a dilation of the amplitude axis by a factor c−H , we obtain a new sample path that is (statistically) indistinguishable from its original

In this deﬁnition, the parameter θ plays the role of a time scale according to which we study the process X. Indeed, the self-similarity of the latter is translated on its increments by:

L H (θ/c) c X (t/c), t ∈ , X (θ) (t), t ∈ =

∀c > 0.

(2.3)

It is evidently possible to extend this deﬁnition to higher order increments (“increments of increments”). Coupling self-similarity of index H and the stationary increments property implies that the parameter H remains in the range 0 < H < 1. Moreover, the covariance function of a process (originally centered and zero at the origin) must be of the form: E X(t)X(s) =

σ 2 2H |t| + |s|2H − |t − s|2H 2

(2.4)

Scale Invariance and Wavelets

75

with the identiﬁcation σ 2 := E |X(1)|2 . Indeed, if we adopt the convention according to which at time t = 0, X(0) = 0, it follows from the assumptions made that: 2 1 E X 2 (t) + E X 2 (s) − E X(t) − X(s) E X(t)X(s) = 2 2 1 E X 2 (t) + E X 2 (s) − E X(t − s) − X(0) = 2 E |X(1)|2 2H |t| + |s|2H − |t − s|2H , = 2 which explains the structure of relation (2.4). Moreover, the correlation function of the increment processes X (θ) reads: σ2 |τ + θ|2H + |τ − θ|2H − 2|τ |2H E X (θ) (t)X (θ) (t + τ ) = 2 2H 2H σ 2 2H θ 1 − θ − 2 . |τ | = 1 + + 2 τ τ

(2.5)

It is now possible to study in depth its asymptotic behaviors in both limits of large and small τ s. For instance, we show that in the limit τ → +∞ (i.e., τ θ), the autocorrelation function decreases asymptotically as τ 2(H−1) : E X (θ) (t)X (θ) (t + τ ) ∼

σ2 2H (2H − 1) θ2 τ 2(H−1) . 2

(2.6)

By Fourier duality, this behavior induces an algebraic spectral divergence with exponent 1 − 2H at the origin. Self-similar processes with stationary increments are hence closely related to long-range dependent processes. In the other limit, τ → 0 (i.e., τ θ), we show that for H > 12 : (2.7) E X (θ) (t)X (θ) (t + τ ) ∼ σ 2 θ2H 1 − θ−2H |τ |2H . This behavior characterizes the local regularity of each sample path of the process X. The following sections explain the notions associated with each of these limits: long-range dependence on the one hand and local regularity on the other hand. 2.2.3. Long-range dependence DEFINITION 2.3.– A second order stationary process X = {X(t), t ∈ } is said to be “long-range dependent” (or to have “long memory”) if its correlation function cX (τ ) := E X(t)X(t + τ ) is such that [BER 94, SAM 94]: cX (τ ) ∼ cr |τ |−β ,

τ −→ +∞

(2.8)

76

Scaling, Fractals and Wavelets

with 0 < β < 1. In the same way, the power spectrum density: +∞ cX (τ ) e−i2πf τ dτ ΓX (f ) := −∞

of a long-range dependent process is such that: ΓX (f ) ∼ cf |f |−γ ,

f −→ 0

(2.9)

with 0 < γ = 1 − β < 1 and cf = 2 (2π) sin((1 − γ)π/2) Γ(γ) cr , where Γ denotes the usual Gamma function. Under its form (2.8), long-range dependence is related to the fact that, for large lags, the (algebraic) decrease of the correlation function is so slow that it does not enable its summability: hence, there is a long memory effect, in the sense that signiﬁcant statistical relations are maintained between very distant samples. Obviously, this situation is in contrast with that of Markovian processes with short memory, which are characterized by an asymptotic exponential reduction of the correlations. By deﬁnition, the existence of an exponential decrease involves a characteristic time scale, whereas this is no longer the case for an algebraic decrease: hence, it is a matter of scaling law behavior. By Fourier duality, long-range dependence implies that ΓX (0) = ∞, in accordance with the power law divergence expressed by (2.9). Finally, even if the property of long-range dependence exists and although its deﬁnition is independent from that of self-similarity, relation (2.6) demonstrates that a strong bond exists between these two notions, since it indicates that the increment process of a self-similar process with stationary increments presents, if H > 12 , long-range dependence. 2.2.4. Local regularity The main issue of this section, rather than the long-term behavior of the autocorrelation function, is its short-term behavior. Let X be a second order stationary random process, whose autocorrelation function is originally such that: E X(t)X(t + τ ) ∼ σ 2 (1 − C|τ |2h ),

τ −→ 0,

0 < h < 1.

(2.10)

Hence, it is easy to prove that this original covariance structure is equivalent to an algebraic behavior of the increments variance in the limit of short increments: E |X(t + τ ) − X(t)|2 ∼ C|τ |2h ,

τ −→ 0,

0 < h < 1.

This relation provides information on the local regularity of each sample path of the process X. For Gaussian processes, for instance, it indicates that these sample paths are continuous of order h < h. When 0 < h < 1, this means that these trajectories of X are everywhere continuous but nowhere differentiable.

Scale Invariance and Wavelets

77

To describe this local regularity more precisely, one can use the notion of Hölder exponent, according to the following deﬁnition. DEFINITION 2.4.– A signal X(t) is of Hölder regularity h 0 in t0 if there exists a local polynomial Pt0 (t) of degree n = h and a constant C such that: |X(t) − Pt0 (t)| C|t − t0 |h .

(2.11)

In the case where 0 h < 1, the regular part of X(t) is reduced to Pt0 (t) = X(t0 ), thus leading to the characterization of the Hölder regularity of X(t) in t0 by the relation: |X(t0 + θ) − X(t0 )| C|θ|h .

(2.12)

The Hölder exponent heuristically supplies a measure of the roughness of the sample path of X: the closer it is to 1, the softer and more regular the path; the closer it is to 0, the rougher and the more variable the path. The asymptotic algebraic behavior of the increments variance thus highlights a Hölder regularity h < h of the sample paths of the process X. This correspondence between the asymptotic algebraic behavior of increments and the local regularity remains valid even if the process X is no longer stationary, but only has stationary increments. The processes whose sample paths possess a uniform and constant local regularity are said to be monofractal. As far as self-similar processes with stationary increments are concerned, it is easy to observe that, on the one hand, starting from (2.12), the increments present an algebraic behavior for all the θ in general, hence in particular in the limit θ → 0: E |X(t + θ) − X(t)|2 = E |X(θ) − X(0) |2 = σ 2 |θ|2H

(2.13)

=0

whereas, on the other hand, relation (2.13) indicates that the increment process presents an autocovariance as in (2.10). Self-similar processes with stationary increments, as well as their increment processes, thus present uniform local regularities (i.e., on average and everywhere) h < H. 2.2.5. Fractional Brownian motion: paradigm of scale invariance The simplest and most commonly used model of a self-similar process is that of fractional Brownian motion (FBM) [MAN 68], which is characterized by its real exponent 0 < H < 1, called the Hurst exponent. DEFINITION 2.5.– FBM BH = {BH (t), t ∈ ; BH (0) = 0} is the only zero-mean Gaussian process which is self-similar and possesses stationary increments.

78

Scaling, Fractals and Wavelets

The self-similarity and the stationary nature of the increments guarantee that the covariance function of FBM is of the form (2.4). As regards the Gaussian character, it demands that the probability law of FBM must be entirely determined by this covariance structure. FBM can be considered as a generalization of ordinary Brownian motion. In the case of ordinary Brownian motion, we know that the increments possess the particularity of being decorrelated (and therefore independent because of Gaussiannity). The generalization offered by FBM consists of introducing a possibility of correlation between the increments. In fact, we show that: E BH (t + θ) − BH (t) BH (t) − BH (t − θ) = σ 2 22H−1 − 1 |θ|2H which conﬁrms the decorrelation between the increments when H = 12 (i.e. ordinary Brownian motion) but induces a positive correlation (persistence) or a negative correlation (antipersistence) depending on whether H > 12 or H < 12 . DEFINITION 2.6.– We call fractional Gaussian noise (FGN) the increments process GH;θ := {GH;θ (t), t ∈ } deﬁned by: GH;θ (t) :=

1 (θ) B (t), θ H

θ>0

(2.14)

where BH is FBM. It is, by construction, a stationary process, everywhere continuous but nowhere differentiable, that can be considered as an extension of white Gaussian noise. Hence, we must be very careful when we decide to take the limit of deﬁnition (2.14) when θ → 0. Nevertheless, if we are interested in the behavior of FGN with “small” increments, we observe according to (2.6) and (2.14) that: cGH;θ (τ ) := E GH;θ (t)GH;θ (t + τ ) ∼

σ2 2H(2H − 1)τ 2(H−1) , 2

τ θ.

On the one hand, this behavior highlights that FGN presents some long memory and, on the other hand, that the power spectrum density of FGN is proportional to |f |−(2H−1) . It is therefore possible to prove on the basis of several arguments (integration/differentiation type) [FLA 92] that the FBM itself possesses an “average spectrum” of the form ΓBH (f ) ∝ |f |−(2H+1) . Along with its role of spectral exponent, the parameter H also controls the Hölder regularity of the sample paths of FBM and FGN, which is h < H in any point. To this regularity (or irregularity), a notion of fractality is naturally associated with the Gaussian processes, since the Hausdorff dimension of the sample paths is equal to dimH graph(BH ) = 2 − H (for a precise deﬁnition of the Hausdorff dimension, see Chapter 1).

Scale Invariance and Wavelets

79

As a result, FBM presents the advantage (or the disadvantage) of being globally self-similar on the entire frequency axis, the only parameter H controlling, according to the requirements, one or other of the three regimes cited before: self-similarity, long memory and local regularity. In terms of modeling, FBM appears as a particularly interesting starting point (as can be the case for white Gaussian noise in stationary contexts). This simplicity (FBM is the only Gaussian process with stationary and self-similar increments and it is entirely determined by the single parameter H) is not of course without counterparts when it is comes to applications, i.e. as soon as it becomes necessary to consider real data. From this theme, numerous variations can be considered, which are not only mentioned here but are also studied in detail in the other chapters of this volume. In all cases, it is a matter of replacing the single exponent H by a collection of exponents. 2.2.6. Beyond the paradigm of scale invariance To begin with, we can consider modifying relation (2.13) by allowing the exponent to depend on time: E |X(t + θ) − X(t)|2 ∼ C(t)|θ|2h(t) ,

θ → 0.

When 0 < h(t) < 1 is a sufﬁciently regular deterministic function, we describe the process X as multifractional or, when it is Gaussian, as locally self-similar, i.e., that locally around t, X(t) possesses similarities with a FBM of parameter H = h(t) (for more details, see Chapter 6). The local regularity is no longer a uniform or global quantity along the sample path but, on the contrary, it varies in time, according to h(t), which therefore makes it possible to model time variations of the roughness. When h(t) is itself a strong irregular function, possibly a random process, in the sense that, with t ﬁxed, h(t) depends on the observed realization of X, the process X is said to be multifractal. The variability ﬂuctuations are no longer described by h(t), but by a multifractal spectrum D(h) which characterizes the Hausdorff dimension of the set of points t where h(t) = h (see Chapter 1 and Chapter 3). One of the major consequences of multifractality in the processes is the fact that quantities usually called partition functions, behave according to power laws in the small scale limit: 1 (τ ) |X (t + kτ )|q cq |τ |ζ(q) , n n

|τ | → 0

(2.15)

k=1

n For processes with stationary increments, the time averages (1/n) k=1 |X (τ ) (t+ kτ )|q can be regarded as estimations of the averages of the set E |X (τ ) (t)|q . Relation (2.15) thus recalls equation (2.2), which is a consequence of self-similarity. However, a fundamental difference exists: the exponents ζ(q) do not possess a priori any

80

Scaling, Fractals and Wavelets

reason to present a linear behavior qH. In other words, the description of scaling laws in data cannot be carried out with a single exponent but requires a whole collection of them. Measuring exponents ζ(q) represents a possibility, through a Legendre transform, of estimating the multifractal spectrum. However, a detailed discussion of the multifractal processes is beyond the scope of this chapter; to this end, see Chapter 1 and Chapter 3. Multifractal processes provide a rich and natural extension of the self-similar model insofar as a single exponent is replaced by a set; nevertheless, they are essentially related to the existence of power law behaviors. In the analysis of experimental data, such behaviors might not be observed. In order to illustrate these situations, the inﬁnitely divisible cascades model exploits an additional degree of freedom: we relax the constraint of a proper power law behavior for the moments, and replace it with a simple behavior that has separable variables q (order of the moment) and τ (scale analysis). The equations below explain this behavior: self-similar multifractal inf. divisib. casc.

E |X (τ ) (t)|q = cq |τ |qH = cq exp(qH log τ ); E |X (τ ) (t)|q = cq |τ |ζ(q) = cq exp ζ(q) log τ ; E |X (τ ) (t)|q = cq exp H(q)n(τ ) .

(2.16) (2.17) (2.18)

In this scenario, the function n(τ ) is no longer ﬁxed a priori to be log τ , as much as the function H(q) is no longer a priori linear according to qH. The concept of an inﬁnitely divisible cascade was initially introduced by Castaing in the context of turbulence [CAS 90, CAS 96]. The complete deﬁnition of this notion is beyond the scope of this chapter and can be found in [VEI 00]. It is nonetheless important to indicate that a quantity, called the propagator of the cascade, plays an important role here: it links the probability densities of process increments with two different scales τ and τ . The inﬁnite divisibility formally translates the notion of the absence of any preferred time scale and demands this propagator be constituted of an elementary function G0 , convoluted with itself a number of times dependent only on the scales τ and τ , and therefore with the following functional form: Gτ,τ (log α) = [G0 (log α)]∗(n(τ )−n(τ

))

.

A possible interpretation of this relation is to read the function G0 as the elementary step, i.e. the building block of the cascade, whereas the quantity n(τ ) − n(τ ) measures how many times this elementary step must be carried out to evolve from scale τ to scale τ . The derivative of n with respect to log τ thus describes, in a sense, the size of the cascade. The term of inﬁnitely divisible cascade is ascribed to situations where the function n possesses the speciﬁc form n(τ ) = log τ ; otherwise, we only refer to a scaling law behavior. The inﬁnitely divisible scale invariant cascades correspond to multiscaling or multifractality when the scaling law

Scale Invariance and Wavelets

81

exists in the small scale limit. The exponents ζ(q) associated with the multifractal spectrum are thus connected to the propagator of the cascade by ζ(q) = H(q). When the functions H and n simultaneously take the forms H(q) = qH and n(τ ) = log τ , the inﬁnitely divisible cascades are simply reduced to the case of self-similarity, thus represented as a particular case. The propagator is hence written as a Dirac function, Gτ,τ (log α) = δ(log α − H log(τ /τ )). The fundamental characteristic of the inﬁnitely divisible cascades – separation of variables q and τ – induces the following relations, which are essential for the analysis [VEI 00]: log E |X (τ ) |q = H(q)n(τ ) + Kq ; log E |X (τ ) |q =

H(q) log E |X (τ ) |p + κq,p . H(p)

(2.19) (2.20)

These equations indicate that the moments behave as power laws with respect to each other, this property being exploited in the analysis. Further deﬁnitions, interpretations and applications of the inﬁnitely divisible cascades can be found in [VEI 00]. 2.3. Wavelet transform 2.3.1. Continuous wavelet transform The continuous wavelet decomposition of a signal X(t) ∈ L2 (; dt) is a linear transformation from L2 (; dt) to L2 (×+∗ ; dtada 2 ), deﬁned by [DAU 92, MAL 98] u−t 1 X(u)ψ ∗ du. (2.21) TX (a, t) := √ a a This is the inner product between the analyzed signal X and a set of analyzing waveforms obtained from a prototype wavelet (or mother wavelet) ψ by dilations with a scale factor a ∈ +∗ and shifts in time t ∈ . In order for the wavelet transform to be a joint representation in time and frequency of the information contained in X, in other words, to be so that the coefﬁcients TX (a, t) account for X around a given instant, in a given frequency range, the mother wavelet must be a function well localized in both time and frequency. In order to obtain the inverse of the wavelet transform, it is also necessary that the mother wavelet satisﬁes a closure relation: t − t du da = δ(t − t ), ψ(u)ψ u − a a2 which induces a condition called admissibility: ∞ cψ = |Ψ(f )|2 df /|f |. −∞

82

Scaling, Fractals and Wavelets

Given this condition, it is possible to reconstruct the signal X by inverting the wavelet transform according to: u − t dt da −1 TX (a, t)ψ . X(u) = cψ a a2 From the admissibility constraint, it also follows that ψ must satisfy: ψ(t) dt = 0. Such a waveform ψ is therefore an oscillating function, localized on a short temporal support, hence the name wavelet. This oscillating behavior indicates that the wavelet transform does not detect the DC component (average value) of the analyzed signal X. For certain mother wavelets, this property can be extended to higher orders: tk ψ(t) dt = 0, ∀0 k < nψ which means that the wavelet analyzing a signal X is orthogonal to the polynomial components of a degree lower than or equal to its number of vanishing moments nψ . In other words, the wavelet coefﬁcients obtained from a mother wavelet characterized by nψ vanishing moments are insensitive to the behaviors of the signal, which are more regular, i.e. softer than the behavior of a polynomial of a degree strictly lower than nψ ; on the other hand, they account for the information relative to behaviors that are more irregular than such polynomial trends. 2.3.2. Discrete wavelet transform One of the fundamental characteristics of the continuous wavelet transform is its redundant character: the information contained in a signal, i.e. in a space of one dimension, is represented, through the wavelet transform, in a space of dimension 2, the time-scale plane (t, a) ∈ ( × ∗+ ); neighboring coefﬁcients thus share some part of the same information. To reduce this redundance, we deﬁne the discrete wavelet transform by the set of coefﬁcients: j/2 X(u) ψ(2j u − k) du (2.22) dX (j, k) := 2 deﬁned using a critical discrete1 sampling of the time-scale plane, which is usually called the dyadic grid: t → 2−j k, a → 2−j , (k, j) ∈ Z × Z ,

1. In [DAU 92] a detailed study can be found of the frames or oblique bases which correspond to the sub-critical sampling of the time-scale plane.

Scale Invariance and Wavelets

83

thus ending up with the correspondence: dX (j, k) = TX (t = 2−j k, a = 2−j ). In this case, the collection of dilated and shifted versions of the mother wavelet {ψj,k (t), j ∈ Z, k ∈ Z} may constitute a basis for L2 (). Here, to simplify, we will suppose that this refers to orthonormal wavelet bases. However, discrete wavelet transforms are not necessarily or a priori equivalent to the existence of an orthonormal basis. The strict deﬁnition of the discrete wavelet transform leading to (orthonormal) bases goes through multiresolution analysis. A multiresolution analysis consists of a collection of nested subspaces of L2 (): . . . ⊂ Vj+1 ⊂ Vj ⊂ Vj−1 ⊂ . . . Each Vj , j ∈ Z possesses its own orthonormal basis {2j/2 φ(2j · −k), k ∈ Z} constructed, as for the wavelets, from a prototype scaling function2 (or father wavelet) φ0 onto which dyadic dilations and integer shifts are applied. The embedded structure demands that the function φ must satisfy a two-scale relation: √ un φ(t − n). φ(t/2) = 2 n

The projection of a signal X ∈ L2 () on this basis thus supplies the approximation coefﬁcients at scale j: j/2 X(u) φ(2j u − k) du. aX (j, k) := 2 To complete these approximations, it is necessary to project the signal X onto the supplementary spaces of Vj in Vj+1 ; therefore, we deﬁne Wj by: Vj ⊕ Wj = Vj+1 ,

j−1

Wj = ∅,

Vj :=

Wj . j =−∞

j∈Z

For each ﬁxed scale j, the wavelet family {2j/2 ψ(2j . − k), k ∈ Z} thus forms a (orthonormal) basis of the corresponding subspace:

+∞ j/2 j dX (j, k) ψ(2 t − k) . Wj := X : X(t) = 2 k=−∞

2. To deﬁne a multiresolution analysis, this function φ has to satisfy a certain number of constraints which are not detailed here [DAU 92].

84

Scaling, Fractals and Wavelets

There again, the embedded structure imposes a two-scale relation on the wavelet: √ vn φ(t − n). ψ(t/2) = 2 n

The wavelet coefﬁcients or detail coefﬁcients therefore correspond to the projections of X on Wj . The signal X can thus be represented as a sum of approximations and details: aX (j, k) 2j/2 φ(2j t − k) X(t) = k

+

−∞ j =j

(2.23)

j /2

dX (j , k) 2

j

ψ(2 t − k).

k

time

scale

signal

details approximation

high-pass filter + decimation

low-pass filter + decimation

Figure 2.2. Fast pyramidal algorithm with ﬁlter structure for discrete wavelet decompositions. An approximation aX (0, k) (at scale 0) of the continuous time signal X is initially calculated (only this stage involves a continuous time evaluation from X). The “signal” represented in the ﬁgure is made up by the sequence aX (0, k). In multiresolution analysis this approximation a0,k is decomposed in a series of details dX (−1, k) and a new and rougher approximation aX (−1, K). This procedure is then iterated from the sequence aX (−1, k). The impulse responses of the discrete-time ﬁlters depend on the generating sequences u and v which deﬁne the scaling function and the wavelet. In the case of orthonormal bases, they are exactly equal to them

Scale Invariance and Wavelets

85

Finally, thanks to the properties of embedded spaces speciﬁc to multiresolution analysis, there exist very fast algorithms with a pyramidal structure, which enables effective and efﬁcient calculations of the discrete decomposition coefﬁcients. From the sequences u and v, described as the generators of multiresolution analysis, we can prove that the approximation and detail coefﬁcients at octave j can be calculated from those at octave j − 1: aX (j, k) = X(t)2j/2 φ(2j t − k) dt = =

√ X(t)2j/2 2 un φ 2(2j t − k) − n dt

n

un

X(t)2(j+1)/2 φ(2j+1 t − 2k − n) dt

n

=

un aj+1,2k+n

n

=

u∨ n aj+1,2k−n

n

= u∨ · ∗ aj+1,· (2k) and, in an identical manner: dX (j, k) = v·∨ ∗ aX (j + 1, ·) (2k) where ∗ denotes the discrete time convolution operator, i.e., (x· ∗ y· )(k) = ∨ ∨ n x(n)y(k − n), un = u−n and vn = v−n . The two previous relations can be rewritten by using the decimation operator ↓2 (y = ↓2 x means yk = x2k , i.e., that every other sample x is left out): ! " aX (j, k) = ↓2 (u∨ · ∗ aj−1,· ) (k); ! " dX (j, k) = ↓2 (v·∨ ∗ aj−1,· ) (k). Thanks to this recursive structure, the calculation cost of discrete wavelet decomposition of a signal uniformly sampled on N points is in O(N ). 2.4. Wavelet analysis of scale invariant processes The aim of this section is to study how these fundamental principles (scale changing operator) and essential properties (multiresolution structure, number of vanishing moments, localization) of wavelet decomposition can be exploited, in order to characterize and easily measure the scale invariance phenomena that have been previously described.

86

Scaling, Fractals and Wavelets

Let us note that the set of results mentioned below can be formulated in the same way as with the continuous wavelet decompositions. However, for the sake of simplicity and conciseness, we will only tackle the case of discrete random ﬁelds of orthogonal wavelet coefﬁcients, arising from the decomposition of scale invariant processes. 2.4.1. Self-similarity PROPOSITION 2.1.– The wavelet coefﬁcients resulting from the decomposition of a self-similar process of index H satisfy the equality: L −j(H+ 1 ) 2 dX (j, 0), . . . , dX (j, Nj − 1) = dX (0, 0), . . . , dX (0, Nj − 1) . 2 This result, initially demonstrated for the FBM [FLA 92] and then generalized to the set of self-similar processes [AVE 98], is based on the scale invariance principle stemming from the dilation/compression operator which deﬁnes the wavelet analysis. To outline the proof, it is only necessary to write down the main argument: when L 2jH X(u): X(2j u) = dX (j, k) = X(u)ψ(2j u − k)2j/2 du = L

=

2−j/2 X(2−j u)ψ(u − k) du 1

2−j(H+ 2 )

X(u)ψ(u − k) du

1

= 2−j(H+ 2 ) dX (0, k). The principal consequence of self-similarity is the fact that, when they do exist, the q-th order moment of the wavelet coefﬁcients satisfy the equality: 1

E |dX (j, k)|q = 2−jq(H+ 2 ) E |dX (0, k)|q . PROPOSITION 2.2.– The wavelet coefﬁcients resulting from the decomposition of a process with stationary increments are stationary at each scale 2j . To understand the origin of this result, let us note that the sampled process of −j increments X (θ=2 ) [2−j k] := X((k + 1)2−j ) − X(k 2−j ) can be identiﬁed with a wavelet decomposition (2.22) according to: −j X (2 ) [2−j k] = 2j X(u) [δ(2j u − k − 1) − δ(2j u − k)] du = 2j/2 dX (j, k),

Scale Invariance and Wavelets

87

with ψ(t) = δ(t−1)−δ(t) as the analyzing wavelet (an elementary wavelet sometimes referred to as the poor man’s # wavelet). In fact, it is the naturally admissible oscillating structure of the wavelets ( ψ(t) dt = 0) which guarantees this stationarity in the case of processes with stationary increments. Heuristically, and by underlining the main argument – the fact that the number of vanishing moments is at least greater than or equal to 1 – the proof reads (on the coefﬁcients of the discrete decompositions and with j = 0 to simplify the writing): dX (0, k + k0 ) = X(u)ψ(u − k − k0 ) du = X(u + k0 )ψ(u − k) du = [X(u + k0 ) − X(k0 )]ψ(u − k) du L [X(u) − X(0)]ψ(u − k) du = = X(u)ψ(u − k) du = dX (0, k). This proof highlights the role played, for stationarization, by the fact that ψ is of zero-mean value (i.e. that its number of vanishing moments is at least 1). This result was obtained in the case of FBM, although directly from the covariance form, in [FLA 92, TEW 92], extended to stable cases, independently by different authors [DEL 00, PES 99] and proved in a general context in [CAM 95, MAS 93]. Given that we are dealing with processes with stationary increments of order p, the simple admissibility condition of the wavelets is no longer sufﬁcient. Hence, it is necessary to choose a wavelet analysis ψ possessing nψ p vanishing moments so that the coefﬁcient series dX (j, k) obtained are stationary at each scale. The complete proof of this result is given in [AVE 98]. However, a good way to make the issue clearer would be to argue here that the wavelet tool plays a role similar to that of a differentiation operator, insofar as the number of vanishing moments control, by time-frequency duality, the behavior of the spectrum magnitude |Ψ(f )| in the vicinity of the zero frequency. Indeed, for a wavelet ψ possessing nψ vanishing moments, we have |Ψ(f )| ∼ |f |nψ , f → 0, which at ﬁrst approximation we can identify with the differentiation operator of order nψ . PROPOSITION 2.3.– The wavelet coefﬁcients resulting from the decomposition of a process X which is zero-mean, self-similar of index H, of ﬁnite variance and with stationary increments (H − ASAS ) possesses, when they exist, moments of order q satisfying the following scaling law: 1

E |dX (j, k)|q = E |dX (0, 0)|q 2−jq(H+ 2 ) .

88

Scaling, Fractals and Wavelets

This last result stems directly from the coupling of the two previous propositions. For processes with ﬁnite variance (i.e., whose third order moment 2 exists) – Gaussian processes, just as the FBM, for instance – this relation takes on the following speciﬁc form: E |dX (j, k)|2 = E |dX (0, 0)|2 2−j(2H+1) .

(2.24)

Given that the latter are second order statistics, the particular form (2.4) of the covariance structure of a H − ASAS process makes it possible to deduce the asymptotic behavior of the dependence structure of the wavelet coefﬁcients [FLA 92, TEW 92]. PROPOSITION 2.4.– The asymptotic covariance structure of the wavelet coefﬁcients of a process X which is zero-mean, self-similar of index H, of ﬁnite variance and with stationary increments (H − ASAS ) takes on the form:

E dX (j, k)dX (j , k ) ≈ |2−j k − 2−j k |2(H−nψ ) ,

|2−j k − 2−j k | → ∞

which illustrates, on the one hand, that the larger the number of vanishing moments, the shorter the range of the correlation and on the other hand, that if H > nψ + 12 , the long-range dependence which exists for the increment process if H > 12 , is transformed into a short-range dependence [ABR 95, FLA 92, TEW 92]. The set of the results which have just been presented can be made more precise when we specify the distribution law which underlies the self-similar process with stationary increments. The Gaussian case, illustrated by the FBM, has been widely studied [FLA 92, MAS 93]. Its wavelet coefﬁcients are Gaussian at all scales. More recently, interest in the non-Gaussian case has led to developments for self-similar α-stable processes (or α-stable motions) [ABR 00a, DEL 00]. Hence, we can deduce from the wavelet decomposition of such processes that the series of their coefﬁcients dX (j, k), in addition to the above-mentioned properties, is itself α-stable with the same index. 2.4.2. Long-range dependence As speciﬁed in section 2.2.3, stationary processes with “long-range dependence” are characterized by a slow decrease of their correlation function cX (τ ) ∼ cr |τ |−β , 0 < β < 1. Thus, the strong statistical connections maintained even between distant samples, X(t) and X(t + τ ), make the study and analysis of such processes much more complex, by impairing, for example, the convergence of algorithms relying on empirical moment estimators. It will be shown hereafter that wavelet decomposition of a process with long-range dependence makes it possible to circumvent this difﬁculty since – under certain conditions – the series of coefﬁcients dX (j, k) exhibit

Scale Invariance and Wavelets

89

short-term dependence. The covariance function of the wavelet coefﬁcients possesses the following form: E dX (j, k)dX (j , k ) j+j −β 2 cr |τ | ψ(2j u − k) ψ 2j (u − τ ) − k du dτ ∼2 = 2−

j+j 2

cf

(2.25)

Ψ(2−j f )Ψ(2−j f ) −i2πf (2−j k−2−j k ) e df, |f |γ

indicating that its asymptotic behavior, i.e. for the large values of the interval |2−j k − 2−j k |, is equivalent to that of its original Fourier transform and hence to that of the relation:

2−(j+j )nψ 2(−j−j )nψ |f |2nψ |Ψ(2−j f )Ψ(2−j f )| ∼ = . −γ γ f →0 |f | |f | |f |γ−2nψ Thus, we can observe the effect of the number of vanishing moments nψ of the wavelet, which may compensate the original divergence of the spectrum density of the process. By choosing a wavelet such that nψ γ/2, the long-range dependence of the process is no longer preserved in the coefﬁcient sequences of the decomposition. Hence, the bigger nΨ is, the faster the residual correlation decreases:

E dX (j, k)dX (j , k ) ≈ |2−j k − 2−j k |γ−2nψ −1 ,

|2−j k − 2−j k | → ∞.

From equation (2.25) we can also prove that the variance of the wavelet coefﬁcients follows a power law behavior as a function of scales: E |dX (j, k)|2 = 2−j(1−β) cf

|Ψ(f )|2 df = c0 2−jγ . |f |γ

(2.26)

This relation will be at the core of the estimation procedure of the parameter γ (see the following section). Finally, it is important to specify that, since it is possible to invert the wavelet decompositions (see equation (2.23)), the non-stationarity of the studied processes does not disappear from the analysis (no more than the long-range dependence does); all the information is preserved but redistributed differently amongst the coefﬁcients. Thus, long-range dependence and non-stationarity are related to the approximation coefﬁcients aX (j, k) of the decomposition, whereas self-similarity is observed through the scales, by an algebraic progression of the moments of order q of the detail coefﬁcients dX (j, k).

90

Scaling, Fractals and Wavelets

2.4.3. Local regularity The local regularity properties of process sample paths have been introduced in section 2.2.4. Their “wavelet” counterparts most often derive from the orthogonal discrete wavelet transform, given that they could be extended, normally quite easily, to the continuous (surfaces) varieties (see Chapter 3). THEOREM 2.1.– Let X be a signal with Hölder regularity h 0 in t0 and ψ a sufﬁciently regular wavelet (nψ h). Hence, there exists a constant c > 0 such that for any j, k ∈ Z × Z: 1 |dX (j, k)| c 2−( 2 +h)j 1 + |2j t0 − k|h . Conversely, if for any j, k ∈ Z × Z: 1 |dX (j, k)| c 2−( 2 +h)j 1 + |2j t0 − k|h for h < h, thus X has Hölder regularity h in t0 . The proof of the theorem was established independently by Jaffard [JAF 89] and Holschneider and Tchamitchian [HOL 90]. In the light of this result, we note once again that it is the decrease of wavelet coefﬁcients through scales which characterizes the local regularity of the sample path of X. Furthermore, this result is not surprising since the Hölder regularity of a function is a particular cause for the 1/f spectral behavior at high frequencies. The second part of Theorem 2.1 also shows that knowledge of the coefﬁcients located “vertically” to the singular point (|2j t0 − k| = 0) is itself not sufﬁcient to determine the local regularity of X in t0 . Strictly speaking, it would be necessary to consider the decomposition in its entirety, thus implying that an isolated singularity can affect all the coefﬁcients dX (j, k) inside a cone, called an inﬂuence cone. For a wavelet whose temporal support is ﬁnite, this cone is also limited at each scale. From the estimation point of view, the direct implication of Theorem 2.1 is to highlight the practical limits of (discrete) orthogonal wavelet transforms, because it is quite unlikely that the abscissa t0 of the singularity coincides with the coefﬁcients line on the dyadic grid. Hence, in practice, it is more often a continuous analysis diagram which is preferred, for which we possess a less precise and incomplete version (direct implication) of Theorem 2.1 (see the following proposition). PROPOSITION 2.5.– If X is of Hölder regularity n < h < n + 1 in t0 , for a wavelet analysis ψ possessing nψ h vanishing moments, then we have the following asymptotic behavior: 1

TX (t0 , a) ∼ O(ah+ 2 ),

a −→ 0+ .

Scale Invariance and Wavelets

91

Proof. Let the continuous wavelet transform (2.21) constructed with nψ > n be: √ TX (t0 , a) = a ψ(u) X(t0 + au) du =

√

$

a

ψ(u) X(t0 + au) −

n

% r r

du,

cr a u

r=0

where cr represent the Taylor expansion coefﬁcients of X in the vicinity of t0 . The signal X is of regularity n < h < n + 1 in t0 and ψ is a localized time function. Thus, in the limit of inﬁnitely ﬁne resolutions (a → 0+ ): √ lim+ TX (t0 , a) C a ψ(u)|au|h du a→0

h+ 12

a

1

|u|h ψ(u) du = Cψ ah+ 2 .

It is important to underline that if the wavelet ψ is not of sufﬁcient regularity, it is the term of degree nψ in the Taylor polynomial which dominates at ﬁnite scales and it is thus the regularity of the wavelet which imposes the decrease of the coefﬁcients through scales. However, one should not be misled by the interpretation of Proposition 2.5. It is only because we focus on the limited case of inﬁnite resolution that the inﬂuence of the singularity seems to be perfectly localized in t = t0 . In reality, it is shown in [MAL 92] that, in the case of non-oscillating singularities (see Chapter 3), it is necessary and sufﬁcient to consider the maximum local lines of the wavelet coefﬁcients situated inside the inﬂuence cone, {TX (a, t) : |t − t0 | < c a}, to be able to characterize the local regularity of the process. In addition, the practical use of this property is made more difﬁcult by the necessarily ﬁnite resolution imposed by the sampling of the data, which does not permit detailed scrutiny of the data beyond a minimum scale, which is noted by convention a = 1. Furthermore, the different aspects of the study of the local regularity of a function constitutes an important object in other chapters of this work. This is of true for Chapter 3, which tackles the issue of the characterization of functional regularity spaces, of Chapter 6 and Chapter 5 which expose the case of multifractional processes and their sample path regularity, and ﬁnally of Chapter 1, which presents the multifractal spectra as statistical and geometric measures of the distribution of pointwise singularities of a process. Finally, let us note that, as previously indicated in section 2.2.4, the increments of stochastic stationary processes with stationary increments for which the Hölder

92

Scaling, Fractals and Wavelets

exponent is constant throughout the sample paths satisfy the asymptotic relation: E |X(t + τ ) − X(t)|2 ∼ C|τ |2h ,

|τ | → 0.

The latter can be rewritten identically on the wavelet coefﬁcients, which can be either continuous or discrete: E |TX (a, t)|2 ∼ a2h+1 ,

a −→ 0;

(2.27a)

E |dX (j, k)|2 ∼ 2−j(2h+1) ,

j −→ +∞.

(2.27b)

These relations should be compared with those obtained in the case of self-similarity (2.24) and long-range dependence (2.26), and will serve as a starting point in the construction of estimators (see the following section). 2.4.4. Beyond second order In this chapter, analysis is limited to the detailed presentation of the wavelet analysis of scaling laws existing in the second statistical order (self-similarity, long-range dependence, constant local regularity). Nevertheless, the study of scaling law models which involve all statistical orders (multifractal processes, inﬁnitely divisible cascades) can be carried out from wavelet analysis in the same way and beneﬁts from the same qualities and advantages. The wavelet analysis of multifractal processes is developed by S. Jaffard and R. Riedi in Chapter 3 and Chapter 4 respectively. The wavelet analysis of inﬁnitely divisible cascades is detailed in [VEI 00]. 2.5. Implementation: analysis, detection and estimation This section is devoted to the implementation of wavelet analysis in the study of scale invariance phenomena, whether it is for detecting and highlighting them, or for estimating the parameters which describe them. The previous sections have outlined the power law behavior of the scale variance of wavelet coefﬁcients: E |dX (j, k)|2 c C 2jα

(2.28)

see, for self-similarity, equation (2.24); for monofractality of long-range dependence, equation (2.26); for monofractality of sample paths, equation (2.27); these equations are represented in Table 2.1. This relation crystallizes the potential of the “wavelet” tool for the analysis of scaling laws and will hence be the core aspect of this section. In a real situation, we must begin by validating the relevance of a model before estimating its parameters. In other words, we must ﬁrst highlight the existence of scale invariance phenomena and identify a scale range within which the above-mentioned

Scale Invariance and Wavelets

α

c

C

Self-sim. with stat. incr. 2H + 1 σ 2 = E |X(1)|2 Long-range dependence

γ

#

|Ψ(f )|2 /|f |2H+1 df #

cf

Uniform local regularity 2h + 1

93

|Ψ(f )|2 /|f |γ df

–

–

Table 2.1. Summary of scaling laws existing in the second statistical order

relation supplies an appropriate description of the data, then carry out the measure of exponent α. A simple situation serves as an introduction, where we suppose that there is an octave range j1 j j2 , already identiﬁed, for which the fundamental relation is satisﬁed in an exact manner: j1 j j2 ,

E |dX (j, k)|2 = cf C 2jα

and we concentrate on the estimation of the parameter α. 2.5.1. Estimation of the parameters of scale invariance To estimate the exponent α, a simple idea consists of measuring the slope of log2 E |dX (j, k)|2 against log2 2j = j. In practice of course, this implies that we have to estimate the quantity E |dX (j, k)|2 from a single observed realization of ﬁnite length. Given the properties of the wavelet coefﬁcients (stationarity, weak statistical dependence) put forth in the previous section, we simply propose to perform the estimation of the ensemble average by the time average [ABR 95, ABR 98, FLA 92]: 1 dx (j, k)2 nj k

where nj designates the number of wavelet coefﬁcients available at octave j. DEFINITION 2.7.– Let us begin by recalling the main characteristics of linear regression. Let yj be the random variables such that E yj = αj + b and let us deﬁne σj2 = var yj . The estimator by a weighted linear regression of α reads: j2 α ˆ=

j=j1

yj (S0 j − S1 )/σj2 S0 S2 − S12

≡

j2 j=j1

wj yj

(2.29)

94

Scaling, Fractals and Wavelets

with: Sp =

j2

j p /aj ,

p = 0, 1, 2

j=j1

where aj are arbitrary quantities, acting as weights associated with yj . With these deﬁnitions, the weights wj satisfy the usual constraints, i.e. j2 j2 j=j1 jwj = 1 and j=j1 wj = 0. We can also easily observe that the estimator is unbiased: α ˆ = α. Moreover, the variance of this estimator is written, in the case of uncorrelated variables yj : var α ˆ=

j2

aj σj2 .

j=j1

The choice of the weights remains to be speciﬁed. We know that the variance α ˆ is minimal if we take into consideration the covariance structure of yj , yj in the deﬁnition of aj . Once again, in the case of uncorrelated variables yj , this leads us to choose aj ≡ σj2 . In the case of scale invariance, we use the estimator deﬁned earlier with the variables: 1 dX (j, k)2 − g(j), yj = log2 nj k

where g(j) are the correction terms aimed at taking into account the fact that E(log(·)) is not log(E(·)) and at ensuring that E yj = αj + b. Hence, this estimator simply consists of a weighted linear regression carried out in the diagram yj against j, referred to as the log-scale diagram [VEI 99]. In order to easily implement this estimator, it is necessary to further determine g(j) and σj2 and choose aj . To begin with, we assume that dx (j, k) are random Gaussian variables, i.e., that they result from the wavelet decomposition of a process which is itself jointly Gaussian. Moreover, if we idealize the weak correlation property of the wavelet coefﬁcients in exact independence, then we can calculate g(j) and σj2 analytically: g(j) = Γ (nj /2)/ Γ(nj /2) log 2 − log2 (nj /2) ∼

−1 , nj log 2

nj → ∞;

(2.30a)

σj2 = ζ(2, nj /2)/ log2 2 ∼

2 , nj log2 2

nj → ∞,

(2.30b)

Scale Invariance and Wavelets

95

where Γ and Γ respectively designate the Gamma function and its derivative, and ∞ where ζ(2, z) = n=0 1/(z + n)2 deﬁnes a function called the generalized Riemann zeta function. Let us note that these analytical expressions, which depend on the known nj s alone, can hence be easily evaluated in practice. The numerical simulations presented in depth in [VEI 99] indicate that, for Gaussian processes, this analytical calculation happens to be an excellent approximation of reality, satisfying a posteriori the idealization of exact independence. Thus, for Gaussian processes, we obtain an estimator which is remarkably simple to carry out, since the quantities g(j) and σj2 can be analytically calculated and do not need to be estimated from data, and which gives excellent statistical performance. From these analytical expressions, we obtain: Eα ˆ = α, which indicates that the estimator is unbiased and this is also valid for observations of ﬁnite duration. With the choice aj = σj2 , its variance reads: Var α ˆ=

j2

σj2 wj2 =

j=j1

S0 S0 S2 − S12

and it attains the Cramér-Rao lower bound, which is calculated under the same hypotheses (Gaussian process and exact independence) [ABR 95, VEI 99, WOR 96]. In addition, if we add the form nj = 2−j n (with n as the number of coefﬁcients in the initial process) induced by the construction of the multiresolution analysis, we obtain the following expression for the variance: Var α ˆ=

j2

σj2 wj2

j=j1

1 (1 − 2−J ) · 1−j1 , n 2 F

(2.31)

where F = F (j1 , J) = log2 2·(1−(J 2 /2+2) 2−J +2−2J ) and where J = j2 −j1 +1 denotes the number of octaves involved in the linear regression. This analytical result shows that the variance of the estimator decreases in 1/n, in spite of the possible presence of a long-range dependence in the analyzed process. It is noteworthy that, in practice, relation nj = 2−j n is not exactly satisﬁed because of boundary effects, which are systematically excluded a priori from the measures. In the case of non-Gaussian processes, the implementation of the estimator is more subtle, since we cannot use analytical expressions for g(j) and σj2 . Nevertheless, in the case of ﬁnite variance processes, the variables (1/nj ) k dX (j, k)2 are asymptotically Gaussian and we can show that correcting terms can be introduced [ABR 00b] to the Gaussian case: g(j) ∼ −

1 + C4 (j)/2 ; nj log 2

σj2 ∼ 2/ log2 2

1 + C4 (j)/2 , nj

96

Scaling, Fractals and Wavelets

where C4 denotes the normalized fourth-order cumulant: 2 2 C4 (j) = E dX (j, k)4 − 3 E dX (j, k)2 / E dX (j, k)2 of the wavelet coefﬁcients at octave j. The practical use of these relations requires the estimation of C4 , which can be difﬁcult, as well as the guarantee that, for each octave, a sufﬁcient number of points exist, so that the above form, which results from an asymptotic expansion, is valid. An approximate yet simple practical choice, regularly implemented, consists of using: g(j) ≡ 0; σj2 ≡ 2/(nj log2 2); aj ≡ σj2 . The numerical simulations proposed in [ABR 98, VEI 99] show that the performance of these choices are very satisfying for the analysis of long-range dependence. Indeed, such a choice implies that the importance in linear regression of yj is twice less than that of yj−1 , which is a point of view a priori realistic for the study of long-range dependence (Gaussianization effect for large scales), but less obvious for the study of local regularities. This choice is all the more delicate as it induces an effect on the bias and the variance of the estimator at the same time; an alternative choice, aj constant, can also be considered. In equation (2.28), from which the study of the scaling law behavior stems, the focus has, until now, been on the exponent α of the power law, since it deﬁnes the phenomenon. However, the measure of the multiplicative parameter cf can be fruitful for certain applications. This estimation is detailed in [VEI 99]. Finally, the case of self-similar processes with stationary increments of inﬁnite variance (and/or mean value) will not be tackled here. It is especially developed in [ABR 00a]. 2.5.2. Emphasis on scaling laws and determination of the scaling range The previous section relied on the hypothesis that the quantity E dX (j, k)2 was made on the basis of a power law. This ideal situation is rarely observed for two reasons. On one hand, the real experimental data are likely to be only approximately described by the proposed models of scale invariance. On the other hand, certain models themselves induce only an approximate or asymptotic behavior as a power law – as is the case for long-range dependence or processes with fractal sample paths; in fact, only the self-similar model induces a strictly satisﬁed power law: LRD

j → +∞ E dX (j, k)2 ∼ cf C 2jα ;

Fractal j → −∞ E dX (j, k)2 ∼ cf C 2jα ; H-ss

∀j

E dX (j, k)2 = cf C 2jα .

Scale Invariance and Wavelets

97

Hence, in the implementation of the estimator described earlier, it is necessary to choose an octave range on which the measure is carried out, i.e., to select the octaves j1 and j2 . Making this choice does not necessarily mean extracting the theoretical values of j1 and j2 from the data, since these are not always deﬁned by the model, but rather optimizes the statistical performance of the estimator. Widening the octave range [j1 , j2 ] implies the use of a higher fraction of the available wavelet coefﬁcients, resulting in a reduction of the estimation variance, as indicated by the above-mentioned relation (2.31); conversely, it can also mean an increase in the estimation bias if we carry out the measure on a range where the behavior is notably different from a power law behavior. The choice of the range is thus guided by the optimization of a bias-variance trade-off. The example of long-range dependence is used: according to the model, we wish to choose j2 = +∞, i.e., in practice, j2 as large as possible, since the maximum limit of the wavelet decomposition is ﬁxed by the number of coefﬁcients of the analysis process and the importance of boundary effects; as for j1 , it is not imposed by the model. Choosing the larger j1 makes it possible to work in a zone where the asymptotic behavior is satisﬁed and thus means a small estimation bias but a strong variance (this is essentially contained in 2j1 , see equation (2.31), and thus doubles each time we increase j1 of 1). Qualitatively, we are led to tolerate a little bias (i.e., widen the measure range towards the small octaves) to reduce the variance and thus minimize the average quadratic error (AQE = (bias)2 + variance). In the case of the local regularity measure, the situation is different. The model tends to impose small j1 (in practice, j1 = 1) without ﬁxing j2 . As in the preceding case, increasing j2 may induce a gap in the ideal behavior of the bias, but will only slightly reduce the variance: indeed, it is sufﬁcient to study the form of the variance dependence of α ˆ to note the reduced inﬂuence of j2 , in accordance with the fact that there exist fewer and fewer wavelet coefﬁcients exist on the coarsest octaves. In practice, we are led to choose the narrowest possible range which is also limited on the lowest octaves. To move towards more quantitative arguments, it is necessary to completely specify the models of the processes studied. We will keep on using the method which imposes the least possible a priori assumptions on the model and consider that it is sufﬁcient to postulate that scale invariance phenomena are present. We resort to the quantity: G(j1 , j2 ) =

j2 (yj − α ˆ j − ˆb).2 σj2 j=j

(2.32)

1

which conveys a usual measure of the mean-square error between data and model. For Gaussian yj , the variable G(j1 , j2 ) follows a Chi-2 law χ2J−2 with J − 2 degrees of freedom. The dependence in j1 , j2 of the quantity: G(j1 ,j2 ) χ2J−2 (u) du Q(j1 , j2 ) = 1 − 0

98

Scaling, Fractals and Wavelets Ŧ5 9

D = 0.55 8

Ŧ10

D = 2.57

cf = 4.7

7

1 d j d 10

Ŧ15

4 d j d 10 6

y

j

y

j

5

Ŧ20 4

3

Ŧ25

2

Ŧ30

1

0

Ŧ35

1

2

3

4

5

6

7

8

9

10

1

2

3

4

5

6

7

8

9

10

Octave j

Octave j

Figure 2.3. Examples of log-scale diagram. Right: second order log-scale diagram for a long-range dependent process, also possessing a highly pronounced correlation structure of short memory type (visible to the small octaves); practically, it refers to ARFIMA (0, d, 2) with d = 0.25 and a second order moving average Ψ(B) = 1 + 2B + B 2 , implying (γ, cf ) = (0.50, 6.38). The vertical error bars for each j carry out conﬁdence intervals at (2) 95% of Yj . A linear behavior is observed between the octaves [j1 , j2 ] = (4, 10), which excludes the small octaves (short range memory) but includes the larger ones. A weighted linear regression enables, in spite of the strong presence of short-term dependencies, a precise estimation of γ: γˆ = 0.53 ± 0.07, cˆf = 6.0 with 4.5 < cˆf < 7.8. Left: second order log-scale diagram for a self-similar process (FBM) of parameter H = 0.8. The linear behavior spreads ˆ = 0.79 over all scales and allows a precise estimation of H: H

makes it possible to work upon the choice of the analysis range. A value of Q close to 1 indicates the adequacy of the model, as opposed to Q close to 0. An approach which consists of examining breaking points in the behavior of Q with j1 , j2 is proposed. 2.5.3. Robustness of the wavelet approach One of the great difﬁculties in the analysis of scale invariance phenomena is linked to the fact that their qualitative expressions are close to those induced by non-stationarities. For a long time, data modeling by scale invariance has been rejected because it was considered, sometimes correctly, as an artefact due to non-stationarity. The difﬁculties are of two types: on the one hand, as just mentioned above, identifying scale invariance when it refers, in fact, to non-stationarities; on the other hand, failing to detect scale invariance or to correctly estimate its parameters, when non-stationary effects are superimposed. Wavelet analysis has made it possible to ﬁnd solutions to these two problems. For the ﬁrst type of problem, it was proposed [VEI 01] to chop the signal under analysis into L segments which do not overlap. For each segment, we carry out an estimation α ˆ l of the scale invariance parameter. Then we validate the relevance of using a scale invariance model by testing the similarity between blocks of α ˆ l . Hence, it does not refer to a stationary test in the more general sense, but more simply to a

Scale Invariance and Wavelets

99

18

30

16

25

14

20 12

15

yj

10

10

8

5

6

0

4

2

Ŧ5

0

Ŧ10

2000

4000

6000

8000

10000

12000

14000

1

2

3

4

5

30

6

7

8

9

10

11

7

8

9

10

11

Octave j

16000

18

25

16

20

14

12

15

10

D = 0.59

yj

10

8

5 6

0 4

Ŧ5

Ŧ10

2

2000

4000

6000

8000

time

10000

12000

14000

16000

0

1

2

3

4

5

6

Octave j

Figure 2.4. Robustness with respect to superimposed trends. Left: fractional Gaussian noise with H = 0.80 (above) and with sinusoidal and superimposed linear trends (below). Right: log-scale diagrams of signal corrupted by the trends, as computed with a Daubechies 2 wavelet (i.e., N = 2) (above) and Daubechies 6 wavelet (i.e. N = 6) (below). We observe that increasing N cancels the effects of the superimposed trends ˆ = 0.795 and allows for a reliable estimation of H: H

test aimed at detecting abnormally large ﬂuctuations of estimations α ˆ l which leads us to reject the presence of scale invariance. In practice, the properties of the wavelet coefﬁcients (weak statistical dependence among coefﬁcients) and the deﬁnition and the theoretical study of the estimator α ˆ , presented earlier, make it possible to conduct this test as the detection of n mean value change within independent Gaussian variables of unknown but identical mean values and of possibly different but known variances [VEI 01]. For the second type of problem, the number of vanishing moments of the wavelet plays a fundamental role. By deﬁnition, the wavelet coefﬁcients of a polynomial p(t) of degree P strictly smaller than the number of vanishing moments of the mother wavelet, P < nψ , are exactly zero. This means that if the observed signal Z is made up of a signal to analyze X on which a polynomial trend is superimposed, the wavelet analysis of the scale invariance phenomena that are likely to be present in X will be, given its linear nature, insensitive to the presence of p as soon as

100

Scaling, Fractals and Wavelets

nψ is sufﬁciently large. In practice, we do not necessarily know a priori the order of the corrupting polynomial, if any; we can thus simply carry out a series of wavelet analyses by making N increase. When these results no longer change with nψ , this is an indication that P has been overtaken. It is noteworthy that this procedure is made fully practicable by the low calculation cost of the discrete wavelet decomposition. Certainly, in practice, the trends superimposed on the data are not polynomial. However, in the case where they possess a sufﬁciently regular behavior (e.g., quasi-sinusoidal oscillations) or slightly irregular (e.g., in t−β , β > 0), the preceding argument remains valid: when we make nψ increase, the magnitude of the wavelet coefﬁcients of the trend decreases, whereas that of X remains identical; the effect of the trend thus becomes quasi-negligible [ABR 98]. The superposition of a deterministic trend to the process X can be interpreted as a non-stationarity of the mean; this situation can be complex considering that the variance of the process itself evolves. Hence, we can write the observation as: Z(t) = a(t) + b(t)X(t) where X(t) is a process presenting scale invariance under the form of one of the models referred to earlier, (self-similarity, long-range dependence, etc.) and where a(t) and b(t) are sufﬁciently regular deterministic functions. Thus, it has been shown that the variation in the number of vanishing moments of the mother wavelet makes it possible to overcome the drift effects of a and b and to carry out reliable estimations of the scale invariance parameters associated only with X [ROU 99]. Finally, the analysis of the signal plus noise situation, which is usual in signal processing when we write an observation Z = Y + X (where X is the process with scale invariance to be studied and Y some additive random noise) has been considered in [WOR 96] by maximum likelihood approaches and will not be developed here. 2.6. Conclusion In this chapter, we have focused on a qualitative description rather than a rigorous formalization of the concepts, models and analyses. The main concern was to offer the reader, who is not specialized in this ﬁeld but is eager to implement, from real data, certain principles of the fractal analysis, some entry points that are as much theoretical as practical. Especially in the ﬁrst part, emphasis has been put on the relations between the different models used to describe scaling laws, their similarities and differences, so that this notion become accessible. Similarly, in the second part, the presentation of wavelet tools enabling the characterization and analysis of scaling laws has been structured around the different practical aspects essential for their implementation (selection of the scale range, estimation of the parameters, robustness and algorithm). All the technical analysis described here, as well as some extensions and variations, have been put in practice in Matlab, with toolboxes that are freely

Scale Invariance and Wavelets

101

accessible on the websites http://www.ens-lyon.fr/pabry and http://perso.ens-lyon .fr/paulo.goncalves. 2.7. Bibliography [ABR 95] A BRY P., G ONÇALVES P., F LANDRIN P., “Wavelets, spectrum estimation, and 1/f processes”, in A NTONIADIS A., O PPENHEIM G. (Eds.), Wavelets and Statistics, Springer-Verlag, Lecture Notes in Statistics 103, New York, p. 15–30, 1995. [ABR 98] A BRY P., V EITCH D., “Wavelet analysis of long-range dependent trafﬁc”, IEEE Trans. on Info. Theory, vol. 44, no. 1, p. 2–15, 1998. [ABR 00a] A BRY P., P ESQUET-P OPESCU P., TAQQU M.S., “Wavelet based estimators for self similar α-stable processes”, in International Conference on Signal Processing: Sixteenth World Computer Congress (Beijing, China, 2000), August 2000. [ABR 00b] A BRY P., TAQQU M.S., F LANDRIN P., V EITCH D., “Wavelets for the analysis, estimation, and synthesis of scaling data”, in PARK K., W ILLINGER W. (Eds.), Self-similar Network Trafﬁc and Performance Evaluation, John Wiley & Sons, p. 39–88, 2000. [AVE 98] AVERKAMP R., H OUDRÉ C., “Some distributional properties of the continuous wavelet transform of random processes”, IEEE Trans. on Info. Theory, vol. 44, no. 3, p. 1111–1124, 1998. [BER 94] B ERAN J., Statistics for Long-memory Processes, Chapman and Hall, New York, 1994. [CAM 95] C AMBANIS S., H OUDRÉ C., “On the continuous wavelet transform of second-order random processes”, IEEE Trans. on Info. Theory, vol. 41, no. 3, p. 628–642, 1995. [CAS 90] C ASTAING B., G AGNE Y., H OPFINGER E., “Velocity probability density functions of high Reynolds number turbulence”, Physica D, vol. 46, p. 177, 1990. [CAS 96] C ASTAING B., “The temperature of turbulent ﬂows”, Journal de physique II France, vol. 6, p. 105–114, 1996. [DAU 92] DAUBECHIES I., Ten Lectures on Wavelets, SIAM, 1992. [DEL 00] D ELBEKE L., A BRY P., “Stochastic integral representation and properties of the wavelet coefﬁcients of linear fractional stable motion”, Stochastic Processes and their Applications, vol. 86, p. 177–182, 2000. [FLA 92] F LANDRIN P., “Wavelet analysis and synthesis of fractional Brownian motion”, IEEE Trans. on Info. Theory, vol. IT-38, no. 2, p. 910–917, 1992. [FRI 95] F RISCH U., Turbulence: The Legacy of A. Kolmogorov, Cambridge University Press, Cambridge, 1995. [HOL 90] H OLSCHNEIDER M., T CHAMITCHIAN P., “Régularité locale de la fonction non différentiable de Riemann”, in L EMARIÉ P.G. (Ed.), Les ondelettes en 1989, Springer-Verlag, 1990. [JAF 89] JAFFARD S., “Exposants de Hölder en des points donnés et coefﬁcients d’ondelettes”, Comptes rendus de l’Académie des sciences de Paris, vol. 308, 1989.

102

Scaling, Fractals and Wavelets

[LEL 94] L ELAND W.E., TAQQU M.S., W ILLINGER W., W ILSON D.V., “On the self-similar nature of Ethernet trafﬁc (extended version)”, IEEE/ACM Trans. on Networking, vol. 2, p. 1–15, 1994. [MAL 92] M ALLAT S.G., H WANG W.L., “Singularity detection and processing with wavelets”, IEEE Trans. on Info. Theory, vol. 38, no. 2, p. 617–643, 1992. [MAL 98] M ALLAT S.G., A Wavelet Tour of Signal Processing, Academic Press, San Diego, California, 1998. [MAN 68] M ANDELBROT B.B., VAN N ESS J.W., “Fractional Brownian motions, fractional noises, and applications”, SIAM Review, vol. 10, no. 4, p. 422–437, 1968. [MAN 97] M ANDELBROT B.B., Fractals and Scaling in Finance, Springer, New York, 1997. [MAS 93] M ASRY E., “The wavelet transform of stochastic processes with stationary increments and its application to fractional Brownian motion”, IEEE Trans. on Info. Theory, vol. 39, no. 1, p. 260–264, 1993. [PAR 00] PARK K., W ILLINGER W. (Eds.), Self-similar Network Trafﬁc and Performance Evaluation, John Wiley & Sons (Interscience Division), 2000. [PES 99] P ESQUET-P OPESCU B., “Statistical properties of the wavelet decomposition of certain non-Gaussian self-similar processes”, Signal Processing, vol. 75, no. 3, 1999. [ROU 99] ROUGHAN M., V EITCH D., “Measuring long-range dependence under changing trafﬁc conditions”, in IEEE INFOCOM’99 (Manhattan, New York), IEEE Computer Society Press, Los Alamitos, California, p. 1513–1521, March 1999. [SAM 94] S AMORODNITSKY G., TAQQU M.S., Stable Non-Gaussian Random Processes: Stochastic Models with Inﬁnite Variance, Chapman and Hall, New York and London, 1994. [TEI 00] T EICH M., L OWEN S., J OST B., V IBE -R HEYMER K., H ENEGHAN C., “Heart rate variability: measures and models”, Nonlinear Biomedical Signal Processing, vol. II, Dynamic Analysis and Modeling (M. Akay, Ed.), Ch. 6, p. 159–213, IEEE Press, 2001. [TEW 92] T EWFIK A.H., K IM M., “Correlation structure of the discrete wavelet coefﬁcients of fractional Brownian motions”, IEEE Trans. on Info. Theory, vol. IT-38, no. 2, p. 904–909, 1992. [VEI 99] V EITCH D., A BRY P., “A wavelet based joint estimator of the parameters of long-range dependence”, IEEE Transactions on Information Theory (special issue on “Multiscale statistical signal analysis and its applications”), vol. 45, no. 3, p. 878–897, 1999. [VEI 00] V EITCH D., A BRY P., F LANDRIN P., C HAINAIS P., “Inﬁnitely divisible cascade analysis of network trafﬁc data”, in Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (Istanbul, Turkey), June 2000. [VEI 01] V EITCH D., A BRY P., “A statistical test for the constancy of scaling exponents”, IEEE Trans. on Sig. Proc., vol. 49, no. 10, p. 2325–2334, 2001. [WOR 96] W ORNELL G.W., Signal Processing with Fractals – A Wavelet-based Approach, Prentice-Hall, 1996.

Chapter 3

Wavelet Methods for Multifractal Analysis of Functions

3.1. Introduction A large number of signals are very irregular. In the most complex situations, the irregularity manifests itself in different forms and may change its form almost instantaneously. The most widely studied example in physics is the speed signal of a turbulent ﬂow. During the 1980s, precise records of the speed of a turbulent ﬂow were made in the ONERA wind tunnel at Modane (see Gagne et al. [GAG 87]). A thin wire, heated at one point, is placed in the ﬂow of turbulent air; the rate at which the temperature decreases is directly proportional to the orthogonal component along the ﬂow speed at the heated point. We obtain incomplete information as the signal is 1D; however, it is more precise than the best numerical simulations being carried out currently. The study of this signal showed that the signal recorded does not have statistical homogenity; its regularity varies a lot from one point to another [ARN 95b, FRI 95]. Such signals cannot be modeled by processes such as fractional Brownian motion, for example. The techniques of multifractal analysis were developed in order to model and analyze such behaviors. Originally introduced for fully developed turbulence, these techniques began to be used, within a few years, in various scientiﬁc ﬁelds: trafﬁc analysis (road and Internet trafﬁc) [ABR 98, LEV 97, TAQ 97, WIL 96, WIL 00], modeling of economic signals [MAN 97], texture analysis [BIS 98], electrocardiograms [AMA 00], etc. The mathematical theory of multifractal functions has expanded considerably: not only were numerous heuristic arguments used to numerically analyze the multifractal Chapter written by Stéphane JAFFARD.

104

Scaling, Fractals and Wavelets

signals studied and justiﬁed under certain assumptions within a context of limited validity but, most importantly, these mathematical results had various consequences on applications; they led researchers to introduce new tools that made it possible to reﬁne and enrich the techniques of multifractal analysis. Inside mathematics, multifractal analysis acquired an extremely original position, since functions taken from very different domains prove to be multifractal: – in probabilities, the sample paths of the Lévy process [JAF 99]; – in analytical theory of numbers, trigonometric series linked to theta functions [JAF 96a]; – in analysis, the geometric constructions like “Peano functions” [JAF 97c, JAF NIC]; – in arithmetic, functions where diophantine approximation properties play a role [JAF 97b]; – etc. Multifractal analysis thus provides us with a vocabulary and a cluster of methods that help establish connections and ﬁnd analogies between diverse ﬁelds of science and mathematics. Almost immediately since their appearance, wavelet analysis techniques were applied to multifractal analysis of signals by Arneodo and his team of CRPP of Bordeaux [ARN 95b]. These techniques were seen to be extremely powerful for mathematical analysis of problems, as well as for the construction of robust numerical algorithms. In this chapter, we explain certain fundamental results related to the wavelet methods of multifractal analysis in detail. We also brieﬂy describe the vast scientiﬁc panorama of recent times and conclude with a review of specialized articles. In order to facilitate the presentation and the notations, all results will be set out in dimension l. Most of the results can easily be applied to functions deﬁned in d (see [JAF 04b, JAF 06]). The reader can ﬁnd a more detailed presentation of the different ﬁelds of application of multifractal analysis in [ARN 95a, JAF 01a, MAN 98, LAS 08, WEN 07]. 3.2. General points regarding multifractal functions 3.2.1. Important definitions Multifractal functions help in modeling signals whose regularity varies from one point to another. Thus, the ﬁrst problem is to mathematically deﬁne a function’s regularity at every point. What is “pointwise regularity”? It is a way of quantifying,

Wavelet Methods for Multifractal Analysis of Functions

105

with the help of a positive real number α, the fact that the graph of a function is generally rough at a given point x0 (the picture is not simply superﬁcial; the concepts we introduce have, in fact, been used in rough symmetry [DUB 89, TRI 97]). The Hölder regularity generalizes familiar concepts: the “minimum level” of regularity is continuity. A function f is continuous in x0 if we have |f (x) − f (x0 )| → 0 when x → x0 ; the continuity will correspond to a regularity index α = 0. Similarly, f is differentiable at x0 if there is a function P such that |f (x) − P (x − x0 )| → 0 which is faster than |x − x0 | when x → x0 ; the derivability will correspond to a regularity index α = 1. The following deﬁnition is a direct generalization of these two cases. DEFINITION 3.1.– Let α be a positive real number and x0 ∈ ; a function f : → is C α (x0 ) if there exists a polynomial P of degree less than α such that: |f (x) − P (x − x0 )| C|x − x0 |α

(3.1)

NOTE 3.1.– The polynomial P is unique (if Q was acceptable as well, by applying Deﬁnition 3.1 with P and Q, we would have |P (x) − Q(x)| C|x|α and, since P − Q is of a maximum degree [α], we would have P − Q = 0). The constant in the polynomial P (x − x0 ) is always f (x0 ); similarly, the ﬁrst degree term, if present, is always (x − x0 )f (x0 ) (by the derivative’s deﬁnition). Also, if f is C [α] (x0 ) close to x0 , the comment that we just made concerning the uniqueness of P implies that the polynomial P is Taylor’s expansion of f in x0 of order α. However, as equation (3.1) can take place for large values of α, without f being twice more differentiable in x0 (in which case Taylor’s expansion stops at similar term), we will consider, for example, the “chirp” xn sin(x−n ) in 0 for big n (which does not have a second derivative in 0 since the ﬁrst derivative is not continuous). We see that P gives a generalization of the notion of Taylor’s expansion (see also Chapter 4; we will consult [GIN 00, MEY 98] for extensions of these concepts in more general contexts). We will ﬁnally note that equation (3.1) implies that f is bounded in the vicinity of x0 ; therefore, we suppose that the functions we look at are locally bounded (see [MEY 98], where the exponent factor is introduced, which makes it possible to deﬁne a weak notion of Hölder regularity for functions that are not a priori locally bounded – and the same for distributions). DEFINITION 3.2.– The Hölder exponent of f in x0 is: hf (x0 ) = sup{α : f is C α (x0 )} The Hölder exponent is a function that is deﬁned point by point and describes local variations of the regularity of f . Certain functions have a constant Hölder exponent. Thus, the Weierstrass series: b−Hj sin(bj x) Wb,H (x) = j

106

Scaling, Fractals and Wavelets

has a Hölder exponent that is constant and equal to H; similarly, the sample paths of the Brownian motion verify with near certainty that hB (x) = 12 for all x. In a more general manner, a fractional Brownian motion of exponent H has at every point a Hölder exponent equal to H. Such functions are irregular everywhere. Our objective is to study functions whose Hölder exponent can jump from one point to another. In such a situation, the numerical calculation of functions hf (x0 ) is completely unstable and of little signiﬁcance. We are rather trying to extract less precise information: whether or not the function hf takes a certain given value H and, if it does, what is the size of the sets of points where hf takes on this value? Here we are faced with a new problem: what is the “right” notion of “size” in this context? We will not be able to fully justify the answer to this question because it is a result of the study of numerous mathematical examples. Let us just keep in mind that the term “size” does not signify “Lebesgue measure” because, in general, there exists a Hölder exponent that is the “most probable” and that appears almost everywhere. The other exponents thus appear on all zero sets and the “Lebesgue measure” does not make it possible to differentiate them. Besides, the “right” notion of size cannot be the box dimension because these sets are usually dense. In fact, we expect them to be fractal. A traditional mathematical method to measure the size of such dense sets of zero measure is to calculate their Hausdorff dimension. Let us recall its deﬁnition. DEFINITION 3.3.– Let A be a subset of . For ε > 0, let us note: εdi Mεd = inf R

i

where R signiﬁes a generic covering of A by intervals ]xi , xi + εi [ of a length εi ε. The operator inf is thus taken on all these coverings. For all d ∈ [0, 1], Hausdorff d-dimensional measure of A is: mes d (A) = lim M d →0

This measure takes on a value of +∞ or 0 except for, at the most, a value of d and the Hausdorff dimension of A is: dim(A) = sup d: lim M d = +∞ = inf d: lim M d = 0 →0

→0

DEFINITION 3.4.– Let f be a function and H 0. If H is a value taken by function hf (x), let us note by EH the set of points x where we have hf (x) = H. Therefore, the singularity spectrum (or Hölder spectrum) of the signal being studied is: fH (H) = dim(EH ) (we use the convention fH (H) = −∞ if H is not a Hölder exponent of f ).

Wavelet Methods for Multifractal Analysis of Functions

107

The concept of multifractal function is not precisely deﬁned (just like the concept of a fractal set). For us, a multifractal function is a function whose spectrum of singularities is “non-trivial”, i.e. unreduced to a point. In the examples, fH (H) takes on positive values on an entire interval [Hmin , Hmax ]. Its assessment thus requires a study of an inﬁnity of fractal sets EH , hence the term “multifractal”. 3.2.2. Wavelets and pointwise regularity For many reasons that we shall gradually discover, wavelet methods of analysis are a favorite tool for studying multifractal functions. The ﬁrst reason is that we have a simple criteria that allows us to characterize the value of the Hölder exponent by a decay condition of a given function’s wavelet coefﬁcients. Let us begin by recapitulating certain points related to the wavelet analysis methods. An orthonormal base of wavelets of L2 () has a particularly simple algorithmic form: we start from a function ψ (the “mother” wavelet) that is regular and well-localized; the technical assumptions are: ∀i = 0, . . . , N,

|ψ (i) (x)|

∀m ∈ N,

C(i, m) (1 + |x|)m

for a relatively big N . We can choose such functions as ψ, such that, moreover, the translation-dilation of ψ: ψj,k (x) = 2j/2 ψ(2j x − k),

j, k ∈ Z

form an orthonormal base of L2 () (see [MEY 90]) (we will choose N to be bigger than the maximum regularity that we expect to ﬁnd in the signal analyzed; we can also take N = +∞ and the wavelet ψ will thus belong to the Schwartz class). We verify that the wavelet has a corresponding number of zero moments: ψ(x)xi dx = 0 ∀i = 0, . . . , N,

Thus, every f ∈ L2 () function can be written as: f (x) =

ef (k, j) ψ(2j x − k)

j∈Z k∈Z

where df (k, j) are the wavelet coefﬁcients of f : f (t)ψ(2j t − k) dt ef (k, j) = 2j

108

Scaling, Fractals and Wavelets

We should note that we do not choose an L2 normalization for the wavelet coefﬁcients, but an L1 normalization which is better adapted to the study of Hölder regularity. Let us ﬁrst study the characterization by wavelets of uniform regularity. Let us begin by deﬁning it. A function f belongs to C α () if condition (3.1) takes place for all x0 , with the possibility of choosing C uniformly, i.e. independently of x0 . If we have α < 1, taking into account that P (x − x0 ) = f (x0 ), this condition can be rewritten as: f ∈ C α () ⇐⇒ ∀x, y ∈ |f (x) − f (y)| C|x − y|α This condition of uniform regularity is characterized by a condition of uniform decay of the wavelet coefﬁcients of f (see [MEY 90]). PROPOSITION 3.1.– If α ∈ ]0, 1[ we thus have the following characterization: f ∈ C α () ⇐⇒ ∃C > 0 :

∀j, k

|df (k, j)| C 2−αj

Proof. Let us assume that f ∈ C α (); then, ∀j, k, we have: j j j f (x)ψ(2 x − k) dx = 2 f (x) − f (k 2−j ) ψ(2j x − k) dx ef (k, j) = 2 (because the wavelets are of zero integral) thus: C |ef (k, j)| C 2j |x − k 2−j |α dx C 2−αj (1 + |2j x − k|)2 (by the change of variable t = 2j x − k). Let us now prove the converse. Let us assume that we have |ef (k, j)| C 2−αj . Let j0 be deﬁned by 2−j0 −1 |x − x0 | < 2−j0 and note that: fj (x) =

ef (k, j) ψ(2j x − k)

k

From the localization assumption of ψ, we deduce that we obtain: |fj (x)| C

k

2−αj C 2−αj (1 + |2j x − k|)2

Wavelet Methods for Multifractal Analysis of Functions

109

and similarly, using the localization of ψ , we deduce that we have |fj (x)| C 2(1−α)j . We obtain: |f (x) − f (x0 )| |fj (x) − fj (x0 )| + |fj (x)| + |fj (x0 )| j>j0

jj0

j>j0

Using ﬁnite increments, the ﬁrst term is bounded by: |x − x0 | sup |fj (t)| C|x − x0 | 2(1−α)j C|x − x0 |2(1−α)j0 C jj0

[x,x0 ]

jj0

(because we have α < 1). Coming back to the deﬁnition of j0 , we see that the ﬁrst term is bounded by C|x − x0 |α . The second and the third terms are bounded by: 2−αj C 2−αj0 C|x − x0 |α j>j0

thus, the converse estimate holds. The reader will easily be able to extend this result to a case where we have α > 1 and α ∈ N; see [MEY 90] in the case of α ∈ N. If a function f belongs to one of the C α spaces, for α > 0, we will say that f is uniformly Hölderian. The following theorem is similar to Proposition 3.1, but gives a result of pointwise regularity. THEOREM 3.1.– Let α ∈ ]0, 1[. If f is C α (x0 ), then we have: |ef (k, j)| C 2−αj (1 + |2j x0 − k|α )

(3.2)

conversely, if the wavelet coefﬁcients of f verify (3.2) and if f is uniformly Hölderian, then, if we have |x − x0 | 1, we obtain: 2 α (3.3) |f (x) − f (x0 )| C|x − x0 | log |x − x0 | (f is “nearly” C α (x0 ) if we make a small logarithmic correction). Proof. Let us assume that f is C α (x0 ); then we have: j j j f (x)ψ(2 x − k) dx = 2 f (x) − f (x0 ) ψ(2j x − k) dx df (k, j) = 2 (because the wavelets are of zero integral). Thus, |df (k, j)| is bounded by: C|x − x0 |α |x − k 2−j |α + |k 2−j − x0 |α j j 2 dx 2 C 2 dx (1 + |2j x − k|)2 (1 + |2j x − k|)2

110

Scaling, Fractals and Wavelets

(because, for a, b > 0, we have (a + b)α 2aα + 2bα ). By once again changing the variable t = 2j x − k, we obtain |df (k, j)| C 2−αj (1 + |2j x0 − k|α ). Let us now prove the converse. Assuming that there exists an > 0 such that f ∈ C (), let j0 and j1 be deﬁned by: 2−j0 −1 |x − x0 | < 2−j0

and

j1 =

α j0

From (3.2), we deduce that, for all x, we have: |ef (k, j)| C(2−αj + |x0 − k 2−j |α ) 2 C(2−αj + |x − x0 |α + |x − k 2−j |α ) and thus: |fj (x)| C

2−αj + |x − x0 |α + 2−αj |2j x − k|α (1 + |2j x − k|)2 k

We have: k

1 C (1 + |2j x − k|)2

and similarly: k

1 C (1 + |2j x − k|)2−α

because we have α < 1. We get: |fj (x)| C 2−αj + |x − x0 |α C 2−αj (1 + 2αj |x − x0 |α ) Using the localization of ψ , we obtain, in the same manner: |fj (x)| C 2(1−α)j (1 + 2αj |x − x0 |α ) in particular, if we have j j0 , we obtain |fj (x)| C2(1−α)j . We can write: |f (x) − f (x0 )|

jj0

|fj (x) − fj (x0 )| +

j>j0

|fj (x)| +

j>j0

|fj (x0 )|

(3.4)

Wavelet Methods for Multifractal Analysis of Functions

111

The ﬁrst term is bounded by:

C|x − x0 |

sup |fj (t)| C|x − x0 |

jj0 [x,x0 ]

2(1−α)j

jj0

C|x − x0 |2(1−α)j0 C|x − x0 |α As far as the second term is concerned, if we have j > j0 , bound (3.4) becomes |fj (x)| C|x − x0 |α and we thus have:

|fj (x)|

j0 <j<j1

|x − x0 |α

j0 <j<j1

C(j1 − j0 )|x − x0 |α C|x − x0 |α log

2 |x − x0 |

because we have j1 − j0 (α/)j0 . Moreover, since f belongs to C (), we have:

|fj (x)|

jj1

2− j C 2− j1 C|x − x0 |α

jj1

(because of the choice of j1 ). As far as the third term is concerned, the bound (3.4) in x = x0 becomes |fj (x0 )| C 2−αj and thus:

|fj (x)|

j>j0

2−αj C|x − x0 |α

jj0

and the converse is proved in Theorem 3.1. Here again, the reader will easily be able to extend this result to exponents α > 1 (see [JAF 91]). As far as the second part of the theorem is concerned, we sometimes use the slight variation below. PROPOSITION 3.2.– Let α ∈ ]0, 1[. If the wavelet coefﬁcients of f verify:

|df (k, j)| C 2−αj (1 + |2j x0 − k|α ) for an α < α, then f is C α (x0 ).

(3.5)

112

Scaling, Fractals and Wavelets

The demonstration of this proposition is very similar to that of the second part of the theorem and therefore we only outline it. We show this time that (3.5) implies that:

|fj (x)| C 2−αj (1 + 2α j |x − x0 |α ) and:

|fj (x)| C 2(1−α)j (1 + 2α j |x − x0 |α ) and we ﬁnish by using the estimation on fj for j j0 and on fj for j j0 . This proposition also extends to the case where α > 1 (see [JAF 91]). NOTE 3.2.– We could get the impression that Proposition 3.2, contrary to Theorem 3.1, does not make the assumption of global regularity. This does not matter because, if (3.5) is veriﬁed and if we have |x − x0 | 1, then we can deduce that |df (k, j)| C 2−(α−α )j , i.e. that we have f ∈ C α−α uniformly close to x0 . We could nevertheless refer to [JAF 00c] where pointwise Hölder estimations are obtained from wavelet coefﬁcients, in the absence of any assumption of uniform regularity. From the theorem, we can immediately deduce the following corollary that characterizes the Hölder exponent by local decay of wavelet coefﬁcients. COROLLARY 3.1.– If f is uniformly Hölder, the Hölder exponent of f at every point x0 is given by: log(|df (k, j)|) (3.6) hf (x0 ) = lim inf inf j→∞ k log(2−j + |k 2−j − x0 |) This corollary is used in all mathematical results where the singularity spectrum of a function f is derived from its wavelet coefﬁcients. On the other hand, it can be used to obtain the numerical value of hf (x) only if function hf is constant, or if it varies slightly with the discretization scale of the signal. 3.2.3. Local oscillations Using the Hölder exponent as a means to measure pointwise regularity makes it a powerful signal and image analysis tool (see [DAO 95, LEV 95]). However, there are many disadvantages in characterizing the pointwise regularity by using only the Hölder exponent: – inability to measure the oscillatory character of the local behavior of f close to x0 ; – lack of stability while using “traditional” operators, such as differential operators, pseudo-differential operators or the Hilbert H transform (convolution

Wavelet Methods for Multifractal Analysis of Functions

113

operator with one dimension as x1 , which allows the transfer of the real signal to the associated analytic signal during signal analysis). So, we can create, for instance, functions f locally bounded with f ∈ C α (x0 ) and Hf ∈ C β (x0 ) for any β > 0 (see [JAF 91]).

spaces will provide a substitute to the pointwise regularity The 2-microlocal Cxs,s 0 notion which does not have any disadvantages; moreover, they are deﬁned even if f is not locally bounded. We have already seen this condition in (3.5).

space if its DEFINITION 3.5.– An F distribution belongs to the 2-microlocal Cxs,s 0 wavelet coefﬁcients satisfy, for sufﬁciently small |x0 − k 2−j |:

∃C > 0 |ef (k, j)| C 2−j(s+s ) (2−j + |k 2−j − x0 |)−s

(3.7)

This condition does not depend on the chosen wavelet base (sufﬁciently regular) (see [JAF 91]). DEFINITION 3.6.– The 2-microlocal domain of f in x0 , noted by E(f (x0 )) is the set of couples (s, s ) such that f ∈ C s,s (x0 ). By interpolation between conditions (3.7), we ﬁnd that the 2-microlocal domain is a convex set and, moreover, by using the trivial lower bound 2−j +|k 2−j −x0 | 2−j , we can see that its boundary is a curve whose slope is everywhere larger than −1. Conversely, we can check that these conditions characterize the boundaries of the 2-microlocal domains in one point x0 (see [GUI 98, MEY 98]) (a more difﬁcult and unresolved issue is to determine which are the compatibility conditions between the different 2-microlocal domains E(f (x)) of a function when the point x varies). The 2-microlocal domain provides very accurate information on the behavior of f close to x0 ; speciﬁcally, we can derive the Hölder exponent of primitives from it or fractional derivatives of f (see [ARN 97]). However, this complete information is superﬂuous when put into practice. In fact, we need to preserve few parameters – and not a convex function at each point (the boundary of the 2-microlocal domain): at least the Hölder exponent, and often a second parameter β, which will measure the oscillating character of f close to x0 . In fact, a similar Hölder exponent can, at any given point, show very different behavior, such as “cusps” |x − x0 |H or, on the contrary, high oscillating functions, such as “chirps”: 1 (3.8) gH,β (x) = |x − x0 |H sin |x − x0 |β for β > 0. In signal processing, the chirp notion models functions whose instantaneous frequency increases rapidly at a given time (see [CHA 99]). The β exponent measures the speed at which the instant frequency (3.8) diverges at x0 .

114

Scaling, Fractals and Wavelets

An additional motivation for the analysis of chirps is that this type of behavior incurs a failure in the ﬁrst versions of the multifractal formalism, as we will see in section 3.4. We presently have two possible mathematical deﬁnitions for exponent β in a general framework. Of course, both result in the same value for functions (3.8). We will deﬁne them and discuss their respective advantages with regard to the analysis of the signal. Let f be locally bounded and let us note by f (−n) a primitive n times iterated from f . As shown by a sequence of integrations by parts, a consequence of the oscillations (−n) of (3.8) close to x0 is that gH,β belongs to C H+n(β+1) (x0 ) (the increase of the Hölder exponent in x0 is not 1 at each integration, as expected for an arbitrary function, but β + 1). This observation has led to the following deﬁnition given by Meyer [JAF 96b]. DEFINITION 3.7.– Let H 0 and β > 0. A function f ∈ L∞ () is a chirp (n) of type (H, β) in x0 if, for any n 0, f can be written as f = gn , with H+n(1+β) (x0 ). gn ∈ C The chirp type can be derived from the 2-microlocal domain in x0 . In fact, if we have f ∈ C H (x0 ), we also have f ∈ C H,−H (x0 ), etc. and, in general, (n) the condition f = gn with gn ∈ C H+n(1+β) (x0 ) implies that we have f ∈ H+nβ,−H−n(β+1) (x0 ). The other characterization of chirps, given below (see C [JAF 96b]), shows that their deﬁnition reﬂects appropriately the oscillatory phenomenon present in functions gH,β . PROPOSITION 3.3.– A function f ∈ L∞ is a chirp of type (H, β) in x0 if and only if a function r(x), C ∞ exists close to x0 and > 0 such that, if we have 0 < x < , then: f (x) = r(x − x0 ) + (x − x0 )H g+ (x − x0 )−β and if we have − < x < 0, then: f (x) = r(x − x0 ) + |x − x0 |H g− |x − x0 |−β since functions g+ and g− are permanently oscillating, i.e. they have bounded primitives of any order. It is very easy to check that the interior of the set of points (f, β) such that f is a chirp of the type (H, β) in x0 is always of the form H < hf (x0 ), β < βf (x0 ) [JAF 00a]. The positive number βf (x0 ) is called the chirp exponent in x0 .

Wavelet Methods for Multifractal Analysis of Functions

115

A highly oscillating local behavior such as (3.8) is notable and it was believed for a long time that this could be observed only in isolated points. This is why the Meyer result, proving that the Riemann series n−2 sin(πn2 x) has a dense set of chirps of type ( 32 , 1), was unexpected (see [JAF 96b, JAF 01a]). Since then, we know how to generate functions with chirps almost everywhere (see [JAF 00a]). Deﬁnition 3.7 has not been adapted to the analysis of signals. In fact, we will see that it is not stable by adding an arbitrary regular function (but not C ∞ ). We will also introduce another deﬁnition of the local exponent to measure the oscillation without this disadvantage. Let us consider the following example. Let B(x) be a Brownian motion and: 1 1/3 + B(x) (3.9) C(x) = x sin x The Hölder exponent of B(x) being everywhere 12 , the largest singularity at 0 is the chirp x1/3 sin( x1 ) and we effectively observe this oscillating behavior in the graph of C(x) expanded around 0. However, after integration, the random term becomes paramount (in fact, an integration by parts shows that the ﬁrst term is O(|x|7/3 ) whereas a Brownian primitive has the Hölder exponent hC (x) = 32 everywhere); the oscillating behavior then disappears on the primitive. The oscillations of the graph of C(x), which do not exist in the primitive, are not taken into account in Deﬁnition 3.7: we see that C(x) is not a chirp at 0 (it is actually a chirp of exponents ( 13 , 0)). Nevertheless, this big oscillating behavior should be taken into account in the chirp exponent: C(x) “should” have exponents ( 13 , 1). Let us now show how to deﬁne an oscillating exponent taking the value β for (3.8), which would not be changed by adding a “regular noise” (it will take value 1 for C(x) and no longer 0). It is obvious, from the previous example, that the oscillating exponent should not be determined by taking into account primitives of F . An “inﬁnitesimal” fractional integration should be used so that the order of importance of terms in (3.9) is not disturbed. Let ht (x0 ) be the Hölder exponent of a fractional integral of order t of the function f at x0 . To be more precise, if f is a locally bounded function, let us note by ht (x0 ) the Hölder exponent in x0 of: I t (f ) = (Id − Δ)−t/2 (φf )

(3.10)

where φ is a C ∞ function with compact support such that φ(x0 ) = 1 and (Id −Δ)−t/2 is the convolution operator which, in Fourier, is none other than multiplication by (1 + |ξ|2 )−t/2 (see Chapter 7 for more details). In the example of function C(x), we ﬁnd, where x0 = 0, ht (x0 ) = 13 + 2t after a fractional integration of quite small order t, i.e., 13 + 2t < 12 + t. In this example, the increase of the Hölder exponent in x0 after a fractional integration of a quite small order t is 2t; we can therefore recover the β

116

Scaling, Fractals and Wavelets

oscillating exponent in this way. In general, the function t → ht (x0 ) is concave, so much so that its derivative at the right exists in 0 (with the possible value +∞). The following deﬁnition is derived from it. DEFINITION 3.8.– Let f : d → be a locally bounded function. The oscillating exponent of f in x0 is: ∂ −1 (3.11) β = ht (x0 ) ∂t t=0 This exponent belongs to [0, +∞]. We will notice that if we have ht (x0 ) = +∞, then β is not deﬁned. The following proposition, extracted from [AUB 99], shows that this deﬁnition appropriately reﬂects the oscillating phenomenon present in (3.8). PROPOSITION 3.4.– If f is uniformly Hölderian, ∀H < h(x0 ) and ∀β < β(x0 ), f can be written as: 1 H + r(x) f (x) = |x − x0 | g |x − x0 |β with r(x) ∈ C α (x0 ) for a α > h(x0 ) and g is inﬁnitely oscillating. 3.2.4. Complements We discuss here the known results relating to the construction of functions having a prescribed Hölder exponent (and possibly oscillating). This problem was ﬁrst encountered in the speech simulation context. A speech signal possesses a Hölder exponent which varies drastically (particularly in the case of consonants) and this led to the idea of efﬁciently storing such a signal by keeping only the information contained in the Hölder exponent. The following theorem characterizes functions which are Hölder exponents (see [AYA 08]). THEOREM 3.2.– A positive function h(x) is the Hölder exponent of a bounded function f if and only if h can be written as lim inf of a sequence of continuous functions. When h(x) has a minimal Hölder regularity, a natural construction is provided by the multifractional Brownian motion (see [BEN 97, PEL 96]). Contrary to the previous result, a couple of functions (h(x), β(x)) should verify very speciﬁc conditions to be a couple (Hölder exponent, chirp exponent): β(x) should vanish on a dense set [GUI 98]. We also have a constructive result: if this condition is satisﬁed, we can prescribe this couple almost everywhere (see [JAF 05]) but unfortunately we do not know how to characterize couples which are couples of the form (Hölder exponent, chirp exponent).

Wavelet Methods for Multifractal Analysis of Functions

117

3.3. Random multifractal processes We present two examples of random multifractal processes. The ﬁrst is the Lévy process. Their importance is derived, on the one hand, from the central place these processes have in determining the probability factor and, on the other hand, their increasing importance for physics or ﬁnancial modeling, particularly in situations where Gaussian models are inadequate (see, for instance, [MAN 97, SCH 95] and Chapter 5 and Chapter 6). Our second example involves random wavelet series, that is, processes whose wavelet coefﬁcients are independent and, at a particular scale, have the same laws (these laws can be set in an arbitrary manner at each scale). In addition to the intrinsic advantages of this model, it also makes it possible to introduce new concepts and it enriches the possible variants of the multifractal formalism. 3.3.1. Lévy processes A Lévy process Xt (t 0) with values in can be deﬁned as a process with stationary independent expansions: Xt+s − Xt is independent on (Xv )0vt and has the same law as Xs . The function that characterizes the Lévy process is written as E(eiλXt ) = e−tφ(λ) , where: 2 1 − eiλx + iλx1|x|<1 π(dx) (3.12) φ(λ) = iaλ + Cλ +

where π(dx) is the Lévy measure of Xt , i.e. a positive Radon measure on − {0} verifying: inf(1, |x|2 ) π(dx) < ∞ (3.13) Or:

Cj =

2−j |x|2·2−j

π(dx)

We can measure the size of π close to the origin with the help of the “inferior” exponent of Blumenthal and Getoor, deﬁned by: log Cj α = inf α 0 : |x| π(dx) < ∞ α = sup 0, lim sup j→∞ j log 2 |x|1 This exponent satisﬁes that 0 α 2 and, if Xt is a stable Lévy process, coincides with the stability index.

118

Scaling, Fractals and Wavelets

In general, the Lévy process has a dense set of discontinuities (it seems to skip everywhere!) and can be written as a superposition of particularly simple processes: the compensated composed Poisson processes, whose sample paths are piecewise linear with jumps. The sample paths of a composed Poisson process of Lévy measure π(dx) (which is ﬁnite) are generated in the following manner: X(t) remains at zero for 0 #t < t1 ; t1 is the time of the ﬁrst jump, whose law is exponential of intensity C = π(dx) (i.e., the law of the ﬁrst jump has Ce−Ct as density). In t1 , the process jumps, the amplitude of the jump is an independent random variable of the value taken by t1 and the probability measure of the jump is π(dx)/C. Then, we start again: if t2 is the second jump time, t2 − t1 does not depend on t1 and the jump value in t1 ; the t2 − t1 law is the same as t1 and, ﬁnally, the jump in t2 has the same law as in t1 and does not depend on the previous choices, etc. Thus, we can generate a process X(t) with independent stationary increments# which is piecewise constant; this is a composed Poisson process. If we have D = xπ(dx), the expectation of Y (t) = X(t) − Dt is zero for all t; this is the compensated composed Poisson process with measure π(dx). If a Lévy measure π(dx) veriﬁes (3.13), then each measure π0 (dx) = 1|x|<1 π(dx) and πj (dx) = 12−j |x|2·2−j π(dx) is bounded and is the Lévy measure of a compensated composed Poisson process Xj (t). The Lévy process (without Brownian component) associated with π is then: X(t) =

+∞

Xj (t)

j=0

If X(t) has a Brownian component, we can add to the process that we have just constructed a Brownian motion originating from 0. The following theorem conﬁrms that the Lévy process sample paths are multifractal. THEOREM 3.3.– Let Xt be a Lévy process without the Brownian component (C = 0 in (3.12)) which veriﬁes: & (3.14) β > 0 and 2−j Cj log(1 + Cj ) < ∞ With probability 1, the singularity spectrum of Xt is: αH if we have H ∈ [0, 1/α] dα (H) = −∞ otherwise NOTE 3.3.– We notice that: – conditions (3.14) are veriﬁed as soon as we have 0 < α < 2 and particularly for all stable Lévy processes;

Wavelet Methods for Multifractal Analysis of Functions

119

– Theorem 3.3 can seem to infer the reverse of what we expect: we normally think of multifractal functions as functions whose “behavior” can vary greatly, which seems to contradict the stationary expansion hypothesis of Lévy processes. The paradox disappears if we remember that this hypothesis is related to the law of a Lévy process and thus it authorizes one particular trajectory presenting such “changes of behavior”. Thus, the Eh sets are, for a given trajectory, remarkable points, but the law of Eh remains invariant per translation; – the assertion in Theorem 3.3 is stronger than when we simply state that fH (H) has a given value with probability 1, which would not be enough in order to determine the singularity spectrum of almost any trajectory; – the Hölder exponent of a Lévy process without the Brownian component is α almost everywhere (see [PRU 81]), which, of course, complies with Theorem 3.3 (case H = α); – a lot of work has been devoted to the fractal nature of the set of values taken by a Lévy process (see [MAN 95] for “Lévy ﬂights” and [BER 96] and its references). Let us now outline the demonstration of the theorem. The Lévy process Xt is written as the sum of a composed Poisson process of Lévy measure 1|x|>1 π(dx) (and we can “forget” this ﬁrst process whose addition does not modify the spectrum) and the series j0 Xtj where Xtj are compensated and independent composed Poisson processes with Lévy measure πj (dx) = 12−j |x|<2·2−j π(dx) (the amplitude of jumps of Xj is therefore of the order of magnitude of Z −j ). The following comment enables us to bound the regularity of Xt . Let f be a function having a dense set of discontinuities and in each discontinuity having a limit on the right and on the left. Let rn be a sequence of discontinuities of f converging towards x. Then, the Hölder exponent of f veriﬁes that: hX (x) lim inf

n→+∞

log|f (rn+ ) − f (rn− )| . log|rn − x|

(3.15)

The proof follows from the simple comment that one of the two numbers |f (x) − f (rn+ )| or |f (x) − f (rn− )| should be larger than |f (rn+ ) − f (rn− )|/2 due to the triangular inequality. The Lévy processes are of this type and thus we can apply this comment to them. We note by Ajδ the union of intervals of diameter 2−δj centered at the jumps of Xtj and Eδ = lim sup Ajδ ; if we have t ∈ Eδ there is a sequence jn of integers and instants tjn such that: |tjn − t| 2−δjn

and

|Xt+ − Xt− | 2−jn jn

jn

By applying (3.15), we then obtain: hX (t) 1/δ

(3.16)

120

Scaling, Fractals and Wavelets

We check that the accumulation of jumps close to t is the only cause of irregularity in a Lévy process and that we actually have the equality in (3.16) (see [JAF 99]). The proof of the theorem therefore consists of showing that the dimension of Eδ is α/δ if we have α δ. The dimension bound of Eδ is immediate: the average number of jumps of Xtj on [0, 1] is Cj and thus, it is almost sure that all Xtj have less than 2Cj jumps starting from a certain rank. Hence, Ajδ covered by 2Cj intervals of diameter 2−δj . The bound dimension bound is derived from it. The upper dimension bound is carried out using a traditional technique, by creating particular measures which “load” the Eδ sets as uniformly as possible (see [JAF 99]). We will also refer to [JAF 99] if Xt has a Brownian component. 3.3.2. Burgers’ equation and Brownian motion Multifractal analysis has been introduced within the frame of turbulence; however, mathematical equations governing the speed evolution of a turbulent ﬂow (Navier-Stokes equations within the limit where the viscosity tends towards 0) are at present very little understood and hope for a mathematical result regarding the multifractal nature of the solutions is not in sight. At the most, we can anticipate precise results only for more simple non-linear evolution equations, hoping that they retain some physical characteristics of the ﬂuid evolution. The simplest equation proposed for this objective is Burgers’ equation: ∂ u2 ∂2u ∂u + = ν 2 , x ∈ , t ∈ + ∂t ∂x 2 ∂t

(3.17)

which we shall consider within the limit of small viscosities ν → 0. One reason to study this equation is that we have explicit formulae providing the solution u(x, t) at any given moment t > 0 according to the initial condition u0 (x) = u(x, 0); in fact, if U is a primitive of u, then U veriﬁes: 2 1 ∂U ∂2U ∂U + =ν 2 ∂t 2 ∂x ∂x Then, we carry out the Cole-Hopf transformation, which consists of supposing that φ = e−U/2ν ; φ then veriﬁes the equation of the linear heat, which we explicitly resolve, hence the expression of U . By passing to the limit within this expression when ν → 0, we obtain using a standard technique (Laplace method) the following result (see [EVA 98] for the details of the calculations). Let us suppose, to simplify things, that the initial condition u0 is zero on ] − ∞, 0[ and that we have u0 (s) + s 0 for s large enough. To have an idea, let us have a look

Wavelet Methods for Multifractal Analysis of Functions

121

at the solution at this instant: t = 1. First, we consider for each x 0, the function of the variable s 0: s u0 (r) + r − x dr Fx (s) = 0

and we note by a(x, 1) the largest point where the minimum is attained: a(x, 1) = max{s 0 : Fx (s) Fx (s ); for all s }

(3.18)

The limit solution when ν → 0 with the time t = 1 is then given by: u(x, 1) = x − a(x, 1)

(3.19)

Formula (3.18) shows that the random process a(x, 1) obtained when the initial condition is a Brownian motion on + (and zero on − ) is a subordinator. We call a subordinator an increasing Lévy process (σx , x 0). A traditional example, which plays an important role later, is that of the ﬁrst times of passage of a Brownian motion with derivative. More speciﬁcally, let us consider a real standard Brownian motion (Bs , s 0), let us note Xs = Bs + s and let us introduce for x 0: τx = inf{s 0, Xs > x} Because τx is a stopping time and since Xτx = x, the strong Markov property applied to the Brownian motion implies that the process Xs = Xs+τx − x is also a Brownian motion with derivative, which is independent of the portion of the trajectory before τx , (Xr , 0 r τx ). For any z ∈ [0, x], the ﬁrst time of passage τz clearly only depends on the trajectory before τx and hence (Xs , s 0) is independent of (τz , 0 z x). The identiﬁcation: τx+y − τx = inf{s 0, Xs+τx > x + y} = inf{s 0, Xs > y} then highlights the independence and the homogenity of the incrementation of τ , which is hence a subordinator. Close arguments apply to the increments of function (3.18) for Burgers’ equation (3.17) non-viscous with Brownian initial condition, i.e., when we have u(x, 0) = Bx for x 0. Indeed, we verify that (a(x, 1) − a(0, 1), x 0) and (τx , x 0) follow the same rule (see [BER 98]). The isolated example of the Burgers’ equation with initial Brownian data can make us hope that more general results are true – and in particular that large classes of non-linear partial differential equations generically develop multifractal solutions. However, at present, there are no proven results of this type (however, the reader can consult [VER 94] concerning Burgers’ equation in several space dimensions).

122

Scaling, Fractals and Wavelets

3.3.3. Random wavelet series Since we are interested in the local properties of the functions, it is equivalent, and also easier, to work with periodic wavelets that are obtained by periodization of a usual base of wavelets (see [MEY 90]) and are deﬁned on the toric T = /Z. The periodic wavelets: ψ 2j (x − l) − k , j ∈ N, 0 k < 2j (3.20) ψj,k (x) = 2j/2 l∈Z

form an orthonormal base of L2 (T) (by adding the constant function equal to 1, see [MEY 90]; we use the same notation than for the wavelets on , which will not lead to any confusion). We also assume that ψ has enough regularity and zero moments. Any periodic function f is hence written as follows: ef (k, j) 2−j/2 ψj,k (x) (3.21) f (x) = j,k

where the wavelet coefﬁcients of f are hence given by: 1 2j/2 ψj,k (t)f (t) dt ef (k, j) = 0

We assume that all the coefﬁcients are independent and have the same law at each scale. Let ρj be the measure of common probability of 2j random variables Xj,k = −(log2 |df (k, j)|)/j (signs of wavelet coefﬁcients do not have any inﬂuence on the Hölder regularity, which is why we are not making any assumptions on this subject). Thus, the measure ρj veriﬁes: P |ef (k, j)| 2−aj = ρj (−∞, a] We will make the following assumption on ρj : ∃ > 0 :

supp(ρj ) ⊂ [, +∞]

It signiﬁes that the sample paths of the process are uniformly Hölder. We need to deﬁne the logarithmic density ρ˜(α) of the coefﬁcients; i.e.: log2 2j ρj ([α − , α + ]) ρ¯(α) = lim lim sup →0 j→+∞ j The reader should note that this density is, in fact, a spectrum of large deviation, but calculated from the wavelet coefﬁcients (see Chapter 4 where general results concerning spectra of large deviation are established). Then, we note: ρ¯(α) if ρ¯(α) 0 ρ˜(α) = 0 otherwise

Wavelet Methods for Multifractal Analysis of Functions

123

or:

Hmax =

sup α>0

ρ˜(α) α

−1

THEOREM 3.4.– Let f be a random wavelet series verifying a uniform regularity assumption. The singularity spectrum of almost any trajectory of f is supported by [, Hmax ] and, within this interval: ρ˜(α) α∈[0,H] α

fH (H) = H sup

(3.22)

Function ρ˜(α) being essentially arbitrary, we notice that a singularity spectrum of a series of random wavelets is not necessarily concave. 3.4. Multifractal formalisms Even if the singularity spectrum of numerous mathematical functions can be determined by simply using Deﬁnition 3.4 with regard to the multifractal signals, it is not practical to determine their regularity at each point. This is because the Hölder exponent can be discontinuous everywhere, and it is even less practical to calculate the inﬁnity of the corresponding Hausdorff dimensions! Frisch and Parisi have introduced a formula which allows us to deduce the singularity spectrum of a signal from quantities that are easily measurable. It is the ﬁrst example of what we now call multifractal formalisms. Several variants have been proposed since then; we shall describe some of them and we shall compare their respective performances. Additional results are found in Chapter 4. The formula proposed by Frisch and Parisi is based on the knowledge of the Besov spaces to which the function belongs. This is why we begin with a few reminders regarding these spaces and their characterization by wavelets. 3.4.1. Besov spaces and lacunarity One of the reasons for the success of wavelet decomposition in applications is that they often provide representations of signals that are very lacunary (few coefﬁcients are numerically not negligible). This lacunarity is often quantiﬁed by determining to which Besov spaces the considered function belongs to. For the characterization of the wavelet coefﬁcients of Besov spaces see [MEY 90]: ∀s ∈ , p > 0, f ∈ B s,p () ⇐⇒

j,k

1

|ef (k, j) 2(s− p )j |p

1/p < +∞

(3.23)

124

Scaling, Fractals and Wavelets

When we have p 1, the Besov spaces are very close to the Sobolev spaces; indeed, if Lp,s is the space of the functions of Lp for which the fractional derivatives of order s still belong to Lp , we have the following injections: ∀ > 0,

∀p 1,

Lp,s+ → B s,p → Lp,s−

However, a determining advantage over Sobolev spaces is that the Besov spaces are deﬁned for any p > 0. It is precisely these spaces for p close to 0 that enable us to measure the lacunarity of the representation in wavelets of f . We illustrate this point with an example. Let us consider the function:

H(x) =

1 if |x| 1 0 otherwise

and let us assume that the wavelet chosen has a compact support, with the interval [−A, A]. Because ψ is a zero integral, for each j, there is less than 4A non-zero wavelet coefﬁcients, so much so that the decomposition in wavelets of f is very lacunary. Because H(x) is bounded, we have |df (k, j)| C for any j, k. By using (3.23), H(x) belongs to B s,p () as soon as we have s < 1/p. Let us show, at the same time, that such an assertion is a way to quantify the fact that the decomposition in wavelets of f is very lacunary. Let us suppose that f is a bounded function satisfying: ∀p > 0,

∀s <

1 , p

f ∈ B s,p ()

We will verify that for any D > 0 and for any > 0, at each scale j, there are −Dj . Indeed, if this was not the less than C(, D)2 j coefﬁcients of a size larger than 2 case, by taking p = /(2D), we would obtain k |df (k, j)|p → +∞ when j → +∞, which is a contradiction. Here is another illustration of the relation between the lacunarity of the decomposition in wavelets and the Besov regularity. We assume that f belongs to ∩p>0 B 1/p,p . Coming back to (3.23), we observe that this condition means exactly that the sequence df (k, j) belongs to lp for any p > 0. Let us then note by dn the rearrangement in a decreasing order of the sequence of wavelet coefﬁcient modules |df (k, j)|; hence, the sequence dn also belongs to lp for any p > 0. Thus: ∀p

∃Cp

such that

∞ n=1

dpn Cp

Wavelet Methods for Multifractal Analysis of Functions

125

Because the sequence dn is decreasing: ∀N,

N dpN

N n=1

dpn

∞

dpn Cp

n=1

and thus dN (Cp )1/p N −1/p . Since we can take p arbitrarily close to 0, we observe that the rearrangement in a decreasing order of the sequence |df (k, j)| has fast decay, which is, once again, a way to express the lacunarity (the converse is immediate: if the sequence dn has fast decay, it belongs to all lp and thus this is also the case for the sequence df (k, j)). The Besov space for p < 1 is not locally convex, which partly explains the difﬁculties in their utilization. Before the introduction of wavelets, these spaces were characterized either by order of approximation of f with rational fractions for which the numerator and the denominator have a ﬁxed degree, or by an order of approximation with the splines “with free nodes” (which means that we are free to choose the points where the polynomials in parts are connected) (see [DEV 98, JAF 01a]). However, these characterizations are difﬁcult to handle and hence do not have any real numerical applications. Characterization (3.23) shows that the knowledge of the Besov spaces to which f belongs is clearly linked to the asymptotic behavior (when j → +∞) of the moments of distribution of the wavelet coefﬁcients of f ; see (3.26). Generally, more information is available; indeed, these moments are deduced from the histogram of the coefﬁcients at each scale j. This is why it is normal to wonder which information regarding the pointwise regularity of f can be deduced from the knowledge of these histograms. We present a study of this problem below. We observe that the cascade type models for the evolution of the repartition function of wavelet coefﬁcients through the scales have been proposed to model the speed of turbulent ﬂows [ARN 98]. To start with, let us point out a limitation of the multifractal analysis: functions having the same histograms of wavelet coefﬁcients at each scale can have singularity spectra that are completely different [JAF 97a]. In the multifractal analysis, it is not only the histogram of the coefﬁcients which is important, but also their positions. This is why no formula deducing the singularity spectrum from the knowledge of the histograms can be valid in general. However, we can hope that some formulae are “more valid than others”. Indeed, we have observed that if the coefﬁcient values are independent random variables, there is a spectrum which is almost sure; we shall notice that the formula that yields this spectrum differs from the formulae proposed until now. Another approach consists of specifying the information on the function and considering the functional spaces that take the positions of the large wavelet coefﬁcients into consideration (see [JAF 05]).

126

Scaling, Fractals and Wavelets

3.4.2. Construction of formalisms The construction of a multifractal formalism can be based on two types of considerations: – counting arguments: we consider the increments (or wavelet coefﬁcients) having a certain size; we estimate their number and deduce their contribution to some “calculable” quantities; – more mathematical arguments: we prove that a bound of the spectrum, according to the “calculable” quantities, is generally true and that this bound is “generically” an equality. The term “generically” is to be understood in the sense of “Baire classes” if the information at the start is of functional type, or as “almost sure” if the information at the start is a probability. We begin by describing the ﬁrst approach; we do not exactly recapitulate the initial argument of Frisch and Parisi, but rather its “translation” in wavelets, as found in [ARN 95b, JAF 97a]. This approach admits two variants, according to the information on the function that we have. Indeed, we can start from: – the partition function τ (q) deﬁned from knowledge of the Besov spaces to which f belongs: q log k |ef (k, j)| s/q,q } = 1 + lim inf τ (q) = sup{s : f ∈ B j→+∞ log 2−j – histograms of wavelet coefﬁcients. Generally, for each j, let Nj (α) = #{k : |df (k, j)| 2−αj }. Thus, we have E(Nj (α)) = 2j ρj ([0, a]). If: log Nj (α + ) − Nj (α − ) ρ(α, ) = lim sup log(2j ) j→∞

(3.24)

then, we deﬁne: ρ(α) = inf ρ(α, ) >0

(3.25)

˜ (there is an order of 2ρ(α)j coefﬁcients of size ∼ 2−αj ).

It is important to note that the information provided by ρ(α) is richer than that provided by τ (q); indeed, τ (q) can be deduced from the histograms with: −1 −j −αqj 2 log2 2 Nj (α) dα (3.26) τ (q) = lim inf j→+∞ j

Wavelet Methods for Multifractal Analysis of Functions

because, by deﬁnition of Nj , we have deduce that:

k |df

(k, j)|q =

#

τ (q) = inf αq − ρ(α) + 1 α0

127

2−αqj dNj (α). It is easy to (3.27)

On the other hand, we cannot reconstitute ρ(α) from τ (q); indeed, it is clear from (3.27) that the two functions ρ(α) and ρ (α) having the same concave envelope lead to the same function τ (q). Based on τ (q), we can thus only obtain the envelope of ρ(α), by carrying out a Legendre transformation once again. We will now describe the heuristic arguments based on the construction of these multifractal formalisms. We will divide them into four steps, highlighting the implicit assumptions that we make for each of them. S TEP 1. The ﬁrst assumption, common to both approaches, is that the Hölder exponent at each point x0 is given by the order of magnitude of the wavelet coefﬁcients of f in a cone |k2−j − x0 | C2−j . With respect to (3.6), if they are decreasing as 2−Hj , we then have hf (x0 ) = H. This assumption is veriﬁed if f does not have “cusp” type singularities ([MEY 98]), i.e., the oscillation exponent is zero everywhere. If we go from the data of ρ(H), we have, as an assumption, 2ρ(H)j wavelet coefﬁcients of size 2−Hj . By using the supports of the corresponding wavelets to cover EH , we expect to obtain fH (H) = ρ(H). Thus, we also obtain a ﬁrst form of multifractal formalism: the formalism said to be “of large deviation”, which simply afﬁrms that: fH (H) = ρ(H) Let us brieﬂy justify this name as well as that of the large deviation spectrum that we sometimes give to function ρ. The theory of large deviations takes care of the calculation of the probabilities, which are so small that we can only correctly estimate them on logarithmic scales. The basic example is as follows: if Xi are n independent reduced centered Gaussian, and if we have S˜n = n1 i=1 Xi , then we obtain n1 log P(|S˜n | δ) ∼ −δ 2 /2. The analogy with (3.24) and (3.25) is striking since, in the common law of wavelet coefﬁcients, the parts of very small probabilities, which we measure with the help of a logarithmic scale, are those that provide the relevant information ρ(α) for calculating the spectrum (see Chapter 4 for a more detailed study). The effective calculation of function ρ(α) is numerically delicate because its deﬁnition leads to a double limit, which generally results into problems said to be “of ﬁnite size”. In theory, we must go “completely” to the limit in j in (3.24) before taking the limit in in (3.25). Practically speaking, the two limits must effectively be taken “together”. The problem is then to know how to take j sufﬁciently large according to , which creates signiﬁcant numerical stability problems. In any case, a

128

Scaling, Fractals and Wavelets

calculation of ρ(α) which is numerically reliable requires us to know the signal on a large number of scales, i.e., with an excellent precision. This is why we often prefer to work from averages, such as k |df (k, j)|q , i.e., ﬁnally, based on the partition function for which the deﬁnition leads to only one limit. From now, this is the point of view that we shall adopt (however, let us note that the direct method introduced by Chhabra and Jensen in [CHH 89] is a method for calculating ρ(α) without going through a double limit, or through a Legendre transformation; we shall ﬁnd a mathematical discussion of this method adapted to the framework of the wavelets in [JAF 04a]). S TEP 2. We will estimate, for each H, the contribution of Hölder singularities of exponent H at: |ef (k, j)|q (3.28) k

Each singularity of this type brings a contribution of C2−Hqj and there must be ∼ 2fH (H)j intervals of length 2−j to recover these singularities; the total contribution of the Hölder singularities of exponent H at (3.28) is thus as follows: 2fH (H)j 2−Hqj = 2−(Hq−fH (H))j

(3.29)

This is a critical step of reasoning; it contains an inversion of limits that implicitly assumes that all Hölder singularities have coefﬁcients ∼ 2−Hj simultaneously from a certain scale J and that Hausdorff dimension can be estimated as if it were a box dimension. It is notable that the multifractal formalism leads to the correct singularity spectrum in several situations where these two assumptions are not veriﬁed. S TEP 3. The third step is an argument of the “Laplace method” type. When j → +∞, we note that, among the (3.29) terms, the one that brings the main contribution to (3.28) is that for which the exponent H carries out the minimum of Hq − fH (H), from which comes the heuristic formula: τ (q) − 1 = inf hq − fH (H) H

S TEP 4. If fH (H) is concave, then −fH (H) and −τ (q) + 1 are conjugate convex functions and each of them can be deduced from the other by a Legendre transformation. Thus, if we deﬁne the Legendre spectrum with: fL (H) = inf Hq − τ (q) + 1 q

we deduce a ﬁrst formulation of the multifractal formalism: fH (H) = fL (H) = inf Hq − τ (q) + 1 q

(3.30)

Wavelet Methods for Multifractal Analysis of Functions

129

The assumption on which the concavity fH (H) depends has no need to be veriﬁed. We can then: – stop at step 3 and only afﬁrm that τ (q) is the Legendre transformation of fH (H). However, this lower form of multifractal formalism is of little interest because the quantity that we would like to calculate is fH (H) and the quantity that we know is generally τ (q); – afﬁrm that (3.30) provides, in fact, the concave spectrum envelope. This information is, however, particularly ambiguous when the function calculated in this manner contains segments at the right (it is often the case, see [JAF 97b, JAF 99]): do they correspond to effective values of the spectrum or only to its envelope in a region where it is not concave? 3.5. Bounds of the spectrum We will obtain bounds of fH (H) valid in general. Once again, we have here two points of view, depending on whether the information that we have concerns the functional spaces to which f belongs or histograms of the wavelet coefﬁcients. We can make an observation before any calculation: since ρ(α) contains more information than τ (q), we expect that the bounds obtained from ρ(α) will be better; we shall see that this is indeed the case. 3.5.1. Bounds according to the Besov domain We start by bounding the singularity spectrum of functions belonging to a ﬁxed Besov space. PROPOSITION 3.5.– Let s > 1/p, α ∈ [s − p1 , s] and d = 1 − p(s − α). Then, for any function f ∈ B s,p , all the points x0 where f ∈ C s−( d-dimensional measure equal to zero.

1−d p )

(x0 ) have a Hausdorff

Proof. Let s > 0 and p > 0. Let us note: ef (k, j) = df (k, j) 2j(s−1/p) With respect to (3.23), the condition f ∈ B s,p can then be rewritten: |ef (k, j)|p < ∞

(3.31)

j,k

Let us note by Ij,k the interval centered in k 2−j and length |ef (k, j)|p/d . We can now rewrite (3.31): ! "d diam(Ij,k ) < ∞

130

Scaling, Fractals and Wavelets

For all J,(Ij,k )jJ form a covering of all the points belonging to an inﬁnity of intervals Ij,k , so much so that the Hausdorff d-dimensional measure of this set is zero. If a point x belongs to more than a ﬁnite number of intervals Ij,k , there exists J (= J(x)) such that: k ∀j J, ∀k j − x |ef (k, j)|p/d 2 and thus: d/p k 1 1 d |ef (k, j)| j − x 2−(s− p )j = 2−(s− p + p )j |2j x − k|d/p 2 1

d

Since we have s − 1/p > 0, Proposition 3.2 implies that f ∈ C s− p + p (x), from which we obtain Proposition 3.5, since: s−

1 d + = α. p p

NOTE 3.4.– The condition s > 1/p is necessary because there are functions of B 1/p,p that are nowhere locally bounded, (see [JAF 00c]). We will now deduce from Proposition 3.5 a bound of the spectrum fH (H). Let us assume that τ (q) is known. Let q and > 0 be ﬁxed. By deﬁnition of τ (q), for any > 0, f belongs to B (τ (q)− )/q,q . We can then apply Proposition 3.5 for all q such that: ∃ > 0 :

1 τ (q) − > q q

which is equivalent to τ (q) > 1. If f is uniformly Hölder, τ (q) is increasing and continuous, and veriﬁes: lim τ (q) = 0 and

q→0

τ (q) → +∞

when

q → +∞

There is a unique value qc such that τ (qc ) = 1 and, for any q > qc , we thus have: τ (q) − −H fH (H) 1 − q q thus: fH (H) qH − τ (q) + 1 Since this result is true for any q > qc , we have shown the following proposition.

Wavelet Methods for Multifractal Analysis of Functions

131

PROPOSITION 3.6.– If f is uniformly Hölder, the singularity spectrum of f veriﬁes the bound: (3.32) fH (H) inf qH − τ (q) + 1 q>qc

The quasi-sure results that we now describe show that bound (3.32) is optimal. Proposition 3.6 suggests that, in formula (3.30), the domain of q on which the Legendre transform must be calculated is the interval [qc , +∞). Let us assume that τ (q) is a function of partition admissible, i.e., it is the partition function of a function f uniformly Hölder (this will be the case if s(q) = qτ (1/q) is concave and veriﬁes 0 s (q) 1 and s(0) > 0; see [JAF 00b]). To say that f has as a partition function τ (q) implies, by deﬁnition, that f belongs to the space V (= Vτ (q) ) deﬁned by: B (τ (q)− )/q,q (3.33) V = >0,q>0

Space V is a Baire space, i.e. it has the following property: any intersection of countable dense open subsets is dense. In a Baire space, a property which is true at least on one intersection countable dense open subsets is said to be quasi-sure. Hence, it is natural to wonder if the multifractal formalism occurs almost surely in V . The following theorem, taken from [JAF 00b], answers this question. THEOREM 3.5.– Let τ (q) be obtainable and V the space deﬁned by (3.33). The deﬁnition domain of the singularity spectrum of almost any function of V is the interval [s(0), 1/qc ] and, on this interval, we have: (3.34) fH (H) = inf Hq − τ (q) + 1 qqc

Formula (3.34) afﬁrms that the singularity spectrum of almost any function is made up of two parts: if we have H < τ (qc ), the inﬁmum in (3.34) is reached for q > qc and the spectrum can be calculated using the “usual” Legendre transformation of τ (q): fH (H) = inf Hq − τ (q) + 1 q>0

If we have τ (qc ) H 1/qc , the inﬁmum in (3.34) is reached for q = qc and the spectrum is the segment at the right fH (H) = Hqc . The study of regularity properties that are almost sure derives from Banach (see [BAN 31]). Buczolich and Nagy have shown in [BUC 01] that almost any monotone function is multifractal of spectrum fH (H) = H for H ∈ [0, 1].

132

Scaling, Fractals and Wavelets

3.5.2. Bounds deduced from histograms The following proposition provides the optimal of the Hölder spectrum, which can be deduced in general from the histograms of wavelet coefﬁcients [AUB 00]. PROPOSITION 3.7.– If we have f ∈ C () for a > 0, then: fH (H) H sup α∈[0,H]

ρ(α) α

(3.35)

This becomes an equality for random wavelet series. Indeed, they verify ρ˜(α) = ρ(α), which shows that this bound is optimal. We can easily verify that it implies (3.32). However, (3.35) clearly provides a better bound if ρ(α) is not concave. Once again, we see that the histogram of the coefﬁcients strictly contains more “useful” information than the partition function. We will note that, although (3.35) is more precise than (3.32), the fact still remains that (3.32) is optimal, when the only information available is the partition function (as shown by almost sure results). The optimal bounds (3.35) and (3.32) can propose variants of the multifractal formalism. We say that the almost sure multifractal formalism is veriﬁed if (3.35) is saturated, i.e. if: ρ(α) α∈[0,H] α

fH (H) = H sup

(3.36)

and the multifractal formalism almost sure is veriﬁed if (3.35) is saturated, i.e. if: (3.37) fH (H) = inf qH − τ (q) + 1 qqc

3.6. The grand-canonical multifractal formalism The aim of the grand-canonical multifractal formalism is to calculate the spectrum of oscillating singularities d(H, β) which, by deﬁnition, provides the Hausdorff dimension of all the points where the Hölder exponent is H and the oscillation exponent β. This formalism is based on new functional spaces. To deﬁne them, we will use the more geometric notations that follow: λ and λ will designate the dyadic intervals λj,k = k 2−j + [0, 2−j ] and λj ,k = k 2−j + [0, 2−j ] respectively, Cλ will designate the coefﬁcient df (k, j) and ψλ the wavelet ψ(2j x − k).

DEFINITION 3.9.– Let p > 0 and s, s ∈ . A function f belongs to à Ops,s () if its wavelet coefﬁcients satisfy: 1/p sj s j p sup |Cλ 2 | <∞ (3.38) sup 2 j∈Z

k

λ ⊂λ

Wavelet Methods for Multifractal Analysis of Functions

133

This deﬁnition is independent of the base of wavelets chosen [JAF 98]. We now derive a grand-canonical multifractal formalism, which will enable us to obtain d(H, β) from the knowledge of the wavelet coefﬁcients of f . This formalism is based on a reasoning similar to that of section 3.4. It is motivated by two preoccupations: obtaining information that is more complete for the Hölder singularities of the signal and explicitly taking into consideration the behaviors of chirp type, which eliminates one of the causes of failure in the multifractal formalism. Indeed, we have seen, in step 1 of section 3.4.2, that the previous multifractal formalisms assume that the signal does not contain any chirps. Let: p s j log sup |C | 2 λ ⊂λ λ k ζ(p, s ) = lim inf j→+∞ log 2−j

= sup s : f ∈ Ops/p,s /p If f has a chirp with exponents (H, β) in x0 , its wavelet coefﬁcients are of the order of magnitude of |k 2−j − x0 |H close to the curve 2−j ∼ |k 2−j − x0 |1+β . For each pair (H, β), we shall estimate the contribution of the chirp of exponents (H, β) to the quantity: sup |Cλ |p 2s j . (3.39) λ∈Λj

λ ⊂λ

Let us consider an interval λ of side 2−j that contains a chirp with exponent (H, β). The wavelet coefﬁcients Cλ for λ ⊂ λ are negligible as long as we have 2−j (2−j )1+β , i.e., as long as we have j j(1 + β). When j ∼ j(1 + β), we have, for certain values of k : |Cλ | ∼ (2−j )H ∼ 2−j and hence:

H 1+β

Hp sup |Cλ |p 2s j ∼ 2−j ( 1+β −s ) ∼ 2−j(Hp−(1+β)s )

λ ⊂λ

(as long as we have s pH/(1 + β)). The contribution of the chirp of exponents (H, β) at (3.39) is thus:

2d(H,β)j 2−j(Hp−(1+β)s ) = 2−j(Hp−(1+β)s −d(H,β)) When j → +∞, the main contribution is provided by the pair (H, β) for which the inf of Hp − (1 + β)s − d(H, β) is attained, from which we obtain the heuristic formula: ζ(s , p) = inf Hp − (1 + β)s − d(H, β) H,β

134

Scaling, Fractals and Wavelets

If d(H, β) is a convex function, then: d(H, β) = inf Hp − (1 + β)s − ζ(s , p) s ,p

(3.40)

If we are interested in the Hölder spectrum fH (H), we can deduce it from the following argument: for a ﬁxed H, the value of β which brings the largest contribution to (3.40) will provide the correct dimension fH (H), hence the formula: ' ( (1 + β)s fH (H) = sup d(H, β) = sup inf + Hp − ζ(s , p) (3.41) β

β

s ,p

Of course, this formula is not at all equivalent to the standard multifractal formalisms, as we can easily verify in the example of lacunary wavelet series. 3.7. Bibliography [ABR 98] A BRY P., V EITCH D., “Wavelet analysis of long-range-dependent trafﬁc”, EEE Trans. Inf. Theory, vol. 44, no. 1, p. 2–15, 1998. [AMA 00] A MANN A., M AYR G., S TROHMENGER H.U., “N (α) histogram analysis of the ventricular ﬁbrilation ECG-signal as predictor of countershock success”, Chaos, Solitons, and Fractals, vol. 11, p. 1205–1212, 2000. [ARN 95a] A RNEODO A., A RGOUL F., BACRY E., E LEZGARAY J., M UZY J.F., Ondelettes, multifractales et turbulence : de l’ADFN aux croissances cristallines, Diderot, 1995. [ARN 95b] A RNEODO A., BACRY E., M UZY J.F., “The thermodynamics of fractals revisited with wavelets”, Physica A, vol. 213, p. 232–275, 1995. [ARN 97] A RNEODO A., BACRY E., JAFFARD S., M UZY J.F., “Oscillating singularities on Cantor sets: A grand canonical multifractal formalism”, Journal of Statistical Physics, vol. 87, p. 179–209, 1997. [ARN 98] A RNEODO A., BACRY E., M UZY J.F., “Random cascades on wavelet dyadic trees”, Journal of Mathematical Physics, vol. 39, no. 8, p. 4142–4164, 1998. [AUB 99] AUBRY J.M., “Representation of the singularities of a function”, Appl. Comput. Harmon. Anal., vol. 6, no. 2, p. 282–286, 1999. [AUB 00] AUBRY J.M., JAFFARD S., “Random wavelet series”, Comm. Math. Phys., vol. 227, p. 483–514, 2002. [AYA 08] AYACHE A., JAFFARD S., Hölder Exponents of Arbitrary Functions (preprint), 2008. [BAN 31] BANACH S., Über die Baire’sche Kategorie gewisser Funktionenmengen, Studia Math., vol. 3, p. 174–179, 1931. [BEN 97] B ENASSI A., JAFFARD S., ROUX D., “Elliptic Gaussian random processes”, Revista Mathematica Iberoamericana, vol. 13, no. 1, p. 19–90, 1997. [BER 96] B ERTOIN J., An Introduction to Lévy Processes, Cambridge University Press, 1996.

Wavelet Methods for Multifractal Analysis of Functions

135

[BER 98] B ERTOIN J., “The inviscid Burgers equation with Brownian initial velocity”, Communications in Mathematical Physics, vol. 193, no. 2, p. 397–406, 1998. [BIS 98] B ISWAS M.K., G HOSE T., G UHA S., B ISWAS P.K., “Fractal dimension estimation for texture images: A parallel approach”, Pattern Recognition Letters, vol. 19, no. 3–4, p. 309–313, 1998. [BUC 01] B UCZOLICH Z., NAGY J., “Hölder spectrum of typical monotone continuous functions”, Real Anal. Exch., vol. 26, no. 1, p. 133–156, 2000–2001. [CHA 99] C HASSANDE -M OTTIN E., F LANDRIN P., “On the time-frequency detection of chirps”, Appl. Comput. Harmon. Anal., vol. 6, no. 2, p. 252–281, 1999. [CHH 89] C HHABRA A.B., J ENSEN R.V., “Direct determination of the f (α) singularity spectrum”, Physical Review Letters, vol. 62, p. 1327–1330, 1989. [DAO 95] DAOUDI K., L ÉVY V ÉHEL J., “Speech signal modeling based on local regularity analysis”, in IASTED/IEEE, International Conference on Signal and Image Processing (SIP’95, Las Vegas, New Mexico), 1995. [DEV 98] D E VORE R., “Nonlinear approximation”, Acta Numerica, p. 1–99, 1998. [DUB 89] D UBUC B., Z UCKER S.W., T RICOT C., Q UINIOU J.F., W EHBI D., “Evaluating the fractal dimension of surfaces”, Proceedings of the Royal Society of London A, vol. 425, p. 113–127, 1989. [EVA 98] E VANS L.C., “Partial differential equations”, American Mathematical Society, Graduate studies in mathematics, 1998. [FRI 95] F RISCH U., Turbulence, Cambridge University Press, 1995. [GAG 87] G AGNE Y., Etude expérimentale de l’intermittence et des singularités dans le plan complexe en turbulence développée, PhD Thesis, Grenoble University, 1987. [GIN 00] G INCHEV I., ROCCA M., “On Peano and Riemann derivatives”, Rend. Circ. Mat. Pal. Ser. 2, vol. 49, p. 463–480, 2000. [GUI 98] G UIHENEUF B., JAFFARD S., L ÉVY V ÉHEL J., “Two results concerning chirps and 2-microlocal exponents prescription”, Appl. Comp. Harm. Anal., vol. 5, no. 4, p. 487–492, 1998. [JAF 91] JAFFARD S., “Pointwise smoothness, two-microlocalization coefﬁcients”, Publicacions Mathematiques, vol. 35, p. 155–168, 1991.

and

wavelet

[JAF 96a] JAFFARD S., “The spectrum of singularities of Riemann’s function”, Revista Mathematica Iberoamericana, vol. 12, no. 2, p. 441–460, 1996. [JAF 96b] JAFFARD S., M EYER Y., “Wavelet methods for pointwise regularity and local oscillations of functions”, Memoirs of the AMS, vol. 123, no. 587, 1996. [JAF 97a] JAFFARD S., “Multifractal formalism for functions”, SIAM Journal of Mathematical Analysis, vol. 28, no. 4, p. 944–998, 1997. [JAF 97b] JAFFARD S., “Old friends revisited. The multifractal nature of some classical functions”, J. Four. Anal. App., vol. 3, no. 1, p. 1–22, 1997.

136

Scaling, Fractals and Wavelets

[JAF 97c] JAFFARD S., M ANDELBROT B., “Peano-Polya motion, when time is intrinsic or binomial (uniform or multifractal)”, The Mathematical Intelligencer, vol. 19, no. 4, p. 21–26, 1997. [JAF 98] JAFFARD S., “Oscillation spaces: Properties and applications to fractal and multifractal functions”, Journal of Mathematical Physics, vol. 39, no. 8, p. 4129–4141, 1998. [JAF 99] JAFFARD S., “The multifractal nature of Lévy processes”, Probability Theory and Related Fields, vol. 114, no. 2, p. 207–227, 1999. [JAF 00a] JAFFARD S., “On lacunary wavelet series”, Ann. Appl. Proba., vol. 10, no. 1, p. 313–329, 2000. [JAF 00b] JAFFARD S., “On the Frisch-Parisi conjecture”, J. Math. Pures Appl., vol. 76, no. 6, p. 525–552, 2000. [JAF 00c] JAFFARD S., M EYER Y., “On the pointwise regularity of functions in critical Besov spaces”, J. Funct. Anal., vol. 175, p. 415–434, 2000. [JAF 01a] JAFFARD S., “Functions with prescribed Hölder and chirps exponents”, Revista Mathematica Iberoamericana, 2001. [JAF 01b] JAFFARD S., M EYER Y., RYAN R., Wavelets: Tools for Science and Technology, SIAM, 2001. [JAF 04a] JAFFARD S., “Beyond Besov spaces part 1: Distributions of wavelet coefﬁcients”, J. Four. Anal. Appl., vol. 10, no. 3, p. 221–246, 2004. [JAF 04b] JAFFARD S., “Wavelet techniques in multifractal analysis”, in L APIDUS M., VAN F RANKENHUIJSEN M. (Eds.), Fractal Geometry and Applications: A Jubilee of Benoît Mandelbrot, Proceedings of Symposia in Pure Mathematics, AMS, vol. 72, Part 2, p. 91–152, 2004. [JAF 05] JAFFARD S., “Beyond Besov spaces part 2: Oscillation spaces”, J. Constr. Approx., vol. 21, no. 1, p. 29–61, 2005. [JAF 06] JAFFARD S., L ASHERMES B., A BRY P., “Wavelet leaders in multifractal analysis”, in Q IAN T. et al. (Eds.), Wavelet Analysis and Applications, Birkhäuser, vol. 72, Part 2, p. 219–264, 2006. [JAF NIC] JAFFARD S., N ICOLAY S., “Pointwise smoothness of space-ﬁlling functions”, to appear in Appl. Comp. Harmon. Anal. [LAS 08] L ASHERMES S., ROUX S., A BRY P., JAFFARD S., “Comprehensive multifractal analysis of turbulent velocity using wavelet leaders”, Eur. Phys. J. B., vol. 61, no. 2, p. 201–215, 2008. [LEV 95] L ÉVY V ÉHEL J., “Fractal approaches in signal processing”, Fractals: Symposium in Honor of Benoît Mandelbrot (Curaçao), vol. 3, no. 4, p. 755–775, 1995. [LEV 97] L ÉVY V ÉHEL J., R IEDI R., “Fractional Brownian motion and data trafﬁc modeling: the other end of the spectrum”, in L ÉVY V ÉHEL J., L UTTON E., T RICOT C. (Eds.), Fractals in Engineering, Springer-Verlag, 1997. [MAN 95] M ANDELBROT B., Les Objets Fractals, Flammarion, 1995.

Wavelet Methods for Multifractal Analysis of Functions

137

[MAN 97] M ANDELBROT B., Fractals and Scalings in Finance, Springer, 1997. [MAN 98] M ANDELBROT B., Multifractals and 1/f Noise, Springer, 1998. [MEY 90] M EYER Y., Ondelettes et opérateurs, Hermann, 1990. [MEY 98] M EYER Y., “Wavelets, vibrations, and scalings”, in CRM Series AMS, University of Montreal Press, vol. 9, 1998. [PEL 96] P ELTIER R., L ÉVY V ÉHEL J., “Multifractional Brownian Motion: Deﬁnitions and Preliminary Results”, Technical Report, 1996. [PRU 81] P RUITT W., “The growth of random walks and Lévy processes”, Annals of Probability, vol. 9, no. 6, p. 948–956, 1981. [SCH 95] S CHLESINGER, Z ASLAVSKY, F RISCH (Eds.), Lévy Flights and Related Topics (Proceedings of the Nice Workshop, 1994), Springer-Verlag, 1995. [TAQ 97] TAQQU M., T EVEROVSKY V., W ILLINGER W., “Is network trafﬁc self-similar or multifractal?”, Fractals, vol. 5, no. 1, p. 63–73, 1997. [TRI 97] T RICOT C., “Function norms and fractal dimension”, SIAM J. Math. Anal., vol. 28, p. 189–212, 1997. [VER 94] V ERGASOLA M., D UBRULLE B., F RISCH U., N OULLEZ A., “Burgers’ equation, devil’s staircase and the mass distribution for large-scale structures”, Astronomy and Astrophysics, vol. 289, p. 325–356, 1994. [WEN 07] W ENDT H., A BRY P., JAFFARD S., “Bootstrap for empirical multifractal analysis”, IEEE Signal Processing Magazine, vol. 24, no. 4, p. 38–48, 2007. [WIL 96] W ILLINGER W., TAQQU M., E RRAMILLI A., “A bibliographical guide to self-similar trafﬁc and performance modeling for modern high-speed networks”, in K ELLY F.P. et al. (Eds.), Stochastic Networks: Theory and Applications (Selected Papers of the Royal Statistical Society Research Workshop, August 1995), Royal Statistical Society Lecture Notes Series, vol. 4, Clarendon Press, Oxford, p. 339–366, 1996. [WIL 00] W ILLINGER W., R IEDI R., TAQQU M., “Long-range dependence and data network trafﬁc”, in D OUKHAN O., TAQQU M. (Eds.), Long-range Dependence: Theory and Applications, Birkhäuser, 2000.

This page intentionally left blank

Chapter 4

Multifractal Scaling: General Theory and Approach by Wavelets

4.1. Introduction and summary Fractal processes have been successfully applied in various ﬁelds such as the theory of fully developed turbulence [MAN 74, FRI 85, BAC 93], stock market modeling [EVE 95, MAN 97, MAN 99], and more recently in the study of network data trafﬁc [LEL 94, NOR 94]. In networking, models using fractional Brownian motion (FBM) have helped advance the ﬁeld through their ability to capture fractal features such as statistical self-similarity and long-range dependence (LRD). It has been recognized, however, that multifractal features need to be accounted for further, so as to gain a better understanding of network trafﬁc, but also of stock exchange [RIE 97a, RIE 99, RIE 00, FEL 98, MAN 97]. In short, there is a call for more versatile models which can, for example, incorporate LRD and multifractal properties independently of each other. Roughly speaking, a fractal entity is characterized by the inherent, ubiquitous occurrence of irregularities, which govern its shape and complexity. The most prominent example is certainly FBM BH (t) [MAN 68]. Its paths are almost surely continuous but not differentiable. Indeed, the oscillation of FBM in any interval of size δ is of the order δ H where H ∈ (0, 1) is the self-similarity parameter: fd

BH (at) = aH BH (t).

Chapter written by Rudolf R IEDI.

(4.1)

140

Scaling, Fractals and Wavelets

Real world signals, on the other hand, often possess an erratically changing oscillation exponent, limiting the appropriateness of FBM as a model. Due to the various exponents present in such signals, they have been termed multifractals. This chapter’s main objective is to present the framework for describing and detecting such a multifractal scaling structure. In doing so we survey local and global multifractal analysis and relate them via the multifractal formalism in a stochastic setting. Thereby, the importance of higher order statistics will become evident. It might be especially appealing to the reader to see wavelets put to novel use. We focus mainly on the analytical computation of the so-called multifractal spectra, and on their mutual relations, dwelling extensively on variations of binomial cascades. Statistical properties of estimators of multifractal quantities, as well as modeling issues, are addressed elsewhere (see [GON 98, ABR 00, GON 99, MAN 97, RIE 99, RIB 06]). The remainder of this introduction provides a summary of the contents of the paper, roughly following its structure. 4.2. Singularity exponents For simplicity we consider processes Y over a probability space (Ω, F, PΩ ) and deﬁned on a compact interval, which we assume without loss of generality to be [0, 1]. Generalization to higher dimensions is straightforward and extending this to processes deﬁned on is simple and will be indicated. 4.2.1. Hölder continuity The erratic behavior or, more precisely, degree of local Hölder regularity of a continuous process Y (t) at a ﬁxed given time t can be characterized to a ﬁrst approximation by comparison with an algebraic function: Y is said to be in Cth if there is a polynomial Pt such that |Y (s) − Pt (s)| ≤ C|s − t|h for s sufﬁciently close to t. If Pt is a constant, i.e. Pt (s) = Y (t) for all s, then Y is in Cth for all h < h(t) and not in Cth for all h > h(t) where h(t) := lim inf ε→0

1 log sup |Y (s) − Y (t)|. log2 (2ε) 2 |s−t|<ε

(4.2)

On the other hand, it easy to prove the following LEMMA 4.1.– If h(t) ∈ / N then Pt is a constant, and h(t) = sup{h : Y ∈ Cth }. As the example Y (s) = s2 + s2.4 with t = 0 shows, the conclusion does not necessarily hold when h(t) ∈ N. Here, |Y (s) − Y (0)| ∼ s2 for s ∼ 0, thus h(0) = 2, while P0 (s) = s2 , Y (s) − P0 (s) = s2.4 , and thus sup{h : Y ∈ C0h } = 2.4.

Multifractal Scaling: General Theory and Approach by Wavelets

141

Proof. Assume there is h > h(t) and Pt (s) such that Y ∈ Cth . We will argue that h(t) must be an integer in this case. Note ﬁrst that Pt is not constant, by deﬁnition of h(t), and we may write Pt (s) = Y (t) + (s − t)m · Q(s) for some integer m ≥ 1 and some polynomial Q without zero at t. Assume ﬁrst that m < h(t) and choose h such that m < h < h(t). Writing Y (s) − Pt (s) = (Y (s) − Y (t)) − (Pt (s) − Y (t)), the ﬁrst term is smaller than |s − t|h and the second term, decaying as C|s − t|m , governs. Whence h = m < h(t), against the assumption. Assuming m > h(t), choose h such that m > h > h(t) and a sequence sn such that |Y (sn ) − Y (t)| ≥ |sn − t|h , whence |Y (sn ) − Pt (sn )| ≥ (1/2)|sn − t|h for large n and h ≤ h . Letting h → h(t) we again obtain a contradiction. We conclude that h(t) equals m. For reasons of symmetry we deﬁne h(t) := lim sup ε→0

1 log sup |Y (s) − Y (t)|. log2 (2ε) 2 |s−t|<ε

(4.3)

If h(t) and h(t) coincide we denote the common value by h(t). We note ﬁrst that the continuous limit in (4.2) may be replaced by a discrete limit. To this end we introduce kn (t) := t2n , an integer deﬁned uniquely by t ∈ Iknn := [kn 2−n , (kn + 1)2−n [.

(4.4)

As n increases the intervals Ikn form a nested decreasing sequence (compare Figure 4.1). Provided n is chosen such that 2−n+1 ≤ ε < 2−n+2 we have [(kn − 1)2−n , (kn + 2)2−n [ ⊂ [t + ε, t − ε[ ⊂ [(kn−2 − 1)2−n+2 , (kn−2 + 2)2−n+2 [ from which it follows immediately that h(t) = lim inf hnkn n→∞

h(t) = lim sup hnkn n→∞

where hnkn := −

1 log2 sup |Y (s) − Y (t)| : s ∈ [(kn − 1)2−n , (kn + 2)2−n [ . n

(4.5)

It is essential to note that the countable set of numbers hnkn contains all the scaling information of interest to us. Being deﬁned pathwise, they are random variables.

142

Scaling, Fractals and Wavelets

4.2.2. Scaling of wavelet coefficients A convenient tool for scaling analysis is found in the wavelet transform, both in its discrete and continuous forms. The discrete transform, for example, allows to represent a 1D process Y (t) in terms of shifted and dilated versions of a prototype bandpass wavelet function ψ(t), and shifted versions of a low-pass scaling function φ(t) [DAU 92, VET 95]. While such representations exist also in the framework of continuous wavelet transforms, we use the latter mainly as a “microscope” in this chapter. In the vocabulary of Hilbert spaces, the discrete wavelet and scaling functions ψj,k (t) := 2j/2 ψ 2j t − k , φj,k (t) := 2j/2 φ 2j t − k , j, k integer (4.6) form an orthonormal basis and we have the representations [DAU 92, VET 95] Y (t) =

DJ0 ,k φJ0 ,k (t) +

∞ j=J0

k

Cj,k ψj,k (t),

(4.7)

k

with Cj,k :=

∗ Y (t) ψj,k (t) dt,

Dj,k :=

Y (t) φ∗j,k (t) dt.

(4.8)

The wavelet coefﬁcient Cj,k measures the signal content around time 2−j k and frequency 2j f0 , provided that the wavelet ψ(t) is centered at time zero and frequency f0 . The scaling coefﬁcient Dj,k measures the local mean around time 2−j k. In the wavelet transform, j indexes the scale of analysis: J0 can be chosen freely and indicates the coarsest scale or lowest resolution available in the representation. The most simple example of an orthonormal wavelet basis are the Haar scaling and wavelet functions (see Figure 4.1a). Here, φ is the indicator function of the unit interval, while ψ = φ(2·) − φ(2 · −1). For a process supported on the unit interval a convenient choice is thus J0 = 0. The supports of the ﬁne-scale scaling functions nest inside the supports of those at coarser scales; this can be neatly represented by the binary tree structure of Figure 4.1b. Row (scale) j of this scaling coefﬁcient tree contains an approximation to Y (t) of resolution 2−j . Row j of the complementary wavelet coefﬁcient tree (not shown) contains the details in scale j + 1 of the scaling coefﬁcient tree that are suppressed in scale j. In fact, for the Haar wavelet we have Dj,k Cj,k

= =

2−1/2 (Dj+1,2k + Dj+1,2k+1 ), 2−1/2 (Dj+1,2k − Dj+1,2k+1 ).

(4.9)

Wavelet decompositions contain considerable information on the singularity behavior of a process Y . Indeed, adapting the argument of [JAF 95, p. 291] and

Multifractal Scaling: General Theory and Approach by Wavelets

2

Dj,k0

I j,k(t)

j/2

k’=0 0

0

k2

-j

k’=1 0

(k+1)2 -j

Dj+1,2k0 +1

Dj+1,2k

0

2

\ (t)

j/2

0

k’=0 1

j,k

k2 -j

143

k’=1 1

Dj+2,4k

0

(k+1)2 -j

(a)

k’=0 1

Dj+2,4k

0

+1

k’=1 1

Dj+2,4k

0

+2

Dj+2,4k0 +3

(b)

Figure 4.1. (a) The Haar scaling and wavelet functions φj,k (t) and ψj,k (t). (b) Binary tree of scaling coefﬁcients from coarse to ﬁne scales

correcting wavelet normalization used here for the L2 – as opposed to L1 in [JAF 95] – it is easily shown that |Y (s) − Y (t)| = O(|s − t|h ) implies that (4.10) 2n/2 |Cn,kn | = O 2−nh . This holds for any h > 0 and any compactly supported wavelet. As a matter of fact # only ψ = 0 is needed to obtain this result since the Taylor polynomial of Y is implicitly assumed to be constant. If we are interested in an analysis only, we may thus consider analyzing wavelets ψ such as derivatives of the Gaussian exp(−x2 ) which do not necessarily form a basis. To distinguish them from the orthogonal wavelets we will call them “analyzing wavelets”. In order to invert (4.10), however, we need representation (4.7) as well as some knowledge concerning the decay of the maximum of the wavelet coefﬁcients in the vicinity of t and sufﬁcient wavelet regularity. For a precise statement, see [JAF 95] and [DAU 92, Theorem 9.2]. All this suggests that replacing hnk (4.5) by the left hand side of (4.10) would produce an alternative description of the local behavior of Y . Consequently, we set w(t) := lim inf wknn n→∞

w(t) := lim sup wknn

(4.11)

n→∞

where wknn := −

1 log2 2n/2 Cn,kn . n

(4.12)

If w(t) and w(t) coincide we denote the common value by w(t). Using wavelets has the advantage of yielding an analysis which is largely # unaffected by polynomial trends in Y due to vanishing moments tm ψ(t)dt = 0 which are typically built into wavelets [DAU 92]. In this context recall Lemma 4.1. It has the disadvantage of complicating the analysis since the maxima of wavelet coefﬁcients have to be considered for a reliable estimation of true Hölder continuity

144

Scaling, Fractals and Wavelets

[JAF 95, DAU 92, JAF 97, BAC 93]. In any case, the decay of wavelet coefﬁcients is interesting in itself as it relates to LRD (compare [ABR 95]) and regularity spaces such as Besov spaces [RIE 99]. 4.2.3. Other scaling exponents Traditional multifractal analysis of a singular measure μ on the line constitutes a study of the singularity structure of its primitive M given by t μ(ds) = μ([0, t]), (4.13) M(t) = 0

Since M is an almost surely increasing process, the coarse exponents hnkn (see (4.5)) simplifying to hnkn = − n1 log2 |M((kn + 2)2−n ) − M((kn − 1)2−n )|, we are motivated (some would say “seduced”) to study an even simpler notion of a coarse exponent: α(t) := lim inf αknn n→∞

α(t) := lim sup αknn

(4.14)

n→∞

where αknn := −

1 1 log2 |M (kn + 1)2−n − M(kn 2−n )| = − log2 μ Iknn . n n

(4.15)

If α(t) and α(t) coincide we denote the common value by α(t). This exponent α(t) has attracted considerable attention in the multifractal community, with its potential due to its simplicity. In [LV 98] various examples of more general exponents were introduced, all of which are so-called Choquet capacities, a notion which is not needed to develop the multifractal formalism. As an interesting alternative, [PEY 98] considers an arbitrary function ξ(I) from the space of all intervals to + (instead of only the Ikn ) and develops a multifractal formalism similar to ours. There, it is suggested that we consider the oscillations of Y around the mean, i.e. # Y (s)ds (4.16) ξ(I) := Y (t) − I dt. |I| I Proceeding as with hn (t), we are led to the singularity exponent −(1/n) log2 (ξ(Ikn )) which is of particular interest since it can be used to deﬁne oscillation spaces, such as Sobolev spaces and Besov spaces. Another useful choice consists of interpolating Y in the interval I by the linear function aI + bI t and considering 1/2 2 (Y (t) − (aI + bI t)) dt . (4.17) ξ(I) := I

Multifractal Scaling: General Theory and Approach by Wavelets

145

This exponent measures the variability of Y and is related to the dimension of the paths of Y . Deducting constant, resp. linear terms in formulae (4.16) and (4.17) reminds us of the use of wavelets with one, resp. two vanishing moments. 4.3. Multifractal analysis Multifractal analysis has been discovered and developed in [MAN 74, FRI 85, KAH 76, GRA 83, HEN 83, HAL 86, CUT 86, CAW 92, BRO 92, BAC 93, MAN 90b, HOL 92, FAL 94, OLS 94, ARB 96, JAF 97, PES 97, RIE 95a, MAN 02, BAC 03, BAR 04, BAR 02, CHA 05, JAF 99] to give only a short list of some relevant work done in this area. The main insight consisted of the fact that local scaling exponents on fractals as measured by h(t), α(t) or w(t) are not uniform or continuous as a function of t, in general. In other words, h(t), α(t) and w(t) typically change in an erratic way as a function of t, thus imprinting a rich structure on the object of interest. This structure can be captured either in geometric terms, making use of the concept of dimensions, or in statistical terms based on sample moments. A useful connection between these two descriptions emerges from the multifractal formalism. As we will see, as far as the multifractal formalism is concerned there is no restriction in choosing a singularity exponent which seems ﬁt for describing scaling behavior of interest. To express this fact we consider in this section the arbitrary scaling exponents s(t) := lim inf snkn n→∞

and

s(t) := lim sup snkn ,

(4.18)

n→∞

where snk (k = 0, . . . , 2n − 1, n ∈ N) is any sequence of random variables. To keep a connection with what was said before, think of snk as representing a coarse scaling exponent of Y over the dyadic interval Ikn . 4.3.1. Dimension based spectra A geometric description of the erratic behavior of a multifractal’s scaling exponents can be achieved using a quantiﬁcation of the prevalence of particular exponents in terms of fractal dimensions as follows: We consider the sets Ka which are deﬁned pathwise in terms of limiting behavior of snkn as n → ∞, as Ea := {t : s(t) = a},

E a := {t : s(t) = a},

Ka := {t : s(t) = a}

(4.19)

These sets Ka are typically “fractal”, meaning loosely that they have a complicated geometric structure and more precisely that their dimensions are non-integer. A compact description of the singularity structure of Y is therefore in terms of the following so-called Hausdorff spectrum d(a) := dim(Ka ), where dim(E) denotes the Hausdorff dimension of the set E [TRI 82].

(4.20)

146

Scaling, Fractals and Wavelets

The sets Ea (a ∈ ) – and also E a – form a multifractal decomposition of the support of Y . We will loosely address Y as a multifractal if this decomposition is rich, i.e. if the sets Ea (a ∈ ) are highly interwoven or even dense in the support of Y . However, the study of singular measures (deterministic and random) has often been restricted to the simpler sets Ka and their spectrum d(a) [KAH 76, CAW 92, FAL 94, ARB 96, OLS 94, RIE 98, RIE 95a, RIE 95b, BAR 97]. With the theory developed here (Lemma 4.2) it becomes clear that most of these results extend to provide formulae for dim(Ea ) and dim(E a ) as well. This aspect of multifractal analysis has been of much interest to the mathematical community. 4.3.2. Grain based spectra An alternative description of the prevalence of singularity exponents, statistical in nature due to the counting involved, is f (a) := lim lim sup ε↓0

n→∞

1 log2 N n (a, ε), n

(4.21)

where1 N n (a, ε) := #{k = 0, . . . , 2n − 1 : a − ε ≤ snk < a + ε}.

(4.22)

This notion has grown out of the difﬁculty faced by any real world application, that the calculation of actual Hausdorff dimensions is often hard, if not impossible. Using a mesh of given grain size as in (4.22) instead of arbitrary coverings as in dim(Ka ) leads generally to more simple notions. However, f should not be regarded as an auxiliary vehicle but recognized for its own merit, which will become apparent in the remainder of this section. Our ﬁrst remark on f (a) concerns the fact that the counting used in its deﬁnition, i.e. N n (a, ε) may be used to estimate box dimensions. Based on this fact it was shown in [RIE 98] that dim(Ka ) ≤ f (a).

(4.23)

Here, we state a slightly improved version: LEMMA 4.2.– dim(Ea ) ≤ f (a)

dim(E a ) ≤ f (a)

(4.24)

1. More generally, using c-ary intervals in Euclidean space d kn will range from 0 to cnd − 1. Logarithms will have to be taken to the base c since we seek the asymptotics of N n (a, ε) in terms of a power law of resolution at stage n, i.e. N n (a, ε) cnf (a) . The maximum value of f (a) will be d.

Multifractal Scaling: General Theory and Approach by Wavelets

147

and dim(Ka ) ≤ f (a) := lim lim inf ε↓0 n→∞

1 log2 N n (a, ε). n

(4.25)

It follows immediately that dim(Ka ) ≤ dim(Ea ) ≤ f (a), but dim(Ea ) is not necessarily smaller than f (a). 4.3.3. Partition function and Legendre spectrum The second comment regarding the grain spectrum f (a) concerns its interpretation as a large deviation principle (LDP). We may consider N n (a, ε)/2n to be the probability of ﬁnding (for a ﬁxed realization of Y ) a number kn ∈ κn := {0, . . . , 2n − 1} such that snkn ∈ [a − ε, a + ε]. Typically, there will be one value of s(t) that appears most frequently, denoted a ˆ, and f (a) will reach its maximum 1 at a=a ˆ. However, by deﬁnition, for a = a ˆ the chance to observe coarse exponents snkn which lie in [a − ε, a + ε] will decrease exponentially fast with a rate given by f (a). Appealing to the theory of LDP-s we consider the random variable An = −nsnK ln(2) where K is randomly picked from κn = {0, . . . , 2n − 1} with uniform distribution Un (recall that the we study one ﬁxed realization or path of Y ) and deﬁne its “logarithmic moment generating function” or partition function τ (q) := lim inf − n→∞

1 log2 S n (q), n

(4.26)

where n

S (q) :=

n 2 −1

exp (−qnsnk

ln(2)) =

k=0

n 2 −1

' ( n n 2−nqsk = 2n En 2−nqsk .

(4.27)

k=0

Here, En stands for expectation with respect to Un . The Gärtner-Ellis theorem [ELL 84] then applies and yields the following result (see [RIE 95a] for a slightly stronger version): THEOREM 4.1.– If the limit τ (q) = lim − n→∞

1 log2 S n (q) n

(4.28)

exists and is ﬁnite for all q ∈ , and if τ (q) is a differentiable function of q, then the double limit f (a) = lim lim

ε↓0 n→∞

1 log2 N n (a, ε) n

(4.29)

148

Scaling, Fractals and Wavelets

exists, in particular f (a) = f (a), and f (a) = τ ∗ (a) := inf (qa − τ (q)) q∈

(4.30)

for all a. Proof. Applying [ELL 84, Theorem II] to our situation immediately gives lim sup n→∞

1 1 log2 N n (a, ε) ≤ lim sup log2 #{k : |snk − a| ≤ ε} ≤ sup τ ∗ (a ) n n→∞ n |a −a|≤ε

and lim inf n→∞

1 1 log2 N n (a, ε) ≥ lim inf log2 #{k : |snk − a| < ε} ≥ sup τ ∗ (a ). n→∞ n n |a −a|<ε

By continuity of τ ∗ (a) these two bounds coincide and (4.29) is established. Now, letting ε → 0 shows that f (a) = τ ∗ (a). Sometimes, the differentiability assumptions of this theorem are too restrictive. Before dwelling more on the relation between τ and f in section 4.4 let us note a simple fact, also providing a simple reason why the Legendre transform appears in this context. LEMMA 4.3.– We always have f (a) ≤ τ ∗ (a).

(4.31)

Proof. Fix q ∈ and consider a with f (a) > −∞. Let γ < f (a) and ε > 0. Then, we take n arbitrarily large such that N n (a, ε) ≥ 2nγ . For such n we bound S n (q) by noting n 2 −1

k=0

n

2−nqsk ≥

n

2−nqsk ≥ N n (a, ε)2−n(qa+|q|ε) ≥ 2−n(qa−γ+|q|ε) (4.32)

|sn k −a|<ε

and hence τ (q) ≤ qa − γ + |q|ε. Letting ε → 0 and γ → f (a), we ﬁnd τ (q) ≤ qa − f (a). Since this is trivial if f (a) = −∞ we ﬁnd τ (q) ≤ qa − f (a) and

f (a) ≤ qa − τ (q)

for all a and q.

From this it follows trivially that τ (q) ≤ f ∗ (q) and f (a) ≤ τ ∗ (a).

(4.33)

Multifractal Scaling: General Theory and Approach by Wavelets

149

With the special choice snk = αkn for the distribution function M of a measure μ, S (q) becomes n

S

n

α (q)

=

n 2 −1

n

2 −1 q M (k + 1)2−n − M(k2−n )q = (μ(Ikn )) .

k=0

(4.34)

k=0

This is the original form in which τ (q) has been introduced in multifractal analysis [HAL 86, HEN 83, FRI 85, MAN 74]. Note that there is a close connection to the thermodynamic formalism [TEL 88]. 4.3.4. Deterministic envelopes An analytical approach is often useful in order to gain an intuition on the various spectra of a typical path of Y , or at least some estimate of it. To establish such an approach, we consider the position, i.e. t or kn , as well as the path Y to be random simultaneously. We then apply the LDP to the larger probability space. More precisely, the exponents snK are now random variables over (Ω × κn , PΩ × Un ). The “deterministic partition function” corresponding to this setting reads as 1 T (q) := lim inf − log2 EΩ [S n (q)]. (4.35) n→∞ n NOTE 4.1 (Ergodic processes).– So far, we have assumed in the deﬁnitions of τ (q) and T (q) that Y is deﬁned on a compact interval. Without loss of generality, this interval was assumed to be [0, 1]. In order to allow for processes deﬁned on we modify S n (q) to N 2n −1 1 −nqsnk n 2 S (q) := lim N →∞ N k=0

n

and N (a, ε) similarly. For ergodic processes this becomes S n (q) = 2n EΩ [2−nqsk ] almost surely. Thus, EΩ [S n (q)] = S n (q) a.s. and n

a.s.

T (q) = τ (q, ω).

(4.36)

We refer to (4.74) for an account of the extent to which marginal distributions may be reﬂected in multifractal spectra in general. For processes on [0, 1] we cannot expect to have (4.36) in all generality. Nevertheless, we will point out scenarios where (4.36) holds. Notably T (q) always serves as a deterministic envelope of τ (q, ω): LEMMA 4.4.– With probability one2 τ (q, ω) ≥ T (q) for all q with T (q) < ∞.

2. For clarity, we make the randomness of τ explicit here.

(4.37)

150

Scaling, Fractals and Wavelets

Proof. Consider any q with ﬁnite T (q) and let ε > 0. Let n0 be such that EΩ [S n (q)] ≤ 2−n(T (q)−ε) for all n ≥ n0 . Then, n(T (q)−2ε) Sn (q, ω) ≤ E 2n(T (q)−2ε) Sn (q, ω) ≤ 2−nε < ∞. E lim sup 2 n→∞

n≥n0

n≥n0

Thus, almost surely lim supn→∞ 2n(T (q)−2ε) Sn (q, ω) < ∞, and τ (q) ≥ T (q)−2ε. Consequently, this estimate holds with probability one simultaneously for all ε = 1/m (m ∈ N) and some countable, dense set of q values with T (q) < ∞. Since τ (q) and T (q) are always concave due to Corollary 4.2 below (see section 4.4), they are continuous on open sets and the claim follows. Along the same lines we may deﬁne the corresponding deterministic grain spectrum. By analogy, we will replace probability over κn = {0, . . . , 2n − 1} in (4.21), i.e. N n (a, ε), by probability over Ω × κn , i.e. n 2 −1

! " PΩ [a − ε ≤ snk < a + ε] = 2n EΩ×κn 1[a−ε,a+ε) (snK )

k=0

(4.38)

= EΩ [N n (a, ε)] and deﬁne F (a) := lim lim sup ε↓0

n→∞

1 log2 EΩ [N n (a, ε)] n

(4.39)

Replacing N n (a, ε) with (4.38) in the proof of Theorem 4.1 and taking expectations in (4.32) we ﬁnd properties analogous to the pathwise spectra τ and f : THEOREM 4.2.– For all a F (a) ≤ T ∗ (a).

(4.40)

Furthermore, under conditions on T (q) analogous to τ (q) in Theorem 4.1 F (a) = T ∗ (a) = F (a) := lim lim inf ε↓0 n→∞

1 log2 EΩ [N n (a, ε)]. n

(4.41)

It follows from Lemma 4.4 that with probability one τ ∗ (a, ω) ≤ T ∗ (a) for all a. Similarly, the deterministic grain spectrum F (a) is an upper bound to its pathwise deﬁned random counterpart f (a, ω), however, only pointwise. On the other hand, we have here almost sure equality under certain conditions. NOTE 4.2 (Negative dimensions).– Deﬁned through counting f (a) as always positive – or −∞. The envelopes T ∗ and F , being deﬁned through expectations of

Multifractal Scaling: General Theory and Approach by Wavelets

151

counts and sums, may assume negative values. Consequently, the negative values of T ∗ and F are not very useful in the estimation of f ; however, they do contain further information and can be “observed”. Negative F (a) and T ∗ (a) have been termed negative dimensions [MAN 90b]. They correspond to probabilities of observing a coarse Hölder exponent a which decays faster than the 2n = #κn “samples” snk available in one realization. Oversampling the process, i.e. analyzing several independent realizations, will increase the number of samples more “rare” snk may be observed. In loose terms, in exp(−n ln(2)F (a)) independent traces we have a fair chance of seeing at least one snk of size a. Thereby, it is essential not to average the spectra f (a) of the various realizations but the numbers N n (a, ε). This way, negative “dimensions” F (a) become visible. 4.4. Multifractal formalism In the previous section, various multifractal spectra were introduced along with some simple relations between them. These can be summarized as follows: COROLLARY 4.1 (Multifractal formalism).– For every a a.s.

dim(Ka ) ≤ dim(Ea ) ≤ f (a) ≤ τ ∗ (a) ≤ T ∗ (a)

(4.42)

where the ﬁrst relations hold pathwise and the last one (the two terms on both sides of the last inequality) with probability one. Similarly a.s.

dim(Ka ) ≤ f (a) ≤ f (a) ≤ F (a) ≤ T ∗ (a).

(4.43)

The spectra on the left end have stronger implications on the local scaling structure while the ones on the right end are more easy to estimate or calculate. This set of inequalities could fairly be called the “multifractal formalism”. However, in the mathematical community a slightly different terminology is already established which goes “the multifractal formalism holds” and means that for a particular process (or one of its paths, according to context) dim(Ka ) can be calculated using some adequate partition function (such as τ (q)) and taking its Legendre transform. Consequently, when “the multifractal formalism holds” for a path or process, then we often ﬁnd that equality holds between several or all spectra appearing in (4.42), depending on the context of the formalism that had been established. This property (“the multifractal formalism holds”) is a very strong one and suggests the presence of one single underlying multiplicative structure in Y . This intuition is supported by the fact that the multifractal formalism in known to “hold” up to now only for objects with strong rescaling properties where multiplication is involved such as self-similar measures, products of processes and inﬁnitely divisible

152

Scaling, Fractals and Wavelets

cascades (see [CAW 92, FAL 94, ARB 96, RIE 95a, PES 97, HOL 92], respectively [MAN 02, BAC 03, BAR 04, BAR 02, CHA 05] as well as references therein). A notable exception of processes without injected multiplicative structure are Lévy processes, the multifractal properties of which are well understood due to [JAF 99]. Though we pointed out some conditions for equality between f , τ ∗ and T ∗ we must note that in general we may have strict inequality in some or all parts of (4.42). Such cases have been presented in [RIE 95a, RIE 98]. There is, however, one equality which holds under mild conditions and connects the two spectra in the center of (4.42). THEOREM 4.3.– Consider a realization or path of Y . If the sequence snk is bounded, then τ (q) = f ∗ (q),

for all q ∈ .

(4.44)

Proof. Note that τ (q) ≤ f ∗ (q) from Lemma 4.3. Now, to estimate τ (q) from below, choose a larger than |snk | for all n and k and group the terms in S n (q) conveniently, i.e.

a/ε

S n (q) ≤

n

2−nqsk

i=− a/ε (i−1)ε≤sn k <(i+1)ε

(4.45)

a/ε

≤

N n (iε, ε)2−n(qiε−|q|ε) .

i=− a/ε

Next, we need uniform estimates on N n (a, ε) for various a. Fix q ∈ and let η > 0. Then, for every a ∈ [−a, a] there is ε0 (a) and n0 (a) such that N n (a, ε) ≤ 2n(f (a)+η) for all ε < ε0 (a) and all n > n0 (a). We would like to have ε0 and n0 independent from a for our uniform estimate. To this end note that N n (a , ε ) ≤ N n (a, ε) for all a ∈ [a − ε/2, a + ε/2] and all ε < ε/2. By compactness we may choose a ﬁnite set of aj (j = 1, . . . , m) such that the collection [aj − ε0 (aj )/2, aj + ε0 (aj )/2] covers [−a, a]. Set ε1 = (1/2) minj=1,...,m ε0 (aj ) and n1 = maxj=1,...,m n0 (aj ). Then, for all ε < ε1 and n > n1 , and for all a ∈ [−a, a] we have N n (a, ε) ≤ 2n(f (a)+η) and, thus,

a/ε

S n (q) ≤

2−n(qiε−f (iε)−η−|q|ε) (4.46)

i=− a/ε

≤ (2a/ε + 1) · 2−n(f

∗

(q)−η−|q|ε)

.

Letting n → ∞ we ﬁnd τ (q) ≥ f ∗ (q) − η − |q|ε for all ε < ε1 . Now we let ε → 0 and ﬁnally η → 0 to ﬁnd the desired inequality.

Multifractal Scaling: General Theory and Approach by Wavelets

153

Due to the properties of Legendre transforms3 it follows: COROLLARY 4.2 (Properties of the partition function).– If the sequence snk is bounded, then the partition function τ (q) is concave and monotonous. Consequently, τ (q) is continuous on , and differentiable in all but a countable number of exceptional points. In order to efﬁciently invert Theorem 4.3 we need: LEMMA 4.5 (Lower semi-continuity of f and F ).– Let am converge to a∗ . Then f (a∗ ) ≥ lim sup f (am )

(4.47)

m→∞

and analogous for F . Proof. For all ε > 0 we can ﬁnd m0 such that a∗ − ε < am − ε/2 < am + ε/2 < a∗ + ε for all m > m0 . Then, N n (a∗ , ε) ≥ N n (am , ε/2) and E[N n (a∗ , ε)] ≥ E[N n (am , ε/2)]. We ﬁnd lim sup n→∞

1 1 log2 N n (a∗ , ε) ≥ lim sup log2 N n (am , ε/2) ≥ f (am ) n n→∞ n

for any m > m0 (ε) and similar for F . Now, let ﬁrst m → ∞ and then ε → 0. COROLLARY 4.3 (Central multifractal formalism).– We always have f (a) ≤ f ∗∗ (a) = τ ∗ (a).

(4.48)

Furthermore, denoting by τ (q±) the right- (resp. left-)sided limits of derivatives we have, f (a) = τ ∗ (a) = qτ (q±) − τ (q±)

at a = τ (q±).

(4.49)

Proof. The graph of f ∗∗ is the concave hull of the graph of f which implies (4.48). It is an easy task to derive (4.49) under assumptions suitable to make the tools of calculus available such as continuous second derivatives. To prove it in general let us ﬁrst assume that τ is differentiable at a ﬁxed q. In particular, τ (q ) is then ﬁnite for q close to q. Since τ (q) = f ∗ (q) there is a sequence am such that τ (q) = limm qam − f (am ). Since τ (q ) ≤ q a − f (a) for all q and a by (4.33), and since τ is differentiable

3. For a tutorial on the Legendre transform see [RIE 99, App. A].

154

Scaling, Fractals and Wavelets

at q this sequence am must converge to a∗ := τ (q). From the deﬁnition of am we conclude that f (am ) converges to qa∗ − τ (q). Applying Lemma 4.5 we ﬁnd that f (a∗ ) ≥ qa∗ − τ (q). Recalling (4.33) implies the desired equality. Now, for an arbitrary q the concave shape of τ implies that there is a sequence of numbers qm larger than q in which τ is differentiable and which converges down to q. Consequently, τ (q+) = limm τ (qm ). Formula (4.49) being established at all qm Lemma 4.5 applies with am = τ (qm ) and a∗ = τ (q+) to yield f (τ (q+)) ≥ qτ (q+) − τ (q+). Again, (4.33) furnishes the opposite inequality. A similar argument applies to τ (q−). COROLLARY 4.4.– If T (q) is ﬁnite for an open interval of q-values then |snk | is bounded for almost all paths, and T (q) = F ∗ (q)

for all q.

(4.50)

Moreover, F (a) = T ∗ (a) = qT (±q) − T (±q)

at a = T (±q).

(4.51)

Proof. Assume for a moment that snk is unbounded from above with positive probability. Then, grouping (4.45) requires an additional term collecting the snk > a. In fact, for any number a we can ﬁnd n arbitrarily large such that snk > a for some k. This implies that for any negative q we have S n (q) ≥ 2−nqa and τ (q) ≤ qa. Letting a → ∞ shows that τ (q) = −∞. By Lemma 4.4 we must have T (q) = −∞, a contradiction. Similarly, we show that snk is bounded from below. The remaining claims can be established analogously to those for τ (q) by taking expectations in (4.45). NOTE 4.3 (Estimation and unbounded moments).– In order to apply Corollary 4.4 in a real world situation, but also for the purpose of estimating τ (q), it is of great importance to possess a method to estimate the range of q-values for which the moments of a stationary process (such as the increments or the wavelet coefﬁcients of Y ) are ﬁnite. Such a procedure is proposed in [GON 05] (see also [RIE 04]). 4.5. Binomial multifractals The binomial measure has a long-standing tradition in serving as the paradigm of multifractal scaling [MAN 74, KAH 76, MAN 90a, CAW 92, HOL 92, BEN 87, RIE 95a, RIE 97b]. We present it here with an eye on possible generalizations of use in modeling. 4.5.1. Construction To be consistent in notation we denote the binomial measure by μb and its distribution function by Mb (t) := μb (] − ∞, t[). Note that μb is a measure or

Multifractal Scaling: General Theory and Approach by Wavelets

155

(probability) distribution, i.e. not a function in the usual sense, while Mb is a right-continuous and increasing function by deﬁnition. In order to deﬁne μb we again use the notation (4.4): for any ﬁxed t there is a unique sequence k1 , k2 , . . . such that the dyadic intervals Iknn = [kn 2−n , (kn +1)2−n [ contain t for all integer n. So, the Ikn form a decreasing sequence of half open intervals n+1 n+1 is the left subinterval of Iknn and I2k which shrink down to {t}. Moreover, I2k n n +1 the right subinterval (see Figure 4.1). Note that the ﬁrst n elements of such a sequence, i.e. (k1 , k2 , . . . , kn ) are identical for all points t ∈ Iknn . We call this a nested sequence and it is uniquely deﬁned by the value of kn . We set μb (Ikn ) = Mb ((kn + 1)2−n ) − Mb (kn 2−n ) · · · Mk11 · M00 . = Mknn · Mkn−1 n−1

(4.52)

In other words, the mass lying in Iknn is redistributed among its two dyadic n+1 n+1 n+1 n+1 subintervals I2k and I2k in the proportions M2k and M2k . For consistency n n +1 n n +1 n+1 n+1 we require M2kn + M2kn +1 = 1. Having deﬁned the mass of dyadic intervals we obtain the mass of any interval ] − ∞, t[ by writing itas a disjoint union of dyadic intervals J n and noting Mb (t) = μb (] − ∞, t[= n μb (J n ). Therefore, integrals (expectations) with respect to μb can be calculated as g(t)μb (dt) = lim

n→∞

=

g(t)dMb (t) = lim

n→∞

n −1 2

g(k2−n )μb (Ikn )

(4.53)

k=0 n −1 2

g(k2−n ) Mb ((k + 1)2−n ) − Mb (k2−n ) (4.54)

k=0

Alternatively, the measure μb can be deﬁned using its distribution function Mb . Indeed, as a distribution function, Mb is monotone and continuous from the right. Since (4.52) deﬁnes Mb in all dyadic points it can be obtained in any other point as the right-sided limit. Note that Mb is continuous at a given point t unless Mknn (t) = 1 for all n large. To generate randomness in Mb , we choose the various Mkn to be random variables. The above properties then hold pathwise. We will make the following assumptions on the multiplier distributions Mkn : i) Conservation of mass. Almost surely for all n and k Mkn is positive and n+1 n+1 + M2k = 1. M2k n n +1

(4.55)

156

Scaling, Fractals and Wavelets

0

M0

0.07

1

0

0.06 0.05

1 M0

.M

0.04

0 0

1

.

0.03

0

M1 M0

0

0.5

1

0.02 0.01 0

. .

2

1

0

M0 M0 M0 0

. .

2

1

Ŧ0.01

0

M1 M0 M0

0.25

. .

2

1

0

M2 M1 M0 0.5

. .

2

1

0

M3 M1 M0 0.75 1

Ŧ0.02 Ŧ0.03 0

2000

4000

6000

8000

Figure 4.2. Iterative construction of the binomial cascade

As we have seen, this guarantees that Mb is well deﬁned. ii) Nested independence. All multipliers of a nested sequence are mutually independent. Analogously to (4.52) we have for any nested sequence EΩ [Mknn · · · M00 ] = EΩ [Mknn ] · · · EΩ [M00 ]

(4.56)

and similar for other moments. This will allow for simple calculations in what follows. iii) Identical distributions. For all n and k M0 if k is even, n fd Mk = M1 if k is odd.

(4.57)

A more general version of iii) was given in [RIE 99] to allow for more ﬂexibility in model matching. The theory of cascades or, more properly, T -martingales4 [KAH 76, BEN 87, HOL 92, BAR 97], provides a wealth of possible generalizations. Most importantly, it allows us to soften the almost sure conservation condition i) to i’) Conservation in the mean EΩ [M0 + M1 ] = 1.

(4.58)

In this case, Mb is well deﬁned since (4.52) forms a martingale due to the nested independence (4.56). The main advantage of such an approach is that we can use unbounded multipliers M0 and M1 such as log-normal random variables. Then, the marginals of the increment process, i.e. μb (Ikn ) are exactly log-normal on all scales. For general binomials, always assuming ii) it can be argued that the marginals μb (Ikn ) are at least asymptotically log-normal by applying a central limit theorem to the logarithm of (4.52).

4. For any ﬁxed t sequence (4.52) forms a martingale due to the nested independence (4.56).

Multifractal Scaling: General Theory and Approach by Wavelets

157

4.5.2. Wavelet decomposition The scaling coefﬁcients of μb using the Haar wavelet are simply

φ∗j,k (t) μb (dt) = 2n/2

Dn,k (μb ) =

(k+1)2−n

μb (dt) = 2n/2 μb (Ikn )

k2−n

(4.59)

from (4.8) and (4.53). With (4.9) and (4.52) we derive the explicit expression for the Haar wavelet coefﬁcients: n+1 n+1 ) − μb (I2k ) 2−n/2 Cn,kn (μb ) = μb (I2k n n +1 n+1 n+1 − M2k ) = (M2k n n +1

n )

Mki i .

(4.60)

i=0

Similar scaling properties hold when using arbitrary, compactly supported wavelets, provided the distributions of the multipliers are scale independent. This comes about from (4.52) and (4.53), which give the following rule for substituting t = 2n t − kn −n/2 2 Cn,kn (μb ) = ψ(2n t − kn )μb (dt) Iknn

= Mknn · · · Mk11 ·

1 0

(4.61) (n,k ) ψ(t )μb n (dt ).

(n,k )

Here μb n is a binomial measure constructed with the same method as μb itself, however, with multipliers taken from the subtree which has its root at the node kn of level n of the original tree. More precisely, for any nested sequence i1 , . . . , im (n,kn )

μb

n+1 n+2 (Iim ) = M2k · M4k · · · M2n+m m k +i . m n +i1 n +i2 n m

(n,k )

From nested independence (4.56) we infer that this measure μb n is independent of Mki i (i = 1, . . . , n). Furthermore, the identical distributions of the multipliers iii) imply that for arbitrary, compactly supported wavelets 1 1 d (n,k ) ψ(t)μb n (dt) = C0,0 (μb ) = ψ(t)μb (dt) (4.62) 0

d

0

where = denotes equality in distribution. In particular, for the Haar wavelet we have 1 d (n,k ) n+1 n+1 Haar ψHaar (t)μb n (dt) = M2k − M2k = M0 − M1 = C0,0 (μb ) (4.63) n n +1 0

158

Scaling, Fractals and Wavelets

(the deterministic analog has also been observed in [BAC 93]). Finally, note that if ψ is supported on [0, 1], then ψ(2n (·) − k) is supported on Ikn . So, the tree of wavelet coefﬁcients Cn,k of μb possess a structure similar to the tree of increments of Mb (compare (4.52)). With a little more effort we calculate the wavelet coefﬁcients of Mb itself, provided ψ is admissible and supported on [0, 1]. Indeed, Mb (t) − Mb (kn 2−n ) = μb ([kn 2−n , t]) (n,kn )

= Mknn · · · Mk11 Mb (n,kn )

where Mb this yields

(n,kn )

(t ) := μb

2−n/2 Cn,kn (Mb ) =

([0, t ]). Using

Iknn

#

(2n t − kn ),

(4.64)

ψ = 0 and substituting t = 2n t − kn

ψ(2n t − kn ) Mb (t) − Mb (kn 2−n ) dt

= 2−n · Mknn · · · Mk11 ·

0

1

(4.65) (n,k ) ψ(t )Mb n (t )dt .

Again, we have 1 d (n,k ) ψ(t)Mb n (dt) = C0,0 (Mb ) = 0

1

ψ(t)Mb (dt)

(4.66)

0

LEMMA 4.6.– Let ψ be a wavelet supported on [0, 1]. Let Mb be a binomial with i)-iii). Then, Cn,kn (μb ) is given by (4.61), and if ψ is admissible then Cn,kn (Mb ) is given by (4.65). Furthermore, (4.62) and (4.66) hold. It is obvious that the dyadic structure present in both the construction of the binomial measure as well as in the wavelet transform are responsible for the simplicity of the calculation above. It is, however, standard by now to extend the procedure to more general multinomial cascades such as Mc , introduced in section 4.5.5 (see [ARB 96, RIE 95a]). 4.5.3. Multifractal analysis of the binomial measure In the light of Lemma 4.6 it becomes clear that the singularity exponent α(t) is most easily accessible for Mb while w(t) is readily available for both, Mb and μb . On the other hand, as increments appear in α(t) they are not well deﬁned for μb . Thus, it is natural to calculate the spectra of both, Mb and μb , with appropriate singularity exponents, i.e. f α,Mb , f w,Mb and f w,μb .

Multifractal Scaling: General Theory and Approach by Wavelets

159

Now, Lemma 4.6 indicates that the singularity structures of μb and Mb are closely related. Indeed, μb is the distributional derivative of Mb in the sense of (4.52) and (4.54). Since taking a derivative “should” simply reduce the scaling exponent by one, we would expect that their spectra are identical up to a shift in a by −1. Indeed, this is true for increasing processes, such as Mb , as we will elaborate in section 4.6.2. However, it has to be pointed out that this rule cannot be correct for oscillating processes. This is effectively demonstrated by the example ta · sin(t−b ) with b > 0. Though this example has the exponent a at zero, its derivative behaves like ta−b−1 there. This is caused by the strong oscillations, also called chirp, at zero. In order to deal with such situations the 2-microlocalization theory has to be employed [JAF 91]. Let us ﬁrst dwell on the well known multifractal analysis of Mb based on αkn . Recall that Mb ((kn + 1)2−n ) − Mb (kn ) is given by (4.52), and use the nested independence (4.56) and identical distributions (4.57) to obtain E[S

n

α,Mb (q)]

=

n 2 −1

E

'

Mknn

q

q 0 q ( M0 · · · Mk11

kn =0

n ( n i n−i E [M0q ] E [M1q ] · =E i i=0 ' ( n 0 q · (E [M0q ] + E [M1q ]) . = E M0

'

q M00

(4.67)

From this, it follows immediately that T α,Mb (q) = − log2 E [(M0 )q + (M1 )q ] .

(4.68)

Note that this value may be −∞ for some q. THEOREM 4.4.– Assume that i’), ii) and iii) hold. Assume furthermore that M0 and M1 have at least some ﬁnite moment of negative order. Then, with probability one dim(Ka ) = f (a) = τ ∗ (a) = T ∗ α,Mb (a)

(4.69)

for all a such that T ∗ α,Mb (a) > 0. Thereby, all the spectra are related to the singularity exponents αkn or hnk of Mb . NOTE 4.4 (Wavelet analysis).– In what follows we will show that we obtain the same spectra for Mb replacing αkn with wkn for certain analyzing wavelets. We will also mention the changes which become necessary when studying distribution functions of measures with fractal support (see section 4.5.5).

160

Scaling, Fractals and Wavelets

Proof. Inspection [BAR 97] we ﬁnd that dim(Ka ) = T ∗ (a) for αkn under the given assumptions. Earlier results, such as [FAL 94, ARB 96], used more restrictive assumptions but are somewhat easier to read. Though weaker than [BAR 97] they are sufﬁcient in some situations. 4.5.4. Examples Example 1 (β binomial).– Consider multipliers M0 and M1 that follow a β distribution, which has the density cp tp−1 (1 − t)p−1 for t ∈ [0, 1] and 0 elsewhere. Thus, p > 0 is a parameter and cp is a normalization constant. Note that the conservation of mass i) imposes a symmetric distribution since M0 and M1 are set to be equally distributed. The β distribution has ﬁnite moments of order q > −p which can be expressed explicitly using the Γ-function. We obtain β-Binomial: T α (q) = −1 − log2

Γ(p + q)Γ(2p) Γ(2p + q)Γ(p)

(q > −p),

(4.70)

and T (q) = −∞ for q ≤ −p. For a typical shape of these spectra, see Figure 4.3. 1.5

1.5

slope=D

slope=q

1 0.5

1

(q,T(q)) (1,T(1))=(1,0)

ŦT*(D)

T (D) o

(0,T(0))=(0,Ŧ1)

Ŧ1

0

q0=0

* (D,T (D))

0.5

*

T(q) o

0 Ŧ0.5

q1=1

ŦT(0)=1

ŦT(1)=0

Ŧ1.5 Ŧ2

Ŧ0.5

ŦT(q)

Ŧ2.5 Ŧ3 Ŧ1

0

1 qo

2

3

Ŧ1 Ŧ0.5

0

0.5

1

1.5

2

2.5

3

D o

Figure 4.3. The spectrum of a binomial measure with β distributed multipliers with p = 1.66. Trivially, T (0) = −1, where the maximum of T ∗ is 1. In addition, every positive increment process has T (1) = 0, where T ∗ touches the bisector. Finally, the LRD parameter is Hvar = (T (2) + 1)/2 = 0.85 (see (4.90) below)

An application of the β binomial for the modeling of data trafﬁc on the Internet can be found in [RIE 99]. Example 2 (Uniform binomial).– As a special case of the β binomial we obtain uniform distributions for the multipliers when setting p = 1. Formula (4.70) simpliﬁes

Multifractal Scaling: General Theory and Approach by Wavelets

161

to T α (q) = −1 + log2 (1 + q) for q > −1. Applying the formula for the Legendre transform (4.51) yields the explicit expression a (4.71) uniform binomial: T ∗ α (a) = 1 − a + log2 (e) + log2 log2 (e) for a > 0 and T ∗ α (a) = −∞ for a ≤ 0.

Example 3 (Log-normal binomial).– Another very interesting case is log-normal distributions for the multipliers M0 and M1 . Note that we have to replace i) with i’) in this case since log-normal variables can be arbitrarily large, i.e. larger than 1. Recall that the log-normal binomial enjoys the advantage of having exactly log-normal marginals μb (Ikn ) since the product of independent log-normal variables is again a log-normal variable. Having mass conservation only in the mean, however, may cause problems in simulations since the sample mean of the process μb (Ikn ) (k = 0, . . . , 2n − 1) is not M00 as in case i), but depends on n. Indeed, the negative (virtual) a appearing in the log-normal binomial spectrum reﬂects the possibility that the sample average my increase locally (see [MAN 90a]). The calculation of its spectrum starts by observing that the exponential M = eG of a N (m, σ 2 ) variable G, i.e. a Gaussian with mean m and variance σ 2 , has the q-th moment E[M q ] = E[exp(qG)] = exp(qm + q 2 σ 2 /2). Assuming that M0 and M1 are equally distributed as M their mean must be 1/2. Hence m + σ 2 = − ln(2), and σ2 q (4.72) log-normal binomial: T α (q) = (q − 1) 1 − 2 ln(2) for all q ∈ such that E[(Mb (1))q ] is ﬁnite. Note that the parabola in (4.72) has two zeros: 1 and qcrit = 2 ln(2)/σ 2 . It follows from [KAH 76] that E[(Mb (1))q ] < ∞ exactly for q < qcrit . Since T (q) is exactly differentiable for q < qcrit we may obtain its Legendre transform implicitly from (4.51) for a = T (q) with q < qcrit , i.e., for all a > acrit = T (qcrit ) = σ 2 /(2 ln(2)) − 1. Eliminating q from (4.51) yields the explicit form 2 σ2 ln(2) a − 1 − T ∗ α (a) = 1 − (a ≥ acrit ) (4.73) 2σ 2 2 ln(2) For a ≤ acrit the Legendre transform yields T ∗ (a) = a · qcrit . Thus, at acrit the spectrum T ∗ crosses over from parabola (4.73) to its tangent through the origin with slope qcrit (the other tangent through the origin is the bisector). It should be remembered that only the positive part of this spectrum can be estimated from one realization of Mb . The negative part corresponds to events so rare that they can only be observed in a large array of realizations (see Note 4.2).

162

Scaling, Fractals and Wavelets

The log-normal framework also allows us to calculate F (a) explicitly, demonstrating which rescaling properties of the marginal distributions of the increment processes of Mb are captured in the multifractal spectra. Indeed, if all ln(Mkn ) are N (m, σ 2 ) then − ln(2) · αkn is N (m, σ 2 /n). The mean value theorem of integration gives ln(2)(−a+ε) 1 (x − m)2 n dx exp − PΩ [|αk − a| < ε] = * 2σ 2 /n 2πσ 2 /n ln(2)(−a−ε) (− ln(2)xa,n − m)2 1 ln(2) · 2ε · exp − =* 2σ 2 /n 2πσ 2 /n with xa,n ∈ [a − ε, a + ε] for all n. Keeping only the exponential term in n and substituting m = −σ 2 − ln(2) we ﬁnd ln(2) 1 log2 (2n PΩ [|αkn − a| < ε]) 1 − n 2σ 2

σ2 xa,n − 1 − 2 ln(2)

2 .

(4.74)

Comparing with (4.73) we see that T ∗ (a) = F (a), as stated in Theorem 4.2. The above computation shows impressively how well adapted a multiplicative iteration with log-normal multipliers is to multifractal analysis (or vice versa): F extracts, basically, the exponent of the Gaussian kernel. Since the multifractal formalism holds for Mb these features can be measured or estimated using the re-normalized histogram, i.e. the grain based multifractal spectrum f (a). This is a property which could be labeled with the term ergodicity. Note, however, that classical ergodic theory deals with observations along an orbit of increasing length, while f (a) concerns a sequence of orbits. 4.5.5. Beyond dyadic structure We elaborate here generalizations of the binomial cascade. Statistically self-similar measures: a natural generalization of the random binomial, denoted here by Mc , is obtained by splitting intervals Jkn iteratively n+1 n+1 n+1 n , . . . , Jck+c−1 with length |Jck+i | = Ln+1 into c subintervals Jck ck+i |Jk | and n+1 n+1 n mass μc (Jck+i ) = Mck+i μc (Jck ). In the most simple case, we will require mass n+1 n+1 n+1 conservation, i.e. Mck + · · · + Mck+c−1 = 1, but also Ln+1 ck + · · · + Lck+c−1 = 1 which guarantees that μc lives everywhere. Assuming the analogous properties of ii) and iii) to hold for both the length- as well as the mass-multipliers we ﬁnd that T Mc (q) is the unique solution of ( ' (4.75) E (M0 )q (L0 )−T (q) + · · · + (Mc−1 )q (Lc−1 )−T (q) = 1.

Multifractal Scaling: General Theory and Approach by Wavelets

163

This formula of T (q) can be derived rigorously by taking expectations where appropriate in the proof of [RIE 95a, Prop 14]. Doing so shows, moreover, that T (q) assumes a limit in these examples. Multifractal formalism: it is notable that the multifractal formalism “holds” for the class of statistically self-similar measures described above in Theorem 4.4 (see [ARB 96]). n+1 n n However, if Ln+1 ck + · · · + Lck+c−1 = λ < 1, e.g. choosing Lk = (1/c ) almost surely with c > c, then the measure μc lives on a set of fractal dimension and its distribution function Mc (t) = μc ([0, t)) is constant almost everywhere. In this case, equality in the multifractal formalism will fail: indeed, unless the scaling exponents snk are modiﬁed to account for boundary effects caused by the fractal support, the partition function will be unbounded for negative q, e.g. τ α (q) = −∞ for q < 0 (see [RIE 95a]). As a consequence, T α (q) = −∞ and (4.75) is no longer valid for q < 0. Interestingly, the ﬁne spectrum dim(Ka ) is still known, however, due to [ARB 96].

Stationary increments: however, an entirely different and novel way of introducing randomness in the geometry of multiplicative cascades which leads to perfectly stationary increments has been given recently in [MAN 02] and in [BAR 02, BAR 03, BAR 04, MUZ 02, BAC 03, CHA 02, CHA 05, RIE 07b, RIE 07a]. The description of these model is, unfortunately, beyond the scope of this work. Binomial in the wavelet domain: in concluding this section we should mention that, with regard to (4.61), we may choose to directly model the wavelet coefﬁcients of a process in a multiplicative fashion in order to obtain a desired multifractal structure. Some early steps in this direction have been taken in [ARN 98]. 4.6. Wavelet based analysis 4.6.1. The binomial revisited with wavelets The deterministic envelope is the most simple wavelet-based spectra of μb to calculate. Taking into account the normalization factors in (4.12) when using Lemma 4.6, the calculation of (4.67) carries over to give n

S n w,μb (q) = 2nq E [|C0,0 |q ] · (EΩ [M0q ] + EΩ [M1q ]) , and similar for Mb . Provided E [|C0,0 |q ] is ﬁnite this immediately gives T w,μb (q) + q = T w,Mb (q) = T α,Mb (q), T

∗

w,μb (a

− 1) = T

∗

w,Mb (a)

=T

∗

α,Mb (a).

(4.76) (4.77)

164

Scaling, Fractals and Wavelets

Imposing additional assumptions on the distributions of the multipliers we may also control wkn (μb ) themselves and not only their moments. To this end, we should be able to guarantee that the wavelet coefﬁcients do not decay too fast (compare (4.10)), i.e. the random factor (4.62) which appears in (4.61) does not become too small. Indeed, it is sufﬁcient to assume that there is some ε > 0 such that |C0,0 (μb )| ≥ ε # (n,k ) almost surely. Then for all t, (1/n) log( ψ(t)μb n (dt)) → 0, and with (4.61) 1 (4.78) wμb (t) = lim inf − log2 2n/2 |Cn,kn | = αMb (t) − 1, n→∞ n and similarly wμb (t) = αMb (t) − 1. Observe that this is precisely the relation we expect between the scaling exponents of a process and its (distributional) derivative – at least in nice cases – and that it is in agreement with (4.77). In summary (ﬁrst observed for deterministic binomials in [BAC 93]): COROLLARY 4.5.– Assume that μb is a random binomial measure satisfying i)-iii). # # (n,k) (n,k) Assume, that the random variables | ψ(t)μb (dt)| resp. | ψ(t)Mb (t)dt| are uniformly bounded away from 0. Then, the multifractal formalism “holds” for the wavelet based spectra of μb , resp. Mb , i.e. dim(Eaw,μb ) = f w,μb (a) = τ ∗ w,μb (a) = T ∗ w,μb (a),

(4.79)

dim(Eaw,Mb ) = f w,Mb (a) = τ ∗ w,Mb (a) = T ∗ w,Mb (a).

(4.80)

a.s.

a.s.

a.s.

respectively a.s.

a.s.

a.s.

# # (n,k) (n,k) Requiring that | ψ(t)μb (dt)| resp. | ψ(t)Mb (t)dt| should be bounded away from zero in order to insure (4.78), though satisﬁed in some simple cases, seems unrealistically restrictive to be of practical use. A few comments are in order here. First, this condition can be weakened to arbitrarily allow small values of these integrals, as long as all their negative moments exist. This can be shown by an argument using the Borel-Cantelli lemma. Second, the condition may simplify in two ways. For iid multipliers we know that these integrals are equal in distribution to C0,0 , thus only n = k = 0 has to be checked. Further, for the Haar wavelet and symmetric multipliers, it becomes simply the condition that M0 be uniformly bounded away from zero (see (4.60)), or at least that E[|M0 − 1/2|q ] < ∞ for all negative q. Third, if we drop iii) and allow the multiplier distributions to depend on scale (see # # (n,k) (n,k) [RIE 99]), then | ψ(t)μb (dt)| resp. | ψ(t)Mb (t)dt| has to be bounded away from zero only for large n. In applications such as network trafﬁc modeling we ﬁnd n+1 n+1 − M2k+1 is best modeled by discrete distributions on [0, 1] that on ﬁne scales M2k with large variance, i.e. without mass around 1/2.

Multifractal Scaling: General Theory and Approach by Wavelets

165

Fourth, another way out is to avoid small wavelet coefﬁcients entirely in a multifractal analysis. More precisely, we would follow [BAC 93, JAF 97] and replace Cn,kn in the deﬁnition of wknn (4.12) by the maximum over certain wavelet coefﬁcients “close” to t. Of course, the multifractal formalism of section 4.4 still holds. [JAF 97] gives conditions under which the spectrum τ ∗ w,μb (a) based on this modiﬁed wkn agrees with the “Hölder” spectrum dim(Ea ) based on hnk (Mb ). 4.6.2. Multifractal properties of the derivative Corollary 4.5 establishes for the binomial what intuition suggests in general, i.e. that the multifractal spectra of processes and their derivative should be related in a simple fashion – at least for certain classes of processes. As we will show, increasing processes have this property, at least for the wavelet based multifractal spectra. However, the order of Hölder regularity in the sense of the spaces Cth (see Lemma 4.1) might decrease under differentiation by an amount different from 1. This is particularly true in the presence of highly oscillatory behavior such as “chirps”, as the example ta sin(1/t2 ) demonstrates. In order to assess the proper space Cth a 2-microlocalization has to be employed. For good surveys see [JAF 95, JAF 91]. In order to establish a general result on derivatives we place ourselves in the framework whereby we care less for a representation of a process in terms of wavelet coefﬁcients and are interested purely in an analysis of oscillatory behavior. A typical example of an analyzing mother wavelet ψ are the derivatives of the Gaussian kernel exp (−t2 /2) which were used to produce Figure 4.4. The idea is to use integration by parts. For a continuous measure μ on [0, 1] with distribution function M(t) = μ([0, t)) and a continuously differentiable function g this reads as g(t)μ(dt) = lim

n→∞

= lim

n→∞

n −1 2

g(k2−n ) M((k + 1)2−n ) − M(k2−n )

k=0 n −1 2

M(k2−n ) g((k − 1)2−n ) − g(k2−n )

(4.81)

k=0

+ M(1)g(1 − 2−n ) − M(0)g(−2−n ) = M(1)g(1) − M(0)g(0) − M(t)g (t)dt where we alluded to (4.53) and regrouped terms. As a matter of fact, M(0) = 0 and M(1) = 1. A similar calculation can be performed for a more general, not necessarily increasing process Y , provided it has a derivative Y , by replacing μ(dt) with Y (t)dt.

166

Scaling, Fractals and Wavelets

Figure 4.4. Demonstration of the multifractal behavior of a binomial measure μb (left) and its distribution function Mb (right). On the top a numerical simulation, i.e. (4.52) on the left and Mb (k2−n ) on the right for n = 20. In the middle the moduli of a continuous wavelet transform [DAU 92] where the second Gaussian derivative was taken as the analyzing wavelet ψ(t) for μb , resp. the third derivative ψ for Mb . The dark lines#indicate the “lines of maxima” [JAF 97, BAC 93], i.e. the locations where the modulus of ψ(2j t − s)μb (dt) has a local maximum as a function of s with j ﬁxed. On the bottom a multifractal analysis in three steps. First, a plot of log S n w (q) against n tests for linear behavior for various q. Second, the partition function τ (q) is computed as the slopes of a least square linear ﬁt of log S n . Finally, the Legendre transform τ ∗ (a) of τ (q) is calculated following (4.49). Indicated with dashes in the plots of τ (q) and τ ∗ (a) of μb are the corresponding function for Mb , providing empirical evidence for (4.76), (4.77), and (4.83)

Now, setting g(t) = 2n/2 ψ(2n t − k) for a smooth analyzing wavelet ψ we have g (t) = 23n/2 ψ (2n t − k) and obtain Cn,k (ψ, μ) = 2n/2 ψ(2n − k) − 2n · Cn,k (ψ , M).

(4.82)

Estimating 2n − kn = 2n − t2n (1 − t)2n and assuming exponential decay of ψ(t) at inﬁnity allows us to conclude w(t)ψ,μ = −1 + w(t)ψ ,M ,

(4.83)

and similarly to w(t). COROLLARY 4.6.– f ψ,μ (a) = f ψ ,M (a + 1)

τ ∗ ψ,μ (a) = τ ∗ ψ ,M (a + 1)

(4.84)

This is impressively demonstrated in Figure 4.4. We should note that ψ has one more vanishing moment than ψ which is easily seen by integrating by parts. Thus, it

Multifractal Scaling: General Theory and Approach by Wavelets

167

is natural to analyze the integral of a process, here the distribution function M of the measure μ, using ψ since the degree of the Taylor polynomials typically grows by 1 under integration. NOTE 4.5 (Visibility of singularities and regularity of the wavelet).– It is notable that the Haar wavelet yields the full spectra of the binomial Mb (and also of its distributional derivative μb ). This fact is in some discord with the folklore saying that a wavelet cannot detect degrees of regularity larger than its own. In other words, a signal will rarely be more regular than the basis elements it is composed of. To resolve the apparent paradox, recall the peculiar property of multiplicative measures which is to have constant Taylor# polynomials. So, it will reveal its # scaling structure to any analyzing wavelet with ψ = 0. No higher regularity, i.e. tk ψ(t)dt = 0 is required. The correct reading of the literature is indeed, that wavelets are only guaranteed to detect singularities smaller than their own regularity. 4.7. Self-similarity and LRD The statistical self-similarity as expressed in (4.1) makes FBM, or rather its increment process, a paradigm of long range dependence (LRD). To be more explicit let δ denote a ﬁxed lag and deﬁne fractional Gaussian noise (FGN) as G(k) := BH ((k + 1)δ) − BH (kδ).

(4.85)

Possessing the LRD property means that the auto-correlation rG (k) := EΩ [G(n + k)G(n)] decays so slowly that k rG (k) = ∞. The presence of such strong dependence bears an important consequence on the aggregated processes G(m) (k) :=

1 m

(k+1)m−1

G(i).

(4.86)

i=km

They have a much higher variance, and variability, than would be the case for a short range dependent process. Indeed, if X is a process with iid values X(k), then X (m) (k) has variance (1/m2 ) var(X0 + · · · + Xm−1 ) = (1/m) var(X). For G we ﬁnd, due to (4.1) and BH (0) = 0, H 1 m BH (mδ) = var BH (δ) var(G(m) (0)) = var m m (4.87) = m2H−2 var (BH (δ)) . Indeed, for H > 1/2 this expression decays much slower than 1/m. As is shown in [COX 84] var(X (m) ) m2H−2 is equivalent to rX (k) k 2H−2 and so, G(k) is indeed LRD for H > 1/2. Let us demonstrate with FGN how to relate LRD with multifractal analysis based only on the fact that it is a zero-mean processes, not (4.1). To this end let

168

Scaling, Fractals and Wavelets

δ = 2−n denote the ﬁnest resolution we will consider, and let 1 be the largest. For m = 2i (0 ≤ i ≤ n) the process mG(m) (k) becomes simply BH ((k + 1)mδ) − BH (kmδ) = BH ((k + 1)2i−n ) − BH (k2i−n ). However, the second moment of this expression – which is also the variance – is exactly what determines T α (2). More precisely, using stationarity of G and substituting m = 2i , we obtain ' ( ! " 2 −1 EΩ S n−i α (2) = EΩ |mG(m) (k)|2 n−i

−(n−i)T α (2)

2

= 2n−i 22i var G(2

i

)

(4.88)

k=0

.

This should be compared with the deﬁnition of the LRD parameter H using var(G(m) ) m2H−2

or

i

var(G(2 ) ) 2i(2H−2) .

(4.89)

At this point a conceptual difﬁculty arises. Multifractal analysis is formulated in the limit of small scales (i → −∞) while LRD is a property for large scales (i → ∞). Thus, the two exponents H and T α (2) can in theory only be related when assuming that the scaling they represent is actually exact at all scales, and not only asymptotically. In any real world application, however, we will determine both H and T α (2) by ﬁnding a scaling region i ≤ i ≤ i in which (4.88) and (4.89) hold up to satisfactory precision. Comparing the two scaling laws in i yields T α (2) + 1 − 2 = 2H − 2, or H=

T α (2) + 1 . 2

(4.90)

This formula expresses most pointedly how multifractal analysis goes beyond second order statistics: with T (q) we capture the scaling of all moments. The relation (4.90), here derived for zero-mean processes, can be put on more solid grounds using wavelet estimators of the LRD parameter [ABR 95] which are more robust than the estimators through variance. The same formula (4.90) also reappears for certain multifractals (see (4.100)). In this context it is worthwhile pointing forward to (4.96), from which we conclude that T BH (q) = qH − 1 if q > −1. The fact to note here is that FBM requires indeed only one parameter to capture its scaling while multifractal scaling, in principle, is described by an array of parameters T (q). 4.8. Multifractal processes The most prominent examples where we ﬁnd coinciding, strictly concave multifractal spectra are the distribution functions of cascade measures [MAN 74,

Multifractal Scaling: General Theory and Approach by Wavelets

169

KAH 76, CAW 92, FAL 94, ARB 96, OLS 94, HOL 92, RIE 95a, RIE 97b, PES 97] for which dim(Ka ) and T ∗ (a) are equal and have the form of a ∩ (see Figure 4.3 and also 4.5(e)). These cascades are constructed through a multiplicative iteration scheme such as the binomial cascade, which is presented in detail earlier in this chapter with special emphasis on its wavelet decomposition. Having positive increments, this class of processes is, however, sometimes too restrictive. FBM, as noted, has the disadvantage of a poor multifractal structure and does not contribute to a larger pool of stochastic processes with multifractal characteristics. It is also notable that the ﬁrst “natural”, truly multifractal stochastic process to be identiﬁed was the Lévy motion [JAF 99]. This example is particularly appealing since scaling is not injected into the model by an iterative construction (this is what we mean by the term natural). However, its spectrum is degenerative, though it shows a non-trivial range of scaling exponents h(t), in the sense that it is linear. 4.8.1. Construction and simulation With the formalism presented here, the stage is set for constructing and studying new classes of truly multifractional processes. The idea, to speak in Mandelbrot’s own words, is inevitable after the fact. The ingredients are simple: a multifractal “time warp’, i.e. an increasing function or process M(t) for which the multifractal formalism is known to hold, and a function or process V with strong monofractal scaling properties such as fractional Brownian motion (FBM), a Weierstrass process or self-similar martingales such as Lévy motion. We then form the compound process V(t) := V (M(t)).

(4.91)

To ﬁx the ideas, let us recall the method of midpoint displacement which can be used to deﬁne a simple Brownian motion B1/2 which we will also call the Wiener motion (WM) for a clear distinction from FBM. This method constructs B1/2 iteratively at dyadic points. Having constructed B1/2 (k2−n ) and B1/2 ((k + 1)2−n ) we deﬁne B1/2 ((2k + 1)2−n−1 ) as (B1/2 (k2−n ) + B1/2 ((k + 1)2−n ))/2 + Xk,n . The offsets Xk,n are independent zero-mean Gaussian variables with variance such as to satisfy (4.1) with H = 1/2, hence the name of the method. One way to obtain Wiener motion in multifractal time WM(MF) is then to keep the offset variables Xk,n as they are but to apply them at the time instances tk,n deﬁned by tk,n = M−1 (k2−n ), i.e. M(tk,n ) = k2−n : B1/2 (t2k+1,n+1 ) :=

B1/2 (tk,n ) + B1/2 (tk+1,n ) + Xk,n . 2

(4.92)

This amounts to a randomly located random displacement, the location being determined by M. Indeed, (4.91) is nothing but a time warp. An alternative construction of “warped Wiener motion” WM(MF) which yields equally spaced sampling, as opposed to the samples B1/2 (tk,n ) provided by (4.92), is

170

Scaling, Fractals and Wavelets

desirable. To this end, note ﬁrst that the increments of WM(MF) become independent Gaussians once the path of M(t) is realized. To be more precise, ﬁx n and let G(k) := B((k + 1)2−n ) − B(k2−n ) = B1/2 (M(k + 1)2−n ) − B1/2 (M(k2−n )).

(4.93)

For a sample path of G we start by producing ﬁrst the random variables M(k2−n ). Once this is done, the G(k) are simply independent zero-mean Gaussian variables with variance |M((k + 1)2−n ) − M(k2−n )|. This procedure has been used in Figure 4.5. 4.8.2. Global analysis To calculate the multifractal envelope T (q) we need only to know that V is an H-sssi process, i.e. that the increment V (t + u) − V (t) is equal in distribution to uH V (1) (see (4.1)). Assuming independence between V and M, a simple calculation reads as EΩ

n −1 2

V (k + 1)2−n − V k2−n q

k=0

=

n 2 −1

k=0

=

n 2 −1

q ! E E V M (k + 1)2−n − V M k2−n M k2−n , " M (k + 1)2−n

(4.94)

' q " qH ( ! E V (1) . E M (k + 1)2−n − M k2−n

k=0

With little more effort the increments |V((k + 1)2−n ) − V(k2−n )| can be replaced n by suprema, i.e. by 2−nhk , or even certain wavelet coefﬁcients under appropriate assumptions (see [RIE 88]). It follows that ! " T M (qH) if EΩ | sup0≤t≤1 V (t)|q < ∞ (4.95) Warped H-sssi: T V (q) = −∞ otherwise. Simple H-sssi process: when choosing the deterministic warp time M(t) = t we have T M (q) = q − 1 since S n M (q) = 2n · 2−nq for all n. Also, V = V . We obtain T M (qH) = qH − 1 which has to be inserted into (4.95) to obtain ! " qH − 1 if EΩ | sup0≤t≤1 V (t)|q < ∞ (4.96) Simple H-sssi: T V (q) = −∞ otherwise. 4.8.3. Local analysis of warped FBM Let us now turn to the special case where V is FBM. Then, we use the term FB(MF) to abbreviate fractional Brownian motion in multifractal time:

Multifractal Scaling: General Theory and Approach by Wavelets

171

B(t) = BH (M(t)). First, to obtain an idea of what to expect from the spectra of B, let us note that the moments appearing in (4.95) are ﬁnite for all q > −1 (see [RIE 88, lem 7.4] for a detailed discussion). Applying the Legendre transform easily yields ∗ (a/H). T ∗ B (a) = inf (qa − TM (qH)) = TM

(4.97)

q

(a)

(d)

50

1

40

0.8

30

0.6

20

0.4

10

0.2

0 0

0

0.2

0.4

0.6

0.8

1

Ŧ0.5

Ŧ0.4

Ŧ0.3

Ŧ0.2

Ŧ0.1

0

0.1

time lag

0.2

0.3

0.4

0.5

(b) (e)

1 0.8

0

0.6

Ŧ0.1

0.4

Ŧ0.2

0.2 0 0

0.2

0.4

0.6

0.8

1

Ŧ0.3 Ŧ0.4

(c)

Ŧ0.5

1.5

Ŧ0.6

1

Ŧ0.7

0.5

Ŧ0.8

0

Ŧ0.9

Ŧ0.5 0

0.2

0.4

0.6

time

0.8

1

Ŧ1 0

0.5

1

1.5

2

a

Figure 4.5. Left: simulation of Brownian motion in binomial time (a) sampling of Mb ((k + 1)2−n ) − Mb (k2−n ) (k = 0, . . . , 2n − 1), indicating distortion of dyadic time intervals, (b) Mb ((k2−n )): the time warp, (c) Brownian motion warped with (b): B(k2−n ) = B1/2 (Mb (k2−n )) Right: estimation of dim EaB using τ ∗ w,B , (d) empirical correlation of the Haar wavelet coefﬁcients, (e) dot-dashed: T ∗ Mb (from theory), dashed: T ∗ B (a) = T ∗ Mb (a/H) Solid: the estimator τ ∗ w,B obtained from (c). (Reproduced from [GON 99])

Second, towards the local analysis we recall the uniform and strict Hölder continuity of the paths of FBM5 which reads roughly as sup |B(t + u) − B(t)| = sup |BH (M(t + u)) − BH (M(t))|

|u|≤δ

|u|≤δ

sup |M(t + u) − M(t)|H . |u|≤δ

5. For a precise statement see Adler [ADL 81] or [RIE 88, Theorem 7.4].

172

Scaling, Fractals and Wavelets

This is the key to concluding that BH simply squeezes the Hölder regularity exponents by a factor H. Thus, hB (t) = H · hM (t), etc. and M = KaB , Ka/H

and, consequently, analogous to (4.97), dB (a) = dM (a/H). Figure 4.5(d)-(e) displays an estimation of dB (a) using wavelets which agrees very closely with the form dM (a/H) predicted by theory (for statistics on this estimator see [GON 99, GON 98]). In conclusion: COROLLARY 4.7 (Fractional Brownian motion in multifractal time).– Let BH denote FBM of Hurst parameter H. Let M(t) be of almost surely continuous paths and independent of BH . Then, the multifractal warp formalism ∗ (a/H) dim(KaB ) = f B (a) = τ ∗ B (a) = T ∗ B (a) = TM

(4.98)

holds for B(t) = BH (M(t)) for any a such that the multifractal formalism holds for M ) = T ∗ M (a/H). M at a/H, i.e., for which dim(Ka/H This means that the local, or ﬁne, multifractal structure of B captured in dim(KaB ) on the left can be estimated through grain based, simpler and numerically more robust spectra on the right side, such as τ ∗ B (a) (compare Figure 4.5 (e)). “Warp formula” (4.98) is appealing since it allows us to separate the LRD parameter of FBM and the multifractal spectrum of the time change M. Indeed, provided that M is almost surely increasing, we have T M (1) = 0 since S n (0) = M(1) for all n. Thus, T B (1/H) = 0 reveals the value of H. Alternatively, the tangent at T ∗ B through the origin has slope 1/H. Once H is known, T ∗ M follows easily from T ∗ B . Simple FBM: when choosing the deterministic warp time M(t) = t we have B = BH and T M (q) = q − 1 since S n M (q) = 2n · 2−nq for all n. We conclude that T BH (q) = qH − 1

(4.99)

for all q > −1. This conﬁrms (4.90) for FGN. With (4.98) it shows that all spectra of FBM consist of the one point (H, 1) only, making the monofractal character of this process most explicit.

Multifractal Scaling: General Theory and Approach by Wavelets

173

4.8.4. LRD and estimation of warped FBM Let G(k) := B((k + 1)2−n ) − B(k2−n ) be FGN in multifractal time (see (4.93) for the case H = 1/2). Calculating auto-correlations explicitly shows that G is second order stationary under mild conditions with HG =

T M (2H) + 1 . 2

(4.100)

Let us discuss some special cases. For example, in a continuous, increasing warp time M, we have always T M (0) = −1 and T M (1) = 0. Exploiting the concave shape of T M we ﬁnd that H < H G < 1/2 for 0 < H < 1/2, and 1/2 < H G < H for 1/2 < H < 1. Thus, multifractal warping cannot create LRD and it seems to weaken the dependence as measured through second order statistics. Especially in the case of H = 1/2 (“white noise in multifractal time”) G(k) becomes uncorrelated. This follows from (4.100). Notably, this is a different statement from the observation that the G(k) are independently conditioned on M (see section 4.8.1). As a particular consequence, wavelet coefﬁcients will decorrelate fast for the entire process G, not only when conditioning on M (see Figure 4.5(d)). This is favorable for estimation purposes as it reduces the error variance. Of greater importance, however, is the warning that the vanishing correlations should not lead us to assume the independence of G(k). After all, G becomes Gaussian only lead us to assume that we know M. A strong, higher order dependence in G is hidden in the dependence of the increments of M which determine the variance of G(k) as in (4.93). Indeed, Figure 4.5(c) shows clear phases of monotony of B indicating positive dependence in its increments G, despite vanishing correlations. Mandelbrot calls this the “blind spot of spectral analysis”. 4.9. Bibliography [ABR 95] A BRY P., G ONÇALVES P., F LANDRIN P., “Wavelets, spectrum analysis and 1/f processes”, in A NTONIADIS A., O PPENHEIM G. (Eds.), Lecture Notes in Statistics: Wavelets and Statistics, vol. 103, p. 15–29, 1995. [ABR 00] A BRY P., F LANDRIN P., TAQQU M., V EITCH D., “Wavelets for the analysis, estimation and synthesis of scaling data”, Self-similar Network Trafﬁc and Performance Evaluation, John Wiley & Sons, 2000. [ADL 81] A DLER R., The Geometry of Random Fields, John Wiley & Sons, New York, 1981. [ARB 96] A RBEITER M., PATZSCHKE N., “Self-similar random multifractals”, Math. Nachr., vol. 181, p. 5–42, 1996. [ARN 98] A RNEODO A., BACRY E., M UZY J., “Random cascades on wavelet dyadic trees”, Journal of Mathematical Physics, vol. 39, no. 8, p. 4142–4164, 1998.

174

Scaling, Fractals and Wavelets

[BAC 93] BACRY E., M UZY J., A RNEODO A., “Singularity spectrum of fractal signals from wavelet analysis: exact results”, J. Stat. Phys., vol. 70, p. 635–674, 1993. [BAC 03] BACRY E., M UZY J., “Log-inﬁnitely divisible multifractal processes”, Comm. in Math. Phys., vol. 236, p. 449–475, 2003. [BAR 97] BARRAL J., Continuity, moments of negative order, and multifractal analysis of Mandelbrot’s multiplicative cascades, PhD thesis no. 4704, Paris-Sud University, 1997. [BAR 02] BARRAL J., M ANDELBROT B., “Multiplicative products of cylindrical pulses”, Probability Theory and Related Fields, vol. 124, p. 409–430, 2002. [BAR 03] BARRAL J., “Poissonian products of random weights: Uniform convergence and related measures”, Rev. Mat. Iberoamericano, vol. 19, p. 1–44, 2003. [BAR 04] BARRAL J., M ANDELBROT B., “Random multiplicative multifractal measures, Part II”, Proc. Symp. Pures Math., AMS, Providence, RI, vol. 72, no. 2, p. 17–52, 2004. [BEN 87] B EN NASR F., “Mandelbrot random measures associated with substitution”, C. R. Acad. Sc. Paris, vol. 304, no. 10, p. 255–258, 1987. [BRO 92] B ROWN G., M ICHON G., P EYRIERE J., “On the multifractal analysis of measures”, J. Stat. Phys., vol. 66, p. 775–790, 1992. [CAW 92] C AWLEY R., M AULDIN R.D., “Multifractal decompositions of Moran fractals”, Advances Math., vol. 92, p. 196–236, 1992. [CHA 02] C HAINAIS P., R IEDI R., A BRY P., “Compound Poisson cascades”, Proc. Colloque “Autosimilarité et Applications” Clermont-Ferrand, France, May 2002, 2002. [CHA 05] C HAINAIS P., R IEDI R., A BRY P., “On non-scale invariant inﬁnitely divisible cascades”, IEEE Trans. Information Theory, vol. 51, no. 3, p. 1063–1083, 2005. [COX 84] C OX D., “Long-range dependence: a review”, Statistics: An Appraisal, p. 55–74, 1984. [CUT 86] C UTLER C., “The Hausdorff dimension distribution of ﬁnite measures in Euclidean space”, Can. J. Math., vol. 38, p. 1459–1484, 1986. [DAU 92] DAUBECHIES I., Ten Lectures on Wavelets, SIAM, New York, 1992. [ELL 84] E LLIS R., “Large deviations for a general class of random vectors”, Ann. Prob., vol. 12, p. 1–12, 1984. [EVE 95] E VERTSZ C.J.G., “Fractal geometry of ﬁnancial time series”, Fractals, vol. 3, p. 609–616, 1995. [FAL 94] FALCONER K.J., “The multifractal spectrum of statistically self-similar measures”, J. Theor. Prob., vol. 7, p. 681–702, 1994. [FEL 98] F ELDMANN A., G ILBERT A.C., W ILLINGER W., “Data networks as cascades: Investigating the multifractal nature of Internet WAN trafﬁc”, Proc. ACM/Sigcomm 98, vol. 28, p. 42–55, 1998. [FRI 85] F RISCH U., PARISI G., “Fully developed turbulence and intermittency”, Proc. Int. Summer School on Turbulence and Predictability in Geophysical Fluid Dynamics and Climate Dynamics, p. 84–88, 1985.

Multifractal Scaling: General Theory and Approach by Wavelets

175

[GON 98] G ONÇALVES P., R IEDI R., BARANIUK R., “Simple statistical analysis of wavelet-based multifractal spectrum estimation”, Proc. 32nd Asilomar Conf. on Signals, Systems and Computers, Paciﬁc Grove, CA, Nov. 1998. [GON 99] G ONÇALVES P., R IEDI R., “Wavelet analysis of fractional Brownian motion in multifractal time”, Proceedings of the 17th Colloquium GRETSI, Vannes, France, September 1999. [GON 05] G ONÇALVES P., R IEDI R., “Diverging moments and parameter estimation”, J. Amer. Stat. Assoc., vol. 100, no. 472, p. 1382–1393, December 2005. [GRA 83] G RASSBERGER P., P ROCACCIA I., “Characterization of strange attractors”, Phys. Rev. Lett., vol. 50, p. 346–349, 1983. [HAL 86] H ALSEY T., J ENSEN M., K ADANOFF L., P ROCACCIA I., S HRAIMAN B., “Fractal measures and their singularities: the characterization of strange sets”, Phys. Rev. A, vol. 33, p. 1141–1151, 1986. [HEN 83] H ENTSCHEL H., P ROCACCIA I., “The inﬁnite number of generalized dimensions of fractals and strange attractors”, Physica D, vol. 8, p. 435–444, 1983. [HOL 92] H OLLEY R., WAYMIRE E., “Multifractal dimensions and scaling exponents for strongly bounded random cascades”, Ann. Appl. Prob., vol. 2, p. 819–845, 1992. [JAF 91] JAFFARD S., “Pointwise smoothness, two-microlocalization coefﬁcients”, Publicacions Mathematiques, vol. 35, p. 155–168, 1991.

and

wavelet

[JAF 95] JAFFARD S., “Local behavior of Riemann’s function”, Contemporary Mathematics, vol. 189, p. 287–307, 1995. [JAF 97] JAFFARD S., “Multifractal formalism for functions, Part 1: Results valid for all functions”, SIAM J. of Math. Anal., vol. 28, p. 944–970, 1997. [JAF 99] JAFFARD S., “The multifractal nature of Lévy processes”, Prob. Th. Rel. Fields, vol. 114, p. 207–227, 1999. [KAH 76] K AHANE J.-P., P EYRIÈRE J., “Sur Certaines Martingales de Benoit Mandelbrot”, Adv. Math., vol. 22, p. 131–145, 1976. [LV 98] L ÉVY V ÉHEL J., VOJAK R., “Multifractal analysis of Choquet capacities: preliminary results”, Adv. Appl. Math., vol. 20, p. 1–34, 1998. [LEL 94] L ELAND W., TAQQU M., W ILLINGER W., W ILSON D., “On the self-similar nature of Ethernet trafﬁc (extended version)”, IEEE/ACM Trans. Networking, p. 1–15, 1994. [MAN 68] M ANDELBROT B.B., N ESS J.W.V., “Fractional Brownian motion, fractional noises and applications”, SIAM Reviews, vol. 10, p. 422–437, 1968. [MAN 74] M ANDELBROT B.B., “Intermittent turbulence in self similar cascades: divergence of high moments and dimension of the carrier”, J. Fluid. Mech., vol. 62, p. 331, 1974. [MAN 90a] M ANDELBROT B.B., “Limit lognormal multifractal measures”, Physica A, vol. 163, p. 306–315, 1990. [MAN 90b] M ANDELBROT B.B., “Negative fractal dimensions and multifractals”, Physica A, vol. 163, p. 306–315, 1990.

176

Scaling, Fractals and Wavelets

[MAN 97] M ANDELBROT B.B., Fractals and Scaling in Finance, Springer, New York, 1997. [MAN 99] M ANDELBROT B.B., “A multifractal walk down Wall Street”, Scientiﬁc American, vol. 280, no. 2, p. 70–73, February 1999. [MAN 02] M ANNERSALO P., N ORROS I., R IEDI R., “Multifractal products of stochastic processes: construction and some basic properties”, Advances in Applied Probability, vol. 34, no. 4, p. 888–903, December 2002. [MUZ 02] M UZY J., BACRY E., “Multifractal stationary random measures and multifractal random walks with log-inﬁnitely divisible scaling laws”, Phys. Rev. E, vol. 66, 2002. [NOR 94] N ORROS I., “A storage model with self-similar input”, Queueing Systems, vol. 16, p. 387–396, 1994. [OLS 94] O LSEN L., “Random geometrically graph directed self-similar multifractals”, Pitman Research Notes Math. Ser., vol. 307, 1994. [PES 97] P ESIN Y., W EISS H., “A multifractal analysis of equilibrium measures for conformal expanding maps and Moran-like geometric constructions”, J. Stat. Phys., vol. 86, p. 233–275, 1997. [PEY 98] P EYRIÈRE J., An Introduction to Fractal Measures and Dimensions, Paris, 11th Edition, k 159, 1998, ISBN 2-87800-143-5. [RIB 06] R IBEIRO V., R IEDI R., C ROUSE M.S., BARANIUK R.G., “Multiscale queuing analysis of long-range-dependent network trafﬁc”, IEEE Trans. Networking, vol. 14, no. 5, p. 1005–1018, October 2006. [RIE 88] R IEDI R.H., “Multifractal processes”, in D OUKHAN P., O PPENHEIM G., TAQQU M.S. (Eds.), Long Range Dependence: Theory and Applications, p. 625–715, Birkhäuser 2002, ISBN: 0817641688. [RIE 95a] R IEDI R.H., “An improved multifractal formalism and self-similar measures”, J. Math. Anal. Appl., vol. 189, p. 462–490, 1995. [RIE 95b] R IEDI R.H., M ANDELBROT B.B., “Multifractal formalism for inﬁnite multinomial measures”, Adv. Appl. Math., vol. 16, p. 132–150, 1995. [RIE 97a] R IEDI R.H., L ÉVY V ÉHEL J., “Multifractal properties of TCP trafﬁc: a numerical study”, Technical Report No 3129, INRIA Rocquencourt, France, February, 1997, see also: L ÉVY V ÉHEL J., R IEDI R.H., “Fractional Brownian motion and data trafﬁc modeling”, in Fractals in Engineering, p. 185–202, Springer, 1997. [RIE 97b] R IEDI R.H., S CHEURING I., “Conditional and relative multifractal spectra”, Fractals. An Interdisciplinary Journal, vol. 5, no. 1, p. 153–168, 1997. [RIE 98] R IEDI R.H., M ANDELBROT B.B., “Exceptions to the multifractal formalism for discontinuous measures”, Math. Proc. Cambr. Phil. Soc., vol. 123, p. 133–157, 1998. [RIE 99] R IEDI R.H., C ROUSE M.S., R IBEIRO V., BARANIUK R.G., “A multifractal wavelet model with application to TCP network trafﬁc”, IEEE Trans. Info. Theory, Special issue on multiscale statistical signal analysis and its applications, vol. 45, p. 992–1018, April 1999.

Multifractal Scaling: General Theory and Approach by Wavelets

177

[RIE 00] R IEDI R.H., W ILLINGER W., “Toward an improved understanding of network trafﬁc dynamics”, in PARK K., W ILLINGER W. (Eds.), Self-similar Network Trafﬁc and Performance Evaluation, p. 507–530, Wiley, 2000. [RIE 04] R IEDI R.H., G ONÇALVES P., Diverging moments, characteristic regularity and wavelets, Rice University, Dept. of Statistics, Technical Report, vol. TR2004-04, August 2004. [RIE 07a] R IEDI R.H., G ERSHMAN D., “Inﬁnitely divisible shot-noise: modeling ﬂuctuations in networking and ﬁnance”, Proceedings ICNF 07, Tokyo, Japan, September 2007. [RIE 07b] R IEDI R.H., G ERSHMAN D., Inﬁnitely divisible shot-noise, Report, Dept. of Statistics, Rice University, TR2007-07, August 2007. [TEL 88] T EL T., “Fractals, multifractals and thermodynamics”, Z. Naturforsch. A, vol. 43, p. 1154–1174, 1988. [TRI 82] T RICOT C., “Two deﬁnitions of fractal dimension”, Math. Proc. Cambr. Phil. Soc., vol. 91, p. 57–74, 1982. ´ J., Wavelets and Subband Coding, Prentice-Hall, ˘ C [VET 95] V ETTERLI M., KOVA CEVI Englewood Cliffs, NJ, 1995.

This page intentionally left blank

Chapter 5

Self-similar Processes

5.1. Introduction 5.1.1. Motivations Invariance properties constitute the basis of major laws in physics. For example, conservation of energy results from invariance of these laws compared with temporal translations. Mandelbrot was the ﬁrst to relate scale invariance to complex objects and the outcome was coined “fractals”. Using the concept of scale invariance, different notions of fractal dimension can be discussed. A particular class of complex objects presenting scale invariance is that of random medium, on which we mainly focus here. Let us begin with the example of percolation (see, for example, [GRI 89]). On a regular network, some connections are randomly and abruptly removed. The resulting network itself is random and it contains “cracks”, “bottlenecks”, “cul-de-sac”, etc. However, a regularity of statistical nature is often seen. For example, let us think of a network of spins on Z2 at critical temperature (see [GUY 94]): an “island” of + signs will be found within a “lake” of − signs, which itself is an island, etc. At each scale, we statistically see “the same thing”. Over this mathematical medium, physicists imagine the circulation of a ﬂuid (or particles) and are hence interested, for example, in the position Xn of a particle after n time steps or in statistical characteristics such as its average position EXn . Using symmetry, this average position is often zero. Therefore, we will study the

Chapter written by Albert B ENASSI and Jacques I STAS.

180

Scaling, Fractals and Wavelets

corresponding standard deviation EXn2 . Let us consider a case where, for large n: EXn2 ∼ σ 2 n2H with σ 2 the variance. When we have H = 12 , the random walk Xn is of Brownian nature. It is said to be abnormal and overdiffusive (or underdiffusive) when H > 12 (respectively H < 12 ). ), with n 0, Let us now consider the case when the dilated random walk ( X(λn) λH is statistically indistinguishable from the initial walk (Xn ), n 0: X(λn) L , n 0 = (Xn ), n 0 (5.1) λH The walk is then said to be self-similar, when the equality in law (5.1) is valid for all λ > 0. A comprehensive survey of random walk on fractal media, orientated toward physicists, can be found in [HAV 87]. A more mathematically grounded framework can be read in [BAR 95]. Apart from the framework of random walk, reasons for which a physical quantity possesses invariance with a power law are generally very tricky to discover. Indications of physical nature can be found in [HER 90], particularly in Duxburg’s contribution, which elaborates a scale renormalization theory for crack dynamics. In [DUB 97], a number of contributions in various ﬁelds such as ﬁnancial markets, avalanches, metallurgy, etc. provide illustrations of scale invariance. In particular, [DUR 97] proposes an analysis of invariance phenomena in avalanches, which is both experimental and theoretical. One major source of inspiration for the deﬁnition and study of the property of scale invariance, is that of hydrodynamic turbulence – more precisely, Kolmogorov’s work (see, for example, [FRI 97]) which, from the basis of Richardson’s work on the energy cascades, established the famous − 53 law, in 1941, based on the modeling of energy transfers in turbulent ﬂows. This theory provides a powerful means to deﬁne the stochastic self-similar processes and to study their properties. The reader is referred to [FRI 97] and references therein. Scale invariance, or self-similarity, sometimes leads to a correlation property referred to as long-range dependence. Generally, for processes satisfying (5.1), increments have a power law correlation decrease, a slow decline that indicates long-term persistence. Mandelbrot and van Ness [MAN 68] popularized fractional Brownian motion, historically introduced in [KOL 40], precisely to model long-range correlation. These processes had since an extraordinary success and numerous extensions provide quantities of generally identiﬁable models. Let us brieﬂy describe the article [WIL 98], in which the authors give a “physical” theory of fractional

Self-similar Processes

181

Brownian motion. A typical machine is either active, or inactive; the durations of activity and inactivity are independent. An inﬁnity of machines is then considered. At the time tT , we consider all the active machines – more exactly, the ﬂuctuations around the average number of active machines at time tT . Then, we renormalize in T to obtain the law of this phenomenon. The set of all durations of active machines at a given time is distributed in a rather similar way to that of the balls of the distribution model of non-renormalized mass which we will study later. The reader is also directed to Chapter 12. The goal of this chapter is to present a set of stochastic processes which have partial self-similarity and stationarity properties. Unfortunately, we cannot claim to present an exhaustive study – moreover, the available space prohibits it. We had to make choices. We preferred to challenge the reader by asking him or her questions whose answers appeared to us as surprising. The intention is to show that the concept of scale invariance remains, to a great extent, misunderstood. We thus hope, with what has been said before, centered on physics, and which we will present, to have opened paths which other researchers will perhaps follow. Trees will be used here as the leading path to scale invariance. It seemed to us that such a simple geometric structure, with such great ﬂexibility, is well-adapted to the study of self-similarity. In order to become familiarized with trees and invariance by dilation and translation, we begin our presentation with a study of purely geometric scaling, where trees and spaces are mixed. From this, we present random or non-random fractals with scale invariance. This leads us to the model of mass distribution. This provides a convenient means to generate a quantity of processes, stochastic or not, with scale and translation invariance. It is remarkable that, through a suitable wavelet decomposition, all the self-similar stochastic processes with stationary increments relate to a “layer-type” model, except perhaps for Takenaka’s process. These models of scale invariance enable us to question the difference between two concepts of equal importance: long-range correlation and sample path regularity. From examples, we show that these two concepts are independent. This chapter consists of four sections. The ﬁrst is mostly an introduction. The second clariﬁes the Gaussian case, a quasi-solved problem. The third section turns to some non-Gaussian cases, mostly that of stable processes. The last section studies correlation and regularity from defective examples. Generally, certain technical difﬁculties, sometimes even major ones, are overlooked. Consequently, the results may give the impression of lack of rigor, while returning to the original work is a necessity.

182

Scaling, Fractals and Wavelets

5.1.2. Scalings 5.1.2.1. Trees To understand the geometric aspects of scaling, we will mainly use trees. Let us start by studying the interval [0, 1[. Let q 2 be a real number and Aq = {0, 1, . . . , q − 1}. Let us note by Tq (respectively Tq ) the unilateral set of sequences (respectively bilateral) (a1 , a2 , . . . , an , . . .) (respectively (. . . , a−n , . . . , a−1 , a0 , a1 , . . . , an , . . .)) where an take values in Aq . We will denote by a a sequence (a1 , a2 , . . . , an , . . .) and an the ﬁnite sub-sequence (a1 , a2 , . . . , an ). We can consider the set Tq from various points of view: – Tq is a set of real numbers of [0, 1[ written with base q: x=

+∞ ak (x)

qk

k=1

Let us recall that this decomposition is unique except for a countable set of real numbers; – Tq is the q-ary tree. Each father has q sons. This tree is provided with a root or ancestor, denoted by , that has no antecedent and is associated with the empty sequence. The peaks of Tq are ﬁnite sequences an , with n 1, and the ridges are couples (an−1 , an ), with n 2; – Tq is the set of q-adic cells {Δnk , 0 k n}, with Δnk = [k/q n , (k + 1)/q n [. The lexicographic order makes it possible to associate each ﬁnite sequence an with only one of the q n cells Δnk . The (lexicographic) order number of this cell will be noted by kn (an ).

0 I−1 1 *:I 1 0

Figure 5.1. Coding from cell [0.5, 0.75]

Self-similar Processes

183

1 I−1 1 *:I 0

Figure 5.2. Coding from cell [0.0, 0.5]

5.1.2.2. Coding of R This section may seem complex, but is actually not difﬁcult: it amounts to extending the previous coding to R. Let a be an inﬁnite branch of the tree Tq . For n 0, let us dilate R by q n , then let us perform a translation by kn (an ) so as to move cell Δnkn (an ) , multiplied by q n , in coincidence with [0, 1[. The Tq (an ) tree, of root an , allows us to code all the q-adic cells included in [−2n , 2n ], with their position in R:

k (5.2) Δm , −m n < +∞, −kn (an ) k q m+n − kn (an ) When we have n → +∞, Tq (an ) extending by itself, this tends towards the ˜ and complete q-adic tree Tq , which has been provided with a particular bilateral way a ˜, ˜ ). This triplet leads us to code all q-adic cells of R. Another a root ˜ , noted by (Tq , a possibility is to base the analysis on the decomposition of any arbitrary real numbers in base q, as previously done for the interval [0, 1[. This coding can be extended to Rd . The reader is referred to [BEN 00] for more details. 5.1.2.3. Renormalizing Cantor set of T in Let E be a set of [0, 1]. With the set E, we can associate a sub-tree TE,q the following way: in any x ∈ E, we connect the branch of a(x).

Let us now assume that E is the triadic Cantor set. The natural choice is q = 3. Up to countable sets, E is the real set that admit no 1 in their decomposition in base 3. . As previously, Thus, set E is simply deﬁned by a condition on the branches of TE,3 by means of dilation and translation, we deﬁne, from Cantor set E, a set E on R: in other words, E is the set of real numbers that admit no 1 in their decomposition in base 3. This set E is, of course, invariant by dilation of a factor 3p , with p ∈ Z. However, it is not invariant for other factors, as can be easily veriﬁed, for example with 2.

184

Scaling, Fractals and Wavelets

5.1.2.4. Random renormalized Cantor set We build a uniform law on the set of binary sub-trees of ternary tree T3 . Following the intuition behind the construction of traditional Cantor set, let us deﬁne the random compact K(T ) by: def

K(T ) =

+

Δ(b)

(5.3)

n0 b∈Tn

where Δ(b) is the single triadic cell connected to branch b ∈ Tn . Then, we observe that K(T, a) is the renormalized set K(T ) along the branch a as we did previously. Therefore, we verify that the law of K(T, a) is equal to the law of 3p K(T, a), with p ∈ Z. We will say that K(T, a) is semi-self-similar, preposition semi indicating that the renormalization factors for which K(T, a) remains, in invariant form, a strict sub-set of R, namely, the multiplicative sub-group of the powers of 3. K(T, a) law is stable using translation by integers. Combining translations by integers and multiplications by powers of 3, we ﬁnd that, for any decimal d (in base 3), the law ofK(T, a) is equal to the law of K(T, a) + d: there is an invariance under translation. This will be referred to as a quasi-stationarity property, stationary being used only when invariance is achieved for all translation parameters. 5.1.3. Distributions of scale invariant masses Inspired by the preceding construction, we now propose that of a stationary scale invariant phenomenon. More precisely, we aim at building a random measure M (dx) verifying the following properties of stationarity and semi-self-similarity associated with a sub-group G multiplicative of R+ : L

M (dx − y) = M (dx), ∀y ∈ Rd L

M (λdx) = λH M (dx), ∀λ ∈ G

(stationary)

(5.4a)

(H-semi-self-similarity)

(5.4b)

In sections 5.1.3.1 and 5.1.3.2 two brief examples are presented. 5.1.3.1. Distribution of masses associated with Poisson measures Let Pn denote an inﬁnity of independent Poisson measures on Rd , with intensity chosen as the Lebesgue measure and identical parameter. Let (xni , i ∈ In ) be a realization of Pn indexed by the set In . Let us denote by B the ball of Rd with center 0 and radius 1. Let us deﬁne the measure M0 by its density m(x): def

m(x) =

n0

2−nH

i∈In

1B (2n x − xni ).

(5.5)

Self-similar Processes

185

xn

hence, to point 2ni , we allotted a mass proportional to 2−n(H+d) , since the proportionality coefﬁcient is equal to the volume of B. The contribution of these masses at scale n is proportional to 2−nH per volume unit. 5.1.3.2. Complete coding If we deﬁne a Cantor set on [0, 1[ only, the resulting set is not stable by dilation of a factor 3. We must deﬁne this set on R to make it stable by dilation of any unspeciﬁed power of 3. In the same way, the measure M0 deﬁned above cannot be stable by dilation of a factor 2. However, the approach used as for Cantor sets can be adopted. We outline it only brieﬂy. For any n 0, we deﬁne the measure Mn par Mn (dx) = 2−nH (M0 (x + 2−n ) − M0 (x))(dx). As distributions, this sequence Mn converges slightly towards M . Thus, we verify that M is semi-self-similar for the multiplicative sub-group of powers of 2. It is also stationary. Let us note that our construction seems to ascribe a speciﬁc role with the number 2. This is not the case and we can replace 2 by any b > 0 in (5.5). We then obtain the semi-self-similar measure for the multiplicative sub-group of powers of b. 5.1.4. Weierstrass functions With Weierstrass functions, we have a deterministic distribution model of mass with properties analog to equation (5.4). If b > 1 and 0 < H < 1, for x ∈ R, Weierstrass functions Wb,H are deﬁned as (see [WEI 72]): def b−nH sin(bn x). (5.6) Wb,H (x) = n∈Z

We can easily verify the semi-self-similar property: Wb,H (bx) = bH Wb,H (x) We should note that the preceding constructions, intended to expand Cantor sets and renormalized sums of Poisson measures on R, have their match on Weierstrass functions by writing: 0 (x) = b−nH sin(bn x) Wb,H n0 0 and noticing that we have limp→+∞ bpH Wb,H (b−p x) = Wb,H (x).

The question that naturally arises is: are there probabilistic models which are self-similar and stationary? The traditional answer is positive, provided the stationarity condition is replaced by a stationary of the increments condition. Therefore, we proceed with the introduction of self-similar stochastic processes whose increments are stationary.

186

Scaling, Fractals and Wavelets

5.1.5. Renormalization of sums of random variables In this section, we present the results of Lamperti’s article [LAM 62] on self-similar process obtained as renormalization limits of other stochastic processes. Let us ﬁrst recall Lamperti’s deﬁnition of a “semi-stable”1 stochastic process. DEFINITION 5.1.– A stochastic process X(x), with x ∈ R, is called semi-stable if, for any a > 0, there is a renormalization function b(a) > 0 such that: L X(ax), x ∈ R = b(a)X(x), x ∈ R When the function b(a) is of the form aH , the process X is called self-similar. Lamperti’s fundamental result shows that the possible choices for the renormalization function b(a) is actually limited. THEOREM 5.1 ([LAM 62, Theorem 1, p. 63]).– Any stochastic semi-stable process is self-similar. From now on, we must note that this result is not in contradiction with the existence of locally self-similar2 processes (see [BEN 98]). Let X and Y be two real stochastic processes indexed by Rd . If we assume hypothesis R, there exists a function f : R+ → R+ such that: X(ax) L d , x ∈ R = Y (x), x ∈ Rd lim a→+∞ f (a) Moreover, let us recall that a function L is a slowly varying function if, for any y > 0, we obtain limx→+∞ L(xy) L(x) = 1. THEOREM 5.2 ([LAM 62, Theorem 2, p. 64]).– Let X and Y be two stochastic processes such that there is a function f for which the hypothesis R is satisﬁed. Then, f necessarily has the following structure, with H > 0: f (a) = aH L(a) where L is a slowly varying function. 1. Not to be confused with the deﬁnition of a stable process. 2. These processes are presented in Chapter 6.

Self-similar Processes

187

As illustrations of this theorem, we offer the two traditional examples: – Brownian motion: X(x) =

[x]

ξk ,

f (n) =

√ n

k=1

where the ξk are independent Bernoulli on {−1, 1}. Brownian motion can be deﬁned as a limit of X(nx) f (n) when n → +∞; – Lévy’s symmetric α-stable motion: X(x) =

[x]

ξk ,

1

f (n) = n α

k=1

where the ξk are independent identically distributed, stable, symmetric, random variables (see Chapter 14 in [BRE 68]). Lévy’s symmetric α-stable motion can be deﬁned as a limit of X(nx) f (n) when n → +∞. Thus, stochastic self-similar processes appear as natural limits of renormalization procedures. Theorem 5.8 provides another example, which is neither Gaussian nor stable. 5.1.6. A common structure for a stochastic (semi-)self-similar process We now wish to propose a uniﬁed structure for the known (semi-)self-similar processes. The basic ingredients are as follows: – a self-similar parameter H > 0; – a “vaguelette” type basis (see [MEY 90]); – a sequence of independent and identically distributed random variables. Let Ed = {0, 1}d be the set of binary sequences of length d and Ed = Ed − {1, 1, . . . , 1}. Let us note by Λd the set of{(n, k), n ∈ Z, k ∈ Zd }. Let us nd observe that ψλ (x) = 2 2 ψ u (2n x − k), with x ∈ Rd and λ = (n, k, u), where n is the scale parameter, k the localization parameter and u the orientation parameter: if we have u = 0, φu is the mother wavelet, if we have u = 1, φu is the father wavelet. This dilation and translation structure, in fact, uses a structure of subjacent binary trees. Function ψ is assumed to rapidly decrease at inﬁnity, and be null and Lipschitzian in zero. Let us observe that ξλ , with λ = (n, k, u) a random variable sequence such that, for any n, n , k, k : L

ξn ,k ,u = ξn,k,u

188

Scaling, Fractals and Wavelets

Let us then deﬁne the process X by: def 2−nH ψλ (x)ξλ X(x) =

(5.7)

λ

The process is semi-self-similar for the multiplicative group of power 2. Again, the number 2 does not play a crucial role. Later on, we will see that Brownian fractional motions have this structure. We observe that Weierstrass functions return to this framework by supposing that ψ(x) = sin(x)1[0,2π] (x). In fact: b−nH ψ(bn x − 2kπ) Wb,H (x) = n∈Z,k∈Z

The question of parameter identiﬁability for a semi-self-similar process is natural. We will be dealing with it in the next section. 5.1.7. Identifying Weierstrass functions 5.1.7.1. Pseudo-correlation Subject to existence, let us deﬁne the pseudo-correlation function of a deterministic function f by (see [BAS 62]): T 1 def f (x)f (x + τ ) dx γf (·) (τ ) = lim T →+∞ 2T −T A function is said to be pseudo-random if limτ →+∞ γf (·) (τ ) = 0, and pseudo-stationary when γf (·−r) = γf (·) for any r. It can be shown that Weierstrass functions are pseudo-random and pseudo-stationary. The example of Weierstrass functions shows that a semi-self-similar phenomenon is not solely determined by the self-similar parameter H. Is it possible to identify the parameters that generate these phenomena? We will see later that generally it is possible. To conclude this introduction, we will be focusing on Weierstrass functions, whose identiﬁcation requires general tools, although their demonstration is elementary. To this end, let us introduce the quadratic variations of a function f : 2 N −1 k k+1 1 N −f f V (f ) = N N N k=0

Let us deﬁne: RN =

VN 1 2 log2 2 VN

By using the pseudo-random character of Weierstrass functions, we can show that RN measures H, when N → +∞.

Self-similar Processes

189

5.2. The Gaussian case As always, the Gaussian case is the best understood. The structure of stochastic, Gaussian, self-similar processes, also possessing stationary increments, is well-known. In his article [DOB 79], Dobrushin presents contemporary results, including his own works, in a deﬁnitive style and in the framework of generalized stochastic processes. Here, we give a “stochastic processes” version. 5.2.1. Self-similar Gaussian processes with r-stationary increments To describe Dobrushin’s results, we need some notations. 5.2.1.1. Notations Let Rd , d 1, be the usual Euclidean space; with x = (x1 , . . . , xd ), and d d |x| = 1 x2k . Let xy = 1 xk yk denote the scalar product of vectors x and y. 2

Let k = (k1 , . . . , kd ) ∈ Nd be a multi-index of length |k| = k1 + . . . + kd . We ∂ k1 ) ◦ . . . ◦ ( ∂x∂ d )kd . deﬁne Dk = ( ∂x 1 Let f be a function of Rd in R. Let TF (f ) denote its Fourier transform and TF −1 (f ) its inverse Fourier transform. When f is a distribution, the same notations are used for the (inverse) Fourier transform. Let us recall that, for k ∈ N: TF (Dk f )(ξ) = i|k| ξ k TF (f )(ξ) Let us denote by S(Rd ) (or S where there is no ambiguity) the Schwartz space of functions C ∞ with rapid decrease and rapid decrease derivatives; S (Rd ) (or S ) then denotes the space of moderate distributions. Let T ∈ S . The translation operator τk def is deﬁned as #τk T, φ$ = #T, τ−k φ$ for φ ∈ S, with τ−k φ(x) = φ(x + k). Let f be a function of Rd in R and n an integer. Let f ⊗n denote a function on def (Rd )n deﬁned as f ⊗n (x1 , . . . , xn ) = (f (x1 ), . . . , f (xn )). Finally, for k ∈ Rd , let us introduce the translation operator τk f ⊗n = (τk f (x1 ), . . . , τk f (xn )). 5.2.1.2. Deﬁnitions In this section, the same notations are used for distributions and functions. DEFINITION 5.2.– Let (X(x)), with x ∈ Rd , be a (possibly) generalized stochastic process: – X is said to be stationary if, for any integer n and any h ∈ Rd : L

τh X ⊗n = X ⊗n

190

Scaling, Fractals and Wavelets

– let r be a non-zero integer. X is said to possess stationary r-increments if Dk X is stationary for any k such that |k| = r; – let r be a non-zero integer and H > 0. X is said to be (r, H) self-similar if there is a polynomial P of degree r so that P (D)X is self-similar with parameter H. NOTE 5.1.– X is said to have stationary increments if it is 1-stationary. X is said to be self-similar with parameter H if it is (1, H) self-similar. 5.2.1.3. Characterization THEOREM 5.3 ([DOB 79, Theorem 3.2, p. 9]).– Let X be a Gaussian (r, H) self-similar process, with H < r, r-stationary increments and a polynomial P . Then, there is a function S of Rd in R, on the unit sphere Σd , such that, for all functions φ of S: P (iξ)TF (φ)(ξ) TF (W )(dξ) #X, P (D)φ$ = d d R |ξ| 2 +H S ξ |ξ| Let us now introduce the pseudo-differential3 operator L, with symbol ρ(ξ) = ξ S( |ξ| ): |ξ| d 2 +H

Lf (x) =

Rd

TF (f )(ξ)ρ(ξ)eixξ dξ

(5.8)

Then, the following weak stochastic differential equation can be deduced (see [BEN 97]): LX = W ◦ where W ◦ is a Gaussian white noise. Now, let us give some examples. 5.2.2. Elliptic processes Let L be the pseudo-differential operator deﬁned in (5.8). L is called elliptic if two constants 0 < a A < +∞ exist such that a S A on the sphere Σd . By analogy, the corresponding process X will be called elliptic [BEN 97]. Generally, self-similar Gaussian processes (r, H) with r-stationary increments, with 0 < H < 1, admit the following representation. 3. See [MEY 90] for traditional results on operators.

Self-similar Processes

191

THEOREM 5.4 ([BEN 97]).– Let X be a self-similar Gaussian (r, H) self-similar process, with r-stationary increments, with 0 < H < 1: – X admits the following harmonic representation: X(x) =

r−1 (ixξ)k k=0 k! TF (W )(dξ) d ξ +H 2 |ξ| S |ξ|

eixξ −

Rd

– X is the unique solution of the following stochastic elliptic differential equation: LX = W ◦ Q(D)X(0) = 0 for any Q such that d◦ Q < r As a particular case, we can mention the harmonic representation [MAN 68] of fractional Brownian motion of parameter H: r = 1 and S ≡ 1: BH (t) =

eixξ − 1

R

1

|ξ| 2 +H

TF (W )(dξ)

5.2.3. Hyperbolic processes DEFINITION 5.3.– The operator L deﬁned in (10.8) is called hyperbolic if its symbol d 1 is of the form ρ(ξ) = i=1 |ξi |Hi + 2 . THEOREM 5.5 (FRACTIONAL B ROWNIAN SHEET [LEG 99]).– Fractional Brownian sheet, deﬁned as: X(x) =

d ixi ξi ) e −1

Rd i=1

1

|ξi |Hi + 2

TF (W )(dξ)

satisﬁes the following equality:

X(λ1 x1 , . . . , λd xd )

Rd

L

=

d )

i X(x1 , . . . , xd ) Rd λH i

i=1

COROLLARY 5.1.– When λ1 = · · · = λd and H = H1 + · · · + Hd , the hyperbolic process X obtained is self-similar with parameter H, with H between 0 and d. In contrast with the elliptic case, H > 1 is hence allowed, though the Brownian fractional sheet is non-derivable.

192

Scaling, Fractals and Wavelets

5.2.4. Parabolic processes Let A be a pseudo-differential operator of dimension n − 1, i.e., its symbol is a function of Rn−1 in R. Let L be the pseudo-differential operator of dimension n, whose symbol is a function of R × Rn−1 in R and deﬁned by L = ∂t − A. Let us consider the stochastic differential equation LX = W ◦ . By analogy with the classiﬁcation of operators, X is said to be parabolic. The most prominent example is the Ornstein-Uehlenbeck process: t e−(t−s)A W (ds, dxy) OU (t, x) = 0

Rd

The operator ∂t is renormalized with a factor 12 ; the operator A can be renormalized with an arbitrary factor. Generally, a parabolic process is not self-similar. 5.2.5. Wavelet decomposition In this section, we expand Gaussian self-similar processes on a wavelet basis, which hence also constitutes a basis for the self-reproducing Hilbert space of the process4. 5.2.5.1. Gaussian elliptic processes Let ψ u , with u ∈ Ed , be a Lemarié-Meyer generating system [MEY 90]. Let ψλ , with λ ∈ Λd , be the generated orthonormal basis of wavelets. Let us assume that X veriﬁes hypotheses and notations, as in Theorem 5.4. Let us deﬁne φu , with u ∈ Ed , with the harmonic representation: k r−1 eixξ − k=0 (ixξ) u k! TF (ψ u )(dξ) φ (x) = d ξ +H Rd 2 |ξ| S |ξ| Let us then deﬁne the associated family of wavelets φλ , with λ ∈ Λd . THEOREM 5.6 ([BEN 97]).– There is a sequence of normalized Gaussian normal random 2D variables ηλ such that: 2−j(r−1+H) ηλ φλ (x) X(x) = λ∈Λd

If X is self-similar in the usual sense, then this decomposition is a renormalized distribution of mass, as deﬁned in section 5.1.3.

4. See [NEV 68] for self-reproducing Hilbert spaces.

Self-similar Processes

193

5.2.5.2. Gaussian hyperbolic process THEOREM 5.7.– With the same notations as those of Theorem 5.6, we obtain the decomposition: 2−(n1 H1 +···+nd Hd ) φλ1 (x1 ) × · · · × φλd (xd )ηλ1 ,...,λd X(x) = (λ1 ,...,λd )∈(Λ1 )d

where the sequence ηλ1 ,...,λd consists of normalized Gaussian random 2D variables. Hyperbolic processes enable us to model multiscale random structures, with preferred directions. 5.2.6. Renormalization of sums of correlated random variable Let BH denote fractional Brownian motion of parameter H. Let us consider the increments of size h > 0: Xk = h−H (BH ((k + 1)h) − BH (kh)). The following properties are well-known: – X0 is a normalized Gaussian random variable; – the sequence (Xk ) is stationary; – a law of large numbers can be written as: 1 2 Xk = 1 n→+∞ n + 1 n

lim

(a.s.)

(5.9)

k=0

Difﬁculties only start when trying to estimate the convergence speed in (5.9). Before exhibiting the results of these questions, let us give an outline of the Rosenblatt process and law. The Rosenblatt law is deﬁned by its characteristic data function, which can be found in [TAQ 75, p. 299]. We can deﬁne a stochastic process (ZD (t))t>0 called the Rosenblatt process whose law in every moment is a Rosenblatt law of parameter D. We can ﬁnd the functional characteristic of (ZD (t1 ), . . . , ZD (tk )), with k 1, in [TAQ 75]. 5.2.7. Convergence towards fractional Brownian motion 5.2.7.1. Quadratic variations Many statistical estimation procedures are based on quadratic variations. It is hence useful to recall quadratic variations of fractional Brownian motion. THEOREM 5.8.– Let:

k+1 k − BH Xk,N = N H BH N N

194

Scaling, Fractals and Wavelets

be the renormalized increments of a fractional Brownian motion of order H. The following results are proven in [GUY 89, TAQ 75]: – when 0 < H < 34 : √

[N t]

N

2 (Xk,N − 1)

k=0

converges in law, when N → +∞, towards σH BH (t), where BH is fractional Brownian motion of order H; – for 34 < H < 1:

[N t]

N

2(1−H)

2 (Xk,N − 1)

k=0

converges in law, when we have N → +∞, towards a Rosenblatt process Z(1−H) (t). Theorem 5.8 admits a generalization to functions G of L2 (R, μ), where μ is the 2 Gaussian density (2π)−1/2 exp(− x2 ). It is known (see, for example, [NEV 68]) that form an orthonormal basis for L2 (R, μ). Hermite’s polynomials Hk , with k 0, Let us expand G on this basis: G(x) = k0 gk Hk (x). If we have g0 = g1 = 0 and g2 = 0, Theorem 5.8 remains valid – up to changes in the variances of the limit processes (see [GUY 89, TAQ 75]). Theorem 5.8 thus provides examples of non-Gaussian self-similar process with stationary increments. Other examples are discussed later. 5.2.7.2. Acceleration of convergence Instead of standard increments Xk,N , let us now consider the second order increments: k+1 k k−1 H − 2BH + BH Yk,N = N σ BH N N N where σ is chosen such that Yk,N is of variance 1. Then, the following result can be obtained (cf. [IST 97]). THEOREM 5.9.– Quantity: √

[N t]

N

2 (Yk,N − 1)

k=0

converges in law, when N → +∞, towards a fractional Brownian motion BH of order H. The frontier H = 34 disappears, thanks to the introduction of the generalized variations Yk,N .

Self-similar Processes

195

5.2.7.3. Self-similarity and regularity of trajectories It is generally admitted in the literature, following the works by Mandelbrot, that self-similarity necessarily goes with sample path irregularity, for example, of Hölderian type. From a simple example, we now show that such an association does not hold in general. To construct such a process, let us start with an inﬁnitely derivable function φ, with compact support and such that there exists a neighborhood of 0 which is not included in the support of φ. Let us then deﬁne a stochastic process as: φ(tξ) TF (W )(dξ) (5.10) X(t) = 1 +H R |ξ| 2 for which the following original result can be obtained. THEOREM 5.10.– X, as deﬁned in (5.10), is a zero-mean Gaussian process that possesses the following properties: – X is self-similar with parameter H ∈ R; – the trajectories of X, for t = 0, are inﬁnitely derivable. Let us mention that X does not have stationary increments. It is precisely this loss of increment stationarity that allows the regularity of trajectories. 5.3. Non-Gaussian case 5.3.1. Introduction In this section, we study processes whose laws are either subordinate to the law of the Brownian measure (cf. Dobrushin [DOB 79]), or symmetric α-stable laws (hereafter SαS). Samorodnitsky and Taqqu’s book [SAM 94] is one of the most prominent reference tools for stable processes and is largely used here. Two classes of process are studied: processes represented by moving averages and those deﬁned by harmonizable representation. These two classes are not equivalent (cf. Theorem 5.11 below). Censov’s process, and a variant, i.e. Takenaka processes, are also analyzed. Let us mention that Takenaka processes do not belong to either of the two aforementioned classes. However, all these processes have a point in common: they are elliptic, i.e., they are the solutions of a stochastic elliptic equation of noninteger degree. Ellipticity makes it possible to expand such processes on appropriate wavelet basis, as in [BEN 97]. In the Gaussian case, this decomposition is of Karhunen-Loeve type. Therefore, the question, which is still unanswered is: are the SαS, self-similar

196

Scaling, Fractals and Wavelets

processes, with stationary increments, elliptic? Finally, ellipticity enables us to construct a wide variety of self-similar processes with stationary increments subordinated to the Brownian measure. 5.3.2. Symmetric α-stable processes 5.3.2.1. Stochastic measure Let d be a non-zero integer and μ a measure on Rd × E. A stochastic measure Mα on Rd is SαS with control measure μ if, for any p < α, there is a constant cp,α such that, for any test function φ, we obtain (see [SAM 94, section 10.3.3]): αp p α |φ(x)| μ(dx) (5.11) E|Mα (φ)| = cp,α Rd

Like previous notations, the measure Mα above possesses the following properties: L

– stationarity. For any x ∈ Rd : τx (Mα ) = Mα ; L

– unitarity. Let Ux f (y) = eixy f (y). For any x ∈ Rd : Ux (Mα ) = Mα ; L

d

– homogenity. For any λ > 0: rλ M = λ α M ; – symmetry: L

−Mα = Mα NOTE 5.2.– Taken as distributions, let TF (Mα ) be the Fourier transform of Mα . If Mα is stationary, TF (Mα ) is unitary. If we have α > 1 and if Mα is d/α-homogenous, then TF (Mα ) is d/α -homogenous, with α1 + α1 = 1. 5.3.2.2. Ellipticity Let Q and S be two functions of the unit sphere Σd−1 of Rd in R+ . Let Mα be a stochastic measure and μ be the Lebesgue measure on Rd . Let us consider, when they exist, the following stochastic processes: d d x−y H− α H− α − |y| |x − y| Q Q(y) Mα (dy) X(x) = |x − y| Rd (5.12) eixξ − 1 (dξ) Z(x) = M α d +H S ξ Rd |ξ| α |ξ| THEOREM 5.11 ([SAM 94], p. 358).– Processes X and Z are deﬁned whenever 0 < H < 1 and 0 < α < 2. So, they have stationary increments and are self-similar of order H. There exists no pair (Q, S) such that X and the real part of Z have proportional laws.

Self-similar Processes

197

NOTE 5.3.– When α = 2, X and Z processes can have proportional laws for conveniently chosen couples (Q, S), particularly for Q = S ≡ 1. Let us deﬁne the operator τd,α by: 1 x d (ξ) = |x|H− α Q ξ d +H |x| α |ξ| τd,α S |ξ| where the Fourier transform is taken as distributions. Then, the following Plancherel formula holds. THEOREM 5.12 ([BEN 99]).– Let 0 < α < 1. There then exists two constants c and c such that the processes deﬁned in (5.12) admit the following harmonizable representation: eixξ − 1 (5.13) X(x) = c ξ TF (Mα )(dξ) d +H τd,α S |ξ| Rd |ξ| α ' ( d d −1 −1 |x − y|H− α τd,α Z(x) = c S(x − y) − |y|H− α τd,α S(y) TF (Mα )(dy) Rd

Let Θ be a symmetric function of Σd−1 in R+ . Let us consider the symbol: d ξ +H β ρβ,Θ (ξ) = |ξ| Θ |ξ| and the corresponding pseudo-differential operator: Lβ,Θ f (x) = ρβ,Θ (ξ)TF (f )(ξ)eixξ dξ

(5.14)

Rd

THEOREM 5.13 ([BEN 99]).– Processes X and Z of Theorem 5.11 are the unique solutions of the following systems: Lα ,Θ X = Mα

(a.s.)

X(0) = 0 with Θ = τd,α S and: Lα ,S Z = TF (Mα )

(a.s.)

Z(0) = 0 NOTE 5.4.– Generally, the sample paths of X and Z are not continuous, hence the deﬁnition of X(0) and Z(0) in average.

198

Scaling, Fractals and Wavelets

5.3.3. Censov and Takenaka processes Let us now consider two examples of self-similar processes with stationary increments, which are neither Gaussian nor stable. Let us consider the afﬁne space Ed of dimension d. Let Vt , t ∈ Rd , denote the set of hyperplans separating the origin 0 from t. Let Nα stand for a SαS measure, with control measure dσ(s)dρ, where dσ(s) is the surface measure of the unit sphere Σd−1 . The Censov process is deﬁned by: C(t) = Nα (Vt ) Let us now consider the set Bt of spheres of Ed that separate the points 0 and t. Each of these spheres is determined by its center x and radius r. Let Mαβ be a SαS measure, with control measure μ(dx, dr) = rβ−d−1 dxdr. Takenaka process, with exponent β is deﬁned as: T β (t) = Mαβ (Bt ) THEOREM 5.14 ([SAM 94]).– When α > 1 and β < 1, processes C and T β are β well-deﬁned. They are self-similar processes (of order H = α1 for C(t) and H = α for T (t)), with stationary increments. THEOREM 5.15 ([SAM 94]).– For α < 2, when projected in any arbitrary direction, processes C and T β have non-proportional laws. 5.3.4. Wavelet decomposition Let M be a stochastic measure verifying (5.11) and possessing stationarity, unitary, homogenity and symmetry properties. Let ψλ be a family of wavelets as deﬁned in section 5.2.5. Let 1 < β 2. We denote by ψλβ the family 2j/β ψ u (2j x − k). A simple observation leads to the following result. LEMMA 5.1.– We have: M (dx) =

ψλα M (ψ α )

Λd

def = ψλα ξλα Λd

Moreover, let us deﬁne: φα λ

=

eixy − 1 TF (ψλα )(y) dy ρ(y)

Self-similar Processes

199

where the function ρ is the denominator of (5.12) or (5.13). From this, we deduce the following decomposition: α 2−jH φα X(x) = λ ξλ λ∈Λd

where the

(ξλα )

verify the following stationarity properties. For any j: L

∀

L

∀r.

α α ξj+,k,u = ξj,k,u α α = ξj,k,u ξj,k+r,u

(5.15)

This property can be compared with that given in [CAM 95] (for second order processes) or in [AVE 98] (general case) for the wavelet coefﬁcients of self-similar processes with stationary increments. 5.3.5. Process subordinated to Brownian measure Let (Ω, F, P) be a probabilized space and Wd the Brownian measure on L2 (Rd ). The space L2 (Ω, F, P) is characterized by its decomposition in chaos [NEV 68]. Let us brieﬂy recall this theory. Let Σn be the symmetric group of order n. For any function of n variables, we deﬁne: def 1 f (xσ(1) , . . . , xσ(n) ) f ◦n (x1 , . . . , xn ) = n! Σn

Let us also deﬁne the symmetric stochastic measure of order n, i.e. (n) Wd (dx1 , . . . , dxn ) on L2 ((Rd )n ) by: def

(n)

Wd (A1 × · · · × An ) = Wd (A1 ) × · · · × Wd (An ) where any two Ai are disjoint. In addition, it is imposed that the expectation (n) of Wd (f1 . . . fn ) is always zero. As an example, with n = 2, we obtain Wd2 (f g) = Wd2 (f )Wd2 (g) − #f, g$: Wd (f ) = Wd (f ◦n ) (n)

def

(n)

The following properties are established: (n) (m) E Wd (f ), Wd (g) = δn,m #f, g$ For any F ∈ L2 (Ω, F, P), there exists a sequence fn ∈ L2 ((Rd )n ), n 0, with: (n) Wd (fn ) F = n0 (0)

and this decomposition is unique. Moreover, we have EF = Wd .

200

Scaling, Fractals and Wavelets

THEOREM 5.16.– Let 0 < H < 1: – process Y n deﬁned by: n

def

Y (x1 , . . . , xn ) =

eix·ξ − 1

Rd

|ξ|

dn 2 +H

TF (Wdn )(dξ)

is self-similar (of order H), with stationary increments; – process X n deﬁned by: def

X n (x) = Y n (x, . . . , x) is self-similar (of order H), with stationary increments; – if an is a summable square sequence, process X deﬁned by: def

X(x) =

an X n (x)

n0

is self-similar (of order H), with stationary increments. This theorem shows how difﬁcult a general classiﬁcation of self-similar processes with stationary increments can be. Let us note, moreover, that we considered the elliptic case only. We could, for example, also think of combinations of hyperbolic and elliptic cases. This difﬁculty is clearly indicated by Dobrushin in the issues raised in the comments of [DOB 79, Theorem 6.2, p. 24]. 5.4. Regularity and long-range dependence 5.4.1. Introduction As opposed to what the title of this section may suggest, we will address this question only by means of ﬁltered white noise and for the Gaussian case. Despite its restricted character, this class of examples allows us to question the connection between the regularity of trajectories on the one hand and the long-range correlation5 on the other hand. To begin with, let us once again consider fractional Brownian motion BH , with parameter H. The sample paths of BH are Hölderian with exponent h (a.s.), for any

5. The analysis of decorrelation and mixing process properties, is an already old subject (see, for example, [DOU 94, IBR 78]).

Self-similar Processes

201

h < H, but are not Hölderian with exponent H (a.s.). In addition, we can verify that, for Δ > 0, k ∈ N: " ! |Δ|2H E BH (Δ) BH (k + 1)Δ − BH (kΔ) = c 2(1−H) |k| The decrease, with respect to lag Δ, of the correlation of the increments of X is slow. It is often incorrectly admitted that the Hölderian character and the slow decrease of the correlation of the increments are tied together. 5.4.2. Two examples 5.4.2.1. A signal plus noise model Let H and K be such that 0 < K < H < 1. Let S1 and S2 , be two functions on the sphere Σd−1 with values in [a, b], with 0 < a b < +∞. Then, let us consider the process X deﬁned by: eixξ − 1 eixξ − 1 def ,1 (dξ) + TF (W2 )(dξ) W X(x) = d d Rd |ξ| 2 +H S1 ξ Rd |ξ| 2 +K S2 ξ |ξ| |ξ| def

= YH (x) + ZK (x)

where it is assumed that W1 and W2 are two independent Wiener processes. The process X can be viewed as signal YH corrupted by noise ZK . Indeed, ZK is more irregular than YH . It is shown in Chapter 6 that X is locally self-similar, with parameter K. The local behavior is, indeed, dominated by K: X(λx) L = ZK (x) x lim K λ→0 λ x However, we can also verify: lim

λ→+∞

X(λx) λH

x

L = YH (x) x

The global behavior is hence dominated by H. 5.4.2.2. Filtered white noise Now, let us consider the process: eixξ − 1 def TF (W )(dξ) T (x) = d d ξ Rd |ξ| 2 +H S1 ξ 2 +K S 2 |ξ| |ξ| + |ξ|

202

Scaling, Fractals and Wavelets

It satisﬁes the following properties. T (λx) L lim = YH (x) x H λ→0 λ x T (λx) L lim = ZK (x) x K λ→+∞ λ x 5.4.2.3. Long-range correlation Previous results show that X is Hölderian, with exponent k < K (a.s.) and is not Hölderian with exponent K, and also that Y is Hölderian with exponent h < H (a.s.) and is not Hölderian with exponent H. The long-range correlation of X is dominated by H: X(kh) X (k + 1)h − X(kh) c = lim E |h|2H |h|→∞ |k|2(1−H) while the long-range correlation of T is dominated by K: T (kh) T (k + 1)h − T (kh) c = lim E 2K 2(1−K) |h| |h|→∞ |k| Thus, from these generic examples, we can see that long-range correlation and Hölderian regularity are two distinct concepts. 5.5. Bibliography [AVE 98] AVERKAMP R., H OUDRÉ C., “Some distributional properties of the continuous wavelet transform of random processes”, IEEE Trans. on Info. Theory, vol. 44, no. 3, p. 1111–1124, 1998. [BAR 95] BARLOW M., “Diffusion on fractals”, in Ecole d’été de Saint-Flour, Springer, 1995. [BAS 62] BASS J., “Les fonctions pseudo-aléatoires”, in Mémorial des sciences mathématiques, fascicule 153, Gauthier-Villars, Paris, 1962. [BEN 97] B ENASSI A., JAFFARD S., ROUX D., “Elliptic Gaussian random processes”, Rev. Math. Iber., vol. 13, p. 19–90, 1997. [BEN 98] B ENASSI A., C OHEN S., I STAS J., “Identifying the multifractional function of a Gaussian process”, Stat. Proba. Let., vol. 39, p. 337–345, 1998. [BEN 99] B ENASSI A., ROUX D., “Elliptic self-similar stochastic processes”, in D EKKING M., L ÉVY V ÉHEL J., L UTTON E., T RICOT C. (Eds.), Fractals: Theory and Applications in Engineering, Springer, 1999. [BEN 00] B ENASSI A., C OHEN S., D EGUY S., I STAS J., “Self-similarity and intermittency”, in Wavelets and Time-frequency Signal Analysis (Cairo, Egypt), EPH, 2000.

Self-similar Processes

203

[BRE 68] B REIMAN L., Probability, Addison-Wesley, 1968. [CAM 95] C AMBANIS S., H OUDRÉ C., “On the continuous wavelet transform of second-order random processes”, IEEE Trans. on Info. Theory, vol. 41, no. 3, p. 628–642, 1995. [DOB 79] D OBRUSHIN R.L., “Gaussian and their subordinated self-similar random ﬁelds”, Ann. Proba., vol. 7, no. 3, p. 1–28, 1979. [DOU 94] D OUKHAN P., “Mixing: properties and examples”, in Lecture Notes in Statistics 85, Springer-Verlag, 1994. [DUB 97] D UBRULLE B., G RANER F., S ORNETTE D. (Eds.), Scale Invariance and Beyond: Proceedings of the CNRS School (Les Houches, France), EDP Sciences and Springer, 1997. [DUR 97] D URAND J., Sables, poudres et grains, Eyrolles Sciences, Paris, 1997. [FRI 97] F RISCH U., Turbulence, Cambridge University Press, 1997. [GRI 89] G RIMMETT G., Percolation, Springer-Verlag, 1989. [GUY 89] G UYON X., L EON J., “Convergence en loi des H-variations d’un processus Gaussien stationnaire”, Annales de l’Institut Henri Poincaré, vol. 25, p. 265–282, 1989. [GUY 94] G UYON E., T ROADEC J.P., Du sac de billes au tas de sable, Editions Odile Jacob, 1994. [HAV 87] H AVLIN S., B EN -H VRAHAM D., “Diffusion in disordered media”, Advances in Physics, vol. 36, no. 6, p. 695–798, 1987. [HER 90] H ERMANN H., ROUX S., Statistical Models for the Feature of Disordered Media – Random Materials and Processes, North-Holland, 1990. [IBR 78] I BRAGIMOV I., ROZANOV Y., “Gaussian random processes”, in Applications of Mathematics 9, Springer-Verlag, 1978. [IST 97] I STAS J., L ANG G., “Quadratic variations and estimation of the Hölder index of a Gaussian process”, Annals of the Institute Henri Poincaré, vol. 33, p. 407–436, 1997. [KOL 40] KOLMOGOROV A., “Wienersche Spiralen und einige andere interessante Kurven im Hilbertsche Raum”, Comptes rendus (Dokl.) de l’Académie des sciences de l’URSS, vol. 26, p. 115–118, 1940. [LAM 62] L AMPERTI J., “Semi-stable stochastic processes”, Trans. Am. Math. Soc., vol. 104, p. 62–78, 1962. [LEG 99] L ÉGER S., P ONTIER M., “Drap brownien fractionnaire”, Note aux Comptes rendus de l’Académie des sciences, S. I, vol. 329, p. 893–898, 1999. [MAN 68] M ANDELBROT B.B., VAN N ESS J.W., “Fractional Brownian motions, fractional noises, and applications”, SIAM Review, vol. 10, no. 4, p. 422–437, 1968. [MEY 90] M EYER Y., Ondelettes et opérateurs, Hermann, Paris, 1990. [NEV 68] N EVEU J., Processus Gaussiens, Montreal University Press, 1968. [SAM 94] S AMORODNITSKY G., TAQQU M.S., Stable Non-Gaussian Random Processes, Stochastic Models with Inﬁnite Variance, Chapman and Hall, New York and London, 1994.

204

Scaling, Fractals and Wavelets

[TAQ 75] TAQQU M.S., “Weak convergence to fractional Brownian motion and the Rosenblatt process”, Z.W.G., vol. 31, p. 287–302, 1975. [WEI 72] W EIERSTRASS K., “Ueber continuirliche Functionen eines reellen Arguments, die fuer keinen Werth des letzteren einen bestimmten differentialquotienten besitzen”, Koenigl. Akad. Know. Mathematical Works II, vol. 31, p. 71–74, 1872. [WIL 98] W ILLINGER W., PAXSON V., TAQQU M.S., “Self-similarity and heavy tails: structural modeling of network trafﬁc”, in A DLER R.J., F ELDMAN R.E., TAQQU M.S. (Eds.), A Practical Guide to Heavy Tails: Statistical Techniques and Applications, Springer-Verlag, p. 27–53, 1998.

Chapter 6

Locally Self-similar Fields

6.1. Introduction Engineers and mathematicians interested in applications have to use many different models to describe reality. The objective of this chapter is to explain the usefulness of locally self-similar ﬁelds. First, we will show how the traditional concept of self-similarity often proves too narrow to model certain phenomena. Then, given the diversity of existing locally self-similar models, we will present the panorama of relations that are found among them. Finally, we will familiarize the reader with the techniques used in this ﬁeld. In order to understand the genesis of locally self-similar ﬁelds, it is necessary to go back to their common ancestor: the fractional Brownian motion (FBM). Historically, the popularity of simple random models having properties of self-similarity in principle can be traced back to [MAN 68]. In particular (if restricted to the Gaussian processes), Mandelbrot and Van Ness show that there exists a single process with stationary increments, self-similar of order H (for 0 < H < 1). This property implies that a change of scale on the index amounts to a scaling on the process value: L

BH (x) = H BH (x) See Chapter 5 for more precise explanations. We will thereafter note by BH the FBM of order H. One of the most interesting properties of fractional Brownian motion of order H is the Hölderian regularity of order H, noted by C H (with near

Chapter written by Serge C OHEN.

206

Scaling, Fractals and Wavelets

logarithmic factors) of the trajectories. Indeed, FBM is a good candidate for modeling phenomena which, using a statistical processing, are found to have C H trajectories and are supposed, for theoretical reasons, to be self-similar. The importance of the identiﬁcation of H, starting from the samples of the phenomenon necessarily taken in discrete time, is thus crucial. At the same time, it is necessary to remember that FBMs are processes with stationary increments, which simpliﬁes the spectral study of the process but is too restrictive for certain applications. Indeed, in many ﬁelds (when we want to simulate textures in an image), it is expected, a priori, that order H depends on the point at which the process is observed. For example, if, using a random ﬁeld, we want to model the aerial photographing of a forest, we would like to have a model where the parameter of self-similarity around the point x, noted by h(x), depends on the geological nature of the ground in the vicinity of x. However, a spatial modulation of the ﬁeld law is generally incompatible with the property of self-similarity, which is the overall property. Consequently, the problem consists of arriving at a concept of a sufﬁciently ﬂexible, locally self-similar ﬁeld, so that the parameters which deﬁne the ﬁeld law could vary with the position, yet be simple enough to enable the identiﬁcation of these parameters. Unfortunately, the simple approach, which consists of reproducing H by a function h(x) in the formula giving the covariance of a FBM, is not satisfactory: generally, we can show that there does not exist any Gaussian ﬁeld having this generalized covariance. We will thus have to introduce the mathematical tools which will make it possible to build models generalizing FBMs and also to identify the functional parameters of these models. These theoretical recaps will be dealt with in section 6.2, where we will consider each time the relevance of the concept introduced for an example of fractional Brownian motion. In this context, we discuss traditional techniques for the study of Gaussian ﬁelds and also the tools of analysis in wavelets. Using this theoretical framework, in section 6.3 we will formally deﬁne the property of local self-similarity and present two examples which form the base of all the later models. Having established that these models are not sufﬁciently general for the applications, we will penetrate into the multifractional world in section 6.4. In each preceding model, speciﬁc attention will be given to the regularity of the trajectories and, in section 6.5, we shall develop the statistical methods which make it possible to estimate this regularity. This is what we call model identiﬁability. At this point, it is necessary to clarify that the term “ﬁelds” is used for families of random variables indexed by groups of d for d 1. In applications, the most interesting cases correspond to d > 1 (for example, d = 2, for the images). However, certain statements, particularly those concerning identiﬁcation, will relate only to processes (i.e., ﬁelds where we have d = 1).

Locally Self-similar Fields

207

6.2. Recap of two representations of fractional Brownian motion We begin with the presentation of tools for the study of locally self-similar ﬁelds based on the concept of reproducing kernel Hilbert space. We derive from this a Karhunen-Loeve expansion regarding FBM. We ﬁnd that there is a spectral representation of fractional Brownian motion called harmonizable. It will be an occasion to recall some concepts of the multiresolution analysis. 6.2.1. Reproducing kernel Hilbert space The study of Gaussian ﬁelds is largely facilitated by a tool of analysis which is traditionally associated with these ﬁelds: reproducing kernel Hilbert space. From a physical point of view, the reproducing kernel Hilbert space can be regarded as a space which describes the energy associated with a Gaussian ﬁeld within the meaning of a spectral energy. On the other hand, mathematically, a reproducing kernel Hilbert space is a Hilbert space of deterministic functions whose standards characterize all the properties of the ﬁeld. See [NEV 68] for a detailed study. Let us now recall its formal deﬁnition. DEFINITION 6.1.– Let (Xx )x∈d be a centered Gaussian ﬁeld (i.e., E(Xx ) = 0). We will call Gaussian space associated with X the space of square integrable random variables (noted by L2 (Ω, A, P )) and made up with the help of linear combinations of the variables Xx and their limits, that is:

n (6.1) λi Xxi HX =adh Z, such that ∃n ∈ N and ∃λi for i=1 to n and Z = i=1

where adh means that we take the closure of the set for the topology deﬁned by L2 (Ω, A, P ). The space:

HX = hZ , such that ∃Z ∈ HX / hZ (x) = E(ZXx∗ ) equipped with the Hermitian product: ∀Z1 , Z2 ∈ HX ,

#hZ1 , hZ2 $ = E(Z1 Z2∗ )

(6.2)

is the reproducing kernel Hilbert space associated with the ﬁeld X. It is veriﬁed, according to (6.2), that the application: h : HX −→ HX Z −→ hZ

(6.3)

208

Scaling, Fractals and Wavelets

is an isometry between Gaussian space and the reproducing kernel Hilbert space, while the Hermitian product on HX is ad hoc. In particular, this application is bijective, meaning that for all the functions h ∈ HX there is only one corresponding random variable Z of HX . Moreover, HX contains the functions of y: hXx (y) = R(x, y) resulting from the covariance of X: R(x, y) = E(Xx Xy∗ ) and we can describe the reproducing kernel Hilbert space as the closure of the ﬁnite linear combinations of functions R(x, ·) for HX . Lastly, the name reproducing kernel Hilbert space comes from the property veriﬁed by its scalar product: #R(x, ·), R(y, ·)$HX = R(x, y)

(6.4)

However, the most important aspect of reproducing kernel Hilbert space is the fact that the choice of an orthonormal base of this space makes it possible to obtain a series representation of the ﬁeld, which is often called Karhunen-Loeve expansion. THEOREM 6.1.– Any orthonormal base (en (x))n∈N of HX , is associated with an orthonormal base of HX , i.e. (ηn )n∈N , by the relation: hηn = en The random variables (ηn )n∈N are the centered independent Gaussian variables of variance 1 and the ﬁeld can be represented by: Xx (ω) =

+∞

ηn (ω)en (x)

(6.5)

n=0

where the ω are hazards of the space of probability Ω and convergence in (6.5) is in the direction L2 (Ω). The preceding theorem is true for any ﬁeld and for any orthonormal base of HX . In fact, a martingale type argument shows that convergence is almost sure, which is important, particulary when simulating the ﬁelds of interest here. Nevertheless, a judicious choice of the orthonormal base of HX is necessary for conveniently studying the regularity of the trajectories of these ﬁelds. We will illustrate these ideas in the fundamental example of FBM in the next section. 6.2.2. Harmonizable representation By way of example, let us seek the reproducing kernel Hilbert space of the FBM: we will deduce from it a Karhunen-Loeve expansion which will form the basis for

Locally Self-similar Fields

209

studying the trajectorial regularity of the generalizations for FBM. Let us begin with the deﬁnition of the FBM, starting from its covariance. DEFINITION 6.2.– We will call FBM of order H the real centered Gaussian ﬁeld BH given by the covariance: 1

x 2H + y 2H − x − y 2H E BH (x)BH (y) = 2

(6.6)

where 0 < H < 1 and where is the Euclidean norm on d . Let us begin with some elementary comments explaining this presentation. To simplify our study, we shall assume a ﬁeld with real values. In addition, it is sometimes more vivid to deﬁne the FBM by expressing the variance of the increments, which is: 2 E BH (x) − BH (y) = x − y 2H In fact, this property characterizes the FBM if we additionally impose BH (0) = 0 a.s. If H = 12 and d = 1, the FBM is a standard Brownian motion and the increments are then independent if they are taken on separate intervals; however, this case is exceptional, and the majority of methods used for the standard Brownian do not apply to the other H. On the other hand, we note that the FBM is a ﬁeld with stationary increments whatever the H, which constitutes the starting point for representing its covariance. Indeed, it is traditional to represent ﬁelds with stationary increments through a spectral measure. While following the example of the stationary processes (see [YAG 87] for a general presentation), we obtain: R(x, y) =

d

(eix·ξ − 1)(e−iy·ξ − 1) μ(dξ)

(6.7)

where μ is the spectral measurement. In the case of FBM, we can guess the spectral measurement from the formula: d

|eix·ξ − 1|2 dξ 2 = CH

x 2H

ξ d+2H (2π)d/2

(6.8)

where CH is a strictly positive constant; the preceding formula gives: R(x, y) =

1 2 CH

d

(eix·ξ − 1)(e−iy·ξ − 1) dξ

ξ d+2H (2π)d/2

= #kx , ky $L2 (d )

(6.9)

210

Scaling, Fractals and Wavelets

where we have:

#f, g$L2 (d ) =

d

f (ξ)g ∗ (ξ)

dξ (2π)d/2

The covariance can still be written by using Parseval’s formula, which expresses that Fourier transform is an isometry of L2 : ,y $L2 (d ) ,x , k R(x, y) = #k

(6.10)

where fˆ is the Fourier transform of f . By using the Fourier inversion theorem: kx =

−1 eix·ξ d

CH ξ 2 +H

Equation (6.9) is an attempt to associate the covariance with a functional scalar product and thus with (6.4). That will enable us, according to [BEN 97b], to have a convenient description of the reproducing kernel Hilbert space of the FBM. To this end, let us deﬁne the operator of isometry J between HBH and L2 (d ). DEFINITION 6.3.– We deﬁne the linear operator J of L2 (d ) on HBH by assuming: J (kx ) = R(x, ·) For ψ ∈ L2 (d ): J (ψ)(y) =

d

dξ (e−iy·ξ − 1) ˆ ∗ (ψ) (ξ) d (2π)d/2 CH ξ 2 +H

(6.11)

The reproducing kernel Hilbert space of the FBM can be written:

HBH = f, ∃ψ ∈ L2 (d ) such that f = J (ψ) Moreover, J is an isometry: #J (ψ1 ), J (ψ2 )$HBH = #ψ1 , ψ2 $L2 (d )

(6.12)

The properties of J contained in Deﬁnition 6.3 are proved in pages 24 and 25 of [BEN 97b]. This presentation of reproducing kernel Hilbert space makes it possible to easily build an orthonormal base of HBH such that the associated Karhunen-Loeve expansion almost surely does converge. The ﬁrst stage consists of choosing an

Locally Self-similar Fields

211

orthonormal base of L2 (d ) which is adapted to our problem. For this, let us start from a multiresolution analysis of L2 (d ) (see [MEY 90a]): it is known that there are functions ψ (l) ∈ L2 (d ) for l pertaining to L = {0, 1}d \ {(0, . . . , 0)} such that their (l) (ξ) are C ∞ and vanish beyond the limit of 2π ξ 8π . Fourier transforms ψ 3 3 We then suppose: (l)

dj

ψj,k (x) = 2 2 ψ (l) (2j x − k)

j, k ∈ Z

(6.13)

(l)

Below, we will note λ = (j, k, l), Λ = Z2 × L and ψj,k = ψλ . Conventionally, in a multiresolution analysis (ψλ )λ∈Λ is an orthonormal base of L2 (d ) and function ψλ is localized around k 2−j which we will identify, by using unconventional language, with λ. Here ψλ in particular shows a fast decrease: |ψλ (x)|

C 2dj/2 1 + |2j x − k|K

∀K ∈ N

(6.14)

From the base (ψλ )λ∈Λ , we will build an orthonormal base of the reproducing kernel Hilbert space of the FBM by assuming: dξ e−ix·ξ − 1 ˆ ∗ (ψλ ) (ξ) (6.15) ϕλ (x) = d d/2 +H (2π) d 2 C

ξ H When d = 1, functions ϕλ are “morally” the fractional integrals of functions ψλ , within the meaning of Chapter 7. To be convinced of this, it is necessary to express the fractional integration operator of the previously mentioned chapter regarding the Fourier transform of the function to which we apply it. It will be noted, however, that the correspondence is not exact. Nevertheless, the principal purpose of functions ϕλ is that they “inherit,” in a certain manner, the localization properties of functions ψλ : these are “wavelet” type functions in the terminology of [MEY 90b], which results in: 1 1 (6.16) + |ϕλ (x)| C(K) 2−Hj 1 + |2j x − k|K 1 + |k|K |ϕλ (x) − ϕλ (y)| C(K) 2−Hj

2j |x − y| 1 + |2j x − k|K

∀K ∈ N

(6.17)

where we suppose that we have j 0 and (j, l) = (0, 0). Consequently, the series expansion (6.5) of the FBM becomes: ηλ ϕλ (x) (6.18) BH (x) = λ∈Λ

where (ηλ )λ∈Λ is a sequence of centered Gaussian random variables independent of variance 1. Thanks to localization (6.16), it is possible to say that, roughly, the

212

Scaling, Fractals and Wavelets

behavior in the vicinity of x of the FBM BH depends mainly on the random variables ηλ for λ approaching x; in particular, the series of (6.18) almost surely converges. In this chapter we will continue to use the representation known as harmonizable for FBM of order H. In this, the FBM appears as a white noise to which a ﬁlter is applied within the meaning of the theory of the signal. Although FBM can be deﬁned by its harmonizable representation (see Chapter 7 of [SAM 94]), we will here deduce it from (6.18). From this point of view, it is necessary to assume: ˆ ∗ (dξ) = W

∗

,λ (ξ) ηλ ψ

λ∈Λ

dξ (2π)d/2

(6.19)

This is a Gaussian random measure and it integrates the deterministic functions f of L2 (d ) into possibly complex values and provides a Gaussian random variable: ∗ ,λ (ξ) dξ ˆ ∗ (dξ) = f (ξ)W ηλ f (ξ)ψ (6.20) (2π)d/2 d d λ∈Λ It is necessary to understand the left-hand side of (6.20) like a notation for the Gaussian random variable deﬁned by the convergent series in L2 (Ω) of the right-hand side. Since the variables ηλ are independent, it is deduced that: . . . .

d

.2 . ˆ ∗ (dξ). f (ξ)W .

L2 (Ω)

= f 2L2 (d )

On the basis of (6.18) and (6.20), we obtain: e−ix·ξ − 1 ˆ ∗ W (dξ) BH (x) = d +H d CH ξ 2

(6.21)

(6.22)

The FBM is thus a white noise ﬁltered through the ﬁlter: g(x, ξ) =

e−ix·ξ − 1 d

CH ξ 2 +H

(6.23)

It should be noted that there are other ﬁlters leading to ﬁelds which have the same law as that deﬁned in (6.22). In the next section, we will see that it is possible to deﬁne the generalizations of FBM which do not have stationary increments, on the basis of the reproducing kernel Hilbert space in FBM or its harmonizable representation.

Locally Self-similar Fields

213

6.3. Two examples of locally self-similar fields 6.3.1. Definition of the local asymptotic self-similarity (LASS) In this section, we will precisely deﬁne the property of local self-similarity which we are seeking. Let us recall, to this end, the property of self-similarity veriﬁed by the FBM: ∀ ∈ + ,

∀x ∈ d ,

L

BH (x) = H BH (x)

(6.24)

L

where = means that, for all n ∈ N and any choice of (x1 , . . . , xn ) in n , the vector (BH (x1 ), . . . , BH (xn )) has the same law as H (BH (x1 ), . . . , BH (xn )). As we saw in the introduction, it is not easy to localize this overall property while preserving an identiﬁable model for which the trajectories of the process have locally, in the vicinity of a point, the desired Hölderian regularity. The asymptotic deﬁnition, presented initially in [BEN 97b], corresponds to these objectives. DEFINITION 6.4.– Let there be a function h: d → ]0, 1[. A ﬁeld X will be called locally self-similar (LASS) of multifractional function h if: X(x + u) − X(x) L lim = a(x) BH (u) u∈d (6.25) ∀x ∈ d , h(x) →0+ u∈d where a is a strictly positive function and BH is a FBM of order H = h(x). The topology with which we equip the trajectory space to deﬁne the convergence in law is that of uniform convergence on each compact. This deﬁnition can be reformulated qualitatively by saying that a locally self-similar process admits at each point x ∈ d , to a standardization of the near variance given by a(x), a Brownian fractional tangent BH . It is a satisfactory generalization of the self-similar property. Indeed, it is easy to verify that a FBM is locally self-similar to the constant multifractional function equal to its order. In the case of FBM, we have: L

X(x + u) − X(x) = X(u) because of the stationarity property of the increments and it is noted, while applying (6.24), that for a FBM, the term: X(x + u) − X(x) H is constant in law. This elementary veriﬁcation explains the denominator of (6.25) as well as the role of localization of the asymptotic → 0+ . The last advantage of Deﬁnition 6.4 is that it enables the construction of non-trivial examples of locally self-similar processes, as we will see in the following section.

214

Scaling, Fractals and Wavelets

6.3.2. Filtered white noise (FWN) In [BEN 98b], we propose to generalize the harmonizable representation (6.22) by calling ﬁltered white noise (FWN) any process corresponding to a ﬁlter g(x, ξ) of the form: a(x) b(x) −ix·ξ − 1) + + R(x, ξ) (6.26) g(x, ξ) = (e d d

ξ 2 +H1

ξ 2 +H2 for 0 < H1 < H2 < 1. The term in parentheses in (6.26) is an asymptotic expansion in high frequency and we will ﬁnd, in the following deﬁnition, the precise assumptions which express that R(x, ξ) is negligible in front of: b(x) d

ξ 2 +H2 when ξ → +∞. DEFINITION 6.5.– We will call ﬁltered second order white noise a process X which admits harmonizable representation: a(t) b(t) −itξ ˆ ∗ (dξ) (6.27) − 1) + + R(t, ξ) W ∀t ∈ , X(t) = (e 1 1 |ξ| 2 +H1 |ξ| 2 +H2 where there are the two following hypotheses. HYPOTHESIS 6.1.– In the preceding deﬁnition, we have 0 < H1 < H2 < 1 and R(t, ξ) ∈ C 1,2 ([0, 1] × ) is such that: m+n ∂ C ∂tm ∂ξ n R(t, ξ) |ξ| 12 +η+n for m = 0, 1 and n = 0, 1, 2 with η > H2 . The symbol C denotes a generic constant. HYPOTHESIS 6.2.– In the preceding deﬁnition, we have a, b ∈ C 1 ([0, 1]) and, for every t ∈ [0, 1], we have a(t)b(t) = 0. Limiting ourselves to an expansion of order 2 is a convention adopted with the aim of facilitating the presentation of the identiﬁcation algorithms. For the same reason, we suppose that ﬁltered white noises are the processes indexed by t ∈ . For a better understanding of the relationship between ﬁltered white noises and FBM, it is enough to suppose that R(t, ξ) is identically zero. We ﬁnd Xt = CH1 a(t)BH1 (t) + CH2 b(t)BH2 (t), for BH1 , BH2 two fractional Brownian motions and the CH constant deﬁned in (6.8). It should, however, be noted that, even in this simpliﬁed example, BH1 and BH2 are not independent and therefore

Locally Self-similar Fields

215

the law of X is not trivially deduced from FBM. This last example illustrates an additional virtue of ﬁltered white noises: their deﬁnition authorizes not only the functional parameters a(t) and b(t) to vary according to the position, but also the superposition of self-similar phenomena of orders H1 and H2 . In addition, we will ﬁnd an interpretation of H2 in terms of long dependence in Chapter 5. On the other hand, starting from formula (6.27), it is difﬁcult to ﬁnd a convenient expression of the reproducing kernel Hilbert space of a ﬁltered white noise. Moreover, the ﬁltered white noises do not maintain the overall properties of FBMs, that is to say, self-similarity and stationarity of the increments, which is an advantage when modeling certain phenomena. Since the ﬁlter of a white ﬁltered noise is asymptotically equivalent in high frequency to that of a FBM, only the local properties remain. For example, the ﬁltered white noises verify a property of local self-similarity of the following type. PROPOSITION 6.1.– A ﬁltered second order white noise X associated with a ﬁlter of form (6.26) is locally self-similar of constant multifractional function equal to H1 : lim+

→0

X(t + u) − X(t) H1

u∈

L = CH1 a(t) BH1 (u) u∈

(6.28)

A proof of this result can be found in [BEN 98a], regarding the multifractional processes. 6.3.3. Elliptic Gaussian random fields (EGRP) Another manner of generalizing FBM, which constitutes the approach adopted in [BEN 97b], consists of starting from the reproducing kernel Hilbert space. By returning to Deﬁnition 6.3, we can already represent the reproducing kernel Hilbert space norm by means of the operator J and the formula: #J (ψ1 ), J (ψ2 )$HBH = #ψ1 , ψ2 $L2 (d ) However, this formula can be presented differently. For every function f, g of space D0 of zero functions in 0, C ∞ with compact support: #f, g$HBH = #AH f, g$L2 (d )

(6.29)

2 where AH is a pseudo-differential operator of symbol CH

ξ 1+2H (CH indicates the constant deﬁned in (6.8)), i.e.: dξ 2 eix·ξ CH

ξ 1+2H fˆ(ξ) (6.30) AH (f ) = (2π)d/2 d

216

Scaling, Fractals and Wavelets

A demonstration of (6.29) is found in Lemma 1.1 of [BEN 97b]. This equation is also equivalent to: AH = J −2

(6.31)

2 It should be noted that the symbol of the operator AH , σ(ξ) = CH

ξ 1+2H is homogenous in ξ and does not depend on x, which respectively corresponds to the self-similarity property and increment stationarity of the process. Consequently, it is natural to consider Gaussian processes which are associated with the symbols σ(x, ξ) which also depend on the position. The property of stationarity of the increments is lost. However, if we impose that σ(x, ξ) is elliptic of order H, i.e., controlled by

ξ 1+2H when ξ → +∞, in the precise sense that there exists C > 0 such that, for all x, ξ ∈ d :

C(1 + ξ )2H+1 σ(x, ξ)

1 (1 + ξ )2H+1 C

(6.32)

we then obtain processes called elliptic Gaussian random processes (EGRP), which locally preserve many properties of the FBM. In this chapter, we will deﬁne the elliptic Gaussian random processes in a less general manner than in [BEN 97b]. DEFINITION 6.6.– Let AX be the pseudo-differential operator deﬁned by: dξ eitξ σ(t, ξ)fˆ(ξ) ∀f ∈ D0 , AX (f ) = 1/2 (2π)

(6.33)

of the symbol σ(t, ξ) verifying for 0 < H < 1, the following hypothesis: HYPOTHESIS 6.3 (H).– There exists R > 0: – for every t ∈ and for i = 0 to 3: i ∂ σ(t, ξ) 2H+1−i ∂ξ i Ci (1 + |ξ|)

for |ξ| > R

– > such that:

|σ(s, ξ) − σ(t, ξ)| (1 + |ξ|)2α+1+ |s − t| – it is elliptic of order H (see (6.32)). We will then call elliptic Gaussian random processes of order H the Gaussian processes of reproducing kernel Hilbert space given by adherence of D0 for the norm (#AX f, f $L2 () )1/2 and provided with the Hermitian product: #f, g$HX = #AX f, g$L2 ()

Locally Self-similar Fields

217

Let us make some comments to clarify Deﬁnition 6.6. First, we restrict ourselves to one dimension mainly for the same reasons of simplicity as for the ﬁltered white noises. In addition, let us note that if σ does not depend on t, then AX still veriﬁes: AX = JX−2

(6.34)

for the isometry: def

JX (ψ)(y) =

dξ e−iyξ − 1 ˆ ∗ * (ψ) (ξ) (2π)1/2 σ(ξ)

∀ψ ∈ L2 ()

This is the same, in fact, as saying that we have a harmonizable representation of X: −itξ e −1 ˆ ∗ * W (dξ) X(t) = σ(ξ) * It is therefore enough to have an asymptotic expansion of (e−itξ − 1)/ σ(ξ) of the type (6.26) so that X is a ﬁltered white noise. On the other hand, the FBMs are not elliptic Gaussian random processes of the type deﬁned earlier, since the lower inequality of ellipticity (6.32) is not veriﬁed. Moreover, if the symbol depends on t, relation (6.34) is no longer true and the elliptic Gaussian random processes are no longer ﬁltered white noises. Let us reconsider Hypothesis 6.3. The ﬁrst two points are necessary so that the symbol AX behaves asymptotically at high frequency “as if” it does not depend on t. If we want to distinguish the two models roughly, we can consider that elliptic Gaussian random processes have more regular trajectories, whereas ﬁltered white noises lend themselves better to identiﬁcation. Consequently, let us reconsider the manner of determining the local regularity of an elliptic Gaussian random process and very brieﬂy summarize the reasoning of [BEN 97b]. The starting point for the study of the regularity in the elliptic Gaussian random processes is, as for FBM, a Karhunen-Loeve expansion of the elliptic Gaussian random processes in adapted bases. The selected orthonormal base is built starting from the base (6.13) of L2 () by supposing: 1

φλ = (AX )− 2 (ψλ ) where the fractional power of AX is deﬁned by means of a symbolic calculation on the operators. This leads to: ηλ φλ a.s. and in L2 (Ω) (6.35) X= λ

The regularity of X is the consequence of “wavelet” type estimates which relate to φλ and its ﬁrst derivative and which resemble to (6.16). A precise statement is

218

Scaling, Fractals and Wavelets

found in Theorem 1.1 of [BEN 97b]; the essential point is the decrease in 2−Hj of the numerator with respect to the scale factor j – it is indeed this exponent which governs the almost sure regularity of the process. Thanks to the traditional techniques on the random series (see Chapters 15 and 16 of [KAH 85]), we ﬁnd that, “morally”, the trajectories are Hölderian of the exponent H. On page 34 of [BEN 97b], we will ﬁnd a great number of results describing very precisely the properties of the local and overall continuity modules of the elliptic Gaussian random processes; we will only mention here, by way of example, a law of the local iterated logarithm. THEOREM 6.2.– If X is an elliptic Gaussian random process of order H, then, for all t ∈ , we have: lim sup ε→0

|X(t + ε) − X(t)| & = C(t) |ε|H log log( 1ε )

(a.s.)

(6.36)

with 0 < C(t) < +∞. Thus, considering that “the trajectories are Hölderian of the exponent H” is equivalent to forgetting the iterated logarithm factor. On the other hand, it should be noted that if we are interested only in the continuity module of the elliptic Gaussian random processes, without wanting to specify the limit C(t), then metric entropic techniques (see [LIF 95]) valid for all Gaussian processes are applicable. Lastly, elliptic Gaussian processes are locally self-similar and subject to a convergence property of their symbol at high frequency. PROPOSITION 6.2.– If an elliptic Gaussian random process X of order H is associated with a symbol verifying: lim

|ξ|→+∞

σ(t, ξ) = a(t) |ξ|1+2H

∀t ∈

then X is a locally self-similar of constant multifractional function equal to H: X(t + u) − X(t) L lim = a(t) BH (u) u∈ (6.37) H →0+ u∈ 6.4. Multifractional fields and trajectorial regularity The examples of the preceding section lead to the principal objection concerning both elliptic Gaussian processes and ﬁltered white noises: the property of local self-similarity shows that, in spite of the modulations introduced by the symbol or ﬁlter, the multifractional function remains constant. The multifractional Brownian motion (MBM), introduced independently by [BEN 97b, PEL 96], is a model where a non-trivial multifractional function appears. This can be deﬁned by its harmonizable representation.

Locally Self-similar Fields

219

DEFINITION 6.7.– Let h: d → ]0, 1[ be a measurable function. We will call MBM of function h any ﬁeld admitting the harmonizable representation: 1 Bh (x) = v h(x)

d

e−ix·ξ − 1 d

ξ 2 +h(x)

W (dξ)

(6.38)

where W (dξ) is a Brownian measure and, for every s ∈ ]0, 1[: v(s) =

d

1/2 2 1 − cos(ξ1 ) dξ

ξ d+2s (2π)d/2

(6.39)

where ξ1 is the ﬁrst co-ordinate of ξ. As in (6.19), we deﬁne a general Brownian measure starting from an orthonormal base (gn (x))n∈N of L2 (d ) and a sequence (ηn )n∈N of centered independent Gaussian variables of variance 1, by supposing: dξ f (ξ)W (dξ) = ηn f (ξ)gn∗ (ξ) (6.40) d/2 (2π) d d n∈N for any function f of L2 (d ). In addition, the function of standardization v 2 of (6.8) and it is noted immediately that, if function corresponds to the constant CH h is a constant equal to H, the MBM is a fractional Brownian motion of order H (a unifractional Brownian). Before studying the properties of the MBM, we will establish the link between its harmonizable representation and the representation in the form of moving average obtained by [PEL 96]. 6.4.1. Two representations of the MBM To summarize the link between the harmonizable representation and the moving average representation of the MBM, we will say that they are Fourier transforms of each other. To be more precise, let us start from Deﬁnition 6.7 of a process indexed by (which is the case under consideration in [PEL 96]). Let us suppose that the ˆ ∗ (dξ); we have a series expansion of the MBM: Brownian measure of (6.38) is W 0 / e−it· − 1 1 , Bh (t) = ηλ , ψλ (6.41) 1 v h(t) λ∈Λ |.| 2 +h(t) L2 () However, Parseval’s identity leads to: /

e−it· − 1 , , ψλ 1 |.| 2 +h(t)

0 = L2

/ eit· − 1

, ψλ 1 +h(t)

|.| 2

0 (6.42) L2

220

Scaling, Fractals and Wavelets 1

The Fourier transform of (eit· − 1)/|.| 2 +h(t) is calculated by noticing that the transform of a homogenous distribution is also homogenous: it· − 1 e

|.|

1 2 +h(t)

1 1 (s) = C h(t) |t − s|(h(t)− 2 ) − |s|(h(t)− 2 )

(6.43)

We deduce from it the following theorem, whose proof is found in [COH 99]. ˆ ∗ (dξ) of (6.19). The MBM of the THEOREM 6.3.– Let the Brownian measure be W harmonizable representation: 1 Bh (t) = v h(t)

e−itξ − 1 ˆ ∗ W (dξ) 1

ξ 2 +h(t)

(6.44)

is equal almost surely to a deterministic multiplicative function close to the symmetric moving average:

+∞ ! −∞

1 1 " |t − s|(h(t)− 2 ) − |s|(h(t)− 2 ) W (ds)

where the Brownian measure is given by: W (ds) = ηλ ψλ (s) ds

(6.45)

(6.46)

λ∈Λ

This theorem calls for several comments. First of all, when h(t) = 12 : ! 1 1 " |t − s|(h(t)− 2 ) − |s|(h(t)− 2 ) is not clearly deﬁned, but the proof of the theorem shows that we must suppose: " ! 1 1 0 0 def − log |t − s| − |s| = log |t − s| |s| in (6.45). Now that we know that there is primarily only one MBM, we can state the local self-similarity associated with its multifractional function. On this subject, let us remember theorem 1.7 of [BEN 97b], which ﬁnds its symmetric (match) in Proposition 5 of [PEL 96]. PROPOSITION 6.3.– A MBM of function h of Hölderian class C r , with r > supt h(t), is locally self-similar to multifractional function h.

Locally Self-similar Fields

221

6.4.2. Study of the regularity of the trajectories of the MBM This section will recall the results known about the trajectory regularities of the MBMs. To carry out this study, both in [BEN 97b, PEL 96], a hypothesis of regularity is stated on the multifractional function h itself. In this section, we assume that the following hypothesis is veriﬁed: HYPOTHESIS 6.4.– Function h is Hölderian of exponent r (noted h ∈ C r ) with: r > sup h(t) t∈

This hypothesis of surprising formulation has long been considered to be related to the technique of the proof. In fact, we will see by outlining the proof of [BEN 97b] that we cannot do better for a MBM and that the obstruction comes from the “low frequencies”. Let us begin with the random series representation of the MBM (6.41), which we present differently: 1 ηλ χλ t, h(t) Bh (t) = v h(t) λ∈Λ

(6.47)

where the function: χλ (x, y) =

dξ e−ixξ − 1 , ∗ ψλ (ξ) 1 +y (2π)1/2

ξ 2

(6.48)

,λ does not is analytical in its two variables (the fact that the support of functions ψ contain 0 is used here). Similarly, the standardization function v is analytical and does not cancel itself on ]0, 1[. It follows that, if we truncate the series (6.47) by considering only a ﬁnite number of dyadics λ, then the random function which results from it has the same regular multifractional function h. In addition, the irregularity of the trajectories of the MBM is a consequence of high frequency phenomena (i.e., dependent on χλ (t, h(t)) for |λ| → +∞). We can ﬁnd in [BEN 97b] the high frequency estimates for the MBM, which are generalizations of (6.16). For every K ∈ N: 1 1 (6.49) + |χλ t, h(t) | C(K) 2−h(t)j 1 + |2j x − k|K 1 + |k|K |χλ t, h(t) − χλ s, h(s) | j (6.50) j|h(t) − h(s)| −h(s,t)j 2 |t − s| + j|h(t) − h(s)| C(K) 2 + 1 + |2j t − k|K 1 + |k|K with h(s, t) = min(h(s), h(t)). We notice, in particular, the factor 2−h(t)j which leads, for reasons identical to those of section 6.3.3, to the conclusion that the

222

Scaling, Fractals and Wavelets

MBM is, up to logarithmic factor, almost surely Hölderian of the exponent h(t). If Hypothesis 6.4 is omitted, it is not difﬁcult to see that “the Hölder exponent” of the MBM in t is given by min(h(t), r) which amounts to saying that it is the most irregular part of the high and low frequency of the MBM which imposes the overall regularity. Let us recall one of the results of [BEN 97b]. THEOREM 6.4.– If X is a MBM of the multifractional function verifying Hypothesis 6.4, then, for all t ∈ , we have: lim sup ε→0

|X(t + ε) − X(t)| & = C(t) |ε|h(t) log log( 1ε )

a.s.

(6.51)

with 0 < C(t) < +∞. For the issue of the MBM simulation, see [AYA 00], where there are some indications regarding this question. We will also note the existence of the FracLab toolbox in this ﬁeld. 6.4.3. Towards more irregularities: generalized multifractional Brownian motion (GMBM) and step fractional Brownian motion (SFBM) We saw in the preceding section that the MBM provides a model for locally self-similar processes with varying multifractional functions and pointwise exponents. However, Hypothesis 6.4 – essential within the strict framework of MBM – is cumbersome for certain applications. Let us quote two examples where we would wish for regularities which are worse than Hölderian regularities. First, in the rupture models which are important in image segmentation, we wish that the Hölder exponent had some discontinuities. Let us be clearer about this problem through a metaphor. Let us suppose that we have an aerial image on which we want to distinguish the limit of a ﬁeld and a forest. It is usual to model the texture of a forest by a FBM of the Hölder exponent H1 . In the same way, for the portion of the ﬁeld, we can think of a FBM of exponent H2 . The question arises as to how “to connect” the two processes on the border. There is a possibility of considering a MBM within the meaning of Deﬁnition 6.7, for which function h takes the two values H1 and H2 . However, this MBM does not correspond to the image. Indeed, it shows a discontinuity at the place where function h jumps. However, on an image, only the regularity changes suddenly and most often, the ﬁeld remains continuous. To model this type of rupture, let us recall the construction of the step fractional Brownian motion (SFBM) of [BEN 00]. In addition, the Hölder exponent of the MBM varies very slowly for applications with so-called developed turbulence (see [FRI 97] for an introduction to this subject). Indeed, the physics of turbulence teach us that the accessible data from measurements are not the Hölder exponents of the studied quantities, but their multifractal spectrum.

Locally Self-similar Fields

223

The description of multifractal spectrum is beyond the scope of this chapter (see Chapter 1, Chapter 3 and Chapter 4), but it is enough to know that for a function whose Hölder exponent is itself C r , this spectrum is trivial. We can thus be convinced that the MBM is not a realistic model for developed turbulence. In order to obtain processes whose trajectories have Hölder exponents which vary abruptly, Ayache and Lévy Véhel have proposed a model called the generalized multifractional Brownian motion (GMBM) in [AYA 99]. We present their model in the second part of this section. 6.4.3.1. Step fractional Brownian motion The multifractional functions associated with the SFBMs are very simple, which makes it possible to have a reasonable model for the identiﬁcation of ruptures. We limit ourselves to multifractional functions in steps: h(t) =

K

1[ai ,ai+1 [ (t) Hi

(6.52)

i=0

with a0 = −∞ and aK+1 = +∞, and where ai is an increasing sequence of realities. By taking (6.47), we arrive at the following deﬁnition. DEFINITION 6.8.– Let Λ+ = { 2kj , for k ∈ Z, j ∈ N}. By SFBM we mean the process of multifractional function h deﬁned by (6.52): (6.53) ηλ χλ t, h(λ) Qh (t) = λ∈Λ+

where functions χλ are deﬁned by (6.48). In the preceding deﬁnition, there are some differences as compared to (6.47). Some of them are technical, like the suppression of standardization v(h(t)) or the absence of negative frequencies. On the other hand, the SFBM has continuous trajectories whereas the MBM which corresponds to a piecewise multifractal function is discontinuous. This phenomenon occurs due to the replacement of χλ (t, h(t)) by χλ (t, h(λ)). Indeed, the ﬁrst function is discontinuous as h at points ai , this jump disappearing in χλ (t, h(λ)). However, the fast decay property of functions t → χλ (t, h(λ)) when |t − λ| → +∞ causes the SFBMs to have local properties very close to those of the MBM outside the jump moments of the multifractional function. The following theorem, which more precisely describes the regularity of the SFBM, can be found in [BEN 00]. THEOREM 6.5.– For any open interval I of , we suppose: H ∗ (I) = inf{h(t), for t ∈ I}

224

Scaling, Fractals and Wavelets

If Qh is a SFBM of the multifractional function h, then Qh is the overall Hölderian of exponent H for all 0 < H < H ∗ (I), on any compact interval J ⊂ I. Thus, in terms of regularity, the SFBM is a satisfactory model. We will see, in section 6.5, that we can completely identify the multifractional function: moments and amplitudes of the jumps. 6.4.3.2. Generalized multifractional Brownian motion Let us now outline the work of [AYA 99]. The authors propose to circumvent the “low frequency” problems encountered within the deﬁnition of MBM, by replacing the multifractional function h with a sequence of regular functions hn , whose limit, which will play the role of the multifractional function, can be very irregular. Let us ﬁrst specify the technical conditions relating to the sequence (hn )n∈N . DEFINITION 6.9.– A function h is said to be locally Hölderian of exponent r and of constant c > 0 on if, for all t1 and every t2 , such that, |t1 − t2 | 1, we have: |h(t1 ) − h(t2 )| c|t1 − t2 |r Such a function will be called (r, c) Hölderian. We can consequently deﬁne the multifractional sequences which generalize the multifractional functions for the GMBM. DEFINITION 6.10.– We will call a multifractional sequence a sequence (hn )n∈N of Hölderian functions (r, cn ) with values in an interval [a, b] ⊂ ]0, 1[ and we will call its lower limit a generalized multifractional function (GMF): h(t) = lim inf hn (t) n→+∞

if (hn )n∈N veriﬁes the following properties: – for all and all t0 , there exists n0 (t0 , ) and h0 (t0 , ) > 0 such that, for all n > n0 and, |h| < h0 we have: hn (t0 + h) > h(t0 ) − – for all t, we have h(t) < r and cn = O(n). In the preceding deﬁnition, it is essential that the generalized multifractional function is a limit when the index n tends towards +∞; we will see that this translates the high frequency portion of the information contained in the multifractional sequence. In addition, the GMF set contains very irregular functions like, for example, 0 < a < b < 1: t −→ b + (a − b)1F (t)

Locally Self-similar Fields

225

where F is a set of the Cantor type. A proof of this result, as well as an opening point of discussion on the set of the GMF, is found in [AYA 99]. Lastly, a process can be associated with a multifractal sequence in the following manner. DEFINITION 6.11.– We will call a GMBM associated with a multifractional sequence noted by (h) = (hn )n∈N any process permitting the harmonizable representation: e−it·ξ − 1 Y(h) (t) = W (dξ) 1 +h0 (t) |ξ|<1 |ξ| 2 (6.54) +∞ e−it·ξ − 1 + W (dξ) 1 n−1 ξ<2n |ξ| 2 +hn (t) n=1 2 The comparison of this deﬁnition with that of MBM is instructive: if it is supposed that functions (hn ) of the multifractional sequence are all equal to a function h verifying Hypothesis 6.4, then it is noted that the GMBM is a MBM with near normalization: 1 Bh (t) Y(h) (t) = v h(t) It is also noted that the law of a GMBM depends on the whole of the multifractional sequence and not only on the generalized multifractional function limit. In fact, writing the GMBM in series form, let us guess that the nth function of the multifractional sequence “governs” the behavior of the GMBM at scale 1/2n−1 . To clarify this idea, let us suppose that the white noise which intervenes in formula ˆ ∗ (dξ) and let us consider the development in series of the GMBM which (6.54) is W is deduced from it: +∞ e−it·ξ − 1 , ∗ (6.55) ψ (ξ) dξ ηλ Y(h) (t) ≈ 1 +hn (t) λ 2n−1 ξ<2n |ξ| 2 λ∈Λ+ n=1 The expression above is only approximate (and therefore the symbol ≈) because it does not take low-frequency phenomena into account; we have in fact omitted the integral on {|ξ| < 1} in (6.54). This minor inaccuracy is not detrimental to the heuristic reasoning to come, which seeks to explain that the regularity of the GMBM depends in fact on h(t) = lim inf n→+∞ hn (t). Still, to eliminate the technical ,λ of the deﬁnition of W ˆ ∗ (dξ) cancel problems, let us suppose that functions ψ k j j+1 themselves outside [2 , 2 ] if we have λ = 2j (this means neglecting constants 2π 3 (l) on (6.13)). and 8π which appear when describing the dependence of function ψ 3

Under these assumptions, formula (6.55) is simpliﬁed because integrals are zero except if the scale index j of λ = 2kj is equal to n − 1; we obtain: +∞ k χλ j , hj+1 (t) η kj (6.56) Y(h) (t) ≈ 2 2 j=0 k

226

Scaling, Fractals and Wavelets

This clariﬁes the natural correspondence between hj+1 and the scale j. In particular, we understand why the regularity of the GMBM depends on the behavior of hn (t) when n becomes large. This result is Theorem 2 of [AYA 99], the precise statement of which we now recapitulate. THEOREM 6.6.– Let: Y (h)(t + ) − Y (h)(t) def = 0 αY (h) (t) = sup α, lim →0 α be the pointwise Hölder exponent for all t ∈ of a GMBM Y (h). Then: ∀t ∈ ,

αY (h) (t) = h(t)

(a.s.)

is the generalized multifractional function of the GMBM. To conclude this section on the GMBM, it should be added that, under additional assumptions bearing on the multifractional sequence of a GMBM, the generalized multifractional function of a GMBM has been identiﬁed in [AYA 04a, AYA 04b]. 6.5. Estimate of regularity In this section, we will estimate the regularity of the processes by means of quadratic variations. It is of course not the only method: see Chapter 2 and Chapter 9 for alternative approaches. 6.5.1. General method: generalized quadratic variation First, we will ﬁx the general framework of identiﬁcation methods. In particular, in this section, we shall identify processes indexed by . The hypothesis that the processes are Gaussian authorizes us to proceed to the identiﬁcation starting from a unique trajectory of the process, which we suppose to have been observed in discrete times: we will assume X( Np ) as known for 0 p N . The various estimators which we use are built from the generalized quadratic variation of the observation X( Np ) which we can write: VN (w) =

N −2 p=0

w

p + 1 p 2 p p + 2 X − 2X +X N N N N

(6.57)

where w is a function of weight which serves to localize the quadratic variation if we seek to estimate the functions (for example, the multifractional function of a MBM). We will note: VN (w) = VN in the sequence if w is equal to the constant 1.

Locally Self-similar Fields

227

We will try to bring out, in the method of identiﬁcation, the techniques which apply to all the models already presented. In fact, all these models have trajectories of Hölderian regularity, and it is this which will guide us in building estimators. From this point of view, it is noted that the introduction of variations to quantify the irregularity of the trajectories is natural. Let us outline an example of the use of formula (6.57) in the simplest of cases: the identiﬁcation of the order of a FBM. Since the order is a global parameter, we can take w constantly equal to 1. By remembering that the trajectories of our processes are nearly C H Hölderian, we deduce from (6.57) that: VN ≈ N

1−2H

N −2 1 p ,ω C N p=0 N

(6.58)

where C( Np , ω) is the random Hölder constant associated with the trajectory X at the point Np . Indeed, the term: X p + 2 − 2X p + 1 + X p N N N ' p + 1 ( ' p + 1 p ( p + 2 −X − X −X = X N N N N and thus when N → +∞: 2 X p + 2 − 2X p + 1 + X p ≈ N −2H N N N and thus (6.58). In addition, we would like to apply a law of large numbers to the terms between brackets in (6.58). However, the random variables C( Np , ω) are not, in general, independent and only the asymptotic property of decorrelation of the fractional Brownian increments enables the use of a principle of the “law of large numbers” type. Formula (6.58) explains the expression of the estimator of H: ˆ N = 1 log2 VN/2 + 1 (6.59) H 2 VN Thanks to the theorem of the central limit related to the term within brackets in (6.58), we obtain the rate of the convergence of the estimators according to the discretization step N1 . We ﬁnd, at this juncture of our reasoning, the factor which forces us to choose a generalized quadratic variation rather than a traditional quadratic variation: V˜N (w) =

N −1 p=1

p 2 p p + 1 X −X w N N N

228

Scaling, Fractals and Wavelets

Indeed, in their article [LEO 89], the authors note that, for H > 34 , no central limit theorem exists. We will see that the methodology presented for a FBM remains valid for the other models. 6.5.2. Application to the examples The starting point and the model most adapted to identiﬁcation is that of the ﬁltered white noises. For these processes, it is possible, not only to identify the ﬁrst order parameters H1 and the modulation function a (thanks to the actually weighted quadratic variations), but also the parameters (H2 , b(x)) which are of the second-order as regards their inﬂuence on the local regularity of the trajectories. First, we will discuss in detail the arguments and the estimators valid for ﬁltered white noises and then we will explain how the principles developed within this framework apply to more sophisticated models. 6.5.2.1. Identiﬁcation of ﬁltered white noise Let us quickly describe the stages for identifying of the parameters of a ﬁltered white noise given by the formula of Deﬁnition 6.5: a(t) b(t) −itξ ˆ ∗ (dξ) − 1) + + R(t, ξ) W X(t) = (e 1 1 |ξ| 2 +H1 |ξ| 2 +H2 ˆ ∗ (dξ), we show By using isometric properties (6.21) of the Brownian measure W that: 1 a2 (t)w(t) dt (6.60) lim N 2H1 −1 E VN (w) = F (2H1 ) N →+∞

0

where: F (x) = 16

sin4 ( 2t ) dt |t|x+1

is deﬁned for x ∈ ]0, 2[. Consequently, the estimate of the ﬁrst-order parameters from a ﬁne study of the variance of VN (w) is described by the following theorem, taken from [BEN 98b]. THEOREM 6.7.– Let X be a ﬁltered white noise given by formula (6.27). If the weight function w is of class C 2 and is with dependence in ]0, 1[, then the estimators: VN/2 1 hatHN = log2 +1 (6.61) 2 VN

Locally Self-similar Fields

229

and: ˆ

N 2HN −1 VN (w) IˆN (w) = ˆN ) F (2H

(6.62)

almost surely converge when N → +∞ towards H1 and:

1

a2 (t)w(t) dt

I(w) = 0

Moreover: – if H2 − H1 > 12 , then: √ ˆ N − H1 and N H

√

N ˆ IN (w) − I(w) LogN

converge in distribution towards a centered random Gaussian variable; – if H2 − H1 12 , then: ˆ N − H1 2 CN 2(H1 −H2 ) E H and: 2 E IˆN (w) − I(w) CLog2 (N ) N 2(H1 −H2 ) As regards the estimate of the functional parameter, the preceding theorem can disappoint, which limits itself to proposing an estimate of integrals of a2 against the weight functions w. To rebuild a pointwise estimator of a(t) starting from these integrals, a general method will be found in [IST 96]. To understand the convergence speeds determined by Theorem 6.7, it is necessary to know that the convergences of the estimators reveal two types of error. One comes from the central limit theorem and intervenes in the estimate of the ﬁrst-order parameters; we will call it stochastic error. On the other hand, the second-order disturbance in the ﬁlter deﬁning the ﬁltered white noise creates a distortion. If H2 − H1 > 12 , stochastic error is dominant over distortion. Otherwise, the convergence speed is imposed by distortion. The estimate of the second-order factors (b, H2 ) is more difﬁcult because it necessitates that we build functions that do not depend asymptotically on the ﬁrst-order factors. An example of such a functional is given by: V N − 22H1 −1 VN 2

230

Scaling, Fractals and Wavelets

The intervention of the factor 22H1 −1 is necessary to compensate for the inﬂuence of the ﬁrst-order parameters exactly. On the other hand, it must be estimated and for this we will use the convergence of: V N2 2

lim

N →+∞

VN 2

= 22H1 −1

which is sufﬁciently rapid for the compensation to always take. An estimator of the parameter H2 is thus obtained. THEOREM 6.8.– If the function: def

WN = V N − 2

V N2 2

VN 2

VN

(6.63)

the estimator: 2 ,2 (N ) = 1 − 1 log2 VN /2 + log2 WN/2 H 2 2 VN 2 WN

(6.64)

converges a.s. towards H2 when N → +∞. 6.5.2.2. Identiﬁcation of elliptic Gaussian random processes Although it is possible to directly identify the symbol of an elliptic Gaussian random process of the form: 1

1

σ(t, ξ) = a(t)|ξ| 2 +H1 + b(t)|ξ| 2 +H2 + p(ξ)

(6.65)

when 0 < H2 < H1 < 1, for a and b two strictly positive C 1 functions, and for p a c∞ function such that p(ξ) = 1 if |ξ| 1 and p(ξ) = 0 if |ξ| > 2 (see [BEN 94]); a comparison carried out in [BEN 97a] between ﬁltered white noises and elliptic Gaussian random processes makes it possible to obtain the result more easily. Let us make several comments on the symbols which we identify. The symbols of form (6.65) verify Hypothesis 6.3 (H1 ). In particular, function p was introduced so that the elliptic inequality of order H1 in ξ = 0 would be satisﬁed. In fact, (6.65) should be understood as an expansion in the fractional power in high frequency (|ξ| → +∞) of a general symbol. The identiﬁcation of the symbol of an elliptic Gaussian random process X comes from the comparison of X with the ﬁltered white noise: −itξ e −1 ˆ ∗ * W (dξ) (6.66) Yt = σ(t, ξ) This explains that the order of the powers for an elliptic Gaussian random process is reversed compared to that which we have for a ﬁltered white noise. The results of identiﬁcation for the elliptic Gaussian processes can be summarized by recalling the following theorem.

Locally Self-similar Fields

231

THEOREM 6.9.– If X is an elliptic Gaussian random process of the symbol verifying (6.65) and: 3H − 1 1 < H2 sup 0, 2 then the estimators: ˜ N = 1 log2 VN/2 + 1 H 2 VN

(6.67)

ˆ

N 2HN −1 VN (w) J˜N (w) = ˜N ) F (2H

(6.68)

for w ∈ C 2 [0, 1] with support included in ]0, 1[ and: VN 2 /2 WN/2 1 1 (H + log2 − log2 2 )N = 2 2 VN 2 WN

(6.69)

where WN is deﬁned by (6.63), converge almost surely when N → +∞ towards respectively: 1 w(t) dt, H2 H1 , J(w) = 0 a(t) It should be noted that, for the elliptic Gaussian random processes, an additional condition for the identiﬁcation of the parameter H2 is found, which is (3H1 − 1)/2 < H2 . This hypothesis is not only technical; a similar hypothesis is found in [INO 76] for Markovian Gaussian ﬁelds of order p: only the monomials of higher degree of a polynomial symbols are identiﬁable. We can thus only hope, within our framework, to identify the principal part of the symbol σ. 6.5.2.3. Identiﬁcation of MBM To identify the multifractional function of a MBM, the generalized quadratic variations must be suitably localized. Indeed, in this case, a pointwise estimator of h is proposed in [BEN 98a]. It is necessary for us, however, to insist on the fact that we can prove the convergence of the estimators only for regular multifractional functions: in this section, we will suppose that the following hypothesis is veriﬁed. HYPOTHESIS 6.5.– Function h is of class C 1 . This hypothesis is, of course, more restrictive than Hypothesis 6.4. Let us specify the principles of the localization of the generalized quadratic variations. A natural method consists of utilizing the weight function: w = 1[t0 ,t1 ]

for 0 < t0 < t1 < 1

232

Scaling, Fractals and Wavelets

We thus obtain a localized variation in the interval [t0 , t1 ] which we will note: VN (t0 , t1 ) = VN (w) =

(6.70)

p {p∈Z,t0 N t1 }

$ %2 p+1 p p+2 − 2X +X X N N N

(6.71)

We can now deﬁne an estimator: VN/2 (t0 , t1 ) 1 hN log2 +1 (t0 , t1 ) = 2 VN (t0 , t1 ) which converges, when N → +∞, towards: inf h(s), s ∈ ]t0 , t1 [

(6.72)

(a.s.)

Indeed, it is the worst Hölderian regularity which is dominating for this estimate. We deduce from this intermediate stage that we must reduce the size of the observation interval [t0 , t1 ] as N increases if we want to estimate h(t). Let us suppose: def

VN, (t) = VN (t − , t + ) VN/2, def 1 ˆ log2 h ,N (t) = 2 VN,

(6.73) (6.74)

and let us apply the general principles of section 6.5 to VN, (t): VN, (t) ≈

1 N

p N

C2

∈[t− ,t+ ]

p p , ω N 1−2h( N ) N

(6.75)

It is clear that the larger is, the smaller the stochastic error due to a law of large numbers, since a great number of variables is added up; however, for a large , we introduce a signiﬁcant distortion by having replaced: N 1−2h(t)

by

p

N 1−2h( N )

The choice of the convergence speed of towards 0 is carried out in the following theorem, extracted from [BEN 98a]. THEOREM 6.10.– Let X be a MBM of harmonizable representation: −it·ξ 1 e −1 Bh (t) = W (dξ) 1 v h(t) |ξ| 2 +h(t)

Locally Self-similar Fields

233

associated with a multifractional function h verifying Hypothesis 6.5. For = N −α with 0 < α < 12 and N → ∞: ˆ ,N (t) −→ h(t) h

(a.s.)

For = N −1/3 : ˆ ,N (t) − h(t) 2 = O Log2 (N )N −2/3 E h In the preceding statement, we used the standardization 1/v(h(t)) but the result is unchanged if this factor is replaced by any function C 1 of t. In addition, the choice of = N −1/3 renders the contributions of the asymptotic error and of the distortion of the same order and thus, in a certain sense, asymptotically minimizes the upper bound obtained for the quadratic risk. 6.5.2.4. Identiﬁcation of SFBMs In the case of a SFBM, the multifractional function to estimate is a piecewise constant function h given by (6.52) and we will build an estimator of: Θ0 = (a1 , . . . , aK ; H0 , . . . , HK ) starting from the quadratic variation: p + 1 p (2 1 ' p + 2 − 2Qh + Qh Qh V˜N (s, t) = N n n n p

(6.76)

s n
(t0 , t1 ) also apply here: The principles evoked for the MBM about the estimator hN 1 log VN (s, t) = inf h(s), s ∈ ]s, t[ N →+∞ −2 log(N ) lim

(a.s.)

(6.77)

We will note: def

fN (s, t) =

1 log VN (s, t) −2 log(N )

According to a technique of [BER 00] for rupture detection, we formulate the difference between the estimate fN on an interval of length A > 0 on the right of t and on the left of t: DN (A, t) = fN (t, t + A) − fN (t − A, t)

(6.78)

Let us suppose as known ν0 = mini=1,...,K−1 |ai+1 − ai | the minimal distance between two jumps of h, as well as a minor η0 of the absolute value of the magnitude

234

Scaling, Fractals and Wavelets

of jumps δi = Hi − Hi−1 . By taking A < ν0 , we obtain the convergence of DN (A, t) towards: δi 1[ai ,ai +A[ (t) D∞ (A, t) = i such that δi >0

+

(6.79) δi 1[ai −A,ai [ (t)

i such that δi <0

Since A < ν0 , the various ruptures intervene separately on function D∞ (A, t), by a slit of width A on the right of the rupture moment ai for the case of a positive jump (i.e., δi > 0), or on the left of the rupture moment ai in the case of a negative jump (i.e., δi < 0). Consequently, for any threshold η ∈ ]η0 /2, η0 [ and any size of window A ν0 , we estimate the ﬁrst time of the positive jump of D∞ (A, ·) starting from the ﬁrst moment Nl such that DN (A, l/N ) η, then the second by using the same method but deviating from A in relation to the ﬁrst found in a more formal way, we suppose: (N )

τˆ1

=

1 min{l ∈ Z, DN (A, l/N ) η} N

where (N )

τˆ1 (N )

If τˆ

= +∞ when DN (A, l/N ) < η, ∀l ∈ Z

< +∞: (N )

τˆ+1 =

1 (N ) min{l ∈ Z, l/N τˆ + A and DN (A, l/N ) η} N

and (N )

ςˆ1

=

1 max{l ∈ Z, DN (A, l/N ) −η} N

where (N )

ςˆ1 (N )

If ςˆm

= −∞ when DN (A, l/N ) > −η, ∀l ∈ Z

> −∞:

(N )

ςˆm+1 =

1 (N ) max{l ∈ Z, l/N ςˆm − A and DN (A, l/N ) −η} N

By uniting the two preceding families and then sorting in ascending order, we (N ) (N ) obtain a family of estimators of the jump times of h: (ˆ a1 , . . . , a ˆκN ) and we estimate

Locally Self-similar Fields

235

(N ) (N ) ˆ (N ) = fN (ˆ the value of h at the jump moments by assuming H a1 −10A, a ˆ1 −5A) 0 (N ) (N ) (N ) ) ˆι ˆ κ(N and H = fN (ˆ aι + A/3, a ˆι+1 − A/3) for ι = 1, . . . , κN − 1 and H = N (N ) (N ) (N ) (N ) fN (ˆ a1 +5A, a ˆ1 +10A). To ﬁnish, we build the estimator ΘN = (ˆ a1 , . . . , a ˆκN ; ) ˆ (N ) , . . . , H ˆ κ(N H N ), whose consistency is established in [BEN 00]. 0

THEOREM 6.11.– Qh is a step fractional Brownian process of function of scale h(·) verifying (6.52). If, moreover, A < ν0 and η ∈ ]η0 /2, η0 [, then we have limN →+∞ ΘN = Θ0 a.s. with Θ0 = (a1 , . . . , aK ; H0 , . . . , HK ). 6.6. Bibliography [AYA 99] AYACHE A., L ÉVY V ÉHEL J., “Generalised multifractional Brownian motion: deﬁnition and preliminary results”, in D EKKING M., L ÉVY V ÉHEL J., L UTTON E., T RICOT C. (Eds.), Fractals: Theory and Applications in Engineering, Springer-Verlag, p. 17–32, 1999. [AYA 00] AYACHE A., C OHEN S., L ÉVY V ÉHEL J., “The covariance structure of multifractional Brownian motion, with application to long range dependence”, in Proceedings of ICASSP (Istanbul, Turkey), 2000. [AYA 04a] AYACHE A., B ENASSI A., C OHEN S., L ÉVY V ÉHEL J., “Regularity and identiﬁcation of generalized multifractional Gaussian processes”, in Séminaire de Probabilités XXXVIII – Lecture Notes in Mathematics, Springer-Verlag Heidelberg, vol. 1857, p. 290–312, 2004. [AYA 04b] AYACHE A., L ÉVY V ÉHEL J., “On the identiﬁcation of the pointwise Hölder exponent of the generalized multifractional Brownian motion”, in Stoch. Proc. Appl., vol. 111, p. 119–156, 2004. [BEN 94] B ENASSI A., C OHEN S., JAFFARD S., “Identiﬁcation de processus Gaussiens elliptiques”, C. R. Acad. Sc. Paris, series, vol. 319, p. 877–880, 1994. [BEN 97a] B ENASSI A., C OHEN S., I STAS J., JAFFARD S, “Identiﬁcation of elliptic Gaussian random processes”, in L ÉVY V ÉHEL J., T RICOT C. (Eds.), Fractals and Engineering, Springer-Verlag, p. 115–123, 1997. [BEN 97b] B ENASSI A., JAFFARD S., ROUX D., “Gaussian processes and pseudodifferential elliptic operators”, Revista Mathematica Iberoamericana, vol. 13, no. 1, p. 19–89, 1997. [BEN 98a] B ENASSI A., C OHEN S., I STAS J., “Identifying the multifractional function of a Gaussian process”, Statistic and Probability Letters, vol. 39, p. 337–345, 1998. [BEN 98b] B ENASSI A., C OHEN S., I STAS J., JAFFARD S., “Identiﬁcation of ﬁltered white noises”, Stoch. Proc. Appl., vol. 75, p. 31–49, 1998. [BEN 00] B ENASSI A., B ERTRAND P., C OHEN S., I STAS J., “Identiﬁcation of the Hurst exponent of a step multifractional Brownian motion”, Statistical Inference for Stochastic Processes, vol. 3, p. 101–110, 2000. [BER 00] B ERTRAND P., “A local method for estimating change points: the hat-function”, Statistics, vol. 34, no. 3, p. 215–235, 2000.

236

Scaling, Fractals and Wavelets

[COH 99] C OHEN S., “From self-similarity to local self-similarity: the estimation problem”, in D EKKING M., L ÉVY V ÉHEL J., L UTTON E., T RICOT C. (Eds.), Fractals: Theory and Applications in Engineering, Springer-Verlag, p. 3–16, 1999. [FRI 97] F RISCH U, Turbulence, Cambridge University Press, 1997. [INO 76] I NOUÉ K., “Equivalence of measures for some class of Gaussian random ﬁelds”, J. Multivariate Anal., vol. 6, p. 295–308, 1976. [IST 96] I STAS J., “Estimating the singularity function of a Gaussian process with applications”, Scand. J. Statist., vol. 23, no. 5, p. 581–596, 1996. [KAH 85] K AHANE J.P., Some Random Series of Functions, Cambridge University Press, second edition, 1985. [LEO 89] L EON J.R., O RTEGA J., “Weak convergence of different types of variation for biparametric Gaussian processes”, in Colloquia Math. Soc. J. Bolayi no. 57, 1989. [LIF 95] L IFSHITS M.A., Gaussian Random Functions, Kluwer Academic Publishers, 1995. [MAN 68] M ANDELBROT B.B., VAN N ESS J.W., “Fractional Brownian motions, fractional noises, and applications”, SIAM Review, vol. 10, p. 422–437, 1968. [MEY 90a] M EYER Y., Ondelettes et operateurs, Hermann, Paris, vol. 1, 1990. [MEY 90b] M EYER Y., Ondelettes et operateurs, Hermann, Paris, vol. 2, 1990. [NEV 68] N EVEU J., Processus aléatoires Gaussiens, Montreal University Press, SMS, 1968. [PEL 96] P ELTIER R.F., L ÉVY V ÉHEL J., “Multifractional Brownian motion: deﬁnition and preliminary results”, 1996 (available at http://www-syntim.inria.fr/fractales). [SAM 94] S AMORODNITSKY G., TAQQU M.S., Stable Non-Gaussian Random Processes, Chapman & Hall, 1994. [YAG 87] YAGLOM A.M., Correlation Theory of Stationary and Related Random Functions. Volume I: Basic Results, Springer, 1987.

Chapter 7

An Introduction to Fractional Calculus

7.1. Introduction 7.1.1. Motivations We give some traditional example applications of fractional calculus and then we brieﬂy point out the theoretical references. 7.1.1.1. Fields of application The modeling of certain physical phenomena, described as long memory, can be carried out by introducing integro-differentials terms with weakly singular kernels (i.e., locally integrable but not necessarily continuous like tα−1 when 0 < α < 1) in the equations of the dynamics of materials. This is very frequent, for example, in linear viscoelasticity with long memory, where a fractional stress-strain dynamic relation can be proposed: see [BAG 86] for viscoelasticity; [KOE 84, KOE 86] for a presentation a little more formalized; [BAG 91] for a rich and quite detailed example; [BAG 83a] for a modal analysis in forced mode or [BAG 85] for a modal analysis in transient state and ﬁnally, [CAP 76] for a modeling which utilizes an equation with partial derivatives with fractional derivative in time. There are also applications for modeling in chemistry of polymers [BAG 83b] or for modeling of dynamics at the interface of fractal structures: see [LEM 90] for the applied physical aspect and [GIO 92] for the theoretical physical aspect.

Chapter written by Denis M ATIGNON.

238

Scaling, Fractals and Wavelets

Moreover, fractional derivatives can appear naturally when a dynamic phenomenon is strongly conditioned by the geometry of the problem: a simple, very instructive example is presented in [TOR 84]. See in particular [CARP 97] for examples in continuum mechanics and [POD 99] for many applications in engineering sciences. 7.1.1.2. Theories A detailed historical overview of the theory of fractional derivatives is given in [OLD 74]; moreover, this work is undoubtedly one of the ﬁrst attempts to assemble scattered results. Recently, a theoretical synthesis was proposed in [MIL 93], where certain algebraic aspects of fractional differential equations of rational order are completely developed. In mathematics, the Russian work [SAM 87] is authoritative; it compiles a set of unique deﬁnitions and theories. Pseudo-differential operators are mentioned in Chapter 7 of [TAY 96] and the ﬁrst article on the concept of diffusive representation was, as far as we know, section 5 of [STA 94]. During the last 10 years, a number of themes have developed: see, in particular, the book [MAT 98c] for a general theoretical framework [MON 98], and for several applications derived from them. 7.1.2. Problems From a mathematical point of view, these integro-differential relations or convolutions with locally integrable kernels (or L1loc , i.e., absolutely integrable on any interval [a, b]) are not simple to treat: analytically, the singular character of kernel tα−1 (with 0 < α < 1) problematizes the use of theorems based on the regularity of the latter (in [DAUT 84a], for example, the kernels are always supposed to be continuous). Numerically, it is not simple to treat this singularity at the temporal origin (although that is a priori possible by carrying out an integration by parts, which artiﬁcially increases the order of derivation of the unknown function, while keeping a convolution with a more regular kernel). In the theories mentioned in section 7.1.1.2, several problems appear. First, the deﬁnition of fractional derivatives poses problems for orders higher than 1 (in particular, fractional derivatives do not commute, which is extremely awkward and, in addition, the composition of integration and fractional derivatives of the same order do not necessarily give the identity); this leads, in practice, to the use of rather strict calculations and a reintroduction a posteriori of the formal solution in the

An Introduction to Fractional Calculus

239

starting equation, to check the coherence of the result. Second, the question of initial conditions for fractional differential equations is not truly solved: we are obliged to deﬁne zero or inﬁnite initial values. Lastly, the true analytical nature of solutions can be masked by closed-form solutions utilizing a great number of special functions, which facilitates neither the characterization of important analytical properties of these solutions nor their numerical simulation. The focus of our work concerns the theory of fractional differential equations (FDE): ﬁrst, we clarify various deﬁnitions by using the framework of causal distributions (i.e., generalized functions whose support is the positive real axis) and by interpreting results on functions expandable in fractional power series of order α (α-FPSE); second, we clarify problems related to fractional differential equations by formulating solutions in a compact general form and third, we establish a strong bond with diffusive representations of pseudo-differential operators (DR of PDO), which is a nearly incontrovertible concept when derivation orders are arbitrary. Finally, we study the extension to several variables by treating a fractional partial differential equation (FPDE) which in fact constitutes a modal analysis of fractional order. 7.1.3. Outline This chapter is composed of four distinct sections. First, in section 7.2, we give deﬁnitions of the fundamental concepts necessary for the study and handling of fractional formalism. We recall the deﬁnition of fractional integration in section 7.2.1. We show in section 7.2.2 that the inversion of this functional relation can be correctly deﬁned within the framework of causal distributions and we examine the fundamental solutions directly connected to this operator. Lastly, we adopt a deﬁnition which is easier to handle, i.e., a “mild” fractional derivative, so as to be able to use fractional derivatives on regular causal functions. We examine in section 7.2.3 the eigenfunctions of this new operator and show its structural relationship with a generalization of Taylor expansions for non-differentiable functions at the temporal origin, like the functions expandable in fractional power series. Then, in section 7.3, we are interested in the fractional differential equations. These are linear relations in an operator of fractional derivative and its successive powers; it appears naturally that the rational orders play an important role, since certain powers are in direct relationship with the usual derivatives of integer orders. We thus examine fractional differential equations in the context of causal distributions (in section 7.3.2) and functions expandable into fractional power series (in section 7.3.3). We then tackle, in section 7.3.4, the asymptotic behavior of the fundamental solutions of these fractional differential equations, i.e., the divergence in modulus, the pseudo-periodicity or convergence towards zero of the eigenfunctions

240

Scaling, Fractals and Wavelets

of fractional derivatives (which plays a similar role to that of the exponential function in the case of integer order). Finally, in section 7.3.5, we examine a class of controlled-and-observed linear dynamic systems of fractional order and approach some typical stakes of automatic control. Then, in section 7.4, we consider fractional differential equations in one variable but when orders of derivations are not commensurate: there are no simple algebraic tools at our disposal in the frequency domain and work carried out in the case of commensurate orders does not apply any more. We further examine the strong bond which exists with diffusive representations of pseudo-differential operators. We give some simple ideas and elementary properties and then we present a general result of decomposition for the solutions of fractional differential equations into a localized or integer order part and a diffusive part. In section 7.5, ﬁnally, we show that the preceding theory in the time variable (which appeals, in the commensurate case, to polynomials and rational fractions in frequency domain) can extend to several variables in the case of fractional partial differential equations (we obtain more general meromorphic functions which are not rational fractions). With this intention, we treat an example conclusively: that of the partial differential wave equation with viscothermal losses at the walls of the acoustic pipes, i.e. an equation which reveals a time derivative of order three halves. Throughout this chapter, we will treat the half-order as an example, in order to clarify our intention. This chapter has been inspired by several articles and particularly [AUD 00, MAT 95a]. This personal work is also the fruit of collaborations with various researchers including d’Andréa-Novel, Audounet, Dauphin, Heleschewitz and Montseny. More recently, new co-authors have helped enlarge the perspective of our work: let us mention Hélie, Haddar, Prieur and Zwart. 7.2. Definitions 7.2.1. Fractional integration The primitive, canceling at initial time t = 0, reiterated an integer number n of times I n f , of an integrable function f , is nothing other than the convolution of f with a polynomial kernel Yn (t) = tn−1 + /(n − 1)!: τ1 t τn−1 def dτ1 dτ2 · · · f τn dτn I n f (t) =

0

0

0

= Yn f (t). By extension, we deﬁne [MIL 93] the primitive I α f of any order α > 0 by using function Γ of Euler which extends the factorial.

An Introduction to Fractional Calculus

241

DEFINITION 7.1.– The primitive of order α > 0 of causal f, locally integrable, is given by: def

I α f (t) = (Yα f )(t)

where we have set Yα (t) =

tα−1 + . Γ(α)

(7.1)

PROPOSITION 7.1.– The property Yα Yβ = Yα+β makes it possible to write the fundamental composition law: I α ◦ I β = I α+β for α > 0 and β > 0. Proof. To establish the property, it is enough to check that the exponents coincide, the numerical coefﬁcient coming from the properties of function Γ: t (t − τ )α−1 τ β−1 dτ (Yα Yβ )(t) ∝ 0

= tα+β−1

1

(1 − x)α−1 xβ−1 dx

0

∝ Yα+β (t). To establish the fundamental composition law, it is enough to use the fact that the convolution of functions is associative, from where: def

I α+β f = Yα+β f = (Yα Yβ ) f = Yα (Yβ f ) = I α {I β f }. PROPOSITION 7.2.– The Laplace transform of Yα for α > 0 is: L[Yα ](s) = s−α

for e(s) > 0

(7.2)

i.e., with the right-half complex plane as a convergence strip. Proof. A direct calculation for s > 0 provides: +∞ +∞ α−1 t 1 def e−st dt = s−α xα−1 e−x dx = s−α L[Yα ](s) = Γ(α) Γ(α) 0 0 according to the deﬁnition of Γ; the result is continued to e(s) > 0 by analyticity. NOTE 7.1.– We see in particular that the delicate meaning given to a fractional power of the complex variable s is perfectly deﬁned: s → sα indicates the analytical continuation of the power function on positive reals. It is the principal determination of the multiform function s → sα ; it has Hermitian symmetry.

242

Scaling, Fractals and Wavelets

PROPOSITION 7.3.– For a causal function f which has a Laplace transform in e(s) > af , we have: L[I α f ](s) = s−α L[f ](s)

for e(s) > max(0, af ).

(7.3)

Proof. This follows from the fact that the Laplace transform transforms a convolution into a product of the Laplace transforms and Proposition 7.2. In particular, we can prove Proposition 7.1 very simply, when we see that s−α−β = s s for e(s) > 0. −α −β

EXAMPLE 7.1.– For f locally integrable, i.e. f ∈ L1loc , we thus obtain: t 1 1 √ f (t − τ ) dτ. I 2 f (t) = πτ 0 7.2.2. Fractional derivatives within the framework of causal distributions 7.2.2.1. Motivation The idea of fractional derivatives of causal functions (or signals) is to obtain an inverse formula to that of fractional integration deﬁned by (7.1), i.e.: f = I α (Dα f ) This is a rather delicate problem; it can be solved by calling upon the theory of Volterra integral equations (for example, see [KOE 84]). However, one of the major problems is the composition law or the law of exponents and particularly because fractional derivatives and integrals do not always commute, which poses delicate practical problems. That is why we propose to carry out the inversion of space of (7.1) within the more general framework of causal distributions (i.e., D+ distributions whose support is the positive real axis of the time variable), while referring to [SCH 65] in particular, even if it means returning later, in section 7.2.3, to an interpretation in terms of internal operation in a class of particular functions. 7.2.2.1.1. Passage to the distributions def

By following Deﬁnition 7.1, we pose naturally I 0 f = f , which gives, according to (7.1), f = Y0 f . It is clear that no locally integrable function Y0 can be a solution of the preceding convolution equation; on the other hand, the Dirac distribution is the neutral element of convolution of distributions [SCH 65]. From where necessarily: def

Y0 = δ and convolution in equation (7.1) is to be taken in the sense of distributions.

(7.4)

An Introduction to Fractional Calculus

243

7.2.2.1.2. Framework of causal distributions We could place ourselves within the framework of distributions, but convolution (which is the basic functional relations for invariant linear systems) is not, in general, associative. When the supports are limited from below (usually the case when we are interested in causal signals), we obtain the property known as convolutive supports, which enables the associative convolution property. , which is a convolution algebra, there is an associative property Therefore, in D+ of the convolution product, and the convolutive inverse of a distribution, if it exists, is unique (see lesson 32 of [GAS 90]), which allows a direct use of the impulse response h of a causal linear system. Indeed, let us consider the general convolution equation : in the unknown y ∈ D+

P y =x

(7.5)

represents the system and x ∈ D+ the known causal input; i.e. h, the where P ∈ D+ impulse response of the system, deﬁned by:

P h = δ. Then, y = h x is the solution of equation (7.5); indeed: P y = P (h x) = (P h) x = δ x = x thanks to the associative convolution property of causal distributions. We thus follow [GUE 72] to deﬁne fractional derivatives Dα . DEFINITION 7.2.– The derivative in the sense of causal distributions of f ∈ D+ is: def

Dα f = Y−α f

where we have Y−α Yα = δ

(7.6)

. i.e., Y−α is the convolutive inverse of Yα in D+

At this stage, the problem is thus to identify the causal distribution Y−α , which we know could not be a function belonging to L1loc . Let us give its characterization by Laplace transform. PROPOSITION 7.4.– The Laplace transform of Y−α for α > 0 is: L[Y−α ](s) = sα

for e(s) > 0

i.e., with a right-half complex plane as the convergence strip.

(7.7)

244

Scaling, Fractals and Wavelets

Proof. We initially use the fact that, within the framework of causal distributions, the Laplace transform of a convolution product is the product of the Laplace transforms, which we apply to deﬁnition Y−α using (7.6), by taking into account Proposition 7.2 and L[δ](s) = 1, i.e.: s−α L[Y−α ](s) = 1 ∀s, e(s) > 0 which proves, on the one hand, the existence and, on the other hand, the declared result. NOTE 7.2.– We read on the behavior at inﬁnity of the Laplace transform that Y−α will be less regular the larger α is. PROPOSITION 7.5.– The property Y−α Y−β = Y−α−β makes it possible to write the fundamental composition law: Dα ◦ Dβ = Dα+β for α > 0 and β > 0. PROPOSITION 7.6.– For a causal distribution f which has a Laplace transform in e(s) > af , we have: L[Dα f ](s) = sα L[f ](s)

for e(s) > max(0, af ).

(7.8)

Finally, we obtain the following fundamental result. PROPOSITION 7.7.– For α and β two real numbers, we have: – the property Yα Yβ = Yα+β ; – the fundamental composition law I α ◦ I β = I α+β ; by taking as notation convention I α = D−α when α < 0. EXAMPLE 7.2.– We seek to clarify the half-order derivation: we ﬁrst calculate the distribution Y−1/2 , then we calculate D1/2 [f Y1 ] where f is a regular function. From the point of view of distributions, we can write Y−1/2 = D1 Y1/2 , where D1 is the derivative in the sense of distributions; maybe, by taking ϕ ∈ C0∞ a test function: ∞ 1 √ ϕ (t) dt #Y−1/2 , ϕ$ = #D1 Y1/2 , ϕ$ = −#Y1/2 , ϕ $ = − Γ(1/2) t 0 ∞ ∞ 1 1 1 lim √ ϕ(t) + ϕ(t) dt =− Γ(1/2) ε→0 2 t3/2 t ε ε ∞ 1 2 1 √ lim =− ϕ(t) dt − ϕ(ε) 2 Γ(1/2) ε→0 ε t3/2 ε 0 / 1 −3/2 pf (t+ ), ϕ = Γ(−1/2)

An Introduction to Fractional Calculus

245

where pf indicates the ﬁnite part within the Hadamard concept of divergent integral. We thus obtain the result, which is not very easy to handle in practice: −3/2

Y−1/2 =

pf (t+ ) . Γ(−1/2)

Let us now calculate the derivative of half-order of a causal distribution f Y1 where Y1 is the Heaviside distribution and f ∈ C 1 . Then, we have D1 [f Y1 ] = f Y1 + f (0)δ, from where, by taking into account D1/2 = I 1/2 ◦ D1 : def

D1/2 [f Y1 ] = Y−1/2 [f Y1 ] = Y1/2 D1 [f Y1 ] = Y1/2 [f Y1 ] + f (0)Y1/2 t 1 1 √ f (t − τ ) dτ + f (0) √ . = πτ πt 0 where two terms appear: the ﬁrst is a convolution of L1loc functions and it is a regular term, i.e., continuous in t = 0+ ; the second is a function which diverges in t = 0+ , while remaining L1loc . Moreover, the preceding formulation√remains valid if we have f ∈ C 0 and f ∈ L1loc : i.e., for example, for t → Y3/2 (t) ∝ t, for which it is easy to check that we have D1/2 Y3/2 = Y1 , in other words the constant 1 for t > 0. PROPOSITION 7.8.– In general, for f ∈ C 0 such that f ∈ L1loc and 0 < α < 1: Dα [f Y1 ] = Y1−α [f Y1 ] + f (0) Y1−α . 7.2.2.2. Fundamental solutions We deﬁne operator Dα in the space D+ of causal distributions. Let us now seek α 1 the fundamental solution of operator D − λ.

DEFINITION 7.3.– The quantity Eα (λ, t) is the fundamental solution of Dα − λ for the complex value λ; it fulﬁlls by deﬁnition: Dα Eα (λ, t) = λEα (λ, t) + δ.

(7.9)

PROPOSITION 7.9.– The quantity Eα (λ, t) is given by: ∞ ! " λk Y(1+k)α (t). Eα (λ, t) = L−1 (sα − λ)−1 , e(s) > aλ =

(7.10)

k=0

1. It is the extension of property D1 eλt Y1 (t) = λeλt Y1 (t) + δ in the case of integer order.

246

Scaling, Fractals and Wavelets

Proof. Let us take the Laplace transform of (7.9); it is: (sα − λ)L[Eα (λ, t)](s) = 1 for e(s) > 0 from where, for e(s) > aλ : ! " L Eα (λ, t) (s) = (sα − λ)−1 = s−α (1 − λs−α )−1 = s−α

+∞

(λs−α )k

for |s| > |λ|1/α

k=0

=

+∞

λk s−(1+k)α .

k=0

By taking the inverse Laplace transform term by term, see [KOL 69], and by using Proposition 7.2, we then obtain the result announced in the time domain (the series of functions (7.10) is normally convergent on every compact subset). EXAMPLE 7.3.– Let us examine the particular cases of integer and half-integer orders. On the one hand, for α = 1, we obtain the causal exponential: E1 (λ, t) = eλt Y1 (t) as fundamental solution of the operator D1 − λ within the framework of causal distributions. In addition, for α = 12 , we obtain: E1/2 (λ, t) =

+∞

λk Y 1+k 2

k=0

=

+∞ k=0

k−1

λk

t 2 Γ( k+1 2 )

√ +∞ (λ t)k = Y1/2 + λ Γ(1 + k2 ) k=0 which is the sum of an L1loc function (i.e. Y1/2 ) and of a power series in the variable √ λ t, which is thus a continuous function. 7.2.3. Mild fractional derivatives, in the Caputo sense 7.2.3.1. Motivation For 0 < α < 1, we saw, according to Proposition 7.8, that Dα f was not continuous in t = 0+ and that, even when we have f ∈ C 1 , which might at ﬁrst seem slightly

An Introduction to Fractional Calculus

247

paradoxical, it would be preferred that Dα f is deﬁned, in a certain sense, between f and f . Moreover, we have just seen that the fundamental solutions Eα (λ, t) are not continuous at the origin t = 0+ ; from the analytical point of view, that is likely a priori to be awkward when initial values are given in a fractional differential equation. For α > 1, the analytical situation worsens, since the objects which are handled become very rapidly distributions which move away from regular functions; for example: Y−n = δ (n)

for n a natural integer.

(7.11)

These considerations justify the description of “mild” applied to fractional derivatives dα , which we now deﬁne. 7.2.3.2. Deﬁnition We are naturally led to extract from the preceding deﬁnitions the more regular or milder parts, according to the example introduced in [BAG 91]. The deﬁnition we propose does actually coincide with that given by Caputo in [CAP 76]. DEFINITION 7.4.– For a causal function f and continuous from the right at t = 0: def

dα f = Dα f − f (0+ ) Y1−α .

(7.12)

In particular, if f ∈ L1loc , then dα f = Y1−α [f Y1 ]. For 0 < α < 1, to some extent, we extract from f (continuous but non-derivable) an intermediate degree of regularity (connected to the Hölder exponent of f at 0). It appears that dα f can be a continuous function, for example, when f ∈ C 0 and f ∈ L1loc , which was not the case for the fractional derivative in the sense of distributions Dα f . Moreover, we can note that, in the case of the integer order, the second derivative of function f is deﬁned like the derivative of f , i.e., exactly like def the following iteration of the operator of derivation; in other words, d2 = (d1 )◦2 . In view of these remarks, we propose the following deﬁnition. DEFINITION 7.5.– For 0 < α 1, we will say that f is of class Cαn if all the n sequentially mild derivatives of order alpha of f exist and are continuous, even at t = 0, i.e.: (dα )◦k f ∈ C 0 for 0 k n. The idea of sequentiality is introduced in a completely formal manner in Chapter 6 of [MIL 93] and more as a curiosity than something fundamentally coherent. Moreover, it is not the same dα which is used, but a deﬁnition which coincides with Dα for certain classes of functions, for 0 < α < 1. However, one of the inherent difﬁculties in the deﬁnition used is that the fundamental composition law is lost, whereas Dnα = (Dα )◦n is obtained immediately according to Proposition 7.5.

248

Scaling, Fractals and Wavelets

EXAMPLE 7.4.– For α = 12 , let us apply successively D1/2 to f causal given by: √ f = b0 + b1 t + b2 t + b3 t3/2 + b4 t2 . Let us reformulate this expansion on the basis of Yk/2 ; it becomes: f = a0 Y1 + a1 Y3/2 + a2 Y2 + a3 Y5/2 + a4 Y3 D1/2 f = a0 Y1/2 + a1 Y1 + a2 Y3/2 + a3 Y2 + a4 Y5/2 d1/2 f

D1 f = a0 Y0 + a1 Y1/2 + a2 Y1 + a3 Y3/2 + a4 Y2

(d1/2 )2 f

d1 f

D3/2 f = a0 Y−1/2 + a1 Y0 + a2 Y1/2 + a3 Y1 + a4 Y3/2 (d1/2 )3 f

D2 f = a0 Y−1 + a1 Y−1/2 + a2 Y0 + a3 Y1/2 + a4 Y1

(d1 )2 f

(d1/2 )4 f

iff

a1 =0

The interest of the operator d1/2 and its successive powers (noted from now on by (d ) instead of (d1/2 )◦k to make the writing less cumbersome) is manifest here. Indeed, with the choice of f considered, (d1/2 )k f are continuous functions at t = 0+ ; 4 ∞ and even f ∈ C1/2 since we have according to Deﬁnition 7.5, we see that f ∈ C1/2 1/2 5 1/2 k (d ) f ≡ 0! Moreover, we obtain the property ak = [(d ) f ](t = 0+ ) and thus the following formula: 1/2 k

f=

4

ak Y1+ k 2

with ak = [(d1/2 )k f ](t = 0+ ).

k=0

This is a kind of fractional Taylor expansion of f causal, in the vicinity of 0, which we will generalize in section 7.2.3.4. 7.2.3.3. Mittag-Lefﬂer eigenfunctions We deﬁned operator dα and noted that it acted in an internal way in the class of functions: we can naturally seek the eigenfunctions of this operator in this class of functions.

Cα∞

An Introduction to Fractional Calculus

249

DEFINITION 7.6.– For 0 < α 1, Eα (λ, t) is the eigenfunction of dα for the complex eigenvalue λ, initialized at 1; it fulﬁlls, by deﬁnition: α d Eα (λ, t) = λEα (λ, t), (7.13) Eα (λ, 0+ ) = 1. PROPOSITION 7.10.– The quantity Eα (λ, t) is given by: ! " Eα (λ, t) = I 1−α Eα (λ, t) = L−1 sα−1 (sα − λ)−1 , e(s) > aλ =

∞

λk Y1+αk (t) = Eα (λtα +)

(7.14)

k=0

where Eα (z) is the Mittag-Lefﬂer monogenic function deﬁned by the power series: def

Eα (z) =

+∞ k=0

zk . Γ(1 + αk)

(7.15)

Proof. It is enough to express (7.13) by using Deﬁnition 7.4 of dα , i.e.: Dα Eα (λ, t) = λEα (λ, t) + 1Y1−α from where, under the terms of Deﬁnition 7.3 of Eα (λ, t) as a fundamental solution of operator (Dα − λ), we obtain as the solution of the preceding equation with second term the a priori causal distribution: Eα (λ, t) = Y1−α (·) Eα (λ, ·) (t) ∞ k = Y1−α λ Y(1+k)α (t) k=0

=

∞

λk Y1+αk (t)

k=0

=

∞ k=0

λk tαk + Γ(1 + αk)

= Eα (z = λtα +) The result sought in the causal distributions is thus, as stated, a continuous function directly connected to the Mittag-Lefﬂer monogenic functions [MIT 04]. EXAMPLE 7.5.– Let us examine the particular cases of the integer and half-integer orders.

250

Scaling, Fractals and Wavelets

On the one hand, for α = 1, we obtain the causal exponential: E1 (λ, t) = eλt Y1 (t) as eigenfunction of usual derivation d1 (which actually belongs to the class of C1∞ functions). On the other hand, for α = 12 , we obtain: E1/2 (λ, t) =

+∞

λk Y1+ k 2

k=0

√ +∞ (λ t)k = Γ(1 + k2 ) k=0

√ = exp (λ2 t) [1 + erf (λ t)]

where erf is the error function (to be evaluated in all the complex plane). 7.2.3.4. Fractional power series expansions of order α (α-FPSE) The series expansion of functions Eα (λ, t) highlighted in Proposition 7.10 suggests the following deﬁnition naturally. DEFINITION 7.7.– For 0 < α 1, the sequence (ak )k0 of complex numbers makes it possible to deﬁne the formal series: f (t) =

∞

ak Y1+αk (t)

(7.16)

k=0

which takes an analytical meaning of fractional power series expansion of order α (α-FPSE), as soon as |ak | are bounded from above by a geometric sequence for example; the uniform convergence of the series of functions then takes place on every compact subset of [0, +∞[. PROPOSITION 7.11.– Any expandable function in fractional power series of order α is of class Cα∞ ; and it fulﬁlls, in particular: ak = [(dα )k f ](0+ )

for

k0

(7.17)

Proof. The proof is provided in a way similar to the calculation in the example studied in section 7.2.3.2 for a series comprising a ﬁnite number of terms. There is no problem of commutation between the operator dα and the inﬁnite summation, since the series of functions of class Cα∞ converges uniformly on every compact subset: in other words, term by term derivation dα is perfectly licit.

An Introduction to Fractional Calculus

251

NOTE 7.3.– Just as any function of class C ∞ is not necessarily expandable in power series (PSE), any function of class Cα∞ is also not necessarily expandable in fractional power series of order α. We will introduce the function later on: ! √ " ψ 1 (t) = L−1 e− s , e(s) > 0 ∝ t−3/2 exp(−1/4t) Y1 (t), ∞ which is of class C1/2 but which is not expandable in fractional power series of half-order.

7.3. Fractional differential equations In Chapter 5 of [MIL 93], examples of fractional differential equations are examined and we note, in particular, problems of initial value (0 or +∞). In Chapter 6 of [MIL 93], the vectorial aspect is considered (we will start with that) and the idea of sequentiality is present, from a rather formal point of view which, for us, involves reserves of an analytical nature which we already stated. In section 7.3, we commence by treating and analyzing an example, which justiﬁes the resolution of fractional differential equations within two quite distinct frameworks: causal distributions in section 7.3.2 and functions with a fractional power series expansion of order α in section 7.3.3; we are ﬁnally concerned, in section 7.3.4, with the asymptotic behavior of the solutions of fractional differential equations, which is a question connected with the basic concept of stability. 7.3.1. Example An integro-differential equation (in y, Y1/2 y , y for example, where y of class C 1 is sought) can be written either with derivatives in the sense of distributions (D1/2 y, D1 y), or with mild derivatives (d1/2 y, (d1/2 )2 y), which makes d1 y disappear. Let us clarify the passage in a particular framework for integro-differential equation with right-hand side: y (t) + c1 (Y1/2 y )(t) + c2 y(t) = x(t)

for t > 0,

with y(0) = a0 .

(7.18) (7.19)

7.3.1.1. Framework of causal distributions In D+ (7.18)-(7.19) is written in a single equation which uses the initial condition, i.e.:

D1 [yY1 ] + c1 D1/2 [yY1 ] + c2 yY1 = xY1 + a0 {Y0 + c1 Y1/2 } which can be vectorially formulated in the following manner: y y 0 1 0 1/2 = + D −c2 −c1 D1/2 y xY1 + a0 {Y0 + c1 Y1/2 } D1/2 y

(7.20)

(7.21)

252

Scaling, Fractals and Wavelets

and is thus solved simply by deﬁning E1/2 (Λ, t) by the power series in the square matrix (exactly as for the matrix exponential): def

E1/2 (Λ, t) =

+∞

k

Λ Y(k+1)/2 = Y1/2 I + Λ

k=0

+∞ k=0

√

Λ

k

k

t Γ(1 + k2 )

from where the solution of (7.21) in D+ , which we will develop in section 7.3.2, is obtained: (7.22) D1/2 y = Λy + xD ⇐⇒ y(t) = E1/2 (Λ, ·) xD (t).

The notations are obvious; let us specify only that the index D of xD means that the vector contains not only the second member x, but also distributions related to initial condition a0 . 7.3.1.2. Framework of fractional power series expansion of order one half We work now with the mild derivative of order one half and seek a function y of 2 which is also of class C 1 ; equation (7.20) is then written in an equivalent class C1/2 manner: (d1/2 )2 y + c1 d1/2 y + c2 y = x for t > 0, y(0) = a0 with ! 1/2 " d y (0) = 0 which can be vectorially formulated in the following way: y y 0 1 0 + d1/2 1/2 = −c2 −c1 d1/2 y x d y y a0 with (0) = 0 d1/2 y

for t > 0,

(7.23) (7.24)

(7.25) (7.26)

and is solved by deﬁning E1/2 (Λ, t) by the power series in the square matrix Λ: def

E1/2 (Λ, t) =

+∞

√ Λk Y1+ k = E1/2 (Λ t) 2

k=0

from where, with obvious notations, the solution of (7.25) and (7.26), which we will develop in section 7.3.3, is obtained: (7.27) d1/2 y = Λy + x ⇔ y(t) = E1/2 (Λ, t)y(0) + E1/2 (Λ, ·) x (t).

An Introduction to Fractional Calculus

253

7.3.1.3. Notes Under initial vectorial condition (7.26) or under two initial scalar conditions (7.24), set the non-integer order initial condition to zero to ensure the C 1 regularity of the solution of the physical starting problem (7.18), which has only one physical initial condition given by (7.19). We obtain a response to the initial condition which is of ∞ , but of class C1k for k = 0, 1 only; it is the same for the impulse response h class C1/2 to the input x. However, in presenting a general theory of fractional differential equations, nothing prevents us from considering [d1/2 f ](0) = a1 as an independent parameter, which will make it possible to speak about response to the integer or non-integer initial conditions ai . Thus, the problem, in general, would be, instead of (7.23)-(7.24): 1/2 2 d y + c1 d1/2 y + c2 y = x for t > 0, ⎧ ⎨y(0) = a0 with ⎩!d1/2 y "(0) = a 1 or, instead of (7.20): D1 [yY1 ] + c1 D1/2 [yY1 ] + c2 yY1 = xY1 + a0 {Y0 + c1 Y1/2 } + a1 Y1/2 . NOTE 7.4.– In terms of application to physics, the rational case α = p1 is interesting; the relations between (d1/p )np y and (d1 )n y will indeed have to be clariﬁed. However, it is rather the case of commensurate orders of derivation which is suitable for an algebraic treatment in general, which is a treatment in every respect analogous to that carried out for 12 . PROPOSITION 7.12.– Any scalar fractional differential equation of commensurate order with α of degree n can be brought back to a vectorial fractional differential equation of order α of degree 1 in dimension n. Proof. It is valid for a fractional differential equation in Dα as for a fractional differential equation in dα , since we have the crucial property of sequentiality. We have just seen it on an example of order 12 and degree 2; the proof, in general, is straightforward. We thus give results directly in vectorial form later on, i.e., by extracting the ﬁrst component from the vector solution; in other words, the solution of the scalar problem.

254

Scaling, Fractals and Wavelets

7.3.2. Framework of causal distributions DEFINITION 7.8.– By deﬁnition, we have: def

Eα (Λ, t) =

+∞

Λk Y(1+k)α (t)

(7.28)

k=0

The matrix Eα (Λ, t) is expressed like a power series in the matrix Λ which, after reduction of the latter (eigenvalues λi of multiplicity mi ), is made explicit on the basis of fundamental solutions and their successive convolutions Eαj (λi , t), with 1 j mi . It is a ﬁrst extension to the fractional case of the matrix exponential concept. PROPOSITION 7.13.– For the j-th times convolution of fundamental solutions, we have: Eαj (λ, t) = L−1 [(sα − λ)−j , e(s) > aλ ] j−1 ∂ 1 = L−1 (sα − λ)−1 , e(s) > aλ (j − 1)! ∂λ j−1 ∂ 1 Eα (λ, t) = (j − 1)! ∂λ =

+∞

j−1 Cj−1+k λk Y(j+k)α (t).

k=0

Proof. It is a formal calculation without much interest; let us note the use of the parametric derivative with respect to the complex parameter λ. In the integer case (α = 1), it is written simply E1j (λ, t) = Yj (t)E1 (λ, t), which can prove itself directly by using the following convolution property of causal functions f and g: " ! f (τ )eλτ g(τ )eλτ (t) = [f (τ ) g(τ )](t) eλt .

PROPOSITION 7.14.– We have: Dα y = Λy + xD

⇐⇒

y(t) = Eα (Λ, ·) xD (t)

(7.29)

where vector xD contains, on the one hand, distributions related to the initial conditions of vector y and, on the other hand, a regular function x or right-hand side.

An Introduction to Fractional Calculus

255

Proof. It derives from Eα (Λ, t), which is the fundamental solution of the matrix operator (Dα I − Λ); this can be achieved by Laplace transform, as for Proposition 7.9. The fundamental relation is established (see Deﬁnition 7.3): Dα Eα (Λ, t) = ΛEα (Λ, t) + I δ

(7.30)

from where the announced result is obtained. 7.3.3. Framework of functions expandable into fractional power series (α-FPSE) DEFINITION 7.9.– By deﬁnition, we have: def

Eα (Λ, t) =

+∞

Λk Y1+kα (t).

(7.31)

k=0

The matrix Eα (Λ, t) is expressed like a power series in the matrix Λ which, in the case where Λ is diagonalizable (eigenvalues λi ), is made explicit on the basis of eigenfunctions Eα (λi , t), with 1 i n. It is the other extension to the fractional case of the matrix exponential concept. PROPOSITION 7.15.– We have: dα y = Λy + x ⇐⇒ y(0+ ) = y 0

y(t) = Eα (Λ, t)y 0 + Eα (Λ, ·) x (t)

(7.32)

where, this time, vector x is a continuous function (or input) which controls the fractional differential system. Proof. By using Deﬁnition 7.4, the left-hand side of (7.32) becomes: Dα y = Λy + x + Y1−α y 0 . We use Proposition 7.14 then, by taking: xD (t) = x(t) + Y1−α (t) y 0 . By noting that:

Eα (Λ, t) = Y1−α (·) Eα (Λ, ·) (t).

(7.33)

We then ﬁnd the right-hand side of (7.32) to be the solution. By reinterpreting this result on a scalar fractional differential equation of degree n, it appears that y 0 is the vector of the n ﬁrst coefﬁcients of the expansion in fractional power series of order α of the solution y; in other words, the vector of the fractional order initial conditions.

256

Scaling, Fractals and Wavelets

To establish the link with physics, when α = p1 , it is advisable to initialize to 0 the fractional order initial conditions and to give the values of traditional initial position and velocity to the integer terms (on the example in y, Y1/2 y , y , of order 12 and degree 2, we took y(0) = y0 and d1/2 y(0) = 0; thus, the response to the only initial ∞ which is C 1 without being C 2 ). conditions is a function C1/2 Within this framework, it is then possible to treat fractional differential equations-α in an entirely algebraic way, by introducing the characteristic polynomial in the variable σ: P (σ) = σ n + cn−1 σ n−1 + · · · + c0 =

r )

(σ − λi )mi

i=1

of the fractional differential equation with the right-hand side: (dα )n y + cn−1 (dα )n−1 y + . . . + c0 y = x The responses hk (t) to the various initial conditions ak = [(dα )k y](0) for 0 k n − 1 and the impulse response h(t) of the system are linear combinations of Eαj (λi , t), with 1 j mi and 1 i r, which can be made explicit by the method of the unknown coefﬁcients, or by algebraic means; in the generic case of distinct roots, for example, we obtain: n 1 1 def −1 = Eα (λ1 , t) · · · Eα (λn , t) = E (λ , t) h(t) = L (λ ) α i P (sα ) P i i=1 EXAMPLE 7.6.– Let us take again the example stated in a general way in section 7.3. Let λ1 , λ2 be the roots of P (σ) = σ 2 + c1 σ + c2 . The general solution of the system is given as: y(t) = (h x)(t) + a1 h1 (t) + a0 h0 (t) with the impulse response: 2 2 √ 1 λi 1 E E = (λ , t) = (λ t) h(t) = L−1 i i 1/2 1/2 P (λi ) P (λi ) P (s1/2 ) i=1 i=1 the response to the initial condition (half-integer) a1 : h1 (t) = I 1/2 h =

2 i=1

1 P (λ

i)

√ E1/2 (λi t)

and the response to the initial condition (integer) a0 : h0 (t) = D1/2 h1 + c1 h1 = h + c1 h1 =

2 λi + c1 i=1

P (λ

i)

√ E1/2 (λi t).

An Introduction to Fractional Calculus

257

j NOTE 7.5.– When there is a double root λ1 , the preceding expressions use E1/2 (λ1 , t) for j = 1, 2; they also give rise to algebraic simpliﬁcations which ﬁnally reveal √ E1/2 (λ1 , t) and t E1/2 (λ1 , t).

7.3.4. Asymptotic behavior of fundamental solutions 7.3.4.1. Asymptotic behavior at the origin We saw that Eα (λ, t) has an integrable singularity at the origin; we ﬁnd the general result according to: PROPOSITION 7.16.– When t → 0+ : Eαj (λ, t) ∼ Yjα (t) =

tjα−1 + ∈ L1loc . Γ(jα)

Proof. This follows from Proposition 7.13; the equivalent in 0+ is deduced from it immediately. 7.3.4.2. Asymptotic behavior at inﬁnity At the beginning of the 20th century, the mathematician Mittag-Lefﬂer was interested in functions Eα (z) (for reasons unconnected with fractional calculus [MIT 04]): concerning asymptotic behavior when |z| → +∞ when α < 2, he established the exponential divergence type of the sector of the complex plane | arg z| < α π2 and convergence towards 0 outwards; the nature of convergence towards 0 and the asymptotic behavior on the limit were not examined. We found later, in the middle of the last century [BAT 54], the very nature of the convergence towards 0 for | arg z| > α π2 . We reuse similar results on the fundamental solutions and extend them, on the one hand, to the limit | arg λ| = α π2 and, on the other hand, to the successive convolutions ∗j α (λ, t). PROPOSITION 7.17.– The asymptotic behavior (when t → +∞) of the fundamental solutions of Dα − λ and their convolutions, which structurally appear in the solutions of fractional differential equations of order α (as the basis of {polynomials in t} × {exp(λt)} in the case of the integer order α = 1) is given by the position of the eigenvalues λ in the complex plane, which holds the place of the fractional spectral domain: – for |arg(λ)| < α π2 , Eαj (λ, t) diverges in an exponential way (more precisely {polynomial in tα } × {exp(λ1/α t)}); – for |arg(λ)| = α π2 , Eα1 (λ, t) is asymptotically oscillatory and Eαj (λ, t) with j 2 diverges in an oscillating polynomial way (in tα );

258

Scaling, Fractals and Wavelets

– for |arg(λ)| > α π2 , we obtain Eαj (λ, t) ∼ kj,α λ−1−j t−1−α . In this latter case, we note that Eαj (λ, t) ∈ L1 (]0, +∞[), which is crucial for the impulse responses (the notion of a bounded input-bounded output (BIBO) system is related to the integrable character of the impulse response; in short, L1 L∞ ⊂ L∞ ). Proof. See [MAT 96b, MAT 98a] for these tricky calculations of residues and asymptotic behavior of indeﬁnite integrals depending on a parameter. Let us note that the analysis in the Laplace plane provides only one pole when | arg(λ)| < απ and the latter (if it exists) is accompanied by an integral term – or aperiodic multimode according to [OUS 83] – resulting from the cut on the negative real semi-axis imposed by the multiform character of s → sα : this is our ﬁrst engagement with diffusive representation, which will be detailed further in section 7.4.1. EXAMPLE 7.7.– In the half-integer case, we can illustrate the asymptotic behavior in the two sides of Figure 7.1: the eigenvalue λ describes the plane in σ (which is only √ the “unfolded” Riemann surface s). That is translated in the Laplace plane either by a pole and a cut (or a “pole” in the ﬁrst layer of the Riemann surface), or by the cut alone (or a “pole” in the second layer of the Riemann surface). 2

2

1.5

1.5

stable

1

stable

1

stable 0.5 Im(sigma)

Im(s)

0.5

unstable

0

-0.5

-0.5

stable

-1

-1

-1.5

-1.5

-2 -2

-1.5

-1

unstable

0

-0.5

0 Re(s)

0.5

1

1.5

2

-2 -2

stable

-1.5

-1

-0.5

0 Re(sigma)

0.5

1

1.5

2

Figure 7.1. Half-integer case: (a) Laplace plane in s; (b) plane in σ

√ In Figures 7.2 to 7.8, we represent the eigenfunctions E1/2 (λ t) (whose integral term decreases like t−1/2 ) in real and imaginary parts. We will note the asymptotically oscillatory character in Figure 7.4 and the absence of oscillatory term (or residue) in Figures 7.6 to 7.8; only the integral or diffusive part is present.

An Introduction to Fractional Calculus 5

4

3

2

1 0

0.2

0.4

0.8

0.6

1

t

√ Figure 7.2. For λ = 1, E1/2 (λ t). Exponentially divergent real behavior

6

8

4

6

2 4 t 0

0

0.2

0.4

0.6

0.8

1 2

-2 0 -4

0

0.2

0.4

0.6

0.8

1

t -2

-6

-4

-8

-10

-6

√ √ Figure 7.3. For λ = 3(1 + 0.9.i), (a): e[E1/2 (λ t)], (b): m[E1/2 (λ t)]. Oscillatory exponentially divergent behavior

2 2

1 1

t 0

0

0.2

0.4

0.6

0.8

1 0

0

0.2

0.4

0.6

0.8

t

-1 -1

-2 -2

√ √ Figure 7.4. For λ = 4(1 + i), (a): e[E1/2 (λ t)], (b): m[E1/2 (λ t)]. Asymptotically oscillating behavior

1

259

260

Scaling, Fractals and Wavelets

1

1.5

0.5 1

0

0

1 t

0.5

2

1.5

0.5

-0.5 0

0

0.2

0.4

0.8

0.6

1

t -1 -0.5

√ √ Figure 7.5. For λ = 4(0.8 + i), (a): e[E1/2 (λ t)], (b): m[E1/2 (λ t)]. Behavior converging in two times: oscillatory exponentially, then diffusive in t−1/2

1

0.6

0.5

0.8

0.4 0.6

0.3 0.4 0.2

0.2 0.1

0

0

0.2

0.4

0.6

0.8

1

0

0

0.2

0.4

t

0.6

0.8

1

t

√ √ Figure 7.6. For λ = 5i, (a): e[E1/2 (λ t)], (b): m[E1/2 (λ t)]. Diffusive behavior only in t−1/2

1

0.2 0.8

0.15 0.6

0.1 0.4

0.05 0.2

0

0.2

0.4

0.6 t

0.8

1

0

0

0.2

0.4

0.6

0.8

t

√ √ Figure 7.7. For λ = 4(−1 + i), (a): e[E1/2 (λ t)], (b): m[E1/2 (λ t)]. Diffusive behavior in t−1/2

1

An Introduction to Fractional Calculus

261

1

0.8

0.6

0.4

0.2

0

0.2

0.4

0.6

0.8

1

t

√ Figure 7.8. For λ = −10, E1/2 (λ t). Pure diffusive behavior in t−1/2

7.3.5. Controlled-and-observed linear dynamic systems of fractional order Let us assume the class of the following fractional linear dynamic systems: dα x = Ax + Bu y = Cx + Du We can study them under the angle of asymptotic stability, controllability, observability, stabilization by state feedback, construction of an asymptotic observer and stabilization by an observator-based controller. The results which relate to the controlled-and-observed linear dynamic systems of integer order [DAN 94, SON 90] can be generalized to the fractional order. In particular, a system in this class will have the property of: – stability if and only if | arg spec(A)| > α π2 ; – stabilizability by state feedback if and only if: ∃K

such that | arg spec(A + BK)| > α

π 2

which is fulﬁlled if the “ungovernable” modes of (A, B) in the traditional sense are stable in the α-sense and thus in particular if the pair (A, B) is governable in the traditional sense; – construction of an asymptotic observer if and only if: ∃L such that | arg spec(A + LC)| > α

π 2

262

Scaling, Fractals and Wavelets

which is fulﬁlled if the “unobservable” modes of (C, A) in the traditional sense are stable in the α-sense and thus in particular if the pair (C, A) is observable in the traditional sense; – stabilizability by observer-based controller if and only if it is stabilizable by state feedback and if we can build an asymptotic observer, which is speciﬁcally the case when the triplet (C, A, B) is minimal in the traditional sense. Further details can be found in [MAT 96c] for the concepts of observability and controllability and in [MAT 97] for the observer-based control. NOTE 7.6.– We should, however, be aware that the application range of the preceding approach is rather limited because it relies heavily on the commensurate character of the derivation orders and therefore makes a distinction between rational orders and others, which is theoretically restrictive and completely impracticable for digital simulation, for example. 7.4. Diffusive structure of fractional differential systems We now approach the study of fractional differential systems of incommensurate orders, linear and with constant coefﬁcients in time, i.e., the pseudo-differential input(u)-output (y) systems of the form: K k=0

ak Dαk y(t) =

L

bl Dβl u(t)

l=0

corresponding, by Laplace transform, to the symbol: L bl sβl H(s) = Kl=0 . αk k=0 ak s

(7.34)

NOTE 7.7.– Strictly speaking, the term fractional should be reserved for the commensurate systems of orders (βl = lα1 and αk = kα1 ), whereas the term non-integer would be, in truth, more suitable; we conform here to the Anglo-Saxon use (fractional calculus). In section 7.4.1 we give a general structure result which shows to what extent the fractional differential systems are also diffusive pseudo-differential systems. In section 7.4.3, a characterization of the concept of long memory is given. Finally, in section 7.4.4, we recall the particular case of the fractional differential systems of commensurate orders, to which the general structure result naturally applies, but which allows, moreover, an explicit characterization of stability (in the sense of BIBO). However, ﬁrst of all, in section 7.4.1, we recall some basic ideas on what diffusive representations of pseudo-differential operators are.

An Introduction to Fractional Calculus

263

7.4.1. Introduction to diffusive representations of pseudo-differential operators A ﬁrst-order system, or autoregressive ﬁlter of order 1 (AR-1) in other contexts, is undoubtedly the simplest linear dynamic system imaginable which does not oscillate, but has a behavior of pure relaxation. A discrete superposition of such systems, for various time constants τk , or in an equivalent way for various relaxation constants ξk = τk−1 and various weights μk , gives a simple idea – without being simplistic2 – of the diffusive pseudo-differential operators required to simulate the fractional differential equations. When the superposition is discrete and ﬁnite, the resulting system is a system of a integer order with poles (real negative sk = −ξk ) and of zeros; on the other hand, if the superposition is either discrete inﬁnite, or continuous for all the relaxation constants ξ > 0 and with a weight function μ(ξ), we obtain a pseudo-differential system known to be of the diffusive type, the function μ being called the diffusive representation of the associated pseudo-differential operator. In the sense of systems theory, a realization of such a system will be: ∂t ψ(t, ξ) = −ξ ψ(t, ξ) + u(t) +∞ y(t) = μ(ξ)ψ(t, ξ) dξ

(7.35) (7.36)

0

which is mathematically meaningful within a suitable functional framework (see e.g. [STA 94, MON 98, MAT 08] for technical details; the latter reference making the link with the class of well-posed linear systems). A simple calculation thus shows that the impulse response of the input u-output y system is: +∞ μ(ξ) e−ξt dξ. (7.37) hμ (t) = 0

Its transfer function or its symbol is then, for e(s) > 0: +∞ μ(ξ) Hμ (s) = dξ. s +ξ 0

(7.38)

EXAMPLE 7.8.– A simple case of a diffusive pseudo-differential operator is that of the fractional integrator I α , whose diffusive representation is μα (ξ) = sinπαπ ξ −α for 0 < α < 1.

2. Indeed, it is, on the one hand, by completion of this family within a suitable topological framework that we can obtain the space of diffusive pseudo-differential operators and, on the other hand and eventually, these simple systems which are programmed numerically by procedures of standard numerical approximation; see e.g. [HÉL 06b].

264

Scaling, Fractals and Wavelets

We see that one of the advantages of diffusive representations is to transform non-local problems of hereditary nature, in time, into local problems, which speciﬁcally enables a standard and effective numerical approximation (see e.g. [HEL 00]). On the other hand, when the diffusive representation μ is positive, the realization suggested has the important property of dissipativity of the pseudo-differential operator (a natural energy functional is then given by Eψ (t) = # +∞ μ(ξ) |ψ(t, ξ)|2 dξ), which is in this case of the positive type, which has 0 important consequences, particularly for the study of stability coupled systems (see [MON 97] and also [MON 00] for non-linear systems, time-varying, with hysteresis, etc.). Now, as far as stability is concerned, it is important to notice that some technicalities must be taken care of in an inﬁnite-dimensional setting (namely, LaSalle’s invariance principle does not apply when the pre-compactness of trajectories in the energy space has not been proved a priori: this is the reason why we have to analyze the spectrum of the inﬁnitesimal generator of the semigroup of the augmented system and resort to Arendt-Batty stability theorem, as has been done recently in [MAT 05]). 7.4.2. General decomposition result 1 tα−1 and by strictly limiting ourselves to the By re-using the notation Yα (t) = Γ(α) + case of strictly proper systems (βL < αK ), the following signiﬁcant result is obtained (see [MAT 98a, AUD 00]).

THEOREM 7.1 (D ECOMPOSITION R ESULT).– The impulse response h of system (7.34) of symbol H has the structure: h(t) =

νi r i=1 j=1

si t

rij Yj (t) e

+∞

+

μ(ξ) e−ξt dξ

(7.39)

0

where si are complex poles in C \ − and where μ is a distribution. Moreover, in the case of a density, the analytical form of μ is given by: α +β K L k l 1 k=0 l=0 ak bl sin (αk − βl )π ξ μ(ξ) = K . π k=0 ak 2 ξ 2αk + 0k
(7.40)

For the proof, the idea is to apply the remainder theorem to function H(s) which is meromorphic in the cut plane C \ − . The diffusive term then follows naturally from the discontinuity of H on the cut on − ; precisely, it is shown that: μ(ξ) = lim+ ε→0

" 1 ! H(−ξ − iε) − H(−ξ + iε) . 2iπ

An Introduction to Fractional Calculus

265

In other words, the impulse response h of a fractional differential system breaks up r into a localized part hn of integer order n = i=1 νi and a part hμ of purely diffusive nature. We can ﬁnd in [DAUP 00, HEL 00] a great number of examples illustrating this decomposition result on some non-standard oscillators. 7.4.3. Connection with the concept of long memory Finally, let us recall that such systems are said to have long memory in so far as the decrease of the impulse response (in the stable case) is not of exponential type. This is determined by a generalized expansion in ξ = 0 of distribution μ, which is followed by the application of the following lemma. LEMMA 7.1 (Watson).– For −1 < γ1 < γm < γm+1 , we have: μ(ξ) =

M −1

μm

m=1

=⇒

ξ γm + O(ξ γM ) Γ(1 + γm ) hμ (t) =

M −1 m=1

(7.41) μm

1 t1+γm

+ O(t−1−γM )

Thus, by juxtaposing the decomposition result (7.39), the expression of μ (7.40) and the asymptotic analysis (7.40), the following characterization of stability is obtained. THEOREM 7.2.– System (7.34) is stable in BIBO if and only if the two following conditions are veriﬁed: – in (7.39), we have e(si ) < 0, for all i; – the ﬁrst exponent γ1 in (7.40) is strictly positive. It should be noted that a priori, si , although a ﬁnite number (see [BON 00]), is not known in a simple way in the general case. The situation is quite different when the system is more structured, as it emphasized below. 7.4.4. Particular case of fractional differential systems of commensurate orders The general result given before can then be expressed differently, by using a strong algebraic structure induced by the commensurate character of the derivation orders. When saying σ = sα , R is deﬁned such that H(s) = R(σ). It is then enough to decompose the rational fraction R into simple elements (σ − λ)−m , to deﬁne

266

Scaling, Fractals and Wavelets

by inverse Laplace transform the corresponding basic elements Eαm (λ, t) and to characterize their stability by using an asymptotic analysis similar to the preceding one. The function Eαm (λ, t) is the fundamental solution of the operator (Dα − λ)m ; it belongs to the family of the Mittag-Lefﬂer functions, which is a subset of hypergeometric special functions. Using these functions, we obtain the following structure result. PROPOSITION 7.18.– We have: h(t) =

mn N

rnm Eαm (λn , t)

n=1 m=1

with R(σ) =

N

mn

n=1

m=1 rnm

(σ − λn )−m .

A reﬁned asymptotic analysis of the functions Eαm (λn , t) makes it possible to deduce the following fundamental result for BIBO stability when R = Q/P , with P, Q two coprime polynomials and 0 < α < 1. THEOREM 7.3.– We have: BIBO stability

⇐⇒

π | arg σ| > α , 2

∀σ ∈ C, P (σ) = 0.

(7.42)

In this latter case, the impulse response has the asymptotic: h(t) ∼ Kt−1−α

when

t → +∞.

(7.43)

NOTE 7.8.– In this case, the poles of the system appearing in decomposition (7.39) are known analytically; they are exactly sn = λn 1/α , but only for those of the preceding λn , which verify | arg λn | < απ. NOTE 7.9.– In the whole case α = 1, we ﬁnd with (7.42) the traditional stability result: absence of poles in the closed right-half plane. 7.5. Example of a fractional partial differential equation An example of propagation phenomenon with long memory, very similar to that with which we will now deal is mentioned in [DAUT 84b]. This refers to the original Russian articles [LOK 78a, LOK 78b], but the fundamental difference which exists between the case presented and ours is that the space is unbounded; hence there are no discrete spectra or resonance modes of the physical system; moreover, no relationship with the eigenfunctions of fractional derivation appears.

An Introduction to Fractional Calculus

267

We thus examine the example of an acoustic pipe of ﬁnite length (consequently, space is bounded), as studied in [MAT 94] and summarized in [MAT 95b]. We present the physical problem in section 7.5.1 and commence by studying the controlled problem: the perturbation by a fractional derivative term, in time, of the traditional wave equation of the 1D waves is examined from the perspective of its spectral consequences in section 7.5.2 and from the point of view of its time-domain consequences in section 7.5.3. Lastly, we examine the response to the initial conditions in section 7.5.4, i.e., the free problem. 7.5.1. Physical problem considered The propagation of pressure waves in air, regarded as a real medium with viscous and thermal losses, has already been studied in acoustics, either in a closed space (bounded domain), or in an open space (unbounded domain): the approach which is generally made is in the frequency domain. A fractional partial differential equation (FPDE) was proposed in [POL 91] for the approximation known as of the broad pipes and is found again, within a very general framework, as an approximation at high frequencies in e.g. [FEL 00]; it is a wave equation where a fractional derivative term appears as a perturbation of the traditional wave equation: the perturbation parameter is conversely proportional to the radius of the cylindrical tube considered. We standardize the propagation velocity of the sound to 1 and the length of the tube to 1 and we note by ε the perturbation parameter. Within this framework, we consider the following linear dynamic system written in the sense of distributions, where u(t) is the input or boundary control (the pressure signal introduced at the left input of the tube in x = 0), X(t, x) the internal state of inﬁnite dimension (since it is in fact a function of the abscissa x ∈ [0, 1]) and y(t) the output or the observation (the pressure signal that we listen to at the output of the tube in x = 1): 3 2 ∂t + 2ε∂t2 + ε2 ∂t1 X − ∂x2 X = 0,

t > 0,

x ∈ ]0, 1[

(7.44)

The initial conditions are identically zero: we have X(t = 0, x) = 0 and ∂t X(t = 0, x) = 0, and the dynamic boundary conditions are of absorbing type (with a0 b0 > 0, a1 b1 > 0): ⎧! 1 1 " ⎨ a0 (∂t + ε∂ 2 ) + b0 ∂−x X(t, x = 0) = a0 (∂t + ε∂ 2 )u(t) t t (7.45) ⎩!a (∂ + ε∂ 12 ) + b ∂ "X(t, x = 1) = 0 1 t 1 x t where {a1 , b1 } deﬁne a reﬂection coefﬁcient of waves at the output of the tube: r1 = (b1 − a1 )/(a1 + b1 ) and where {a0 , b0 } deﬁne a reﬂection coefﬁcient of waves at its input: r0 = (a0 − b0 )/(a0 + b0 ). The absorbing property is given by |ri | < 1, i.e., ai bi > 0; in the limiting case |ri | = 1, these are traditional boundary conditions

268

Scaling, Fractals and Wavelets

of Dirichlet or Neumann type. The total reﬂection coefﬁcient is given by ρ = −r0 r1 . Lastly, the output of the system is: y(t) = X(t, x = 1).

(7.46)

7.5.2. Spectral consequences According to [KOL 69], we apply the Laplace transform in the sense of causal distributions to (7.44); we are then led to the following characteristic equation for the poles of the system: * (7.47) sεn + ε sεn = s0n √ = 0) with e( s) > 0, where s0n = −α0 +iωn0 are the poles of the uncontrolled (u(t) * system without loss (ε = 0), with the boundary conditions (7.45): α0 = − ln |ρ| fulﬁlls the relation α0 > 0 when ai bi > 0, i.e., when there is loss of energy at the edges. Moreover, ωn0 = ω00 + nπ are also spaced with interval of π, symmetrically distributed with respect to 0 and including or not including 0 according to whether they are even modes (if ρ > 0, ω00 = 0) or odd (if ρ < 0, ω00 = π2 ). This classical result can be found in [RUS 78], for example. At this stage, we can summarize the three following lessons: – damping is more signiﬁcant and depends upon the frequency: αnε > α0 , and the eigenpulsations are attenuated: |ωnε | < |ωn0 |; – the negative real axis becomes an integral part of the “spectrum”, with a weight o(ε) (see section 7.5.3.2); – a ﬁnite number of poles located at low frequency can even disappear if the perturbation parameter becomes too large, with α0 ﬁxed. In Figure 7.9, we represent in the Laplace plane the poles s0n without loss (ε = 0) and the poles sεn with losses (ε = 0.25), for the conﬁgurations of odd modes o: ρ = −1 and of even modes ∗: ρ = +0.8. 7.5.3. Time-domain consequences Now, we calculate the impulse response of the system, using three methods which take the transfer function of the system as a starting point, i.e., up to a multiplicative factor: H(s) =

e−Γ(s) 1 − ρ e−2Γ(s)

(7.48)

√ where Γ(s) = s + ε s is the propagation constant and ρ the global reﬂection coefﬁcient.

An Introduction to Fractional Calculus

269

20

15

10

Im(s)

5

0

-5

-10

-15

-20 -1

-0.5

Re(s)

0

0.5

Figure 7.9. Position in the Laplace plane of the poles of the models with losses (ε = 0.25) and without loss (ε = 0). Legend: o: ρ = −1, ∗: ρ = +0.8

7.5.3.1. Decomposition into wavetrains This is undoubtedly the most physically meaningful decomposition. By expanding the fraction in (7.48) into power series, which is legitimate for e(s) > 0 in the case |ρ| < 1, then by applying the inverse Laplace transform term by term, we obtain the following wavetrain decomposition of the system: h(t) =

∞

ρk ψ (2k+1)ε t − (2k + 1)

(7.49)

k=0

where ψ ε (t) is the fundamental solution of a 3D diffusion process (i.e., parabolic heat equation): ' √ ( 3 ε2 def (7.50) ψ ε (t) = L−1 e−ε s , e(s) > 0 ∝ εt− 2 e− 4t for t > 0. In Figure 7.10a, we represent function ψ 1 , that is, the elementary lossy wave: it is a function of class C ∞ of which all the derivatives are zero at t = 0+ and which decreases like t−3/2 at inﬁnity; it is integrable and it even belongs to L1 , L2 , . . . , L∞ , 1 with norm ψ ε p = ε−2(1− p ) ψ 1 p . An interesting property is that we again ﬁnd the case without loss of the classical wave equation: ψ 0 (t) = δ(t) (as a limit in the sense of distributions), which does not have any of the regularity properties of ψ ε when ε > 0. Moreover, the family of functions obeys the following scaling law: t ψ ε (t) = ε−2 ψ 1 2 ε

(7.51)

270

Scaling, Fractals and Wavelets 1

15

0.9 0.8 0.7

10

psi

psi

0.6 0.5 0.4

5

0.3 0.2 0.1 0 0

0.2

0.4

0.6

0.8

1 t

1.2

1.4

1.6

1.8

0 0

2

0.1

0.2

0.3

0.4

0.5 t

0.6

0.7

0.8

0.9

1

Figure 7.10. (a) Fundamental solution ψ 1 of a 3D diffusion process; (b) scaling law of functions ψ ε for ε = 0.25 (continuous line), 0.75 (dotted line) and 1.25 (indents) 10000

15

9000 8000 7000

10

5000

h

h

6000

4000 5

3000 2000 1000 0 0

1

2

3

t

4

5

6

7

0 0

1

2

3

t

4

5

6

7

Figure 7.11. Impulse response of the model with losses: (a) for ε = 0.01, (b) for ε = 0.25

which is shown in Figure 7.10b. Thus, it is clear that the waves which appear successively in h(t) in decomposition (7.49) have an increasingly low amplitude, but a temporal support (or a width with middle height) which is increasingly high. We illustrate this phenomenon as follows. In Figure 7.11a, ε = 0.01 is very small and the supports of successive waves remain separate. To some extent this resembles the case without loss, but with successive amplitudes which decrease in a way nearly independent of the total reﬂection coefﬁcient ρ, whereas in Figure 7.11b, ε = 0.25 can no longer be compared with a perturbation, while the increasing spreading of the supports is veriﬁed. 7.5.3.2. Quasi-modal decomposition To apply the inverse Laplace transform to (7.48) directly by using residue calculus, we must take into account the cut √ along the negative real axis, which is imposed by the multiform character of s → s.

An Introduction to Fractional Calculus

271

It is this which very precisely creates an integral term in the modal decomposition, sometimes called aperiodic multimode, since we can regard it as the superposition of a continuous inﬁnity of damped exponentials; the structure of the impulse response is thus the following: ∞ ε sεn t cn e + με (ξ) e−ξt dξ (7.52) h(t) = 0

n∈S ε

where some of sεn possibly disappeared from the discrete sum on the indices in S ε (this√occurs very exactly when equation (7.47) does not have solutions such that e( s) > 0). In the diffusive part, με = o(ε) shows that in a certain way, the integral term is a perturbation of order ε. NOTE 7.10.– It should be noted that we observe here a generalization of the decomposition result (7.39), the only difference being the inﬁnitely countable character of the poles. For a detailed study of the family of diffusive representation ξ → με (ξ) indexed by ε, see Chapter 9 of [HEL 00]. 7.5.3.3. Fractional modal decomposition In (7.48), we carry out the decomposition of the meromorphic function in a series of normally convergent elementary meromorphic functions on every compact subset of the complex plane (see Chapter 5 of [CART 61]), by using: +∞ (−1)n 1 = sinh(z) n=−∞ z − inπ

which gives us: +∞ 1 (−1)n . H(s) = √ 2 ρ n=−∞ Γ(s) + α − i(nπ + ω0 )

Lastly, √ we break up each term of the series into rational fractions of the variable s: 1 1 1 1 √ = ε+ −√ √ s + ε s − s0n σn − σnε− s − σnε+ s − σnε− where σnε± are the solutions of (7.47): σ 2 + εσ = s0n

272

Scaling, Fractals and Wavelets

then, we carry out a inverse Laplace transform, i.e.: h(t) =

+∞ n=−∞

cεn E 12 (σnε+ , t) − E 12 (σnε− , t) .

(7.53)

This is exactly the fractional modal decomposition of order one half of the impulse response h(t); it has the impulse response structure of a fractional differential equation, with the difference that ﬁnite sums become series in the case of our equation with fractional partial derivatives (strictly speaking, it is advisable to examine in which way the series of functions (7.53) converges, so as to give a precise analytical meaning to our formal expression). 7.5.4. Free problem In addition, to completely analyze the model suggested, we considered the free problem, i.e., with zero order (u ≡ 0), but with non-zero initial conditions, i.e.: ' 1 1 2 ( 1 3 4 X − ∂x2 X = 0 for t > 0 and x ∈ ]0, 1[ ∂t2 + 2ε ∂t2 + ε2 ∂t2 with given integer initial conditions (physical): X(t = 0, x) = f (x) and

1

∂t2

2

X(t = 0, x) = g(x)

half-integer initial conditions (abstract) equal to zero: 1 1 3 ∂t2 X(t = 0, x) = ∂t2 X(t = 0, x) = 0

and homogenous (7.45) absorbing boundary conditions (u ≡ 0). In [MAT 96a], we carried out an extension to the inﬁnite dimension of the results established on the fractional differential equations; we obtained the following half-integer modal decomposition of the solution: X(t, x) =

+∞ ! ε+ " ε− cn E1/2 (σnε+ , t) + cε− n E1/2 (σn , t) ψn (x) n=−∞

where (σnε± )n constitutes the half-integer temporal spectrum and where ψn are the spatial modes worth, as for the equation without loss (i.e., with ε = 0), ψn (x) = r0 exp(−s0n x) − exp(s0n x). In other words, the solution of the problem is a sum of space × time products, but contrarily to the case of the integer temporal order, the time evolution of each spatial vibration mode is not exponential, but monogenic in nature and of half-order.

An Introduction to Fractional Calculus

273

Lastly, with regard to the temporal decrease of the waves energy, it is, once again, by using the equivalent diffusive formulations that we can answer positively to this question (see [MAT 98b]). We would like to give some more insight on the use of diffusive representations for this type of FPDE: – in the case of a varying cross-section, that is, when curvature is present, the above model becomes a bit more complex; it is sometimes called the Webster-Lokshin model (see [HÉL 06a] for the model itself and its resolution in the Laplace domain in the case of piecewise-constant coefﬁcients); – using the Hille-Yosida theory, the well-posedness of this system has ﬁrst been proved in [HAD 03], with full technical details in Chapter 2 of [HAD 08b]; – numerical schemes taking advantage of this equivalent reformulation have been proposed and analyzed in [HAD 08a], with full technical details in Chapter 3 of [HAD 08b]; – using the Arendt-Batty stability theorem, the asymptotic stability of this system has ﬁrst been proved in [MAT 06]. 7.6. Conclusion This chapter is only an introduction to fractional calculus and its multiple applications. Through the meticulous examination of the connections which exist between two deﬁnitions of fractional derivatives, detailed asymptotic analysis of the fundamental solutions of these operators (the decrease at inﬁnity in t−1−α of stable solutions is the analytical expression of the physical phenomenon of long memory), we have shown the unquestionable contribution of a complex analytical approach, which always establishes the link between the temporal and frequential domains. We also hope to have illustrated the simplicity and the elegance of the diffusive representations which present, in our eyes, promising, if not decisive advantages in the ﬁelds of modeling, analysis, approximation and identiﬁcation, without being restricted to the fractional case. 7.7. Bibliography [AUD 00] AUDOUNET J., M ATIGNON D., M ONTSENY G., “Diffusive representations of fractional and pseudo-differential operators”, in Research Trends in Science and Technology (Beirut, Lebanon), Lebanese American University, p. 171–180, March 2000. [BAG 83a] BAGLEY R.L., T ORVIK P.J., “Fractional calculus – A different approach to the analysis of viscoelastically damped structures”, AIAA J., vol. 21, no. 5, p. 741–748, 1983.

274

Scaling, Fractals and Wavelets

[BAG 83b] BAGLEY R.L., T ORVIK P.J., “A theoretical basis for the application of fractional calculus to viscoelasticity”, J. Rheology, vol. 27, no. 3, p. 201–210, 1983. [BAG 85] BAGLEY R.L., T ORVIK P.J., “Fractional calculus in the transient analysis of viscoelastically damped structures”, AIAA J., vol. 23, no. 6, p. 918–925, 1985. [BAG 86] BAGLEY R.L., T ORVIK P.J., “On the fractional calculus model of viscoelastic behavior”, J. Rheology, vol. 30, no. 1, p. 133–155, 1986. [BAG 91] BAGLEY R.L., C ALICO R.A., “Fractional order state equations for the control of viscoelastically damped structures”, J. Guidance, Control, and Dynamics, vol. 14, no. 2, p. 304–311, 1991. [BAT 54] BATEMAN H., Higher Transcendental Functions, McGraw-Hill, New York, vol. 3, chap. XVIII, p. 206–212, 1954. [BON 00] B ONNET C., PARTINGTON J.R., “Stabilization and nuclearity of fractional differential systems”, in Mathematical Theory of Networks and Systems Symposium (Perpignan, France), MTNS, June 2000. [CAP 76] C APUTO M., “Vibrations of an inﬁnite plate with a frequency independent Q”, J. Acoust. Soc. Amer., vol. 60, no. 3, p. 634–639, 1976. [CARP 97] C ARPINTERI A., M AINARDI F. (Eds.), Fractals and Fractional Calculus in Continuum Mechanics, Springer-Verlag, CISM Courses and Lectures 378, 1997. [CART 61] C ARTAN H., Théorie élémentaire des fonctions analytiques d’une ou plusieurs variables complexes, Collection Enseignement des sciences, Hermann, Paris, 1961. [DAN 94] D’A NDRÉA -N OVEL B., C OHEN DE L ARA M., Commande linéaire des systèmes dynamiques, Masson, Paris, Collection MASC, 1994. [DAUP 00] DAUPHIN G., H ELESCHEWITZ D., M ATIGNON D., “Extended diffusive representations and application to non-standard oscillators”, in Mathematical Theory of Networks and Systems Symposium (Perpignan, France), MTNS, June 2000. [DAUT 84a] DAUTRAY R., L IONS J.L., Analyse mathématique et calcul numérique pour les sciences et les techniques, vol. 8, chap. XVIII, p. 774-785, Masson, Paris, 1984. [DAUT 84b] DAUTRAY R., L IONS J.L., Analyse mathématique et calcul numérique pour les sciences et les techniques, vol. 7, chap. XVI, p. 333-337, Masson, Paris, 1984. [FEL 00] F ELLAH Z.E.A., D EPOLLIER C., “Transient acoustic wave propagation in rigid porous media: A time-domain approach”, J. Acoust. Soc. Amer., vol. 107, no. 2, p. 683–688, 2000. [GAS 90] G ASQUET G., W ITOMSKI P., Analyse de Fourier et applications. Filtrage, calcul numérique, ondelettes, Masson, Paris, 1990. [GIO 92] G IONA M., ROMAN H. E., “Fractional diffusion equation on fractals: one-dimensional case and asymptotic behaviour”, Journal of Physics A: Mathematical and General, vol. 25, p. 2093–2105, 1992. [GUE 72] G UELFAND I.M., C HILOV G.E., Les distributions, volume 1, Dunod, Paris, Monographies universitaires de mathématiques 8, 1972.

An Introduction to Fractional Calculus

275

[HAD 03] H ADDAR H., H ÉLIE T., M ATIGNON D., “A Webster-Lokshin model for waves with viscothermal losses and impedance boundary conditions: strong solutions”, Mathematical and Numerical Aspects of Wave Propagation, p. 66–71, Springer Verlag, 2003. [HAD 08a] H ADDAR H., L I J.-R., M ATIGNON D., “Efﬁcient solution of a wave equation with fractional order dissipative terms”, J. Comput. & Appl. Maths, 2008, forthcoming. [HAD 08b] H ADDAR H., M ATIGNON D., Theoretical and numerical analysis of the Webster-Lokshin model, Report, Institut National de la Recherche en Informatique et Automatique (INRIA), 2008, Research Report no. 6558. [HEL 00] H ELESCHEWITZ D., Analyse et simulation de systèmes différentiels fractionnaires et pseudo-différentiels linéaires sous représentation diffusive, PhD Thesis, ENST, December 2000. [HÉL 06a] H ÉLIE T., M ATIGNON D., “Diffusive reprentations for the analysis and simulation of ﬂared acoustic pipes with visco-thermal losses”, Math. Models Meth. Appl. Sci., vol. 16, p. 503–536, January 2006. [HÉL 06b] H ÉLIE T., M ATIGNON D., “Representations with poles and cuts for the time-domain simulation of fractional systems and irrational transfer functions”, Signal Processing, vol. 86, p. 2516–2528, July 2006. [KOE 84] KOELLER R.C., “Applications of fractional calculus to the theory of viscoelasticity”, J. Appl. Mech., vol. 51, p. 299–307, 1984. [KOE 86] KOELLER R.C., “Polynomial operators, Stieljes convolution, and fractional calculus in hereditary mechanics”, Acta Mech., vol. 58, p. 251–264, 1986. [KOL 69] KÖLBIG K.S., Laplace transform, Lectures in the academic training programme, CERN, Geneva, Switzerland, 1968–1969. [LEM 90] L E M EHAUTÉ A., Les géométries fractales, Hermes, Paris, 1990. [LOK 78a] L OKSHIN A.A., “Wave equation with singular retarded time”, Dokl. Akad. Nauk SSSR, vol. 240, p. 43–46, 1978. [LOK 78b] L OKSHIN A.A., ROK V.E., “Fundamental solutions of the wave equation with retarded time”, Dokl. Akad. Nauk SSSR, vol. 239, p. 1305–1308, 1978. [MAT 94] M ATIGNON D., Représentations en variables d’état de modèles de guides d’ondes avec dérivation fractionnaire, PhD Thesis, University of Paris XI, November 1994. [MAT 95a] M ATIGNON D., D ’A NDRÉA -N OVEL B., “Décomposition modale fractionnaire de l’équation des ondes avec pertes viscothermiques”, in Journées d’études : les systèmes d’ordre non entier en automatique, Groupe de recherche Automatique du CNRS, April 1995. [MAT 95b] M ATIGNON D., D ’A NDRÉA -N OVEL B., “Spectral and time-domain consequences of an integro-differential perturbation of the wave PDE”, in Third International Conference on Mathematical and Numerical Aspects of Wave Propagation Phenomena (Mandelieu, France), INRIA, SIAM, p. 769-771, April 1995.

276

Scaling, Fractals and Wavelets

[MAT 96a] M ATIGNON D., “Fractional modal decomposition of a boundary-controlled-and-observed inﬁnite-dimensional linear system”, in Mathematical Theory of Networks and Systems (Saint-Louis, Missouri), MTNS, June 1996. [MAT 96b] M ATIGNON D., “Stability results for fractional differential equations with applications to control processing”, in Computational Engineering in Systems Applications (Lille, France), IMACS, IEEE-SMC, vol. 2, p. 963–968, July 1996. [MAT 96c] M ATIGNON D., D ’A NDRÉA -N OVEL B., “Some results on controllability and observability of ﬁnite-dimensional fractional differential systems”, in Computational Engineering in Systems Applications (Lille, France), IMACS, IEEE-SMC, vol. 2, p. 952–956, July 1996. [MAT 97] M ATIGNON D., D ’A NDRÉA -N OVEL B., “Observer-based controllers for fractional differential systems”, in Conference on Decision and Control (San Diego, California), IEEE-CSS, SIAM, p. 4967–4972, December 1997. [MAT 98a] M ATIGNON D., “Stability properties for generalized fractional differential systems”, in ESAIM: Proceedings, vol. 5, p. 145–158, December 1998 (available at http://www.edpsciences.org/articlesproc/Vol.5/). [MAT 98b] M ATIGNON D., AUDOUNET J., M ONTSENY G., “Energy decay for wave equations with damping of fractional order”, in Fourth International Conference on Mathematical and Numerical Aspects of Wave Propagation Phenomena (Golden, Colorado), INRIA, SIAM, p. 638–640, June 1998. [MAT 98c] M ATIGNON D., M ONTSENY G. (Eds.), Fractional differential systems: Models, methods, and applications, ESAIM, Proceedings 5, December 1998 (available at http://www.edpsciences.org/articlesproc/Vol.5/). [MAT 05] M ATIGNON D., P RIEUR C., “Asymptotic stability of linear conservative systems when coupled with diffusive systems”, ESAIM Control Optim. Calc. Var., vol. 11, p. 487–507, July 2005. [MAT 06] M ATIGNON D., “Asymptotic stability of the Webster-Lokshin model”, Mathematical Theory of Networks and Systems (MTNS), Kyoto, Japan, 11 p. (CD-Rom), July 2006, (invited session). [MAT 08] M ATIGNON D., Z WART H., “Standard diffusive systems as well-posed linear systems”, 2008, submitted. [MIL 93] M ILLER K.S., ROSS B., An Introduction to the Fractional Calculus and Fractional Differential Equations, John Wiley & Sons, 1993. [MIT 04] M ITTAG -L EFFLER G., “Sur la représentation analytique d’une branche uniforme d’une fonction monogène”, Acta Math., vol. 29, p. 101–168, 1904. [MON 97] M ONTSENY G., AUDOUNET J., M ATIGNON D., “Fractional integro-differential boundary control of the Euler-Bernoulli beam”, in Conference on Decision and Control (San Diego, California), IEEE-CSS, SIAM, p. 4973–4978, December 1997. [MON 98] M ONTSENY G., “Diffusive representation of pseudo-differential time-operators”, in ESAIM: Proceedings, vol. 5, p. 159–175, December 1998 (available at http://www.edpsciences.org/articlesproc/Vol.5/).

An Introduction to Fractional Calculus

277

[MON 00] M ONTSENY G., AUDOUNET J., M ATIGNON D., “Diffusive representation for pseudo-differentially damped non-linear systems”, in I SIDORI A., L AMNABHI -L AGARRIGUE F., R ESPONDEK W. (Eds.), Nonlinear Control in the Year 2000 (Paris, France), Springer-Verlag, vol. 2, p. 163–182, 2000. [OLD 74] O LDHAM K.B., S PANIER J., The Fractional Calculus, Academic Press, New York and London, 1974. [OUS 83] O USTALOUP A., Systèmes asservis linéaires d’ordre fractionnaire, Masson, Paris, Série Automatique, 1983. [POD 99] P ODLUBNY I., Fractional Differential Equations, Academic Press, Mathematics in Science and Engineering 1998, 1999. [POL 91] P OLACK J.D., “Time domain solution of Kirchhoff’s equation for sound propagation in viscothermal gases: A diffusion process”, J. Acoustique, vol. 4, p. 47–67, 1991. [RUS 78] RUSSEL D.L., “Controllability and stabilizability theory for linear partial differential equations: Recent progress and open questions”, SIAM Rev., vol. 20, no. 4, p. 639–739, 1978. [SAM 87] S AMKO S.G., K ILBAS A.A., M ARICHEV O.I., Fractional Integrals and Derivatives: Theory and Applications, Gordon and Breach, 1987. [SCH 65] S CHWARTZ L., Méthodes mathématiques pour les sciences physiques, Hermann, Paris, Collection Enseignement des sciences, 1965. [SON 90] S ONTAG E.D., Mathematical Control Theory. Deterministic Finite Dimensional Systems, Springer-Verlag, Texts in Applied Mathematics 6, 1990. [STA 94] S TAFFANS O.J., “Well-posedness and stabilizability of a viscoelastic equation in energy space”, Trans. Amer. Math. Soc., vol. 345, no. 2, p. 527–575, 1994. [TAY 96] TAYLOR M.E., Partial Differential Equations. II: Qualitative Studies of Linear Equations, Springer-Verlag, Applied Mathematical Sciences 116, 1996. [TOR 84] T ORVIK P.J., BAGLEY R.L., “On the appearance of the fractional derivative in the behavior of real materials”, J. Appl. Mech., vol. 51, p. 294–298, 1984.

This page intentionally left blank

Chapter 8

Fractional Synthesis, Fractional Filters

8.1. Traditional and less traditional questions about fractionals Linear ﬁnite impulse response ﬁlters enable the design of the well-known moving average (MA) processes. The corresponding inverse ﬁlters are used to construct autoregressive (AR) processes. In this chapter, we study a family of ﬁlters receiving growing interest: fractional ﬁlters. They enable the deﬁnition of fractional processes as well as that of fractional Brownian motion. 8.1.1. Notes on terminology The word fractional is associated with ﬁlters and processes. It is both adapted and unadapted. On the one hand, it is adapted as it refers to the interesting class of non-integer derivative differential equations, which are constant-coefﬁcient equations with non-integer derivatives of order n1 , n2 , . . . , np multiples of a basic fraction [MIL 93, OLD 74]. On the other hand, the term is inappropriate because, for most mathematical or ﬁltering issues, power exponents do not consist of fractions: they can be non-rational real numbers, or even complex numbers. Nevertheless, as the term is commonly used in the literature, and as it is of no major consequence, we will continue using it. 8.1.2. Short and long memory Many stochastic models with short memory are known, e.g., independent variables, m-dependent variables, certain Markov processes, moving average, most

Chapter written by Liliane B EL, Georges O PPENHEIM, Luc ROBBIANO and Marie-Claude V IANO.

280

Scaling, Fractals and Wavelets

of the autoregressive moving average (ARMA) processes and many linear processes. The key consequence of short memory is that it often implies many limit theorems hold, such as the laws of large numbers, central limit theorems, large deviation theorems. With long memory, the situation is different. It has been more than a century since astronomers ﬁrst noticed the existence of empirical series with persistent memory. Since then, similar phenomena have been observed, especially in chemistry, hydrology, climatology and economics. Such series obviously pose interesting statistical problems and their modeling as well as statistical processing have always been burning issues. See [BER 94] for historical examples modeled with fractional Brownian motion. Let us mention the existence of two reviews of long-range dependence, one written by Cox [COX 84] and the other by Beran [BER 94]. Moreover, a bibliographical guide has been created in [TAQ 92]. We focus on second order stationary processes, that is to say assuming the existence of a spectral density f . In this case, the habit is taken to characterize long memory by the non-summability of auto-covariance or by the non-ﬁniteness of spectral density at the origin. However, both phenomena are not equivalent. Indeed, there exist fractional ARMA family processes whose spectral density is ﬁnite and non-zero, although their autocorrelation is non-summable. Furthermore, some authors have recently emphasized how useful spectral density models with singularities away from the origin were for explaining periodicity persistence. This is why we deﬁne long memory as the existence of a frequency λ0 in the neighborhood of which spectral density f (λ) behaves as |λ − λ0 |2d where d is negative. Additionally, the case where 2d is non-integer positive deﬁnes the intermediate memory situation. 8.1.3. From integer to non-integer powers: filter based sample path design Let us begin with discrete time. The well known and often studied ARMA processes are deﬁned as solutions of recurrence equations whose second member is a simple process: X(t + 1) − a0 X(t) − a1 X(t − 1) + · · · − ap X(t − p) = b0 ε(t) − b1 ε(t − 1) + · · · − bq ε(t − q) where ε is a white noise iid, innovation of the process. By deﬁning two polynomials and a rational fraction: A(z) =

p i=0

i

ai z ,

B(z) =

q j=0

bj z j ,

F (z) =

B(z) , A(z)

z∈C

Fractional Synthesis, Fractional Filters

281

we can rewrite the above equation as X(t + 1) = F (B)ε(t) where B is the usual delay operator: Bε(t) = ε(t − 1). By factorizing polynomials, we obtain J F (z) = j=1 (1 − αj z)dj where the dj are relative integers. This is how we can represent autoregressive moving average family processes. Filters F have impulse responses with an exponential decay towards 0. To design long memory processes, a straightforward approach consists of introducing ﬁlters whose impulse response has slow decay towards 0. Granger and Joyeux [GRAN 80], Hosking [HOS 81] and Gonçalves [GON 87] showed in their articles that we can obtain long memory by letting d in F (z) = (1 − αz)d take appropriate negative non-integer values and α = ±1. Then the impulse responses do not decay exponentially nor are they non-summable, but only square summable, which corresponds to a slow decay. These processes have unbounded spectral densities at λ = 0 and λ = π. We can combine F with another autoregressive moving average ﬁlter. The autocovariance function of the corresponding process behaves like k −2d−1 when k tends towards +∞ and its spectral density behaves like λ2d close to 0, which implies long memory when d < 0. J If exponents dj in F (z) = j=1 (1−αj z)dj are not all relative integers, we end up with the family of fractional ﬁlters. The processes obtained by ﬁltering a white noise with F , when F belongs to this family, are called fractional ARMA processes. Their behavior – as far as memory properties, sample path regularity (for continuous time) and singularities of spectral density are concerned – is richer and more complex than that of traditional ARMA processes. This chapter studies this family, both in discrete and continuous time. An extension to distribution processes is also proposed. 8.1.4. Local and global properties In continuous time, the preferred prototype for a fractional process is fractional Brownian motion. Introduced by Mandelbrot in 1968, this process is non-stationary, Gaussian, centered and dependent on a single parameter H in ]0, 1[. The autocovariance kernel reads 12 (|t|2H + |s|2H − 2|t − s|2H ). If H = 12 , we obtain ordinary Brownian motion. Several properties account for the success of these processes: they are H-self-similar in law and almost all their sample paths have a Hausdorff dimension equal to 2 − H. Parameter H plays a central role. It both regulates global and local properties: – on the one hand, memory properties: if H > 12 , the increment process has long memory; if H 12 , memory is short; – on the other hand, H also determines the sample path regularity: the larger H is, the more regular the sample path. For continuous time, there exists a fractional ARMA family whose deﬁnition is identical to that of discrete time. Its parametric richness enables us to disconnect

282

Scaling, Fractals and Wavelets

regularity properties from memory properties: the parameters that regulate trajectories and memory range are no longer the same. Continuous time fractional ARMAs are stationary, but are not self-similar. 8.2. Fractional filters 8.2.1. Desired general properties: association ARMA ﬁlters, i.e. “rational” ﬁlters, are stable when associated in series, in parallel and even in feedback loop. It is one of their main properties. Fractional ﬁlters offer a less advantageous situation. Two fractional ﬁlters associated in series yield a fractional ﬁlter. However, if we move to parallel association, the resulting ﬁlter no longer belongs to the fractional ﬁlter family. In general, the sum of any two power law transfer functions (of the type z d ) is not a power law function, except if all the exponents dj s are integers. In other words, as opposed to the family of ARMA ﬁlters, that of fractional ARMA ﬁlters is not stable under parallel association. We wish to compensate for this drawback. We can extend the fractional ﬁlter family to a class of ﬁlters which remains stable under parallel or series associations. A way of achieving this consists of adding to fractional ﬁlters all the sums for these ﬁlters. In fact, a good way of extending the family is to ensure that the two following properties are satisﬁed. The ﬁrst relates to the localization of the ﬁlter’s singular points, which must be situated in a well-chosen area. The second property relates to the increase at inﬁnity of the ﬁlter’s transfer functions in the complex plane, which must not be too fast. In the discrete case, if they exist, the singularities are required not to be too unstable and, if they are oscillating, F should be regular in their vicinity. This family contains fractional ﬁlters and shares a common property with that of traditional ARMA ﬁlters: they are closed under both serial and parallel associations. In continuous time, F must be holomorphic in a conical area containing the right half-plane and its growth at inﬁnity must be approximately that of a power law function. 8.2.2. Construction and approximation techniques The approximation of ﬁlter F by polynomials or rational fractions of the complex variable z is a traditional problem. The approximate Fa ﬁlter is used to ﬁlter a white noise in order to create an MA type approximation of the initial process. As for polynomial approximations, authors resort to truncated series expansions of F (z). The expansions are made on the basis of the z n or on the basis of Gegenbauer

Fractional Synthesis, Fractional Filters

283

polynomials [GRAY 89]. Expansion coefﬁcients are easily calculated by recurrence. Because of this property, it is possible to calculate hundreds of thousands of terms (290,000 in [GRAY 89] to approximate a ﬁlter having two roots on the unit circle). Moreover, linear recurrences still exist for the coefﬁcients of the impulse response of general fractional ﬁlters. They remain linear, but the coefﬁcients are afﬁne functions of time. In simple cases, the quadratic upper bound for the rest of the truncated series are easy to determine. Nevertheless, the number of terms increases with the memory range. To construct a very long memory process, a moving average process of a gigantic order is necessary. However, the simplicity of this procedure largely accounts for its success. In a traditional way, the analyticity of F and a criterion of inﬁnite norm are both used. Other approximations are studied to solve stochastic problems that involve the properties of F on the imaginary axis in continuous time or on the unit circle in discrete time. Rational approximations are rarer although more promising. The principle lies in the search for an ARMA ﬁlter which minimizes a certain criterion, in general of type L2 . In [WHI 86], Whitﬁeld brings together several ideas of approximations by a rational fraction. Certain criteria include a ponderation function in the criterion which essentially ensures the approximation in a frequency band. The procedures are built on linear or non-linear least square algorithms, recursive or not. The integral is replaced by a sum, or approximated by a trapezoid method. Various authors carry out approximations with an interesting intuitive sense. Let us quote Oustaloup [OUS 91], who chooses 2n + 1 zeros zj and 2n + 1 real poles pj , p z so that ratios zjj and j+1 pj do not depend on j. In [BON 92, CUR 86] approximations of Hankel matrices H were studied, whose ﬁrst line consists of the desired impulse response. Calculations are carried out satisfactorily for the processes with intermediate memory. We then have access to an upper bound H∞ for the approximation error. Baratchart et al. [BARA 91] perfected a powerful algorithm, which we describe brieﬂy. A linear system is considered, with constant coefﬁcients, strictly causal, single . . . , fm , . . .) be its impulse response and f deﬁned entry and single exit. Let (f1 , f2 , +∞ in the complex plane by f (z) = m=1 fm z −m . We assume that f belongs to Hardy − space H2 , i.e., it is square integrable on the unit circle. We then seek a rational fraction of maximum order n (to be determined) in H2− that minimizes the criterion f − r 22 . The problem at hand is that of minimizing a functional Ψn (q), where q is a polynomial

284

Scaling, Fractals and Wavelets

of P1n , the space of the real polynomials of maximum degree n whose roots are inside the unit disc and such that the coefﬁcient of the highest degree is equal to one. It is shown that if f is holomorphic in the vicinity of the unit circle, then Ψn can extend to a regular function on Δn , the adherence of P1n in n . Adherence Δn is the set of polynomials of degree n whose roots are in the closed unit disc. A polynomial of the edge of Δn , ∂Δn , can then be factorized into a polynomial of degree k, every root of which is of module 1 and a polynomial qi internal to Δn−k . Moreover, if qi is a critical point of Δn−k , then ∇n (q), the gradient of Ψn at point q, is orthogonal to ∂Δn and points towards outside. By supposing, moreover, that ∇k is non-zero on ∂Δk and that the critical points of Ψk in Δk are not degenerated for 1 k n, the following method can give a local minimum: 1) we choose a point q0 interior to Δn as initial condition and integrate the vector −∇n ; 2) either a critical point is reached: if it is a local minimum, the procedure is completed otherwise, since it is not degenerated, it is unstable for small disturbances and the procedure can continue; 3) or the edge ∂Δn is reached in qb : then qb is decomposed up into qb = qu qi and we go back to stage 1) with qi and Δn−k . We end up reaching a minimum of Ψm (1 m n), qm , that gives, by a simple transformation, a local minimum of Ψn . This algorithm never meets the same point twice and convergence towards a local minimum is guaranteed. It was extended to the multivariable case and with time-varying coefﬁcients (see [BARA 98]). 8.3. Discrete time fractional processes 8.3.1. Filters: impulse responses and corresponding processes Let us consider the fractional ﬁlter (of variable z) parameterized by αj and dj : F (z) =

J )

1 − αj z

dj

|z| < a

(8.1)

1

where a is deﬁned below. αj are non-zero complex numbers. Let us note that αj−1 is a singular value of F if dj ∈ N. The following notations will be used: 1) E ∗ is the set of the singular points of F ; 2) a = min{|αj |−1 , j ∈ E ∗ }; 3) E ∗∗ is the set of the indices of the singular points whose module is equal to a; 4) d = min{Re(dj ), j ∈ E ∗∗ };

Fractional Synthesis, Fractional Filters

285

5) E ∗∗∗ is the subset of E ∗∗ corresponding to the singular points for which Re(dj ) = d. In (8.1), each factor is selected to satisfy (1 − αj z)dj = 1 when z = 0. In the domain |z| < a, F admits the series expansion: F (z) = 1 +

+∞

aj z j

j=1

where the series (aj )j1 is the convolution product of the development of the J factors in (8.1). When n tends to inﬁnity and if F is not a polynomial, i.e., if E ∗ = ∅, then:

(J) n Γ(n − dj ) 1 + o(1) when n −→ +∞ (8.2) an = C0 αj Γ(−dj )n! ∗∗∗ j∈E

(j)

where C0 =

m=j (1

−

αm dm . αj )

From this we deduce that, if F is not a polynomial, then: 1) (aj ) belongs to l1 (N) if and only if a > 1 or (a = 1 and d > 0); 2) (aj ) belongs to l2 (N) if and only if a > 1 or (a = 1 and d > − 12 ). ∞ ∞ Consequently, 1 |aj | = +∞ and 1 |aj |2 < +∞ if and only if a = 1 1 and ∞ d ∈2 ] − 2 , 0]. Let us consider a transfer function of the form (8.1) with |a | < +∞, meaning that either F is a polynomial, or: j 1 a > 1 or

a = 1 and d > −

1 2

(8.3) ¯

¯ k z)dk Let us suppose now that if (αk , dk ) ∈ 2 , then the “conjugated” factor (1− α will appear in the right-hand side part of (8.1). Then, (aj ) is a real sequence. If (εn ) is a white noise, the process deﬁned by: X(n) = ε(n) +

∞

aj ε(n − j)

(8.4)

j=1

is a second order stationary process, zero-mean, linear and regular, with a spectral with J density proportional to f (λ) = | j=1 (1 − αj exp(iλ))dj |2 . Moreover, if (ε(n)) is an iid sequence, (X(n)) is strictly stationary and ergodic. This process admits an autoregressive development of inﬁnite order: X(n) +

∞ i=1

bj X(n − j) = ε(n) with

∞ j=1

b2j < +∞

286

Scaling, Fractals and Wavelets

Under the following conditions: 1) |αj | 1 for all j ∈ {1, . . . , J}; 2) if |αj | = 1, then Re(dj ) ∈ ] − 12 , 12 ], when n tends towards inﬁnity, we have: 1) an = O(a−n ) and bn = O(b−n ) when α > 1; 2) an = O(n−d−1 ) and bn = O(nδ−1 ) when α = 1, with b = min{|αj |−1 , dj ∈ N} and δ = max{Re(dj ), |αj | = 1}. The coefﬁcients (aj ) are the solution of the afﬁne linear difference equation of order J: nan +

J

(n − k)qk − pk−1 an−k = 0,

n1

(8.5)

k=1

with aj = 0 if j < 0 and a0 = 1. The pj and qj are respectively the coefﬁcients of the two relatively prime polynomials P and Q deﬁned by Q(0) = 1 and: P (z) F (z) −αj dj = = F (z) 1 − α z Q(z) j 1 J

The coefﬁcients bj are the solution of the same difference equation, replacing P by −P . These equations are useful to calculate the coefﬁcients aj and bj , which enables us to obtain simulations or forecasts for the process X(n). 8.3.2. Mixing and memory properties We study two characterizations of the memory structure: the speed of covariance decrease#and the mixing coefﬁcients. The covariance of the process X(n) is given by π σ(n) = −π exp(inλ)f (λ) dλ. When n tends to +∞, if F is a polynomial of degree k, then σ(n) = 0 for n > k, otherwise: 1) if a > 1, then σ(n) = O(a−n ); 2) if a = 1, then σ(n) = ( j∈E ∗∗∗ γj αjn n−1−2dj )(1 + o(1)). Consequently, the covariance (σ(n)) is not absolutely summable, thus X(n) has long memory if and only if a = 1 and d ∈ ] − 12 , 0], i.e., if a singularity is located on the unit circle. Another approach is to study mixing coefﬁcients. Mixing coefﬁcients measure a certain form of dependence between sigma-algebras. More precisely, if A and B are

Fractional Synthesis, Fractional Filters

287

two sigma-algebras, the strong mixing coefﬁcient α is deﬁned by:

P (A ∩ B) − P (A)P (B) α(A, B) = sup A∈A,B∈B

the mixing coefﬁcient of absolute regularity β by:

β(A, B) = sup |P (B/A) − P (B)| B∈B

and the mixing coefﬁcient of maximum correlation ρ by:

corr(X, Y ) ρ(A, B) = sup X∈L2 (A),Y ∈L2 (B)

It is said that a process X is mixing if the sigma-algebras generated by {X(t), t ∈ I} and {X(t), t ∈ J} have a mixing coefﬁcient which tends towards 0 when d(I, J) tends towards inﬁnity. Here, we will use the mixing coefﬁcient:

0 , v ∈ Hn+∞ , Var (u) = Var (v) = 1 r(n) = sup |cov(u, v)|, u ∈ H−∞ where Hrs is the subspace of L2 generated by (X(j), j ∈ {r, . . . , s}). Then, (X(n)) is r-mixing if and only if F is a polynomial or if a > 1 and, in the latter case, r(n) = O(an ). If, moreover, (X(n)) is Gaussian, then (X(n)) is strongly mixing if and only if F is a polynomial or if a > 1 and, in this last case, (X(n)) is β-mixing with β(n) = O(an ). If (ε(n)) is an iid series with a probability density p such that: |p(x) − p(x + h)| dx C1 |h|

then, if a > 1, the process (X(n)) is β-mixing with β(n) = O(a−2n/3 ). According to the values of the parameters a and d, there are thus cases where the process X(n) is: – with long memory and not mixing; – with short memory and not mixing; – with short memory and mixing. 8.3.3. Parameter estimation In the context of long memory processes, there are two facets to estimation problems. The ﬁrst is when long memory behaves like a parasitic phenomenon that tends to make traditional results obsolete. Examples of this kind are: regression parameters estimation with long memory noise, estimation of the marginal laws of a long memory process and rupture detection in a long memory process. The other aspect relates to the estimation of the parameters quantifying the memory length, i.e., in fact, spectral density parameters. Within this last framework, three types of methods were largely studied, depending on whether the model was completely or partially parameterized:

288

Scaling, Fractals and Wavelets

1) methods related to maximum likelihood, which, by nature, aim at estimating all the parameters of the model; 2) the estimation of the memory exponent d in ϕ(λ) = f ∗ (λ)|λ|−2d where ϕ is the spectral density; 3) the estimation of the autosimilarity parameter in fractional Brownian motion. For fractional ARMA processes whose transfer function has the form: F (z) = (1 − z)d

J )

(1 − αj z)dj ,

j=1

where dj are integer and where − 12 < d < 12 (referred to as ARFIMA), parametric methods give satisfactory results, provided that the order of the model is known a priori (see, for example, [GON 87]). When the order of the model is not known, semi-parametric methods, as in point 2), are fruitfully used. We can schematically describe these methods as follows. Let us use the ﬂexible framework of models with spectral density: ϕ(eiλ ) = |1 − eiλ |−2d g(λ), ∞ where g can be written g(λ) = exp( k=0 θk cos kλ) and is a very regular function. The unique singularity of ϕ is located at λ = 0. Then, the natural tool to estimate the spectral density is the periodogram: InX (x)

n 1 | = X(t)eitx |2 2π t=1

The log of the periodogram estimates log ϕ: log ϕ(eiλ ) = −2d log |1 − eiλ | +

∞

θk cos kλ

k=0

After a truncation of the sum at the order q − 1, the parameters d, θ0 , θ1 ,. . . , θq−1 are estimated by a regression of log I(λ) against (−2 log |1 − eiλ |, cos kλ). The difﬁculty resides in two points: – the choice of q; – the study of the properties of the estimators. In [MOU 00], recent procedures based on penalization techniques give automatic methods for the choice of q, that can be tuned to the function g. The asymptotic

Fractional Synthesis, Fractional Filters

289

properties ensure convergences, for example towards Gaussian, of the estimator of d. Many estimators that rely on this principle were proposed, some of them being compared in [BARD 01, MOU 01]. When the singularities of the transfer function are not located at z = 1, another model is used. Let us assume that the transfer function is written: F (z) = (1 − eiλ0 z)d (1 − e−iλ0 z)d

J )

(1 − αj z)dj ,

j=1

with |αj | < 1, λ0 = 0. The spectral density takes the form ϕ(λ) = |1−ei(λ+λ0 ) |2d g(λ) with a very regular g. If λ0 is known, various authors [OPP 00] think that the estimate of d is made according to ideas developed for α = 1. When λ0 is not known, it has to be estimated. The idea is to use the frequency location where the periodogram takes its max. Although more sophisticated, the procedures elaborated by Yajima [YAJ 96], Hidalgo [HID 99] and Giraitis et al. [GIR 01], provide convergences in probability and in law of the estimators towards normal law. 8.3.4. Simulated example Several methods were proposed to simulate trajectories of fractional processes. Granger and Joyeux [GRAN 80] use an autoregressive approximation of order 100 obtained by truncating the AR(∞) representation combined with an initialization procedure based on the Cholesky decomposition. Geweke and Porter-Hudak [GEW 85] or Hosking [HOS 81] elaborate on the autocovariance and use a Levinson-Durbin-Whittle algorithm to generate an autoregressive approximation. In both cases, quality is not quantiﬁed – although it can be. Gray et al. [GRAY 89] approximate (X(n)) by a long MA obtained by truncating the MA (∞) representation. The second method seems inadequate in our case because there is no expression for the autocovariance function of fractional ARMA processes. These methods can easily be established thanks to differential equations (8.5). However, when (X(n)) is long-ranged, these methods require very long trajectories of white noise because of the slow decay of an ; for example, Gray et al. [GRAY 89] use moving averages of order around 290,000. Another idea consists of approximating F by a rational fraction B A and simulating the ARMA process with representation A(L)Y (n) = B(L)ε(n). This is the chosen approach for the example below. The algorithm used to calculate the polynomials A and B is developed by Baratchart et al. [BARA 91, BARA 98]. In principle, this algorithm, as with the theoretical results, is only valid when F has no singular point on the unit disc. However, it provides satisfactory results, from the perspective detailed below. The studied ﬁlter F reads: −0.2 −0.2 z − exp(−2iπ 0.231) F (z) = z − exp(2iπ 0.231)

290

Scaling, Fractals and Wavelets

It is approximated by a rational fraction whose numerator and denominator are of degrees 8 and 9, respectively. Figure 8.1 shows the localization of the poles and the zeros of this rational fraction and the singularities of F . 90 1 120

60

singularities zeros poles

0.8

0.6 150

30

0.4 0.2

180

0

330

210

300

240 270

Figure 8.1. Localization of the singularities () of the ﬁlter F (z) = (z − exp(2iπ 0.231))−0.2 (z − exp(−2iπ 0.231))−0.2 and of the poles (◦) and zeros (∗) of the rational fraction B approximating F A

It is worth noting that the rational fraction B A is stable; however, its poles and zeros are almost superimposed to the two singularities of the function F . In Figure 8.2, the amplitude of F on the unit circle (continuous line) is compared to that of the rational fraction (dotted line). The approximation is excellent away from a close vicinity around the singularities of F . Figure 8.3 is a simulated sample path of the ARMA process obtained with the rational approximation of the transfer function. We compare the impulse response of the process, its empirical covariance and spectral density (continuous line) with the impulse response, theoretical covariance and spectral density (dotted line), calculated directly from function F . The impulse response of the ARMA process satisfactorily matches the theoretical impulse response. The estimated spectral density for the ARMA process reasonably matches the theoretical spectral density. For small time lags, the autocorrelation calculated on the simulated process and the autocorrelation calculated as the inverse Fourier transform of the spectral density are very similar. This is no longer true for larger lags, which comes as no surprise because we have, on the one hand, a short memory process, and on the other, a long memory process.

Fractional Synthesis, Fractional Filters transfer function

module of the transfer function

4

20 15 modulus in dB

imaginairy part

2

0

−2

−4

291

10 5 0

0

2

4 6 real part

8

−5

10

Black diagram

−2 0 2 frequency in radians/second phase of the transfer function

20 20 phase in degrees

modulus in dB

15 10 5

10 0 −10

0 −20 −5

−20

−10 0 10 phase in degrees

20

−2 0 2 frequency in radians/second

Figure 8.2. Amplitude of F (z) (continuous line) and of the rational (z) (dotted line) approximating F on the unit circle fraction B A Simulated trajectories

Impulse reponse

3 1 2

0.8

1

0.6

0

0.4

−1

0.2 0

−2

−0.2 −3

0

50

100

0

Spectral density

2

50

100

Autocorrelation

10

0.2 0.1

1

10

0 −0.1

0

10

−0.2 −1

10

0

0.5

1

1.45

2

2.5

3

−0.3

0

20

40

60

Figure 8.3. Simulation of a sample path for a fractional ARMA process obtained from an approximating rational fraction. Comparison between simulated and theoretical impulse responses, spectral densities and autocorrelation functions

8.4. Continuous time fractional processes 8.4.1. A non-self-similar family: fractional processes designed from fractional filters FBMs were the ﬁrst to be introduced as continuous time processes characterized by a fractional parameter [MAN 68]. They are interesting because they generalize ordinary Brownian motion while maintaining its Gaussian nature and self-similarity. However, consequently, they lose the independence of their increments. The key properties of these processes are governed by the unique parameter H. This simplicity

292

Scaling, Fractals and Wavelets

has its advantages and constraints. The design of continuous time processes, controlled by several parameters that make it possible to uncouple the local from long-range memory properties, is the subject of the present section. However, these new processes are not self-similar. Fractional ARMA processes are deﬁned, as in the discrete case, by a fractional ﬁlter s, with (the complex number) parameters ak and dk : F (s) =

K )

(s − ak )dk

for Re(s) > a

(8.6)

k=1

The following notations are used: K 1) D = k=1 dk ; 2) E ∗ is the set of the singular points of F ; 3) a = max{Re(ak ), k ∈ E ∗ }; 4) E ∗∗ is the set of the indices of the singular points whose real part is equal to a; 5) d = min{Re(dk ), k ∈ E ∗∗ }. ¯

It is assumed that, if (ak , dk ) ∈ 2 , then the factor (s − a ¯k )dk is present in the right-hand side of (8.6). Under this hypothesis, D is real. Let us moreover assume, in this section, that D < 0, since the study of the case D > 0 is the subject of the next section. Then, the set E ∗ is not empty and the impulse response f , given by the inverse Laplace transform of F , is well-deﬁned, real and locally integrable on + : f (t) =

1 2iπ

c+i∞

exp(st)F (s) ds

for t > 0,

c ∈ ]a, +∞[

(8.7)

c−i∞

No closed-form expression for f is available except when K = 1. Its behavior in the vicinity of 0+ is described by: f (t) ∼

t−(1+D) Γ(−D)

t → 0+

(8.8)

and, in the vicinity of +∞, by: f (t) ∼

λk t−(1+dk ) exp(ak t)

t → +∞

k∈E ∗∗

where λk are non-zero complex numbers depending on parameters ak and dk .

(8.9)

Fractional Synthesis, Fractional Filters

293

As in the discrete case, if D < 1−K, then f is the solution of the linear differential equation of order K whose coefﬁcients are afﬁne functions of t: K

(νj + tψj )f (j) (t) = 0

(8.10)

k=0

where νj and ψk are constants depending on parameters ak and dk . Then, the impulse responses enable us to deﬁne the processes (X(t)) as a stochastic integral of Brownian motion W (s): t f (t − s) dW (s) (8.11) X(t) = −∞

2

+

when f belongs to L ( ) and is given by (8.8). Then, (X(t)) is a zero-mean, stationary Gaussian process with spectral density # +∞ g(λ) = |F (iλ)|2 and with covariance function σ(t) = −∞ g(λ) exp(iλt) dλ. 8.4.2. Sample path properties: local and global regularity, memory Let us begin with the properties of the covariance function σ, from which the other properties are deduced. The memory properties of (X(t)) are given by the covariance behavior at inﬁnity: 1) when a < 0, σ(t) = o(t−n ) when t → +∞ for all n ∈ N; −2dj −1 2) when a = 0, σ(t) ∼ , t → +∞, when γj are j∈E ∗∗ γj exp(aj t)t constants depending on F ; 3) σ is non-integrable if and only if a = 0 and d ∈ ] − 12 , 0[. For this latter case, (X(t)) is a long memory process. From the covariance behavior at inﬁnity, we can also deduce that the process (X(t)) is strongly mixing if and only if a < 0. As in the discrete case, there are various memory-mixing scenarios. The regularity of the sample path is determined by the covariance behavior at the origin: 1) if D < − 32 , then σ(t) = σ(0)+γ1 t2 +o(t2+ε ), where γ1 is a non-zero constant and ε a positive number; 2) if D > − 32 , then σ(t) = σ(0)+γ2 t−2D−1 +o(t−2D−1 ), where γ2 is a non-zero constant; 3) if D = − 32 , then σ(t) = σ(0) +

t2 2

log t(1 + o(1)).

From this, we deduce that there is a process (Y (t)) equivalent to (X(t)), such that all the trajectories are Hölderian of exponent γ for all γ ∈ ]0, min(1, − 12 − D)[.

294

Scaling, Fractals and Wavelets

Moreover, if D < − 32 these trajectories are of class C 1 . The Hausdorff dimension of the sample paths of the process Y (t) is then equal to 1 if D < − 32 and to 52 + D if D ∈ [− 32 , − 12 [. In this studied family, there exist long memory processes X(t) (a = 0, d ∈ ] − 12 , 0[), which can be as regular (D < − 32 ) or irregular (D close to − 12 ) as desired. Conversely, there exist processes X(t) with short memory (a < 0) having arbitrary regularity, in contrast to fractional Brownian motion. However, only the processes obtained from F (s) = sd transfer functions are self-similar. 8.5. Distribution processes 8.5.1. Motivation and generalization of distribution processes To complement the survey of continuous time fractional ARMA processes, the idea that spontaneously comes to mind is to study the consequences of slackening the constraint D < 0. Then, the impulse response f is no longer a simple function, belonging to L2 , but can be deﬁned as the inverse Laplace transform of F in the space D of distributions with support on + . Expression (8.11) no longer makes sense and it is necessary to deﬁne the process X differently. Distribution processes were introduced independently by Ito [ITO 54], and Gelfand and Vilenkin [GEL 64]. They can be deﬁned in the following way: a second order distribution process X is a continuous linear application from the space of C ∞ test functions with compact support in , to the space of random variables whose moment of second order exists. We note: ∀ϕ ∈ C0∞ () X(ϕ) = #X, ϕ$ ∈ L2 (Ω) and we have, for all K compact of : ∃CK , ∃k, ∀ϕ ∈ C0∞ (K),

#X, ϕ$ L2 (Ω) CK ϕ k

Derivation, time shift, convolution and Fourier transform (denoted F(X) or ˆ are deﬁned for distribution processes, as they are for distributions. Likewise, X) expectation, covariance and stationarity for distribution processes are deﬁned as they are for continuous time processes. In particular (see [GEL 64]), if X is a second order stationary distribution process, then there exists η ∈ C such# that, for all ϕ ∈ C0∞ (), the expectation m of X is written m(ϕ) = E(X(ϕ)) = η ϕ(t) dt, and there exists a positive tempered measurement μ, called the spectral measurement, such that the # ˆ dμ(ξ) = ¯ covariance B of X is written B(ϕ, ψ) = E(X(ϕ)X(ψ)) = ϕ(ξ) ˆ ψ(ξ) ¯ If μ admits a density g with respect to the Lebesgue measure, g is called #σ, ϕ ∗ ψ$. the spectral density of X and then σ = F −1 (g). 8.5.2. The family of linear distribution processes Let us begin by putting forth the deﬁnition of this family. Let f be a function of L2 and X(t) the Gaussian process deﬁned by: X(t) = f (t − s) dW (s)

Fractional Synthesis, Fractional Filters

295

where W (s) is the ordinary Brownian motion. Let fˇ(t) = f (−t). Then, X(t) is a distribution process if for ϕ ∈ C0∞ (): #X, ϕ$ = f (t − s) dW (s) ϕ(t) dt

It is easy to see that we can, in this case, permute the integrals and that, for all ϕ ∈ C0∞ (): f (t − s) dW (s) ϕ(t) dt = f (t − s)ϕ(t) dt dW (s)

It is then natural to introduce the following process: #X, ϕ$ = fˇ ∗ ϕ(s) dW (s)

(8.12)

with f ∈ D () such that fˇ ∗ ϕ(s) ∈ L2 (n ) for all ϕ ∈ C0∞ (n ). Let H −∞ () = ∪s∈ H s () be the Sobolev space, where H s () = {f ∈ S (), (1 + ξ 2 )s/2 |fˆ| ∈ L2 ()}, and S is the space of temperate distributions. It is shown that (8.12) properly deﬁnes a distribution process when the distribution f belongs to H −∞ (). The process X, with impulse response f , is then a zero-mean stationary Gaussian distribution process, with spectral density |fˆ(ξ)|2 and covariance σ = F −1 (|fˆ|2 ). As for continuous time processes, the regularity of a process X, with impulse response f ∈ H −∞ (n ), is characterized by the regularity of f : if f ∈ H −∞ (), the corresponding distribution process X belongs to the Hölder space C s (, L2 (Ω)) s (n ) (see [TRI 92] for the deﬁnition if and only if f belongs to the Besov space B2,∞ of these spaces, or Chapter 3). 8.5.3. Fractional distribution processes We can now deﬁne fractional ARMA processes for D > 0, i.e. when the inverse Laplace transform of f does not belong to L2 . In fact, a more general framework is available: let us suppose that F satisﬁes the two following assumptions: 1) F is holomorphic in the domain D = C \ {z, Re(z) a and |Im(z)| K|Re(z)|}; 2) there exists N > 0 and there exists C such that |F (z)| C(1 + |z|)N in D. This includes the functions F which verify (8.6). Under assumptions 1) and 2) above, f = L−1 (F ) exists [SCH 66], since L is deﬁned by: ∀t > 0,

L(f )(s) = F (s) = #f (t), e−st $

for s ∈ C, Re(s) > a

296

Scaling, Fractals and Wavelets

The function f belongs to the space H −∞ () if a < 0 or (a = 0 and F (iξ) ∈ L2loc ()), this condition being satisﬁed in the case of fractional ﬁlters as long as d > − 12 if a = 0 and whatever the value of D. In fact, f belongs to the Besov −N −1/2 () and the associated process has regularity of order −N − 12 . space B2,∞ Moreover, if F veriﬁes hypotheses 1) and 2) and if a < 0, then f is an analytical function for t > 0. More precisely, there exists K > 1 such that f has a holomorphic extension to {|Im(t)| K1 Re(t)}. If F is a fractional ﬁlter, these impulse responses are simple. They are zero on − and, except on {0}, they are very regular functions. The only serious irregularity is in 0, as can be seen by examining the following formula. Let δ denote the Dirac mass at t = 0, δ (j) its j th derivative in the sense of distributions and pv(tλ ) the principal value of tλ . If D ∈ N, the function f reads: f (t) = δ (D) (t) +

D

γj δ (D−j) (t) +

1

∞ D+1

γj

t−(D−j+1) Γ(j − D)

and if D ∈ \ N: f (t) =

∞ 1 pv(t−(D−j+1) ) vp(t−(D+1) ) + γj Γ(−D) Γ(−D + j) j=1

The function f is characterized by the same asymptotic behavior at inﬁnity as in the continuous case and satisﬁes the same differential equation of order K with afﬁne coefﬁcients in t. Also, the regularity index of the distribution process X is −D − 12 . 8.5.4. Mixing and memory properties For distribution processes, memory properties also derive from the summability of the covariance function. As for the continuous case, a fractional ARMA distribution has a long memory, if and only if a = 0 and d ∈ ] − 12 , 0[. Regarding mixing properties, it is necessary to redeﬁne the various coefﬁcients extending usual deﬁnitions [DOU 94] to distribution processes. To this end, we replace the concept of distance in time by that of time-lag between supports of test functions. 0 and HT+∞ the Let X be a stationary distribution process; let us denote by H−∞ ∞ vectorial subspaces generated by {X(ϕ), ϕ ∈ C0 (] − ∞, 0])} and {X(ψ), ψ ∈ C0∞ ([T, +∞[)}. The mixing linear coefﬁcient of the process X can then be deﬁned by:

0 , Z ∈ HT+∞ rT = sup |corr(Y, Z)|, Y ∈ H−∞

This coefﬁcient coincides with the usual linear coefﬁcient of mixing when X is a time process. The other mixing coefﬁcients can be deﬁned similarly. In particular, linear mixing and ρ-mixing coefﬁcients are equal and satisfy the following bounding relation: αT ρT 2παT

Fractional Synthesis, Fractional Filters

297

This implies that for Gaussian distribution processes, linear mixing, ρ-mixing and α-mixing are equivalent notions. Let us now assume that X is a linear distribution process, with transfer function F . A sufﬁcient condition for the process X to be ρ-mixing reads: if F veriﬁes assumptions 1) and 2), if F is bounded from below for large |z|, i.e.: there exists C, A such that, for |z| > A, C|z|N |F (z)| and moreover, if a < 0, then the ρ-mixing coefﬁcient of the distribution process X tends towards 0 when T tends towards inﬁnity. In this case, we obtain: ρT = O(ebT )

for all

b ∈ ]a, 0[.

However, if F has a singularity on the imaginary axis, i.e., if F can be written as F (z) = (z − iα)d G(z) with d ∈ C \ N, α ∈ and G continuous close to iα and G(iα) = 0, then the distribution process is not ρ-mixing. These two conditions yield the following result for fractional ARMA processes: the fractional distribution process, with transfer function F , is ρ-mixing if and only if a < 0. Then, ρT = O(ebT ) for all b ∈ ]a, 0[. This result complements the result obtained for continuous time processes, by providing an explicit mixing rate. Many authors [DOM 92, HAY 81, IBR 74, ROZ 63] studied the relation between the mixing properties and the spectral density of continuous time stationary processes. Their results either rely on more restrictive hypotheses than ours, but give better convergence speeds, or rely on hypotheses about functional spaces membership – difﬁcult to verify in our case – but give necessary and sufﬁcient conditions for the mixing coefﬁcient to tend towards 0. 8.6. Bibliography [BARA 91] BARATCHART L., C ARDELLI M., C LIVI M., “Identiﬁcation and rational L2 approximation: a gradient algorithm”, Automatica, vol. 27, no. 2, p. 413–418, 1991. [BARA 98] BARATCHART L., G RIMM J., L EBLOND J., O LIVI M., S EYFERT F., W IELONSKY F., Identiﬁcation d’un ﬁltre hyperfréquences par approximation dans le domaine complexe, Technical Report 219, INRIA, 1998. [BARD 01] BARDET J.M., L ANG G., O PPENHEIM G., TAQQU M., P HILIPPE A., S TOEV S., “Semi-parametric estimation of long-range dependence parameter: a survey”, in Long-range Dependance: Theory and Applications, Birkhäuser, Boston, Massachusetts, 2001. [BER 94] B ERAN J., Statistics for Long-memory Processes, Chapman & Hall, New York, 1994.

298

Scaling, Fractals and Wavelets

[BON 92] B ONNET C., Réduction de systèmes linéaires discrets de dimension inﬁnie: étude de ﬁltres fractionnaires, RAIRO Automat-Prod. Inform. Ind., vol. 26, no. 5-6, p. 399–422, 1992. [COX 84] C OX D.R., “Long-range dependence: a review”, in DAVID H.A., DAVID H.T. (ed.), Proceedings of the Fiftieth Anniversary Conference Iowa State, Iowa State University Press, p. 55–74, 1984. [CUR 86] C URTAIN R.F., G LOVER K., “Balanced realisation for inﬁnite-dimensional systems”, in Operator Theory and Systems, Birkhäuser, Boston, Massachusetts, 1986. [DOM 92] D OMINGUEZ M., “Mixing coefﬁcient, generalized maximal correlation coefﬁcients, and weakly positive measures”, J. Multivariate Anal., vol. 43, no. 1, p. 110–124, 1992. [DOU 94] D OUKHAN P., Mixing Properties and Examples, Springer-Verlag, Lectures Notes in Statistics, 1994. [GEL 64] G ELFAND I.M., V ILENKIN N.Y., Generalized Functions, vol. 4, Academic Press, New York, 1964. [GEW 85] G EWEKE J., P ORTER -H UDAK S., “The estimation and application of long time series models”, J. Time Series Anal., vol. 4, p. 221–238, 1985. [GIR 01] G IRAITIS L., H IDALGO J., ROBINSON P.M., “Gaussian estimation of parametric spectral density with unknown pole”, Annals of Statistics, vol. 29, no. 4, p. 987–1023, 2001. [GON 87] G ONÇALVÈS E., “Une généralisation des processus ARMA”, Annales d’économie et de statistiques, vol. 5, p. 109–146, 1987. [GRAN 80] G RANGER C.W.J., J OYEUX R., “An introduction to long-memory time series models and fractional differencing”, J. Time Ser. Anal., vol. 1, p. 15–29, 1980. [GRAY 89] G RAY H.L., Z HANG N.F., W OODWARD W.A., “On generalized fractional processes”, J. Time Ser. Anal., vol. 10, no. 3, p. 233–256, 1989. [HAY 81] H AYASHI E., “The spectral density of a strongly mixing stationary Gaussian process”, Paciﬁc Journal of Mathematics, vol. 96, no. 2, p. 343–359, 1981. [HID 99] H IDALGO J., “Estimation of the pole of long memory processes”, Mimeo, London School of Economics, 1999. [HOS 81] H OSKING J.R.M., “Fractional differencing”, Biometrika, vol. 68, p. 165–176, 1981. [IBR 74] I BRAGIMOV I., ROZANOV Y., Processus aléatoires Gaussiens, Mir, Moscou, 1974. [ITO 54] I TO K., “Stationary random distributions”, Mem. Coll. Sci. Kyoto Univ. Series A, vol. 28, no. 3, p. 209–223, 1954. [MAN 68] M ANDELBROT B.B., VAN N ESS J.W., “Fractional Brownian motions, fractional noises, and applications”, SIAM Review, vol. 10, no. 4, p. 422–437, 1968. [MIL 93] M ILLER K.S., ROSS B., An Introduction to Fractional Calculus and Fractional Differential Equations, John Wiley & Sons, 1993. [MOU 00] M OULINES E., S OULIER P., “Conﬁdence sets via empirical likelihood: broadband log-periodogram regression of time series with long-range dependence”, Annals of Statistics, vol. 27, no. 4, p. 1415–1439, 2000.

Fractional Synthesis, Fractional Filters

299

[MOU 01] M OULINES E., S OULIER P., “Semiparametric spectral estimation for fractional processes”, in Long-range Dependence: Theory and Applications, Birkhäuser, Boston, Massachusetts, 2001. [OLD 74] O LDHAM K.B., S PANNIER J., The Fractional Calculus, Academic Press, 1974. [OPP 00] O PPENHEIM G., O ULD H AYE M., V IANO M.C., “Long-memory with seasonal effects”, Statistical Inference for Stochastic Processes, vol. 3, p. 53–68, 2000. [OUS 91] O USTALOUP A., La commande Crone, commande robuste d’ordre non entier, Hermes, Paris, 1991. [ROZ 63] ROZANOV Y., Stochastic Random Processes, Holdenday, 1963. [SCH 66] S CHWARTZ L., Théorie des distributions, Hermann, Paris, 1966. [TAQ 92] TAQQU M.S., “A bibliographical guide to self-similar processes and long-range dependence”, in Dependence in Probability and Statistics, Birkhäuser, 1992. [TRI 92] T RIEBEL H., Theory of Function Spaces, Birkhäuser, 1992. [WHI 86] W HITFIELD A.H., “Transfer function synthesis using frequency response data”, Int. J. Control, vol. 43, no. 5, p. 1413–1426, 1986. [YAJ 96] YAJIMA Y., “Estimation of the frequency of unbounded spectral densities”, ASA Proc. Business and Economic Statistics, Section 4-7, Amer. Statist. Assoc., Alexandria, VA.

This page intentionally left blank

Chapter 9

Iterated Function Systems and Some Generalizations: Local Regularity Analysis and Multifractal Modeling of Signals

9.1. Introduction There are many ways of carrying out the fractal analysis of a signal: evaluation and comparison of various measures and dimensions (for example, Hausdorff [FAL 90] or packing [TRIC 82], lacunarity [MAND 93], etc.). The objective of this chapter is to describe in detail two types of fractal characterizations: – analysis of the pointwise Hölderian regularity; – multifractal analysis and modeling. The ﬁrst characterization enables us to describe the irregularities of a function f (t) by associating it with its Hölder function αf (t) which gives, at each point t, the value of the Hölder exponent of f . The smaller αf (t) is, the more irregular the function f is. A negative exponent indicates a discontinuity, whereas if αf (t) is strictly superior to 1, f is differentiable at least once at t. The characterization of signals through their Hölderian regularity has been studied by many authors from a theoretical point of view. For instance, it is related to wavelet decompositions [JAF 89, JAF 91, JAF 92, MEY 90a], signal processing applications [LEV 95, MAL 92] such as denoising, turbulence analysis [BAC 91] and image segmentation [LEV 96]. This approach is particularly relevant when the information

Chapter written by Khalid DAOUDI.

302

Scaling, Fractals and Wavelets

resides in the signal irregularity rather than, for example, in its amplitude or in its Fourier transform (this is notably the case for edge detection in image processing). The ﬁrst part of this chapter is thus devoted to studying the properties of the Hölder function of signals. The question that naturally arises is the following: given a continuous function f on [0, 1], what is the most general form that can be taken by αf ? By generalizing the notion of iterated function systems (IFS), we answer this question by characterizing the class of functions αf and by giving an explicit method to construct a function whose Hölder function is prescribed in this class. This generalization enables us to deﬁne a new class of functions, that of generalized iterated function systems (GIFS). This will allow the development of a new approach to estimate the Hölder function of a given signal. An interesting feature of the Hölder function is that it can be very simple while the signal is irregular. For example, although they are nowhere differentiable, Weierstrass function [WEI 95] and fractional Brownian motion (FBM) [MAND 68] have a constant Hölder function. However, there are signals with very irregular appearance for which the Hölder function is even more irregular; e.g. continuous signals f such that αf is discontinuous everywhere. While the canonical example is that of IFS, it turns out to be more interesting to use another description for the signal: the multifractal spectrum. Instead of attributing to each t the value of the Hölder function, all the points with same exponent α in a sub-set Eα are aggregated and the irregularity is characterized in a global manner by calculating, for each value of α, the Hausdorff dimension of the set Eα . This yields a geometric estimation of the “size” of the subparts of the support of f where a given singularity appears. This type of analysis, ﬁrst referenced in [MAND 72, MAND 74] and in the context of turbulence [FRI 95], has since been used often. It has been studied at a theoretical level (analysis of self-similar measures or functions in a deterministic [BAC 93, OLS 02, RIE 94, RIE 95] and random [ARB 02, FAL 94, GRA 87, MAND 89, MAU 86, OLS 94] context, extension to capacities, higher order of spectra [VOJ 95]) and directly applied (study of DLA sequences [EVE 92, MAND 91], analysis of earthquake distribution [HIR 02], signal processing [LEV 95] and trafﬁc analysis). The second part of this chapter deals with multifractal analysis. Self-similar functions constitutes the paradigm of “multifractal signals”, as most of the quantities of interest can be explicitly calculated. In particular, it has been demonstrated that the multifractal formalism, which connects the Hausdorff multifractal spectrum to the Legendre transform of a partition function, holds for self-similar measures and functions, and various extensions of the latter were considered [RIE 95]. The multifractal formalism enables us to reduce the calculation of the Hausdorff spectrum

IFS: Local Regularity Analysis and Multifractal Modeling of Signals

303

(which is very complex, as it not only entails the determination all Hölder exponents but also the deduction of an inﬁnity of Hausdorff dimension) to a simple calculation of a partition function limit, which offers less difﬁculty, both theoretically and numerically. In practice, self-similar functions and their immediate extensions are, most of the time, too rigid to model real signals, such as, for instance, speech signals, in an appropriate way. In this chapter, we present a generalization – weak self-afﬁne (WSA) functions – that offers a satisfactory trade-off between ﬂexibility and complexity in modeling. Weak self-afﬁne functions are essentially self-similar functions for which the renormalization parameters can differ from one scale to another, while verifying certain conditions which enable us to preserve the multiplicative structure. The Hausdorff multifractal spectrum of WSA functions is calculated and the validity of the multifractal formalism proven. We then explain how to use WSA functions to model and segment real signals. We also show how modeling through WSA functions can be used to estimate non-concave multifractal spectra. This chapter is organized as follows. In the following section, we give the deﬁnition of the Hölder exponent. Then, we describe the concept of iterated function systems (IFS) and analyze the local regularity behavior of afﬁne iterated function systems which generate the continuous function graphs. In section 9.4, which constitutes the core of this chapter, we propose some generalizations of IFS which enables us to solve the problem of characterizing Hölder functions. In section 9.5, we introduce a method to estimate the Hölder exponent, based on GIFS and evaluate its performance from numerical simulations. In section 9.6, we address multifractal analysis and modeling. We introduce WSA functions and prove their multifractal formalism. In sections 9.7 and 9.8, we show how to represent and segment real signals by WSA functions. section 9.9 is devoted to the estimation of the multifractal spectrum through the WSA approach. Finally in section 9.10 we present some numerical experiments. 9.2. Definition of the Hölder exponent The Hölder exponent is a parameter which quantiﬁes the local (or pointwise) regularity of a function around a point.1 Let us ﬁrst deﬁne the pointwise Hölder space. ∗ Let I be an interval of , and F a continuous function on I in and β + \N.

1. The Hölder exponent can also be deﬁned for measures or functions of sets in general.

304

Scaling, Fractals and Wavelets

DEFINITION 9.1.– Let t0 ∈ I. The function f belongs to the pointwise Hölder space C β (t0 ) if and only if there exists a polynomial P of degree less than or equal to the integer part of β and a positive constant C such that for any t in the neighborhood of t0 : f (t) − P t − t0 ≤ C t − t0 β Let us note that if β ∈ N∗ , the space C β has to be replaced by the Zygmund β-class [MEY 90a, MEY 90b]. DEFINITION 9.2.– A function f is said to be of Hölder exponent β at t0 if and only if: 1) for any scalar γ < β: lim

h→0

|f (t0 + h) − P (h)| =0 |h|γ

2) if β < +∞, for any scalar γ > β: lim sup h→0

|f (t0 + h) − P (h)| = +∞ |h|γ

where P is a polynomial of degree less than or equal to the integer part of β. If β < +∞, this is equivalent to: f∈ C β− (t0 )

but

>0

f ∈

+

C β+ (t0 )

>0

This is also equivalent to: β = sup{θ > 0 : f ∈ C θ (t0 )} 9.3. Iterated function systems (IFS) Let K be a complete metric space, with distance d. Given m continuous functions Sn of K in K, we call an iterated functions system (IFS) the family {K, Sn : n = 1, 2, . . . , m}. Let H be the set of all non-empty closed parts of K. Then the set H is a compact metric space for the Hausdorff distance h [HUT 81] deﬁned, for any A, B in H, by: h(A, B) = max sup inf d(x, y), sup inf d(x, y) x∈A y∈B

x∈B y∈A

IFS: Local Regularity Analysis and Multifractal Modeling of Signals

305

Let us consider the operator W : H → H deﬁned by: W (G) =

m +

Sn (G)

for all G ∈ H

n=1

We call any set A ∈ H which is a ﬁxed point of W an attractor of the IFS {K, Sn : n = 1, 2, . . . , m}, i.e., it veriﬁes: W (A) = A An IFS always possesses at least one attractor. Indeed, given any set G ∈ H, (m) (G) = the closure of the accumulation set of points {W (m) (G)}∞ m=1 , with W (m−1) (G)), is a ﬁxed point of W . W (W If all Sn functions are contractions, then the IFS is said to be hyperbolic. In this case, W is also a contraction for the Hausdorff metric; thus, it possesses a single ﬁxed point which is the single attractor of the IFS. When the IFS is hyperbolic, the attractor can be obtained in the following manner [BAR 85a]: let p = (p1 , . . . , pm ) be a probability vector with pn > 0 and n pn = 1. From the ﬁxed point x0 of S1 , let us deﬁne the sequence xi by successively choosing xi ∈ {S1 (xi−1 ), . . . , Sm (xi−1 )}, where the probability pn is linked to the occurrence xi = Sn (xi−1 ). Then, the attractor is the closure of the trajectory {xi }i∈N . In this chapter, we focus on IFS which make it possible to generate continuous function graphs [BAR 85a]. Given a set of points {(xn , yn ) ∈ [0; 1] × [u; v], n = 0, 1, . . . , m}, with (u, v) ∈ 2 , let us consider the IFS given by m contractions Sn (n = 1, . . . , m) which are deﬁned on [0; 1] × [u; v] by: Sn (x, y) = Ln (x); Fn (x, y) where Ln is a contraction which transforms [0; 1] into [xn−1 ; xn ] and where Fn : [0; 1] × [u; v] → [u; v] is a contraction with respect to the second variable, which satisﬁes: Fn (x0 , y0 ) = yn−1 ; Fn (xm , ym ) = yn

(9.1)

Then, the attractor of this IFS is the continuous function graph which interpolates the points (xn , yn ). In general, this type of function is called a fractal interpolation function [BAR 85a]. The most studied class of IFS is that of afﬁne iterated function systems, i.e., IFS for which Ln and Fn are afﬁne functions. We will study this class later. We also assume that the interpolation points are equally spaced. Then, Sn (0 ≤ n < m) can be written in a matrix form as: t 1/m 0 t n/m = + Sn x an cn x bn

306

Scaling, Fractals and Wavelets

Let f be the function whose graph is the attractor of the corresponding IFS. Let us note that once cn is ﬁxed, an and bn are uniquely determined by (9.1) so as to ensure the continuity of f . We are now going to calculate the Hölder function of f and see if we can control the local regularity with these afﬁne iterated function systems. PROPOSITION 9.1 ([DAO 98]).– Let t ∈ [0; 1) and 0 · i1 . . . ik . . . be its base m decomposition (when t possesses two decompositions, we select the one with a ﬁnite number of digits). Then: log(ci1 . . . cik ) log(cj1 . . . cjk ) log(cl1 . . . clk ) , lim inf , lim inf αf (t) = min lim inf k→+∞ k→+∞ k→+∞ log(m−k ) log(m−k ) log(m−k ) where, for any integer k, if we note tk = m−k [mk t], the k-tuples (j1 , . . . , jk ) and (l1 , . . . , lk ) are given by: −k = t+ k = tk + m

k

jp m−p

p=1 −k t− = k = tk − m

k

lp m−p

p=1

COROLLARY 9.1 ([DAO 98]).– Let t ∈ [0; 1). If, for every i ∈ {0, . . . , m − 1}, the proportion φi (t) of i in the base m decomposition of t exists, then: αf (t) = −

m−1

φi (t) logm ci

i=0

This corollary clearly shows that we cannot control the regularity at each point using IFS. Indeed, the almost sure value of φi (t) w.r.t. the Lebesgue measure is 1 m , hence almost all the points have the same Hölder exponent. However, we can easily construct a continuous function whose Hölder function is not constant almost everywhere. In the next section, we propose a generalization of IFS that offers more ﬂexibility in the choice of the Hölder function. 9.4. Generalization of iterated function systems The principal idea from which the generalization of IFS is inspired lies in the following question: what happens, in terms of regularity, if the Si contractions are allowed to vary at every iteration in the process of attractor generation? However, raising this question entails that we ﬁrst answer the preliminary issue: does an “attractor” exist in this case? Andersson [AND 92] studied this problem and found satisfactory conditions for the existence and uniqueness of an attractor when the Si vary.

IFS: Local Regularity Analysis and Multifractal Modeling of Signals

307

Formally, let us consider the collection of sets (F k )k∈N∗ where each F k is a (not empty) ﬁnite set of contractions Sik in K for i = 0, . . . , Nk − 1, Nk ≥ 1 being the cardinal of F k , while cki denotes the contraction factor of Sik . For n ∈ N∗ , let 'nNi be the set of sequences of length n, deﬁned by:

'nNi = σ = (σ1 , . . . , σn ) : σi ∈ {0, . . . , Ni − 1}, i ∈ N∗ and let:

∗ '∞ Ni = σ = (σ1 , σ2 , . . .) : σi ∈ {0, . . . , Ni − 1}, i ∈ N

For any k, let us consider the operator W k : H → H deﬁned by: Nk +

k

W (A) =

Snk (A)

for A ∈ H

n=1

Let us deﬁne the conditions: (c)

sup

lim

n→∞ (σ ,...,σ )∈n 1 n N

i

(c )

lim

sup

n→∞ (σ ,σ ,...)∈∞ 1 2 N

i

n )

ckσk

=0

k=1 ∞

d(Sσj+1 x, x) j+1

j=n

j )

ckσk

= 0 ∀x ∈ K

k=1

Andersson proved the below result. PROPOSITION 9.2 ([AND 92]).– If conditions (c) and (c ) are satisﬁed, then there exists a unique compact set A ⊂ K such that: lim W k ◦ . . . ◦ W 1 (G) = A

k→∞

for all G ∈ H

A is called the attractor of the IFS (K, [0]{F k }k∈N∗ ). 9.4.1. Semi-generalized iterated function systems We now consider the case where the Sik are afﬁne and where the Nk are constant. Let F k be a set of afﬁne contractions Sik (0 ≤ i < m) whose matrix representation reads: 1/m 0 i/m t t = + Sik x x cki aki bki Let us assume that conditions (c) and (c ) are satisﬁed. Then, if aki and bki satisfy similar relations as (9.1), we can show, by using the same techniques as

308

Scaling, Fractals and Wavelets

in [BAR 85a], that the attractor of the semi-generalized IFS (K, {F k }k∈N ) is the graph of a continuous function f . As for typical IFS, let us now verify whether the expression of the Hölder function for semi-generalized IFS allows us to control the local regularity. PROPOSITION 9.3 ([DAO 98]).– Let t ∈ [0; 1). Then: log(c1j1 . . . ckjk ) log(c1l1 . . . cklk ) log(c1i1 . . . ckik ) , lim inf , lim inf αf (t) = min lim inf k→+∞ k→+∞ k→+∞ log(m−k ) log(m−k ) log(m−k ) where ip , jp and lp are deﬁned as in Proposition 9.1. Although the Hölder function of semi-generalized IFS describes a broader class than that of standard IFS, it still remains very restrictive (as far as the problem at hand is concerned). Indeed, it is easy to observe that two scalars which only differ in a ﬁnite number of digits in their base m decomposition have the same Hölder exponent – whereas it is easy to construct a continuous function whose Hölder function does not satisfy this constraint. It remains thus impossible to control the local regularity at each point by using the semi-generalized IFS. 9.4.2. Generalized iterated function systems Let us now consider a more ﬂexible extension than semi-generalized IFS, by allowing the number and support of Sik to vary through iterations. More precisely, let F k be the set of afﬁne contractions Sik (0 ≤ i ≤ mk − 1), where each Sik only operates on [[ mi ]m−k+1 ; ([ mi ] + 1)m−k+1 ] and has values in [im−k ; (i + 1)m−k ]. Then, the matrix representation of Sik becomes: 1/m 0 t t i/mk = + Sik x x cki aki bki We call (K, (F k )) a GIFS. Given cki , the following construction yields an attractor which is the graph of a continuous function f , that interpolates a set of given points {( mi , yi ), i = 0, . . . , m} (for simplicity, we consider the case m = 2, although the general case can be treated in a similar way). Consider the graph of a non-afﬁne continuous function φ on [0; 1], we note: φ(0) = u,

φ(1) = v

then we choose aki and bki so that the following conditions hold. For i = 0, 1: i i + 1 , yi , Si1 (1, v) = , yi+1 Si1 (0, u) = m m S02 (0, y0 ) = (0, y0 ), S02 (1/2, y1 ) = S12 (0, y0 ), S12 (1/2, y1 ) = (1/2, y1 ) S22 (1/2, y1 ) = (1/2, y1 ), S22 (1, y2 ) = S32 (1/2, y1 ), S32 (1, y2 ) = (1, y2 )

IFS: Local Regularity Analysis and Multifractal Modeling of Signals

309

For k > 2 and i = 0, . . . , 2k − 1: 1) if i is even, then: a) if i < 2k−1 : Sik ◦ S k−1 ◦ S[k−2 ◦ . . . ◦ S[2 i i ]

i 2k−2

22

2

] (0, y0 )

◦ S[k−2 ◦ . . . ◦ S[2 = S k−1 i i ]

i 2k−2

22

2

◦ S[k−2 ◦ . . . ◦ S[2 Sik ◦ S k−1 i i ]

i 2k−2

22

2

] (0, y0 )

] (1/2, y1 )

k−2 k 2 ◦ S[k−1 = Si+1 i+1 ◦ S i+1 ◦ . . . ◦ S[ i+1 ] (0, y0 ) ] ] [ 22

2

2k−2

b) if i ≥ 2k−1 : ◦ S[k−2 ◦ . . . ◦ S[2 Sik ◦ S k−1 i i ]

i 2k−2

22

2

] (1/2, y1 )

◦ S[k−2 ◦ . . . ◦ S[2 = S k−1 i i ]

i 2k−2

22

2

◦ S[k−2 ◦ . . . ◦ S[2 Sik ◦ S k−1 i i ]

i 2k−2

22

2

] (1/2, y1 )

] (1, y2 )

k−2 k 2 ◦ S[k−1 = Si+1 i+1 ◦ S i+1 ◦ . . . ◦ S[ i+1 ] (1/2, y1 ) [ ] ] 22

2

2k−2

2) if i is odd, then: a) if i < 2k−1 : Sik ◦ S[k−1 ◦ S[k−2 ◦ . . . ◦ S[2 i i ] ]

i 2k−2

22

2

] (1/2, y1 )

◦ S[k−2 ◦ . . . ◦ S[2 = S[k−1 i i ] ]

i 2k−2

22

2

] (1/2, y1 )

b) if i ≥ 2k−1 : ◦ S[k−2 ◦ . . . ◦ S[2 Sik ◦ S[k−1 i i ] ]

i 2k−2

22

2

] (1, y2 )

◦ S[k−2 ◦ . . . ◦ S[2 = S[k−1 i i ] ] 2

22

i 2k−2

] (1, y2 )

This set of conditions, which we call continuity conditions, ensures that f is a continuous function that interpolates the points ( mi , yi ). The Hölder function f is given by the following proposition. PROPOSITION 9.4.– Let us assume that the conditions (c) and (c ) are satisﬁed. Then, the attractor of the GIFS, deﬁned above, is the graph of a continuous function f such that: i = yi ∀i = 0, . . . , m f m and: αf (t) = min(α1 , α2 , α3 )

310

Scaling, Fractals and Wavelets

where: ⎧ log ckmk−1 i1 +mk−2 i2 +...+mik−1 +ik . . . c2mi1 +i2 c1i1 ⎪ ⎪ ⎪ α1 = lim inf ⎪ ⎪ k→+∞ log(m−k ) ⎪ ⎪ k ⎪ ⎨ log cmk−1 j1 +mk−2 j2 +...+mjk−1 +jk . . . c2mj1 +j2 c1j1 α2 = lim inf ⎪ k→+∞ log(m−k ) ⎪ ⎪ k ⎪ ⎪ ⎪ log cmk−1 l1 +mk−2 l2 +...+mlk−1 +lk . . . c2ml1 +l2 c1l1 ⎪ ⎪ ⎩α3 = lim inf k→+∞ log(m−k )

(9.2)

and where ip , jp and lp are deﬁned as in Proposition 9.1. 1 NOTE 9.1.– Given m real numbers u1 , . . . , um ∈ ] m ; 1[, let us deﬁne, for any k ≥ 1 k k and for any i ∈ {0, . . . , m − 1}, ci as:

cki = ri+1−m[ mi ] In this case, we recover the original construction of the usual IFS, considered in section 9.3. We now prove that GIFS allow us to solve the problem of characterizing the Hölder functions. Indeed, we have the following main result. THEOREM 9.1.– Let s be a function of [0; 1] in [0; 1]. The following conditions are equivalent: 1) s is the Hölder function of a continuous function f of [0; 1] in ; 2) there is a sequence (sn )n≥1 of continuous functions such that: s(x) = lim inf sn (x), n→+∞

∀x ∈ [0; 1]

The implication of 1) ⇒ 2) is relatively easy and can be found in [DAO 98]. Hereafter, we present a constructive proof of the converse implication. To do so, let H denote the set of functions of [0; 1] in [0; 1] which are inferior limits of continuous functions. We need the following lemma. LEMMA 9.1 ([DAO 98]).– Let s ∈ H. There exists a sequence {Rn }n≥1 of piecewise polynomials such that: ⎧ s(t) = lim inf n→+∞ Rn (t) ∀t ∈ [0; 1] ⎪ ⎪ ⎪ ⎪ ⎨ + −

Rn ∞ ≤ n, Rn ∞ ≤ n ∀n ≥ 1 (9.3) ⎪ ⎪ ⎪ 1 ⎪ ⎩ Rn ∞ ≥ log n −

where Rn and Rn are the right and left derivatives of Rn , respectively. +

IFS: Local Regularity Analysis and Multifractal Modeling of Signals

311

Let {Rn }n≥1 be the sequence given by (9.3) and M be the set of m-adic points of [0; 1]. Now let us consider the sequence {rk }k≥1 of functions on M in deﬁned, for k0 ip m−p , by: any t ∈ M such that t = p=1 r1 (t) = R1 (i1 m−1 ) rk (t) = kRk (t) − (k − 1)Rk−1

$k−1

% ip m

−p

for k = 2, . . . , k0

p=1

rk (t) = kRk (t) − (k − 1)Rk−1 (t)

for k > k0

Thanks to the continuity conditions, ﬁnding a GIFS whose attractor satisﬁes 1) amounts to determining the double sequence (cki )i,k . The latter is given by the following result. PROPOSITION 9.5 ([DAO 98]).– Let s ∈ H and {rk }k≥1 be the previously deﬁned sequence. Then, the attractor of the GIFS whose contraction factors are given by: cki = m−rk (im

−k

)

is the graph of the continuous function f satisfying: αf (t) = s(t)

∀t ∈ [0; 1]

This result provides an explicit method, fast and easy to execute, that allows the construction of interpolating continuous functions whose Hölder function is prescribed in the class of inferior limits of continuous functions (situated between the ﬁrst and second Baire classes). Let us underline that there are two other constructive approaches to prescribe Hölder functions. One of them is based on a generalization of Weierstrass function [DAO 98] and the other is based on the wavelet decomposition [JAF 95]. This section is concluded with some numerical simulations. Figures 9.1 and 9.2 show the attractors of GIFS with prescribed Hölder functions. Figure 9.1 (respectively 9.2) shows the graph obtained when s(t) = t (respectively s(t) = | sin(5πt)|). In both cases, the set of interpolation points is: 2 3 4 1 ,1 ; ,1 ; ,1 ; , 1 ; (1, 0) (0, 0); 5 5 5 5 9.5. Estimation of pointwise Hölder exponent by GIFS In this section, we address the problem of the estimation of the Hölder exponent for a given discrete time signal. Our approach, based on GIFS, is to be compared

312

Scaling, Fractals and Wavelets 5 "f" 4

3

2

1

0

-1

-2

0

0.2

0.4

0.6

0.8

1

Figure 9.1. Attractor of a GIFS whose Hölder function is s(t) = t

5 "f" 4

3

2

1

0

-1

-2

-3

0

0.2

0.4

0.6

0.8

1

Figure 9.2. Attractor of a GIFS whose Hölder function is s(t) = | sin(5πt)|

with two other methods which make it possible to obtain satisfactory estimations. The ﬁrst method is based on the wavelet transform and is called the wavelet transform maxima modules (WTMM) (cf. Chapter 3 for a detailed description of this method). The second method [GON 92b] uses Wigner-Ville distributions. 9.5.1. Principles of the method For the sake of simplicity, our study is limited to continuous functions on [0; 1]. The calculation algorithm of the Hölder exponent is based on Proposition 9.4. To apply this proposition to the calculation of the Hölder exponents for a real continuous signal f , we have to begin by calculating the coefﬁcients cjk of a GIFS whose attractor is f . This amounts to solving the “inverse problem” for GIFS, which is a generalization of the ordinary inverse problem for IFS. The latter problem was studied by many authors, either from a theoretical point of view [ABE 92, BAR 88, BAR 85a, BAR 85b, BAR 86, CEN 93, DUV 92, FOR 94, FOR 95, VRS 90] or in

IFS: Local Regularity Analysis and Multifractal Modeling of Signals

313

physics applications [MANT 89, VRS 91b], image compression [BAR 93a, BAR 93b, CAB 92, FIS 93, JAC 93a, JAC 93b, VRS 91a] or signal processing [MAZ 92]. The inverse problem is deﬁned as follows: “given a signal f , ﬁnd a (generalized) iterated function system whose attractor approximates at best f , in the sense of a ﬁxed norm”. This problem is extremely complex in the case of the IFS, yet becomes much easier for that of GIFS. In particular, it is shown in [DAO 98] that, for a function f ∈ C 0 ([0; 1]), we can ﬁnd a GIFS whose attractor is as close to f as wished, in the ||.||∞ sense. This is obviously different when IFS have a ﬁnite number of functions. However, the consequence of simplifying and improving the approximation, is that we move from a ﬁnite to an inﬁnite number of parameters in the modeling. A practical solution to the inverse problem for GIFS is detailed in the next section. For now, let us note that in the particular case where: ⎧ ⎨m = 2 ⎩Lk (t) = i t + i i m mk if we write: 1 1 sjk = f (k + )2−j − f (k2−j ) + f (k + 1)2−j 2 2

∀k ∈ {0, . . . , 2j − 1}

and: cjk =

sjk sj−1 [k]

∀k = 0, . . . , 2j − 1

(9.4)

2

then, since m and Lki are ﬁxed, the inverse problem is solved if cjk satisfy Conditions (c) and (c ). Indeed, in this case, the attractor of the GIFS deﬁned by the interpolation points {(0, 0); ( 12 , f ( 12 )); (1, 0)} and the coefﬁcients cji is the graph of f . To calculate the Hölder function of f , we note:

1 S = f ∈ C 0 ([0; 1]) : |cjk | ∈ [ + ; 1 − ] 2 where cjk are determined by (9.4) and where is a strictly positive ﬁxed scalar (as small as we wish). If we have f ∈ S, we can apply Proposition 9.4 and the pointwise Hölder exponents of f ∈ S are obtained by applying this proposition. NOTE 9.2.– If we have f (0) = f (1) = 0, then: j f (x) = sk θj,k (x) j≥0 0≤k<2j

314

Scaling, Fractals and Wavelets

where: θj,k (x) = θ(2j x − k)

and: θ(x) =

1 − |2x − 1| if x ∈ [0; 1] 0 if x ∈ [0; 1]

which is simply the decomposition of f in the Schauder basis. 9.5.2. Algorithm Let f ∈ C 0 ([0; 1]). To simplify notations, let us suppose that f (0) = f (1) = 0. The function f is then written as: j sk θj,k (x) f (x) = j≥0 0≤k<2j

The method described in the previous section allows us to calculate the Hölder exponents only when f belongs to S. As far as the general case is concerned, it is not possible to calculate αf , and we can only obtain an approximation. For this purpose, we construct a function f˜ satisfying: f˜ ∈ S ∀x ∈ [0; 1], ∀g ∈ S : |αf (x) − αf˜(x)| ≤ |αf (x) − αg (x)| and we calculate αf˜ instead of αf . This method is reminiscent, in a sense, of that consisting of ﬁltering a signal before sampling it: if the sampling frequency ωe is less than 2ωmax , where ωmax designates the equivalent signal bandwidth, then the direct sampling of the signal will lead to information loss. When ωe < 2ωmax , the only part of the signal that can be used is that at “low frequency” and the best way to do it is to consider a frequency domain approximation of the signal prior to carrying out the sampling. In the present case, we are searching for an approximation f˜ of f in the sense described earlier and which then calculates αf˜. Obviously, this approach will only give appropriate results if f can really be approximated according to the criteria that have been deﬁned. Applications show that this is often the case. Let us emphasize the fact that f˜ is, in general, an improper approximation of f in the sense of the ||.||L2 or ||.||∞ norms and that only αf˜ is close to αf .

IFS: Local Regularity Analysis and Multifractal Modeling of Signals

315

We can easily verify, according to the above-mentioned hypothesis, that f˜ is obtained as the attractor of the GIFS deﬁned by: m=2 i i t+ k m m ⎧ 1 ⎪ j j ⎪ ;1 if |ck | ∈: ⎪ ⎪ck ⎪ 2 ⎨ c˜jk = 1 − if |cjk | > 1 − ⎪ ⎪ ⎪ ⎪ ⎪ 1 + if |cj | < 1 + ⎩ k 2 2

Lki (t) =

9.5.3. Application We are now going to explain how to practically solve the inverse problem. Let F = {f (i), i = 0, . . . , 2J } be a given discrete signal. For j = 1, . . . , J, let us consider the set Pj deﬁned by: Pj = {f (i2J−j ), i = 0, . . . , 2j } The set Pj is simply the sub-sampled signal with the step 2J−j . A geometric interpretation of the calculation of cjk is the following: for j ∈ {1, . . . , J − 1}, the set of coefﬁcients {cjk , k = 0, . . . , 2j − 1}, which will determine the GIFS, is obtained as being the set of slopes of the 2j afﬁne functions which enable us to transform the polygon deﬁned by Pj into the polygon deﬁned by Pj+1 . To make the point clearer, an example with J = 3 is proposed in Figure 9.3. The signal samples are represented by dots. For j = 1, 2, the coefﬁcients of the GIFS are: cjk =

ujk uj−1 [k]

∀k = 0, . . . , 2j − 1

2

The estimation procedure of the pointwise Hölder exponent is asfollows: let J−1 i ∈ {0, . . . , 2J − 1}, and (i1 , . . . , iJ−1 ) be the tuple such that i2−J = p=1 ip 2−p . j−1 p Then let us assume that kj = p=0 ij−p 2 and let us consider the sequence j (C )j=1,...,J−1 deﬁned by: C j = − log2 |cjkj | · · · |c2k2 | |c1k1 | We deﬁne:

j C˜ j = − log2 |˜ ckj | · · · |˜ c2k2 | |˜ c1k1 |

316

Scaling, Fractals and Wavelets

2

2

u2

u1 1

2

1

u0

u1

u0 0

u0

2

u3

Figure 9.3. Example of the calculation of cjk

The exponent αf (i) is then given as the result of the linear regression of C˜ j with respect to j. Let us observe that if c˜jk = cjk , for any k and j, i.e., if all the GIFS coefﬁcients have their amplitude in ] 12 ; 1[, then we are in the estimation framework of the Hölder exponent from the Schauder coefﬁcients sjk . Indeed, in this case, we have C˜ j = − log2 |sjkj | and it is known [TRIE 78] that the Schauder base characterizes the local regularity of continuous and nowhere differentiable functions (and also more regular functions, under certain conditions [JAF 02b]). We now present some numerical tests on synthetic signals. We compare the three methods described earlier (wavelets, Wigner-Ville and GIFS) applied to the generalized Weierstrass functions: W (t) =

∞

2−ks(t) sin(2k t)

k=0

It is shown in [DAO 98] that its Hölder function is αW (t) = s(t), under certain conditions on s. The choice of these functions is motivated by many reasons: – they are simple to synthesize and it is easy to vary αW (t); – they have already been studied by many authors [GON 92a, TRIC 93] and provide a reliable generic model for different phenomena [JAG 90]; – they are synthesized neither from wavelets nor from GIFS, which makes fair comparisons possible. Figure 9.4 (respectively 9.6) shows the function W (t) on [0; 1] obtained by taking s(t) = t (respectively s(t) = | sin(5πt)|). Figure 9.5 (respectively 9.7) shows its Hölder function estimation using the methods mentioned earlier, the theoretical curve being represented by a thick line. The standard line represents the GIFS estimation, the thin line represents the Schauder base estimation, the dashed line represents the WTMM (Morlet wavelet) estimation. Finally, the mixed line represents the time-scale energy distribution (pseudo-Wigner) estimation.

IFS: Local Regularity Analysis and Multifractal Modeling of Signals 5

4

3

2

1

0

Ŧ1

Ŧ2 0

200

400

600

800

1000

1200

Figure 9.4. Generalized Weierstrass function with s(t) = t

2.5

2

1.5

1

0.5

0

Ŧ0.5 0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Figure 9.5. Estimation of the Hölder function

3

2

1

0

Ŧ1

Ŧ2

Ŧ3 0

200

400

600

800

1000

1200

Figure 9.6. Generalized Weierstrass function with s(t) = | sin(5πt)|

317

318

Scaling, Fractals and Wavelets 2

1.5

1

0.5

0

Ŧ0.5 0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Figure 9.7. Estimation of the Hölder function

Based on these examples, we conclude that the GIFS estimates are more precise than those of other methods. Thus, GIFS are not only capable of prescribing any Hölder function, but they can also provide accurate estimations of these functions in practice, even for very irregular signals. This makes GIFS a well-adapted framework for the analysis of local function regularity, and deserves to be further studied. 9.6. Weak self-similar functions and multifractal formalism With this section we embark upon the second part of the chapter, devoted to multifractal analysis and modeling of signals. Self-similar functions constitute the paradigm of “multifractal signals” (see Chapter 3 or Chapter 4 for the deﬁnition and the multifractal properties of these functions). However, in practice, self-similar functions and their immediate extensions are most of the time too rigid to properly model real-world signals, such as speech signals for instance. In what follows, we consider a generalization, the WSA functions, which offers a good compromise between ﬂexibility and complexity in modeling. WSA functions are deﬁned as a generalization of the self-similar functions, in the sense of Jaffard [JAF 02a], where the renormalization factors are allowed us to vary through scales. Formally, DEFINITION 9.3.– A function f : [0, 1] → is said to be weak self-afﬁne if and only if: 1) there exists an open set Ω ⊂ [0, 1] and d (d ≥ 2) contracting similitudes S0 , . . . , Sd−1 with contraction factor d1 , such that: Si (Ω) ⊂ Ω

∀i ∈ {0, . . . , d − 1}

Si (Ω) ∩ Sj (Ω) = ∅

if i = j

IFS: Local Regularity Analysis and Multifractal Modeling of Signals

319

2) there exist d positive sequences (λj0 )j∈N∗ , . . . , (λjd−1 )j∈N∗ satisfying 0 < λji < 1 for any i ∈ {0, . . . , d − 1} and j ∈ N∗ and there exists a continuous function g with compact support, such that f veriﬁes: $ n % ∞ ) j j j i 2j−p λij f (x) = g(x) + p=1 p (9.5) n=1 (i1 ,...,in )∈{0,...,d−1}n j=1 −1 (x) × g Sin ◦ . . . ◦ Si−1 1 where, for any j ≥ 1 and k ∈ {0, . . . , dj − 1}, we have jk = ±1. If there exist d scalar λ0 , . . . , λd−1 such that: jk λji = λi ,

∀i ∈ {0, . . . , d − 1}, ∀j ≥ 1 and ∀k ∈ {0, . . . , dj − 1}

then we recover the traditional self-similar functions, in the sense of [JAF 02a]. The (weak) self-afﬁnity of f is made clear seeing that Deﬁnition 9.3 implies that f can be obtained as the limit of the sequence (fj )j∈N , where f0 (x) = g(x) and, for j ≥ 1, fj is recursively calculated as follows: fj (x) =

d−1

ji λji fj−1 Si−1 (x) + g(x)

i=0

The following theorem, proven in [DAO 96], enables us to calculate the multifractal spectrum d(α) of WSA functions. The corresponding theorem for ordinary self-similar functions can be found in [JAF 02a] (see also Chapter 3 in this volume). Let us deﬁne, for any j ≥ 1, the d-tuple (uj0 , . . . , ujd−1 ) by: (uj0 , . . . , ujd−1 ) = (λji0 , . . . , λjid−1 ) where (i0 , . . . , id−1 ) is the permutation of (0, . . . , d − 1) which gives: λji0 ≤ · · · ≤ λjid−1 In other words, for any j, (uj0 , . . . , ujd−1 ) is the d-tuple (λj0 , . . . , λjd−1 ) rearranged in ascending order. Then, the following theorem holds. THEOREM 9.2.– Let us suppose that there exist two scalars a > 0 and b > 0 such that, for any i ∈ {0, . . . , d − 1} and j ≥ 1, we have: 0 < a ≤ uji ≤ b < 1

320

Scaling, Fractals and Wavelets

Let us also suppose that:

card j ∈ {1, . . . , n} : uji ≤ xi ; ∀i = 0, . . . , d − 1 p(x0 , . . . , xd−1 ) = lim n n exists for any (x0 , . . . , xd−1 ) ∈ [a; b]d . Finally, let us suppose that g is uniformly more regular than f . We note by d(α) the (Hausdorff) multifractal spectrum of f , i.e., d(α) = dimH {x : α(x) = α}, where α(x) is the Hölder exponent of f at point x. Then: – d(α) = −∞ if α ∈ [αmin ; αmax ] where: ⎧ logd (u1d−1 ) + · · · + logd (und−1 ) ⎪ ⎪ ⎨αmin = lim − n n 1 n ⎪ ⎪ ⎩αmax = lim − logd (u0 ) + . . . + logd (u0 ) n n – if α ∈ [αmin ; αmax ], then: d(α) = inf qα − τ (q) q∈

where: n τ (q) = lim inf −

j=1

logd (λj0 )q + . . . + (λjd−1 )q

n→∞

n

This theorem shows that the multifractal formalism is valid for the WSA functions. Therefore, the Hausdorff multifractal spectrum of large deviations and Legendre transform (see Chapter 1 or Chapter 4 for deﬁnitions) coincide, are concave and can be easily calculated. 9.7. Signal representation by WSA functions The method we use to represent a given signal is based on its approximation by one or many WSA functions. In this section, we develop a practical technique to approximate, in the L2 sense, a signal through a WSA function. In practice, only discrete data are available; thus, in what follows, we will suppose that we have a signal {f (m), m = 0, . . . , 2J − 1}. Our purpose is to ﬁnd the parameters (d, g, (Si )i , (jk )k,j , (λji )i,j ) of the WSA function which provide the best L2 -approximation of f . In its general form, this problem is difﬁcult to solve. However, it is possible to consider a simpliﬁed and less general form, for which a solution can be found by using a fast algorithm based on the wavelet decomposition of f . Let us describe this sub-optimal solution. Let φ

IFS: Local Regularity Analysis and Multifractal Modeling of Signals

321

denote the scaling function, ψ the corresponding wavelet and wkn the resulting wavelet coefﬁcients of the signal f : f (x) = a0 φ(x) +

J−1

wkj ψ(2j x − k)

j=0 0≤k<2j

Let us assume that, for any (k, j), we have wkj = 0 and let us deﬁne for j ≥ 1: j wk cjk = j−1 w[ k ] 2

Let us assume that Si (x) =

x+i 2

for i = 0, 1. Then, a simple calculation leads to:

f (x) = a0 φ(x) + w00 ψ(x) +

w00

J−1

n sgn w00 w n

p=1 ip 2

n=1 (i1 ,...,in )∈{0,1}n

$ n ) n−p

j=1

% cj j

p=1 ip 2

j−p

(9.6)

× ψ Si−1 ◦ . . . ◦ Si−1 (x) n 1

where sgn(x) denotes the sign function. This latter equality implies the following sub-optimal choice for the parameters: under the additional constraint d = 2, (f (x) − a0 φ(x))/w00 takes a form similar to (9.5) if we assume that g = ψ, j j j−1 Si (x) = x+i 2 for i = 0, 1 and k = sgn(wk /w[ k ] ). In the following, we thus ﬁx the 2

values of d, S0 and S1 as above and try to ﬁnd the optimal g and (λji ). For now, let us assume that g is known. Hence, our problem is to ﬁnd, for any j, two positive scalars λj0 and λj1 such that, if we replace all (cj2k ) (respectively (cj2k+1 )) with λj0 (respectively λj1 ) in (9.6), then we obtain the best L2 -approximation of the original signal f . In other words, for a given couple (φ, ψ), we want to ﬁnd, at each j scale, one scalar which “best represents” the wavelet coefﬁcient ratio of f with even indexes k and the same for the coefﬁcients with odd indexes k. By using a gradient descent in the time-scale wavelet space, we have shown in [DAO 96] that the two sequences (λn0 )n≥1 and (λn1 )n≥1 , which are solutions of the so-simpliﬁed inverse problem, are given by: Pkn cn2k+i (9.7) λ1i = c1i and λni = 0≤k<2n−1

for any n > 1 where, for any k ∈ {0, . . . , 2n−1 − 1}: n−1 j n−j cn−1 j=1 λi (k) k n Pk = n−1 j j j 2 2 j=1 |λ0 | + |λ1 |

322

Scaling, Fractals and Wavelets

where the sequence (i1 (k), . . . , in−1 (k)) is the single sequence of {0, 1}n−1 n−1 n−j−1 such that we have k = . Moreover, for any n > 1, we have j=1 ij (k)2 n P = 1. n−1 k 0≤k<2 Unfortunately, such a procedure does not provide us with appropriate representations in practice and the reasons for this can be analyzed as follows: each cjk is deﬁned as the ratio of two wavelet coefﬁcients. While it is assumed that all wkj are non-zero, some arbitrarily small values are likely to exist for most real applications. This yields both very large and small values of cjk . Obviously, such a large variation range is an issue for the proposed modeling, since all the coefﬁcients {cjk , k = 0, . . . , 2j − 1} are replaced by only two values and thus a control on the dispersion of the cjk is needed. However, what draws our interest is the representation of irregular signals (otherwise, a fractal approach would not make sense). As far as irregular signals are concerned, energy is present at most scales, and consequently the majority of the cjk should vary within an intermediate range. Moreover, from a fractal analysis viewpoint, large cjk are not interesting as they do not contribute to the regularity of f . In addition, if we assume that f is nowhere differentiable, then the Hölder exponent in each point is less than 1; thus, “many” cjk , including those which control the multifractal properties of f , will belong to [ 12 , 1] (see [DAO 98] for details). Thus, it appears reasonable to ignore, in our representation, the “large” cjk , and consider only those that are less than 1. More precisely, we keep unchanged the cjk when they are greater than 1 and we calculate (λji ) that yield the best L2 -approximation by considering only the remaining cjk . Evidently, for this strategy to make sense, it is necessary that the cardinal of {cjk : cjk ≥ 1} is negligible as compared with that of {cjk : cjk < 1}. This depends on the nature of the signal and on the choice of g = ψ. These constraints lead us to the following criteria for the sub-optimal choice of the wavelet analysis: (C1) {wkj = 0} = ∅ (C2)

the cardinal of {cjk : cjk < 1} is maximal.

In practice, because of edge artifacts, wavelet decomposition is often limited to a certain scale j0 > 0. As a consequence, the (cjk ) are deﬁned only for j > j0 . In practice, if we write the signal f as: f (x) =

0≤k<2j0

ajk0 φ(2j0 x − k) +

J−1

n=j0

0≤k<2n

wkn ψ(2n x − k)

then the problem at hand is to ﬁnd, for any j > j0 , two positive scalars λj0 and λj1 such that, when we replace all (cj2k ) satisfying cj2k < 1 (respectively (cj2k+1 ) satisfying

IFS: Local Regularity Analysis and Multifractal Modeling of Signals

323

cj2k+1 < 1) by λj0 (respectively λj1 ), we obtain the best L2 -approximation of the original signal f . The resulting approximate signal f˜ is hence a WSA function deﬁned by: ajk0 φ(2j0 x − k) + wkj0 ψ(2j0 x − k) f˜(x) = 0≤k<2j0

+

0≤k<2j0

J−1

j0 w j0

p=1 ip

n=j0 +1 (i1 ,...,in )∈{0,1}n

$

c˜j j

p=1 ip

j=j0 +1

where: c˜j j

⎧ ⎨cj j

j−p p=1 ip 2

=

⎩λj ij

n n sgn w n−p i 2 p p=1 %

n )

×

2j0 −p

p=1 ip 2

j−p

2j−p

ψ Si−1 ◦ . . . ◦ Si−1 (x) n 1

if cj j

p=1 ip 2

j−p

≥1 (9.8)

otherwise

Let us observe that, for n > j0 and k ∈ {0, . . . , 2n − 1}, the wavelet coefﬁcient w ˜kn of f˜ is given by: w ˜kn = sgn(wkn )w[j0

k 2n−j0

n ) c˜j[ ] j=j0 +1

k 2n−j

]

Since the study is restricted to orthogonal wavelet transforms – that preserve energy – the goal is to ﬁnd two positive sequences (λn0 )n=j0 +1,...,J−1 and (λn1 )n=j0 +1,...,J−1 that satisfy: argmin

J−1

n=j0 +1

0≤k<2n

|wkn − w ˜kn |2

However, ﬁnding this global minimum is a difﬁcult problem. A local minimum can nevertheless be obtained by successively calculating, for n = {j0 + 1, . . . , J − 1}, the pair (λn0 , λn1 ) that satisﬁes: |wkn − w ˜kn |2 (9.9) argmin 0≤k<2n

The solution to problem (9.9) is given in the following proposition (see [DAO 02] for the proof).

324

Scaling, Fractals and Wavelets

PROPOSITION 9.6.– For i = 0, 1 and n > j0 , the (λni ) solution of (9.9) are recursively given by: λij0 +1

=

0≤k<2j0 j0 +1 |c2k+i |<1

j0 2 j0 +1 w c

0≤k<2j0 j0 +1 |c2k+i |<1

n−1

λni

=

0≤k<2 |cn 2k+i |<1

2k+i

k

j0 2 w

(9.10)

k

2 j0 n−1 j j n w c ˜ c [ 2k+i ] j=j0 +1 [ 2k+i ] [ 2k+i ] c2k+i 2n−j

2n−j0

2n−j

2 2 j0 n−1 j n−1 w 2k+i c ˜ 2k+i 0≤k<2 j=j0 +1 [ [ ] ] |cn 2k+i |<1

2n−j0

(9.11)

2n−j

for n = j0 + 2, . . . , J − 1 Formulae (9.8), (9.10) and (9.11) deﬁne the approximation of f by a WSA function. NOTE 9.3.– Obviously, it is possible to develop a similar algorithm for ordinary self-similar functions. However, such an algorithm would restrict the search to a class of functions smaller than that of WSA functions. Therefore, this would yield less precise representations in general. Moreover, since ordinary self-similar functions are a particular case of WSA functions, the proposed algorithm is thus also able to model signals that are properly represented by ordinary self-similar functions. Indeed, λni given by (9.11) would be equal for all n. 9.8. Segmentation of signals by weak self-similar functions In many scenarios, a single WSA function cannot alone represent a signal. A typical example is the concatenation of two IFS. Consider a signal X on [0, 1] whose restrictions to [0, 12 ] and [ 12 , 1] consist of the attractors of two different IFS. The modeling of X by a single WSA function would result in a signiﬁcant global L2 error, whereas two WSA functions would yield a perfect approximation (using a Schauder basis as wavelet decomposition). It is therefore important to design a procedure that segments a given signal into many parts, each one being appropriately represented by a WSA function. As with the previous section, it is worth mentioning that constructing an optimal algorithm is a difﬁcult task. In what follows, we present a segmentation method which yields good results in practice. Let us consider the root node lives at each level j ∈ with l = 0, . . . , 2j

the dyadic tree of depth J for which, by convention, at level zero and the leaves at level J − 1. The nodes, {0, . . . , J − 1}, are numbered from left to right by (l, j) − 1. The coefﬁcient cjl is associated with each node (l, j)

IFS: Local Regularity Analysis and Multifractal Modeling of Signals

325

such that j > j0 . The segmentation algorithm is based on the fact that the sub-tree extending from node (l, j) completely determines the restriction of f to I(l, j) = {l2J−j , . . . , (l + 1)2J−j − 1}. The aim is to deﬁne and measure the error associated with each node (l, j) by the L2 distance between the restriction of the original signal f to I(l, j) and the representation of this restriction by a single WSA function. In order to account for the fact that large scale errors have more impact than small scale ones, errors are weighted (see below). Starting from the root node, which obviously corresponds to a single WSA function for all the signal, we recursively divide the tree until the error falls, for each sub-tree, below a given threshold deﬁned a priori. Each resulting sub-tree is “appropriately” represented by a (single) WSA function and the union of the corresponding I(l, j) deﬁnes the segmentation of f . More precisely, the set of integers I(l, j, n) = {l2n−j , . . . , (l + 1)2n−j − 1} is associated with each node (l, j) and each n > j. Let λni (l, j) denote the ratio of sums similar to those in (9.11), but where the indices are determined by (k ∈ I(l, j, n − 1), cn2k+i < 1). Let us denote: eni (l, j) =

ei (l, j) =

1 2n−j

n λi (l, j) − |cn2k+i |2

k∈I(l,j,n−1) cn 2k+i <1

J−1

eni (l, j)

n=j+1

and: e(l, j) =

e0 (l, j) + e1 (l, j) σ(j)

where σ is a positive increasing function. The quantity e(l, j) is the error function and σ is introduced to account for the fact that errors made on coarse scales have more impact than that made on ﬁne scales. Now, the proposed segmentation algorithm can be formulated (see Algorithm 9.1). The result of this algorithm is the segmentation of f into consecutive parts, each being properly represented by a WSA function. From this point of view, this segmentation approach is a new type of tool. Instead of dividing the original signal into homogenous parts according to usual criteria such as local average or fractal dimension, we use a criterion based on multifractal stationarity. Indeed each segment has a well-deﬁned multiplicative structure, with a multifractal spectrum given by Theorem 9.2. As an application of this new segmentation scheme, we will examine in the next section how it enables to estimate non-concave multifractal spectra.

326

Scaling, Fractals and Wavelets

Fix > 0; node = root node; (this is the initialization) function segmentation (node) Begin (l, j) = number of the node; If we have e(l, j) < , then: {f (m), m ∈ I(l, j, J)} is approximated by the weak self-similar function defined by {λni (l, j), n = j + 1, . . . , J − 1, i = 0, 1} Otherwise segmentation (left line of the node); segmentation (right line of the node); End Algorithm 9.1. Segmentation algorithm

NOTE 9.4.– Our algorithm suffers from a weakness: segmentation can only occur at dyadic points, the consequence being an important loss if the “real” segments are not lined up with dyadic points. This difﬁculty frequently arises when using dyadic wavelets and it can be solved using standard techniques such as non-decimated wavelet transforms. 9.9. Estimation of the multifractal spectrum Representation by means of weak self-similar functions offers a semi-parametric approach to the estimation of the spectrum d(α). At ﬁrst, f is segmented into homogenous parts by using the algorithm described in the previous section. Each subpart Pi , with i = 1, . . . , p, is represented by a single WSA function Fi whose spectrum di can be calculated using Theorem 9.2. Because the number of segments is ﬁnite, the dimension d(α) associated with the Hölder exponent α for any signal is thus a maximum of di (α), with i = 1, . . . , p. Therefore, the semi-parametric estimation of d(α) is: ˆ d(α) = max di (α) i=1,...,p

ˆ Obviously, each di is concave, although, in general, this is not the case for d. ˆ The concatenation example of two IFS shows that the estimated spectrum d coincides exactly with the theoretical spectrum and exhibits two modes, as is characteristic for phase transition. It is important to note that, for a given δ > 0, it is easy to construct two functions f1 and f2 such that ||f1 − f2 ||L2 < δ or ||f1 − f2 ||∞ < δ, yet with very different spectra. Therefore, we cannot draw the conclusion that, in general, the original signal spectrum is close to that of the approximating WSA function. However, based on criteria which allow us to conﬁrm that the physical properties of the original signal and

IFS: Local Regularity Analysis and Multifractal Modeling of Signals

327

the approximating signal are close, then this conclusion can make sense. For example, in the case of speech signals, studied later, an obvious criterion is that of auditory comparison. As far as Internet trafﬁc applications are concerned, the chosen criterion will be the comparison between our estimation of the spectrum and that proposed by other approaches. Indeed, since approaches are qualitatively different and more or less yield equivalent spectra (as we will see later), this gives credit to the quality of the proposed estimation. 9.10. Experiments The ﬁrst example consists of the representation of the word “welcome” pronounced by a male speaker. The signal contains 215 samples, we assume that j0 = 8 and we use the Daubechies-16 wavelet (as explained before, the choice of wavelet is based on the criteria (C1) and (C2)). In this experiment, as in the following one, we assume that σ(j) = j 2 . By using a threshold = 50, we obtain a representation with seven WSA functions, where 64% of the coefﬁcients are processed (the remaining 36% correspond to the tree levels coarser to j0 or to values of cjk larger than 1). Figure 9.8 shows that the original and the approximating signals are visually almost identical. In a more signiﬁcant manner, we cannot distinguish the two signals from an auditory comparison, as can be checked at: http://www-rocq.inria.fr/fractales. In addition, the segmentation (see the crosses in Figure 9.8) is phonetically consistent, since it coincides almost perfectly with the sounds: silence, /w/, / l /, silence, /k/, /om/, silence. The slight difference between the positions of the segmentation marks and the exact transition points between phonetic units is due to the fact that, in our actual implantation, the segmentation is restricted to the dyadic points.

Figure 9.8. The word “welcome” pronounced by a male speaker (in black) with its approximation (superimposed in gray) and the segmentation marks (the crosses)

In the second example, we present an application for Internet trafﬁc signals. We use a signal of 512 trafﬁc samples coming out from Berkeley, measured in bytes by time

328

Scaling, Fractals and Wavelets

steps. The analyzed wavelet is Daubechies-4, j0 = 4 and 65% of the coefﬁcients have been processed. With a threshold = 30, we obtain two segments. Figure 9.9 shows the original signal (in black), its approximation (in gray), the segmentation marks (the crosses) and the estimated spectrum on each segment (Figure 9.9b). It is interesting to compare these spectra with those estimated in [LEV 97]: the results are very similar, whereas if we use more segments (or a single segment), a clear difference appears (see Figure 9.10 for the result obtained with four segments). Since the method used here and that of [LEV 97] are very different, the fact that we have found a concordance can imply that this particular signal possesses two parts, each being stationary in a multifractal sense. Such information can be useful for a better understanding of the trafﬁc structure. 6

x 10

4

Singularity Spectrum

1

5

0.8

4

3

0.6

2 0.4

1 0.2

0

0

-1

0

100

200

300

400

500

0

0.2

0.4

0.6 Holder exponent

600

(a)

0.8

1

1.2

(b)

Figure 9.9. (a) Original Internet trafﬁc signal (in black), its approximation (in gray) using two segments and the segmentation marks (the crosses), (b) estimated spectrum for each segment. The estimated spectrum for the whole signal is the superior hull of the two curves 6

x 10

4

Singularity Spectrum

1

5

0.8

4

3

0.6

2 0.4

1 0.2

0

0

-1

0

100

200

300

(a)

400

500

600

0

0.2

0.4

0.6

0.8

1 1.2 Holder exponent

1.4

1.6

1.8

2

(b)

Figure 9.10. (a) Original Internet trafﬁc signal (in black), its approximation (in gray) using four segments and the segmentation marks (the crosses), (b) estimated spectrum for each segment. The estimated spectrum for the whole signal is the convex hull of the two curves

IFS: Local Regularity Analysis and Multifractal Modeling of Signals

329

9.11. Bibliography [ABE 92] A BENDA S., D EMKO S., T URCHETTI G., “Local moments and inverse problem for fractal measures”, Inverse Problems, vol. 8, p. 739–750, 1992. [AND 92] A NDERSSON L.M., “Recursive construction of fractals”, Annales Academiae Scientiarum Fennicae, 1992. [ARB 02] A RBEITER M., PATZSCHKE N., “Random self-similar multifractals”, Advances in Mathematics, 2002. [BAC 91] BACRY E., A RNÉODO A., F RISH U., G AGNE Y., H OPFINGER E., Wavelet Analysis of Fully Developed Turbulence Data and Measurement of Scaling Exponents, Kluwer Academic Publishers, 1991. [BAC 93] BACRY E., M UZY J.F., A RNÉODO A., “Singularity spectrum of fractal signal from wavelet analysis: exact results”, J. Stat. Phys., vol. 70, no. 3-4, p. 635–674, 1993. [BAR 88] BARNSLEY M.F., D EMKO S., E LTON J., G ERONIMO J., “Invariant measures for Markov processes arising from iterated function systems with place-dependent probabilities”, Annales de l’IHP, Probabilité et Statistique, vol. 24, no. 3, p. 367–394, 1988. [BAR 85a] BARNSLEY M.F., Approximation, 1985.

“Fractal

functions

and

interpolation”,

Constructive

[BAR 85b] BARNSLEY M.F., D EMKO S., “Iterated function system and the global construction of fractals”, Proceedings of the Royal Society, vol. A399, p. 243–245, 1985. [BAR 86] BARNSLEY M.F., E RVIN V., H ARDIN D., L ANCASTER J., “Solution of an inverse problem for fractals and other sets”, Proc. Natl. Acad. Sci. USA, vol. 83, 1986. [BAR 93a] BARNSLEY M.F., Fractal Image Compression, A.K. Peters, 1993. [BAR 93b] BARNSLEY M.F., Fractals Everywhere, Academic Press, 1993. [CAB 92] C ABRELLI C.A., F ORTE B., M OLTER U.M., V RSCAY E.R., “Iterated fuzzy set systems: A new approah to the inverse problem for fractal and other sets”, Math. Analysis and Applications, vol. 171, no. 1, p. 79–100, 1992. [CEN 93] C ENTORE P.M., V RSCAY E.R., “Continuity of attractors and invariant measures for iterated functions systems”, Canadian Math. Bull., vol. 37, p. 315–329, 1993. [DAO 96] DAOUDI K., Généralisations des IFS: Applications au Traitement du Signal, PhD Thesis, Paris 9 University, 1996. [DAO 98] DAOUDI K., L ÉVY V ÉHEL J., M EYER Y., “Construction of continuous functions with prescribed local regularity”, Constructive Approximation, vol. 14, no. 3, p. 349–386, 1998. [DAO 02] DAOUDI K., L ÉVY V ÉHEL J., “Signal representation and segmentation based on multifractal stationarity”, Signal Processing, vol. 82, no. 12, p. 2015–2024, 2002. [DUV 92] D UVALL P.F., H USCH L.S., “Attractors of iterated function systems”, Proc. of the Amer. Math. Society, vol. 116, no. 1, 1992. [EVE 92] E VERTSZ C.J.G., M ANDELBROT B.B., “Self-similarity of the harmonic measure on DLA”, Physica A, vol. 185, p. 77–86, 1992.

330

Scaling, Fractals and Wavelets

[FAL 90] FALCONER K.J., Fractal Geometry: Mathematical Foundations and Applications, John Wiley & Sons, 1990. [FAL 94] FALCONER K.J., “The multifractal spectrum of statistically self-similar measures”, Journal of Theoretical Probability, vol. 7, no. 3, p. 681–702, 1994. [FIS 93] F ISHER Y., JACOBS E.W., B OSS R.D., “Fractal image compression using iterated transforms”, in Data Compression, Kluwer Academic Publishers, 1993. [FOR 94] F ORTE B., L O S CHIAVO M., V RSCAY E.R., “Continuity properties of attractors for iterated fuzzy set systems”, J. Australian Math. Soc., vol. B 36, p. 175–193, 1994. [FOR 95] F ORTE B., V RSCAY E.R., “Solving the inverse problem for measures using iterated function systems: a new approach”, Adv. Appl. Prob, vol. 27, p. 800–820, 1995. [FRI 95] F RISCH U., PARISI G., “Fully developed turbulence and intermittency”, in Proceedings of the International Summer School on “Turbulence and Predictability in Geophysical Fluid Dynamics and Climate Dynamics”, p. 84–88, 1985. [GON 92a] G ONÇALVES P., F LANDRIN P., “Bilinear time-scale analysis applied to local scaling exponents estimation”, in Progress in Wavelet Analysis and Application (Toulouse, France), p. 271–276, June 1992. [GON 92b] G ONÇALVES P., F LANDRIN P., “Scaling exponents estimation from time-scale energy distributions”, in Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 1992. [GRA 87] G RAF S., “Statistically self-similar fractals”, Probability Theory and Related Fields, vol. 74, p. 357–392, 1987. [HIR 02] H IRATA T., I MOTO M., “Multifractal analysis of spatial distribution of microearthquakes in the Kanto region”, Geophys. J. Int., vol. 107, p. 155–162, 2002. [HUT 81] H UTCHINSON J., “Fractals and self-similarity”, Indiana University Journal of Mathematics, vol. 30, p. 713–747, 1981. [JAC 93a] JACOBS E.W., B OSS R.D., F ISHER Y., Fractal based image compression, Technical Report, Naval Ocean System Center, 1993. [JAC 93b] JACOBS E.W., B OSS R.D., F ISHER Y., “Image compression: a study of the iterated transform method”, Signal Processing, vol. 29, 1993. [JAF 89] JAFFARD S., “Exposants de Hölder en des points donnés et coefﬁcients d’ondelettes”, Comptes rendus de l’Académie des sciences de Paris, vol. 308, no. 1, p. 79–81, 1989. [JAF 91] JAFFARD S., “Pointwise smoothness, two-microlocalization, coefﬁcients”, Publications Mathématiques, vol. 35, p. 155–168, 1991.

and

wavelet

[JAF 92] JAFFARD S., M EYER Y., Pointwise Regularity of Functions and Wavelet Coefﬁcients, Masson, 1992. [JAF 95] JAFFARD S., “Functions with prescribed Hölder exponent”, Applied and Computational Harmonic Analysis, vol. 2, no. 4, p. 400–401, 1995. [JAF 02a] JAFFARD S., “Multifractal formalism functions: Parts 1 and 2”, SIAM Journal of Mathematical Analysis, 2002.

IFS: Local Regularity Analysis and Multifractal Modeling of Signals

331

[JAF 02b] JAFFARD S., M ANDELBROT B.B., “Local regularity of non-smooth wavelet expansions and applications to the Polya function”, Advances in Mathematics, 2002. [JAG 90] JAGGARD D.L., “On fractal electrodynamics”, in Recent Advances in Electromagnetic Reseach, Springer-Verlag, 1990. [LEV 95] L ÉVY V ÉHEL J., “Fractal approches in signal processing”, Fractals, vol. 3, no. 4, p. 755–775, 1995. [LEV 96] L ÉVY V ÉHEL J., “Introducion to the multifractal analysis of images”, in F ISHER Y. (Ed.), Fractal Image Encoding and Analysis, Springer-Verlag, 1996. [LEV 97] L ÉVY V ÉHEL J., R IEDI R., “Fractional Brownian motion and data trafﬁc modeling: the other end of the spectrum”, in L ÉVY V ÉHEL J., L UTTON E., T RICOT C. (Eds.), Fractals in Engineering, Springer-Verlag, 1997. [LEV 01] L ÉVY V ÉHEL J., “Weakly self-afﬁne functions and applications in signal processing”, Cuadernos del Instituto Matematica Beppo Levi, vol. 30, p. 35–49, 2001. [MAL 92] M ALLAT S., H ANG W.L., “Singularity detection and processing with wavelets”, IEEE Trans. on Information Theory, vol. 38, no. 2, 1992. [MAND 68] M ANDELBROT B.B., “Fractional Brownian motions, fractional noises, and applications”, SIAM Review, vol. 10, no. 4, p. 422–437, 1968. [MAND 72] M ANDELBROT B.B., “Possible reﬁnement of the lognormal hypothesis concerning the distribution of energy dissipation in intermittent turbulence”, in ROSENBLATT M., VAN ATTA C. (Eds.), Statistical Models and Turbulence (La Jolla, California), Springer, Lecture Notes in Physics 12, p. 331–351, 1972. [MAND 74] M ANDELBROT B.B., “Intermittent turbulence in self-similar cascades: Divergence of high moments and dimension of the carrier”, J. Fluid Mech., vol. 62, p. 331–358, 1974. [MAND 89] M ANDELBROT B.B., “A class of multinomial multifractal measures with negative (latent) values for the dimension f (α)”, in P IETRONERO L. (Ed.), Fractals’ Physical Origin and Properties (Erice), Plenum, New York, p. 3–29, 1989. [MAND 91] M ANDELBROT B.B., E VERTSZ C.J.G., “Multifractality of the harmonic measure on fractal aggregates and extended self-similarity”, Physica A, vol. 177, p. 386–393, 1991. [MAND 93] M ANDELBROT B.B., “A fractal’s lacunarity, and how it can be tuned and measured”, Fractals in Biology and Medicine, p. 8–21, 1993. [MANT 89] M ANTICA G., S LOAN A., “Chaotic optimization and the construction of fractals: Solution of an inverse problem”, Complex Systems, vol. 3, p. 37–62, 1989. [MAU 86] M AULDIN R.D., W ILLIAMS S.C., “Random recursive constructions: Asymptotic geometry and topological properties”, Trans. Amer. Math. Soc., vol. 295, p. 325–346, 1986. [MAZ 92] M AZEL D.S., H AYES M.H., “Using iterated function systems to model discrete sequences”, IEEE Trans. on Signal Processing, vol. 40, no. 7, 1992. [MEY 90a] M EYER Y., Ondelettes et opérateurs. Vol. 1: ondelettes, Hermann, 1990. [MEY 90b] M EYER Y., Ondelettes et opérateurs. Vol. 2: opérateurs de Calderon-Zygmund, Hermann, 1990.

332

Scaling, Fractals and Wavelets

[OLS 94] O LSEN L., “Random geometrically graph directed self-similar multifractals”, in Pitman Research Notes in Mathematics, Series 307, 1994. [OLS 02] O LSEN L., “A multifractal formalism”, Advances in Mathematics, 2002. [RIE 94] R IEDI R., “Explicit bounds for the Hausdorff dimension of certain self-similar sets”, in N OVAK M.M. (Ed.), Fractals in the Natural and Applied Sciences, North-Holland, IFIP Transactions, p. 313–324, 1994. [RIE 95] R IEDI R., M ANDELBROT B.B., “Multifractal formalism for inﬁnite multinomial measures”, Advances in Applied Mathematics, vol. 16, p. 132–150, 1995. [TRIC 82] T RICOT C., “Two deﬁnitions of fractional dimension”, Math. Proc. Camb. Phil. Soc., vol. 91, p. 54–74, 1982. [TRIC 93] T RICOT C., Courbes et dimension fractale, Springer-Verlag, 1993. [TRIE 78] T RIEBEL H., Spaces of Besov-Hardy-Sobolev Type, Teubner, Texte zur Mathematik, 1978. [VOJ 95] VOJAK R., L ÉVY V ÉHEL J., “Higher order multifractal analysis”, SIAM Journal on Mathematical Analysis, 1995. [VRS 90] V RSCAY E.R., “Moment and collage methods for the inverse problem of fractal construction with iterated function systems”, in Proceedings of the Fractal’90 Conference (Lisbon, Portugal), 1990. [VRS 91a] V RSCAY E.R., “Iterated function systems: Theory, applications, and the inverse problem”, in Fractal Geometry and Analysis, p. 405–468, 1991. [VRS 91b] V RSCAY E.R., W EIL D., “ “Missing moment” and perturbative methods for polynomial iterated function systems”, Physica D, vol. 50, p. 478–492, 1991. [WEI 95] W EIERSTRASS K., “On continuous function of a real argument that do not have a well-deﬁned differential quotient”, Mathematische Werke, p. 71–74, 1895.

Chapter 10

Iterated Function Systems and Applications in Image Processing

10.1. Introduction An iterated function system (IFS) makes it possible to generate fractal images from a set of contracting transformations [BARN 88]. It is also possible to rely on a simple fractal image to determine the parameters of contracting transformations, thereby allowing us to synthesize it (in this case, such a process is referred to as the inverse problem). The ﬁrst part of this chapter is devoted to the review of some basic concepts necessary for the comprehension of IFS theory. An adaptation of this theory, initially proposed in [JACQ 92], makes it possible to generate non-fractal images of natural scenes from a set of local contracting transformations. The purpose of the second part of this chapter is to introduce this method, which can be used for analyzing or coding images. We then describe the principles of natural image coding by fractals, which consists of the automatic calculation of the transformation parameters that enables us to generate a given image. Finally, we present various solutions which allow us to speed up the automatic calculation of the contracting transformations and to improve the quality of reconstructed images. 10.2. Iterated transformation systems In this section, we recall the main assets of the iterated transformation systems theory, which enables the coding and synthesis of binary fractal images

Chapter written by Franck DAVOINE and Jean-Marc C HASSERY.

334

Scaling, Fractals and Wavelets

and of gray-level images. The concept of IFS is also discussed, from a broader perspective, in Chapter 9 for the generation of the continuous function graphs and the characterization of Hölder functions. 10.2.1. Contracting transformations and iterated transformation systems 10.2.1.1. Lipschitzian transformation Let ω: R2 → R2 be a transformation deﬁned on the metric space (R2 , d). The symbol d indicates the distance between two points of R2 . The transformation ω is called Lipschitzian with strictly real positive Lipschitz factor s, if: (10.1) d ω(x), ω(y) s · d(x, y) ∀x, y ∈ R2 10.2.1.2. Contracting transformation Let ω: R2 → R2 be a transformation deﬁned on the metric space (R2 , d). The transformation ω is said to be contracting with real contraction factor s, 0 < s < 1, if: (10.2) d ω(x), ω(y) s · d(x, y) ∀x, y ∈ R2 10.2.1.3. Fixed point A contracting transformation ω possesses a single ﬁxed point xf ∈ R2 , such that ω(xf ) = xf . Let us note by ω on (·) the application of ω reiterated n times. For any point x element of R2 , the sequence {ω on (x) : n = 0, 1, 2, . . .} converges towards xf : lim ω on (x) = xf

n→∞

∀x ∈ R2

(10.3)

10.2.1.4. Hausdorff distance Let us consider the metric space (R2 , d). The symbol H(R2 ) indicates a space whose elements are the non-empty compact subsets of R2 . The distance [TRI 93] from the point x element of R2 to the set B element of H(R2 ), noted d(x, B), is deﬁned by: d(x, B) = min{d(x, y) : y ∈ B} The distance from the set A element of H(R2 ) to the set B element of H(R2 ), noted d(A, B), is deﬁned by: d(A, B) = max{d(x, B) : x ∈ A}

Iterated Function Systems and Applications in Image Processing

335

The Hausdorff distance between two sets A and B elements of H(R2 ), noted hd (A, B), is deﬁned by: hd (A, B) = max{d(A, B), d(B, A)}

(10.4)

Only when applied to closed and bounded sets – also referred to as compacts – does the Hausdorff distance verify all the properties of a distance (in particular, commutativity). Evidently, it is not to be confused with the concept of the Hausdorff dimension presented in this volume in Chapter 1 and Chapter 3. 10.2.1.5. Contracting transformation on the space H(R2 ) Let ω: R2 → R2 be a contracting transformation deﬁned on the metric space (R , d) with the real s as contraction factor. The transformation ω: H(R2 ) → H(R2 ) deﬁned by: 2

ω(B) = {ω(x) : x ∈ B},

∀B ∈ H(R2 )

(10.5)

is contracting on (H(R2 ), hd ), with contraction factor s. The symbol hd indicates the Hausdorff distance. 10.2.1.6. Iterated transformation system An IFS deﬁned on the complete metric space (R2 , d) is composed of a set of N transformations ωi : R2 → R2 (i = 1, . . . , N ), each of them associated with a Lipschitz factor si . From now on, in this section, it will be considered that the N transformations are contracting: the transformation system is in this case called hyperbolic IFS. The contraction factor of the hyperbolic IFS, noted s, is equal to max{si : i = 1, . . . , N }. 10.2.2. Attractor of an iterated transformation system Let us consider an IFS {R2 ; ωi , i = 1, . . . , N }. It has been demonstrated [BARN 93] that the operator W : H(R2 ) → H(R2 ) deﬁned by: W (B) =

N +

ωi (B),

∀B ∈ H(R2 )

(10.6)

i=1

is contracting and that its contraction factor corresponds with that of the IFS. The operator W possesses a single ﬁxed point At ∈ H(R2 ) given by: At = W (At ) = lim W on (X), n→∞

∀X ∈ H(R2 )

(10.7)

The object At is also called an IFS attractor. It is invariant under the transformation W and is equal to the union of N copies of itself transformed by ω1 , . . . , ωN . This invariant object is called self-similar or “self-afﬁne” when the elementary transformations ωi are afﬁne.

336

Scaling, Fractals and Wavelets

EXAMPLE 10.1.– Let us consider the IFS {R2 ; ωi , i = 1, . . . , 3} composed of the following afﬁne transformations: ⎡ 1 x ⎢2 =⎣ ω1 y 0 ⎡1 x ⎢ ω2 = ⎣2 y 0 ⎡1 x ⎢ = ⎣2 ω3 y 0

⎤ ⎛ ⎞ 0 0 x ⎥ + ⎝ y0 ⎠ ⎦ 1 y 2 2 ⎤ ⎞ ⎛ −x0 0 ⎥ x ⎜ 2 ⎟ ⎦ y + ⎝ −y ⎠ 0 1 2 2 ⎤ ⎛ x ⎞ 0 0 ⎥ x ⎜ 2 ⎟ ⎦ y + ⎝ −y0 ⎠ 1 2 2

(10.8)

Its contraction factor is equal to 0.25. The attractor coded by the IFS, called the Sierpinski triangle, is represented in Figure 10.1. x y ( 20 , y 0 ) w1

x (0,0)

w2

w3

(-x 0 , -y 0 )

Figure 10.1. Attractor of the iterated transformation system of Example 10.1. The initial square, originally centered and with sides of length 2x0 , 2y0 , is transformed into three homothetic squares by contracting transformations ω1 , ω2 and ω3 . This process is then iterated

10.2.3. Collage theorem This theorem, shown in [BARN 93], provides an upper bound to the Hausdorff distance hd between a point A included in H(R2 ) and the attractor At of an IFS. THEOREM 10.1.– We consider the complete metric space (R2 , d). Given a point A belonging to H(R2 ) and an IFS {R2 ; ω1 , ω2 , . . . , ωn } with a real contraction factor

Iterated Function Systems and Applications in Image Processing

337

0 s < 1. The following relation holds: $ N % + 1 hd A, hd (A, At ) ωi (A) 1−s i=1

(10.9)

The theorem shows that if it is possible to transform an object A so as to verify the relation A W (A) while ensuring that W is contracting, then the ﬁxed point At of the operator W is close to A. In this case, the operator W , deﬁned in section 10.2.2, fully characterizes the approximation1 At of object A and exactly codes this latter if A = W (A) [BARN 86]. EXAMPLE 10.2.– Let us consider the application of four contracting transformations to a square noted A. If the resulting four subsquares cover the initial square A exactly, the pasting theorem is satisﬁed. The attractor of the IFS is therefore a square identical to the square A. Figure 10.2 illustrates the attractor coded by the IFS {R2 ; ωi , i = 1, . . . , 4} composed of the following afﬁne transformations: ⎡1 x ⎢ ω1 = ⎣2 y 0 ⎡1 x ⎢ = ⎣2 ω2 y 0 ⎡1 x ⎢ = ⎣2 ω3 y 0 ⎡1 x ⎢ = ⎣2 ω4 y 0

⎤ ⎛ ⎞ −x0 0 ⎥ x ⎜ 2 ⎟ ⎦ y +⎝ y ⎠ 1 0 2 2 ⎤ ⎛x ⎞ 0 0 ⎥ x ⎜2⎟ ⎦ y + ⎝ y0 ⎠ 1 2 2 ⎤ ⎛ −x ⎞ 0 0 ⎥ x ⎜ 2 ⎟ ⎦ y + ⎝ −y ⎠ 1 0 2 2 ⎤ ⎛ x ⎞ 0 0 ⎥ x ⎜ 2 ⎟ + ⎝ −y ⎠ ⎦ y 1 0 2 2

(10.10)

The iterative process initialized on a circle converges towards the square A. The same result would be obtained by initializing the process on any other form.

1. The more self-similar the object A, the more effective the coding.

338

Scaling, Fractals and Wavelets (x , y ) 0 0

y

w (0,0)

1

w

2

x w

3

w

4

(-x 0 , -y 0 )

Figure 10.2. Attractor of the iterated transformation system of Example 10.2

10.2.4. Finally contracting transformation Let us consider a Lipschitzian transformation ω. If there is an integer n such that the transformation ω on is contracting, then ω is termed ﬁnally contracting. The integer n is called a contraction exponent. The operator W , deﬁned by equation (10.6), can be ﬁnally contracting even if a limited number of transformations ωi are not contracting. In this case the operator W is not contracting, but it can become so at iteration n since transformation products ωi enter progressively while iterating the transformations of transformations2. Generalized collage theorem Let us consider the ﬁnally contracting W with integer contraction exponent n. Then there is a single ﬁxed point xf ∈ R2 such that: xf = W (xf ) = lim W ok (x) k→∞

∀x ∈ R2

In this case: 1 1 − σn hd A, W (A) (10.11) 1−s 1−σ where σ is the Lipschitz factor of W and s the contraction factor of W on [FIS 91, LUN 92]. hd (A, At )

2. For example, W o2 = ω1 ◦ ω1 ∪ ω1 ◦ ω2 ∪ ω2 ◦ ω1 ∪ ω2 ◦ ω2 .

Iterated Function Systems and Applications in Image Processing

339

10.2.5. Attractor and invariant measures This section is a mere introduction of the main concepts for generating fractal objects in gray-levels. For more information see Chapter 9 in [BARN 88]. Given pi a probability associated with each one of the N transformation ωi of an IFS: pi = 1 ∀i, pi 0 and i=1:N

A fractal object in gray-level induces a measure3 μ on its support, that is associated with a Markov operator M of the following form: pi μn−1 ωi−1 (B) . (10.12) μn (B) = Mμn−1 (B) = i=1:N

In this expression, B is a Borel subset of R2 and μn (B) the probability of B at iteration n. It is shown that such an operator M is contracting [BARN 88] (with respect to the Hutchinson metric on the space of measures) and thus there exists a single measure μ called an invariant measure of the IFS that reads: Mμ = μ = lim Mok (μ0 ), k→∞

∀μ0

(10.13)

Moreover, the support of the invariant measure μ is the IFS attractor. Let us now consider the practical case in which the fractal object is a digital image. The normalized value of a pixel B of the image corresponds to the probability of the Borel subset B of R2 . According to (10.12), the value of a pixel B of the image μn is equal to the sum of the values of pixels ωi−1 (B) in μn−1 , multiplied by the probabilities pi . The invariant measure associated with the IFS attractor can also be obtained by iterating the three following operations a great number of times, initialized on an arbitrary point x0 of R2 : – choose a transformation ωi with the probability pi ; – calculate x1 = ωi (x0 ); – replace x0 by x1 . 3. Recall that a measure is, in the physical sense of the term, a measurable quantity (e.g. the light intensity) that allows us to associate weights with the different points of its support (the support of the measure is the set of the points on which it is deﬁned).

340

Scaling, Fractals and Wavelets

w2 p2

w1 p1

w3 p3 Figure 10.3. Calculation of the measure μn

When the number of iterations is sufﬁciently high, the points are distributed on a compact set R2 deﬁning the attractor of the IFS. The density (frequency of the visits) of each pixel of the attractor deﬁnes the invariant measure μ (see [ELT 87] and the “chaos game” in [BARN 88]), whose form is controlled by the set of the predeﬁned probabilities pi . 10.2.6. Inverse problem The deﬁnition of the inverse problem can be stated as follows: given an object A belonging to H(R2 ) and a measure μ on A, how can we ﬁnd the IFS and the set of probabilities pi , for which A is the attractor and μ the invariant measure? Various works have attempted to solve this constrained optimization problem, the difﬁcultly of which lies in its large dimensionality and in the irregularity of the function to be minimized. The proposed solutions use different techniques based on genetic algorithms [LUT 93], wavelets [RIN 94] and other approaches [BARN 86, CAB 92, KRO 92, LEV 90, MAN 89, VRS 90]. 10.3. Application to natural image processing: image coding 10.3.1. Introduction The goal of this section is to introduce the basic methods that make it possible to associate a natural image with a ﬁnally contracting transformation W whose attractor is an approximation of the image itself. If this inverse problem is solved, we can

Iterated Function Systems and Applications in Image Processing

341

then talk about image compression, since storing the coefﬁcients of W requires “less information” than storing the original image. It is also referred to as lossy compression, owing to the fact that the attractor constitutes only one approximation of the original image. The compression of an image by fractals relies on a transformation called fractal transformation, which consists of transforming the image by a ﬁnally contracting operator, so that its visual aspect remains almost unchanged. For that, the image transformation is made up of N elementary sub-transformations, each one operating on a block of the image, in the following way (see Figure 10.4). The image is partitioned in N blocks rn called destination blocks: A=

N +

rn

(10.14)

n=1

destination block r n

Zn

source block d D (n)

Figure 10.4. Destination blocks rn and source blocks dα(n) . The source block transformed by ωn must resemble the smaller size destination block. The set of destination blocks form a partition of the image

We call R the partition of the image support in destination blocks. Each destination block is then put in correspondence with another transformed block ωn (dα(n) ) that “resembles” it with respect to a gray-level based error measure. The block dα(n) , called a source block, is sought through a library made up of Q blocks belonging to the image: α(n) is thus an application from [1 . . . N ] to [1 . . . Q]. The Q blocks do not necessarily form a partition of the image but are representative of the entire image.

342

Scaling, Fractals and Wavelets

The transformation of image A by W is formulated using the following equation: W (A) =

N +

ωn (dαn ) =

n=1

N +

ˆ rn

(10.15)

n=1

where ˆ rn is the approximation of the destination block rn , obtained by transforming rn is called the source block dαn by ωn (the mapping between block dαn and block ˆ a “collage” operation). The calculation of the transformation parameters ωn and the position of the blocks dαn are detailed in the following section. It should be noted that the problem described here is different from the inverse problem introduced in section 10.2.6 since the spatial transformations considered do not apply to the whole image directly but to subparts of it, as it is not fractal. Moreover, no probability pi is assigned to the transformation ωi deﬁning W . We will see that, among the proposed methods, the Dudbridge method is that which comes closest to it. 10.3.2. Coding of natural images by fractals Jacquin [JACQ 92] proposed in 1989 an approach based on a regular partition R with square geometry. The image is partitioned into square destination blocks4 of ﬁxed size equal to B 2 pixels (B = 8). The algorithm seeks, for each destination block rn ) rn , the source block dα(n) of size D2 (D = 2B) that minimizes the error d(rn , ˆ where ˆ rn is the approximation of rn calculated from the source block dα(n) . The error measure d is given by: 2

rn ) = d(rn , ˆ

B

rnj − rˆnj

2

(10.16)

j=1

where rnj and rˆnj are the pixel values of index j inside the original block rn and the collage block ˆ rn , respectively. The joining operation, called parent collage, is detailed in the following section. 10.3.2.1. Collage of a source block onto a destination block The collage operation of a source block dα(n) onto a destination block rn , realized by the transformation ωn , decomposes into two parts: – a spatial transformation deforms the support of block dα(n) ; – a “mass” transformation acts on the pixel luminance of the deformed dα(n) block.

4. In the formulation of Jacquin, these blocks are called parent blocks.

Iterated Function Systems and Applications in Image Processing

343

These two points are further detailed in this section. Spatial transformation shrinks the block source dα(n) of size D2 to the scale and overlaps it with the destination block rn of size B 2 . The block thus transformed, noted (n) b2 , is obtained by decimating the pixels of the source block: a pixel of coordinates (n) (xi , yj ) in b2 is given by the following equation: (n)

b2 (xi , yj ) =

1' dα(n) (xk , yl ) + dα(n) (xk , yl+1 ) 4

(

(10.17)

+ dα(n) (xk+1 , yl ) + dα(n) (xk+1 , yl+1 ) where (xk , yl ) are the coordinates of a pixel of intensity dα(n) , and belonging to the block dα(n) . (n)

Mass transformation acts on block b2 to approximate the destination block rn . The complexity of this transformation depends on the nature of the block rn under consideration. To do so, Jacquin proposes classifying the square blocks using the method developed by Ramamurthi and Gersho [RAM 86]: all blocks of an image are grouped into three classes; homogenous blocks, textured blocks and blocks with contours (simple and divided). Depending on the class which the destination block rn belongs to, a more or less complex mass transformation is associated with it. This (n) (n) depends on the decimated block b2 and/or on a constant block b1 formed of pixels (n) (n) all equal to one. The block b2 will be associated with a scale coefﬁcient denoted β2 (n) (n) and the block b1 will be associated with a shift coefﬁcient noted β1 . The choice of transformation type depends on the following procedure: – if the block rn is homogenous: absorption of the gray-levels of rn . No search for source blocks dα(n) is carried out. The transformation of rn , coded with Is bits, reads: (n) (n)

ˆ rn = β1 b1 (n)

where the integer β1 lies between 0 and 255; – if the block rn is textured: search for the source block dα(n) , then perform contrast change and apply shifts. The transformation of dα(n) , coded with Im bits, reads: (n) (n)

(n) (n)

ˆ rn = β2 b2 + β1 b1 (n)

(n)

where β2 belongs to the set {0.7, 0.8, 0.9, 1.0} and the integer β1 lies between −255 and 255; – if the block rn contains contours: search for the source block dα(n) , then perform contrast change, apply shifts and discrete isometries ın (rotations of 0, +90, −90 and

344

Scaling, Fractals and Wavelets

+180 degrees, reﬂections along vertical and horizontal symmetry axes, and reﬂections along the two diagonal axes). The transformation of dα(n) , coded with Ie bits, reads: (n) (n) (n) (n) ˆ rn = ın β2 b2 + β1 b1 (n)

(n)

where β2 belongs to the set {0.5, 0.6, 0.7, 0.8, 0.9, 1.0} and the integer β1 between −255 and 255.

lies

When the destination block is textured or overlaps with contours, the scale (n) coefﬁcient β2 is calculated so that the standard deviations of the two blocks b2 and rn are equal. It is then rounded to a coefﬁcient belonging to a set of predeﬁned values, all real positive and less than one. The shift coefﬁcient β1 is calculated so that (n) the pixel averages of the two blocks b2 and rn are equal. It is not quantiﬁed. The exhaustive search of the source block dα(n) is carried out by shifting over the image support a square block by a step width of δh = δv = 4 pixels in the horizontal and in the vertical directions. When two blocks are compared, each of the eight discrete isometries is considered. For an image of size 256 × 256 (respectively 512 × 512), such research is thus carried out through a library made up of 29,768 (respectively 125,000) source blocks. 10.3.2.2. Hierarchical partitioning In a next step, Jacquin proposes the division of the collage parent blocks ˆ rn into four destination sub-blocks of size 4 × 4 pixels called a child blocks (see Figure 10.5). The obtained blocks are compared with their equivalents in the original image, through the error measure given by formula (10.16), with B = 4. If the error is higher than a given threshold, they are coded separately by seeking the best source block of size 8 × 8 available in the image. The collage process is, in this case, called a child collage. 1 parent no child 1 configuration

child collage p arent collage

1 parent 1 child 4 configurations 1 parent 2 children 6 configurations no parent 4 children 1 configuration

Figure 10.5. Partitioning formed by parents and child blocks

Iterated Function Systems and Applications in Image Processing

345

If, for a parent block, three or four child collages are necessary, only the four child collages are coded. If one or two child collages are necessary, the parent block is coded by parent collage complemented with child collages. 10.3.2.3. Coding of the collage operation on a destination block The storing of the collage of a source block (parent or child) dα(n) on a destination block (parent or child) rn includes: – the index of the source block dα(n) retained among the Q blocks of the library, provided that those are arranged in a block list and that their organization on the image support is known. Otherwise, it is necessary to store the coordinates (xk , yl ) of a reference pixel in block dα(n) (for example, the upper left corner in the case of a square block); – the isometry used during collage (one among eight); – the coefﬁcients β1 and β2 of the mass transformation. This information is associated with each N destination block of the partition R. It is coded on a variable number of bits since it is not always necessary to store the set of the three components. It depends on the mass transformation that is used. 10.3.2.4. Contraction control of the fractal transformation The contraction control of the fractal transformation is a difﬁcult problem. Jacquin shows that the contraction factor of a mass transformation depends on the scale factor (n) β2 . The contraction factor of the other mass transformations (shift, absorption) is equal to 1. The author thus ensures the contraction of the fractal transformation by (n) imposing the condition (β2 )2 < 1 whatever the value of n in [1, . . . , N ]. This too stringent constraint limits the quality of the collage. A more detailed study of the contraction control of the fractal transformation will be presented in section 10.3.3.2 and we shall see in section 10.3.3.3 that it is possible to slightly slacken the constraint while preserving the ﬁnal contraction of the fractal transformation. 10.3.3. Algebraic formulation of the fractal transformation Immediately following the work by Jacquin, Lundheim proposed an algebraic formulation aimed at facilitating the comprehension of various theoretical and practical problems raised by the extension of IFS theory to the coding of natural images [LUN 92]. Its formulation, applied to the 1D signal, was then extended to the case of 2D signals by Øien [ØIE 93] and Lepsøy [LEP 93]. The formulation that is provided here is used in the simple case of a fractal transformation operating on the source and destination blocks of ﬁxed size and simple geometry (square, rectangular or triangular). The source blocks do not overlap. A

346

Scaling, Fractals and Wavelets

block is seen as a vector by supposing that the pixels which make it up are connected inside the original image. Let us consider an image as a column vector x made up of M 2 pixels. The fractal transformation T of the image x, composed of a linear term L and of a translation vector t, takes on the following form: T x = Lx + t

(10.18)

By specifying this transformation at the level of each destination block of the partition R, equation (10.18) can be written as follows: $ Tx =

N

% Ln x +

n=1

N

tn

(10.19)

i=1

The elementary transformations associated with each matrix Ln and each vector tn are detailed below. 2

2

The transformation associated with the elementary matrix Ln : RM → RM operates on the source vector dα(n) to attach it to the destination vector rn . The source 2 2 and destination vectors belong respectively to RD and RB , where D > B. The matrix Ln reads: (n)

Ln = β2 Pn DFα(n)

(10.20)

(n)

b2

where: 2 2 1) Fα(n) : RM → RD selects a source block dα(n) of size D2 pixels of the image. The isometries can be applied inside the block; 2

2

2) D : RD → RB brings back the source block selected to the size of a destination block by sub-sampling or by averaging pixels. The block thus obtained (n) is called block b2 . It should be thought as the decimated source block described by Jacquin; 2

2

(n)

3) Pn : RB → RM positions the decimated source block b2 destination block rn and cancels the other pixels of the image.

on the

A column vector Ln x is primarily composed of zeros, except for the part corresponding to the considered destination block of index n. The matrix L given by formula (10.21) is thus composed of sub-matrices associated with each block rn of

Iterated Function Systems and Applications in Image Processing

347

the image and of zeros elsewhere: ⎡

... ⎢ L = ⎣ ... (N ) β2 D

⎤ ... (n) ⎥ β2 D ⎦ ...

(1)

β2 D ... ...

(10.21)

(n)

The vertical position of a sub-matrix β2 D in the matrix L corresponds to the index n of the destination block considered. The horizontal position corresponds to the position of the source block which is associated with it. The elementary translation vector tn reads: (n)

(n)

tn = β1 Pn b1 (n)

(10.22) (n)

where β1 is a real coefﬁcient. The constant block b1 of size B 2 pixels, not (n) stemming from image x, is composed of pixels equal to (b1 = [1, 1, . . . , 1]T ). The vector t operating on the whole image is written as follows: ' (T (1) (1) (1) (2) (2) (2) (N ) (N ) (N ) t = β1 , β1 , . . . , β1 , β1 , β1 , . . . , β1 , β1 , β1 , . . . , β1

(10.23)

In short, the fractal transformation T of image x reads as follows: Tx =

N

(n) (n) β2 Pn b2

x+

n=1

N

(n)

(n)

β1 Pn b1

(10.24)

n=1

10.3.3.1. Formulation of the mass transformation In what follows, and for the sake of clarity, the operator Pn and index n will be omitted in the expressions of fractal transformation T . Thanks to this simpliﬁed writing, the approximation of the block r, noted ˆ r, is provided by a linear combination of two blocks, among which block d is extracted from the image itself. It reads: ˆ r = β2 b2 + β1 b1

(10.25)

The resulting transformation is that ﬁrst proposed by Jacquin in 1989. It is of course possible to elaborate expression (10.25) so as to improve the approximation of

348

Scaling, Fractals and Wavelets constant block (n)

β1

b1

rn

source block dα (n)

destination block

β2 decimated source block (n) b2 isometry decimation

Figure 10.6. Fractal transformation of the image

the destination block. In a more general way, the approximation of r is expressed as follows: ˆ r = β2 b2 + β1 b1 +

K

βi bi

(10.26)

i=3

The blocks bi are constant and known by the coder and the decoder. This type of expression is in particular used in [GHA 93, MON 94].

β1

β3

block b 1

block b 3

β4 block b 4

β2

block b 2

block r

Figure 10.7. Adjunction of ﬁxed vectors for the approximation of a destination block

Vines [VIN 93] proposes another diagram based around square destination blocks and decimated source blocks, of size equal to 8 × 8 pixels. It builds an orthonormal basis made up of the three ﬁxed vectors illustrated in Figure 10.7, and of 61 other vectors obtained from the decimated source vectors of the image. A destination block is then approximated by a linear combination of some basis vectors. The number of vectors considered depends on the complexity of the destination block.

Iterated Function Systems and Applications in Image Processing

349

10.3.3.2. Contraction control of the fractal transformation The Lipschitz factor s of the afﬁne operator T (equation (10.24)) is equal to the norm of the matrix L and thus to the square root of the largest eigenvalue of LT L, if the L2 norm is considered. Based on this remark, Lundheim deﬁnes the sufﬁcient conditions which ensure the contraction of the operator T [LUN 92], by considering that the source blocks do not overlap: if the collages derive from sub-sampling the square source blocks, then s reads: = > (n) 2 > β2 (10.27) s = ? max l=1:Q

α(n)=l

where Q is the number of source blocks used. The sum above encompasses the scale (n) coefﬁcients β2 associated with the set of destination blocks rn which depend on the source block dα(n) . If the collages result from averaging the pixels of the source blocks, s reads: = > (n) 2 >B β2 (10.28) s = ? max D l=1:Q α(n)=l

2

Ei ri

d E2k

E2j rj

rk Figure 10.8. Illustration of equations (10.27) and (10.28): for each source (n) block dα(n) , we calculate the sum of the scale coefﬁcients β2 associated with the destination blocks rn which depend on dα(n)

Equation (10.28) shows that the Lipschitz factor of the operator T is reduced by B when the pixels of the source blocks are averaged. It depends on the scale a factor D (n) coefﬁcients β2 but also on the size difference of the compared blocks. Hence, the “spatial contraction” of the blocks inﬂuences, in this case, the contraction factor of T .

350

Scaling, Fractals and Wavelets

10.3.3.3. Fisher formulation Fisher [FIS 95a] described the collage operation of a source block onto a destination block using a unique formula: ⎛ ⎞ ⎛ ⎞⎛ ⎞ ⎛ ⎞ an bn en 0 x x 0 ⎠ ⎝y ⎠ + ⎝ fn ⎠ (10.29) ω n ⎝ y ⎠ = ⎝ cn d n (n) (n) z z β1 0 0 β2 where (x, y) are the coordinates of a pixel pertaining to the source block dα(n) and z is its gray-level. The symbols an , bn , cn , dn , en and fn are the coefﬁcients of the afﬁne spatial transformation which brings back the pixels of source block dα(n) inside the (n) (n) destination block rn . The quantities β1 and β2 are the transformation coefﬁcients of the pixels’ gray-level. Whereas the mass transformation coefﬁcients used by Jacquin are chosen from a predeﬁned set of values, Fisher uses a uniform scalar quantiﬁer. Jacobs et al. showed that for this type of quantiﬁer, the quantiﬁcation of the translation coefﬁcients and of the scale coefﬁcients, with seven and ﬁve bits, respectively, is optimal in terms of visual quality of the rebuilt images [JACO 92]. Optimal coefﬁcients calculation of the mass transformation For a destination block r, the approximation ˆ r is given by: ˆ r = β2 b2 + β1 b1 The calculation of the “mass” transformation coefﬁcients is an optimization 2 problem within a linear subspace X of the vector space RB . The purpose is to ﬁnd, for a destination block r and for a decimated source block b2 , the optimal coefﬁcients β1 and β2 minimizing the least square distance d between r and its collage ˆ r. THEOREM 10.2 (P ROJECTION).– The optimal approximation in the L2 norm of a 2 r element of X which vector r element of RB , in the linear subspace X is the vector ˆ yields the residual vector r − ˆ r orthogonal to all vectors spanning the subspace X. Let us consider, the mean square error (MSE) between two vectors x and y in the 2 space RB , deﬁned by the following expression: 2

Err (x, y) =

B j=1

(xj − yj )2

∀x, y ∈ RB

2

Iterated Function Systems and Applications in Image Processing

351

Determining the two optimal coefﬁcients β1 and β2 leading to the best approximation ˆ r of vector r in the basis b1 , b2 , amounts to canceling the two mean square errors: Err (r − β1 b1 − β2 b2 , b1 ) = 0 Err (r − β1 b1 − β2 b2 , b2 ) = 0 The optimal coefﬁcients β1 and β2 thus calculated are not constrained and the contraction of the collage operator is not guaranteed. Jacobs et al. empirically show that thresholding the magnitude of the β2 coefﬁcient to 1.5 ensures the ﬁnal convergence of the fractal transformation. Hürtgen proposes a detailed study of the contraction control of the fractal transformation by considering particular cases based on square partitioning [HURT 93a, HURT 94] and on the spectral radius associated with the linear term L introduced by Lundheim (see equation (10.18)). 10.3.4. Experimentation on triangular partitions It has been shown in section 10.3 that the fractal coding of a natural image consisted of approximating each area of the image from other areas of the same image, by means of local transformations. The image is thus partitioned into N blocks rn , forming a partition R. Each block is then put in correspondence with another block dn of the image, from which it is possible to approximate, by an elementary transformation ωn , the gray-level function of rn . The operator W deﬁning a fractal transformation of the image is composed of the N transformations ωn . It is ﬁnally contracting provided that a sufﬁcient number of transformations ωn are contracting. Fractal transformation coding requires the coefﬁcients of the N transformations ωn to be stored. It is thus all the more effective that the partition R contains a reduced number of blocks. The ﬁrst works of Jacquin, mentioned in section 10.3, showed the advantage of using square regular partitions to build the operator W . However, they did not reach large compression rates because the partition R entailed too large a number of blocks rn . This issue could be circumvented by calculating the fractal transformation on square or rectangular partitionings [FIS 95a] adapted to the image contents (quadtree, HV). We shall now present an approach that consists of calculating the fractal transformation of the image on a triangular partitioning, which is ﬂexible and adapted to the contents of the image. Various algorithms [CHAS 93, DAVO 97] allow us to build such a triangulation.

352

Scaling, Fractals and Wavelets

Dk

Rk

Di,j

Rj

Ri

Figure 10.9. Fractal transformation calculation. The left hand partition (D) encloses the source blocks and the right hand partition (R) contains the destination blocks r

Let us assume that the triangulation R (adapted to the image contents) has N destination blocks ri and that the (regular) triangulation D contains Q source blocks (see Figure 10.9). The algorithm consists of associating each block ri with the block dj that minimizes the error d between the gray-level function of block ri and that of block dj transformed by ω. The mass transformation to perform the collage of the source block dj onto the destination block ri is the same as the one proposed by Jacquin and Fisher. It uses (i) (i) only two coefﬁcients: the shift coefﬁcient β1 and the scale coefﬁcient β2 . The decoding algorithm amounts to an iteration of operator W , after the partitions R and D were rebuilt, starting from an arbitrary image f0 . After k iterations of operator W , the gray-level fk (xi , yi ) of a pixel in block r reads: (10.30) fk (xi , yi ) = β2 fk−1 v −1 (xi , yi ) + β1 ∀(xi , yi ) ∈ r In practice, the result converges towards the reconstructed image, an attractor of the fractal transformation, after ﬁve to ten iterations (see Figure 10.10). Primarily, the number of iterations depends on the surface ratio between the blocks of partition D and those of partition R. As the number of iterations increases, the collage of a block dα(n) (covering several blocks r) onto its corresponding block rn reduces the size of the details within blocks of partition R. 10.3.5. Coding and decoding acceleration 10.3.5.1. Coding simpliﬁcation suppressing the research for similarities Dudbridge proposed in 1995 [DUD 95b] a fast fractal compression method for images, based on a regular square partitioning. The speed of this compression algorithm is due to the fact that no search for a similar interblock is made. The

Iterated Function Systems and Applications in Image Processing

353

Figure 10.10. Decoding of Lena image 512 × 512. MSE At iteration 15: Tc = 11.2 : 1, PSNR = −10 log10 255 2 = 32.29 dB

image is partitioned into a set of square ﬁxed size blocks, and each block is coded individually by a fractal transformation. According to the author, the method gives less efﬁcient results than, for example, Jacquin’s method. The reasons for this will be explained at the end of the section.

354

Scaling, Fractals and Wavelets

Coding An image5 is coded using a set of contracting spatial transformations (IFS) {ω1 , . . . , ωN } deﬁned on R2 , associated with a contracting transformation G acting on the pixels’ luminance. At resolution m, the square support of the IFS transformed image reads: A=

N + k=1

ωk (A) =

N + k1 =1

···

N + km =1

ωk1 ◦ · · · ◦ ωkm (A)

(10.31)

Ak1 ...km

It is noteworthy that the spatial transformation is applied to A and not to a subpart of A as it is the case in the traditional approach of coding deﬁned by Jacquin. The quantity p = Ak1 ...km denotes an “element” of the image support at resolution m, which may contain several pixels of the original image. At the maximum resolution, the size of element p is equal to that of an image pixel. The set Pm = {Ak1 ...km ; k1 , . . . , km = 1, . . . , N } contains all the elements of the image at resolution m. In the following section, we will consider that the IFS is composed of N = 4 afﬁne transformations deﬁned by equations (10.10). In these conditions, equation (10.31) is illustrated in Figure 10.11.

Figure 10.11. The square image A is divided into four square elements by four afﬁne contracting transformations ωk1 (k1 = 1 . . . 4). In the center, Ak1 = ωk1 (A) corresponds to one of the four elements of the image at resolution 1. On the right, Ak1 k2 = ωk1 ◦ ωk2 (A) corresponds to one of the 16 elements of the image at resolution 2

5. Within this section, the term image stands for a square block resulting from the regular partitioning of the original image.

Iterated Function Systems and Applications in Image Processing

The transformation G abides by the following equation [DUD 95b]: Gf (p) = (ak1 x + bk1 y + tk1 ) dx dy + sk1 v(p)

355

(10.32)

p

in which the function f : Pm → R gives the gray-level of element p and v(p) (p) (see is the sum of the gray-levels of the elements included in the block ωk−1 1 Figure 10.12): v(p) =

N

f (Ak2 ...km i ).

i=1

A1111

A1112

A1114

A1113

v (A 1441 )

-1 1

w (A 1441 )

Figure 10.12. Example for k1 = 1 (left upper quadrant), m = 4 and N = 4: v (A1441 ) = 4i=1 f (A441i ). In this particular case, the size of element p = A1441 corresponds to that of an image pixel

In contrast to the mass transformation of Jacquin, which contains only one scale factor and one shift factor on the gray-levels, equation (10.32) contains two additional coefﬁcients, related to the position (x, y) in the image of the element to be approximated. Dudbridge demonstrates that the transformation G is ﬁnally contracting at all N resolutions m if | k1 =1 sk1 | is smaller than 1, with respect to the Euclidean distance.

356

Scaling, Fractals and Wavelets

The calculation of the coefﬁcients ak1 , bk1 , tk1 and sk1 for all k1 of the set [1 . . . N ] is carried out so as to minimize the least squares error at resolution m between the original image f and its G-transform. This way, for the collage theorem to be satisﬁed, it sufﬁces to minimize for all k1 in set [1 . . . N ], the following function: 2 (ak1 x + bk1 y + tk1 ) dx dy + sk1 v(p) − f (p) (10.33) p∈ωk1 (Pm−1 )

p

The N summations are performed on the sub-block k1 of the original image, noted ωk1 (Pm−1 ) (recall that this is one of the four quadrants of the original image). The sum v(p) depends on the resolution m of approximation of the luminance of element p. The minimization of function (10.33) amounts to solving the following system of equations [DUD 95b]: ⎡ 2 ⎤ x x y x 1 v(p) x⎥ ⎢ p p p p p p ⎥ ⎢ p p p p ⎢ ⎥⎡ ⎤ 2 ⎥ ⎢ ak1 ⎢ x y y y 1 v(p) y ⎥ ⎥ ⎢ bk ⎥ p p p p p ⎥⎢ ⎢ p p p p p ⎢ ⎥⎣ 1⎥ ⎢ 2 ⎥ tk 1 ⎦ ⎢ ⎥ ⎢ x 1 y 1 1 v(p) 1 ⎥ sk1 ⎢ ⎥ ⎢ p p p p p p p p p p ⎢ 2 ⎥ ⎦ ⎣ v(p) v(p) x v(p) y v(p) 1 p

p

⎡

p

p

p

p

p

⎤

f (p) x ⎥ ⎢ p ⎢ p ⎥ ⎥ ⎢ ⎢ f (p) y ⎥ ⎥ ⎢ ⎢ p p ⎥ = ⎢ ⎥ ⎥ ⎢ ⎢ f (p) 1 ⎥ ⎢ p ⎥ p ⎥ ⎢ ⎦ ⎣ f (p)v(p) p

An image (a square block of the original image partition) is then coded by a sequence of 4 × 4 real coefﬁcients. Decoding The decoding algorithm allows for a fast and non-iterative reconstruction of the invariant function g associated with the operator G. It is only necessary to know the coefﬁcients ak1 , bk1 , tk1 and sk1 associated with each of the N spatial transformations ωk1 .

Iterated Function Systems and Applications in Image Processing

357

Dudbridge demonstrates that the gray-level sum, noted gk1 , of the sub-elements included in the element Ak1 can be decomposed as follows [DUD 95b, MON 95a]: x dx dy + bk1 y dx dy gk1 = ak1 Ak 1

Ak 1

+ tk 1

Ak 1

1 dx dy + sk1

N

(10.34) gk

k=1

and that, consequently, the sum of the gray-levels of the N elements Ak1 reads: # # N # N a x + b y + t 1 k Ak k Ak k Ak k=1 gk = (10.35) N 1 − k=1 sk k=1 Similarly, gk1 k2 stands for the gray-level sum of the sub-elements of Ak1 k2 and reads: x dx dy + bk2 y dx dy gk1 k2 = ak1 Ak 1 k 2

Ak 1 k 2

+ tk2 with

N

k=1 gk2 k

Ak 1 k 2

dx dy + sk2

N

(10.36) gk2 k

k=1

= gk 2 .

The decoding procedure decomposes as follows: N – the sum k=1 gk is directly calculated from the coefﬁcients of G (formula (10.35)). The result is equal to the gray-level sum of the pixels in the original image; – according to (10.34), gk1 (k1 = 1 . . . N ) is a function of the variables ak1 , bk1 , N tk1 , sk1 and of the value k=1 gk previously calculated; – according to (10.36), gk1 k2 is a function of the variables ak2 , bk2 , tk2 , sk2 and of the previously calculated gk2 ; – etc. This way, each element of the invariant function g at resolution m can be recursively calculated. The reconstruction procedure is not iterative, in contrast to most algorithms of fractal decoding. The method presented in this section, approximates the luminance function f in each square block of an original image partition. Each block is approximated by an invariant function g, independently of the rest of the image. At a given resolution m, the approximation is performed in the least squares sense, using an IFS associated

358

Scaling, Fractals and Wavelets g1

g3

g2

g 12

g 22

g 32

g 42

g4

Figure 10.13. Illustration of Dudbridge algorithm for decoding at resolution m = 3, a 8 × 8 image. The four values gk1 k2 (k1 = 1 . . . 4) at resolution 2 depend on the value gk2 at resolution 1 and on their respective position in the image

with a ﬁnally contracting transformation G in the luminance space. The expression of G (equation (10.32)) is comparable with that of the mass transformation suggested by Jacquin, since it also relies on a scale factor sk1 and on a shift factor tk1 . It also contains two additional coefﬁcients ak1 and bk1 which act on the coordinates of the approximated elements inside the block: a ﬁrst order approximation, in that case. Equation (10.32) shares similarities with equation (10.26): the coefﬁcients ak1 and bk1 deﬁne weights associated with two inclined planes, in the luminance space. The reason why this method is not as efﬁcient as the basic diagram of Jacquin is because a block is approximated from itself and not from another block of the image. Then, the transformation G must be sufﬁciently complex to achieve a good block approximation. That is why Dudbridge added two more parameters to the mass transformation expression. However, saving these extra parameters results in a lower compression ratio. Nonetheless, the method has the advantage of being very fast. Moreover, the coding-decoding algorithm is evenly balanced regarding calculation time. Initially, Dudbridge presented this coding technique in his thesis in 1992 [DUD 92]. Since then, the approach has been generalized on square blocks issued from a quadtree partitioning [DUD 95a]. The additional weights of the terms x2 , y 2 , x3 and y 3 in the expression of G, were studied by Monro et al. [MON 93a, MON 93b, MON 94, WOO 94]. The authors also extended this approach to the compression of video sequences [WIL 94]. 10.3.5.2. Decoding simpliﬁcation by collage space orthogonalization We now consider that vector ˆ r, an approximation of the destination vector r, reads: ˆ r = β2 b2 + β1 b1

(10.37)

The vector ˆ r belongs to the vector subspace spanned by b1 and b2 , where b1 is a constant basis vector (b1 = [1, 1, . . . , 1]T ) and b2 a vector extracted from the image to be coded. Each vector ˆ r, r, b1 and b2 is of dimension B 2 .

Iterated Function Systems and Applications in Image Processing

359

De-correlation of the mass transformation coefﬁcients Øien proposed in [ØIE 93] a solution designed to accelerate the decompression phase by orthogonalization of the basis vectors b1 and b2 that span the collage vector subspace. The second advantage of this approach is that we do not have to impose constraints on the scale coefﬁcients of the fractal transformation. In signal processing applications, such as data compression, the handled vectors are generally represented by a linear combination of orthogonal functions. The most usual example is that of the Fourier transform where the functions are complex exponential. We can resort to the Gram-Schmidt procedure to orthogonalize the decimated block b2 with respect to the constant basis vector b1 of the vectorial subspace. This procedure amounts to multiply the vector b2 by the orthogonalizing matrix O = I − b1 b1 T , where I is the identity matrix of dimension B 2 × B 2 . The ˜ 2 reads: orthogonalized vector b ˜ 2 = Ob2 b Practically, this amounts to removing the b2 component that belongs to the subspace spanned by vector b1 , and thus to force the mean value of block b2 to zero. The new collage, now performed in the vectorial subspace spanned by the ˜ 2 , becomes: orthogonal vectors b1 and b ˜ 2 + α1 b1 ˆ ro = α2 b In this case, the coefﬁcients αi are independent of each other. Øien and Lepsøy show in [ØIE 94b] that it is not necessary to formally orthogonalize the vectors to calculate the coefﬁcients α1 and α2 . They are directly obtained from the expressions: 2

B 1 α1 = #r, b1 $ = 2 rj B j=1

and

α2 =

#r, b2 $ − α1 #b1 , b2 $

b2 2 − #b1 , b2 $2

(10.38)

in which rj and bj are the pixels inside blocks r and b2 , respectively. The coefﬁcients α1 and α2 are related to the initial coefﬁcients β1 and β2 through the relations: β2 = α2

and

β1 = α1 − #b1 , b2 $β2

Øien and Lepsøy show in [ØIE 94b] that with the orthogonalization of vectors b1 and b2 prior to the coding phase, it becomes possible to warrant the exact convergence of the decoder in a ﬁnite number of iterations.

360

Scaling, Fractals and Wavelets

10.3.5.3. Coding acceleration: search for the nearest neighbor In [SAU 95], Saupe proposes a fractal coding procedure of complexity in O(log Q), where Q is the number of source blocks, which is worthy comparing with the usual schemes whose complexity is in O(Q). The traditional procedure, using an afﬁne transformation in the luminance spaces, searches for each destination block 2 2 r ∈ RB , of the Q source block b2 ∈ RB that minimizes: E(ˆ r, r) = min r − β1 b1 − β2 b2 2 β1 ,β2

Calculation of the optimal coefﬁcients β1 and β2 and of the error is costly in terms of calculation time. 2

Saupe considers the orthonormal basis of the vector subspace of RB , spanned by the normalized vectors b1 (b1 = B1 (1, . . . , 1)) and φ(b2 ). The symbol φ represents the projection operator making φ(b2 ) orthonormal to b1 . It is shown, in this case [SAU 95], that the error E(ˆ r, r) is proportional to an increasing monotonic function of distance D given by: D(ˆ r, r) = min d φ(b2 ), φ(r) , d −φ(b2 ), φ(r) Minimizing E(ˆ r, r) amounts to minimizing D(ˆ r, r) and thus, to seeking for the closest neighbor of φ(r) among the 2Q vectors ±φ(b2 ). Different fast search algorithms of the nearest neighbor are proposed in the literature. In [FRI 77], the authors build a tree of dimension B 2 and deﬁne a search method of complexity in O(log Q). The results presented in [SAU 95] show that it is possible to gain a factor 1.3 to 11.5 over the compression time, without degrading the quality of the reconstructed image signiﬁcantly. The gain depends of course on the nature of the image and on the number of source blocks considered. 10.3.6. Other optimization diagrams: hybrid methods Seminal work by Jacquin, based on the use of a local iterated contracting functions system, launched a great deal of research into the approximation or coding of real 1D, 2D and 3D signals by fractals [FIS 95a, JACQ 93, SAU 94, WOH 99]. These mainly concern: – the construction of an optimal partition to calculate the fractal transformation [DAVO 97, FIS 95b, FIS 95c, HURT 93c, NOV 93, REU 94, THO 95]. They are composed of square, rectangular or polygonal surface blocks locally adapted to the texture of the images (see Figure 10.14); – the acceleration of the coding algorithm (see [DAVO 96, DUD 92, LEP 93, TRU 00]);

Iterated Function Systems and Applications in Image Processing

361

– the use of constant vectors issued from a known dictionary, to approximate the destination blocks from the source blocks [GHA 93, VIN 93]; – the use of non-afﬁne elementary functions ωn that allow for coding the spatial redundancy of the images [LIN 94, POP 97]; – the acceleration of the decoding algorithm: iterative, non-iterative, hierarchical [BARA 95, CHAN 00, ØIE 93, ØIE 94a]; – the theoretical study of the fractal transformation [VRS 99] and of the convergence of the decoder [HURT 93a, HURT 94, LUN 92, MUK 00]; – the extension of the method to the coding of video sequences [BART 95, BEA 91, BOG 94, FIS 94, HURD 92, HURT 93b, LAZ 94, MON 95b, WIL 94]; – the use of fractals in hybrid coding-decoding schemes, either based on a discrete cosine transform [BART 94a, BART 94b, BART 95b] or on a wavelet transform of the image [BEL 98, DAVI 98, KRU 95, RIN 95, SIM 95, WALLE 96]. The fractal code is, in this case, calculated on another representation of the original image, which can be more favorable to the search of local similarities.

Figure 10.14. Illustration of different partitionings: quadtree (square), HV (rectangular) and Voronoï (polygonal), used for the search of local similarities in the image and for the calculation of the fractal code

Figure 10.15 compares the compression performances using fractal coding on square, rectangular and polygonal partitioning with those of a normalized JPEG compression [WALLA 91]. It highlights the fact that beyond a reasonable compression rate, compression by fractals outperforms the JPEG compression, at least in terms of visual quality of the images.

362

Scaling, Fractals and Wavelets

38 JPEG Delaunay fractal Voronoi fractal HV fractal Quadtree 1 fractal Quadtree 2 fractal

Reconstruction SNR

36 34 32 30 28 26 10

20

30

40 50 60 70 Compression rate

80

90

100

Figure 10.15. Signal to noise ratios versus the compression rate, calculated on the 512 × 512 Lena image. These ratios compare the quality of the reconstructed images, after fractal compression on blocks of variable geometries, and after a normalized JPEG compressor

10.4. Bibliography [BARA 95] BARAHAV Z., M ALAH D., K ARNIN E., “Hierarchical interpretation of fractal image coding and its applications”, in F ISHER Y. (Ed.), Fractal Image Compression: Theory and Application to Digital Images, Springer-Verlag, New York, p. 91–117, 1995. [BARN 86] BARNSLEY M.F., E RVIN V., H ARDIN D., L ANCASTER J., “Solution of an inverse problem for fractals and other sets”, Proc. Natl. Acad. Sci. USA, vol. 83, p. 1975–1977, 1986. [BARN 88] BARNSLEY M.F., Fractal Everywhere, Academic Press, New York, 1988. [BARN 93] BARNSLEY M.F., H URD L.P., Fractal Image Compression, A.K. Peters, Wellesley, 1993. [BART 94a] BARTHEL K.U., S CHÜTTEMEYER J., VOYÉ T., N OLL P., “A new image coding technique unifying fractal and transform coding”, in IEEE International Conference on Image Processing (Austin, Texas), p. 112–116, November 1994. [BART 94b] BARTHEL K.U., VOYÉ T., “Adaptive fractal image coding in the frequency domain”, in Proceedings of International Workshop on Image Processing: Theory, Methodology, Systems, and Applications (Budapest, Hungary), June 1994.

Iterated Function Systems and Applications in Image Processing

363

[BART 95] BARTHEL K.U., VOYÉ T., “Three-dimensional fractal video coding”, in ICIP (Washington DC, USA), vol. 3, p. 260–263, 1995. [BART 95b] BARTHEL K.U., “Entropy constrained fractal image coding”, Fractals, in NATO ASI on Fractal Image Coding, Trondheim, Norway, July 1995. [BEA 91] B EAUMONT J.M., “Image data compression using fractal techniques”, BT Technol. J., vol. 9, no. 4, p. 93–109, 1991. [BEL 98] B ELLOULATA K., BASKURT A., B ENOIT-C ATIN H., P ROST R., “Fractal coding of subbands with an oriented partition”, Signal Processing: Image Communication, vol. 12, 1998. [BOG 94] B OGDAN A., “Multiscale (inter/intra frame) fractal video coding”, in Proceedings of the IEEE International Conference on Image Processing (ICIP’94, Austin, Texas), November 1994. [CAB 92] C ABRELLI C.A., F ORTE B., M OLTER U.M., V RSCAY E.R., “Iterated fuzzy set systems: A new approach to the inverse problem for fractals and other sets”, Journal of Mathematical Analysis and Applications, vol. 171, no. 1, p. 79–100, 1992. [CHAN 00] C HANG H.T., K UO C.J., “Iteration-free fractal image coding based on efﬁcient domain pool design”, IEEE Transactions on Image Processing, vol. 9, no. 3, p. 329–339, 2000. [CHAS 93] C HASSERY J.M., DAVOINE F., B ERTIN E., “Compression fractale par partitionnements de Delaunay”, in Quatorzième colloque GRETSI (Juan-les-Pins, France), vol. 2, p. 819–822, 1993. [DAVI 98] DAVIS G., “A wavelet-based analysis of fractal image compression”, IEEE Transactions on Image Processing, vol. 7, p. 141–154, 1998. [DAVO 96] DAVOINE F., A NTONINI M., C HASSERY J.M., BARLAUD M., “Fractal image compression based on Delaunay triangulation and vector quantization”, IEEE Transactions on Image Processing: Special Issue on Vector Quantization, February 1996. [DAVO 97] DAVOINE F., ROBERT G., C HASSERY J.M., “How to improve pixel-based fractal image coding with adaptive partitions”, in L ÉVY V ÉHEL J., L UTTON E., T RICOT C. (Eds.), Fractals in Engineering, Springer-Verlag, p. 292–307, 1997. [DUD 92] D UDBRIDGE F., Image approximation by self-afﬁne fractals, PhD Thesis, University of London, 1992. [DUD 95a] D UDBRIDGE F., “Fast image coding by a hierarchical fractal construction”, University of California, San Diego, 1995. [DUD 95b] D UDBRIDGE F., “Least-squares block coding by fractal functions”, in F ISHER Y. (Ed.), Fractal Image Compression: Theory and Application to Digital Images, Springer-Verlag, New York, p. 229–241, 1995. [ELT 87] E LTON J.H., “An ergodic theorem for iterated maps”, Ergodic Theory and Dynamical Systems, vol. 7, p. 481–488, 1987. [FIS 91] F ISHER Y., JACOBS E.W., B OSS R.D., Iterated transform image compression, Technical Report 1408, Naval Ocean Systems Center, San Diego, California, April 1991.

364

Scaling, Fractals and Wavelets

[FIS 94] F ISHER Y., ROGOVIN D., S HEN T.P., “Fractal (self-VQ) encoding of video sequences”, in Proceedings of the SPIE: Visual Communications and Image Processing (Chicago, Illinois), September 1994. [FIS 95a] F ISHER Y. (Ed.), Fractal Image Compression: Theory and Application to Digital Images, Springer-Verlag, New York, 1995. [FIS 95b] F ISHER Y., “Fractal image compression with Quadtrees”, in F ISHER Y. (Ed.), Fractal Image Compression: Theory and Application to Digital Images, Springer-Verlag, New York, p. 55–77, 1995. [FIS 95c] F ISHER Y., M ENLOVE S., “Fractal encoding with HV partitions”, in F ISHER Y. (Ed.), Fractal Image Compression: Theory and Application to Digital Images, Springer-Verlag, New York, p. 119–136, 1995. [FRI 77] F RIEDMAN J.H., F INKEL J.L., “An algorithm for ﬁnding best matches in logarithmic expected time”, ACM Trans. Math. Software, vol. 3, no. 3, p. 209–226, 1977. [GHA 93] G HARAVI -A LKHANSARI M., H UANG T.S., “A fractal-based image block-coding algorithm”, in Proceedings of ICASSP, p. 345–348, 1993. [HURD 92] H URD L.P., G USTAVUS M.A., BARNSLEY M.F., “Fractal video compression”, in Compcon Spring. Conference 37, p. 41–42, 1992. [HURT 93a] H ÜRTGEN B., “Contractivity of fractal transforms for image coding”, Electronics Letters, vol. 29, no. 20, p. 1749–1750, 1993. [HURT 93b] H ÜRTGEN B., B ÜTTGEN P., “Fractal approach to low rate video coding”, in Proceedings of SPIE, vol. 2094, p. 120–131, 1993. [HURT 93c] H ÜRTGEN B., M ÜLLER F., S TILLER C., “Adaptive fractal coding of still pictures”, in Proceedings of the Picture Coding Symposium, 1993. [HURT 94] H ÜRTGEN B., H AIN T., “On the convergence of fractal transforms”, in Proceedings of ICASSP, p. 561–564, 1994. [JACO 92] JACOBS E.W., F ISHER Y., B OSS R.D., “Image compression: a study of the iterated transform method”, Signal Processing, vol. 29, p. 251–263, 1992. [JACQ 92] JACQUIN A.E., “Image coding based on a fractal theory of iterated contractive image transformations”, IEEE Transactions on Image Processing, vol. 1, no. 1, p. 18–30, 1992. [JACQ 93] JACQUIN A.E., “Fractal image coding: a review”, Proceedings of the IEEE, vol. 81, no. 10, p. 1451–1465, 1993. [KRO 92] K ROPATSCH W.G., N EUHAUSSER M.A., L EITGEB I.J., B ISCHOF H., “Combining pyramidal and fractal image coding”, in Proceedings of the Eleventh ICPR (The Hague, Netherlands), vol. 3, p. 61–64, 1992. [KRU 95] K RUPNIK H., M ALAH D., K ARNIN E., “Fractal representation of images via the discrete wavelet transform”, in IEEE Eighteenth Conference of EE in Israel (Tel Aviv), March 1995. [LAZ 94] L AZAR M.S., B RUTON L.T., “Fractal block coding of digital video”, IEEE Transactions on Circuits and Systems for Video Technology, vol. 4, no. 3, p. 297–308, 1994.

Iterated Function Systems and Applications in Image Processing

365

[LEP 93] L EPSØY S., Attractor image compression – Fast algorithms and comparisons to related techniques, PhD Thesis, Norwegian Institute of Technology, Trondheim, June 1993. [LEV 90] L ÉVY V ÉHEL J., G AGALOWICZ A., Fractal approximation of 2-D object, Technical Report 1187, INRIA, Rocquencourt, France, 1990. [LIN 94] L IN H., V ENETSANOPOULOS A.N., “Incorporating nonlinear contractive functions into the fractal coding”, in Proceedings of the International Workshop on Intelligent Signal Processing and Communication Systems (Seoul, Korea), p. 169–172, October 1994. [LUN 92] L UNDHEIM L., Fractal signal modelling for source coding, PhD Thesis, Norwegian institute of technology, Trondheim, September 1992. [LUT 93] L UTTON E., L ÉVY V ÉHEL J., “Optimization of fractal functions using genetic algorithms”, in Fractal’93 (London, Great Britain), Springer, 1993. [MAN 89] M ANTICA G., S LOAN A., “Chaotic optimization and the construction of fractals: Solution of an inverse problem”, Complex Systems, vol. 3, p. 37–62, 1989. [MON 93a] M ONRO D.M., “Class of fractal transforms”, Electronics Letters, vol. 29, no. 4, p. 362–363, 1993. [MON 93b] M ONRO D.M., “Fractal transforms: Complexity versus ﬁdelity”, in V ERNAZZA G., V ENETSANOPOULOS A.N., B RACCINI C. (Eds.), Image Processing: Theory and Applications, Elsevier Science Publishers, p. 45–48, 1993. [MON 94] M ONRO D.M., W OOLLEY S.J., “Fractal image compression without searching”, in Proceedings of ICASSP, vol. 5, p. 557–560, 1994. [MON 95a] M ONRO D.M., D UDBRIDGE F., “Rendering algorithms for deterministic fractals”, IEEE Computer Graphics and Applications, p. 32–41, January 1995. [MON 95b] M ONRO D.M., N ICHOLLS J.A., “Low bit rate colour fractal video”, in ICIP (Washington DC, USA), vol. 3, p. 264–267, 1995. [MUK 00] M UKHERJEE J., K UMAR P., G HOSH S.K., “A graph-theoretic approach for studying the convergence of fractal encoding algorithm”, IEEE Transactions on Image Processing, vol. 9, no. 3, p. 366–377, 2000. [NOV 93] N OVAK M., Attractor coding of images, PhD Thesis, Department of Electrical Engineering, Linköping University, 1993. [ØIE 93] Ø IEN G.E., L2-optimal attractor image coding with fast decoder convergence, PhD Thesis, Norwegian Institute of Technology, Trondheim, April 1993. [ØIE 94a] ØIE 94a Ø IEN G.E., BAHARAV Z., L EPSØY S., K ARNIN E., M ALAH D., “A new improved collage theorem with applications to multiresolution fractal image coding”, in International Conference on Accoustics, Speech, and Signal Processing, 1994. [ØIE 94b] Ø IEN G.E., L EPSØY S., “Fractal-based image coding with fast decoder convergence”, Signal Processing, vol. 40, p. 105–117, 1994. [POP 97] P OPESCU D.C., D IMCA A., YAN H., “A nonlinear model for fractal image coding”, IEEE Transactions on Image Processing, vol. 6, no. 3, 1997. [RAM 86] R AMAMURTHI B., G ERSHO A., “Classiﬁed vector quantization of images”, IEEE Transactions on Communications, vol. 34, no. 11, p. 1105–1115, 1986.

366

Scaling, Fractals and Wavelets

[REU 94] R EUSENS E., “Partitioning complexity issue for iterated functions systems based image coding”, in Proceedings of the Seventh European Signal Processing Conference (Edinburgh, Scotland), vol. 1, p. 171–174, September 1994. [RIN 94] R INALDO R., Z AKHOR A., “Inverse and approximation problem for two-dimensional fractal sets”, IEEE Transactions on Image Processing, vol. 3, no. 6, p. 802–820, 1994. [RIN 95] R INALDO R., C ALVAGNO G., “Image coding by block prediction of multiresolution subimages”, IEEE Transactions on Image Processing, p. 909–920, July 1995. [SAU 94] S AUPE D., H AMZAOUI R., “A review of the fractal image compression literature”, Computer Graphics, vol. 28, no. 4, p. 268–276, 1994. [SAU 95] S AUPE D., “Accelerating fractal image compression by multi-dimensional nearest neighbor search”, in S TORER J.A., C OHN M. (Eds.), Proceedings of the Data Compression Conference (DCC’95, Institute for Information Technology, Freiburg University), IEEE Computer Society Press, March 1995. [SIM 95] S IMON B., “Explicit link between local fractal transform and multiresolution transform”, in ICIP (Washington DC, USA), vol. 1, p. 278–281, 1995. [THO 95] T HOMAS L., D ERAVI F., “Region-based fractal image compression using heuristic search”, IEEE Transactions on Image Processing, vol. 4, no. 6, p. 832–838, 1995. [TRI 93] T RICOT C., Courbes et dimension fractale, Springer-Verlag, 1993. [TRU 00] T RUONG T.K., J ENG J.H., R EED I.S., L EE P.C., L I A.Q., “A fast encoding algorithm for fractal image compression using the DCT inner product”, IEEE Transactions on Image Processing, vol. 9, no. 4, p. 529–535, 2000. [VIN 93] V INES G., Signal modelling with iterated function systems, PhD Thesis, Georgia Institute of Technology, May 1993. [VRS 90] V RSCAY E.R., “Moment and collage methods for the inverse problem of fractal construction with iterated function systems”, in Fractal’90 conference, June 1990. [VRS 99] V RSCAY E.R., S AUPE D., “Can one break the collage barrier in fractal image coding?”, in Fractals in Engineering, Springer-Verlag, 1999. [WALLA 91] WALLACE G.K., “The JPEG still picture Communications of the ACM, vol. 34, no. 4, p. 30–44, 1991.

compression

standard”,

[WALLE 96] VAN DE WALLE A., “Merging fractal image compression and wavelet transform methods”, Fractals, 1996. [WIL 94] W ILSON D.L., N ICHOLLS J.A., M ONRO D.M., “Rate buffered fractal video”, in Proceedings of the ICASSP, vol. V, p. 505–508, 1994. [WOH 99] W OHLBERG B., DE JAGER G., “A review of the fractal image coding literature”, IEEE Transactions on Image Processing, vol. 8, no. 12, p. 1716–1729, 1999. [WOO 94] W OOLLEY S.J., M ONRO D.M., “Rate-distortion performance of fractal transforms for image compression”, Fractals, vol. 2, no. 6, p. 395–398, 1994.

Chapter 11

Local Regularity and Multifractal Methods for Image and Signal Analysis

11.1. Introduction In this chapter, we shall review some of the important and recent applications of local regularity and multifractal analysis to signal/image processing. Obviously, we will not aim at a complete coverage of the ﬁeld, which would require a book of its own: (multi)fractal processing of signals and images is indeed now present in numerous applications. Rather, we will concentrate on a few topics, and try to explain in a very concrete manner how tools developed for the study of irregular functions may be applied to solve typical signal processing problems. This chapter is organized as follows: in section 11.2, we brieﬂy recall the notions that will be used in what is to come, namely regularity exponents and multifractal spectra. For more details on these, see Chapters 1 and 3. In section 11.3, we explain how to estimate regularity exponents on numerical data and compare various methods to do so (we do not tackle the problem of multifractal spectrum estimation, which will not be needed here. See Chapters 1 and 3 for more information on this). Section 11.4 gives a detailed explanation of how to use fractal tools to perform signal and image denoising. We ﬁrst recall a traditional wavelet-based denoising method, and explain why it is not adapted to processing irregular signals. We then

Chapter written by Pierrick L EGRAND.

368

Scaling, Fractals and Wavelets

present three different methods based on Hölder exponents and large deviation multifractal spectra that give good results on signals such as fractal functions, road proﬁles and SAR (radar) images. Section 11.5 explains how Hölder exponents may be used to perform data interpolation: the idea is to reﬁne the resolution in such a way that local regularity is preserved at each point. Again, this approach is well adapted to processing irregular signals and images, as we show in examples. Section 11.6 gives an account of the remarkable applications of fractal tools to ECG analysis: links between the condition of the heart and some features of the multifractal spectrum of its ECG, relation between RR signals and their local regularity, etc. In section 11.7, we brieﬂy describe an application of multifractal analysis to texture classiﬁcation, and describe an example of well logs. Section 11.8 is devoted to the presentation of an image segmentation method based on characterizing edges through multifractal analysis. The issue of change detection in a sequence of images is dealt with in section 11.9. As in the contour segmentation application, the idea is to characterize relevant changes through their signatures in the multifractal spectrum. As a ﬁnal image processing application, we describe in section 11.10 a method for reconstructing an image from a speciﬁc subset of pixels selected through multifractal analysis. To end this introduction, we should also mention that many of the methods described in this chapter are implemented in the free software toolbox FracLab [FracLab]. 11.2. Basic tools 11.2.1. Hölder regularity analysis This section focuses on the Hölder characterizations of regularity. To simplify notations, we assume that our signals are nowhere differentiable. Generalization to other signals simply requires the introduction of polynomials in the deﬁnitions (see Chapters 1 and 3).

Local Regularity and Multifractal Methods for Image and Signal Analysis

369

DEFINITION 11.1 (Pointwise Hölder exponent).– Let α ∈ (0, 1), and x0 ∈ K ⊂ R. A function f : K → R is in Cxα0 if, for all x in a neighborhood of x0 , |f (x) − f (x0 )| ≤ c|x − x0 |α

(11.1)

where c is a constant. The pointwise Hölder exponent of f at x0 , denoted αp (x0 ), is the supremum of the α for which the equation (11.1) holds. Let us now introduce the local Hölder exponent: let α ∈ (0, 1), Ω ⊂ R. We say that f ∈ Clα (Ω) if: ∃ C : ∀x, y ∈ Ω :

|f (x) − f (y)| ≤C |x − y|α

Let: αl (f, x0 , ρ) = sup {α : f ∈ Clα (B (x0 , ρ))} and notice that αl (f, x0 , ρ) is non-increasing as a function of ρ. We may thus set the following deﬁnition: DEFINITION 11.2.– Let f be a continuous function. The local Hölder exponent of f at x0 is the real number: αl (x0 ) = αl (f, x0 ) = lim αl (f, x0 , ρ) ρ→0

11.2.2. Reminders on multifractal analysis We brieﬂy state in this section some basic facts about multifractal analysis. Multifractal analysis is concerned with the study of the regularity structure of functions or processes, both from a local and global perspective. More precisely, we start by measuring in some way the pointwise regularity, usually with some kind of Hölder exponent. The second step is to give a global description of this regularity. This can be done either in a geometric fashion using Hausdorff dimension, or in a statistical manner using a large deviation analysis. Formally, let X(t), t ∈ I ⊂ R be a deterministic function or a stochastic process on a probability space (Ω, F, P). For ease of notation, we shall assume without loss of generality that I = [0, 1]. We deﬁne the following functions (these are random functions in general when X is itself random). 11.2.2.1. Hausdorff multifractal spectrum To simplify notations, set α(t) = αp (t). The Hausdorff spectrum describes the structure of the function t → α(t) by evaluating the size of its level sets. More precisely, let: Tα = {t ∈ I, α(t) = α}

370

Scaling, Fractals and Wavelets

The Hausdorff multifractal spectrum is the function: fh (α) = dimH (Tα ) where dimH (T ) denotes the Hausdorff dimension of the set T . 11.2.2.2. Large deviation multifractal spectrum Let: Nnε (α) = #{k : α − ε ≤ αnk ≤ α + ε} where αnk is the coarse-grained exponent corresponding to the dyadic interval Ink = [k2−n , (k + 1)2−n ], i.e.: αnk =

log |Ynk | − log n

Here, Ynk is a quantity that measures the variation of X in the interval Ink . The choice Ynk := X ((k + 1)2−n ) − X (k2−n ) leads to the simplest analytical calculations. Another possibility is to set: Ynk = oscX (Ink ), i.e. the oscillation of X inside Ink . A third choice is to take Ynk to be the wavelet coefﬁcient xn,k of X at scale n and location k (note that, in this case, the spectrum will depend on the chosen wavelet). The large deviation spectrum fg (α) is deﬁned as follows: fg (α) = lim lim inf ε→0 n→∞

log Nnε (α) log n

Note that, whatever the choice of Ynk , fg always ranges in R+ ∪ −{∞}. The intuitive meaning of fg is as follows. For n large enough, one has roughly: Pn (αnk α) 2−n(1−fg (α)) where Pn denotes the uniform distribution over {0, 1, . . . , 2n − 1}. Thus, for all α such that fg (α) < 1, 1 − fg (α) measures the exponential decay rate of the probability of ﬁnding an interval Ink with coarse-grained exponent equal to α, when n tends to inﬁnity. When X is a stochastic process, fg is in general a random function. In applications, it is convenient to consider in this case a “deterministic version” of fg , deﬁned as follows. log πn (α) . Fg (α) = 1 + lim lim sup ε→0 n→∞ log(n) where: πn (α) := P × Pn [αnk ∈ (α − ε, α + ε)] and unlike fg , Fg may assume non-trivial negative values.

Local Regularity and Multifractal Methods for Image and Signal Analysis

371

11.2.2.3. Legendre multifractal spectrum It is natural to interpret the spectrum fg as a rate function in a large deviation principle (LDP). Large deviations theorems provide conditions under which such rate functions may be calculated as the Legendre transform of a limiting moment generating function. When applicable, this procedure provides a more robust estimation of fg than a direct calculation. Deﬁne, for q ∈ R: Sn (q) =

n−1

|Ynk |q

k=0

with the convention 0q := 0 for all q ∈ R. Let: τ (q) = lim inf n→∞

log Sn (q) − log(n)

The Legendre multifractal spectrum of X is deﬁned as (τ ∗ denotes the Legendre transform of τ ): fl (α) := τ ∗ (α) = inf (qα − τ (q)). q∈R

Being deﬁned through a Legendre transform fl is a concave function. The two spectra fg and fl are related as follows. Deﬁne the sequence of random variables Zn := log |Ynk | where the randomness is through a choice of k uniformly in the set {0, . . . , n − 1}. Consider the corresponding moment generating functions: cn (q) := −

log En [exp(qZn )] log(n)

where En denotes expectation with respect to Pn . A version of Gärtner-Ellis theorem ensures that if lim cn (q) exists (in which case it equals 1 + τ (q)), and is differentiable, then c∗ = fg − 1. We then say that the weak multifractal formalism fg = fl holds. 11.3. Hölderian regularity estimation 11.3.1. Oscillations (OSC) The most natural way to estimate regularity is to use the “oscillation” method. This method is a direct application of the deﬁnition of the Hölder exponent (see [TRI 95] for more on this topic).

372

Scaling, Fractals and Wavelets

As seen above, a function f (t) is Hölderian with exponent α ∈ (0, 1) at t if there exists a constant c such that for all t in a neighborhood of t, |f (t) − f (t )| ≤ c|t − t |α In terms of oscillations, this condition may be written as: ∃c, ∀τ oscτ (t) ≤ cτ α where oscτ (t) = sup|t−t |≤τ f (t ) − inf |t−t |≤τ f (t ) = supt ,t ∈[t−τ,t+τ ] |f (t ) − f (t )|. The estimator is then simply deﬁned as the slope in the least-square linear regression of the logarithms of the oscillations versus the logarithms of the size τ of balls used to calculate the oscillations. 11.3.2. Wavelet coefficient regression (W CR) A method using wavelet coefﬁcients is described in this section. It relies on a theorem by S. Jaffard. This theorem shows how we can estimate the regularity at the point t0 using the wavelet coefﬁcients (provided the wavelets verify some regularity properties [JAF 04]). THEOREM 11.1 (S. Jaffard).– Let f be a uniformly Hölderian function and α the pointwise Hölder exponent of f at t0 . Then there exists a constant c such that the wavelet coefﬁcients verify: 1

|cj,k | ≤ c2−j(α+ 2 ) (1 + |2j t0 − k|)α

∀j, k ∈ Z2 1

Conversely, if ∀j, k ∈ Z2 we obtain |cj,k | ≤ c2−j(α+ 2 ) (1 + |2j t0 − k|)α for a α < α then the Hölder exponent of f at t0 is α. From this theorem, a traditional local regularity estimator is obtained if we consider only the indexes j, k such that |k − 2j t0 | < cste. This amounts to making the assumption that the local and pointwise Hölder exponent of f at t0 coincide [LEV 04b]. Under this hypothesis, an estimator is obtained through the regression slope p of log2 |cj,k | versus j. More precisely, at each point t0 of a signal decomposed on n scales we estimate the regularity by: n 1 sj log2 |cj,k | α ˜ (n, t0 ) = − − Kn 2 j=1 12 with Kn = n(n−1)(n+1) and sj = j − t0 +1 “above” t0 , i.e. k = 2n−j+1 .

n+1 2 .

(11.2)

The cj,k are the wavelet coefﬁcients

11.3.3. Wavelet leaders regression (W L) This method is very similar to the previous one, but often provides better results. For more information on wavelet leaders see [JAF 04].

Local Regularity and Multifractal Methods for Image and Signal Analysis

A dyadic cube at the scale j is given by: λ = interval becomes a cube in Rd .

!k

k+1 2j , 2j

373

. In d dimensions, this

DEFINITION 11.3.– The wavelet leaders are dλ = supλ ⊂λ |cλ |. λj (t0 ) is the dyadic cube of size 2−j at scale j containing the point t0 . Let dj (t0 ) = supλ ∈adj(λj (t0 )) |cλ | with adj(λj (t0 )) the set of dyadic cubes adjacents to λj (t0 ). PROPOSITION 11.1 (S. Jaffard).– If f ∈ C α (t0 ), then ∃c > 0, ∀j ≥ 0,

1

dj (t0 ) ≤ c2−(α+ 2 )j

(11.3)

Conversely, if equation (11.3) holds, and if f is uniform Hölderian, then f ∈ C α (t0 ). From this theorem and with the simpliﬁcation adj(λj (t0 )) = λj (t0 ) only, the new estimator is determined. At each point t0 of the signal X decomposed on n scales, the regularity is estimated by the following formula: n 1 sj log2 max (|xλ |) α ˜ W L (n, t0 ) = − − Kn λ ⊂λ 2 j=1 11.3.4. Limit inf and limit sup regressions The deﬁnition of the Hölder exponent makes use of a lower limit, which allows the exponent to exist without conditions. The three estimators presented above, however, calculate the exponent through a linear regression. Typically, they will not converge to the true exponent when the resolution tends to inﬁnity if the expression deﬁning the exponent does not converge, i.e. when the lower limit is not a plain limit. Indeed, if the upper and lower limits are different, the slope given by a linear regression has no relevance. However, as shown in [LEG 04b, LEV 04a], it is still possible to use regressions to obtain the exponent. The method is general, as it applies to the estimation of upper and lower limits through a modiﬁed regression scheme, that we proceed to explain now. The use of these liminf and limsup regression methods is of great practical importance, as it allows us to estimate, on arbitrary signals, various fractal quantities such as dimensions, exponents, and multifractal spectra. Note that, even for fractal signals, the Hölder exponents are often obtained as genuine lower limits (i.e. the limit does not exist).

374

Scaling, Fractals and Wavelets l

Let (lj )j≥1 be an arbitrary sequence of real numbers, and denote uj = jj . Let a = lim inf j→∞ uj . In our framework, think for instance of lj as the logarithm of the sizes of balls used in the oscillation calculations. Deﬁne, for all n ≥ 1: Vn0 = {1, . . . , n}

L0n = {l1 , . . . , ln }

0 Let (a0n , b0n ) be the parameters of the least square n linear regression of Ln with respect to Vn0 , i.e. the real numbers that minimize j=1 (lj − aj − b)2 over all couples (a, b). We write: 0 0 an , bn = Reg Vn0 , L0n

Now let: Vn1 = {j ∈ Vn0 , lj ≤ a0n j + b0n }, L1n = {lj , j ∈ Vn1 },

(a1n , b1n ) = Reg(Vn1 , L1n )

and deﬁne recursively: i−1 Vni = {j ∈ Vni−1 , lj ≤ ai−1 n j + bn },

Lin = {lj , j ∈ Vni },

(ain , bin ) = Reg(Vni , Lin )

for all i = 2, . . . Nn , where Nn is deﬁned as the ﬁrst index such that #VnNn +1 < 2. The geometric interpretation of the sequence (ain , bin ) is simple: In the ﬁrst step, we keep in (Vn1 , L1n ) those points that are “below” the regression line of L0n with respect to Vn0 . We then calculate the regression line of L1n with respect to Vn1 to obtain (a1n , b1n ), and iterate the process until at most one point remains below the regression line. n The slope of the liminf regression is then deﬁned as aN n . (the method is similar for the limsup, just keep the point above the regression line). In many cases of interest, n aN n will tend to a when n tends to inﬁnity.

11.3.5. Numerical experiments In this section we compare the three methods W CR, W L and OSC on different kinds of signals. For more experiments, see [LEG 04b]. Figure 11.1 represents a generalized Weierstrass function with a regularity H(t) = t and the estimations of the Hölder function by regression of the wavelet coefﬁcients (W CR), by wavelet leaders (W L) and by oscillation (OSC). This experiment shows that the best results are obtained by the W L method in this case.

Local Regularity and Multifractal Methods for Image and Signal Analysis 5

1.2

4

1

3

0.8

2

0.6

1

0.4

0

0.2

−1

0

−2

0

500

1000

1500

2000

2500

3000

3500

4000

4500

−0.2 0

500

1000

1500

(a) 1.2

1

1

0.8

0.8

0.6

0.6

0.4

0.4

0.2

0.2

0

0

500

1000

1500

2000

(c)

2500

3000

3500

4000

4500

2500

3000

3500

4000

4500

(b)

1.2

−0.2 0

2000

375

2500

3000

3500

4000

4500

−0.2 0

500

1000

1500

2000

(d)

Figure 11.1. (a) Generalized Weierstrass function (4096 points), regularity h(t) = t; (b) regularity estimation by W CR; (c) regularity estimation by W L; (d) regularity estimation by OSC

The second comparison deals with multifractional Brownian motion (MBM). MBM is an extension of fractional Brownian motion where the local regularity may be controlled. See Chapter 6 and [PEL 95, AYA 99, AYA 00b, AYA 00a]. For the experiment, 10 MBM with a regularity evolving like a sine function are built. The three estimation methods are applied to each signal and the results are displayed in Figure 11.2 (mean and variance). We see that, in this case, the oscillation-based method provides the best results both in terms of bias and variance. In conclusion, the methods described in this section generally provide decent estimates of the Hölderian regularity. Nevertheless, there is no “best” estimator between them. The estimation quality depends on the signal class.

376

Scaling, Fractals and Wavelets 2

2

1.5

1.5

1

1

0.5

0.5

0

0

−0.5

0

500

1000

1500

2000

2500

3000

3500

−0.5

4000

0

500

1000

(a)

1500

2000

2500

3000

3500

4000

(b) 2

1.5

1

0.5

0

−0.5

0

500

1000

1500

2000

2500

3000

3500

4000

(c)

Figure 11.2. Estimation of the regularity of a set of ten MBM with a regularity evolving like a sine function. The three estimation methods are tested. For each method, 10 Hölder functions are thus obtained. The empirical mean and the variance on these 10 functions are then calculated. Abscissa: time. Ordinates: mean estimated regularity (white), and error bars corresponding to two times the standard deviation on each side (gray). The theoretical regularity is displayed in black. (a) W CR method; (b) W L method; (c) OSC method

11.4. Denoising 11.4.1. Introduction Signal/image denoising is an important task in many areas including biology, medicine, astronomy, geophysics, and many more. For such applications and others, it is important to denoise the observed data in such a way that the features of interest to the practitioner are preserved. The basic framework is as follows. We observe a signal (or an image) Y which is a combination F (X, B) of the signal of interest X and a “noise” B. Making various assumptions on the noise, the structure of X and

Local Regularity and Multifractal Methods for Image and Signal Analysis

377

@ of the original the function F , we then try to derive a method to obtain an estimate X image which is in some sense optimal. F usually amounts to convolving X with a low pass ﬁlter and adding noise. Assumptions on X are almost always related to its regularity, e.g. X is supposed to be piecewise C n for some n ≥ 1. In this section, B is assumed to be independent of X, white, Gaussian and centered. 11.4.2. Minimax risk, optimal convergence rate and adaptivity A useful way to compare denoising methods is to analyze their convergence properties. In this section, we recall some basic facts (see [HAR 98] for more details). DEFINITION 11.4.– The minimax risk in LP is given by ˆ n − X||pp Rn (V, p) = inf sup E||X @ n ∈E X∈V X

where E is the set of measurable estimators and V a ball in a functional space. 1

DEFINITION 11.5.– rn Rn (V, p) p is called the optimal convergence rate or minimax convergence rate on the class V for the risk Lp . We say that the estimator @n − X p Rn (V, p). @n of X reaches the optimal convergence rate if supX∈V E X X p Typical function spaces that are considered in this framework are the so-called Besov Spaces (for a complete description of Besov spaces see [PEE 76, POP 88]). s and that the Lp loss is used. Suppose for instance that X belongs to a ball in Br,q Then, we can show that: sn • If r ≥ p (homogenous area) the optimal rate is 2− 2s+1 . sn p ≤ r ≤ p (intermediate area) the optimal rate is 2− 2s+1 and • If 2s+1 (s− r1 + p1 )n − 2(s− 1 + 1 )+1 r p 2 for linear estimators. (s− r1 + p1 ) 1 1 2 p (sparse area), the optimal rate is (n2−n ) (s− r + p )+1 for non-linear • If r ≤ 2s+1 estimators. In the following sections, the L2 loss is used, and as a consequence, there is no sparse zone. The corresponding optimal convergence rates are as follows. Non-linear estimator sn 1 r ≥ 2 Rn (V, 2) 2 = 2− 2s+1 1

r < 2 Rn (V, 2) 2 = 2− 2s+1 sn

Linear estimator sn 1 Rnlin (V, 2) 2 = 2− 2s+1 (s− r1 + 12 )n − 1 1 1 2 Rnlin (V, 2) 2 = 2 (s− r + 2 )+1

Table 11.1. Convergence rates

378

Scaling, Fractals and Wavelets

For some estimators, the availability of the optimal rate of convergence is conditioned to the knowledge of information about the signal, such as its regularity. This constraint is a drawback in applications. In this context we try to develop adaptive estimators. DEFINITION 11.6.– X ∗ is an adaptive estimator for the loss Lp and the set {Fα , α ∈ A} if for all α ∈ A, there exists a constant cα > 0: sup E||X ∗ − X||pp ≤ cα Rn (α, p)

X∈Fα

For general results about adaptivity, see [LEP 90, LEP 91, LEP 92, BIR 97]. 11.4.3. Wavelet based denoising A popular set of denoising methods is based on decomposing the corrupted signal in a wavelet basis, processing the wavelet coefﬁcients, and then going back to the time domain. In the case of additive white noise, this is justiﬁed by two fundamental facts: ﬁrst, many real-world signals have a sparse structure in the wavelet domain, i.e. a few coefﬁcients are signiﬁcant, and most are small or zero. Second, for an orthonormal wavelet transform, all wavelet coefﬁcients of a white noise are iid random variables. Denoising in the wavelet domain thus allows us to separate in an easy way “large”, signiﬁcant, coefﬁcients, from “small” coefﬁcients due mainly to noise. Throughout this section, the wavelet coefﬁcients of a signal X are denoted by xj,k where j is scale and k is location. X is the original signal, Y the observed noisy signal @ an estimator of X. We assume that Y = X +B, where B is a centered Gaussian and X white noise with variance σ 2 , independent from the original signal X. Thus, we have yj,k = xj,k + bj,k . Since the wavelet basis is supposed to be orthonormal, the bj,k are also Gaussian and iid. The ﬁrst and simplest methods for denoising based on the above principles are the so-called hard and soft thresholding [DEV 92, DON 94]. Since the time these methods were introduced, a huge number of improvements have been proposed, ranging from block thresholding [HAR 98] to Bayesian approaches [VID 99] and many more. We brieﬂy recall the basics of hard thresholding, and show why a different method is needed for the processing of irregular signals. DEFINITION 11.7.– Let Yn be a sample of Y on 2n points. The estimator of X by @ HT , a signal with the following wavelet coefﬁcients: hard thresholding is X {ˆ xHT j,k }j,k = {yj,k .1|yj,k |≥λn }j,k where λn is a given threshold.

Local Regularity and Multifractal Methods for Image and Signal Analysis

379

Traditional choices for λn include the so-called universal, sure and Bayesian n√ thresholds [VID 99]. In this section, the universal threshold λn = σ2− 2 2n will be used throughout. s and Xn its THEOREM 11.2 (Risk for hard thresholding (D. Donoho)).– Let X ∈ Bp,q sampled version on 2n points. 2sn @ HT )2 ] ≤ C.n.2− 2s+1 RHT := E[(Xn − X

Thus, hard thresholding is near-minimax. A limitation of hard thresholding, as well as of most wavelet-based methods, is that they are not well adapted to denoise highly textured or everywhere irregular signals, in particular (multi)fractal or multifractional signals, with potentially rapid variations in local regularity. It is particularly well-known that, when the original signal is itself irregular, most wavelet-based denoising methods will typically produce an oversmoothed signal and/or so-called “ringing” effects. Indeed, as recalled above, the basic idea behind wavelet thresholding is that many real-world signals have a sparse wavelet representation, with few large wavelet coefﬁcients. Putting small coefﬁcients to 0 in the noisy signal will then in general do no harm, since these are mainly due to noise. Everywhere irregular signals, on the other hand, have signiﬁcant coefﬁcients scattered all other the time-frequency plane. At high frequencies, these signiﬁcant but relatively small coefﬁcients in the signal crucially determine the local irregularity. Zeroing small coefﬁcients will thus typically destroy the regularity information. As a consequence, it is no surprise that a speciﬁc method has to be designed for such signals. Figure 11.4 illustrates some of the drawbacks just mentioned. A theoretical result on a particular class of signals also allows us to measure precisely the over-smoothing effect of hard thresholding. Deﬁne the set P ART (α) as follows: 1 (11.4) PART(α) := X, {xj,k }j,k = {εj,k .2−j(α+ 2 ) }j,k , εj,k iid in {−1, 1} PROPOSITION 11.2.– Let α ˜ X HT (n, t) denote the regularity of the signal after hard thresholding, estimated by the wcr method. Then, for a signal X ∈ P ART (α), at each point t, 6α + 1 1 8α3 + 12α2 + 12α αX HT (n, t)] = − − + p lim E[˜ n→∞ 2 2(2α + 1)2 (2α + 1)3 where p is the number of vanishing moments of the wavelet. This result means that the regularity of the denoised signal is essentially controlled by p. In particular, if we use a wavelet with an inﬁnite number of vanishing moments, the estimated regularity of the hard thresholded signal will be equal to inﬁnity.

380

Scaling, Fractals and Wavelets

The following sections describe three denoising methods that are well ﬁtted to the processing of extremely irregular signals such as (multi)fractal ones. The ﬁrst and second methods both make it possible to control the local Hölder regularity. They differ in the way this regularity is estimated. In the ﬁrst method, this is done through linear regression over all available scales. In the second one, an “exponent between scales” is used. The third method allows a control of the multifractal spectrum. 11.4.4. Non-linear wavelet coefficients pumping In this section, a reﬁnement of hard thresholding is presented. It is called non-linear wavelet pumping (NLP) and is near-minimax, adaptive and allows us to control the regularity of the denoised signal through a parameter δ ∈ R+ . For a theoretical study, proofs and numerical experiments, see [LEG 04b]. DEFINITION 11.8.– Let Yn a sample of Y on 2n points. The estimator of X by NLP @ N LP , a signal with the following wavelet coefﬁcients: method, is given by X LP −jδ {ˆ xN yj,k .1|yj,k |<λn }j,k j,k }j,k = {yj,k .1|yj,k |≥λn + 2

The decision law of this method is displayed in Figure 11.3 for a given scale j and compared to the decision law of hard thresholding (which is the same for every scale).

Figure 11.3. Decision law for hard thresholding (left) and for NLP (right). Abscissa: value of the noisy wavelet coefﬁcient. Ordinate: value of the estimator of the original wavelet coefﬁcient

11.4.4.1. Minimax properties The NLP method and hard thresholding have similar convergence properties. s THEOREM 11.3.– Let X ∈ Bp,q .

If δ >

s , then RN LP ≤ RHT + O(RHT ) 2s + 1

Local Regularity and Multifractal Methods for Image and Signal Analysis

381

Thus, NLP is near-minimax. Additionally, If δ >

1 the estimator is adaptive. 2

11.4.4.2. Regularity control The advantage of NLP is that it allows a control over the local regularity through the parameter δ. PROPOSITION 11.3 (Increase of regularity).– Let αY (n, t) and αX@ N LP (n, t) denote @ N LP at the respectively the regularity of the noisy signal Y and of the estimator X point t, estimated by wcr. Then at each point t: αX@ N LP (n, t) = αY (n, t) + Kn δ

n

jsj .

j=1 (n) |yj,k (t)|<λn

In other words, NLP increases the Hölder regularity proportionally to the δ parameter. This result sometimes allows us to ﬁnd an optimal value for δ. This is particularly the case for the set of functions P ART (α) (deﬁned in (11.4)). PROPOSITION 11.4.– For a signal X ∈ P ART (α), at each t: ! " (6α − 1) lim E[αX@ N LP (n, t)] − E[αY (n, t)] = δ 1 + n→∞ (2α + 1)3 i.e. ! " α − 2α2 (6α − 1) +δ 1+ lim E[αX@ N LP (n, t)] = n→∞ (1 + 2α)2 (2α + 1)3 Using this proposition, we may calculate the value δideal that ensures that the denoised signal will have the same average regularity as the original signal. PROPOSITION 11.5.– For a signal X ∈ P ART (α), the “optimal” parameter δ is δideal =

α(1 + 2α)(2α + 3) 2(2α2 + 3α + 3)

382

Scaling, Fractals and Wavelets

0.06

0.04

0.02

0

−0.02

−0.04

−0.06

0

1000

2000

3000

4000

5000

6000

7000

8000

9000

(a)

(b)

0.06

0.06

0.04

0.04

0.02

0.02

0

0

−0.02

−0.02

−0.04

−0.06

−0.04

0

1000

2000

3000

4000

5000

6000

(c)

7000

8000

9000

−0.06

0

1000

2000

3000

4000

5000

6000

7000

8000

9000

(d)

Figure 11.4. Denoising of a lacunary wavelet series: (a) original signal (regularity: 0.2, lacunarity: 0.7); (b) noisy version; (c) denoising by the NLP method; (d) denoising by hard thresholding

11.4.4.3. Numerical experiments Lacunary wavelet series We present an example of denoising with NLP, along with a comparison with hard thresholding, on a lacunary wavelet series [JAF 00]. The regularity is equal to 0.2, and the lacunarity parameter is 0.7. Figure 11.4 represents the original signal, the noisy signal and the two denoisings. The NLP method provides a reasonable result, while the hard thresholding clearly oversmooths the signal. SAR images As a second illustration, we display an original synthetic aperture radar (SAR) image along with its hard thresholding and NLP denoisings in Figure 11.5. As we can see, the original image appears very noisy, and does not seem to hold any useful information. The hard thresholded image is not very readable either. However, we can see clearly on the image processed with NLP a river ﬂowing from the top of the image and assuming roughly an inverted “Y” shape. Denoising is used in this application as a pre-processing step that enhances the image so that it will be possible to automatically detect the river. Such a procedure is used by IRD, a French agency, which, in this particular application, is interested in monitoring water resources in a region of Africa.

Local Regularity and Multifractal Methods for Image and Signal Analysis

50

50

50

100

100

100

150

150

150

200

200

200

250

250 50

100

150

200

250

383

250 50

100

150

200

250

50

100

150

200

250

Figure 11.5. Left: original SAR image. Middle: denoising by HT. Right: denoising by NLP

11.4.5. Denoising using exponent between scales 11.4.5.1. Introduction In [ECH 07], Echelard presents a denoising method that is similar in spirit to that just described, and is thus also well ﬁtted to the processing of irregular signals. The proposed approach consists of extrapolating the unknown, small, coefﬁcients by imposing a local regularity constraint. More precisely, the small coefﬁcients are reconstructed in such a way that the local regularity at each point of the denoised signal matches the regularity of the original signal. Of course, since the original signal is unknown, so is its regularity. Thus, we ﬁrst need to estimate the local regularity of the original signal from the noisy observations. As in the previous section, a difﬁculty arises from working on discrete signals. Indeed, the very deﬁnition of Hölder exponents requires us to let the resolution tend to ∞, which cannot be done here. We require an adapted deﬁnition of α that both makes sense at ﬁnite resolution and allows us to capture the visual impression of regularity on sampled signals. In the previous section, a regression of the wavelet coefﬁcients was used for this purpose. Here, a different path is taken. In view of the fact that the perceived regularity depends on the considered range of scales, an “exponent between two scales” is deﬁned as follows: log |xj,k | − 1/2 αg (j1 , j2 , X) = min min −j j∈[j1 ,j2 ] k∈Z In order to maintain some information at small scale, the proposed method follows the steps below: • Estimate the critical scale cn , deﬁned as the scale where the coefﬁcients of the white noise become predominant as compared to the ones of the signal. • Estimate the regularity sn of the original signal at the considered point, using coefﬁcients at scales larger than cn .

384

Scaling, Fractals and Wavelets

• Assign to the small scale coefﬁcients a value that is “coherent” with the ones of the coefﬁcients at larger scales. More precisely, the wavelet coefﬁcients of the An are set as follows: denoised signal X (11.5) {˜ xj,k }j,k = min |yj,k | , 2Kn −j(sn +1/2) sgn yj,k j,k

for j > cn , and where Kn and sn are estimated from the noisy wavelet coefﬁcients (yj,k ) at scales j < cn . This means that, at small scales, we do not accept overly large coefﬁcients, that is, coefﬁcients which would not be compatible with the estimated Hölder regularity of the signals (statistically, there will always be such coefﬁcients, since noise has no regularity given that its coefﬁcients do not decrease with scale). On the other hand, “small” coefﬁcients (those not exceeding 2Kn −j(sn +1/2) ) are left unchanged. Note that both the estimated regularity sn and the critical scale cn depend on the considered point. Note also that this procedure may be seen as a location-dependent shrinkage of the coefﬁcients. We can prove the following property, which essentially says that the above method does a good job in recovering the regularity of the original signal, as measured by the exponent between two scales, provided that we are able to estimate with good accuracy its Hölder exponent at any given point t: PROPOSITION 11.6.– Let X belong to C 0 (R) for some 0 > 0, and let α denote its Hölder exponent at point t. Let (sn )n be a sequence of real numbers tending almost surely (resp. in n An be deﬁned as above. probability) to α. Let c(n) = 1+2s . Let X n An ) tends Then, for any function h tending to inﬁnity with h(n) ≤ n, αg (h(n), n, X almost surely (resp. in probability) to α. For this method to be put to practical use, there thus remains to estimate the critical scale and Hölder exponent from the noisy observations. This is the topic of the next section. 11.4.5.2. Estimating the local regularity of a signal from noisy observations The main result in [ECH 07] concerning the estimation of the critical scale is the following. THEOREM 11.4.– Let (xi )i∈N denote the wavelet coefﬁcients of X ∈ C 0 (R) “above” a point t where the local and pointwise Hölder exponent of X coincide. Let β = lim inf i→∞ − logi |xi | . Assume that there exists a decreasing sequence (εn ) such

Local Regularity and Multifractal Methods for Image and Signal Analysis

385

that εn = o n1 when n → ∞ and − logi |xi | ≥ β − εi , for all i. Let (yi ) denote the noisy coefﬁcients corresponding to the xi . Let: Ln (p) =

n 1 y2 , (n − p + 1)2 i=p i

and denote p∗ = p∗ (n) an integer such that: Ln (p∗ ) =

min

p:1≤p≤n−b log(n)

where b > 1 is a ﬁxed number. Finally let q(n) = ∀a > 1,

Ln (p),

n 1 . 2(β− n )

p∗ (n) ≤ q(n) + a log(n),

Then, almost surely:

n→∞

In addition, if the sequence (xi ) veriﬁes the following condition: there exists a sequence of positive integers (θn ) such that, for all n large enough and all θ ≥ θn : q−1 1 − δβ∗ 1 2 xi > bσn2 , ∗ θ (1 − δβ )2 i=q−θ

where δ∗ ∈ (0, 12 ) and δ ∗ ∈ ( 12 , β). Then, almost surely: ∀a > 1,

p∗ (n) ≥ q(n) − max(a log(n), θn ),

n→∞

In other words, when the conditions of the theorem are met, any minimizer of L is, within an error of O(log(n)), approximately equal to the searched for critical scale. This allows in turn to estimate the Hölder exponent using the next corollary: COROLLARY 11.1.– With the same notations and assumptions as in the theorem above, with the additional condition that θn is not larger than b log(n) for all ˆ sufﬁciently large n, deﬁne: β(n) = 2p∗n(n) + n1 . Then the following inequality holds almost surely for all large enough n: ˆ |β(n) − β| ≤ 2bβ 2

log(n) . n

ˆ The value sn = β(n) + 12 is used in (11.5). Kn is estimated as the offset in the linear least square regression of the logarithm for the absolute value of the wavelet coefﬁcients with respect to scale, at scales larger than p∗ (n).

386

Scaling, Fractals and Wavelets

Figure 11.6. Top: original Weierstrass function. Middle: noisy version. Bottom: signal obtained with the regularity preserving method

11.4.5.3. Numerical experiments Figure 11.6 shows the original, noisy and denoised versions of a Weiertsrass function. 11.4.6. Bayesian multifractal denoising 11.4.6.1. Introduction In [LEG 04b, LEV 03], a denoising method is presented that assumes a minimal local regularity. This assumption translates into constraints on the multifractal spectrum of the signals. Such constraints are used in turn in a Bayesian framework to estimate the wavelet coefﬁcients of the original signal from the noisy ones. An assumption is made that the original signal belongs to a certain set of parameterized classes S described below. Functions belonging to such classes have a minimal local regularity, but may have wildly varying pointwise Hölder exponent. Along with possible additional conditions, this yields a parametric form for the prior distribution of the wavelet coefﬁcients of X. These coefﬁcients are estimated using a traditional maximum a posteriori technique. As a consequence, the estimate is deﬁned as the signal “closest” to the observation which has the desired multifractal spectrum (or a degenerate version of it, see below). Because the multifractal spectrum subsumes information about the pointwise Hölder regularity, this procedure is naturally adapted for signals which have sudden changes in regularity.

Local Regularity and Multifractal Methods for Image and Signal Analysis

387

11.4.6.2. The set of parameterized classes S(g, ψ) The denoising technique described below is based on the multifractal spectrum rather than the use of the sole Hölder exponent. This will in general allow for more robust estimates, since we use a higher level description subsuming information on the whole signal. For such an approach to be practical, however, we need to make the assumption that the considered signals belong to a given set of parameterized classes, as we will now describe. Let F be the set of lower semi-continuous functions from R+ to R ∪ {−∞}. We consider classes of random functions X(t), t ∈ [0, 1], deﬁned on (Ω, F, P) deﬁned by (11.6) below1. Each class S(g, ψ) is characterized by the functional parameter g ∈ F and a wavelet ψ such that the set {ψj,k }j,k forms a basis of L2 . Let: – K be a positive constant log (K|x |) – Pεj (α, K) = P × Pj α − ε < 2 −j j,k| < α + ε Deﬁne:

S(g, ψ) =

X : ∃K > 0, j0 ∈ Z : ∀j > j0 , xj,k

and xj,k are identically distributed for (k, k ) ∈ {0, 1, . . . , 2j − 1} log2 Pεj (α, K) = g(α) + Rn,ε (α) and j

(11.6)

where Rn,ε (α) is such that limε→0 limn→∞ Rn,ε (α) = 0 uniformly in α. The assumption that, for large enough j, the wavelet coefﬁcients (xj,k )k at scale j are identically distributed entails that: log2 (K|xj,k |) ε <α+ε πj (α, K) := P × Pj α − ε < −j 2n log2 (K|xj,k |) −n <α+ε =2 P α−ε< −j k=0 log2 (K|xj,k |) <α+ε =P α−ε< −j Consequently, deﬁnition (11.6) has a simple interpretation in terms of multifractal analysis: For a given wavelet ψ, we consider the set of random signals X such that the

1. Extension to functions deﬁned on Rn requires only minor adaptations.

388

Scaling, Fractals and Wavelets

normalized signal KX has a deterministic multifractal spectrum Fg (α) (with respect to ψ) equal to 1 + g, with the following additional condition: Fg is obtained as a limit in j rather than a lim inf, this limit being attained uniformly with respect to α. This condition ensures that, for sufﬁciently large j, the rescaled statistics of the wavelet based coarse-grained exponents αj,k are close enough to their limit, allowing meaningful inference. The classes S(g, ψ) encompass a fairly wide variety of signals. Most models of (multi)fractal processes and certain other “traditional” processes belong to such classes. These include IFS, multiplicative cascades, fractional Brownian motion and stable processes. Such processes have been used in the modeling of Internet trafﬁc, ﬁnancial records, speech signals, medical images and more. 11.4.6.3. Bayesian denoising in S(g, ψ) The main steps in the traditional maximum a posteriori (MAP) approach in a Bayesian framework are given in this section. The MAP estimate of x ˆj,k of xj,k from the observation yj,k is deﬁned to be an argument that maximizes P(xj,k /yj,k ). Using Bayes rules, and since P(yj,k ) does not depend on xj,k , maximizing P(xj,k /yj,k ) amounts to maximizing the product P(yj,k /xj,k )P(xj,k ). The MAP estimate is thus obtained as follows: x ˆj,k = argmaxx [P(yj,k /x)P(x)] The term P(yj,k /x) is easily computed from the law of B if one assumes that B is white, since the bj,k then have the same law as B (recall orthonormal wavelets are used). The prior P(xj,k ) is deduced from our assumption that X belongs to S(g, ψ) in (Kx) the following way. For x > 0, set αj (x) = log2−j . P(|xj,k | = x) = P(K|xj,k | = Kx) log2 (K|xj,k |) = αj (x) =P −j 2j(g(αj (x))−1) This leads us to deﬁne the approximate Bayesian MAP estimate as: % B $ C ˆ log2 (Kx) + log2 (P(yj,k /x)) × sgn(yj,k ) x ˆj,k = argmaxx>0 jg −j @ is given by: where sgn(y) denotes the sign of y and K @ = ( sup sup(xj,k ))−1 K j>j0

k

(11.7)

Local Regularity and Multifractal Methods for Image and Signal Analysis

389

The estimate for K can be heuristically justiﬁed as follows; writing @ is α with α > 0 implies that K|xj,k | < 1 for all couples (j, k). K chosen as the smallest normalizing factor that entails the latter inequality. log2 (K|xj,k |) −j

In the numerical experiments, we shall deal with the case where the noise is centered, Gaussian, with variance σ 2 . The MAP estimate then reads: % B $ C @ (yj,k − x)2 log2 (Kx) − x ˆj,k = argmaxx>0 jg × sgn(yj,k ) −j 2σ 2 While (11.7) gives an explicit formula for denoising Y , it is often of little practical use. Indeed, in most applications, we do not know the multifractal spectrum of X, so that without an evaluation of g, it is not possible to use (11.7) to obtain x ˆj,k . In addition, we should recall that Fg depends in general on the analyzing wavelet. We would thus need to know the spectrum shape for the speciﬁc wavelet in use. Furthermore, a major aim of this approach is to be able to extract the multifractal @ A strong justiﬁcation of the multifractal features of X from the denoised signal X. X Bayesian approach is to estimate Fg as follows: a) denoise Y , b) evaluate numerically @ @ c) set F@X = F X@ . Obviously, from this point of view, it does the spectrum F X of X, g

g

g

not make sense to require the prior knowledge of FgX in the Bayesian approach. Thus, a “degenerated” version of (11.7) is presented which uses a single real parameter as input instead of the whole spectrum. The heuristic is as follows; from a regularity point of view, important information contained in the spectrum is its support, i.e. the set of all possible “Hölder exponents”. More precisely, let α0 = inf{α, Fg (α) > −∞}. While the spectra shapes obtained with different analyzing wavelets depend on the wavelet, their supports are always included in [α0 , ∞). The “ﬂat” spectrum 1[α0 ,∞) thus contains intrinsic information. Furthermore, it only depends on the positive real α0 . Rewriting (11.7) with a ﬂat spectrum yields the following explicit simple expression for x ˆj,k :

x ˆj,k = yj,k = sgn(yj,k )2−jα0

if K|yj,k | < 2−jα0

(11.8)

otherwise

Although α0 is really prior information, it can be estimated from the noisy observations [LEG 04b]. In this respect, it is comparable to the threshold used in the traditional hard or soft wavelet thresholding scheme. Furthermore, in applications, it is useful to think of α0 rather as a tuning parameter, whereby increasing α0 yields a smoother estimate (since the original signal is assumed to have a larger minimal exponent). Note that (11.8) has a ﬂavor reminiscent of the method described in section 11.4.5.

390

Scaling, Fractals and Wavelets

11.4.6.4. Numerical experiments As a test signal for numerical experiments, we shall consider fractional Brownian motion (FBM). As is well-known, FBM is the zero-mean Gaussian process X(t) with covariance function: R(t, s) = E(X(t)X(s)) =

σ 2 2H |t| + |s|2H − |t − s|2H 2

where H is a real number in (0, 1) and σ is a real number. FBM reduces to Brownian motion when H = 1/2. In all other cases, it has stationary correlated increments. At all points, the local and pointwise Hölder exponents of FBM equal H almost surely. The Hausdorff multifractal spectrum of FBM is degenerated, as we have fh (α) = −∞ almost surely for α = H, fh (H) = 1. The large deviation spectrum, however, depends on the deﬁnition of Ynk : if we consider oscillations, then fg = fh . Taking increments, we get that, almost surely, for all α: ⎧ ⎪ if α < H ⎨−∞ fg (α) = fl (α) = H + 1 − α if H ≤ α ≤ H + 1 ⎪ ⎩ −∞ if α > H + 1 Moreover, in both cases (oscillations and increments), fg (α) is given by log Nnε (α) ε→0 n→∞ log n lim lim

(i.e. the lim inf in n is really a plain limit). Together with the stationarity property of the increments (or the wavelet coefﬁcients), this entails that FBM belongs to a class S(g, ψ). If we deﬁne the Ynk to be wavelet coefﬁcients, the spectrum will depend on the analyzing wavelet ψ. All spectra with upper envelope equal to the characteristic function of [H, ∞) may be obtained with adequate choice of ψ. The result of the denoising procedure will thus in principle be wavelet-dependent. The inﬂuence of the wavelet is controlled through the prior choice, i.e. the multifractal spectrum among all admissible ones. In practice, few variations are observed if we use a Daubechies wavelet with length between 2 and 20, and a non-increasing spectrum supported on [H, H + 1] with fg (H) = 1. A graphical comparison of results obtained through Bayesian multifractal denoising and traditional hard and soft thresholding is displayed in Figure 11.7. For each method, the parameters were manually set so as to obtain the best ﬁt to the known original signal. By and large, the following conclusions may be drawn from these experiments. First, it is seen that, for irregular signals such as FBM, which belong to S(g, ψ), the Bayesian method yields more satisfactory results

Local Regularity and Multifractal Methods for Image and Signal Analysis 0.5

0.5

0

0

Ŧ0.5

Ŧ0.5 200

400

600

800

1000

0.5

0.5

0

0

Ŧ0.5

Ŧ0.5 200

400

600

800

1000

0.5

0.5

0

0

Ŧ0.5

Ŧ0.5 200

400

600

800

1000

391

200

400

600

800

1000

200

400

600

800

1000

200

400

600

800

1000

Figure 11.7. First line: FBM with H = 0.6 (left) and noisy version with Gaussian white noise (right). Second line: Denoised versions with a traditional wavelet thresholding; hard thresholding (left), soft thresholding (right). Third line: Bayesian denoising with increments’ spectrum (left), Bayesian denoising with ﬂat spectrum (right)

than traditional wavelet thresholding (we should however recall that hard and soft thresholding were not designed for stochastic signals). In particular, this method preserves a roughly correct regularity along the path, while wavelet thresholding yields a signal with both too smooth and too irregular regions. Second, it appears that using the degenerate information provided by the “ﬂat” spectrum does not signiﬁcantly decrease the denoising quality. 11.4.6.5. Denoising of road proﬁles An important problem in road engineering is to understand the mechanisms of friction between rubber and the road. This is a difﬁcult problem, since friction depends on many parameters: the type of rubber, the type of road, the speed, etc. Several authors have shown that most road proﬁles are fractal [RAD 94, HEI 97, GUG 98] on given ranges of scales. Such a property has obvious consequences on friction, some of which have been investigated for instance in [RAD 94, KLU 00]. The main idea is that, in the presence of fractal roads, all irregularity scales contribute to friction [DO 01].

392

Scaling, Fractals and Wavelets

In [LEG 04a], it is veriﬁed that road proﬁles ﬁnely sampled using tactile and laser sensors are indeed fractals. More precisely, it is shown that they have well-deﬁned correlation exponents and regularization dimensions over a wide range of scales. However, various classes of proﬁles which have different friction coefﬁcients are not discriminated by these global fractal parameters. This means that friction may have relatively low correlation with fractional dimensions or correlation exponents. In contrast, experiments show that the pointwise Hölder exponent allows us to separate road proﬁles which have different friction coefﬁcients. The laser acquisition system developed at LCPC (Laboratoire central des ponts et chaussées), based on an Imagine optics sensor, allows us to obtain (at the ﬁnest resolution) road proﬁles with a sampling step of 2.5 microns. These signals are very noisy. As a consequence, when they are used for computing a theoretical friction ([DO 01]), a very low correlation with the real friction (0.48) is obtained. Since local regularity is related to friction, it seems natural to use one of the regularity-based denoising methods presented above in this case. It appears that a Bayesian denoising is well ﬁtted; after this denoising, the correlation between theoretical real friction increases up to 0.86 (see Figure 11.8). For more on this topic, see [LEG 04b]. Denoised Profiles : correlation = 0.8565 1

0.9

0.9

0.8

0.8

0.7

0.7 Theoretical friction

Theoretical friction

Original Profiles : correlation = 0.4760 1

0.6 0.5 0.4

0.6 0.5 0.4

0.3

0.3

0.2

0.2

0.1

0.1

0

0

0.2

0.4 0.6 Measured friction

(a)

0.8

1

0

0

0.2

0.4 0.6 Measured friction

0.8

1

(b)

Figure 11.8. Theoretical friction versus measured friction. Each star represents a given class of road proﬁles. We calculate the friction for each proﬁle in a class and performs an average: (a) originals proﬁles (correlation 0.4760); (b) denoised proﬁles (correlation 0.8565)

Local Regularity and Multifractal Methods for Image and Signal Analysis

393

11.5. Hölderian regularity based interpolation 11.5.1. Introduction A ubiquitous problem in signal and image processing is to obtain data sampled with the best possible resolution. At the acquisition step, the resolution is limited by various factors such as the physical properties of the sensors or the cost. It is therefore desirable to seek methods which would allow us to increase the resolution after acquisition. This is useful, for instance, in medical imaging or target recognition. At ﬁrst sight, this might appear hopeless, since we cannot “invent” information which has not been recorded. Nevertheless, by making reasonable assumptions on the underlying signal (typically, a priori smoothness information), we may design various methods (note that we do not consider here the situation where several low resolution overlapping signals are available). However, most techniques developed so far suffer from a number of problems. While the interpolated image is usually too smooth, it also occurs sometimes that on the contrary too many details are added, in particular in smooth regions. In addition, the creation of details is not well controlled, so that we can neither predict how the high resolution image will look like, nor describe the theoretical properties of the interpolation scheme. The main idea of [LEV 06] is to perform interpolation in such a way that smooth regions as well as irregular regions (i.e. sharp edges or textures) remain so after zooming. This can be interpreted as a constraint on the local regularity; the interpolation method should preserve local regularity. 11.5.2. The method Let Xn be the signal obtained by sampling the signal X on 2n points. The proposed method is strongly related to the estimator W CR described in section 11.3.2 and follows the steps below: – estimate the regularity by the W CR method: the regression of the logarithm of wavelet coefﬁcients vs scale is calculated above each point t; – the wavelet coefﬁcients at the scale n+1 are obtained by the following formula: 2 log |xj,k |(3j − n − 2) n(n − 1) j=1 2 n

log2 |˜ xn+1,k | =

with xj,k , j = 1 . . . n the wavelet coefﬁcients “above” t. This means that the regression slope (i.e. the estimated regularity by the W CR method) remains the same after interpolation (see Figure 11.9 left, second row); – perform an inverse wavelet transform.

394

Scaling, Fractals and Wavelets

With this method, the local estimated regularity of the signal/image will remain unchanged because the high frequencies content is added in a manner coherent with lower scales. From an algorithmic point of view, we note that only one computation is needed, whatever the number of added scales. 11.5.3. Regularity and asymptotic properties An+m be the signal after m interpolations, α ˜ (n, t) be the estimated regularity Let X A at t given by (11.2) and log2 Kn,k be the ordinate at zero of the W CR regression. PROPOSITION 11.7.– If X ∈ C α then, whatever the number m of added scales: An+m 2 ≤

X −X 2

@n c2 K 1 2−2αn + 2αˆ n 2−2@αn n 2α 2 2 −1 2 −1

@n, α with (K @n ) such that: @ n 2−2j αˆ n = K

' max

A n,k ,α(n,t)) (K ˜

˜ A n,k 2−2j α(n,t) K

(

PROPOSITION 11.8.– Assume that X ∈ C α and that at each point the local and the pointwise Hölder exponents of X coincide. Then, ∀ε > 0, ∃N : An+m 2 = O(2−(n+m)(α−ε) ) n > N ⇒ X − X An+m B s = O(2−(n+m)(α−s−ε) ) for all s < α − ε. In addition, X − X p,q See [LEG 04b] for more results. 11.5.4. Numerical experiments We show a comparison between the regularity-based interpolation method and a traditional bicubic method on a scene containing a Japanese door (toryi). Figure 11.9 displays the original 128×128 pixel image, and eight-times bicubic and regularity-based interpolations on a detail of the door image. 11.6. Biomedical signal analysis Fractal analysis has long been applied with success in the biomedical ﬁeld. A particularly interesting example is provided by the study of ECG. ECG and signals derived from them, such as RR intervals, are an important source of information in the detection of various pathologies, e.g. congestive heart failure and sleep apnea, among others. The fractality of such data has been reported in numerous works over the years. Several fractal parameters, such as the box dimension, have been found to correlate well with this heart condition in certain situations ([PET 99, TU 04]).

Local Regularity and Multifractal Methods for Image and Signal Analysis

16

14

12

10

8

6

4

2

0

0

2

4

6

8

10

12

14

Figure 11.9. Left. First row: original door image, second row: Regression of the logarithm of the wavelet coefﬁcients vs scales above the point t. The added wavelet coefﬁcient is the one on the right. Right: 8 times bicubic (up) and regularity-based (bottom) interpolations on door image (detail)

395

396

Scaling, Fractals and Wavelets

More precise information on ECG is provided by multifractal analysis, because their local regularity varies wildly from point to point. In the speciﬁc case of RR intervals, several studies have shown that the multifractal spectrum correlates well with the heart condition ([IVA 99, MEY 03, GOL 02]). Roughly speaking, we observe two notable phenomena: – On average, the local regularity of healthy RR is smaller than that in presence of, e.g., congestive heart failure. In other words, pathologies increase the local regularity. – Healthy RR have much more variability in terms of local regularity; congestive heart failure reduces the range of observed regularities. These results may be traced back to the fact that congestive heart failure is associated with profound abnormalities in both the sympathetic and parasympathetic control mechanisms that regulate beat-to-beat variability [GOL 02]. A precise view on the mechanisms leading to multifractality is important if we want to understand the purposes it serves and how it will be modiﬁed in response to external changes or in case of abnormal behavior. As of today, there is no satisfactory multifractal model for RR intervals ([AMA 99]). As a preliminary step toward this goal, we shall describe in this section a remarkable feature of the time-evolution of the local regularity. Obviously, calculating the time evolution of the local regularity gives far more information than the sole multifractal spectrum. Indeed, the latter may be calculated from the former, while the reverse is not true. In addition, inspecting the variations of local regularity yields new insights which cannot be deduced from a multifractal spectrum, since all time-dependent information is lost on a spectrum. This is crucial for RR intervals, since, as we will see, the evolution of local regularity is strongly (negatively) correlated with the RR signals. This fact prompts the development of new models that would account for the fact that, when the RR intervals are larger, the RR signal is more irregular, and vice versa. In that view, we shall brieﬂy describe a new mathematical model that goes beyond the usual multifractional Brownian motion (MBM). Recall that the MBM is the following random process, that depends on the functional parameter H(t), where H : [0, ∞) → [a, b] ⊂ (0, 1) is a C 1 function: 0 WH(t) (t) = [(t − s)H(t)−1/2 − (−s)H(t)−1/2 ]dW (s) −∞

t

+

(t − s)H(t)−1/2 dW (s).

0

The main feature of MBM is that its Hölder exponent may be easily prescribed; at each point t0 , it is equal to H(t0 ) with a probability of one. Thus, MBM allows us to describe phenomena whose regularity evolves in time/space. For more details on MBM, see Chapter 6.

Local Regularity and Multifractal Methods for Image and Signal Analysis

397

Figure 11.10 shows two paths of MBM with a linear function H(t) = 0.2 + 0.6t and a periodic H(t) = 0.5 + 0.3 sin(4πt). We clearly see how regularity evolves over time.

Figure 11.10. MBM paths with linear and periodic H functions

Estimation of the H functions from the traces in Figure 11.10, using the so-called generalized quadratic variations are shown in Figure 11.11 (the theoretical regularity is in gray and the estimated regularity is in black).

Figure 11.11. Estimation of the local regularity of MBM paths. Left: linear H function. Right: periodic H function

24-hour interbeat (RR) interval time series obtained from the PhysioNet database [PHY] along with their estimated local regularity (assuming that the processes may be modeled as MBM) are shown in Figure 11.12. These signals were derived from long-term ECG recordings of adults between the ages of 20 and 50 who have no known cardiac abnormalities and typically begin and end in the early morning (within an hour or two of the subject waking). They are composed of around 100, 000 points. As we can see from Figure 11.12, there is a clear negative correlation between the value of the RR interval and its local regularity: when the black curve moves up, the

398

Scaling, Fractals and Wavelets

Figure 11.12. RR interval time series (upper curves) and estimation of the local regularity (lower curves)

gray tends to move down. In other words, slower heartbeats (higher RR values) are typically more irregular (smaller Hölder exponents) than faster ones. In order to account for this striking feature, the modeling based on MBM must be reﬁned. Indeed, while MBM allows us to tune regularity at each time, it does so in an “exogenous” manner. This means that the value of H and of WH are independent. A better model for RR time series requires us to deﬁne a modiﬁed MBM where the regularity would be a function of WH at each time. Such a process is called a self-regulating multifractional process (SRMP). It is deﬁned as follows. We give ourselves a deterministic, smooth, one-to-one function g ranging in (0, 1), and we seek a process X such that, at each t, αX (t) = g(X(t)) almost surely. It is not possible to write an explicit expression for such a process. Rather, a ﬁxed point approach is used, which we now brieﬂy describe (see [BAR 07] for more details). Start from an MBM WH with an arbitrary function H (for instance a constant). At the second step, set H = g(WH ). Then iterate this process, i.e. calculate a new WH with this updated H function, and so on. We may prove that these iterations will almost surely converge to a well-deﬁned SRMP X with the desired property, namely the regularity of X at any given time t is equal to g(X(t)). For such a process, there is a functional relation between the amplitude and the regularity. However, this does not make precise control of the Hölder exponent possible. Let us explain this through an example. Take deﬁniteness g(t) = t for all t. Then, a given realization might result in a low value of X at, say, t = 0.5 and thus high irregularity at this point, while another realization might give a large X(0.5), resulting in a path that is smooth at 0.5. See Figure 11.13 for an example of this fact. In order to gain more control, the deﬁnition of an SRMP is modiﬁed as follows. First deﬁne a “shape function” s, which is a deterministic smooth function. Then, at each step, calculate WH , and set H = g(WH + ms), where m is a positive number. The function s thus serves two purposes. First, it allows us to tune the shape of X:

Local Regularity and Multifractal Methods for Image and Signal Analysis

399

1 0.8 0.6 0.4

1

2

3

4

0.2

0

1

2

3

0

1

2

3

4

x 10

1 0.8 0.6 0.4

1

2

3

4

0.2

4

x 10

Figure 11.13. Paths of SRMPs with g(Z) = Z

when m is large, X and s will essentially have the same form. Second, because of the ﬁrst property, it allows us to decide where the process will be irregular and where it will be smooth. Figure 11.14 displays an example of SRMP with controlled shapes.

(a)

(b)

Figure 11.14. (a) SRMP with g(Z) = Z (black), and controlling shape function (gray); (b) the same SRMP (black) with estimated Hölder exponent (gray)

400

Scaling, Fractals and Wavelets

It is then possible to obtain a ﬁne model for RR traces based on the following ingredients: – an “s” function, that describes the overall shape of the trace, and in particular the nycthemeral cycle; – a g function whose role is to ensure that the correct relation between the heart rate and its regularity is maintained at all times. The shape s is estimated from the data in the following way; for each RRi time series, histograms of both the signal and its exponent are plotted, and modeled as a sum of two Gaussians, as represented in Figure 11.15.

Figure 11.15. Histogram of RRi time series, modeled as a sum of two Gaussians

From these signals the shape functions are inferred. They are based on splines and parameterized by: – Dn , duration of the night: Dn ∈ [6, 10] – Dm , duration of the beginning of the measure: Dm ∈ [2, 4] – Ds , duration of the sleeping phase: Ds ∈ [0.5, 1.5] – Da , duration of the awakening phase: Dr ∈ [0.5, 1.5] – RRid , mean interbeat interval during the day: RRid ∈ [0.6018, 0.7944] – RRin , mean interbeat interval during the night: RRin ∈ [0.7739, 1.0531] randomly chosen, in each case, in their respective intervals, with uniform probability (see Figure 11.16 for a representation).

Local Regularity and Multifractal Methods for Image and Signal Analysis

401

Figure 11.16. Shape function of RR intervals

The g function is estimated in the phase space. More precisely, for each trace, the value of H as a function of the RR interval is plotted. Representing all these graphs on a single plot, a histogram is obtained, as in Figure 11.17. 8

5

6 4

4 5

2 3

3

8 5

6 4

2

2 5

2 1 0.2

0.4

0.6

0.8

1 RRi

1.2

1.4

1.6

1.8

8 0.4

0.5

0.6

0.7

0.8

0.9

1

RRi

Figure 11.17. Histogram in the phase space

The ridge line of this histogram, seen as a surface in the (RR, α) plane is then extracted (see Figure 11.17). It is seen that this ridge line is roughly a straight line, that is, ﬁtted using least squares minimization in order to obtain an equation of the form α = g(RR) = aRR + b. The last step is to synthesize an SRMP with shape function s and regularity function g, as explained in the previous section. Paths obtained in this way are shown in Figure 11.18. Compare this with the graphs shown in Figure 11.12, displaying true RR traces. 11.7. Texture segmentation We will now brieﬂy explain how multifractal analysis may be used for texture segmentation. We present an application to 1D signals, namely well logs. For an application to images, see [MUL 94, SAU 99].

402

Scaling, Fractals and Wavelets

Figure 11.18. Two forged RR intervals based on SRMP (upper curves) and estimated regularity (lower curves)

The characterization of geological strata with the help of well logs can be used for the interpretation of a sedimentary environment of an area of interest, such as reservoirs. Recent progress [SER 87] of electrofacies has enabled us to relate well logs to the sedimentary environment and to extrapolate information coming from the core of any vertical well span. Electrofacies predictions are based on multivariate correlations and cluster analysis for which the entry data are conventional well logs (including sonic logs, of density and gamma) as well as the information extracted from the core analysis. The microresistivity log (ML) measures the local rock wall resistivity of the wells. The measure is obtained by passing an electric current in the rock, to a lateral depth of approximately 1 cm. The resistivity varies according to the local porosity function and the connectivity of the pores (normally, the rock is a non-conductor and thus the current passes in the ﬂuid contained in the pores). ML contain information not only on the inclination of geological strata but also on the texture of these strata. To analyze the irregular variations of ML, we may calculate texture parameters locally and at different depths in the well. These can be used to obtain a well segmentation by letting [SAU 97] r(xi ) denote the signal resistivity, where xi are equidistant coordinates which measure the depth in the wells. In order to emphasize the vertical variations of r(xi ), a transformed signal sω (xi ) is ﬁrst deﬁned sω (xi ) = |r(xi+1 ) − r(xi )|ω where ω > 0. This transformation ampliﬁes the small scales and eliminates any constant component of signal r(xi ). Analysis of well logs from the Oseberg reservoir in the North Sea shows that, typically, a fractal behavior is observed for lengths of about [2 cm, 20 cm]. It has been

Local Regularity and Multifractal Methods for Image and Signal Analysis

403

found that relevant textural indices are given by the information dimension D(1) and the curvature parameter αc = 2|D (1)|/(D(0) − D(1)), where D(q) is deﬁned by D(q) = τ (q)/(q − 1), with an obvious modiﬁcation for q = 1. In particular, these indices allow us to separate the three strata present in these logs. For instance, D(1) is correlated with the degree of heterogenity: a formation which is more heterogenous translates into smaller D(1) [SAU 97]. 11.8. Edge detection 11.8.1. Introduction In the multifractal approach to edge detection, an image is modeled by a measure μ, or, more precisely, a Choquet capacity [LEV 98]. A Choquet capacity is roughly a measure which does not need to satisfy the additivity requirement. This distinction will not be essential for our discussion below, and the reader may safely assume that we are dealing with measures. See Chapter 1 for more precise information on this. The basic assumptions underlying the multifractal approach to image segmentation are as follows: • The relevant information for the analysis can be extracted from the Hölder regularity of μ. • Three levels contribute to the perception of the image: the pointwise Hölder regularity of μ at each point, the variation of the Hölder regularity of μ in local neighborhoods and the global distribution of the regularity in the whole scene. • The analysis should be translation and scale invariant. Let us brieﬂy compare the multifractal approach to traditional methods such as mathematical morphology (MM), and gradient based methods, or more generally image multiscale analysis (IMA): • As in MM and IMA, translation and scale invariance principles are fulﬁlled. • There is no so-called “local comparison principle” or “local knowledge principle”, i.e. the decision of classifying a point as an edge point is not based only on local information. On the contrary, it is considered useful to analyze information about whole parts of the image at each point. • The most important difference between “traditional” and multifractal methods lies in the way they deal with regularity. While the former aims at obtaining smoother versions of the image (possibly at different scales) in order to remove irregularities, the latter tries to extract information directly from the singularities. Edges, for instance, are not considered as points where large variations of the signal still exist after smoothing, but as points whose regularity is different from the “background” regularity in the raw data. Such an approach makes sense for “complex” images, in which the relevant structures are themselves irregular. Indeed, an implicit assumption of MM and IMA is that the useful information lies at boundaries between originally smooth regions, so that it is natural to ﬁlter the image. However, there are cases (e.g.

404

Scaling, Fractals and Wavelets

in medical imaging, satellite or radar imaging) where the meaningful features are essentially singular. • As in MM, and contrarily to IMA, the multifractal approach does not assume that there is a universal scheme for image analysis. Rather, depending on what we are looking for, different measures μ may be used to describe the image. • Both MM and IMA consider the relative values of the gray levels as the basic information. Here the Hölder regularity is considered instead. This again is justiﬁed in situations where the important information lies in the singularity structure of the image. Throughout the rest of this section, we make the following assumption: f := fh = fg The simplest approach then consists of deﬁning a measure (or, often, a sequence of Choquet capacities on the image), calculating its multifractal spectrum, and classifying each point according to the corresponding value of (α, f (α)), both in a geometric and a probabilistic fashion. The value of α gives a local information about the pointwise regularity: for a ﬁxed capacity, an ideal step edge point in an image without noise is characterized by a given value. The value of f (α) yields a global information: a point on a smooth contour belongs to a set Tα whose dimension is 1, a point contained in a homogenous region has f (α) = 2, etc. The probabilistic interpretation of f (α) corresponds to the fact that a point in a homogenous region is a frequent event, an edge-point is a rare event, and, for instance, a corner an even rarer event (see Figures 11.19 and 11.20). Indeed, if too many “edge points” are detected, it is in general more appropriate to describe these points as belonging to a homogenous (textured) zone.

Figure 11.19. Three edges, a texture

Figure 11.20. Three corners, a texture

Local Regularity and Multifractal Methods for Image and Signal Analysis

405

In other words, the assumption that fg = fh allows us to link the geometric and probabilistic interpretations of the spectrum. Points on a smooth contour have an α such that: • fh (α) = 1 because a smooth contour ﬁlls the space as a line. • fg (α) = 1 because a smooth contour has a given probability to appear. In fact, we may deﬁne the point type (i.e. edge, corner, smooth region, etc.) through its associated f (α) value; a point x with f (α(x)) = 1 is called an edge point, a point x with f (α(x)) = 2 is called a smooth point, and, more generally, for t ∈ [0, 2], x is called a point of type t if f (α(x)) = t. A beneﬁt of the multifractal approach is thus that it allows us to deﬁne not only edge points, but a continuum of various types of points. An important issue lies in the choice of a relevant sequence of capacities for describing the scene. The problem of ﬁnding an optimal c in a general setting is unsolved. In practice, we often use the following. Assume the image is deﬁned on [0, 1] × [0, 1]. Let P := ((Ikn )0≤k<νn )1≤n≤N be a sequence of partitions of [0, 1] × [0, 1] and (xnk , ykn ) be any point in Ikn . Each Ikn is made of an integer number of pixels. Let L(Ikn ) denote the sum of the gray levels in Ikn . Let (x, y) denote a generic pixel in the image and L(x, y) denote the gray level at (x, y). Let (pn )1≤n≤N be a ﬁxed sequence of positive integers and Ω be a region in the image. sum measure: cs (Ω) =

L(x, y)

(x,y)∈Ω

max capacities: cm n (Ω) =

max

n ,y n+pn )∈Ω Ikn+pn /(xn+p k k

L(Ikn+pn )

cM (Ω) = max L(x, y) (x,y)∈Ω

iso capacities: n cin (Ω) = max # {k / L(Ikn+pn ) = l, (xn+p , ykn+pn ) ∈ Ω} k

l

c (Ω) = max # {(x, y) / L(x, y) = l, (x, y) ∈ Ω} I

l

It is easy to see that: • cs (Ω) depends on both the gray level values and their distribution in Ω,

406

Scaling, Fractals and Wavelets M • cm n (Ω) and c (Ω) only depend on the gray level values, • cin (Ω) and cI (Ω) only depend on the gray level distribution.

M i I Thus, (cm n , c ) and (cn , c ) give in some loose sense “orthogonal” information about the image. Furthermore, it can be shown that they are more robust to noise than cs .

11.8.1.1. Edge detection The simplest procedure for extracting edges using multifractal analysis is as follows: • Choose a sequence c of capacities. • Calculate the Hölder exponent of c at each point of the image. • Calculate the multifractal spectrum of c. • Declare as “smooth” edge points those belonging to the set(s) Tα whose dimension is 1. • Declare as “irregular” edge points those belonging to the set(s) Tα whose dimension is between two ﬁxed values, typically 1.1 and 1.5. A result of segmentation using this approach is shown in Figure 11.21.

(a)

(c)

(b)

(d)

Figure 11.21. (a) Original image; (b) Hölder exponents with max capacity; (c) smooth edges; (d) irregular edges

Local Regularity and Multifractal Methods for Image and Signal Analysis

407

The sum measure, max and iso capacities are the basic tools used for analyzing images. There are cases where speciﬁc capacities, and/or more elaborate schemes than the one just described, must be designed in order to get robust results. See [LEV 98] for more details. 11.9. Change detection in image sequences using multifractal analysis In [CAN 96], the authors propose a multifractal approach to the problem of change detection in image sequences. Multifractal analysis proves to be useful for detecting changes without any a priori knowledge of the object to be extracted. If a change occurs in an incoming image, it is reﬂected in the global description provided by the multifractal spectrum graph. The abscissa α of the spectrum part (α, fh (α)) that has changed makes it possible to extract binary images corresponding to the changes. As can be seen in Figure 11.22c and Figure 11.23b, the extracted changes using multifractal analysis are much more relevant than the absolute difference between the two images.

(a)

(b)

(c)

Figure 11.22. (a) First image, (b) second image and (c) absolute difference (pixel to pixel) between the two images

(a)

(b)

Figure 11.23. (a) Local Hölder exponents image and (b) changes detected using multifractal analysis

408

Scaling, Fractals and Wavelets

11.10. Image reconstruction In [TUR 02], the authors describe an interesting method for image reconstruction that mixes multifractal analysis and more traditional diffusion-like techniques. The image I (or, more precisely, the modulus of its gradient) is ﬁrst modeled as a so-called log-Poisson multifractal measure. This means that its spectrum f admits the following parametric form: α − α∞ α − α∞ 1 − log , f (α) = f∞ + γ γ(2 − f∞ ) where α∞ is the lowest observed exponent, f∞ the associated spectrum value and γ := − log[1 + α∞ /(2 − f∞ )]. The justiﬁcation for considering such a model is that it seems to describe many natural images with reasonable accuracy. For most images, it is observed that f∞ 1, and this is the value chosen for the rest of the method. The points in the image having exponent α∞ are the most singular ones. In many cases, the set comprising all these points, which is called the most singular manifold (MSM), is strongly related to the edges of the images. This observation and the fact that the whole spectrum may be computed once α∞ is known suggest that the MSM contains the most relevant information, and that the whole image may be reconstructed from it. In order to implement this idea, a linear operator is applied to the MSM. This operator must satisfy the following natural constraints: it should be translation-invariant, isotropic, and allow us to recover the original power (Fourier) spectrum of the image. Under these constraints, we may show that there is essentially exactly one possible operator, and that the image I may be reconstructed through the following formula: if.vˆ0 (f) , cˆ(f) = f2 where – cˆ is the Fourier transform of c := I − I0 , – I0 is the mean of I, – f denotes frequency, – v0 (x) = ∇c(x) if x belongs to the MSM, and v0 (x) = 0 otherwise. Reconstruction of real-world images, such as outdoor scenes or natural textures, is surprisingly good considering the simplicity of the method.

Local Regularity and Multifractal Methods for Image and Signal Analysis

409

11.11. Bibliography [AMA 99] A MARAL L.A.N., G OLDBERGER A.L., I VANOV P.C., S TANLEY H.E. “Modeling heart rate variability by stochastic feedback”, Computer Phys. Comm., vol. 121-122, p. 126–128, 1999. [AYA 99] AYACHE A., L ÉVY V ÉHEL J., “Generalized multifractional Brownian motion: deﬁnition and preliminary results”, in D EKKING M., L EVY V EHEL J., L UTTON E. and T RICOT C. (Eds.), Fractals: Theory and Applications in Engineering, Springer, 1999. [AYA 00a] AYACHE A., “The generalized multifractional ﬁeld: a nice tool for the study of the generalized multifractional Brownian motion”, Journal of Fourier Analysis and Applications, vol. 8, p. 581–601, 2000. [AYA 00b] AYACHE A., L ÉVY V ÉHEL J., “The generalized multifractional Brownian motion”, Statistical Inference for Stochastic Processes 3, pp.7–18, 2000. [BAR 07] BARRIÈRE O., Synthèse et estimation de mouvements Browniens multifractionnaires et autres processus à régularité prescrite. Deﬁnition du processus auto-regulé multifractionnaire et applications, PhD Thesis, University of Nantes, 2007. [BIR 97] B IRGÉ L., M ASSART P., “From model selection to adaptive estimation”, in Torgersen E., Pollard D., Yang G. (Eds.), Festschrift for Lucien Le Cam, Springer, New York, p. 55–87, 1997. [CAN 96] C ANUS C., L ÉVY V ÉHEL J., “Change detection in sequences of images by multifractal analysis”, in Proc. ICASSP-96, May 7-10, Atlanta, 1996. [DEV 92] D EVORE R.A., L UCIER B., “Fast wavelet techniques for near optimal image processing”, IEEE Military Communications Conference, vol. 2, no. 12, 1992. [POP 88] D EVORE R.A., P OPOV V.A., “Interpolation of Besov spaces”, Transactions of the American Mathematical Society, vol. 305, no. 1, p. 397–414, 1988. [DO 01] D O M.T., Z AHOUANI H., Frottement pneumatique/chaussée, inﬂuence de la microtexture des surfaces de chaussée, JIFT, 2001. [DON 94] D ONOHO D.L., “De-noising by soft-thresholding”, IEEE Trans. Inf. Theory, vol. 41, no. 3, p. 613–627, 1994. [ECH 07] E CHELARD A., “Analyse 2-microlocale et application au débruitage”, PhD Thesis, University of Nantes, December 2007. [FracLab] FracLab: a software toolbox http://apis.saclay.inria.fr/FracLab/.

for

fractal

processing

of

signals,

[GOL 02] G OLDBERGER A.L., A MARAL L.A.N, H AUSDORFF J.M, I VANOV P.C., P ENG C.K., S TANLEY H.E., “Fractal dynamics in physiology: alterations with disease and aging”, PNAS, vol. 99, p. 2466–2472, 2002. [GUG 98] G UGLIELMI M., L ÉVY V ÉHEL J., Analysis and simulation of road proﬁle by means of fractal model, Conference on Advances in Vehicle Control and Safety (AVCS 98), Amiens, 1998. [HAR 98] H ÄRDLE W., K ERKYACHARIAN G., P ICARD D., T SYBAKOV A., Wavelets, Approximation and Statistical Applications, Lecture Notes in Statistics, Springer, 1998.

410

Scaling, Fractals and Wavelets

[HEI 97] H EINRICH G., “Hysteresis friction of sliding rubbers on rough and fractal surfaces”, Rubber Chemistry and Technology, vol. 70, no.1, p. 1–14, 1997. [IVA 99] I VANOV C., A MARAL L.A.N., G OLDBERGER A.L., H AVLIN S., ROSENBLUM M.G., S TRUZIK Z.R., S TANLEY H.E., “Multifractality in human heartbeat dynamics”, Nature vol. 399, June 1999. [JAF 00] JAFFARD S., “On lacunary wavelet series”, The An. of Appl. Prob., vol. 10, no. 1, p. 313–319, 2000. [JAF 04] JAFFARD S., “Wavelet techniques in multifractal analysis, fractal geometry and applications: a jubilee of Benoit Mandelbrot”, in L APIDUS M., VAN F RANKENHUIJSEN M. (Eds.), Proceedings of Symposia in Pure Mathematics (AMS), vol. 72, part 2, p. 91–151, 2004. [KLU 00] K LÜPPEL M., H EINRICH G., “Rubber friction on self-afﬁne road tracks”, Rubber Chemistry Technology, vol. 73, no. 4, p. 578–606, 2000. [LEG 04a] L EGRAND P., L ÉVY V ÉHEL J., D O M.-T., “Fractal properties and characterization of road proﬁles”, in N OVAK M. (Ed.), Thinking in Patterns: Fractals and Related Phenomena in Nature, World Scientiﬁc, New Jersey, p. 189–198, 2004. [LEG 04b] L EGRAND P., Débruitage et interpolation par analyse de la régularité Hölderienne. Application à la modélisation du frottement pneumatique/chaussée, PhD Thesis, University of Nantes, Ecole Centrale de Nantes, December 2004. [LEP 90] L EPSKII O., “On a problem of adaptative estimation in Gaussian white noise”, Theory Prob. Appl., vol. 35, pp. 454–466, 1990. [LEP 91] L EPSKII O., “Asymptotically minimax adaptative estimation I: upper bounds. Optimal adaptive estimates”, Theory Prob. Appl., vol. 36, p. 682–697, 1991. [LEP 92] L EPSKII O., “Asymptotically minimax adaptative estimation II: statistical models without optimal adaptation. Adaptive estimates”, Theory Prob. Appl., vol. 37, pp. 433–468, 1992. [LEV 95] L ÉVY V ÉHEL J., “Fractal approaches in signal processing”, Fractals, vol. 3, no. 4, p. 755–775, 1995. [LEV 98] L ÉVY V ÉHEL , J., “Introduction to the multifractal analysis of images”, in F ISHER Y. (Ed.), Fractal Image Encoding and Analysis, Springer Verlag, 1998. [LEV 03] L ÉVY V ÉHEL J., L EGRAND P., “Bayesian multifractal signal denoising”, IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP03), Hong Kong, April 6-10, 2003. [LEV 04a] L ÉVY V ÉHEL J., L EGRAND P., “Signal and image processing with FracLab, FRACTAL04”, Complexity and Fractals in Nature, 8th International Multidisciplinary Conference, Vancouver, Canada, April 4-7, 2004. [LEV 04b] L ÉVY V ÉHEL J., S EURET S., “The 2-microlocal formalism”, Fractal Geometry and Applications: A Jubilee of Benoit Mandelbrot, Proc. Sympos. Pure Math, 2004. [LEV 06] L ÉVY V ÉHEL J., L EGRAND P., “Hölderian regularity-based image interpolation”, IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP06), May 14-19, 2006, Toulouse, France.

Local Regularity and Multifractal Methods for Image and Signal Analysis

411

[MEY 03] M EYERS M., S TIEDL O., K ERMAN B., “Discrimination by multifractal spectrum estimation of human heartbeat interval dynamics”, Fractals, vol. 11, part 2, p. 195–204, 2003. [MUL 94] M ULLER J., “Characterization of the North Sea chalk by multifractal analysis”, Journal of Geophysical Research, vol. 99, p. 7275–7280, 1994. [PEL 95] P ELTIER R.F., L ÉVY V ÉHEL J., Multifractionnal Brownian Motion: deﬁnition and preliminary results, tech. report INRIA, No. 2645, 1995. [PEE 76] P EETRE J., New Thoughts on Besov Spaces, Duke University Mathematics Series, Duke University Press, Durham, 1976. [PET 99] P ETERS , R., “The fractal dimension of atrial ﬁbrillation: a new method to predict left atrial dimension from the surface electrocardiogram”, Cardiology, vol. 92, no. 1, p. 17–20, 1999. [PHY] www.physionet.org. [RAD 94] R ADO Z., A study of road texture and its relationship to friction, PhD Thesis, Pennsylvania State University, 1994. [SAU 97] S AUCIER A., H USEBY O., M ULLER J., “Electrical texture characterization of dipmeter microresistivity signals using multifractal analysis”, Journal of Geophysical Research, vol. 102, p. 10327–10337, 1997. [SAU 99] S AUCIER A., M ULLER J., “A generalization of multifractal analysis based on polynomial expansions of the generating function”, in D EKKING M., L EVY V EHEL J., L UTTON E., T RICOT C. (Eds.), Fractals: Theory and Applications in Engineering, Springer-Verlag, London, p. 81–91, 1999. [SER 87] S ERRA O., “Sedimentological applications of wireline logs to reservoir studies”, in G RAHAM , T. (Ed.), Proceedings of the North Sea Oil and Gas Reservoirs. Norwell, Massachusetts, p. 277–299, 1987. [TRI 95] T RICOT C., Curves and Fractal Dimension, Springer-Verlag, 1995. [TU 04] T U C., Z ENG Y., YANG X., “Nonlinear processing and analysis of ECG data”, Technology and Health Care vol. 12, no. 1, p. 1–9, 2004. [TUR 02] T URIEL A., DEL P OZO A., “Reconstructing images from their most singular manifold”, IEEE. Trans. Image Processing, vol. 11, p. 345–350, 2002. [VID 99] V IDAKOVIC B., Statistical Modeling by Wavelets, Wiley, 1999.

This page intentionally left blank

Chapter 12

Scale Invariance in Computer Network Trafﬁc

12.1. Teletraffic – a new natural phenomenon 12.1.1. A phenomenon of scales From the appearance of the ﬁrst telephone exchanges – before the beginning of the 19th century – we have spoken of a kind of current, which, whether it be realized through electrons or light, consists fundamentally of a ﬂow of information, that we call teletrafﬁc. Like its cousin, vehicular trafﬁc, this current is a phenomenon encompassing calm and ﬂuctuating episodes, predictable rhythms and unpleasant surprises, with periods of free circulation, but also of monumental trafﬁc jams that bear witness to the limited size both of carriageways and intersections. What is more remarkable however is the fact that, in common with numerous natural phenomena such as thermodynamics, and despite its man-made nature, teletrafﬁc is in practice too complicated to be fully understood, even though such an understanding is theoretically possible. On the contrary, it is necessary to observe it, and to know how to neglect peripheral details in order to better understand its core nature. Moreover, it is in most cases necessary to impose a statistical framework on what is, at a more fundamental level, a deterministic entity. We are led, ﬁnally, to treat this object – which is engineered, highly structured and ordered, even managed – as if it were a true natural phenomenon, and to seek to understand it as if it were created by unknown processes, and equipped with an obscure and unfamiliar nature. The complexity of the phenomenon is not due solely to the size of contemporary networks, though this is clearly an important factor. Already in traditional telephone

Chapter written by Darryl V EITCH.

414

Scaling, Fractals and Wavelets

networks, highly developed though dedicated to the transmission of voices, there were signiﬁcant heterogenities in the circulation of trafﬁc. In terms of temporal irregularities, we can include the diurnal nature of human activity, and other factors which, while equally prosaic, are not without importance, like the end of the working week, lunch periods and even coffee breaks. Trafﬁc also displays spatial heterogenity, an effect measured not in kilometers but in the degree of multiplexing, by which we understand the degree of call mixing, which increases towards the heart of the network where links grow thicker, like tributaries merging downstream. The higher the degree of multiplexing, the more the trafﬁc character becomes removed from that of individual calls, showing that heterogenity is found not only in the topology of links but also in the distribution of their size or capacity. Multiplexing is closely associated with the hierarchical nature of the network, which can itself take different forms. To these fundamental effects we can add, ever since the arrival of the facsimile, the complications resulting from different services provided over the same network. However, fax trafﬁc is quite simple. It differs from telephone calls only in the speciﬁc geographical distribution of the fax machines, and by the characteristic durations of fax transmissions. In contrast, other services provided over more modern networks, which freely combine images, computer ﬁles or data in the broadest sense, and which, moreover, increasingly allow user mobility, produce a very rich trafﬁc mixture the individual characteristics of which vary widely. When we add to this the growth in use of all kinds of computing devices, computers and telephones being the obvious examples today, coupled with the astonishing prospect of their quasi-universal connection to a single global network, the full complexity of the teletrafﬁc phenomenon is strikingly revealed. Apart from the factors of a “structural” nature mentioned above, there is another central point of particular interest here. It is the presence in trafﬁc of a very signiﬁcant range of times scales, in the sense of many orders of magnitude. This property constitutes the necessary basis of any scaling laws which may possibly govern trafﬁc. In the next section, we will see that such laws in fact exist. Even if we ignore very large scale effects, such as seasonal cycles and the trafﬁc evolution itself, the number of scales involved is very impressive. If we go from the traditional timescale of a day to the millisecond, being the transmission time of a thousand bytes across the world, then to the microsecond necessary to transfer them over a high speed local area network (say at 1 Gbit per second), we cover 23 octaves1, or 23 orders of magnitude: teletrafﬁc is indeed a phenomenon of scales. Rarely do we ﬁnd a quantity, of any kind, spread across such a broad range. What is even more signiﬁcant is that this range will continue to grow, with the inevitable improvements, closely following those of computing, in the bandwidth of connections and the capacity of switches, and all this without any limit known to science.

1. It is convenient to take logarithms to the base 2, hence octave = log2 (scale).

Scale Invariance in Computer Network Trafﬁc

415

12.1.2. An experimental science of “man-made atoms” If the properties of teletrafﬁc generally need to be studied and discovered, those of their underlying components, the sources, the atoms of the network gas, are available to us. These components can be studied closely, modiﬁed or even replaced. The three principal parameters of a source are its duration, its average rate and its product, the total quantity of data sent during a “transfer”. Instead of “transfer”, we can talk about a “session” or “call”, ﬂexible terms which vary according to the context. Often it is necessary to go beyond these three parameters, which give only a rough idea of the nature of trafﬁc, in order to consider the internal structure of sources, which deﬁnes their character. Taking an example from telephony, audio, we notice that in conversations there are many silences where there is no useful information to be transmitted, which immediately gives an idea of the jerky or bursty nature of this trafﬁc. The details of these periods of silence and activity constitute the particular nature of burstiness in telephony. On the other hand, we imagine that the transmission of a computerized text will be accomplished in a uniform way, at as high a speed as possible given the capacity of the available connection. In this latter case, trafﬁc does not have a clear structure. It is the capacity of the network itself, which is not homogenous in time or transmission resources, which leads to its irregular behavior. Thus, the concept of source cannot be deﬁned so easily. To emphasize this point, let us consider, for example, a certain number of sources of the same type carried over a local area network, whose trafﬁc pass by a multiplexer, that is, a trafﬁc concentrator, which alleviates trafﬁc congestion in the network. The output of this multiplexer, which is, from the point of view of the original sources, a superposition, constitutes a source for a multiplexer or a switch located downstream, at a “regional” level in the hierarchy of the total network. Sources can be divided into two main categories based on their sensitivity to transmission delays. Among those which are not sensitive, that is “data” sources, we include computer ﬁles, images and sound recordings. The complementary class, of “real-time” type, is sensitive to delays and includes video transmission, distributed games, and real-time systems in general, including audio (telephony). It should be clariﬁed that by “delay” we mean the variable delays suffered during transmission and not the total duration of transfer. It is essential, for example, in the case of audio that the pieces into which words are divided arrive in time to be rebuilt without any audible gap, whereas for a simple data transfer, only the total transfer duration matters, which amounts to the average data rate. On the other hand, as far as the reliability of reception is concerned, the sensitivities are reversed. It is the ﬂow corresponding to audio (for example) which can tolerate the loss of a certain percentage of data without sacriﬁcing adequate quality, whereas for sources of “data” type, we require a loss rate equal to zero – corrupted ﬁles are not acceptable!

416

Scaling, Fractals and Wavelets

12.1.3. A random current Even if sources can be individually described in a detailed, or even exact way (in principle), we generally take a stochastic point of view, and thus models proposed for trafﬁc and trafﬁc superpositions are random processes. The main reasons for this choice are the impossibility of managing within a deterministic framework the innumerable factors and details which intervene in practice, and to take into account the unpredictability of trafﬁc. For example, the fact that the length of calls and the precise arrival times of new connections are not known in advance demands the adoption of a probabilistic approach. Such models can be deﬁned in discrete or continuous time, according to the interpretation which is given to the ﬂow, the time scale considered, and the questions that are of interest. The choice also depends on the analytic tools available, which can favor one approach or other for technical reasons. For example, let us take a process X(k) that models audio trafﬁc in discrete time, giving the number of bytes to be transmitted in interval k of width 40 ms. This time series corresponds to a trafﬁc rate on a discrete time grid whose resolution is close to an established mechanism of audio transmission, namely the packet based TCP/IP (we revisit this shortly). It also constitutes a trafﬁc representation based on a fairly ﬁne time scale. Beginning from a rate series X(k), the nature of trafﬁc is described by 2 (k) = statistical quantities, starting with the mean μX (k) = E[X(k)], variance σX E[X(k) − μk ]2 (assuming it exists, which is not always the core), or even the entire distribution function X(k) for every k = 1, 2, 3, etc. Regarding its temporal structure, the most important quantity is the auto-correlation function, deﬁned by cX (k, l) = E[(X(k) − μ(k))(X(l) − μ(l))]/σX (k)σX (l). Typically, we assume that a ﬂow is stationary, that is its statistics do not depend on the time of origin, 2 2 (k) = σX and cX (k, l) = c ((τ = k − l)) = we therefore obtain μ(k) = μ, σX 2 E[(X(k) − μ)(X(l) − μ)]/σX . In this context, let us now examine the nature of a trafﬁc superposition as mentioned earlier. Let two processes X1 and X2 have the same (statistical) characteristics as X, in other words, they are identical but independent copies of X. Assuming that their superposition reduces to a simple addition, from the superposition Y (k) = X1 (k) + X2 (k) we can easily show that the 2 2 2 (k) + σX (k) = 2σX mean μY = μX1 + μX2 = 2μX and variance σY2 (k) = σX 1 2 behave in a linear fashion, but on the other hand that the structure of covariances remains constant: c= ((τ )) cX (τ ). Although the covariance amplitude (the variance) increased, the ratio of the size of variations to the mean, σ/μ, is reduced by a factor √ of 2. The simple fact that multiplexing two independent ﬂows results in another ﬂow of reduced variability explains why larger links are more effective; for the same probability of trafﬁc congestion, they can transport trafﬁc with a mean rate which is closer to their capacity: this is the well known multiplexing gain. However, when it comes to peak rates, if X1 and X2 are transported by links of capacity C which supply a concentrator, the latter’s output must have a capacity of at least 2C to ensure a zero loss rate.

Scale Invariance in Computer Network Trafﬁc

417

In teletrafﬁc, we are often concerned with the irregular character of trafﬁc, as it has been known for a long time, thanks to queueing theory (a major tool for the modeling of multiplexers), that the burstier a ﬂow is the larger the queue content. In this probabilistic framework, the question is how to re-examine what “bursty” trafﬁc means. We will answer this key question later, but it is appropriate to note here that both the time structure, described by cX (k), and the probability distribution play a part. Often, the distribution is taken as Gaussian, in which case only the mean and variance count, but reality frequently forces us far from this convenient choice. 12.1.4. Two fundamental approaches Until now, we have not dealt in detail with data transmission, which is a complex subject in itself, incorporating numerous logical and physical layers built on top of each other. However, it is hardly possible to have a clear vision of what teletrafﬁc is without talking about certain essential aspects – such as the paradigms of circuit switching and packet switching and the idea of protocols and their hierarchical organization. Of course, the physical level, that is, the electronic or optical functioning of systems, does not concern us, though it an integral part of this system hierarchy. To know more about these core concepts of telecommunications engineering, consult, for example, [TAN 88]. In order to introduce circuit switching, let us examine a telephone network during the age of switchboards. When a call was accepted and connected, an actual line of copper wire, a circuit, connected the parties. Throughout the duration of the call, they enjoyed full use of the capacity of this circuit, even during silent periods when they had no need of it. Furthermore, the circuit bandwidth was at their disposal regardless of the state of other circuits. In more modern networks, we can no longer trace a physical circuit so explicitly, due to the diversity of technologies developed for the physical layer, although the essential characteristics of circuit switching remain the same. Let us carry this a little further. The idea of dividing the total bandwidth into equal portions, or circuits, leads to link characteristics which are predictable and constant. Once a call is accepted, there is no doubt about the infrastructure provided: a ﬁxed quantity of bandwidth capable of being used by a service designed to function under such conditions. If, on the contrary, the system is overloaded then there is no available circuit, which is an easily veriﬁable situation without any ambiguity. Therefore, this system is simple in principle, but suffers from the disadvantage of wasting resources; it is entirely possible that the bandwidth is highly under-used even though most or even all of the circuits are occupied. To address the above disadvantage, packet switching proposes a sharing of resources whereby all ﬂows are subdivided and the pieces sent independently over the same connection, without any detailed reservation. Each packet is separately transported without any direct link to its brothers from the same connection. We rely on the principle of statistic multiplexing described earlier to reduce the variability of

418

Scaling, Fractals and Wavelets

this unmanaged superposition, thereby allowing a higher number of simultaneous calls. The chief disadvantage is the possibility that the capacity of the link might be reached, leading to a loss of packets. Furthermore, the decision to admit a new ﬂow, to accept a new connection, is not a simple matter but depends on a statistical judgment of the probability of a future link overload. It is also necessary to plan how to manage packet losses, perhaps by retransmitting them, which would lead to more complicated systems of sending and receiving. Nevertheless, packet switching offers the advantage of high efﬁciency and also a vast range of services, such as telemetry, whose very low rate would not justify the allocation of an entire circuit, and the possibility, if a link is weakly loaded, of using a high proportion of the capacity to quickly transfer a huge ﬁle. Evidently, there are hybrid systems existing between these two extremes. For example, we can enrich circuit switching by means of priority classes to ensure, or at least to increase, the probability that certain trafﬁc types (very costly ones, for example) always ﬁnd circuits available. These two paradigms are the traditional conceptual extremes, which do not depend on the technology of the day but, on the contrary, will always play a role even if, as systems, technologies, services and proﬁtability criteria vary, one of the two will still dominate. They can even exist simultaneously, at different levels in the protocol hierarchy. A protocol is a language and a system by which agents can communicate. In the case of packet switching, on the one hand it relates to the fact that each packet consists of two parts, the header containing, among other things, the destination and origin addresses, and the payload where the actual data resides. On the other hand it relates to the computing infrastructure (software or hardware) that understands this structure and language to the bit level. However, it is often true that a single language is not sufﬁcient; this is certainly the case when a connection traverses different networks and therefore different technologies and protocols, as for example with the Internet. Moreover, a system in which a single language is expected to manage the demands of all the functions and constraints of each level in the hierarchy would be unworkable; the task is therefore broken up. An example of some importance is the relationship between the IP (Internet Protocol) and TCP (Transmission Control Protocol) in the functioning of today’s Internet. The former allows the transfer across different networks. The IP headers contain globally meaningful addresses, sufﬁcient information to steer each packet independently, without any knowledge of their fellow packets or their purpose. At this level, a connection, that is, a ﬂow established between two end points having certain properties such as reliable communication, does not exist. Since it is often necessary to guarantee the arrival of all packets, it is essential to have another mechanism that carries the connection concept, capable of verifying the safe arrival of packets and of managing any necessary retransmissions. This is the role of TCP, in which packets containing their own headers and data payloads are transported inside the IP packet payload. Generally, there are several such layers which provide the link between “high level” applications and the technological, and then physical, layers. To know more about the TCP/IP tag-team, the reader can refer to [STE 94].

Scale Invariance in Computer Network Trafﬁc

419

12.2. From a wealth of scales arise scaling laws 12.2.1. First discoveries During the ﬁrst decades of its evolution, trafﬁc modeling grew considerably and came to constitute, towards the end of the 1980s, a diverse and abundant literature. In parallel, the ﬁeld of network performance analysis, essentially a branch of queueing theory, grew in conjunction with this work, unveiling the positive and/or negative effects of each new model. The advent of packet switching stimulated new developments and founded new classes of models, leading to a signiﬁcant broadening in the class of systems capable of being treated. However, this corpus of knowledge remained deeply rooted in the intuitive and technical foundation developed for circuit switching, a thoroughly mastered ﬁeld where models enjoyed a real and acknowledged success. In fact, there was even a sense that we already knew this new concept of packet trafﬁc, so much so that despite major changes in network structure, we hardly felt the need to compare model predictions with reality. It was noted [PAW 88] that among several thousand papers published between 1966 and 1987 dealing with performance analysis, around only 50 dealt with actual network measurements. In the defence of the research community of the time, it is important to note that it was far from easy to obtain measurements, particularly high quality measurements. In general, collection systems lost packets and could only operate at low time resolution. On the other hand, it must be recognized that it was, above all, tractability concerns which guided researchers towards models which did not necessarily have any solid link with the actual properties of the data they were supposed to describe. It has long been taken as a given, and no less so today, that from the statistical point of view the fundamental characteristic of packet trafﬁc is its bursty nature. By this we refer implicitly to a comparison with traditional telephonic trafﬁc, and more precisely, to a Poisson process, the keystone model of the circuit switching world. Let us recall some of the basic properties of this point process. Let N (t) be the number of points falling within the interval [0, t]. The fact that the variance to mean ratio of this quantity is equal to 1 point for the mild variability of this process. As far as its time dependent structure is concerned, recall that the Poisson process is the traditional example of a memoryless process: the conditional probability density of the number of points in (t1 , t2 ] given the past history, is the same as the unconditional density which takes account of neither the number of points N (t1 ) in [0, t1 ] nor of their positions. Compared with this timid standard, the nature of a more boisterous trafﬁc seemed clear in the minds of trafﬁc engineers; it is sufﬁcient to expect high variance/average ratios, maybe equal to 2 or even 3 and to replace exponentially decreasing covariance functions by others whose correlations are larger, but still short range, naturally! Such generalizations, often in the form of Markov chains, were developed and were taken to constitute an adequate responses to the question: “what is bursty trafﬁc?” At the same time, however, the signs of a forthcoming revolution were already on the horizon.

420

Scaling, Fractals and Wavelets

In a 1988 survey paper, Ramaswami [RAM 88] spoke about the future of trafﬁc modeling. He emphasized the inadequacy of traditional approaches, particularly the real danger of neglecting certain deterministic trafﬁc features which are poorly modeled by accepted approaches, yet inﬂuential on performance. An example of such an effect is that one loss can provoke a series of losses. In the context of asymptotically stationary models, he also mentioned very long transient periods. Through these observations we can glimpse an emerging appreciation of the fact that the high variability of real trafﬁc, as well as the far-reaching effects of “deterministic” events, must be modeled and can have a serious impact, and we even touch on the idea of non-stationary modeling. The author also spoke of the necessity of deﬁning both Quality of Service metrics, and ﬂow statistics, that were no longer universal but relative to time scale. This time we recognize an understanding of the fact that simple mean measures are too coarse for trafﬁc which has access to, and uses, a wide range of scales. In 1991, Hellstern, Wirth, Yan, and Hoeﬂin [MEI 91] analyzed ISDN (Integrated Service Digital Network) data from a low speed packet network. They observed strong variability which was difﬁcult to explain using traditional models. On the other hand, they successfully used a model with three components of which two contained unusual, if not unheard of statistical ingredients for the ﬁeld, namely inﬁnite variances and even means. Such discoveries clearly showed that reality had many things to say regarding the nature of bursty trafﬁc. Also in 1991, Leland and Wilson [LEL 91], in a study of Ethernet trafﬁc, the well-known local network technology (at that time operating at 10 Mbit per second), reported astonishing discoveries. They were obtained from the analysis of a dataset which was exceptional from the point of view of its high resolution as well as its length and reliability. Essentially, instead of ﬁnding a preferential time scale that could serve as a basis for a deﬁnition of burstiness, as in the Poisson case, they observed a chaotic behavior over 20 orders of magnitude; in other words, burstiness at all time scales! We could not ﬁnd a result more in contradiction with the spirit, and even the hopes, of traditional modeling, without abandoning stationarity itself. 12.2.2. Laws reign With the appearance of the celebrated article by Leland, Taqqu, Willinger and Wilson [LEL 93] in 1993 (see also the review article [LEL 94] and also [ERR 93]), this new, poorly understood behavior ﬁnally made sense and at the same time was given a name: scale invariance. It happened that these “bizarres” properties had already been met elsewhere, and even constituted a canonical mode of behavior in nature, and a modeling paradigm well known to science in physics, biology, chemistry, geology and, under the name of fractal geometry, in mathematics. To this apparently complex phenomenon, described above, corresponded a corpus of knowledge capable of describing and characterizing it: fractal trafﬁc was born. Using the same system of Ethernet data collection as that of Leland and Wilson [LEL 91], large measurement traces were taken, some of which subsequently

Scale Invariance in Computer Network Trafﬁc

421

became freely available and formed unofﬁcial reference datasets. Part of one of these public series, “pAug”, appears in Figure 12.1. From the data a time series X(k) was extracted, corresponding to the number of bytes crossing the network during intervals of width δ = 12 ms. We denote by X (m) the series X averaged over blocks of length m, a procedure called aggregation of level m, for example, X (3) (1) = (X(1) + X(2) + X(3))/3. Using this smoothing operator, an illustration as simple as it is fundamental to scale invariance is presented in Figure 12.1. From top to bottom, we trace the ﬁrst 512 points of four series X(k) = X (1) (k), X (8) (k), X (64) (k) and X (512) , with δ which varies from δ = 12 ms to δ = 12 × 8 × 8 × 8 ms, or 6, 144 s. Apart from a reduction of variance, fortuitously compensated for by the automated setting of the graph scale, these three series present a quasi-uniform statistical face. A smoothing of this kind, across nine orders of magnitude, if carried out on short memory data, would have revealed a strong evolution towards a constant trafﬁc rate. =12ms 10000 5000 0

50

100

150

200

250

300

350

400

450

500

450

500

=12 * 8 ms

8000 6000 4000 2000 0

50

100

150

200

250

300

350

400

=12 * 8 * 8 ms

6000 4000 2000 0

50

100

150

200

250

300

350

400

450

500

=12 * 8 * 8 *8 ms

4000 2000 0

50

100

150

200

250

300

350

400

450

500

Figure 12.1. Visual demonstration of scale invariance. Each plot shows the ﬁrst 512 points, in bytes per bin, of “pAug” Ethernet data. From top to bottom, the resolution passes from δ = 12 ms to δ = 6.144 s. The visual appearance of variability remains the same; no smoothing effects are seen

This kind of scale invariance takes its pure form in the canonical process called fractional Gaussian noise (FGN) Z(t). This discrete time stationary process, which satisﬁes E[Z(t)] = 0, has a parametric correlation function with a single parameter,

422

Scaling, Fractals and Wavelets

the exponent β, 0 < β < 1: 1 (τ + 1)2−β − 2τ 2−β + (τ − 1)2−β 2 1 ∼ (1 − β)(2 − β)τ −β ≡ c∗γ τ −β 2

c∗Z (τ ) =

(12.1)

where τ = 1, 2, 3, and the asymptotic relation is valid for large τ . The function c∗Z (τ ) is a ﬁxed point of the aggregation operator, that is, each of the Z (m) have this same correlation function. This perfect second order invariance is closely related to the power law decrease of c∗Z (τ ), a decrease so slow that the correlation sum ∞ ∗ τ =1 cZ (τ ) is inﬁnite. Moreover, if a process is not a fractional Gaussian noise, but has an asymptotic decrease of the same form: cX (k) ∼ cγ k −β ,

0<β<1

(12.2)

then under aggregation the correlation functions of X (m) tend towards c∗Z (τ ). Thus, such processes form a second order asymptotically self-similar class. We say that they exhibit long-range dependence, giving, at second order, a precise meaning to the long memory concept. If, on the other hand, we were to examine the effect of successive aggregations on exponentially decreasing correlations, for which the sum of the correlations is ﬁnite, we would converge towards the trivial ﬁxed point, white noise. A property equivalent to deﬁnition (12.2) [COX 84], sometimes named slowly decreasing variance, arises from the fact that v (m) = var[X (m) ] ∼

2cγ m−β (1 − β)(2 − β)

(12.3)

which should be compared against the case of an exponential decrease, for which v (m) goes as O(1/m). Such a slow decrease explains the visible difference between the four variance to mean ratios of Figure 12.1. Aggregation implies a normalization of 1/m as it is based on taking means, whereas m−β is the factor necessary to exactly compensate for the (asymptotic) invariance present. For the speciﬁc case of FGN, we have v (m) = m−β for each m 1. Note that we can also interpret v (m) as the asymptotic variance of an estimate of the mean of X, which is larger for a process with long-range dependence and depends on both β and cγ , but not on σX . These two fractal properties, long-range dependence and the slow decrease of variance, were rapidly veriﬁed in Ethernet networks across the world. In 1994, Duffy, et al. [DUF 94] also observed them in SS7 trafﬁc (a signaling protocol), collected from a packet switched network CCSN (Common Channel Signaling Network) which is used to control other networks. In 1994, Paxson and Floyd, in the aptly named “The failure of Poisson modeling” [PAX 94b] (see also [PAX 94a]) showed that in

Scale Invariance in Computer Network Trafﬁc

423

trafﬁcs that circulate between regional networks, in other words on the Internet, we often ﬁnd scale invariance, and almost never ﬁnd trafﬁc obeying the traditional rules. Certain types of digitized video trafﬁc also exhibited the phenomenon (see Beran et al. [BER 95]), though over a less spectacular range than for Ethernet. In 1995, Willinger et al. [WIL 95] returned to Ethernet data for a more reﬁned analysis, examining not only the trafﬁc itself but also its components, and once again found scale invariance. We will consider this example and its theoretical offshoots in section 12.3. Numerous discoveries followed (for a more complete list, see the bibliography in [WIL 96]) and almost always scale invariance was found, as well as heavy tails for many quantities such as the length of bursts. By this qualiﬁer “heavy”, we understand a slow decrease in probability density at inﬁnity, which gives rise to inﬁnite variances or even means. Despite the weight of abundant empirical evidence, resistance against this new wave in trafﬁc was not slow in showing itself. Essentially, this resistance was divided between those who preferred to believe that the evidence was in fact merely the artifact of corruptive non-stationarities, and others who thought that a continuum of scales could be effectively approximated by a sufﬁciently large number of discrete scales, obviating any need to talk about fractals. Although it is true that, in many cases, such objections were merely a reﬂex against the shock of new ideas, it is nevertheless true that non-stationarities can very well be confused with the increased variability of scale invariant processes. One of the ﬁrst methods used to detect scale invariance and to estimate its exponent β was based on equation (12.3). From X (m) , we estimate v (m) using the standard variance estimator, and then plot its logarithm against log(m), the time-variance plot. The slope in this plot corresponds to an estimate of β. This method, although simple and with a low calculation cost, suffers from many statistical defects: a notable bias, a quite large variance, and in particular a poor robustness with respect to non-stationarity. This last defect is shared, at least partly, by many other methods (see [TAQ 95] for a comparison of several methods). The need to measure β in a reliable way stimulated further contributions to the already sizable statistical literature. In fact, the high volume of “teletrafﬁc” data excluded the use of the estimators typically in use, which had good statistical performance, but very high computational costs. A semi-parametric method based on wavelets, introduced into the ﬁeld by Abry and Veitch [ABR 98], solved these problems thanks to its low complexity, only O(n), without sacriﬁcing robustness, and excellent statistical performance due to the natural match between wavelet bases and scale invariant processes. Wavelet analysis operates jointly in time and scale. It replaces the time signal X(t) with a set of coefﬁcients, the details dX (j, k), j, k ∈ Z, where 2j denotes the scale, and 2j k the instant, around which the analysis is carried out. In the wavelet domain, equation (12.3) is replaced with var[dX (j, k)] = cf C 2(1−β)j , where the role of m is played by scale, of which j is the logarithm, cf is the frequency analog of cγ and is proportional to it, and where C is independent of j. The analog of the variance-time plot, the graph of the estimates of log(var[dX (j, ·)]) against j, is called the logscale diagram and constitutes a spectrum estimate of the process, where low frequencies correspond to large scales (on the

424

Scaling, Fractals and Wavelets Logscale Diagram, N=2

[ (j1,j2)= (3,15) Q= 0.011384 ]

Logscale Diagram, N=2

30

29

29

28

28

27

27 y

y

j

31

30

j

31

[ (j1,j2)= (3,15) Q= 0.011384 ]

26

26

25

25

24

24

23

23

22

22 2

4

6

8 Octave j

10

12

14

2

4

6

8 Octave j

10

12

14

Figure 12.2. Wavelet analysis of scale invariance. The Logscale diagram, an estimated “wavelet spectrum” (in log-log coordinates), is shown for the data. Left: Ethernet “pAug”. The slope gives an estimate of 1 − β with good properties, which is reliable from j = 3. We obtain 1 − β = 0.59 ± 0.01. Right: the number of new TCP connections in 10 ms bins on an Internet link. We see two scale ranges, from j = 1 to 8 and j = 8 to 19. Daubechies 2 wavelets were used

right side in Figure 12.2). For more details on this method, its use and robustness, see [ABR 00, VEI 99]. In the left plot of Figure 12.2, the wavelet method is applied to “pAug”. We see a general alignment across all scales in this time series of length n = 218 which, starting from j = 3, justiﬁes the estimate of the slope (a weighted regression is used which gives more weight to small scales where there is more data, as indicated by the conﬁdence intervals displayed). Thus, the self-similarity visible in Figure 12.1 can be objectively revealed and qualitatively measured. 12.2.3. Beyond the revolution Despite considerable resistance, the concept of the fractality spread quickly. In fact, after a short span of time, as frequently as resistance was found, the opposite attitude was also encountered, that is, the idea that we had only to measure the value β to capture the essence of fractal trafﬁc. In fact, a model based on relation (12.3), comprising only three parameters, μX , σX and β, cannot claim, except in particular cases, to describe the essence of an object as rich as trafﬁc. Even if we are ready to accept a Gaussian hypothesis, which eliminates the need for dealing with statistics of orders higher than two, models that allow more ﬂexibility are required. For example, it is necessary to think about the constant cγ of equation (12.2), which gives the “size” of the law of which β expresses the nature. There is no reason, a priori, to assume that its value is equal to that of standard fractional Gaussian noise, where it takes

Scale Invariance in Computer Network Trafﬁc

425

the parametric form c∗γ = 12 (1 − β)(2 − β). The same comments are valid for short range correlations, which are very important for sources such as video. In order for a trafﬁc which presents long-range dependence but which also notably deviates from fractional Gaussian noise to be well-modeled by the latter, it is necessary to have a high value of m; however, the time scales thereby neglected can have signiﬁcant impacts on performance. In 1997, Lévy Véhel and Riedi [LEV 97] observed that in TCP data, not only can the behavior over small scales be far from that of a fractional Gaussian noise, but we can even ﬁnd another form of scale invariance, that of multifractality. These observations were conﬁrmed on other TCP data by Feldman et al. [FEL 98]. In Figure 12.2, the plot on the right side shows a second order analysis of a time series which corresponds to the number of new TCP connections arriving in bins of width 10 ms. In addition to the slope, on the right, corresponding to long-range dependence, we observe a second slope at small scales, a second zone in which an invariance lies. However, multifractality means much more than a simple fact that there are two different regimes, each with its own invariance. To understand this difference, let us imagine that we deﬁned, not a second order logscale diagram, on the basis of quantities |dX (j, k)|2 , but, in a similar manner, a logscale diagram at q th order based on |dX (j, k)|q , from which we estimated, over the same range of scales, an exponent βq . In the case of fractional Gaussian noise, though each of the {βq } (here positive real q are taken) are different, they are connected in a simple, linear way; essentially there is only one “true” exponent. On the other hand, in the multifractal case the {βq } enjoy considerable freedom. The invariance is fundamentally different for each different moment, and therefore it is necessary to know the entire spectrum of exponents, instead of just one, to fully understand the nature of the invariance present. A detailed treatment of multifractals is beyond the scope of this chapter (see [RIE 95, RIE 99], Chapter 1, Chapter 3 and Chapter 4 for a more in-depth analysis) but it is nevertheless relevant to describe a simple example: a deterministic multiplicative cascade. We begin with a unit mass, uniformly spread out over [0, 1]. We then distribute a fraction p ∈ (0, 1) of the mass on the ﬁrst half [0, 12 ] and the remainder on ( 12 , 1]. The total mass is thus preserved and we say that the cascade is conservative. Now, let us repeat this procedure to obtain masses {p2 , p(1 − p), p(1 − p), (1 − p)2 } on the quarters of the interval, then {p3 , p2 (1 − p), p2 (1 − p), p(1 − p)2 , p2 (1 − p), p(1 − p)2 , p(1 − p)2 , (1 − p)3 } on the eighths, and so on. This procedure, repeated indeﬁnitely, creates a mass distribution, that is a measure, which is singular: it is multifractal. Less rigid and random variants can be easily deﬁned and taken to be trafﬁc models by normalizing [0, 1] to the duration [0, t], and by regulating the number of iterations which controls the time resolution δ. The singularity and positivity of these measures are apt to describe the astonishing variability observed in small scale trafﬁc, where a Gaussian hypothesis is far from reasonable. Thus, multifractal modeling offers the hope of venturing into the difﬁcult

426

Scaling, Fractals and Wavelets

terrain of non-Gaussianity, with the strong structured support of scale invariance as a conceptual and mathematical tool. 12.3. Sources as the source of the laws 12.3.1. The sum or its parts Following the surprise generated by the discovery of fractal trafﬁc, the underlying cause of the phenomenon soon became an issue. A relevant though elementary observation, already discussed in section 12.1, is that there is no shortage of the raw material, namely scales. A second observation is that there are certainly a great number of characteristic scales present in trafﬁc, due either to perturbations of the network or inherent in the nature of the sources themselves. On the network side we have the queues inside switches, which control the ﬁnest scales, measured at the beginning of the 21st century in fractions of a microsecond. From there we move on to millisecond scales, strongly inﬂuenced by queues at the input and output of large switches, which average out the high frequency ﬂuctuations of ﬂows. Next, we ﬁnd the scales of control mechanisms for admission and congestion, which have their own internal structures and associated time scales. On a much larger scale, say of 1 minute, we can cite the regular updating of address tables in switches, and ﬁnally, major changes of packet routes and link capacities. As for the sources, the picture is even richer. Each protocol in the hierarchy imposes its own characteristic scales, and then the nature of the trafﬁc itself enters in: audio, video, ﬁles, and others. Finally, we must include scales that are associated with human activities, such as the frequency of telephone calls, the dynamics of hyperlink navigation in web pages, and patterns of working hours, to mention only a few. Thus, there is no lack of characteristic scales. We could even imagine that every scale is a characteristic scale for at least one trafﬁc component. Unfortunately, even this does not in any way resolve the question of scale invariance we are concerned with, which deals with the relative structure of behavior at different scales, and not simply with the fact that effects exist at each scale. It must not be forgotten that traditional trafﬁc also contains correlations at all scales even though their relative sizes, summarized in the form of cX , were such that a single one dominated. In contrast, with scale invariance they are connected to each other in a very particular way. To answer the question concerning the origin of this mysterious connection, two logical extremes come naturally to mind: scale invariance is found in every source and the trafﬁc superposition is simply the inheritor of this fact, or it is an emergent property, either linearly – the whole is the sum of its parts – or non-linearly – the whole is more than the sum of its parts. Of course, the sources are combined and controlled by the network, the origin of the non-linear effects in question. However, this does not make the situation simple. Although it is the network which combines different ﬂows, it is also the nature of other ﬂows which, through the mechanisms provided by the network, largely controls the statistical evolution of a given ﬂow. The network

Scale Invariance in Computer Network Trafﬁc

427

is thus as much an agent as the cause of non-linear effects. As for the source, its characteristics are deﬁned only in its broadest outlines by its underlying fundamental nature, real-time or not, etc. The majority of its characteristics are determined by the protocol (or protocols) which interpret it, and hence it is necessary to understand “source” as meaning the “fundamental data and protocol(s)”. In this section, we explore partial responses to these questions, starting with a description of the on/off source, which is a natural and commonly used model possessed of both theoretical and practical advantages. 12.3.2. The on/off paradigm The motivation of the idea is as physical as it is simple: a source alternates between periods of silence, where the rate is zero, and activity, where the rate is constant, say h. Such simplicity models a source which is highly spasmodic in that whenever it has anything to transmit, it sends it at its peak rate. This is a model which was originally aimed at modeling the “burst scale” situated at medium scales, a concept which is now rather old-fashioned. However, it remains true that its structure is not useful at very small scales, where the rate varies rapidly. In an on/off source, periods are generated by (positive) random variables which are mutually independent, distributed as a variable A for silences and B for activity. Thus, the transition points obey an alternating renewal process. Let E[B] = 1/μ and E[A] = 1/ν; it follows that average rate is λ = hν/(ν + μ). As for the process X(t), its 1D marginal is Bernoulli and we 2 = λ(1 − λ). Thanks to the structure of renewals, that of obtain E[X(t)] = λ and σX the correlations of X(t) is not difﬁcult to understand: a link between instants t1 and t2 exists only if they fall within the same period (silence or burst); as such correlations are of short range, unless the probability of a long period is in itself large. Let A be a law of ﬁnite variance, but let B be one with a heavy tail: FB (t) ∼ hB t−α ,

t 1,

1<α<2

(12.4)

where FB is the complementary distribution function of B, that is FB (t) = P (B > t). The range of α corresponds to inﬁnite variance for B and it is not difﬁcult to verify [BRI 96] that it leads to long-range dependence for X(t), with parameters β = 3 − α,

cγ =

hB ν(1 − λ)3 (α − 1)

(12.5)

One of the principal reasons why the on/off approach, as well as others based on renewal processes [RYU 96], have been so often used is the ease with LRD can be introduced and controlled. This also brings a key advantage in terms of the generation of trajectories: we are not obliged to take the past into account in great detail. On the contrary, it is enough to draw samples from a random variable without variance, in an independent manner. The use of this method in Monte Carlo simulations is

428

Scaling, Fractals and Wavelets

thus widespread, although it suffers from a subtle, but signiﬁcant, problem of slow convergence which is under-appreciated [ROU 99]. A variance which is inﬁnite may appear unrealistic, and be seen as a weak device for generating long-range dependence, unrelated to empirical observations. How can we claim to observe an inﬁnite variance when in practice one can only measure and handle ﬁnite quantities? The answer, like elsewhere in science, lies in the fact that a model does not claim absolute truth, but elegant utility. If when measuring the distribution of values of a quantity, we observe that they follow relation (12.4) across a wide range of t, up to and including the largest available, an inﬁnite variance model becomes entirely relevant as an idealization. It was in this spirit that Cunha, Bestavros and Crovella declared in 1995 [CUN 95] that they observed heavy tails, inﬁnite variances, in many characteristics of web documents, particularly their sizes. They observed this same property in the sizes of UNIX ﬁles, thus revealing an orebody rich in power laws, capable of contributing to the existence of on/off type sources with inﬁnite variance. In [WIL 95], Willinger, Taqqu and Sherman went further in their analysis of local Ethernet trafﬁc and also external Ethernet trafﬁc, which consists of trafﬁc offered to (and received from) the Internet. Not only did they observe evidence of on/off behavior in individual ﬂows, deﬁned as trafﬁc ﬂowing between unique emission and reception address pairs, but in most cases the estimated values of α were indeed well within the inﬁnite variance range. 12.3.3. Chemistry In general, the addition of independent processes induces the addition of their temporal structures. More precisely, if the independent processes Xi (t) have γ Xi (τ ) as their covariance functions, then the covariance function of the process X = Xi is just γX (τ ) = γXi (τ ). It follows that the presence of long-range dependence in at least one of the Xi induces a superposition, with an exponent equal to the minimum of those of the long-range dependent components. This is reminiscent of rules governing the fractal dimension of a union of fractal sets [FAL 90]. If, for example, the Xi (t) are independent identically distributed (iid) copies of a LRD process of parameters (cγ , β), the parameters of the superposition are simply (cγ m, β). The persistence of long-range dependence applies in particular to a ﬁnite superposition of N identically distributed and independent on/off sources. As for inﬁnite superpositions, models as interesting as they are signiﬁcant emerge depending on the precise way in which the normalization is carried out. Let us initially examine a normalization relating to the instanteous rate h (during the on states), leaving the structure of individual sources untouched (notably ν and μ). By increasing N , we can show [KUR 96, TAQ 97] that there is convergence both in the distributional and weak senses to a Gaussian process. It is not surprising that,√to obtain this result, it is necessary to impose a normalization proportional to N after having ﬁrst subtracted the mean. This limiting process has long memory and, by aggregating we

Scale Invariance in Computer Network Trafﬁc

429

can obtain fractional Gaussian noise in a second limit operation, this time operating on time. This bond between on/off sources and the canonical scale invariant process will be explored further in the next section. Here we emphasize that the order of the limit operations, ﬁrst in rate and then over time, is of central importance. If we try reversing these we obtain a completely different result, the Lévy ﬂight, a stable process with stationary and independent increments which does not, for the moment, correspond to a model applicable to trafﬁc. The interpretation of fractional Gaussian noise to be a combination of on/off sources reveals it to be an example of a process where the source of scale invariance lies in the linear primitive components themselves, to which a linear superposition does not add anything essential. On the other hand, there is another normalization which provides an example of where scale invariance is emergent. In this case, we leave the peak rate h constant, and lengthen the silent periods with N so as to maintain the total arrival rate constant. More precisely, we set ν = ρ/N , with ﬁxed ρ, and obtain in the limit N → ∞ λ = hρ/μ and an arrival process of bursts that obeys a Poisson process of parameter ρ. In this limit, each source contributes only a single active period because silences before and after extend to inﬁnity. The sole burst that remains to every source can be interpreted as the transfer of a single ﬁle at constant rate. The number of simultaneously active sources is described by a Poisson law of parameter ρ/μ. For ﬁnite N , as well as inﬁnite limit N , the rate process has long memory. In contrast, for the limit process, long memory is no longer ascribable to individual sources but to heavy tailed distribution of the size of the transfers which remain individually constant, without any scale invariance of their own. An aggregate of such sources can be regarded as a random and independent model of ﬁle transfers across a network. The question of non-linear mixtures is, not unexpectedly, much more complicated, little studied, and beyond the scope of this chapter. In its broad sense, it implies that sources can inﬂuence each other, in other words that there exists feedback between the superposition and its components. We will brieﬂy return to this later. 12.3.4. Mechanisms A coherent way of explaining scale invariance has already emerged: long-range dependence is generated by the heavy tailed property, is preserved by superposition, and is well idealized by a canonical Gaussian process. Are there network mechanisms capable of carrying out these linear mathematical operations? The answer is yes. From the discrete nature of packets, ﬂows can inter-penetrate in switches andmultiplexers (trafﬁc concentrators), thereby effectively adding, in an approximate sense, their instantaneous rates. Moreover, since the packets remain identiﬁable in the mixture, this quasi-additivity also acts in the inverse direction, in the demultiplexers where ﬂows leave a large link to move away from the core of network, or in switches where ﬂows are extracted from one superposition to be integrated into others. The

430

Scaling, Fractals and Wavelets

same logic is also valid for various methods of multiplexing relevant to circuit switching. A second question lies in the possibility that multiplicative, rather than additive, mechanisms exist in the network, potentially allowing the realization of one mathematical path (following cascades) for the generation of multifractal properties. It was suggested that the hierarchy of protocols can fulﬁll this role [FEL 98] by recursive subdivision of source data. However, the true cause (or causes) of multifractality observed remains to be determined. If, in low load, multiplexing, switching and demultiplexing operations are well understood in terms of linear operations, at high load non-linearities, mainly due to buffers, electronic queues, are inevitable. From strong smoothing we expect elimination of scale invariance over a certain scale range, however at a large scale the inﬂuence of heavy tails, a property of great robustness, will persist. However, the non-linear mechanisms potentially involved are richer than a simple truncation of what is otherwise a simple linear superposition. If a control mechanism regulates a ﬂow resulting from a given source, such as for example in TCP connections, there is a coupling between the source and the network, a feedback, which modiﬁes the transmission depending on the state of the network, controlled by example by the level of loss detected. Thus, network queues generate an indirect coupling between different sources, producing a highly non-linear dynamics capable of very signiﬁcantly modifying the nature of trafﬁc. This effect is stronger as the proportion of sources thus regulated is large. Such dynamics, and its potential capacity to generate scale invariance such as self-similarity and multifractality, has begun to generate considerable excitement in the networking research community. Finally, it is interesting to note a return to dynamic system approaches, which were considered by Erramilli and Singh early in the history of fractal trafﬁc [ERR 90, ERR 95] but which did not evolve thereafter. 12.4. New models, new behaviors 12.4.1. Character of a model By a “good model” we understand, ﬁrst of all, that the statistics of the data are well captured by the random structure of the model. It is imperative to insist on the principle of parsimony, in other words, that only the minimal number of parameters necessary to cover the essential degrees of freedom be used. An excess of parameters is the sign of a model which is over-ﬁtted to a speciﬁc dataset, which does not therefore capture any generality, or hold structural validity. In this case, the majority of the parameters lack physical signiﬁcance, and as a result their estimation is likely to be difﬁcult and arbitrary. Finally, to measure the degree of adequacy of a model, good metrics should be chosen. In the context of telecommunications, these will not only refer to the statistics of a ﬂow, but to the system as a whole. Thus, the model

Scale Invariance in Computer Network Trafﬁc

431

should be judged by its capacity to predict some measure of quality of service, of which there are a number to chose from. Among the metrics which are precisely deﬁned, and yet reasonably close to the concerns of users, we count loss rate and average packet delay, whereas an example which is more focused on engineering questions of network dimensioning is the distribution function of queue contents, which is the marginal of the “waiting process.” However, we also work under the constraint of considering problems for which we can hope to ﬁnd solutions. Often, we impose simple idealizations, for example queues with inﬁnite waiting rooms. We then commonly use the fact that the probability Q(x), that the level of an inﬁnite queue exceeds x, bounds from above the probability of a loss in a corresponding system where the queue is of ﬁnite size x. From the ﬁrst studies on the impact of fractal trafﬁc we have seen that the behavior of systems can deviate notably from traditional intuition. In 1993, Veitch [VEI 93] emphasized this fact by presenting a simple system in which a fractal renewal process, with an average incoming rate of zero, could produce a dynamic non-trivial queue. In this section, we consider three model classes representing the state of the art and corresponding performance studies, essentially the form of Q(x) for large x. Each class considered itself exhibits untraditional behavior, though of very different types. Each of the models allows an interpretation in terms of a linear superposition of on/off sources, though they were not necessarily proposed in that light, and in each case other motivations are possible. 12.4.2. The fractional Brownian motion family Often, instead of studying trafﬁc via its rate X(k), we turn to the series k Y (k) = i=1 X(i), measuring the mass of data accumulated over the interval (0, k]. Passing over to continuous time, if X(t) is stationary, Y (t) has stationary increments, that is the distributions of the increments {Y (t + δ) − Y (t), t ∈ } do not depend on t. We can decompose this process as Y (t) = μY t + σY W (t)

(12.6)

where W (t) has zero-mean. If the rate process exhibits long-range dependence, the natural choice to model W (t) is the fractional Brownian motion (FBM) BH (t), t ∈ , 0 < H < 1. This canonical process is the unique self-similar Gaussian process with stationary increments. Thus, it has a perfect scale invariance simultaneously in all its statistics across all scales, for example its variance obeys Var[BH (t)] = |t|2H . If we differentiate fractional Brownian motion with H < 1 and δ = 1, we obtain fractional Gaussian noise with β = 2(1 − H), which has long memory if H > 12 . In 1994, Norros [NOR 94] examined a system called fractional Brownian storage consisting of an inﬁnite reservoir with a constant drainage rate of C, fed by Y (t). This type of system is known as a ﬂuid queue, as the data ﬂows into the reservoir which

432

Scaling, Fractals and Wavelets

is emptied in a continuous manner, and the queue state is given simply by the storage level. In terms of a limit of on/off sources, the idealization W (t) = BH (t) is valid when we have many of them each with h C, which corresponds to many trafﬁc streams ﬂowing in a high capacity link. The main result is that Q(x) is asymptotically close to a Weibull law, namely log Q(x) ∼ κx2(1−H)

(12.7)

where κ denotes a known constant. The slow decrease of this probability with x implies that a loss probability is more signiﬁcant than in the common exponential case. This result was conﬁrmed by Brichet et al. [BRI 96] who began by studying on/off sources themselves and examined the system in a limit leading to fractional Brownian motion. The logarithmic asymptotic equivalence was reﬁned by Narayan [NAR 98] and by Simonian and Massoulie [SIM 99]. The asymptotic form of Q(x) is now known up to a constant. 12.4.3. Greedy sources In contrast to the fractional Brownian motion model, which idealizes a mixture of a great number of small sources, we can imagine a ﬁnite, even a small number of sources, each of which keeps an appreciable rate h. Such a scenario can model a link with a low level of aggregation, far from the center of the network, close to user access links. Indeed, even a single heavy on/off source with h > C, ﬂowing into a reservoir being drained at rate C (with λ < C, naturally), generates remarkable statistics in the queue: the tail Q(x) of the queue is so heavy that its mean does not even exist [CHO 97]! Such a tail decay, Q(x) = O(x−(α−1) ) with 1 < α < 2, is slower than that of a Weibullian queue, for which all moments exist. There are many generalizations of this system sharing the same fundamental property: systems made up of a mixture of several sources of which some have long-range dependence while others do not, and some are characterized by h < C, and others by h > C. To learn more about these systems, see the survey article [BOX 97]. Intuitively, the property necessary for such behavior is that time intervals during which the total rate of the superposition exceeds C must have a heavy tail, without variance. 12.4.4. Never-ending calls A large fraction of the queueing theory literature concerns systems for which the arriving work is not ﬂuid, but a point process. For example, the notation M/G/1 denotes a Poisson process as the arrival process (“M” for Markov), of which each point, upon reaching the server, is allocated a service time distributed according to a random variable of general (“G”) distribution, that is without restriction. Here “1” denotes a single server and, by convention, the waiting room is assumed to be unlimited. Point process models readily adapt themselves to the modeling of circuit switching: points represent call requests, whose durations are determined by independent copies of B.

Scale Invariance in Computer Network Trafﬁc

433

Equipping B with a heavy tail is interpreted as modeling long calls, for example, those generated by people connected to the Internet through the telephone network. It is evident that such heavy connections will weigh upon the size of the queue. In fact, thanks to Cohen [COH 73], since 1973 it has been known that if B is characterized by an exponent α > 1, then the exponent of Q(x) is α − 1. Therefore, in particular, if service durations do not have a variance, the average waiting time before receiving service is inﬁnite! This recalls certain ﬂuid queue results, and in fact there are strong connections between the two types of system, the main difference being that, for point arrival, the incoming mass is instantaneously rather than progressively deposited into the queue, therefore further aggravating its load. A considerable number of results for such systems are now available. For a survey we can consult Boxma and Cohen [BOX 00]. Among the most important generalizations is the replacement of “M” by “GI”, indicating a renewal process whose inter-arrivals are distributed according to a variable A which is not exponential but arbitrary. This law can also have a heavy tail, in which case, depending on the ratio of the exponents of A and B, different behaviors are possible, especially when the system is heavily loaded: λ ≈ C. Another major factor lies in the choice of service discipline of the queue. In [COH 73] the traditional choice is made: arrivals are serviced in the order of arrival. However, there is no shortage of alternatives which are commonly employed in switches. For example, with processor sharing, where the server divides its capacity equally over all the customers present, we recover a ﬁnite average waiting time even when B is without variance, essentially because no arrival is forced to wait behind earlier arrivals which may have very long service times. 12.5. Perspectives Even if the fractal nature of teletrafﬁc is now accepted, and a new understanding of its impact has been, to some extent, reached, the list of open questions remains long. In reality, we are in the early stages of studying this phenomenon, observing its evolution, and appreciating its implications. As far as long-range dependence is concerned, one category of outstanding questions concern the details of these effects on various aspects of performance. In some sense it is necessary to “redo everything” in the queueing literature and other ﬁelds, to take into account this invariance at large scales. Despite considerable progress, our knowledge falls far short of that necessary to design networks capable of mimimizing the harmful effects with conﬁdence, and efﬁciently exploit the beneﬁcial properties of long memory. A second category of questions that appears essential for the future is to understand the origin (or origins) of the apparent scaling invariance over small scales; multifractal behavior. Understanding these origins will be essential to predict whether this behavior will persist, not only in the sense of not disappearing, but also in the sense of its extension towards ever smaller scales, as they are progressively activated by advances in technology. Even if small scale invariance is inﬂuenced, or even entirely determined, by network design, the study of its impact on performance will remain

434

Scaling, Fractals and Wavelets

relevant. Though it appears obvious that, like any variability, its presence will be negative overall, we have yet to evaluate the cost of any impact against the cost of the actions that may be required to suppress it. The third category of questions concern protocol dynamics in closed loop control, such as in TCP, which conﬁgures the global network as an immense distributed dynamic system, from which the generation of scale invariances may be only one of the important consequences. The richness of non-linear and non-local interactions in this system merits that this new ﬁeld be studied in full depth. The next phase in the evolution of fractal teletrafﬁc phenomenon, as unpredictable as it is fascinating, could very well come from determinism rather than randomness. 12.6. Bibliography [ABR 98] A BRY P., V EITCH D., “Wavelet analysis of long-range dependent trafﬁc”, IEEE Transactions on Information Theory, vol. 44, no. 1, p. 2–15, 1998. [ABR 00] A BRY P., TAQQU M.S., F LANDRIN P., V EITCH D., “Wavelets for the analysis, estimation, and synthesis of scaling data”, in PARK K., W ILLINGER W. (Eds.), Self-similar Network Trafﬁc and Performance Evaluation, John Wiley & Sons, 2000. [BER 95] B ERAN J., S HERMAN R., TAQQU M.S., W ILLINGER W., “Variable-bit-rate video trafﬁc and long range dependence”, IEEE Transactions on Communications, vol. 43, p. 1566–1579, 1995. [BOX 97] B OXMA O.J., D UMAS V., Fluid queues with long-tailed activity period distributions, Technical Report PNA-R9705, CWI, Amsterdam, The Netherlands, April 1997. [BOX 00] B OXMA O.J., C OHEN J.W., “The single server queue: Heavy tails and heavy trafﬁc”, in PARK K., W ILLINGER W. (Eds.), Self-Similar Network Trafﬁc and Performance Evaluation, John Wiley & Sons, 2000. [BRI 96] B RICHET F., ROBERTS J., S IMONIAN A., V EITCH D., “Heavy trafﬁc analysis of a storage model with long range dependent on/off sources”, Queueing Systems, vol. 23, p. 197–225, 1996. [CHO 97] C HOUDHURY G.L., W HITT W., “Long-tail buffer-content distributions in broadband networks”, Performance Evaluation, vol. 30, p. 177–190, 1997. [COH 73] C OHEN J.W., “Some results on regular variation for the distributions in queueing and ﬂuctuation theory”, Journal of Applied Probability, vol. 10, p. 343–353, 1973. [COX 84] C OX D.R., “Statistics: an appraisal”, in DAVID H.A., DAVID H.T. (Eds.), Long-Range Dependence: A Review, Iowa State University Press, Ames, USA, p. 55–74, 1984. [CUN 95] C UNHA C., B ESTAVROS A., C ROVELLA M., Characteristics of WWW client-based traces, Technical Report, Boston University, Boston, Massachusetts, July 1995. [DUF 94] D UFFY D.E., M CINTOSH A.A., ROSENSTEIN M., W ILLINGER W., “Statistical analysis of CCSN/SS7 trafﬁc data from working CCS subnetworks”, IEEE Journal on Selected Areas in Communications, vol. 12, no. 3, 1994.

Scale Invariance in Computer Network Trafﬁc

435

[ERR 90] E RRAMILLI A., S INGH R.P., “Application of deterministic chaotic maps to characterize broadband trafﬁc”, in Proceedings of the Seventh ITC Specialist Seminar (Livingston, New Jersey), 1990. [ERR 93] E RRAMILLI A., W ILLINGER W., “Fractal properties in packet trafﬁc measurements”, in Proceedings of the ITC Specialist Seminar (Saint Petersburg, Russia), 1993. [ERR 95] E RRAMILLI A., S INGH R.P., P RUTHI P., “An application of deterministic chaotic maps to model packet trafﬁc”, Queueing Systems, vol. 20, p. 171–206, 1995. [FAL 90] FALCONER K., Fractal Geometry: Mathematical Foundations and Applications, John Wiley & Sons, 1990. [FEL 98] F ELDMANN A., G ILBERT A., W ILLINGER W., “Data networks as cascades: Explaining the multifractal nature of internet WAN trafﬁc”, in ACM/Sigcomm’98 (Vancouver, Canada), 1998. [KUR 96] K URTZ T.G., “Limit theorems for workload input models”, in K ELLY F.P., Z ACHARY S., Z IEDINS I. (Eds.), Stochastic Networks: Theory and Applications, Clarendon Press, Oxford, p. 119–140, 1996. [LEL 91] L ELAND W.E., W ILSON D.V., “High time-resolution measurement and analysis of LAN trafﬁc: Implications for LAN interconnection”, in Proceedings of the IEEE Infocom’91 (Bal Harbour, Florida), p. 1360–1366, 1991. [LEL 93] L ELAND W.E., TAQQU M.S., W ILLINGER W., W ILSON D.V., “On the self-similar nature of Ethernet trafﬁc”, Computer Communications Review, vol. 23, p. 183–193, 1993. [LEL 94] L ELAND W.E., TAQQU M.S., W ILLINGER W., W ILSON D.V., “On the self-similar nature of Ethernet trafﬁc (extended version)”, IEEE/ACM Transactions on Networking, vol. 2, no. 1, p. 1–15, 1994. [LEV 97] L ÉVY V ÉHEL J., R IEDI R.H., “Fractional Brownian motion and data trafﬁc modeling: The other end of the spectrum”, in L ÉVY V ÉHEL J., L UTTON E., T RICOT C. (Eds.), Fractals in Engineering’97, Springer, 1997. [MEI 91] M EIER -H ELLSTERN K., W IRTH P.E., YAN Y.L., H OEFLIN D.A., “Trafﬁc models for ISDN data users: Ofﬁce automation application”, in Proceedings of the Thirteenth ITC (Copenhagen, Denmark), p. 167–172, 1991. [NAR 98] NARAYAN O., “Exact asymptotic queue length distribution for fractional Brownian trafﬁc”, Adv. Perf. Analysis, vol. 1, no. 39, 1998. [NOR 94] N ORROS I., “A storage model with self-similar input”, Queueing Systems, vol. 16, p. 387–396, 1994. [PAW 88] PAWLITA P.F., “Two decades of data trafﬁc measurements: A survey of published results, experiences, and applicability”, in Proceedings of the Twelfth International Teletrafﬁc Congress (ITC 12, Turin, Italy), 1988. [PAX 94a] PAXSON V., F LOYD S., “Wide-area trafﬁc: The failure of Poisson modeling”, IEEE/ACM Transactions on Networking, vol. 3, no. 3, p. 226–244, 1994. [PAX 94b] PAXSON V., F LOYD S., “Wide-area trafﬁc: The failure of Poisson modeling”, in Proceedings of SIGCOMM’94, 1994.

436

Scaling, Fractals and Wavelets

[RAM 88] R AMASWAMI V., “Trafﬁc performance modeling for packet communication – Whence, where, and whither?”, in Proceedings of the Third Australian Teletrafﬁc Research Seminar, vol. 31, November 1988. [RIE 95] R IEDI R.H., “An improved multifractal formalism and self-similar measures”, J. Math. Anal. Appl., vol. 189, p. 462–490, 1995. [RIE 99] R IEDI R.H., C ROUSE M.S., R IBEIRO V.J., BARANIUK R.G., “A multifractal wavelet model with application to network trafﬁc”, IEEE Transactions on Information Theory (special issue on “Multiscale Statistical Signal Analysis and its Applications”), vol. 45, no. 3, p. 992–1018, 1999. [ROU 99] ROUGHAN M., YATES J., V EITCH D., “The mystery of the missing scales: Pitfalls in the use of fractal renewal processes to simulate LRD processes”, in ASA-IMA Conference on Applications of Heavy Tailed Distributions in Economics, Engineering, and Statistics (American University, Washington, USA), June 1999. [RYU 96] RYU B.K., L OWEN S.B., “Point process approaches to the modeling and analysis of self-similar trafﬁc. Part I: Model construction”, in IEEE INFOCOM’96: The Conference on Computer Communications (San Francisco, California), IEEE Computer Society Press, Los Alamitos, California, vol. 3, p. 1468–1475, March 1996. [SIM 99] S IMONIAN A., M ASSOULIÉ L., “Large buffer asymptotics for the queue with FBM input”, Journal of Applied Probability, vol. 36, no. 3, 1999. [STE 94] S TEVENS W., TCP/IP Illustrated. Volume 1: The Protocols, Addison-Wesley, 1994. [TAN 88] TANNENBAUM A.S., Computer Networks, Prentice Hall, Second Edition, 1988. [TAQ 95] TAQQU M.S., T EVEROVSKY V., W ILLINGER W., “Estimators for long-range dependence: An empirical study”, Fractals, vol. 3, no. 4, p. 785–798, 1995. [TAQ 97] TAQQU M.S., W ILLINGER W., S HERMAN R., “Proof of a fundamental result in self-similar trafﬁc modeling”, Computer Communication Review, vol. 27, p. 5–23, 1997. [VEI 93] V EITCH D., “Novel models of broadband trafﬁc”, in IEEE Globecom’93 (Houston, Texas), p. 1057, November 1993. [VEI 99] V EITCH D., A BRY P., “A wavelet based joint estimator of the parameters of long-range dependence”, IEEE Transactions on Information Theory (special issue on “Multiscale Statistical Signal Analysis and its Applications”), vol. 45, no. 3, p. 878–897, 1999. [WIL 95] W ILLINGER W., TAQQU M.S., S HERMAN R., W ILSON D.V., “Self-similarity through high-variability: Statistical analysis of the Ethernet LAN trafﬁc at the source level”, in Proceedings of the ACM/SIGCOMM’95 conference, 1995 (available at the address: http://www.acm.org/sigcomm/sigcomm95/sigcpapers.html). [WIL 96] W ILLINGER W., TAQQU M.S., E RRAMILLI A., “A bibliographical guide to self-similar trafﬁc and performance modeling for modern high-speed networks”, in K ELLY F.P., Z ACHARY S., Z IEDINS I. (Eds.), Stochastic Networks: Theory and Applications, Clarendon Press, Oxford, p. 339–366, 1996.

Chapter 13

Research of Scaling Law on Stock Market Variations

13.1. Introduction: fractals in finance Stock market graphs representing changes in the prices of securities over a period of time appear as irregular forms that seem to be reproduced and repeated in all scales of analysis: rising periods follow periods of decline. However, the rises are broken up with intermediate falling phases and falls are interspersed with partial rises, and this goes on until the natural quotation scale limit is reached. This entanglement of repetitive patterns of rising and falling waves in all the scales was discovered in the 1930s by Ralph Elliott, to whom this idea occurred while observing the ebb and ﬂow of tides on the sands of a seashore. From this, he formulated a ﬁnancial symbolization known as “stock market waves” or “Elliott’s waves,” which he broke up into huge tides, normal waves and wavelets, and also “tsunami”, from the name given in Japan to huge waves arising due to earthquakes. The theory called “Elliott’s waves” [ELL 38] presents a deterministic fractal description of the stock market from self-similar geometric ﬁgures that we ﬁnd on all scales of observation and compiles a toolbox in the form of graphic analysis of stock market ﬂuctuations used by certain market professionals: technical analysts. Elliott’s ﬁgures propose a calibration of rise and fall variations from a pythagorician numerology based on the usage of golden ratio and Fibonacci sequence, which are predictions strongly tinged with subjectivity, in so far as detection and positioning of waves depend on the graphic analyst’s view of the

Chapter written by Christian WALTER.

438

Scaling, Fractals and Wavelets

market which he examines. For the lack of an appropriate mathematical tool, this conceptualization of stock market variations was conﬁrmed, as alchemy before chemistry, in the pre-scientiﬁc ﬁeld until the emergence of fractals. The fractals of Benoît Mandelbrot, though developed in a radically different approach, ﬁt in this understanding of stock market variations and present, as common point with Elliott’s waves, the fact of ﬁnding how to untangle the inextricable interlacing of stock market ﬂuctuations in all the scales. Using stock market language, do we ﬁnd ourselves in fall correction of a rising phase or in a fall period contradicted by a temporary rise? Fractals represent adequate conceptualization allowing the translation of intuitions of graphic analysts in rigorous mathematical representations. However, the adventure of fractals in ﬁnance does not have a smooth history. It rather refers to an eventful progression of Mandelbrot’s assumptions through the evolution of ﬁnance theory over 40 years, from 1960 until today, which stirred up a vehement controversy on modeling in inﬁnite variance or inﬁnite memory. The connecting thread of Mandelbrot’s works, followed by others – including his contradictors – was the research of scaling laws on stock market ﬂuctuations, irrespective of whether this research followed the direction of scaling invariance, or pure fractal approach of markets, as proposed by Mandelbrot, or according to that of an instrumentation of multiscaling analysis of markets, such as that corresponding to mixed processes or of ARCH type that emerged in the 1980s, or to the changing system in the 1990s. The starting point of this controversy was the existence of leptokurtic (or non-Gaussian) distributions on stock market variations. This distributional anomaly in relation to the Brownian hypothesis of traditional ﬁnancial modeling led Mandelbrot to propose α-stable distributions in 1962 to Paul Lévy’s inﬁnite variance by replacing Gaussian for modeling of periodic returns. However, very soon, this new hypothesis provoked a relatively ﬁerce controversy concerning the existence of variance and other new candidate processes appeared, all the more easily while scaling invariance of α-stable laws, cardinal property of Mandelbrot’s fractal hypothesis, did not appear, or only with difﬁculty, experimentally validated. The attempt to resolve leptokurtic problems by conservation of iid hypothesis and the proposal of α-stable distributions did not solve all the anomalies, since a new type of anomalies, or scaling anomalies appeared. Therefore, the theoretical research is interested in modeling leptokurticity in other possible ways and we turn towards the second pivot of ﬁnancial modeling: the hypothesis of independence of successive returns, which was equally challenged. Hence, we looked in different forms of dependence between returns (linear and then non-linear dependence) for the cause of leptokurticity observed. This is the second round of empirical investigations. After highlighting the absence of short memory on returns, the research is turned towards the detection of long memory on returns.

Research of Scaling Law on Stock Market Variations

439

This attempt does not succeed either. Then, the focus is shifted to the process of volatilities, with the formalization of short memory on volatilities, i.e. an approach that led to the trend of ARCH modeling and then by highlighting long memory on volatilities (or long dependence), i.e. a trend that led to the rediscovery of scaling laws in ﬁnance. Finally, the fractal hypothesis was validated on the generating process of stock market volatilities. Today, the long memory of volatilities (i.e., a hyperbolic law on correlations among volatilities) has become a recognized fact on ﬁnancial markets and ﬁnancial modeling seeks to reconcile the absence of memory on returns and the presence of long memory on volatilities. After the ﬁrst part, which brieﬂy outlines the quantities followed in ﬁnance and traditional ﬁnancial modeling, we present a review of theoretical works on results of research on scaling laws for the description of stock market variations1. This review very clearly shows the various distinct periods and thus we propose to establish a chronology in this research of scaling laws, i.e. a periodization which illustrates conceptual transfers whose subject has been ﬁnance for 40 years. The chronology is as follows: – during the ﬁrst period, from 1960 to 1970, Mandelbrot’s proposals and the ﬁrst promising discoveries of scaling invariance on the markets launched a debate in the university community, by introducing iid-α-stable and H-correlative models; – this debate developed during the period 1970-1990 and seems to be completed with the experimental rejection of fractals in ﬁnance on stock market returns; – however, parallel to fractals, developments of time series econometrics in the process of fractional integration degree of type ARFIMA from the 1980s and, then FIGARCH in the 1990s led, from 1990-2000, a rediscovery of scaling laws on the process of stock market volatilities, using long memory concepts; – ﬁnally, the measurement of time itself becomes an object of research, with the recent developments of modeling in time deformation. 13.2. Presence of scales in the study of stock market variations 13.2.1. Modeling of stock market variations 13.2.1.1. Statistical apprehension of stock market ﬂuctuations When we want to statistically describe the behavior of a stock market between two dates 0 and T , with the aim of proposing its probabilistic modeling, two “natural”

1. Mathematical aspects of fractal modeling in general, developed in many works, are not dealt with here. Speciﬁc aspects of fractal modeling in ﬁnance and examples of application of iid-α-stable model are presented in detail in [LEV 02].

440

Scaling, Fractals and Wavelets

interpretations of the available data – price quotations – are possible. We may consider the prices quoted between 0 and T based on a ﬁxed frequency of observation, which can be a day, a month or a trimester – but also an hour or ﬁve minutes. This further subdivides the interval [0, T ] in n periods equal to basic length τ = T /n, this duration τ deﬁning a “characteristic time” of market observation. On the other hand, we considered the price quoted in every transaction that took place between 0 and T , which means splitting up the interval [0, T ] in 0 = t0 < t1 < . . . < tn = T and working in “time deformation”, or “transaction-time”, tj being the moment of the j th transaction. The ﬁrst approach appears most immediate, but because of the discontinuous and irregular nature of stock market quotations, it is possible that the price recorded on date t does not correspond to a real exchange on the market, i.e., to an equilibrium of supply and demand at the time of quoting: in that case, the economic signiﬁcance of statistical analysis could appear very weak. On the other hand, when the frequency of quoting is higher than a day, the variations between the previous day’s closing price and the following day’s opening price are treated as intra-daily variations. Finally, this quoting in physical time assumes that market activity is broadly uniform during the observation period, which is generally not the case. It is from here that the interest in the second approach is derived, corresponding as if it does to the succession of balanced price in supply and demand. These two approaches exist simultaneously in ﬁnancial modeling and this alternative leads us to consider the issue of adequate time to measure stock market ﬂuctuations, which was put forth for the ﬁrst time by Mandelbrot and Taylor [MAND 67b] and by Clark [CLA 73], who introduced the concept of “time-information” – where information was associated with the volume of transactions2. The ﬁrst analysis (calendar time) is widely used, but the second research trend (time deformation) has begun to be the subject of new interest. This interest is the result of change (and also reﬂects it) that appeared in the computing environment of stock markets and which is translated by an increasing abundance of available data of prices: if prices have been registered daily since the 19th century, during the 1980s they were quoted every minute and then in the 1990s, after each transaction. Thus, the magnitude of sample sizes increased in several powers of 10. Statistical tests carried out on markets during the 1960s made use of approximately 103 data. Those in the early 1990s processed nearly 105 data. The most recent investigations examine nearly 107 data.

2. We can observe that this approach is similar to that of Maurice Allais who introduced the concept of “psychological time” in economics (see for example [ALL 74]).

Research of Scaling Law on Stock Market Variations

441

In calendar time, the basic modeling is as follows. Let S(t) be the price of asset S on date t. The variation of price between 0 and T is: S(T ) = S(0) +

n

ΔS(k)

n=

k=1

T τ

(13.1)

The notation ΔS(k) represents the price variation of asset S between the dates t − τ and t, where t is expressed in multiple steps of basic time τ : ΔS(t, τ ) = S(t) − S(t − τ ) = S(kτ ) − S (k − 1)τ = ΔS(k) (13.2) In transaction-time, prices are quoted in every transaction made and price variations between two successive transactions are taken into consideration. Let N (t) be the number of transactions3 made between dates 0 and t. The variation of price between 0 and T in this case is:

N (T )

(T ) = S(0) +

ΔS(j)

(13.3)

j=1

The notation ΔS(j) represents price variation of the asset S between transactions j − 1 and j: ΔS(j) = S(j) − S(j − 1) The pricing process {S(j)} is therefore indexed by a functioning stock market time, or “transaction-time,” noted by θ(t): S(j) = S(θ(t)). Finally, market professionals generally say that the value of a quoted price (and thus relevance of measurement) is not the same depending on whether this price corresponds to a transaction of 500,000 securities or 5 securities. Therefore, the concepts of market “depth”, exchange “weight”, come into play. We measure the intensity of the exchange, or “activity level” of the market, by the quantity of securities exchanged, or volume of transactions. Financial modeling took into account this element and the volume of transactions in the evolution of price is analyzed4 today by introducing volume process in ﬁnancial modeling.

3. The process of transaction inﬂows N (t) was dealt by Hasbrouck and Ho [HAS 87] and recently by Ghysels et al. [GHY 97]. Evertsz [EVE 95b] showed that the distribution of waiting time between two quotations followed Pareto’s power law, whose exponent value implies inﬁnite mathematical expectation. 4. See, for example, Lamoureux and Lastrapes [LAM 94] or Gouriéroux and Le Fol [GOU 97b] who give a synthetic idea of this question. Maillet and Michel [MAI 97] showed that distribution of volumes follows Pareto’s power law.

442

Scaling, Fractals and Wavelets

Let V (t) be the total volume of securities exchanged between 0 and t. The total volume of securities exchanged between 0 and T is:

N (T )

V (T ) =

υ(j)

(13.4)

j=1

Notation υ(j) represents the volume of securities exchanged during the transaction j. The volume process {V (j)} is indexed by transaction-time. The price quoted on date T is therefore the result of three factors or processes between 0 and T : the transaction process of N (t), the process of price variations between two transactions ΔS(j) and the volume process υ(j). The price on date T is the result of the simultaneous effect of these three factors. 13.2.1.2. Proﬁt and stock market return operations in different scales From basic data such as quoted prices on the period [0, T ], three quantities are of interest in ﬁnance. The three quantities are as follows: – proﬁt realized on security during the period [0, T ], deﬁned by: G(T ) = S(T ) − S(0)

(13.5)

– rate of return of security over the period [0, T ], deﬁned by: R(T ) =

S(T ) − S(0) G(T ) = S(0) S(0)

– continuous rate of return security over the period [0, T ], deﬁned by: r(T ) = ln 1 + R(T ) = ln S(T ) − ln S(0)

(13.6)

(13.7)

We are interested in the evolution of these quantities over successive sub-periods [t − τ, t]. The periodic gain realized on security during the sub-period [t − τ, t] is: ΔG(t, τ ) = G(t) − G(t − τ ) = S(t) − S(t − τ ) = ΔS(t, τ )

(13.8)

The rate of periodic return realized on security during the sub-period [t − τ, t] is: ΔR(t, τ ) =

S(t) − S(t − τ ) ΔS(t, τ ) = S(t − τ ) S(t − τ )

or: 1 + ΔR(t, τ ) =

S(t) 1 + R(t) = S(t − τ ) 1 + R(t − τ )

(13.9)

Research of Scaling Law on Stock Market Variations

443

The rate of periodic continuous return realized on security during the sub-period [t − τ, t] is: Δr(t, τ ) = ln 1 + ΔR(t, τ ) = ln S(t) − ln S(t − τ ) (13.10) = r(t) − r(t − τ ) When data are obtained in high frequency, ΔR(t, τ ) is “small” and we have ln(1 + ΔR(t, τ )) ≈ ΔR(t, τ ). Expressions (13.9) and (13.10) are very close and we measure periodic security returns by one or the other. The temporal aggregation of returns is realized by bringing expressions (13.6) and (13.9) closer; we have (t = kτ ): 1 + R(T ) =

T )

n ) 1 + ΔR(t, τ ) = 1 + ΔR(k)

t=τ

(13.11)

k=1

In the same way, (13.7) and (13.10) lead to: r(T ) =

T t=τ

Δr(t, τ ) =

n

Δr(k)

(13.12)

k=1

13.2.1.3. Traditional ﬁnancial modeling: Brownian motion Modeling stock market variations has led us to assume that S(t) is a random variable: therefore, the value sequence S(1), S(2), S(3), etc. is considered as values on certain dates t of a process in continuous time. So, analysis of stock market variations leads to a stochastic process studying {S(t), t 0} or the associated processes {R(t), t 0} or {r(t), t 0} and their growth. The usual hypothesis of ﬁnancial theory assumes that these random processes have independent and identically distributed (iid) increments of ﬁnite variance, which we can write, as an abbreviation, “iid-Gaussian” modeling. iid-Gaussian hypothesis has been the subject of much controversy for the last 50 years. iid sub-hypothesis was intensively tested in the theoretical works.5 Today, it is admitted that, for the calculation of usual evaluation and hedge models, this assumption is valid in ﬁrst approximation and when τ is more than 10 minutes (see, for example, [BOU 97]). It is more convenient for the distribution of returns in scale τ than that of returns in scale T = nτ because, in this case, if P (Δr(t, τ )) is the probability distribution of periodic

5. See [CAM 97] for a complete review of the different ways to statistically test this and also the results obtained.

444

Scaling, Fractals and Wavelets

returns Δr(t, τ ), then:

⊗n P Δr(t, nτ ) = P Δr(t, τ )

(13.13)

where ⊗ represents the convolution operator. From the probabilistic point of view, the advantage of this hypothesis is purely computational. From the economic point of view, the independence of returns means considering that the available and relevant information for the evaluation of ﬁnancial assets is correctly transferred in the quoted prices, which signiﬁes the beginning of a concept of informational market efﬁciency; stationarity signiﬁes that the economic characteristics of an observed phenomenon do not change much in the course of time. The existence of variance limits the ﬂuctuation of returns, not a stock market crash or stock market boom. The ﬁrst formal model representing stock market variations was proposed [BAC 00] in 1900 by Louis Bachelier6 and based on proﬁts (13.8), then modiﬁed in 1959 by Osborne for returns (13.9) and (13.10): dS(t) = dr(t) = μ dt + σ dW (t) S(t)

(13.14)

with coefﬁcients μ ∈ R and σ > 0, where W (t) is a standard Brownian motion7, i.e., E(W (1)) = 0 and E(W (1)2 ) = 1. Coefﬁcient μ represents the expectation of instantaneous returns for the share purchased. The risk of a ﬁnancial asset is generally measured by the coefﬁcient σ of Brownian motion, called “volatility” by market professionals: this is a potential dispersal measure of stock market returns. There are other risk measures, which are all based on this idea of conditional variability of returns in a given time (see [GOU 97b]). The solution of (13.14) is obtained by supposing that X(t) = ln S(t) and by applying Itô’s differentiation formulae in dX(t). We obtain: σ2 t + σW (t) t ∈ [0, T ] (13.15) S(t) = S(0) exp μ− 2 which is considered as the standard model of stock market variations.

6. A biography of Louis Bachelier has been compiled by Courtault et al. [COU 00]. For a description of ﬁnancial aspects of Bachelier’s work and their impact on the ﬁnance industry, see [WAL 96]. For an understanding of Bachelier’s probabilistic work with reference to his epoch, see [TAQ 00]. 7. Let us note that Bachelier did not know Brownian motion in the strict sense of its deﬁnition because it is only in 1905 that this deﬁnition would be given by Einstein, then in 1923 by Wiener. However, Bachelier assumes that the successive differences of the form ΔS(t, τ ) are independent of Gaussian distribution and of proportional variance in time interval τ , which leads to describe Wiener’s process.

Research of Scaling Law on Stock Market Variations

445

13.2.2. Time scales in financial modeling 13.2.2.1. The existence of characteristic time If we choose modeling in physical time, thus in ﬁxed pace of time, the ﬁrst question that arises is that of selecting the pace of time τ , i.e., resolution scale of market analysis: is it necessary to examine time variations – daily, weekly, monthly, etc.? Which is the most appropriate observation scale for capturing the statistical structure of stock market variations? Thus, a question of ﬁnancial nature appears: should the probability law which governs stock market variations be the same at all scales? If we understand each time scale as representing an investment horizon for a given category of operators, there is apparently no particular reason for variations corresponding to a short trading horizon and those corresponding to a long horizon of portfolio manager to be modeled by the same probability law. Equation (13.12) shows that the return in scale T is the sum of returns in scales τ in the case of iid. Generally, when we add iid random variables, the resulting probability law is different from initial probability laws. Thus, a multiscale analysis seems, at ﬁrst sight, inevitable, if we do not wish to lose information on the market behavior at each scale of characteristic time of a given economic phenomenon. The ﬁrst analysis of market behavior used only one observation frequency, often monthly. It was Mandelbrot who, in 1962, became the ﬁrst to introduce the concept of simultaneous analysis on several scales, in order to compare distributions of periodic returns Δr(t, τ ) based on these different scales τ . Mandelbrot sought to establish invariance by changing scale on periodic returns (i.e., a fractal structure of the market). If P (Δr(t, τ )) is the probability distribution of periodic returns Δr(t, τ ), relation (13.13) is simpliﬁed as: ⊗n = nH P Δr(t, τ ) (13.16) P Δr(t, τ ) where H is a self-similar exponent – which means that the process of returns {r(t), t 0} is self-similar to exponent H: L r(T ) = r(nτ ) = nH r(τ )

T = nτ

(13.17)

L where = symbolizes equality in distribution.

In such a market model, an important consequence of a fractal hypothesis is the absence of preferential observation scale and of characteristic discriminant time, for its statistical observation. In this case, it becomes possible to estimate probability law for a long horizon from the study of stock market ﬂuctuations on a short horizon: distribution of returns in long-term horizon T is obtained, from the distribution of returns in short-term horizon τ , by means of relation (13.17). In other words, by observing the market in any scale, we can access its fundamental behavioral structure: the probability law which characterizes stock market ﬂuctuations is independent of the scale of these ﬂuctuations.

446

Scaling, Fractals and Wavelets

13.2.2.2. Implicit scaling invariances of traditional ﬁnancial modeling However, traditional ﬁnancial modeling has fractal properties: Brownian motion is a self-similar process of exponent H = 12 . Particularly, its increase Δr over a time τ follows a scaling law such that: Δr(t, τ ) ∼ τ 1/2

(13.18)

The distribution of ratio Δr/τ 1/2 is independent of time. Translated in ﬁnancial terms, the magnitude order of a security return for a given time is proportional to the square root of this time. In the theory of ﬁnance, it is stated that the returns and the associated risk are proportional to the time spent. Relation (13.18) gives this proportionality, irrespective of the time scale (duration) considered. Hence, there is an invariance in the law of returns by changing the scale: law of security returns does not depend on the duration of security detention. Thus, the theoretical risk of a ﬁnancial asset will be expanded in square root (exponent 12 ) of the detention time of this asset. Important people belonging to these markets permanently apply this fractal property, by opting to “annualize” volatility by means of the aforementioned formula (13.18) Thus, for example, volatility in 12 months will be equal to the volatility in a month multiplied by the square root of 12. This calculation of long-term risk level from short-term risk is also at the base of the banking industry’s prudential reﬂections on the control of risks on the market operations (see [BAL 94, BAL 96, BAL 98, BAL 99, IOS 94]). 13.3. Modeling postulating independence on stock market returns 13.3.1. 1960-1970: from Pareto’s law to Lévy’s distributions 13.3.1.1. Leptokurtic problem and Mandelbrot’s ﬁrst model The scaling character of quoted price ﬂuctuations in stock markets was ﬁrst established by Mandelbrot through the study of price variations in cotton between 1880 and 1958. This is the ﬁrst trace of an explicit comment about the existence of scaling phenomena on stock market variations. This existence was highlighted by the study of distribution tails, which brought out the connection between the discovery of scaling laws and the appropriate treatment of large stock market variations. From the ﬁrst statistical study of stock market ﬂuctuations, it was established that the empirical distributions of successive returns contained too many tail points for them to be adjusted by Gaussian densities: the empirical distributions obtained were all leptokurtic. This problem of great stock market variations was not solved and was temporarily abandoned by research for lack of appropriate means to model it. Moreover, this abnormal distribution tail was old and dated back to Pareto who had invented the law which carries his name precisely to give an account of the distribution

Research of Scaling Law on Stock Market Variations

447

of revenues in an economy on a given date and which is a power law. However, Pareto’s law did not seem to have the status of a limit law in probability and was not used in ﬁnance. Thus, Mandelbrot tackled the problem of large values of empirical distribution functions of returns Δr(t, τ ) = ln S(t) − ln S(t − τ ), where S(t) is the closing price of cotton on date t, with two values for τ : month and day. By calculating expressions Fr (Δr(t, τ ) > u) for positive values and Fr (Δr(t, τ ) < −u) for negative values, where Fr designates the cumulative frequency of variations Δr(t, τ ), a double adjustment to Pareto’s exponent law α is obtained: ! " (13.19a) ln Fr Δr(t, τ ) > u ≈ −α ln u + ln C (τ ) ! " (13.19b) ln Fr Δr(t, τ ) < −u ≈ −α ln u + ln C (τ ) Noting that adjustment rights corresponding to distributions τ = a day and τ = a month are parallels, Mandelbrot deduces that distribution laws of Δr for τ = a day and a month only differ by a changing of scale and hence proposes a new model of price variation: by conserving iid assumptions, the stability of phenomena between a day and a month is interpreted as a stability trace according to Lévy. A random variable X is called stable according to Lévy, or α-stable, if, for any couple c1 , c2 > 0, there is α ∈ ]0, 2] and d ∈ R such that: α c1 X1 + c2 X2 ≡ cX + d cα = cα 1 + c2

(13.20)

where ≡ symbolizes equality in distribution and where X1 and X2 are two copies independent of X. In the case where we have d = 0, X is strictly known as stable. Exponent α ∈ ]0, 2] is the characteristic exponent of stable laws. Mandelbrot’s inference comes from the following property of stable laws. If X is a stable law of a characteristic exponent α, then it can be shown that: 'A ( A2 −α 1 + x + O x−2α (13.21) P (X x) = 1 − F (x) = x−α πα 2πα where A1 and A2 are the independent quantities of α. In addition, by deﬁnition, a random variable follows Pareto’s law in a higher tail if: (13.22) P X x | x x(0) = 1 − F (x) = x−α h(x) where α is called Pareto’s index and where h(x) is a slowly varying function, i.e., lim h(tx)/h(x) = 1 when x → ∞ for any t > 0. When h(x) is a constant function, the law is said to be Pareto’s in a strict sense. By writing h(x) = [(A1 /(πα)) + (A2 /(2πα))x−α + O(x−2α )], relations (13.21) and (13.22) we show that α-stable laws are asymptotically Paretians with the tailing

448

Scaling, Fractals and Wavelets

index α: this is the reason why, in his 1962 communication, Mandelbrot concludes that “Paretian character[. . . ]is “predicted” or “conﬁrmed” by stability”. The second important fact of this empirical emphasis concerns the value of α equal to 1.7. No higher order moment than α exists. Thus, since α is lower than 2, the variance Δr(t, τ ) is inﬁnite. 13.3.1.2. First emphasis of Lévy’s α-stable distributions in ﬁnance Mandelbrot [MAND 63] developed and summarized variation modeling of prices proposed in 1962: “this was the ﬁrst model that I have elaborated to describe the price variation practiced on certain stock exchanges of raw materials in a realistic way,” (see [MAND 97a], French edition, p.128). We can qualify this ﬁrst model as “iid-α-stable,” insofar as iid hypotheses are conserved and that the characteristic exponent value α of stable laws goes from 2 (Gauss) to value α < 2. This model, which made it possible to create, in an unforeseen way, the ﬂooding of stock markets, was named “Noah’s effect” by Mandelbrot, in reference to the biblical ﬂood (see [MAND 73b]). Fama [FAM 65] and then Mandelbrot [MAND 67a] followed the initial investigations and validated the model. Finally, in 1968, the ﬁrst tabulations of symmetric α-stable laws were carried out by Fama and Roll [FAM 68], which made it possible to generate the ﬁrst parameter estimators of these distributions. 13.3.2. 1970–1990: experimental difficulties of iid-α-stable model 13.3.2.1. Statistical problem of parameter estimation of stable laws As Fama indicated in 1965, these ﬁrst emphases of Lévy’s distributions were fragile because the estimation methods of characteristic exponent α were not very sure: the adjusting method of distribution tails in a bilogarithmic graph is very sensitive to subjective selection of the point from which we commence distribution tails. Fama [FAM 65] had proposed two other estimators based on invariance property by applied addition, be it an interquantile interval measure, or dilation law of empirical variance. However, these two estimators were equally fragile: the former presumed the independence of growth and the latter was very sensitive to the selection of sample size. A stage was reached in 1971: Fama and Roll, using properties relating to quantiles, which were detected with the help of previously made tabulations of symmetric stable distributions, proposed new estimate methods of the parameters α and γ of symmetric stable laws [FAM 71]. These ﬁrst statistical tools allowed the implementation of the ﬁrst tests of iid-α-stable model in 1970. Then, a second generation of estimators appeared during the 1970s. Successively, Press [PRE 72], DuMouchel [DUM 73, DUM 75], Paulson et al. [PAU 75], Arad

Research of Scaling Law on Stock Market Variations

449

[ARA 80], Koutrouvélis [KOU 80] and McCulloch [MCC 81] developed new estimation methods of parameter, using the characteristic function of stable laws8. Simultaneously, generators of stable random variables were designed by Chambers et al. [CHA 76], whose algorithms allow an improvement of the simulation possibilities on the ﬁnancial markets. These new theoretical stages make it possible to improve the tests for the hypothesis of scale invariance. However, DuMouchel [DUM 83] showed that it is possible, by means of the preceding methods, to separate Lévy-stable from Pareto-unstable distributions (i.e., with convergence towards a normal law). He shows that these methods are good when the “true” distribution is stable, but are skewed when this is not the case, which lets a doubt remain regarding the validity of scale invariance, when this invariance is veriﬁed by means of these methods. In addition, the sample size can affect the results of estimations made with Koutrouvélis method and a fortiori with older methods9. For example, Walter [WAL 99] veriﬁes that α increases according to a decrease in sample size but remains nearly constant when tests on sub-samples of constant size are carried out. More generally, we can say that the difﬁculties of characteristic exponent estimation make it very delicate to determine a deﬁnitive position. Thus, we ﬁnd the following remark in a recent manual: “we think that estimate methods of parameter α are not precise enough to infer a clear conclusion on the real nature of distributions from estimates made on various time scales” (see [EMB 97], p. 406). When it occurs, the rejection of stability of α will not appear as “conclusive”, as Taylor afﬁrms (see later on). In addition, other more recent studies, like those of Belkacem et al. [BEL 00], have shown the partial validations of this invariance. 13.3.2.2. Non-normality and controversies on scaling invariance In a general way, all work which will be undertaken on stock markets will not only conﬁrm the abnormality of distributions of returns on various scales, and the possible adjustment in Lévy’s distributions on each scale, but also the difﬁculty in validating the fractal hypothesis. In fact, a scaling anomaly will quickly appear, which is a tendency towards the systematic increase in value of α(τ ) according to τ . The differences between these works will entail the choice of replacement process to give

8. For a review of these methods, see the works of Akgiray and Lamoureux [AKG 89] or Walter [WAL 94], who arrived at the same conclusion on selecting the best estimate method: that of Koutrouvélis [KOU 80]. 9. See the work of Koutrouvélis [KOU 80] and Akgiray and Lamoureux [AKG 89] for illustrations of this sample size problem, which has been known since the ﬁrst works of Mandelbrot and Fama.

450

Scaling, Fractals and Wavelets

an account of this failure, by means of non-fractal modeling, i.e., of a multi-scale market analysis. Here we present the main articles relating to this emphasis. Teichmoeller [TEI 71], Ofﬁcer [OFF 72], Fielitz and Smith [FIE 72], Praetz [PRA 72] and Barnea and Downes [BAR 73] obtain all the values of α which increase on average from 1.6 in high frequency to 1.8 in low frequency. This increase led Hsu et al. [HSU 74], who also veriﬁed it, to estimate that “in an economy where factors affecting price levels (technical developments, government policies, etc.) can undergo movements on a great scale, it seems unreasonable (our emphasis) to want to try to represent price variations by a single probability distribution” (see [HSU 74], p.1). Brenner [BRE 74], Blattberg and Gonedes [BLA 74] and Hagerman [HAG 78] continued the investigations by observing the same phenomenon. Hagerman concludes that “the symmetric stable model cannot reasonably (our emphasis) be regarded as a suitable description of stock market returns” (see [HAG 78], p. 1220). We can see a similarity of arguments between Hagerman and Hsu et al. [HSU 74] for whom it does not seem to be “reasonable” to retain a model with inﬁnite variance. This argument was used by Bienaymé against Cauchy as early as 1853. Zajdenweber [ZAJ 76] veriﬁes the adjustment in Lévy’s distribution but does not test the scale invariance. Upton and Shannon [UPT 79] take up the question in a different way by seeking to estimate the violation degree in normality based on observation scale by using the Kolmogorov-Smirnov (KS) method, which is a calculation of curve coefﬁcients K and skewness S. The scale invariance is not retained. A new study by Fielitz and Rozelle [FIE 83] conﬁrms the scaling anomaly. Other investigations are carried out on exchange markets. Wasserfallen and Zimmermann [WAS 85], Boothe and Glassman [BOO 87], Akgiray and Booth [AKG 88a], Tucker and Pond [TUC 88] and Hall et al. [HAL 89] tackled, for their part, the increase of α according to the decrease of observation frequency. At the end of the 1980s, the iid-α-stable model of stock market returns appeared to be rejected by all the research in this ﬁeld. In 1986, as we read in a summarized work on the analysis of stock market variations: “many researchers estimated that the hypothesis of inﬁnite variance was not acceptable. Detailed studies on stock market variations rejected Lévy’s distributions in a conclusive way. [. . .] Ten years after his article in 1965, Fama himself preferred to use a normal distribution for monthly variations and thus to give up stable distributions for daily variations” (see [TAY 86], p. 46). However, we can observe that theoretical scale invariance of Gaussian modeling (scaling law in square root of time) is not validated by real markets in all cases and that generalization by iid-α-stable model represents a good compromise between modeling

Research of Scaling Law on Stock Market Variations

451

power and statistical cost of estimation. We ﬁnd such an argument, for example, in McCulloch [MCC 78], who advocates the small number of parameters required by stable laws, as compared with the ﬁve necessary parameters for jump models such as those proposed by Merton [MER 76]. In other words, the question remains open, even if it is probable that the “true” process of returns is more complex than iid-α-stable modeling. Certain works that were carried out show that the values of α can change in time10 (stationarity problem of Δr), which leads us to raise the question of dependence between increments of the prices process and in ﬁnding other forms of scaling laws on ﬁnancial series. 13.3.2.3. Scaling anomalies of parameters under iid hypothesis Systematic increase of characteristic exponent α(τ ) of stable laws according to τ constitutes what is called a “scaling anomaly.” Indeed, in iid-α-stable modeling, the following relation must be veriﬁed: α(T ) = α(nτ ) = α(τ )

T = nτ

(13.23)

The fact that this relation is not found for all the values of n shows that scale invariance is not total on all time scales, or that the iid hypothesis is not valid. More generally, a way of highlighting invariance by changing the scaling probability law, and thus being able to determine fractal hypothesis, is to examine whether its characteristic parameters have a scaling behavior, i.e., seek a dilation (or contraction) law of parameters according to time scale. This idea is the beginning of an important trend in the theoretical research in ﬁnance. Let λ(τ ) be a statistical parameter of distribution Δr(t, τ ): λ(τ ) is a function of τ and searching for scaling laws on a market between 0 and T therefore leads to the estimation of parameter values based on each value of τ , then to the study of scale relation, or function λ: τ → λ(τ ). All the statistical distribution parameters are also a priori usable for the research of scaling laws on distributions. The most analyzed parameters in research works are either a scaling parameter or the curve coefﬁcient, or kurtosis K. In the Gaussian case, the scaling parameter is the standard deviation and in case of iid increments, we must have the relation: σ(T ) = σ(nτ ) = n1/2 σ(τ )

T = nτ

(13.24)

This scaling relation, already postulated on variance by Bachelier [BAC 00], was introduced into research during the 1980s, and is known under the name of “test of

10. See an example in [WAL 94].

452

Scaling, Fractals and Wavelets

variance ratio”.11 Relation (13.24) shows that in the case of iid returns, we must have a proportionality σ(τ ) ∼ τ 1/2 . Some works have highlighted a slight violation of this relation, bringing to light a proportionality of type σ(τ ) ∼ τ H with H > 0.5. For example, Mantegna [MANT 91], and Mantegna and Stanley [MANT 00] make a list of the values close to 0.53 or 0.57. In case of non-Gaussian α-stable laws, the scaling parameter noted by γ is tested and we must have the relation12: γ(T ) = γ(nτ ) = n1/α γ(τ )

T = nτ

(13.25)

An important parameter is Pearson’s coefﬁcient, or kurtosis K, deﬁned by KX = E[(X − E(X))4 ]/E[(X − E(X))2 ]2 − 3, as this makes it possible to highlight a variation in the normality of the distribution observed. For a normal distribution, we have KX = 0. In the case of iid-Gaussian returns, we must have: K(T ) = K(nτ ) = K(τ )/n

T = nτ

(13.26)

Yet, for example, Cont [CON 97] ﬁnds that the kurtosis coefﬁcient K(τ ) does not decrease in 1/n but rather in n1/α with α ≈ 0.5 indicating the presence of a possible non-linear dependence between variations (see section 13.4). Generally, the more we improve our knowledge of the scaling behaviors of various parameters, the more it becomes possible to choose between the two alternative terms, scale invariance or characteristic scales. The study of scaling behaviors of parameters thus helps in the modeling of stock market ﬂuctuations. The existence of a scaling anomaly on parameter α during investigations carried on between 1970 and 1980, then on K parameter during the following decade, led certain authors to try to modify Mandelbrot’s model by limiting scale invariance, either to certain time scales, by introducing system changes (cross-over), or to certain parts of the distributions only on the extreme values. In these two fractal metamorphoses, this led to the introduction of a multiscale market analysis. 13.3.3. Unstable iid models in partial scaling invariance 13.3.3.1. Partial scaling invariances by regime switching models The question of mode changes, or partial scaling invariance on a given frequency band had already been dealt with by Mandelbrot [MAND 63], who assumed the

11. For example, see Lo and MacKinlay’s work [LO 88], who gave a list of previous works on the calculation of the variance ratio. 12. This relation is veriﬁed by Walter [WAL 91, WAL 94, WAL 99] and Belkacem et al. [BEL 00].

Research of Scaling Law on Stock Market Variations

453

existence of higher and lower limits (cut-off) in the fractality of markets (see also [MAND 97a], p. 51 and pp. 64–66) and introduced the concept of scaling range. Akgiray and Booth [AKG 88b] used this idea to reinforce McCulloch’s argument [MCC 78] on the cost-advantage ratio of a model in scaling invariance. Using stable distributions between two cutoffs is appropriate because it is less costly in parameter estimations than other modeling, which is perhaps ﬁner (like the combinations of normal laws or mixed diffusion-jumps processes) but also more complex and therefore at the origin of a greater number of estimation errors. Therefore, the issue to be solved is the detection of points where change in speed occurs. Bouchaud and Potters [BOU 97] and Mantegna and Stanley [MANT 00] propose such a model, combining Lévy’s distributions and exponential law from a given value. 13.3.3.2. Partial scaling invariances as compared with extremes DuMouchel [DUM 83] suggests, without making a hypothesis a priori on the entire scaling invariance, “letting distribution tails speak for themselves” (see [DUM 83], p. 1025). For this, he uses the generalized Pareto’sdistribution introduced by Pickands [PIC 75], whose distribution function is: 1 − (1 − kx/σ)1/k k = 0 (13.27) F (x) = 1 − exp(−x/σ) k=0 where σ > 0 and k are the form parameters: the bigger k is, the thicker the distribution tail. In the case where distribution is stable with characteristic exponent α < 2 (scaling invariance), then we have α ∼ = 1/k. We can observe that, while Pareto’s laws had been Mandelbrot’s initial step in his introduction of the concept of scaling invariance on stock market variations, Du Mouchel operated in a manner similar to his predecessors and rediscovered Pareto’s law without the invariance sought by Mandelbrot. Mittnik and Rachev [MIT 89] propose to replace scale invariance on summation of iid-α-stable variables by another invariance structure, invariance compared with the minimum: X(1)

L

= an min X(i)+bn 1in

(13.28)

in which the stability property by addition is replaced by the stability property for an extreme value, i.e. the minimum. Weibull’s distribution corresponds to this structure. This was the beginning of a research trend that would lead to the rediscovery in ﬁnance, during the 1990s, of the theory of extreme values,13 which depicts another form of invariance: invariance compared with consideration of the maxima and minima.

13. For the application of the theory of extreme values in ﬁnance, see [LON 96, LON 00].

454

Scaling, Fractals and Wavelets

13.4. Research of dependency and memory of markets 13.4.1. Linear dependence: testing of H-correlative models on returns 13.4.1.1. Question of dependency of stock market returns The standard model of stock market variations made a hypothesis that returns Δr(t, τ ) = ln S(t) − ln S(t − τ ) were iid according to a normal variance law σ 2 τ . The question of validating the independence hypothesis emerged very early in the empirical works dealing with the characterizations of stock market ﬂuctuations. Generally, dependency between two random variables X and Y is measured by the quantity Cf,g (X, Y ) = E(f (X)g(Y )) − E(f (X)]E[g(Y )) and we have the relation: independent X and Y

⇐⇒

Cf,g (X, Y ) = 0

The case of f (x) = g(x) = x corresponds to the measurement of usual covariance. Other cases include all the (non-linear) possible correlations between variables X and Y . Applied to stock market variations, this measure implies that the returns Δr(t, τ ) are independent only if we have C(h) = Cf,f (Δr(t), Δr(t+h)) = 0 for any function f (Δr(t)). Therefore, studying the independence of stock market variations will pave the way for the analysis of function: C(h) = E f Δr(t) f Δr(t + h) (13.29) − E f Δr(t) E f Δr(t + h) The chronology of the study merges with different choices made for the deﬁnition of function f (·). The earliest works (1930-1970) on the veriﬁcation of increment independence were done only on f (x) = x. In this case, C(h) becomes the common autocovariance function: C(h) = γ(h) = E Δr(t)Δr(t + h) − E Δr(t) E Δr(t + h) (13.30) and the independence of increments corresponds to the invalidity of the linear correlation coefﬁcient. In total, the conclusions of initial works proved the absence of a serial autocorrelation and contributed to the formation of a concept of informational efﬁciency in stock markets.14

14. See, for example, [CAM 97, TAY 86] for a review of this form of independence and [WAL 96] for the historical formation of the efﬁciency concept from initial works.

Research of Scaling Law on Stock Market Variations

455

13.4.1.2. Problem of slow cycles and Mandelbrot’s second model However, by the end of the 1970s, certain results contrary to this relation came up in the study of return behaviors in a long-term horizon, which led to tests called “long memory.” By noting by γ(h) = E(Δr(t)Δr(t + h)) − E(Δr(t))E(Δr(t + h)) the common autocovariance function and ρ(h) = γ(h)/γ(0) the associated autocorrelation function, the standard model of stock market variations implies that ρ(h) must decrease geometrically, i.e., ρ(h) cr−h with c > 0. However, it seemed that, in some cases, we obtain a hyperbolic decay ρ(h) ∼ ch2H−2 with c > 0 and 0 < H < 1, corresponding to a phenomenon of “long memory” or “long dependence.” This phenomenon of long memory was observed in the 1960s by Adelman [ADE 65] and Granger [GRA 66]; the latter described it as “the characteristic of ﬂuctuating economic variables”. Besides, this led Mandelbrot [MAND 65] to rediscover Hurst’s law [HUR 51] by introducing the concept of “self-similar process” which later became [MAND 68] fractional Brownian motion (FBM), whose increments are self-similar with exponent H and autocovariance function γ(h) = 12 [|h + 1|2H − 2|h|2H + |h − 1|2H ]. Hence, Mandelbrot’s model can be qualiﬁed as “H-correlative” model. Mandelbrot called it “Joseph’s effect” with reference to the slow and aperiodic cycles evoked in biblical history concerning Joseph and the ﬂuctuations in the Egyptian harvest [MAND 73a]. Summers [SUM 86], Fama and French [FAM 88], Poterba and Summers [POT 88] and DeBondt and Thaler [DEB 89] highlighted the phenomena of “average return” for successive returns, introducing the concept of long-term horizon on markets. Although divergent, the interpretations of these autocorrelation phenomena on a long horizon tended to question the hypothesis of common independence and to ﬁnd a form of long memory on stock market returns. 13.4.1.3. Introduction of fractional differentiation in econometrics Since the 1970s, econometric limits of ARMA (p, q) and ARIMA (p, d, q) stationary processes in the description of ﬁnancial series had progressively led to a generalization of these models by introducing a non-integer differentiation degree 0 < d < 12 with ARFIMA process, which found a great echo in ﬁnance in the 1980s. The fractional differentiation operator ∇d = (1 − L)d where ∇ is deﬁned by ∇X(t) = X(t) − X(t − 1) = (1 − L)X(t) and: ∇d = (1 − L)d =

∞

(−1)k Cdk Lk

k=0

where Cdk is the binomial coefﬁcient, made it possible to obtain “long memory” on studied economic series and met the demand for a new characterization of some of the properties observed in these series. Baillie [BAI 96] presents a complete synthesis

456

Scaling, Fractals and Wavelets

of the usage of these processes in econometrics of ﬁnance. ARFIMA and FBM trends recurred and led to the research of long memory on returns. 13.4.1.4. Experimental difﬁculties of H-correlative model on returns Tests carried out in the research work of these anomalies implemented Hurst’s R/S statistic, improved by Mandelbrot [MAND 72]. This statistic helps in ﬁnding the value of self-similar exponent H insofar as the ratio R/S is asymptotically proportional to nH : H ≈ ln(R/S)/ ln n. Thus, between 1980 and 1990, several works revealed values of H greater than 0.5, indicating the presence of long memory on markets, which seemed to corroborate the observations concerning the “abnormal” behavior of returns over long periods. However, Lo [LO 91] showed that this statistics is also sensitive to short memory effects: in the case of AR (1) process, the result R/S can be based on the rise of 73%. Lo proposed a modiﬁed statistic R/S, by adding weighed autocovariance terms to the denominator. Therefore, it appears that new values obtained from H were close to 0.5. Thus, for example, Corazza et al. [COR 97], Batten et al. [BAT 99] and Howe et al. [HOW 99] verify that the traditional analysis R/S gives values of H greater than 0.5 but the modiﬁed statistics R/S of Lo [LO 91] makes the values of H drop towards 0.5: “what is more astonishing in this result is not the absence of long memory but rather the radical change in judgment that we are led to implement when we use Lo’s modiﬁed statistics” (see [HOW 99], p.149). Further studies on independence will consider, in function C(h) deﬁned in (13.29), for the case f (x) = x2 and f (x) = |x|. Absolute variations of price and their squares represent a measurement of price “volatility”. It is on this form of dependence, i.e. dependence on volatility, that scaling laws in ﬁnance will appear. 13.4.2. Non-linear dependence: validating H-correlative model on volatilities 13.4.2.1. The 1980s: ARCH modeling and its limits A common beginning of all the studies that were conducted in the 1990s is the observation of limits of iid-α-stable and H-correlative models, applied on stock market returns. This observation will lead us to look for a form of dependence on their volatility, by ﬁrst introducing short memory on variances, with the trend of ARCH15 modeling, which is a trend that created a great number of models for this family developing initial logic of conditioning of variance in various directions (for a synthesis review see [BOL 92]). However, in 1997, we could read this comment on

15. Auto-regressive conditional heteroscedasticity: modeling introduced by Engle [ENG 82]. See an ARCH presentation in [GOU 97a, GUE 94].

Research of Scaling Law on Stock Market Variations

457

the ARCH trend: “Yet, the recent inﬂation of basic model varieties and terminology GARCH, IGARCH, EGARCH, TARCH, QTARCH, SWARCH, ACD-ARCH reveals that this approach appears to have reached its limits, cannot adequately answer to some questions, or does not make it possible to reproduce some stylized facts” (see [GOU 97b], p.8). These “stylized facts” particularly relate to hyperbolic decline in the correlation of volatilities, i.e., long memory, or scaling law on volatility. 13.4.2.2. The 1990s: emphasis of long dependence on volatility Baillie [BAI 96] and Bollerslev et al. [BOL 00] present a review of the emphasis of long memory on volatility. This scaling law on volatility makes it possible to understand scaling anomalies observed on kurtosis K. In fact, as Cont [CON 97] shows, if we assume that correlations on volatility are deﬁned by a power law of type g(k) ∼ = g(0)k −α , then we obtain a scaling relation for kurtosis K: 6 K(τ ) + 2 K(τ ) + K(nτ ) = n (2 − α)(1 − α)nα which explains the phenomenon of abnormal decrease of kurtosis. Mandelbrot [MAND 71] showed the importance of taking into consideration the horizon in markets whose variations can be modeled by long dependence processes: particularly, probability of huge losses decreases less rapidly than in a iid-Gaussian world. Financiers often say that “patience reduces risk”: what long dependence shows is that this decrease is much slower than it appears and that it is necessary to be very patient. 13.5. Towards a rediscovery of scaling laws in finance After 40 years of ﬁnancial modeling of stock market prices, we can observe that one of the new intellectual aspects of the 1990s in terms of describing stock market variations was a change in perspective on markets that appears in the research in ﬁnance. We can ﬁnd a trace of this change in the emergence of new vocabulary. Although since Zajdenweber [ZAJ 76], all reference to fractals had disappeared from articles on ﬁnance (fractals developed in other research ﬁelds), Peters [PET 89], who estimated the value of Hurst’s exponent H on index SP500, and Walter [WAL 89, WAL 90] reintroduced Mandelbrot’s terminology and the concept of fractal structure of markets by considering “Noah” and “Joseph” effects simultaneously in their implications for understanding the nature of stock market variations. It is especially with long memory of volatilities that the concept of fractal structure reappeared and Baillie [BAI 96] can recall the relation between Mandelbrot’s terminology and the recent econometric studies. Richards [RIC 00] is directly interested in the fractal dimension of the market.

458

Scaling, Fractals and Wavelets

This is, in fact, the beginning of a progressive rediscovery of scaling laws and of a growing value for these laws. However, following Peters’ works, we can draw attention to the confusion that may emerge among the professional ﬁnancial community, between the concept of fractals and that of chaos. Peters [PET 91] associated these two concepts in an approach that is more spontaneous than rigorous and consolidated them in his second work [PET 94], in which fractals and chaos are mistakenly uniﬁed in the title by presenting the application of chaos theory on investment policies, from a fractal description of stock market variations. Insofar as a great number of studies highlighted the non-applicability of approaches using chaos for the description of stock market variations16, this confusion, introduced by Peters, contributed (and perhaps still contributes) to problematizing the professional community’s understanding of fractal hypothesis, which often associates chaos concept with fractals. Notwithstanding this conceptual hesitation, we can conclude that, faced with the success of fractal modeling of volatility and with recent attempts to apply Brownian motion on deformed time [DAC 93, EVE 95a, MUL 90, MUL 93], the ﬁnancial modeling of stock market variations must make way in the coming years for a partial rediscovery of scaling invariances, no longer in the context of unique fractal dimension (as in the case of iid-α-stable and H-correlative models) but from the introduction of deformed time models, which make it possible to understand market time by replacing physical time with intrinsic stock market time. The most recent modeling explores this promising method (see, for example, [ANE 00, MAND 97b]). 13.6. Bibliography [ADE 65] A DELMAN I., “Long cycles: facts or artefacts?”, American Economic Review, vol. 50, p. 444–463, 1965. [AKG 88a] A KGIRAY V., B OOTH G., “Mixed diffusion-jump process modeling of exchange rate movements”, The Review of Economics and Statistics, p. 631–637, 1988. [AKG 88b] A KGIRAY V., B OOTH G., “The stable-law model of stock returns”, Journal of Business and Economic Statistics, vol. 6, no. 1, p. 51–57, 1988. [AKG 89] A KGIRAY V., L AMOUREUX C., “Estimation of stable-law parameters: a comparative study”, Journal of Business and Economic Statistics, vol. 7, no. 1, p. 85–93, 1989. [ALL 74] A LLAIS M., “The psychological rate of interest”, Journal of Money, Credit, and Banking, vol. 3, p. 285–331, 1974. [ANE 00] A NÉ T., G EMAN H., “Order ﬂow, transaction clock, and normality of asset returns”, Journal of Finance, vol. 55, no. 4, 2000.

16. For a synthesis, see, for example, [MIG 98].

Research of Scaling Law on Stock Market Variations

459

[ARA 80] A RAD R., “Parameter estimation for symmetric stable distributions”, International Economic Review, vol. 21, no. 1, p. 209–220, 1980. [BAC 00] BACHELIER L., Théorie de la spéculation, PhD Thesis in Mathematical Sciences, Ecole normale supérieure, 1900. [BAI 96] BAILLIE R., “Long memory processes and fractional integration in econometrics”, Journal of Econometrics, vol. 73, p. 5–59, 1996. [BAL 94] BÂLE, Risk management guidelines for derivatives, Basle Committee on Banking Supervision, July 1994. [BAL 96] BÂLE, Amendment to the capital accord to incorporate market risks, Basle Committee on Banking Supervision, January 1996. [BAL 98] BÂLE, Framework for supervisory information about derivatives and trading activities, Joint report, Basle Committee on Banking Supervision and Technical Committee of the IOSCO, September 1998. [BAL 99] BÂLE, Trading and derivatives disclosures of banks and securities ﬁrms, Joint report, Basle Committee on Banking Supervision and Technical Committee of the IOSCO, December 1999. [BAR 73] BARNEA A., D OWNES D., “A reexamination of the empirical distribution of stock price changes”, Journal of the American Statistical Association, vol. 68, no. 342, p. 348–350, 1973. [BAT 99] BATTEN J., E LLIS C., M ELLOR R., “Scaling laws in variance as a measure of long-term dependence”, International Review of Financial Analysis, vol. 8, no. 2, p. 123–138, 1999. [BEL 00] B ELKACEM L., L ÉVY V ÉHEL J., WALTER C., “CAPM, risk, and portfolio selection in α-stable markets”, Fractals, vol. 8, no. 1, p. 99–115, 2000. [BLA 74] B LATTBERG R., G ONEDES N., “A comparison of the stable and Student distributions as statistical models for stock prices”, Journal of Business, vol. 47, p. 244–280, 1974. [BOL 92] B OLLERSLEV T., C HOU R., K RONER K., “ARCH modeling in ﬁnance: A review of the theory and empirical evidence”, Journal of Econometrics, vol. 52, no. 1-2, p. 5–59, 1992. [BOL 00] B OLLERSLEV T., C AI J., S ONG F., “Intraday periodicity, long memory volatility, and macroeconomic announcements effects in the US treasury bond market”, Journal of Empirical Finance, vol. 7, p. 37–55, 2000. [BOO 87] B OOTHE P., G LASSMAN D., “The statistical distribution of exchange rates: Empirical evidence and economic implications”, Journal of International Economics, vol. 22, p. 297–319, 1987. [BOU 97] B OUCHAUD J.P., P OTTERS M., Théorie des risques ﬁnanciers, Collection Aléas, Saclay, 1997. [BRE 74] B RENNER M., “On the stability of the distribution of the market component in stock price changes”, Journal of Financial and Quantitative Analysis, vol. 9, p. 945–961, 1974.

460

Scaling, Fractals and Wavelets

[CAM 97] C AMPBELL J., L O A., M ACKINLAY A.C., The Econometrics of Financial Markets, Princeton University Press, 1997. [CHA 76] C HAMBERS J., M ALLOWS C., S TUCK B., “A method for simulating stable random variables”, Journal of the American Statistical Association, vol. 71, no. 354, p. 340–344, 1976. [CLA 73] C LARK P., “A subordinated stochastic process model with ﬁnite variance for speculative prices”, Econometrica, vol. 41, no. 1, p. 135–155, 1973. [CON 97] C ONT R., “Scaling properties of intraday price changes”, Science and Finance Working Paper, June 1997. [COR 97] C ORAZZA M., M ALLIARIS A.G., NARDELLI C., “Searching for fractal structure in agricultural futures markets”, The Journal of Future Markets, vol. 17, no. 4, p. 433–473, 1997. [COU 00] C OURTAULT J.M., K ABANOV Y., B RU B., C RÉPEL P., L EBON I., L E M ARCHAND A., “Louis Bachelier on the centenary théorie de la spéculation”, Mathematical Finance, vol. 10, no. 3, p. 341–353, 2000. [DAC 93] DACOROGNA M., M ÜLLER U., NAGLER R., O LSEN R., P ICTET O., “A geographical model for the daily and weekly seasonal volatility in the foreign exchange market”, Journal of International Money and Finance, vol. 12, p. 413–438, 1993. [DEB 89] D E B ONDT W., T HALER R., “Anomalies: A mean-reverting walk down Wall Street”, Journal of Economic Perspectives, vol. 3, no. 1, p. 189–202, 1989. [DUM 73] D U M OUCHEL W., “Stable distributions in statistical inference: 1. Symmetric stable distributions compared to other long-tailed distributions”, Journal of the American Statistical Association, vol. 68, no. 342, p. 469–477, 1973. [DUM 75] D U M OUCHEL W., “Stable distributions in statistical inference: 2. Information from stably distributed samples”, Journal of the American Statistical Association, vol. 70, no. 350, p. 386–393, 1975. [DUM 83] D U M OUCHEL W., “Estimating the stable index in order to measure tail thickness: A critique”, The Annals of Statistics, vol. 11, no. 4, p. 1019–1031, 1983. [ELL 38] E LLIOTT R., The Wave Principle, Collins, New York, 1938. [EMB 97] E MBRECHTS P., K LÜPPELBERG C., M IKOSCH T., Modelling Extremal Events for Insurance and Finance, Springer, 1997. [ENG 82] E NGLE R., “Autoregressive conditional heteroskedasticity with estimates of the variance in United Kingdom inﬂation”, Econometrica, vol. 50, p. 987–1008, 1982. [EVE 95a] E VERTSZ C.G., “Fractal geometry of ﬁnancial time series”, Fractals, vol. 3, no. 3, p. 609–616, 1995. [EVE 95b] E VERTSZ C.G., “Self-similarity of high-frequency USD-DEM exchange rates”, in Proceedings of the First International Conference on High Frequency Data in Finance (Zurich, Switzerland), vol. 3, March 1995. [FAM 65] FAMA E., “The behavior of Stock Market prices”, Journal of Business, vol. 38, no. 1, p. 34–195, 1965.

Research of Scaling Law on Stock Market Variations

461

[FAM 68] FAMA E., ROLL R., “Some properties of symmetric stable distributions”, Journal of the American Statistical Association, vol. 63, p. 817–836, 1968. [FAM 71] FAMA E., ROLL R., “Parameter estimates for symmetric stable distributions”, Journal of the American Statistical Association, vol. 66, no. 334, p. 331–336, 1971. [FAM 88] FAMA E., F RENCH K., “Permanent and temporary components of stock prices”, Journal of Political Economy, vol. 96, no. 2, p. 246–273, 1988. [FIE 72] F IELITZ B., S MITH E., “Asymmetric stable distributions of stock price changes”, Journal of the American Statistical Association, vol. 67, no. 340, p. 813–814, 1972. [FIE 83] F IELITZ B., ROZELLE J., “Stable distributions and the mixture of distributions hypotheses for common stock returns”, Journal of the American Statistical Association, vol. 78, no. 381, p. 28–36, 1983. [GHY 97] G HYSELS E., G OURIÉROUX C., JASIAK J., “Market time and asset price movements: Theory and estimation”, in H AND D., JARKA S. (Eds.), Statistics in Finance, Arnold, London, p. 307–322, 1997. [GOU 97a] G OURIÉROUX C., ARCH Models and Financial Applications, Springer-Verlag, 1997. [GOU 97b] G OURIÉROUX C., L E F OL G., “Volatilités et mesures du risque”, Journal de la Société de statistique de Paris, vol. 38, no. 4, p. 7–32, 1997. [GRA 66] G RANGER C.W.J., “The typical spectral shape of an economic variable”, Econometrica, vol. 34, p. 150–161, 1966. [GUE 94] G UEGAN D., “Séries chronologiques non linéaires à temps discret”, Economica, 1994. [HAG 78] H AGERMAN R., “More evidence of the distribution of security returns”, Journal of Finance, vol. 33, p. 1213–1221, 1978. [HAL 89] H ALL J., B RORSEN B., I RWIN S., “The distribution of future prices: a test of the stable paretian and mixture of normal hypotheses”, Journal of Financial and Quantitative Analysis, vol. 24, no. 1, p. 105–116, 1989. [HAS 87] H ASBROUCK J., H O T., “Order arrival, quote behavior, and the return generating process”, Journal of Finance, vol. 42, p. 1035–1048, 1987. [HOW 99] H OWE J.S., M ARTIN D., W OOD B., “Much ado about nothing: long-term memory in Paciﬁc Rim equity markets”, International Review of Financial Analysis, vol. 8, no. 2, p. 139–151, 1999. [HSU 74] H SU D.A., M ILLER R., W ICHERN D., “On the stable paretian behavior of stok-market prices”, Journal of the American Statistical Association, vol. 69, no. 345, p. 108–113, 1974. [HUR 51] H URST H.E., “Long term storage capacity of reservoirs”, Transactions of the American Society of Civil Engineers, vol. 116, p. 770–799, 1951. [IOS 94] I OSCO, Operational and ﬁnancial risk management control mechanisms for over-the-counter derivatives activities of regulated securities ﬁrms, Technical Committee of the International Organization of Securities Commissions, July 1994.

462

Scaling, Fractals and Wavelets

[KOU 80] KOUTROUVÉLIS I., “Regression-type estimation of the parameters of stable laws”, Journal of the American Statistical Association, vol. 75, no. 372, p. 918–928, 1980. [LAM 94] L AMOUREUX C., L ASTRAPES W., “Endogeneous trading volume and momentum in stock-return volatility”, Journal of Business and Economic Statistics, vol. 12, no. 2, p. 225–234, 1994. [LEV 02] L ÉVY V ÉHEL J., WALTER C., Les marchés fractals, PUF, Paris, 2002. [LO 88] L O A.W., M ACKINLAY A., “Stock prices do not follow random walks: evidence from a simple speciﬁcation test”, Review of Financial Studies, vol. 1, p. 41–66, 1988. [LO 91] L O A.W., “Long-term memory in stock market prices”, Econometrica, vol. 59, no. 5, p. 1279–1313, 1991. [LON 96] L ONGIN F., “The asymptotic distribution of extreme stock market returns”, Journal of Business, vol. 69, no. 3, p. 383–408, 1996. [LON 00] L ONGIN F., “From value at risk to stress testing approach: the extreme value theory”, Journal of Banking and Finance, p. 1097–1130, 2000. [MAI 97] M AILLET B., M ICHEL T., “Mesures de temps, information et distribution des rendements intrajournaliers”, Journal de la Société de statistique de Paris, vol. 138, no. 4, p. 89–120, 1997. [MAND 63] M ANDELBROT B., “The variation of certain speculative prices”, Journal of Business, vol. 36, p. 394–419, 1963. [MAND 65] M ANDELBROT B., “Une classe de processus stochastiques homothétiques à soi ; application à la loi climatologique de H.E. Hurst”, Comptes rendus de l’Académie des sciences, vol. 260, p. 3274–3277, 1965. [MAND 67a] M ANDELBROT B., “The variation of some other speculative prices”, Journal of Business, vol. 40, p. 393–413, 1967. [MAND 67b] M ANDELBROT B., TAYLOR H., “On the distribution of stock price differences”, Operations Research, vol. 15, p. 1057–1062, 1967. [MAND 68] M ANDELBROT B., VAN N ESS J.W., “Fractional Brownian motion, fractional noises, and applications”, SIAM Review, vol. 10, no. 4, p. 422–437, 1968. [MAND 71] M ANDELBROT B., “When can price be arbitraged efﬁciently? A limit to the validity of random walk and martingale models”, Review of Economics and Statistics, vol. 53, p. 225–236, 1971. [MAND 72] M ANDELBROT B., “Statistical methodology for non-periodic cycles: from the covariance to R/S analysis”, Annals of Economic and Social Measurement, vol. 1, p. 259–290, 1972. [MAND 73a] M ANDELBROT B., “Le problème de la réalité des cycles lents et le syndrome de Joseph”, Economie appliquée, vol. 26, p. 349–365, 1973. [MAND 73b] M ANDELBROT B., “Le syndrome de la variance inﬁnie et ses rapports avec la discontinuité des prix”, Economie appliquée, vol. 26, p. 349–365, 1973.

Research of Scaling Law on Stock Market Variations

463

[MAND 97a] M ANDELBROT B., Fractals and Scaling in Finance, Springer, New York, 1997 (abridged French version: Fractales, Hasard et Finances, Flammarion, Paris). [MAND 97b] M ANDELBROT B., F ISHER A., C ALVET L., “A multifractal model of asset returns”, Cowles Foundation Discussion Paper, no. 1164, September 1997. [MANT 91] M ANTEGNA R., “Lévy walks and enhanced diffusion in Milan stock exchange”, Physica A, vol. 179, p. 232–242, 1991. [MANT 00] M ANTEGNA R., S TANLEY E., An Introduction to Econophysics: Correlations and Complexity in Finance, Cambridge University Press, 2000. [MCC 78] M C C ULLOCH J.H., “Continuous time processes with stable increments”, Journal of Business, vol. 51, no. 4, p. 601–619, 1978. [MCC 81] M C C ULLOCH J.H., “Simple consistent estimators of the stable distributions”, in Proceedings of the Annual Meeting of the Econometric Society, 1981. [MER 76] M ERTON R., “Optimal pricing when underlying stock returns are discontinuous”, Journal of Financial Economics, vol. 3, p. 125–144, 1976. [MIG 98] M IGNON V., “Marchés ﬁnanciers et modélisation des rentabilités boursières”, Economica, 1998. [MIT 89] M ITTNIK S., R ACHEV S., “Stable distributions for asset returns”, Applied Mathematics Letters, vol. 2, no. 3, p. 301–304, 1989. [MUL 90] M ÜLLER U., DACOROGNA M., M ORGENEGG C., P ICTET O., S CHWARZ M., O LSEN R., “Statistical study of foreign exchange rates: empirical evidence of a price change scaling law and intraday pattern”, Journal of Banking and Finance, vol. 14, p. 1189–1208, 1990. [MUL 93] M ÜLLER U., DACOROGNA M., DAVÉ R., P ICTET O., O LSEN R., WARD J., Fractals and intrinsic time – A challenge to econometricians, Olsen and Associates Research Group, UAM 1993-08-16, 1993. [OFF 72] O FFICER R., “The distribution of stock returns”, Journal of the American Statistical Association, vol. 67, no. 340, p. 807–812, 1972. [PAU 75] PAULSON A., H OLCOMB E., L EITCH R., “The estimation of the parameters of the stable laws”, Biometrika, vol. 62, no. 1, p. 163–170, 1975. [PET 89] P ETERS E., “Fractal structure in the capital markets”, Financial Analysts Journal, p. 32–37, July-August 1989. [PET 91] P ETERS E., Chaos and Order in the Capital Markets: A New View of Cycles, Prices, and Market Volatility, John Wiley & Sons, New York, 1991. [PET 94] P ETERS E., Fractal Market Analysis: Applying Chaos Theory to Investment and Economics, John Wiley & Sons, New York, 1994. [PIC 75] P ICKANDS J., “Statistical inference using extreme order statistics”, Annals of Statistics, vol. 3, p. 119–131, 1975. [POT 88] P OTERBA J.M., S UMMERS L., “Mean reversion in stock prices: Evidence and implications”, Journal of Financial Economics, vol. 22, p. 27–59, 1988.

464

Scaling, Fractals and Wavelets

[PRA 72] P RAETZ P., “The distribution of share price changes”, Journal of Business, vol. 45, p. 49–55, 1972. [PRE 72] P RESS S.J., “Estimation in univariate and multivariate stable distributions”, Journal of the American Statistical Association, vol. 67, no. 340, p. 842–846, 1972. [RIC 00] R ICHARDS G., “The fractal structure of exchange rates: Measurement and forecasting”, Journal of International Financial Markets, Institutions, and Money, vol. 10, p. 163–180, 2000. [SUM 86] S UMMERS L., “Does the stock market rationally reﬂect fundamental values?”, Journal of Finance, vol. 41, no. 3, p. 591–601, 1986. [TAQ 00] TAQQU M., “Bachelier et son époque: une conversation avec Bernard Bru”, in Proceedings of the First World Congress of the Bachelier Finance Society (Paris, France), June 2000. [TAY 86] TAYLOR S., Modelling Financial Time Series, John Wiley & Sons, 1986. [TEI 71] T EICHMOELLER J., “A note on the distribution of stock price changes”, Journal of the American Statistical Association, vol. 66, no. 334, p. 282–284, 1971. [TUC 88] T UCKER A., P OND L., “The probability distribution of foreign exchange price changes: test of candidate processes”, Review of Economics and Statistics, p. 638–647, 1988. [UPT 79] U PTON D., S HANNON D., “The stable paretian distribution, subordinated stochastic processes, and asymptotic lognormality: an empirical investigation”, Journal of Finance, vol. 34, no. 4, p. 1031–1039, 1979. [WAL 89] WALTER C., “Les risques de marché et les distributions de Lévy”, Analyse ﬁnancière, vol. 78, p. 40–50, 1989. [WAL 90] WALTER C., “Mise en évidence de distributions Lévy-stables et d’une structure fractale sur le marché de Paris”, in Actes du premier colloque international AFIR (Paris, France), vol. 3, p. 241–259, 1990. [WAL 91] WALTER C., “L’utilisation des lois Lévy-stables en ﬁnance: une solution possible au problème posé par les discontinuités des trajectoires boursières”, Bulletin de l’IAF, vol. 349-350, p. 3–32 and 4–23, 1991 [WAL 94] WALTER C., Les structures du hasard en économie: efﬁcience des marchés, lois stables et processus fractals, PhD Thesis, IEP Paris, 1994. [WAL 96] WALTER C., “Une histoire du concept d’efﬁcience sur les marchés ﬁnanciers”, Annales HSS, vol. 4, p. 873–905, July-August 1996. [WAL 99] WALTER C., “Lévy-stability-under-addition and fractal structure of markets: Implications for the investment management industry and emphasized examination of MATIF notional contract”, Mathematical and Computer Modelling, vol. 29, no. 10-12, p. 37–56, 1999. [WAS 85] WASSERFALLEN W., Z IMMERMANN H., “The behavior of intra-daily exchange rates”, Journal of Banking and Finance, vol. 9, p. 55–72, 1985. [ZAJ 76] Z AJDENWEBER D., “Hasard et prévision”, Economica, 1976.

Chapter 14

Scale Relativity, Non-differentiability and Fractal Space-time

14.1. Introduction The theory of scale relativity [NOT 93] applies the principle of relativity to scale transformations (particularly to transformations of spatio-temporal resolutions). In Einstein’s [EIN 16] formulation, the principle of relativity requires that laws of nature must be valid in every coordinate system, whatever their state. Since Galileo, this principle had been applied to the states of position (origin and orientation) and motion of the coordinate system (velocity and acceleration), i.e. states which can never be deﬁned in an absolute way, but only in a relative way. The state of one reference system can be deﬁned only with regard to another system. It is the same as regards the change of scale. The scale of one system can be deﬁned only with regard to another system and so owns the fundamental property of relativity: only scale ratios have a meaning, never an absolute scale. In the new approach, we reinterpret the resolutions, not only as a property of the measuring device and/or of the measured system, but more generally as an intrinsic property of space-time, characterizing the state of scale of the reference system in the same way as velocity characterizes its state of motion. The principle of scale relativity requires that the fundamental laws of nature apply, whatever the state of scale of the coordinate system.

Chapter written by Laurent N OTTALE.

466

Scaling, Fractals and Wavelets

What is the motivation behind adding such a ﬁrst principle to fundamental physics? It becomes imperative from the very moment we want to generalize the current description of space and time. Present description is usually reduced to differentiable manifolds (even though singularities are possible at certain particular points). So, a way to generalize current physics consists of trying to abandon the hypothesis of differentiability of spatio-temporal coordinates. As we will see, the main consequence of such an abandonment is that space-time becomes fractal, i.e. it acquires an explicit scale dependence (more precisely, it becomes scale-divergent) in terms of the spatio-temporal resolutions. 14.2. Abandonment of the hypothesis of space-time differentiability If we analyze the state of physics based on the principle of relativity before Einstein, we note that it is entirely traditional physics, including the theory of gravitation via the generalized relativity of motion, which is based on this principle. Quantum physics, although compatible with Galilean relativity of motion, seems not to rely on it with regard to its foundations. We could question whether a new generalization of the relativity, which includes quantum effects as its consequence (or, at least, some of them) remains possible. However, in order to generalize relativity, it is necessary to generalize a possible transformation between the coordinate systems, as well as the deﬁnition of what the possible coordinate systems are and, ﬁnally, to generalize the concepts of space and space-time. The general relativity of Einstein is based on the hypothesis that the space-time is Riemannian, i.e., describable by a manifold that is at least twice differentiable: in other words, we can deﬁne a continuum of spatio-temporal events, then speeds which are their derivative and then accelerations by a new derivation. Within this framework, Einstein’s equations are the most general of the simplest equations, which are covariant in twice differentiable coordinates transformations. Just as the passage of special relativity to generalized relativity is allowed by abandoning restrictive hypothesis (that of the ﬂatness of the space-time through a consideration of curved space-time), a new opening is then possible by abandoning the assumption of differentiability. The issue now is to describe a space-time continuum which is no longer inevitably differentiable everywhere or almost everywhere. 14.3. Towards a fractal space-time The second stage of construction consists of “recovering” a mathematical tool that seems to be lost in such a generalization. The essential tool of physics, since Galileo, Leibniz and Newton is the differential equation. Is abandoning the assumption of the differentiability of space-time and therefore of the coordinate systems and of transformations between these systems the same as abandoning the differential equations?

Scale Relativity, Non-differentiability and Fractal Space-time

467

This crucial problem can be circumvented by the intervention of the concept of fractal geometry in space-time physics. With its bias, non-differentiability can be treated using differential equations. 14.3.1. Explicit dependence of coordinates on spatio-temporal resolutions This possibility results from the following theorem [NOT 93, NOT 96a, NOT 97a], which is itself a consequence of a Lebesgue theorem. It can be proved that a continuous and almost nowhere differentiable curve has a length depending explicitly on the resolution at which we consider it and tending to inﬁnity when the interval of resolution tends to zero. In other words, such a curve is fractal in the general sense given by Mandelbrot to this term [MAN 75, MAN 82]. Applied to the coordinate system of a non-differentiable space-time, this theorem implies a fractal geometry for this space-time [ELN 95, NOT 84, ORD 83], as well as for the reference frame. Moreover, it is the dependence according to the resolution itself which solves the problem posed. Indeed, let us consider the deﬁnition of the derivative applied, for example, to a coordinate (which deﬁnes speed): x(t + dt) − x(t) (14.1) v(t) = lim dt→0 dt The non-differentiability is the non-existence of this limit. The limit being, in any case, physically unattainable (inﬁnite energy is required to reach it, according to Heisenberg time-energy relation), v is redeﬁned as v(t, dt), function of time t and of the differential element dt identiﬁed with an interval of resolution, regarded as a new variable. The issue is not the description of what occurs in extreme cases, but the behavior of this function during successive zooms on the interval dt. 14.3.2. From continuity and non-differentiability to fractality It can be proved [BEN 00, NOT 93, NOT 96a] that the length L of a continuous and nowhere (or almost nowhere) differentiable curve is dependent explicitly on the resolution ε at which it is considered and, further, that L(ε) remains strictly increasing and tends to inﬁnity when ε → 0. In other words, this curve is fractal (we will use the word “fractal” in this general sense throughout this chapter). Let us consider a curve (chosen as a function f (x) for the sake of simplicity) in the Euclidean plane, which is continuous but nowhere differentiable between two points A0 {x0 , f (x0 )} and AΩ {xΩ , f (xΩ )}. Since f is non-differentiable, there is a point A1 of coordinates {x1 , f (x1 )}, with x0 < x1 < xΩ , such that A1 is not on the segment A0 AΩ . Thus, the total length becomes L1 = L(A0 A1 ) + L(A1 AΩ ) > L0 = L(A1 AΩ ). We can now iterate the argument and ﬁnd two coordinates x01 and x11 with x0 < x01 < x1 and x1 < x11 < xΩ , such that L2 = L(A0 A01 ) + L(A01 A1 ) +L(A1 A11 ) + L(A11 AΩ ) > L1 > L0 . By iteration we ﬁnally construct successive

468

Scaling, Fractals and Wavelets

approximations of the function f (x) studied, f0 , f1 , . . . , fn , whose length L0 , L1 , . . . , Ln increase monotonically when the resolution ε ≈ (xΩ − x0 ) × 2−n tends to zero. In other words, continuity and non-differentiability imply monotonous scale dependence of f in terms of resolution ε. However, the function L(ε) could be increasing but converge when ε → 0. This is not the case for such a continuous and non-differentiable curve: indeed, the second stage of demonstration, which establishes the divergence of L(ε), is a consequence of Lebesgue theorem (1903), which states that a curve of ﬁnite length is differentiable almost everywhere (see for example [TRI 93]). Consequently, a non-differentiable curve is necessarily inﬁnite. These two results, taken together, establish the above theorem on the scale divergence of non-differentiable continuous functions. A direct demonstration, using non-standard analysis, was given in [NOT 93], p. 82. This theorem can be easily generalized to curves, surfaces, volumes, and more generally to spaces of any dimension. Regarding the reverse proposition, a question remains as to whether a continuous function whose length is scale-divergent between any two points such that δx = xA − xB is ﬁnite (i.e., everywhere or nearly everywhere scale-divergent) and non-differentiable. In order to prepare the answer, let us remark that the scale-dependent length, L(δx), can be easily related to * the average value of the scale-dependent slope v(δx). Indeed, we have L(δx) = # 1 + v 2 (δx)$. Since we consider curves such that L(δx) → ∞ when δx → 0, this means that L(δx) ≈ #v(δx)$ at large enough resolution, so that L(δx) and v(δx) share the same kind of divergence when δx → 0. Basing ourselves on this simple result, the answer to the question of the non-differentiability of scale-divergent curves is as follows (correcting and updating here previously published results [NOT 08]): 1) Homogenous divergence. Let us ﬁrst consider the case when the slopes diverge in the same way for all points of the curve, which we call “homogenous divergence”. In other words, we assume that, for any couple of points the absolute values v1 and v2 of their scale-dependent slopes verify: ∃K1 and K2 ﬁnite, such that, ∀δx, K1 < v2 (δx)/v1 (δx) < K2 . Then the mode of mean divergence is the same as the divergence of the slope on the various points, and it is also the mode of longitudinal divergence. In this case the inverse theorem is true, namely, in the case of homogenous divergence, the length of a continuous curve f is such that: L inﬁnite (i.e., L = L(δx) → ∞ when δx → 0) ⇔ f non-differentiable. 2) Inhomogenous divergence. In this case there may exist curves such that only a subset of zero measure of their points have divergent slopes, in such a way that the

Scale Relativity, Non-differentiability and Fractal Space-time

469

length is nevertheless inﬁnite in the limit δx → 0. Such a function may therefore be almost everywhere differentiable, and in the same time be characterized by a genuine fractal law of scale-dependence of its length, i.e. by a power law divergence characterized by a fractal dimension DF . The same reasoning may be applied to other types of divergences, such as logarithmic, exponential, etc. Therefore, when the divergence is inhomogenous, an inﬁnite curve may be either differentiable or non-differentiable whatever its divergence mode, namely, this means that there is no inverse theorem in this case. When it is applied to physics, this result means that a fractal behavior may result from the action of singularities (in inﬁnite number even though forming a subset of zero measure) in a space or space-time that nevertheless remains almost everywhere differentiable (such as for example Riemannian manifolds in Einstein’s general relativity). This comes in support of Mandelbrot’s view about the origin of fractals, which are known to be extremely frequent in many natural phenomena that yet seem to be well described by standard differential equations: this could come from the existence of singularities in differentiable physics (see e.g. [MAN 82], Chapter 11). However, the viewpoint of scale relativity theory is more radical, since the main problem we aim at solving in its framework is not the (albeit very interesting) question of the origin of fractals, but the issue of the foundation of the quantum theory and of gauge ﬁelds from geometric ﬁrst principles. As we shall recall, a fractal space-time is not sufﬁcient to reach this goal (speciﬁcally concerning the emergence of complex numbers). We need to work in the framework of non-differentiable manifolds, which are indeed fractal (i.e. scale-divergent) as has been shown above. However, the fractality is not central in this context, and it mainly appears as a derived (and very useful) geometric property of such continuous non-differentiable manifolds. 14.3.3. Description of non-differentiable process by differential equations This result is the key for enabling a description of non-differentiable processes in terms of differential equations. We introduce explicitly the resolutions in the expressions of the main physical quantities and, as a consequence, in the fundamental equations of physics. This means that a quantity f , usually expressed in terms of space-time variables x, i.e., f = f (x), must now be described as also depending on resolutions ε, i.e., f = f (x, ε). In other words, rather than considering only the strictly non-differentiable mathematical object f (x), we shall consider its various “approximations” obtained from smoothing or averaging it at various resolutions ε: f (x, ε) =

+∞

−∞

Φ(x, y, ε) f (x + y) dy

(14.2)

470

Scaling, Fractals and Wavelets

where Φ(x, y, ε) is a smoothing function centered on x, for example, a Gaussian function of standard error ε. More generally, we can use wavelet transformations based on a ﬁlter that is not necessarily conservative. Such a point of view is particularly well-adapted to applications in physics: any real measurement is always performed at ﬁnite resolution (see [NOT 93] for additional comments on this point). In this framework, f (x) becomes the limit for ε → 0 of the family of functions fε (x), i.e., in other words, of the function of two variables f (x, ε). However, whereas f (x, 0) is non-differentiable (in the sense of the non-existence of the limit df /dx when ε tends to zero), f (x, ε), which we call a “fractal function” (and which is, in fact, deﬁned using a class of equivalence that takes into account the fact that ε is a resolution, see [NOT 93]), is now differentiable for all ε = 0. The problem of physically describing of the various processes where such a function f intervenes is now posed differently. In standard differentiable physics, it amounts to ﬁnding differential equations involving the derivatives of f with respect to space-time coordinates, i.e., ∂f /∂x, ∂ 2 f /∂x2 , namely, the derivatives which intervene in laws of displacement and motion. The integro-differential method amounts to performing such a local description of space-time elementary displacements, of their effect on quantum physics and then integrating in order to obtain the large scale properties of the system under consideration. Such a method has often been called “reductionist” and it was indeed adapted to traditional problems where no new information appears at different scales. The situation is completely different for systems characterized by a fractal geometry and/or non-differentiability. Such behaviors are found towards very small and very large scales, but also, more generally, in chaotic and/or turbulent systems and probably in basically all living systems. In these cases, new, original information exists at different scales and the project to reduce the behavior of a system to one scale (in general, to a large scale) from its description at another scale (in general, the smallest possible scale, δx → 0) seems to lose its meaning and to become hopeless. Our suggestion consists precisely of giving up such a hope and introducing a new frame of thought where all scales co-exist simultaneously inside a unique scale-space, and are connected together using scale differential equations acting in this scale-space. Indeed, in non-differentiable physics, ∂f (x)/∂x = ∂f (x, 0)/∂x no longer exists. However, physics of the given process will be completely described provided we succeed in knowing f (x, ε), which is differentiable (for x and ε) for all ﬁnite values of ε. Such a function of two variables (which is written more precisely, to be complete, as f [x(ε), ε)]) can be the solution of differential equations involving ∂f (x, ε)/∂x but also ∂f (x, ε)/∂ ln ε. More generally, with non-linear laws, the equations of physics take the form of second-order differential equations, which will

Scale Relativity, Non-differentiability and Fractal Space-time

471

then contain, in addition to the previous ﬁrst derivatives, operators like ∂ 2 /∂x2 (laws of motion), ∂ 2 /∂(ln ε)2 (laws of scale), but also ∂ 2 /∂x∂ ln ε, which correspond to a coupling between motion and scale (see below). What is the physical meaning of the differential ∂f (x, ε)/∂ ln ε? It is simply the variation of the physical quantity f under an inﬁnitesimal scale transformation, i.e., a resolution dilation. More precisely, let us consider the length of a non-differentiable curve L(ε), which can represent more generally a fractal curvilinear coordinate L(x, ε). Such a coordinate generalizes in a non-differentiable and fractal space-time the concept of curvilinear coordinates introduced for curved Riemannian space-time in Einstein’s general relativity [NOT 89]. 14.3.4. Differential dilation operator Let us apply an inﬁnitesimal dilation ε → ε = ε(1+d ) to the resolution. We omit the dependence on x to simplify the notation in what follows, since for the moment we are interested in pure scale laws. We obtain: L(ε ) = L(ε + ε d ) = L(ε) +

∂L(ε) ˜ d )L(ε) ε d = (1 + D ∂ε

(14.3)

˜ is by deﬁnition the dilation operator. The comparison of the last two members where D of this equation thus yields: ˜ =ε ∂ = ∂ D ∂ε ∂ ln ε

(14.4)

This well-known form of the inﬁnitesimal dilation operator, obtained by an application of Gell-Mann-Levy method (see [AIT 82]) shows that the “natural” variable for resolution changes is ln ε and that the differential equations of scale to build will indeed involve expressions such that ∂L(x, ε)/∂ ln ε. What will be the form that these equations take? In fact, equations describing the scale dependence of physical beings have already been introduced in physics: these are the renormalization group equations, particularly developed in the framework of Wilson’s “multiple-scale-of-length” approach [WIL 83]. In its simplest form, a “renormalization group”-like equation for a physical quantity L can be interpreted as stating that the variation of L under an inﬁnitesimal scale transformation d ln ε depends only on L itself, in other words, L determines the whole physical behavior including the behavior in scale transformations. This is written ∂L(x, ε) = β(L) ∂ ln ε

(14.5)

Such an equation (and its generalization), the behavior of which we will analyze in more detail later on, is the differential equivalent of the generators in the case of

472

Scaling, Fractals and Wavelets

fractal objects built by iterations (for example, the von Koch curve). However, instead of passing from one stage of the construction to another by means of discrete ﬁnite dilations (successive factors in the case of the von Koch curve), we pass from ln ε to ln ε + d ln ε. In other words, the differential calculus made in the scale-space allows us to describe a non-differentiable behavior (in the limit) by differential equations. 14.4. Relativity and scale covariance We complete our current description which is made in terms of space (positions), space-time or phase-space, using a scale space. We now consider that resolutions characterize its space state, just as speeds characterize the state of motion of the coordinate system. The relative nature of temporal and spatial resolution intervals is a universal law of nature: only a ratio of length or time intervals can be deﬁned, never their absolute value, as this is reﬂected in the need to appeal constantly to the units. This allows us to set the principle of scale relativity, according to which the fundamental laws of nature apply whatever the state of scale of the reference system is. In this framework, we shall call scale covariant the invariance of equations of physics under the transformations of spatio-temporal resolutions (let us note that this expression was introduced by other authors in a slightly different sense, as a generalization of scale invariance). It is also necessary to be careful because of the fact that a multiple covariance must be implemented in such an attempt, since it will be necessary to combine the covariance of motion and the new scale covariance, as well as a covariance under scale-motion coupling. We shall thus develop different types of covariant derivations which should be clearly distinguished: one strictly on the scales, then a “quantum-covariant” derivative which describes the inferred effects on the dynamics by the internal scale structures (which transforms traditional mechanics into quantum mechanics) and ﬁnally a covariant derivative which is identiﬁed with that of gauge theories and which describes non-linear effects of scale-motion coupling. 14.5. Scale differential equations We now pass on to the next stage and construct scale differential equations with a physical signiﬁcance, then look at their solutions. For this we shall be guided by an analogy with the construction of the law of motion and by the constraint that such equations must satisfy the scale relativity principle. We shall ﬁnd, at ﬁrst, the self-similar fractal behavior at a constant dimension. In a scale transformation, such a law possesses the mathematical structure of the Galileo group and thus satisﬁes, in a simple way, the relativity principle.

Scale Relativity, Non-differentiability and Fractal Space-time

473

The analogy with motion can be pushed further. We know, on the one hand, that the Galileo group is only an approximation of the Lorentz group (corresponding to the limit c → ∞) and, on the other hand, that both remain a description of an inertial behavior, whereas it is with dynamics that motion physics ﬁnds its complexity. The same is true for scale laws. Fractals with constant dimension constitute for scales the counterpart of what the Galilean inertia is for the motion. We can then suggest generalizing the usual dilation and contraction laws in two ways: 1) one way is to introduce a Lorentz group of scale transformation [NOT 92]. In its framework, there appears a ﬁnite resolution scale, minimal or maximal, invariant under dilation, which replaces zero or inﬁnity while maintaining their physical properties. We have suggested identifying these scales, respectively, with the Planck length and with the scale of the cosmological constant [NOT 92, NOT 93, NOT 96a]. This situation, however, still corresponds to a linear transformation of scale on the resolutions; 2) another way is to take into account non-linear transformations of scale, i.e., to move to a “scale dynamics” and if possible to a generalized scale relativity [NOT 97a]. We shall consider in what follows some examples of these kind of generalized laws, after ﬁnding the standard fractal (scale-invariant) behavior (and the breaking of this symmetry) as a solution of the simplest possible ﬁrst-order scale differential equation. 14.5.1. Constant fractal dimension: “Galilean” scale relativity Power laws, which are typical of the self-similar fractal behavior, can be identiﬁed as the simplest of the laws sought. Let us consider the simplest possible scale equation, which is written in terms of an eigenvalue equation for the dilation operator: ˜ = bL DL

(14.6)

Its solution is a standard divergent fractal law: L = L0 (λ0 /ε)δ

(14.7)

where δ = −b = D − DT , since D is the fractal dimension assumed to be constant and DT is the topological dimension. The variable L can indicate, for example, the length measured on a fractal curve (which will describe particularly a coordinate in the fractal reference system). Such a law corresponds, with regards to scales, to inertia from the point of view of motion. We can verify this easily by applying a resolution transformation to it. Under such a transformation ε → ε , we obtain: ln(L /λ) = ln(L/λ) + δ ln(ε/ε ),

δ = δ

(14.8)

474

Scaling, Fractals and Wavelets

where we recognize the mathematical structure of the Galileo transformation group between the inertial systems: the substitution (motion → scale) results in the correspondences x → ln(L/λ), t → δ and v → ln(ε/ε ). Let us note the manifestation of the relativity of the resolutions from the mathematical point of view: ε and ε intervene only by their ratio, while the reference scale λ0 disappeared in relation (14.8). In agreement with the preceding analysis of the status of resolutions in physics, the scale exponent δ plays the role for the scales which is played by time with regard to motion, and the logarithm of the ratio of resolutions plays the role of velocity. The composition law of dilations, written in logarithmic form, conﬁrms this identiﬁcation with the Galileo group: ln(ε /ε) = ln(ε /ε ) + ln(ε /ε)

(14.9)

formally identical to Galilean composition of velocities, w = u + v. 14.5.2. Breaking scale invariance: transition scales Statement (14.7) is scale invariant. This invariance is spontaneously broken by the existence of displacement and motion. Let us change the origin of coordinate system. We obtain: L = L0 (λ0 /ε)δ + L1 = L1 [1 + (λ1 /ε)δ ]

(14.10)

where λ1 = λ0 (L0 /L1 )1/δ . Whereas the scale λ0 remains arbitrary, the scale λ1 (which remains relative in terms of position and motion relativity) displays a break in scale symmetry (in other words, of a fractal to non-fractal transition in the space of scales). Indeed, it is easy to establish that, for ε λ1 , we have L ≈ L1 and L no longer depends on resolution, whereas for ε λ1 , we recover the scale dependence given by (14.7), which is asymptotically scale invariant. However, this behavior (equation (14.10)), which thus satisﬁes the double principle of relativity of motion and scale, is precisely obtained as the solution to the simplest scale differential equation that can be written (ﬁrst-order equation, depending only on L itself, this dependence being expandable in Taylor series: the preceding case corresponds to simpliﬁcation a = 0): dL/d ln ε = β(L) = a + bL + · · · .

(14.11)

The solution (14.11) is effectively given by expression (14.10), with δ = −b, L1 = −a/b, knowing that λ1 is an integration constant. Let us note that, if we push the Taylor series further, we obtain a solution yielding several transition scales, in agreement with the behaviors observed for many

Scale Relativity, Non-differentiability and Fractal Space-time

475

natural fractal objects [MAN 82]. Particularly, going up to the second order, we ﬁnd fractal structures with a lower and higher cut-off. We can also obtain behaviors which are scale-dependent toward the small and large scales, but which become scale-independent at intermediate scales. 14.5.3. Non-linear scale laws: second order equations, discrete scale invariance, log-periodic laws Among the corrections to scale invariance (characterized by power laws), one of them is led to play a potentially important role in many domains, which are not limited to physics. We are talking about the log-periodic laws which can be deﬁned by the appearance of scale exponents or complex fractal dimensions. Sornette et al. (see [SOR 97, SOR 98] and the reference included) have shown that such behavior provides a very satisfactory and possibly predictive model of some earthquakes and market crashes. Chaline et al. [CHA 99] used such laws of scale to model the chronology of major jumps in the evolution of the species, and Nottale et al. [NOT 01a] showed that they also applied to the chronology of the main economic crises since the Neolithic era (see [NOT 00c] for more details). More recently, Cash et al. [CAS 02] showed that these laws describe the chronology of the main steps of embryogenesis and child development. This may be a ﬁrst step towards a description of the temporal evolution of “crises” (in the general acception of this word), which could appear very general, all the more so as recent works validated these ﬁrst results [SOR 01]. An intermittency model of this behavior was recently proposed [QUE 00]. Let us show how to obtain a log-periodic correction to power laws [NOT 97b] utilizing scale covariance [NOT 89], i.e. conservation of the form of scale dependent equations (see also [POC 97]). Let us consider a quantity Φ explicitly dependent on resolution, Φ(ε). In the application under consideration, the scale variable is identiﬁed with a time interval ε = T − Tc , where Tc is the date of crisis. Let us assume that Φ satisﬁes a renormalization-group-like ﬁrst-order differential equation: dΦ − DΦ = 0 d ln ε whose solution is a power law, Φ(ε) ∝ εD .

(14.12)

In the quest to correct this law, we note that directly introducing a complex exponent is not enough since it would lead to large log-periodic ﬂuctuations rather than to a controllable correction to the power laws. So let us assume that the cancellation of difference (14.12) was only approximate and that the second member of this equation actually differs from zero: dΦ − DΦ = χ d ln ε

(14.13)

476

Scaling, Fractals and Wavelets

We require that the new function χ is solution of an equation that keeps the same form as the initial equation: dχ − D χ = 0 d ln ε

(14.14)

Setting D = D + δ, we ﬁnd that Φ is solution of a second order general equation: dΦ d2 Φ + CΦ = 0 −B (d ln ε)2 d ln ε

(14.15)

where we have B = 2D + δ and C = D(D + δ). This solution is written Φ(ε) = a εD (1 + b εδ ), where b can now be arbitrarily small. Finally, the choice of an imaginary exponent δ = iω yields a solution whose real part includes a log-periodic correction: Φ(ε) = a εD [1 + b cos(ω ln ε)]

(14.16)

Log-periodical ﬂuctuations were also obtained within the approach of scale relativity through a reinterpretation of gauge invariance and of the nature of electromagnetism which can be proposed in this framework (see below and [NOT 96a, NOT 06]). 14.5.4. Variable fractal dimension: Euler-Lagrange scale equations Let us now consider the case of “scale dynamics”. As we have indicated earlier, the strictly scale-invariant behavior with constant fractal dimension corresponds to a free behavior from the point of view of the scale physics. Thus, just as there are forces which imply a variation with the inertial motion, we also expect to see the natural fractal systems displaying distortions compared with self-similar behavior. By analogy, such distortions can, in a ﬁrst stage, be attributed to the effect of a “scale force” or even a “scale ﬁeld”. Before introducing this concept, let us recall how we should reverse the viewpoint as regards the meaning of scale variables, in comparison with the usual description of fractal objects. This reversal is parallel, with respect to scales, to that which was operated for motion laws in the conversion from “Aristotelian” laws to Galilean laws. From the Aristotelian viewpoint, time is the measurement of motion: it is thus deﬁned by taking as primary concepts space and velocity. In the same way, fractal dimension is deﬁned, generally, from the “measure” of the fractal object (for example, curve length, surface area, etc.) and from the resolution: “t = x/v”

←→

δ = D − DT = d ln L/d ln(λ/ε)

(14.17)

Scale Relativity, Non-differentiability and Fractal Space-time

477

With Galileo, time becomes a primary variable and velocity is derived from a ratio of space over time, which are now considered on the same footing, in terms of a space-time (which remains, however, degenerated, since the speed limit C is implicitly inﬁnite there). This involves the vectorial character of velocity and its local aspect (ﬁnally implemented by its deﬁnition like the derivative of the position with respect to time). The same reversal can be applied to scales. The scale dimension δ itself becomes a primary variable, treated on the same footing as space and time, and the resolutions are therefore deﬁned as derivatives from the fractal coordinate and δ (i.e. as a “scale-velocity”): V = ln

λ = d ln L/dδ ε

(14.18)

This new and fundamental meaning given to the scale exponent δ = D − DT , now treated like a variable, makes it necessary to allot a new name to it. Henceforth, we will call it djinn (in preceding articles, we had proposed the word zoom, but this already applies more naturally to the scales transformation themselves, ln(ε /ε)). This will lead us to work in terms of a generalized 5D space, the “space-time-djinn”. In analogy with the vectorial character of velocity, the vectorial character of the zoom (i.e., of the scale transformations) is then apparent because the four spatio-temporal resolutions can now be deﬁned starting from the four coordinates of space-time and of the djinn: v i = dxi /dt

←→

ln

λμ = d ln Lμ /dδ εμ

(14.19)

Note however that, in more recent works, a new generalization of the physical nature of the resolutions is introduced, which attributes a tensorial nature to them, analogous to that of a variance-covariance error matrix [NOT 06, NOT 08]. We could object to this reversal of meaning of the scale variables, that, from the point of view of the measurements, it is only through L and ε that we have access to the djinn δ, which is deduced from them. However, we notice that it is the same for the time variable, which, though being a primary variable, is always measured in an indirect way (through changes of position or state in space). A ﬁnal advantage of this inversion will appear later on in the attempts to construct a generalized scale relativity. It allows the deﬁnition of a new concept, i.e. that of scale-acceleration Γμ = d2 ln Lμ /dδ 2 which is necessary for the passage to non-linear scale laws and to a scale “dynamics”. The introduction of this concept makes it possible to further reinforce the identiﬁcation of fractals of constant fractal dimension with “scale inertia”. Indeed,

478

Scaling, Fractals and Wavelets

the free scale equation can be written (in one dimension to simplify the writing): Γ = d2 ln L/dδ 2 = 0

(14.20)

It integrates as: d ln L/dδ = ln

λ = constant ε

(14.21)

The constancy of resolution means here that it is independent of the djinn δ. The solution therefore takes the awaited form L = L0 (λ/ε)δ . More generally, we can then make the assumption that the scale laws can be constructed from a least action principle. A scale Lagrange function, L(ln L, V, δ), with V = ln(λ/ε) is introduced, and then a scale action:

δ2

S=

L(ln L, V, δ) dδ

(14.22)

δ1

The principle of stationary action then leads to Euler-Lagrange scale equations: ∂L d ∂L = dδ ∂V ∂ ln L

(14.23)

14.5.5. Scale dynamics and scale force The simplest possible form of these equations corresponds to a cancellation of the second member (absence of scale force), and to the case where the Lagrange function takes the Newtonian form L ∝ V 2 . We once again recover, in this other way, the “scale inertia” power law behavior. Indeed, the Lagrange equation becomes in this case: dV =0 dδ

⇒

V = constant

(14.24)

The constancy of V = ln(λ/ε) means here, as we have already noticed, that it is independent of δ. Equation (14.23) can therefore be integrated under the usual fractal form L = L0 (λ/ε)δ . However, the principal advantage of this representation is that it makes it possible to pass to the following order, i.e., to non-linear scale dynamic behaviors. We consider that the resolution ε can now become a function of the djinn δ. The fact of having

Scale Relativity, Non-differentiability and Fractal Space-time

479

identiﬁed the resolution logarithm with a “scale-velocity”, V = ln(λ/ε), then results naturally in deﬁning a scale acceleration: Γ = d2 ln L/dδ 2 = d ln(λ/ε)/dδ

(14.25)

The introduction of a scale force then makes it possible to write a scale analog of Newton’s dynamic equation (which is simply the preceding Lagrange equation (14.23)): d2 ln L (14.26) dδ 2 where μ is a “scale-mass” which measures how the system resists scale force. F = μΓ = μ

14.5.5.1. Constant scale force Let us ﬁrst consider the case of a constant scale-force. Continuing with the analogy with motion laws, such a force derives from a “scale-potential” ϕ = F ln L. We can write equation (14.26) in the form: d2 ln L =G dδ 2

(14.27)

where G = F/μ = constant. This is the scalar equivalent to parabolic motion in constant gravity. Its solution is a parabolic behavior: V = V0 + G δ,

ln L = ln L0 + V0 δ +

1 G δ2 2

(14.28)

The physical meaning of this result is not clear in this form. Indeed, from the experimental point of view, ln L and possibly δ are functions of V = ln(λ/ε). After redeﬁnition of the integration constants, this solution is therefore expressed in the form: λ L 1 1 λ ln , ln ln2 (14.29) = δ= G ε L0 2G ε Thus, fractal dimension, usually constant, becomes a linear function of the log-resolution and the logarithm of length now no longer varies linearly, but in a parabolic way. This result is potentially applicable to many situations, in all the ﬁelds where fractal analysis prevails (physics, chemistry, biology, medicine, geography, etc.). Frequently, after careful examination of scale dependence for a given quantity, the power law model is rejected because of the variation of the slope in the plane (ln L, ln ε). In such a case, the conclusion that the phenomenon considered is not fractal could appear premature. It could, on the contrary, be a non-linear fractal behavior relevant to scale dynamics, in which case the identiﬁcation and the study of scale force responsible for the distortion would be most interesting.

480

Scaling, Fractals and Wavelets

14.5.5.2. Scale harmonic oscillator Another interesting case of scale potential is that of the harmonic oscillator. In the case where it is “attractive”, the scale equation is written as: ln L + α2 ln L = 0

(14.30)

where the notation indicates the second derivative with respect to the variable δ. Setting α = ln(λ/Λ), the solution is written as: L ln = L0

1/2 ln2 (λ/ε) 1− 2 ln (λ/Λ)

(14.31)

Thus, there is a minimal or maximal scale Λ for the considered system, whereas the slope d ln L/d ln ε (which can no longer be identiﬁed with the djinn δ in this non-linear situation) varies between zero and inﬁnity in the ﬁeld of resolutions allowed between λ and Λ. More interesting still is the “repulsive” case, corresponding to a potential which we can write as ϕ = −(ln L/δ0 )2 /2. The solution is written as: D λ L λ − ln2 (14.32) ln = δ0 ln2 L0 ε Λ This solution is more general than that given in previous publications, where we had considered only the case ln(λ/Λ) = δ0−1 . The interest of this solution is that it again yields asymptotic behavior of very large or very small scales (ε λ or ε λ) the standard solution L = L0 (λ/ε)δ0 , of constant fractal dimension D = 1 + δ0 . On the other hand, this behavior is faced with increasing distortions when the resolution approaches a maximum scale εmax = Λ, for which the slope (which we can identify with an effective fractal dimension minus the topological dimension) becomes inﬁnite. In physics, we suggested that such a behavior could shed new light on the quarks conﬁnement: indeed, within the reinterpretive framework of gauge symmetries as symmetries on the spatio-temporal resolutions (see below), the gauge group of quantum chromodynamics is SU(3), which is precisely the dynamic symmetry group of the harmonic oscillator. Solutions of this type could also be of interest in the biological ﬁeld, because we can interpret the existence of a maximum scale where the effective fractal dimension becomes inﬁnite, like that of a wall, which could provide models, for example, of cell walls. With scales lower than this maximum scale (for small components which evolve inside the system considered), we tend either towards scale-independence (zero slope) in the ﬁrst case, or towards “free” fractal behavior with constant slope in the second case, which is still in agreement with this interpretation.

Scale Relativity, Non-differentiability and Fractal Space-time

481

14.5.6. Special scale relativity – log-Lorentzian dilation laws, invariant scale limit under dilations It is with special scale relativity that the concept of “space-time-djinn” takes its full meaning. However, this has only been developed, until now, in two dimensions: one space-time dimension and one for the djinn. A complete treatment in ﬁve dimensions remains to be made. The previous comment, according to which the standard fractal laws (in constant fractal dimension) have the structure of the Galileo group, immediately implies the possibility of generalizing of these laws. Indeed, we know since the work of Poincaré [POI 05] and Einstein [EIN 05] that, as regards motion, this group is a particular and degenerated case of Lorentz group. However, we can show [NOT 92, NOT 93] that, in two dimensions, assuming only that the law of searched transformation is linear, internal and invariant under reﬂection (hypotheses deducible from the only principle of special relativity), we ﬁnd the Lorentz group as the only physically acceptable solution: namely, it corresponds to a Minkowskian metric. The other possible solution is the Euclidean metric, which correctly yields a relativity group (that of rotations in space), but is excluded in the space-time and space-djinn cases since it is contradictory with the experimental ordering found for velocities (the sum of two positive velocities yields a larger positive velocity) and for scale transformations (two successive dilations yield a larger dilation, not a contraction). In what follows, let us indicate by L the asymptotic part of the fractal coordinate. In order to take into account the fractal to non-fractal transition, it can be replaced in all equations by a difference of the type L − L0 . The new log-Lorentzian scale transformation is written, in terms of the ratio of dilation between the resolution scales ε → ε [NOT 92]: ln(L/L0 ) + δ ln ln(L /L0 ) = & 1 − ln2 / ln2 (λ/Λ) δ =

δ + ln ln(L/L0 )/ ln2 (λ/Λ) & 1 − ln2 / ln2 (λ/Λ)

(14.33)

(14.34)

The law of composition of dilations takes the form: ln

ln(ε/λ) + ln ε = ln ln(ε/λ) λ 1+ ln2 (λ/Λ)

(14.35)

Let us specify that these laws are valid only at scales smaller than the transition scale λ (respectively, at scales larger than it when this law is applied

482

Scaling, Fractals and Wavelets

to very large scales). As we can establish on these formulae, the scale Λ is a resolution scale invariant under dilations, unattainable, (we would need an inﬁnite dilation from any ﬁnite scale to reach it) and uncrossable. We proposed to identify it, towards very small scales, with the space and time Planck scale, lP = (G/c3 )1/2 = 1.616 05(10) × 10−35 m and tP = lP /c, which would then own all the physical properties of the zero point while remaining ﬁnite. In the macroscopic case, it is identiﬁed to the cosmic length scale given by the inverse of the root of the cosmological constant, LU = Λ−1/2 [NOT 93, NOT 96a, NOT 03]. We have theoretically predicted this scale to be LU = (2.7761 ± 0.0004) Gpc [NOT 93], and the now observed value, LU (obs) = (2.72 ± 0.10) Gpc, is in very good agreement with this prediction (see [NOT 08] for more details). This type of “log-Lorentzian” law was also used by Dubrulle and Graner [DUB 96] in turbulence models, but with a different interpretation of the variables. To what extent does this new dilation law change our view of space-time? At a certain level, it implies a complication because of the need for introducing the ﬁfth dimension. Thus, the scale metrics is written with two variables: λ0 (14.36) dσ 2 = dδ 2 − (d ln L)2 /C02 , with; C0 = ln Λ The invariant dσ deﬁnes a “proper djinn”, which means that, although the effective fractal dimension, given by D = 1 + δ according to (14.34), became variable, the fractal dimension remained constant in the proper reference system. However, we can also note that the fractal dimension now tends to inﬁnity when the resolution interval tends to the Planck scale. While going to increasingly small resolutions, a fractal dimension will thus successively pass the values 2, 3, 4, which would make it possible to cover a surface, then space, then space-time using a single coordinate. It is thus possible to deﬁne a Minkowskian space-time-djinn requiring, in adequate fractal reference systems, only two dimensions on very small scales. By tending towards large resolutions, the space-time-djinn metric signature (+, −, −, −, −) sees its ﬁfth dimension vary less and less to become almost constant on scales currently accessible to accelerators (see [NOT 96a, Figure 4]). It ﬁnally vanishes beyond the Compton scale of the system under consideration, which is identiﬁed with the fractal to non-fractal transition in rest frame. At this scale the temporal metric coefﬁcient also changes sign, which generates the traditional Minkowskian space-time of metric signature (+, −, −, −). 14.5.7. Generalized scale relativity and scale-motion coupling This is a vast ﬁeld of study. We saw how we could introduce non-linear scale transformations and a scale dynamics. This approach is, however, only a ﬁrst step towards a deeper “entirely geometric” level in which scale forces are but manifestations of the fractal and non-differentiable geometry. This level of

Scale Relativity, Non-differentiability and Fractal Space-time

483

description also implies taking resolutions into consideration, which would in turn depend on space and time variables. The ﬁrst aspect leads to the new concept of scale ﬁeld, which corresponds to a distortion in scale space compared with usual self-similar laws [NOT 97b]. It can also be represented in terms of curved scale space. It is intended that this approach will be developed in more detail in future research. The second aspect, of which we now point out some of the principal results, leads to a new interpretation of gauge invariance and thus gauge ﬁelds themselves. This in turn proves the existence of general relations between mass scale and coupling constants (generalized charge) in particle physics [NOT 96a]. One of these relations makes it possible, as we will see, to predict the value of the electron mass theoretically (considered as primarily of electromagnetic origin, in this approach), as a function of its charge. Lastly, to be complete, let us point out that even these two levels are only transitory stages from the perspective of the theory we intend to build. A more comprehensive version will deal with motion and scales on the same footing and thus see the principle of scale relativity and motion uniﬁed into a single principle. This will be done by working in a 5D space-time-djinn provided with a metric, in which all the transformations between the reference points identify with rotations: in the planes (xy, yz, zx), they are ordinary rotations of 3D space; in the planes (xt, yt, zt) they are motion effects (which are reduced to Lorentz boosts when the space-time-djinn is reduced to 4D space time on macroscopic scales); ﬁnally, four rotations in the planes (xδ, yδ, zδ, tδ) identify with changes of space-time resolutions. 14.5.7.1. A reminder about gauge invariance At the outset, let us recall brieﬂy the nature of the problem set by gauge invariance in current physics. This problem already appears in traditional electromagnetic theory. This theory, starting from experimental constraints, has led to the introduction of a four-vector potential, Aμ , then of a tensorial ﬁeld given by the derivative of the potential, Fμν = ∂μ Aν − ∂ν Aμ . However, Maxwell ﬁeld equations (contrary to what occurs in Einstein’s general relativity for motion in a gravitational ﬁeld) are not enough to characterize the motion of a charge in an electromagnetic ﬁeld. It is necessary to add the expression for the Lorentz force, which is written in 4D form f μ = (e/c)F μν uν , where uν is the four-velocity. It is seen that only the ﬁelds intervene in this and not the potentials. This implies that the motion will be unaffected by any transformation of potentials which leave the ﬁelds invariant. It is obviously the case, if we add to the four-potential the gradient of any function of coordinates: Aμ = Aμ + ∂μ χ(x, y, z, t). This transformation is called, following Weyl, gauge transformation and the invariance law, which results from it is the gauge invariance. What was apparently only a simple latitude left in the choice of the potentials takes within the quantum mechanics framework a deeper meaning. Indeed, gauge

484

Scaling, Fractals and Wavelets

invariance in quantum electrodynamics becomes an invariance under the phase transformations of wave functions and is linked to current conservation using Noether’s theorem. It is known that this theorem connects fundamental symmetries to the appearance of conservative quantities, which are manifestations of these symmetries (thus the existence of energy results from the uniformity of time, the momentum of space homogenity, etc.). In the case of electrodynamics, it appears that the existence of the electric charge itself results from gauge symmetry. This fact is apparent in the writing of the Lagrangian which describes Dirac’s electronic ﬁeld coupled to an electromagnetic ﬁeld. This Lagrangian is not invariant under the gauge transformation of electromagnetic ﬁeld Aμ = Aμ + ∂μ χ(x), but becomes invariant, provided it is completed by a local gauge transformation on the phase of the electron wave function, ψ → e−ieχ(x) ψ. This result can be interpreted by saying that the existence of the electromagnetic ﬁeld (and its gauge symmetry) implies that of the electric charge. However, although impressive (particularly through its capacity for generalization to non-Abelian gauge theories which includes weak and strong ﬁelds and allows description of weak electric ﬁelds), this progress in comprehending the nature of the electromagnetic ﬁeld and the charge remains incomplete, in our opinion. Indeed, the gauge transformation keeps an arbitrary nature. The essential point is that no explicit physical meaning is given to function χ(x): however, this is the conjugate variable of the charge in the electron phase (just as energy is the conjugate of time and momentum of space), so that it is from an understanding of its nature that an authentic comprehension of the nature of charge could arise. Moreover, the quantization of charge remains misunderstood within the framework of the current theory. However, its conjugate variable still holds the key to this problem. The example of angular momentum is clear in this regard: its conjugate quantity is the angle, so that its conservation results from the isotropy of space. Moreover, the fact that angle variations cannot exceed 2π implies that the differences in angular momentum are quantized in units of . In the same way, we can expect that the existence of limitation on the variable χ(x), once its nature is elucidated, would imply charge quantization and leads to new quantitative results. As we will see, scale relativity makes it possible indeed to make proposals in this direction. 14.5.7.2. Nature of gauge ﬁelds Let us consider an electron or any other charged particle. In scale relativity, we identify “particles” with the geodesics of a non-differentiable space-time. These paths are characterized by internal (fractal) structures (beyond the Compton scale λc = /mc of the particle in rest frame). Now consider any one of these structures (which is deﬁned only in a relative way), lying at a resolution ε < λc . In a displacement of the electron, the relativity of scales will imply the appearance of a ﬁeld induced by this displacement.

Scale Relativity, Non-differentiability and Fractal Space-time

485

To understand it, we can take as model an aspect of the construction, from the general relativity of motion, of Einstein’s gravitation theory. In this theory, gravitation is identiﬁed with the manifestation of the curvature of space-time, which results in vector rotation of geometric origin. However, this general rotation of any vector during a translation can result simply from the only generalized relativity of motion. Indeed, since space-time is relative, a vector V μ subjected to a displacement dxρ cannot remain identical to itself (the reverse would mean absolute space-time). It will thus undergo a rotation, which is written, by using Einstein summation convention on identical lower and upper indices, δV μ = Γμνρ V ν dxρ . Christoffel symbols Γμνρ , which emerge naturally in this transformation, can then be calculated, while processing this construction, in terms of derivatives of the metric potentials gμν , which makes it possible to regard them as components of the gravitational ﬁeld generalizing Newton’s gravitational force. Similarly, in the case of fractal electron structures, we expect that a structure, which was initially characterized by a certain scale, jumps to another scale after the electron displacement (if not, the scale space would be absolute, which would be in contradiction with the principle of scale relativity). A dilation ﬁeld of resolution induced by the translations is then expected to appear, which is written: e

δε = −Aμ δxμ ε

(14.37)

This effect can be described in terms of the introduction of a covariant derivative: eDμ ln(λ/ε) = e∂μ ln(λ/ε) + Aμ

(14.38)

Now, this ﬁeld of dilation must be deﬁned irrespective of the initial scale from which we started, i.e., whatever the substructure considered. Therefore, starting from another scale ε = ε (here we take into account, as a ﬁrst step, only the Galilean scale relativity law in which the product of two dilations is the standard one), we get during the same translation of the electron: e

δε = −Aμ δxμ ε

(14.39)

The two expressions for the potential Aμ are then connected by the relation: Aμ = Aμ + e ∂μ ln

(14.40)

where ln (x) = ln(ε/ε ) is the relative scale state (it depends only on the ratio between resolutions ε and ε ) which depends now explicitly on the coordinates. In this regard, this approach already comes under the framework of general scale relativity and of non-linear scale transformations, since the “scale velocity” has been redeﬁned as a ﬁrst derivative of the djinn, ln = d ln L/dδ, so that equation (14.40) involves a second-order derivative of fractal coordinate, d2 ln L/dxμ dδ.

486

Scaling, Fractals and Wavelets

If we consider a translation along two different coordinates (or, in an equivalent way, displacement on a closed loop), we may write a commutator relation: e(∂μ Dν − ∂ν Dμ ) ln = (∂μ Aν − ∂ν Aμ )

(14.41)

This relation deﬁnes a tensor ﬁeld Fμν = ∂μ Aν − ∂ν Aμ , which, unlike Aμ , is independent of the initial scale from where we started. We recognize in Fμν the analog of an electromagnetic ﬁeld, in Aμ that of an electromagnetic potential, in e that of electric charge and in equation (14.40) the property of gauge invariance which, in accordance with Weyl’s initial ideas and their development by Dirac [DIR 73], recovers its initial status of scale invariance. However, equation (14.40) represents progress compared with these early attempts and with the status of gauge invariance in today’s physics. Indeed, the gauge function χ(x, y, z, t) which intervenes in the standard formulation of gauge invariance, Aμ = Aμ + e ∂μ χ and which has, up to now, been considered as arbitrary, is identiﬁed with the logarithm of internal resolutions, χ = ln ρ(x, y, z, t). Another advantage with respect to Weyl’s theory is that we are now allowed to deﬁne four different and independent dilations along the four space-time resolutions instead of only one global dilation. Therefore, we expect that the ﬁeld above (which corresponds to a group U(1) of electromagnetic ﬁeld type) is embedded into a larger ﬁeld, in accordance with the electroweak theory and grand uniﬁcation attempts. In the same way, we expect that the charge e is an element of a more complicated, “vectorial” charge. These early remarks have now developed into a full theory non-Abelian gauge ﬁelds [NOT 06], in which the main tools and results of Yang-Mills theories can be recovered as a manifestation of fractal geometry. Moreover, this generalized approach makes it possible to suggest a new and more completely uniﬁed preliminary version of electroweak theory [NOT 00b], √ in which the Higgs boson mass can be predicted theoretically (we ﬁnd mH = 2mW = 113.73 ± 0.06 GeV, where mW is the W gauge boson mass). Moreover, our interpretation of gauge invariance yields new insights about the nature of the electric charge and, when it is combined with the Lorentzian structure of dilations of special scale-relativity, it makes it possible to obtain new relations between the charges and the masses of elementary particles [NOT 94, NOT 96a], as recalled in what follows. 14.5.7.3. Nature of the charges In gauge transformation Aμ = Aμ − ∂μ χ, the wave function of an electron of charge e becomes: ψ = ψ eieχ

(14.42)

Scale Relativity, Non-differentiability and Fractal Space-time

487

In this expression, the essential role played by the gauge function is clear. It is the conjugate variable of the electric charge, in the same way as position, time and angle are conjugate variables of momentum, energy and angular momentum, (respectively) in the expressions of the action and/or the quantum phase of a free particle, θ = i(px − Et + σϕ)/. Our knowledge of what constitutes energy, momentum and angular momentum comes from our understanding of the nature of space, time, angles and their symmetry (translations and rotations), using Noether’s theorem. Conversely, the fact that we still do not really know what an electric charge is, despite all the development of gauge theories comes, in our view, from the fact that the gauge function χ is considered devoid of physical meaning. We have interpreted in the previous section the gauge transformation as a scale transformation of resolution, ε → ε , ln = ln(ε/ε ). In such an interpretation, the speciﬁc property that characterizes a charged particle is the explicit scale dependence of its action and therefore of its wave function in function of resolution. The result is that the electron’s wave function is written: e2

ψ = ψ ei c ln

(14.43)

Since, by deﬁnition (in the system of units where the permittivity of vacuum is 1): e2 = 4παc

(14.44)

ψ = ψ ei4πα ln

(14.45)

equation (14.43) becomes:

Now considering the wave function of the electron as an explicitly dependent function on resolution ratios, we can write the scale differential equation of which ψ is a solution as: −i

∂ψ = eψ ln )

∂( ec

(14.46)

˜ = −i∂/∂( e ln ) a dilation operator. Equation (14.46) can We recognize in D c then be read as an eigenvalue equation: ˜ = eψ Dψ

(14.47)

In such a framework, the electric charge is understood as the conservative quantity that comes from the new scale symmetry, namely, the uniformity of resolution variable ln ε.

488

Scaling, Fractals and Wavelets

14.5.7.4. Mass-charge relations In the previous section, we have stated the wave function of a charged particle in the form: ψ = ψ ei4πα ln

(14.48)

In the Galilean case such a relation leads to no new result, since ln is unlimited. However, in the special scale-relativistic framework (see previous section), scale laws become Lorentzian below the Compton scale λc of the particle, then ln becomes limited by the fundamental constant C = ln(λc /lP ), which characterizes the considered particle (where lP = (G/c3 )1/2 is the Planck length scale). This implies a quantization of the charge, which amounts to relation 4παC = 2kπ, i.e.: k (14.49) 2 where k is an integer. This equation deﬁnes a general form for relations between masses and charges (coupling constants) of elementary particles. αC =

For example, in the case of the electron, the ratio of its Compton length /me c to Planck length is equal to the ratio of Planck mass (mP = (c/G)1/2 ) to electron mass. Moreover, within the framework of the electroweak theory, it appears that the coupling constant of electrodynamics at low energy (i.e., ﬁne structure constant) results from a “running” electroweak coupling dependent on the energy scale. This running coupling is decreased by a factor 38 owing to the fact that the gauge bosons W and Z become massive and no longer contribute to the interaction at energies lower than their mass energy. We thus obtain a mass-charge relation for the electron which is written: mP 8 α ln =1 3 me

(14.50)

Such a theoretical relation between the mass and the electron charge is supported by the experimental data which leads to a value 1.0027 for this product and becomes 1.00014 when taking the threshold effects at Compton transition into account. Such a relation accounts for many other structures observed in particle physics and suggests solutions to the questions of the origin of the masses of certain particles, of the coupling values and of the hierarchy problem between electroweak and grand uniﬁcation scales [NOT 96a, NOT 00a, NOT 00b]. 14.6. Quantum-like induced dynamics 14.6.1. Generalized Schrödinger equation In scale relativity, as we have seen, it is necessary to generalize the concept of space-time and once again to work within the framework of fractal space-time. We

Scale Relativity, Non-differentiability and Fractal Space-time

489

consider the coordinate systems (and paths, particularly fractal space geodesics) which are themselves fractal, i.e., having internal structures at all scales. We concentrated, in the preceding sections, on possible descriptions of such structures, which relates to scale space. We will now brieﬂy consider, to ﬁnish, its induced effects on displacements in ordinary space. The combination of these effects leads to the introduction of a description tool of the quantum mechanical type. In its framework, we give up the traditional description in terms of initial conditions and deterministic individual trajectories, for the beneﬁt of a statistical description in terms of probability amplitudes. Let us point out the essence of the method used within the framework of scale relativity to pass from a traditional dynamics to a quantum-like dynamics. The three minimal conditions, which make it possible to transform the fundamental equation of dynamics into a Schroedinger equation are as follows: 1) there is an inﬁnity of potential paths; this ﬁrst condition is a natural outcome of non-differentiability and space fractality, if the paths could be identiﬁed with the geodesics of this space; 2) the paths are fractal curves (dimension D = 2, which corresponds to a complete loss of information on elementary displacements playing a special role here). In the case of a space and its geodesics, the fractal character of the space implies the fractality of its geodesics directly; 3) there is irreversibility at the inﬁnitesimal level, i.e., non-invariance in the reﬂection of time differential element dt → −dt. Again, this condition is an immediate consequence of the abandonment of the differentiability hypothesis. Let us recall that one of the fundamental tools, which enable us to manage non-differentiability, consists of reinterpreting differential elements as variables. Thus, the space coordinate becomes a fractal function X(t, dt) and its velocity, although becoming undeﬁned at the limit dt → 0, is now also deﬁned as a fractal function. The difference is that there are two deﬁnitions instead of one (that are transformed one into the other by the reﬂection dt ↔ −dt), and thus the velocity concept becomes two-valued: X(t + dt, dt) − X(t, dt) (14.51) V+ (t, dt) = dt X(t, dt) − X(t − dt, dt) V− (t, dt) = (14.52) dt The ﬁrst condition leads us to use a “ﬂuid”-like description, where we no longer consider only the velocity of an individual path, but rather the mean velocity ﬁeld v[x(t), t] of all the potential paths. The second condition brings us back to preceding works concerning scale laws satisfying the relativity principle. We saw that, in the simplest “scale-Galilean” case,

490

Scaling, Fractals and Wavelets

the coordinate (which is a solution of a scale differential equation) decomposes in the form of a traditional, scale-independent, differentiable part and of a fractal, non-differentiable part. We use this result here, after having differentiated the coordinate. This leads us to decompose the elementary displacements dX = dx + dξ in the form of a mean scale-independent, dx = v dt, and of a ﬂuctuation dξ characterized by a law of fractal behavior, dξ ∝ dt1/DF , where DF is the fractal dimension of the path. The third condition implies, as we have seen, a two-valuedness of the velocity ﬁeld. Deﬁned by V = dX/dt = v + dξ/dt, it decomposes, in the case of both V+ and V− , in terms of a non-fractal component v (thus derivable in ordinary sense) and of a divergent fractal component dξ/dt, of zero-mean. We are thus led to introduce a 3D twin process: i i = dxi± + dξ± dX± i i dt, #dξ± $ = 0 and: in which dxi± = v± / i j 0 2D 2−2/DF dξ± dξ± = ±δ ij dt dt dt

(14.53)

(14.54)

(c = 1 is used here to simplify the writing; δ ij is a Kronecker symbol). The symbol D is a fundamental parameter of scale, which characterizes fractal trajectories behavior (it is nothing other than a different notation for the fractal to non-fractal transition scale introduced previously). This parameter determines the essential transition, which appears in such a process between the fractal behavior on a small scale (where the ﬂuctuations dominate) and the non-fractal behavior on a large-scale (where the mean traditional motion dominates). A natural representation of the two-valuedness of variables due to irreversibility consists of using complex numbers (we can show that this choice is “covariant” in the sense that it preserves the form of the equations [CEL 04]). A complex time derivative operator is deﬁned (which relates to the scale-independent differentiable parts): 1 d+ + d− d+ − d− dˆ = −i dt 2 dt dt

(14.55)

Then we deﬁne an average complex velocity which results from the action of this operator on the position variable: Vi =

i i v i − v− v i + v− dˆ i x = V i − i Ui = + −i + dt 2 2

(14.56)

Note that, in more recent works, we have constructed such an operator from the whole velocity ﬁeld, including the non-differentiable part, and still obtained

Scale Relativity, Non-differentiability and Fractal Space-time

491

the standard Schrödinger equation as the equation of motion [NOT 05, NOT 07], therefore allowing for the possible existence of fractal and non-differentiable wave functions in quantum mechanics [BER 96]. After having deﬁned the laws of elementary displacements in such a fractal and locally irreversible process, it is necessary for us now to analyze the effects of these displacements on other physical functions. Let us consider a differentiable function f (X(t), t). Its total derivative with respect to time is written: ∂f dX 1 ∂ 2 f dX i dX j df = + ∇f. + dt ∂t dt 2 ∂X i ∂X j dt

(14.57)

We may now calculate the (+) and (−) derivatives of f . In this procedure, the mean value of dX/dt amounts to d± x/dt = v± , while #dX i dX j $ is reduced to j i dξ± $. Finally, in the particular case when the fractal dimension of the paths is #dξ± DF = 2, we have #dξ 2 $ = 2Ddt, and the last term of equation (14.57) is transformed into a Laplacian. We obtain in this case: d± f /dt = (∂/∂t + v± .∇ ± DΔ)f

(14.58)

Although we consider here only the fractal dimension DF = 2, we recall that all the results obtained can be generalized to other values of the dimension [NOT 96a]. By combining these two derivatives, we obtain the complex derivation operator with respect to time: ∂ dˆ = + V.∇ − i DΔ dt ∂t

(14.59)

It has two imaginary terms, −iU.∇ and −iDΔ, in addition to the standard Eulerian total derivative operator, d/dt = ∂/∂t + V.∇. We can now rewrite the fundamental dynamic equation using this derivative operator: this will then automatically take into account the new effects considered. It keeps the Newtonian form: m

dˆ2 x = −∇φ dt2

(14.60)

where φ is an exterior potential. If the potential is either zero or a gravitational potential, this equation is nothing other than a geodesic equation. We have therefore implemented a generalized equivalence principle, thanks to which the motion (gravitational and quantum) remains locally described under an inertial form: indeed, as we will see now, this equation can be integrated under the form of a Schrödinger equation.

492

Scaling, Fractals and Wavelets

More generally, we can generalize Lagrangian mechanics with this new tool (see [NOT 93, NOT 96a, NOT 97a, NOT 07]). The complex character of velocity V implies that the same is true of the Lagrange function and therefore of the action S. The wave function ψ is then introduced very simply as a re-expression of this complex action: ψ = eiS/2mD

(14.61)

It is related to complex velocity in the following manner: V = −2iD∇(ln ψ)

(14.62)

We can now change the descriptive tool and write the Euler-Newton equation (14.60) in terms of this wave function: 2imD

dˆ (∇ ln ψ) = ∇φ dt

(14.63)

After some calculations, this equation can be integrated in the form of a Schrödinger equation [NOT 93]: D2 Δψ + iD

φ ∂ ψ− ψ=0 ∂t 2m

(14.64)

We ﬁnd the standard quantum mechanics equation by selecting D = /2m. By setting that ψψ † = , we ﬁnd that the imaginary part of this equation is the continuity equation: ∂ /∂t + div ( V ) = 0 which justiﬁes the interpretation of

(14.65)

as a probability density [NOT 05, NOT 07].

14.6.2. Application in gravitational structure formation Physics has for a long time been confronted with the problem of the very non-homogenous spacial distribution of matter in the universe. This distribution of spatial structures often show a hierarchy of organization, whether it is in the microscopic domain (quarks in the nucleons, nucleons in the nucleus, nucleus and electrons in the atom, atoms in the molecule, etc.) or macroscopic domain (stars and their planetary system, star groups and clusters gathering with the interstellar matter, itself fractal, in galaxies which form groups and clusters of galaxies, which belong to superclusters of galaxies, themselves subsets of the large scale structures of the universe). What is striking, in these two cases, is that it is vacuum rather than matter which dominate, even on very large scales where we thought we would ﬁnd a homogenous distribution.

Scale Relativity, Non-differentiability and Fractal Space-time

493

The theory of scale relativity was built, among other aims, to deal with questions of scale structuring. We take into account an explicit intervention of the observation scales (which amounts to working within the framework of a fractal geometry), or more generally of the scales which are characteristic of the phenomena under consideration, as well as relations between these scales, by the introduction of a resolution space. As we saw, such a description of structures over all the scales (or on a broad range) induces a new dynamics whose behavior becomes quantum rather than traditional. However, the conditions under which Newton’s equation is integrated in the form of a Schrödinger equation (which correspond to a complete loss of information on individual trajectories, from the viewpoint of angles, space and time) do not manifest themselves only on microscopic scales. Certain macroscopic systems, such as protoplanetary nebulae, which created our solar system, could satisfy these conditions and thus be described statistically by a Schrödinger-type equation (but with, of course, an interpretation different from that of standard quantum mechanics). Such a dynamics leads naturally to a “morphogenesis” [DAR 03] since it generates organized structures in a hierarchical way, dependent on the external conditions (forces and boundary conditions). A great example of the application of such an approach is in planetary system formation. It is fascinating that theoretical predictions about it could be made [NOT 93], then validated in our solar system, several years before the discovery [MAY 95, WOL 94] of the ﬁrst extrasolar planets. We theoretically predicted that the distribution of semi-major axes of planetary orbits should show probability peaks for values an /GM = (n/w0 )2 , where M is the mass of the star, w0 = 144 km/s is a universal constant which characterizes the structuring of the inner solar system and which is observed (with its multiples and sub-multiples) from the planetary scales to extragalactic scales, and n is an integer. It is also expected that the eccentricities show probability peaks for values e = k/n, where k is an integer ranging between 0 and n − 1. Since then more than 250 exoplanets have been discovered, of which the observed distribution of semi-major axes (Figure 14.1) and eccentricities (Figure 14.2) show a highly signiﬁcant statistical agreement with the theoretically awaited probability distributions [NOT 96b, NOT 97c, NOT 00d, NOT 01b, DAR 03]. 14.7. Conclusion The present contribution has mainly focused on the detailed principle and theoretical development of the “scale-relativistic” approach. However, we have not been able to touch on everything. For example, the construction of an equation of the Schrödinger type starting from the abandonment of differentiability, explicitly shown above in the case of Newton’s fundamental equation of motion, can be generalized in all cases where the equations of traditional physics could be put in the form of Euler-Lagrange equations. This was done explicitly for the equations of rotational

494

Scaling, Fractals and Wavelets

a/M (A.U. / M ) 0.043

Number

7

0.171

0.385

0.685

1.07

1.54

2.09

6

7

5 3 1

1

2

3 4 5 4.83 (a/M)1/2

Figure 14.1. Histogram of the observed distribution of variable n ˜ = 4.83 (a/M )1/2 where a indicates the semi-major axis and M star mass (in solar system units, i.e., astronomical unit AU and solar mass M ), for the recently discovered exoplanets and the planets of our inner solar system. We theoretically expect probability peaks for integer values of this variable. The probability of obtaining such a statistical agreement by chance is lower than 4 × 10−5

16 14

Number

12 10 8 6 4 2 0

1

2

3

4

5

6

n.e ˜ = n.e, where m is the principal quantum Figure 14.2. Histogram of the distribution of k number (which characterizes the semi-major axes) and e eccentricity, for the exoplanets and the planets of the inner solar system. The theory predicts probability peaks for integer values of this variable. The probability of obtaining such an agreement by chance is lower than 10−4 . Combined probability to obtain by chance the two distributions (semi-major axes and eccentricities) is 3 × 10−7 , i.e., a level of statistical signiﬁcance reaching 5σ

Scale Relativity, Non-differentiability and Fractal Space-time

495

motion of a solid, for the equation of motion with a dissipation function, for Euler and Navier-Stokes equations, or even for scalar ﬁeld equations [NOT 97a, NOT 08]. Among the possible generalizations of the theory, we can also mention abandoning the differentiability, not only in the usual space (which leads, as we saw, to the introduction of a scale-space governed by differential equations acting on scale variables, in particular on the spatio-temporal resolutions), but also in the space of scales itself. All the previous construction can again apply to this deeper level of description. This leads to the introduction of a “scale-quantum mechanics” [NOT 08]. In this framework, which is equivalent to a “third quantization,” fractal “objects” of a new type can be deﬁned: rather than having structures at well-deﬁned scales (the case of the ordinary fractal objects), or than having variable scale structures described by traditional laws (the case of “scale-relativistic” fractals considered in this chapter), they are now characterized by an amplitude of probability for scale ratios (“quantum” fractals). With regard to the applications of this approach, we gave only two of their examples, concerning electron mass and planetary systems. Let us recall, nevertheless, that it could be applied successfully to a large number of problems of physics and astrophysics which were unresolved with the usual methods, and that it also allowed theoretical prediction of structures and of new relations [NOT 96a, NOT 08]. Thus, the transformation of fundamental dynamic equation into a Schrödinger equation under very general conditions (loss of information on the individual paths and irreversibility) leads to a renewed comprehension of the formation and evolution of gravitational structures. This method, besides semi-major axes and eccentricities of planets discovered around solar-type stars, brieﬂy considered earlier, was also applied successfully to the three planets observed around pulsar PSR 1257+12 [NOT 96b], to obliquities and inclinations of planets and satellites of the solar system [NOT 98a], satellites of giant planets [HER 98], double stars, double galaxies, distribution of the galaxies on a very large scale and other gravitational structures [NOT 96a, NOT 98b, DAR 03]. 14.8. Bibliography [AIT 82] A ITCHISON I., An Informal Introduction to Gauge Field Theories, Cambridge University Press, Cambridge, 1982. [BEN 00] B EN A DDA F., C RESSON J., “Divergence d’échelle et différentiabilité”, Comptes rendus de l’Académie des sciences de Paris, série I, vol. 330, p. 261–264, 2000. [BER 96] B ERRY M.V., “Quantum fractals in boxes”, J. Phys. A: Math. Gen., vol. 29, p. 6617–6629, 1996. [CAS 02] C ASH R., C HALINE J., N OTTALE L., G ROU P., “Human development and log-periodic laws”, C.R. Biologies, vol. 325, p. 585, 2002.

496

Scaling, Fractals and Wavelets

[CEL 04] C ÉLÉRIER M.N., N OTTALE L., “Quantum-classical transition in scale relativity”, J. Phys. A: Math. Gen. vol. 37, p. 931, 2004. [CHA 99] C HALINE J., N OTTALE L., G ROU P., “Is the evolutionary tree a fractal structure?”, Comptes rendus de l’Académie des sciences de Paris, vol. 328, p. 717, 1999. [DAR 03] DA ROCHA D., N OTTALE L., “Gravitational structure formation in scale-relativity”, Chaos, Solitons & Fractals, vol. 16, p. 565, 2003. [DIR 73] D IRAC P.A.M., “Long range forces and broken symmetries”, Proc. Roy. Soc. Lond., vol. A333, p. 403–418, 1973. [DUB 96] D UBRULLE B., G RANER F., “Possible statistics of scale invariant systems”, J. Phys. (Fr.), vol. 6, p. 797–816, 1996. [EIN 05] E INSTEIN A., “Zur Elektrodynamik bewegter Körper”, Annalen der Physik, vol. 17, p. 891–921, 1905. [EIN 16] E INSTEIN A., “Die Grundlage der allgemeinen Relativitätstheorie” Annalen der Physik, vol. 49, p. 769–822, 1916. [ELN 95] E L NASCHIE M.S., ROSSLER O.E., P RIGOGINE I. (Eds.), Quantum Mechanics, Diffusion, and Chaotic Fractals, Pergamon, Cambridge, p. 93, 1995. [HER 98] H ERMANN R., S CHUMACHER G., G UYARD R., “Scale relativity and quantization of the solar system. Orbit quantization of the planet’s satellites”, Astronomy and Astrophysics, vol. 335, p. 281, 1998. [MAN 75] M ANDELBROT B., Les objets fractals, Flammarion, Paris, 1975. [MAN 82] M ANDELBROT B., The Fractal Geometry of Nature, Freeman, San Francisco, 1982. [MAY 95] M AYOR M., Q UELOZ D., “A Jupiter-mass companion to a solar-type star”, Nature, vol. 378, p. 355–359, 1995. [NOT 84] N OTTALE L., S CHNEIDER J., “Fractals and non-standard analysis”, Journal of Mathematical Physics, vol. 25, p. 1296, 1984. [NOT 89] N OTTALE L., “Fractals and the quantum theory of space-time”, International Journal of Modern Physics, vol. A4, p. 5047, 1989. [NOT 92] N OTTALE L., “The theory of scale relativity”, International Journal of Modern Physics, vol. A7, p. 4899, 1992. [NOT 93] N OTTALE L., Fractal Space-Time and Microphysics: Towards a Theory of Scale Relativity, World Scientiﬁc, Singapore, 1993. [NOT 94] N OTTALE L., “Scale relativity: ﬁrst steps toward a ﬁeld theory”, in D IAZ A LONSO J., L ORENTE PARAMO M. (Eds.), Relativity in General, E.R.E.’93, Spanish relativity meetings, Editions Frontières, p. 121, 1994. [NOT 96a] N OTTALE L., “Scale relativity and fractal space-time: application to quantum physics, cosmology and chaotic systems”, Chaos, Solitons, and Fractals, vol. 7, p. 877, 1996. [NOT 96b] N OTTALE L., “Scale-relativity and quantization of extrasolar planetary systems”, Astronomy and Astrophysics Letters, vol. 315, p. L9, 1996.

Scale Relativity, Non-differentiability and Fractal Space-time

497

[NOT 97a] N OTTALE L., “Scale relativity and quantization of the universe. I. Theoretical framework”, Astronomy and Astrophysics, vol. 327, p. 867, 1997. [NOT 97b] N OTTALE L., “Scale relativity”, in D UBRULLE B., G RANER F., S ORNETTE D. (Eds.), Scale Invariance and Beyond, Les Houches workshop, EDP Sciences and Springer, p. 249, 1997. [NOT 97c] N OTTALE L., S CHUMACHER G., G AY J., “Scale relativity and quantization of the solar system”, Astronomy and Astrophysics, vol. 322, p. 1018, 1997. [NOT 98a] N OTTALE L., “Scale relativity and quantization of planet obliquities”, Chaos, Solitons, and Fractals, vol. 9, p. 1035, 1998. [NOT 98b] N OTTALE L., S CHUMACHER G., “Scale relativity, fractal space-time, and gravitational structures”, in N OVAK M.M. (Eds.), Fractals and Beyond: Complexities in the Sciences, World Scientiﬁc, p. 149, 1998. [NOT 00a] N OTTALE L., “Scale relativity and non-differentiable fractal space-time”, in S IDHARTH B.G., A LTAISKY M. (Eds.), Frontiers of Fundamental Physics 4, Kluwer Academic and Plenum Publishers, International Symposia on Frontiers of Fundamental Physics, 2000. [NOT 00b] N OTTALE L., “Scale relativity, fractal space-time, and morphogenesis of structures”, in D IEBNER H., D RUCKREY T., W EIBEL P. (Eds.), Sciences of the Interface, Genista, Tübingen, p. 38, 2000. [NOT 00c] N OTTALE L., C HALINE J., G ROU P., Les arbres de l’évolution, Hachette, Paris, 2000. [NOT 00d] N OTTALE L., S CHUMACHER G., L EFÊVRE E.T., “Scale relativity and quantization of exoplanet orbital semi-major axes”, Astronomy and Astrophysics, vol. 361, p. 379, 2000. [NOT 01a] N OTTALE L., C HALINE J., G ROU P., “On the fractal structure of evolutionary trees”, in L OSA G. (Eds.), Fractals in Biology and Medicine, Birkhäuser Press, Mathematics and Biosciences in Interaction, 2001. [NOT 01b] N OTTALE L., T RAN M INH N., “Theoretical prediction of orbits of planets and exoplanets”, Scientiﬁc News, Paris Observatory, 2002, http://www.obspm.fr/actual/ nouvelle/nottale/nouv.fr.shtml. [NOT 03] N OTTALE L., “Scale-relativistic cosmology”, Chaos, Solitons & Fractals, vol. 16, p. 539, 2003. [NOT 05] N OTTALE L., “Origin of complex and quaternionic wavefunctions in quantum mechanics: the scale-relativistic view”, in A NGLÈS P. (Ed.), Proceedings of 7th International Colloquium on Clifford Algebra and their Applications, 19-29 May 2005, Toulouse, Birkhäuser. [NOT 06] N OTTALE L., C ÉLÉRIER M.N., L EHNER T., “Non-Abelian gauge ﬁeld theories in scale relativity”, J. Math. Phys., vol. 47, p. 032303, 2006. [NOT 07] N OTTALE L., C ÉLÉRIER M.N., “Derivation of the postulates of quantum mechanics form the ﬁrst principles of scale relativity”, J. Phys. A: Math. Theor., vol. 40, p. 14471, 2007. [NOT 08] N OTTALE L., The Theory of Scale Relativity, 528 pp., 2008, forthcoming.

498

Scaling, Fractals and Wavelets

[ORD 83] O RD G.N., “Fractal space-time: a geometric analogue of relativistic quantum mechanics” Journal of Physics A: Mathematical and General, vol. 16, p. 1869, 1983. [POC 97] P OCHEAU A., “From scale-invariance to scale covariance”, in D UBRULLE B., G RANER F., S ORNETTE D. (Eds.), Scale Invariance and Beyond, Les Houches workshop, EDP Sciences and Springer, p. 209, 1997. [POI 05] P OINCARÉ H., “Sur la dynamique de l’electron”, Comptes Rendus de l’Académie des Sciences de Paris, vol. 140, p. 1504–1508, 1905. [QUE 00] Q UEIROS -C ONDÉ D., “Principle of ﬂux entropy conservation for species evolution Principe de conservation du ﬂux d’entropie pour l’évolution des espèces”, Comptes Rendus de l’Académie des Sciences de Paris, vol. 330, p. 445–449, 2000. [SOR 97] S ORNETTE D., “Discrete scale invariance”, in D UBRULLE B., G RANER F., S ORNETTE D. (Eds.), Scale Invariance and Beyond, Les Houches workshop, EDP Sciences and Springer, p. 235, 1997. [SOR 98] S ORNETTE D., “Discrete scale invariance and complex dimensions”, Physics Reports, vol. 297, p. 239–270, 1998. [SOR 01] S ORNETTE D., J OHANSEN A., “Finite-time singularity in the dynamics of the world population and economic indices”, Physica, vol. A 294, p. 465–502, 2001. [TRI 93] T RICOT C., Courbes et dimensions fractales, Springer-Verlag, Paris, 1993. [WIL 83] W ILSON K.G., “The Renormalization Group and critical phenomena”, American Journal of Physics, vol. 55, p. 583–600, 1983. [WOL 94] W OLSZCZAN A., “Conﬁrmation of Earth-mass planets orbiting the millisecond pulsar PSR B1257+12” Science, vol. 264, p. 538, 1994.

List of Authors

Patrice A BRY Laboratoire de physique de l’ENS CNRS Lyon France Liliane B EL Laboratoire de mathématiques Paris-Sud University Orsay France Albert B ENASSI Department of Mathematics Blaise Pascal University Clermont-Ferrand France Jean-Marc C HASSERY LIS CNRS Grenoble France Serge C OHEN LSP Paul Sabatier University Toulouse France

500

Scaling, Fractals and Wavelets

Khalid DAOUDI INRIA Nancy France Franck DAVOINE Heudiasyc University of Technology of Compiègne France Patrick F LANDRIN Laboratoire de physique de l’ENS CNRS Lyon France Paulo G ONÇALVES Laboratoire de l’Informatique du Parallélisme de l’ENS INRIA Lyon France Jacques I STAS Département IMSS Pierre Mendès France University Grenoble France Stéphane JAFFARD Department of Mathematics University of Paris XII Créteil France Pierrick L EGRAND IMB University of Bordeaux 1 Talence France Jacques L ÉVY V ÉHEL INRIA Centre de recherche Saclay - Île-de France Orsay France

List of Authors

Denis M ATIGNON LTCI, CNRS Ecole nationale supérieure des télécommunications Paris France Laurent N OTTALE Observatoire de Paris-Meudon CNRS Meudon France Georges O PPENHEIM Laboratoire de mathématiques Paris-Sud University Orsay France Rudolf R IEDI Department of Telecommunications University of Applied Sciences of Western Switzerland Fribourg Switzerland Luc ROBBIANO Laboratoire de mathématiques Paris-Sud University Orsay France Claude T RICOT LMP Blaise Pascal University Clermont-Ferrand France Darryl V EITCH Department of Electrical and Electronic Engineering University of Melbourne Victoria Australia

501

502

Scaling, Fractals and Wavelets

Marie-Claude V IANO Laboratoire de mathématiques Paris-Sud University Orsay France Christian WALTER PricewaterhouseCoopers, Paris University of Evry France

Index

A aggregation, 421, 422, 432, 443

C cascade, 80, 81, 92, 125, 156, 158, 425, 430 binomial, 140, 156, 162, 169 computer network trafﬁc bursty, 415, 417, 419, 420 fractal, 420, 424, 426, 430, 431

Schrödinger, 488–493, 495 exponent Hölder, 33–48, 61, 67, 77, 105–107, 112–116, 119, 123, 127, 132, 222, 223, 226, 301, 303, 304, 306, 308, 311–316, 320, 322, 326, 328, 368, 369, 372, 373, 383–387, 396, 398, 406, 407 pointwise, 33, 37, 38, 53, 226, 311, 315, 369, 372, 384, 386, 390, 392, 394 Hurst, 77 oscillation, 39, 127, 132, 140

D diffusive representation, 238–240, 258, 262–264, 271, 273 dimension fractal, 145, 163, 428, 457, 458, 473, 475–477, 479–482 Hausdorff, 41–45, 53–55, 63, 78, 79, 106, 123, 128, 132, 145, 146, 281, 294, 302, 303, 335, 369, 370 distribution processes, 294–297 Weibull, 453

F fractal space-time, 466, 488 fractional derivative, 113, 124, 237–240, 242–251, 267, 273 ﬁlter, 279, 281–284, 296 integration, 115, 211, 240–242, 439 function partition, 79, 126, 128, 131, 132, 147–149, 151, 153, 163, 166, 302 Weierstrass, 32, 41, 44, 65, 66, 185, 188, 302, 316, 317, 336

E equation fractional differential, 239, 240, 251–273 fractional partial differential, 239, 240, 266–273

G gravitational structure, 492, 495

B binomial measure, 62–65, 68, 154, 157, 158, 160, 164, 166

H Hausdorff measure, 27, 61

504

Scaling, Fractals and Wavelets

heavy tail, 423, 427–430, 432, 433 I increment r-stationary, 189–191 stationary, 73–77, 79, 86–88, 91, 96, 163, 181, 189, 190, 194–196, 198–200, 205, 209, 212, 431 L large deviations, 57, 127, 320, 371 long range dependence, 75, 76, 88, 89, 92, 93, 95–97, 139, 167, 200, 422, 425, 427–429, 431–433 M motion fractional Brownian, 77, 103, 116, 169, 170, 172, 191, 193, 194, 205, 207, 214, 222, 223, 280, 302, 388, 431, 432, 455 multifractional Brownian, 116, 218–226, 232, 375, 396–398 multiplexing, 414, 416, 417, 430 multiresolution analysis, 83–85, 95, 207, 211 N noise ﬁltered white, 201, 214, 215, 217, 218, 228, 230 fractional Gaussian, 78, 99, 167, 173, 421, 424, 425 P process α-stable, 88, 181, 186, 196, 197 Censov, 198 distribution, 294–297 increments, 73, 75–77, 88, 156, 160, 167, 281 Lévy, 104, 117–120 non-differentiable, 469 self-similar, 73–75, 86, 88, 96, 179–202, 213, 222, 446

semi-stable, 186 Takenaka, 195, 198 pseudo-differential operator, 38, 40, 112, 190, 192, 197, 215, 216, 238–240, 263, 273 Q quadratic variations, 188, 193, 194, 226–228, 231, 233 R renormalization of sums of random variables, 186, 187 S sample path regularity, 91, 181, 281 scale change of, 205, 465 dynamics, 473, 476, 478–480, 482 equation, 473, 478, 480 invariance, 71–81, 92–100, 179, 180, 420–434 relativity, 465–493 scaling law, 76, 80, 87, 92, 93, 96, 100, 168, 269, 270, 419–430, 439 segmentation, 66, 67, 222, 301, 324–328 self-similarity local, 206, 213, 215, 218, 220 source on/off, 427–429, 431, 432 space Besov, 123–126, 129, 144, 295, 377 Sobolev, 124, 144, 295 spectrum large deviation, 51, 52, 54, 60, 65, 67 127, 370 Legendre, 54, 60, 128, 147–149 singularity, 106, 112, 118, 119, 123, 125, 128, 129, 131 system iterated function, 23, 32, 301–320 hyperbolic, 335 W wavelet analysis, 85–92, 98, 107, 159, 322