COMPUTATIONAL INVERSE TECHNIQUES in NONDESTRUCTIVE EVALUATION
© 2003 by CRC Press LLC
COMPUTATIONAL INVERSE TECHNIQUE...
39 downloads
884 Views
9MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
COMPUTATIONAL INVERSE TECHNIQUES in NONDESTRUCTIVE EVALUATION
© 2003 by CRC Press LLC
COMPUTATIONAL INVERSE TECHNIQUES in NONDESTRUCTIVE EVALUATION G.R. Liu X. Han
C RC P R E S S Boca Raton London New York Washington, D.C.
© 2003 by CRC Press LLC
1523_Frame_C00.fm Page 4 Thursday, August 28, 2003 3:36 PM
Library of Congress Cataloging-in-Publication Data Liu, G.R. Computational inverse techniques in nondestructive evaluation / G.R., Liu, X. Han, p. cm. Includes bibliographical references and index. ISBN 0-8493-1523-9 (alk. paper) 1. Non-destructive testing—Mathematics. I. Han, X. II. Title. TA417.2.L58 2003 620.1′127—dc21
2003043554
This book contains information obtained from authentic and highly regarded sources. Reprinted material is quoted with permission, and sources are indicated. A wide variety of references are listed. Reasonable efforts have been made to publish reliable data and information, but the author and the publisher cannot assume responsibility for the validity of all materials or for the consequences of their use. Neither this book nor any part may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, microfilming, and recording, or by any information storage or retrieval system, without prior permission in writing from the publisher. The consent of CRC Press LLC does not extend to copying for general distribution, for promotion, for creating new works, or for resale. Specific permission must be obtained in writing from CRC Press LLC for such copying. Direct all inquiries to CRC Press LLC, 2000 N.W. Corporate Blvd., Boca Raton, Florida 33431. Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identification and explanation, without intent to infringe.
Visit the CRC Press Web site at www.crcpress.com © 2003 by CRC Press LLC No claim to original U.S. Government works International Standard Book Number 0-8493-1523-9 Library of Congress Card Number 2003043554 Printed in the United States of America 1 2 3 4 5 6 7 8 9 0 Printed on acid-free paper
© 2003 by CRC Press LLC
1523_Frame_C00.fm Page 5 Thursday, August 28, 2003 3:36 PM
Dedication
To Zuona Yun, Kun, Run, and my family for the time they gave to me
G. R. Liu
To Zhenglin, Weiqi and my family for their support
To my mentor, Dr. Liu for his guidance
X. Han
© 2003 by CRC Press LLC
1523_Frame_C00.fm Page 7 Thursday, August 28, 2003 3:36 PM
Preface
In the past 2 decades, inverse problems have been one of the most important focal areas of research in engineering. Advances in computational numerical methods and techniques in computer hardware and software in so-called soft computing have enabled inverse techniques to become a powerful tool for practical engineering problems. However, for many researchers and engineers, inverse analysis is still a distant topic because of a frightening and mysterious image of the difficulty of grasping and dealing with its concepts. In the early 1980s, G. R. Liu encountered the first inverse problem in his years of postgraduate study for the characterization of composite laminates. Dr. Liu was alarmed and confused by the flood of unfamiliar terminologies related to inverse problems, such as ill-posedness, regularization, stability, uniqueness, etc. In fact, he worried so much about the possibility of doing things improperly that he finally gave up pursuing the problem in the context of inverse analysis. He solved the problem by cutting the composite laminates into pieces and measuring the mechanical and thermal properties using traditional tensile machines and thermal measurement equipment — a destructive, time consuming, problematic, but conventionally accepted approach. Back then, he wished for a book like this one to guide his research work so that he could conduct it in a more advanced manner. The next time Dr. Liu got the courage to face the same inverse problem was in 1997, when he had a set of good forward solvers for waves in composite laminates. This time he decided to put these terminologies aside and go straight ahead to formulate and use the optimization tools to solve the problem. He managed to obtain the solution without too much difficulty, but with many mistakes. Based on confidence built upon the first trial study, he then turned to look at these terminologies, and found that they were, in fact, walls that scared people away. The best way to break open these walls is to solve an inverse problem first following general knowledge, and then dealing with the issues as they occur. Slowly, experience will accumulate and tricks and techniques will be learned so that increasing numbers of inverse problems can be solved. This means learning or training was difficult before, but much easier and particularly useful now because these practices can be performed in a PC environment, to which almost everyone has access. The authors have learned some inverse techniques through the aforementioned hard way; they decided to put their experiences in this book on how inverse problems of mechanics can be formulated and solved and the possible issues important for successful inverse analysis. They are committed to putting all these materials in a very simple and easy-to-understand form, as well
© 2003 by CRC Press LLC
1523_Frame_C00.fm Page 8 Thursday, August 28, 2003 3:36 PM
as using the simplest examples to reveal true meanings of these abstruse terminologies and the mechanisms of some important phenomena. Many example problems and practical engineering problems are presented, together with many numerical tests as well as some experimental verification. The authors hope that this book can help readers face inverse problems comfortably and tackle them with ease, without being frightened off. The truth is that many engineering inverse problems are not that difficult because they can be well posed if they are properly formulated with a sound experimental strategy. Properly formulating and solving an inverse problem demands that the analyst have (1) very good understanding of the physical problem, (2) good experimental strategy and quality measurement data, and (3) most importantly, effective computational techniques. Without a good understanding of the physics of the problem, basically nothing can be done. This book will not help much in this context, except to emphasize the importance of this understanding. Quality measurement data are essential because they will decide the quality of the solution of the inverse problem. This includes not only the accuracy of the experimental (or test or observational) data, but also the precise knowledge of the characteristics of the measurement data in terms of the noise content (noise level, frequency, etc.). Apart from modern hightech experimental equipment, acquiring such quality experimental data depends highly on understanding the physics involved in the problem and the process of measurement. Although, this book will cover some of the issues in measuring wave and vibration responses of structures, they are not its focus. This book emphasizes the key to solution of any practical and complex inverse problems: computational techniques. These techniques concern how to obtain what is needed from given experimental data efficiently and accu rately. Without the computer and effective computational techniques, it is not possible to perform a decent inverse analysis of a complex engineering problem. Forward solver is also very important, but this book generally assumes that a reliable forward solver to the physical problem is available. Thus, only sources of and a brief introduction to forward solvers are provided here. It is the task of the analyst to use these forward solvers properly and produce reliable results for the inverse analysis — by no means an easy task, but not the focus of this volume. Readers may refer to earlier books by Dr. Liu or other related literature. The authors’ work in the area of inverse analysis has been profoundly influenced and guided by many existing works reported in the open literature, which are partially listed in the references. Without those significant contributions to this area, this book would not exist. The authors would like to thank all the authors of the excellent papers and books published in areas related to this book’s topic. Many colleagues and students have supported and contributed to the writing of this book. Dr. Liu expresses sincere thanks to all of them, with special appreciation to Y.G. Xu, Z.L. Yang, Z.P. Wu, S.I. Ishak, H.M. Shang,
© 2003 by CRC Press LLC
1523_Frame_C00.fm Page 9 Thursday, August 28, 2003 3:36 PM
S.P. Lim, W.B. Ma, Irwan Bin Karim, S.C. Chen, and H.J. Ma. Many of them have contributed examples to this book in addition to their hard work in carrying out a number of projects related to inverse problems. Finally, the authors would also like to thank A*STAR, Singapore, for its partial financial sponsorship for research projects related to the topic of this book that were undertaken by the authors and their teams.
G.R. Liu and X. Han
© 2003 by CRC Press LLC
1523_Frame_C00.fm Page 11 Thursday, August 28, 2003 3:36 PM
Authors
Dr. G.R. Liu received his Ph.D. from Tohoku University, Japan, in 1991. He was a postdoctoral fellow at Northwestern University, Evanston, Illinois. He is currently the director of the center for advanced computations in engineering science (ACES), National University of Singapore and an associate professor in the Department of Mechanical Engineering, National University of Singapore. He is also currently the president of the Association for Computation Mechanics (Singapore). Dr. Liu has provided consultation services to many national and international organizations and authored more than 300 technical publications, including more than 180 international journal papers. He has written five books, including the popular book Mesh-Free Method: Moving beyond the Finite Element Method. He serves as an editor and a member of editorial boards of five scientific journals. Dr. Liu is the recipient of the Outstanding University Researchers Award, the Defense Technology Prize, and the Silver Award at CrayQuest (nationwide competition). His research interests include computational mechanics, mesh-free methods, nanoscale computation, microbiosystem computation, vibration and wave propagation in composites, mechanics of composites and smart materials, inverse problems, and numerical analysis. Dr. X. Han obtained his bachelor’s and master’s degrees in engineering mechanics from Harbin Institute of Technology, China, in 1990 and 1997, respectively, and his doctorate in mechanical engineering from National University of Singapore in 2001. He was a research fellow at the School of Mechanical and Production Engineering, Nanyang Technology University, Singapore. Dr. Han has been working on the development of numerical analysis techniques for wave propagation problems and computational inverse techniques. He is currently the manager of the center for advanced computations in engineering science (ACES), Department of Mechanical Engineering, National University of Singapore. Dr. Han’s research interests include structural dynamics of advanced composite and smart materials, inverse problems and numerical analysis. He is the author or co-author of approximately 30 referenced journal papers.
© 2003 by CRC Press LLC
1523_Frame_C00.fm Page 13 Thursday, August 28, 2003 3:36 PM
Contents
1 1.1 1.2 1.3
2 2.1
2.2
2.3
2.4 2.5 2.6
Introduction Forward and Inverse Problems Encountered in Structural Systems General Procedures to Solve Inverse Problems Outline of the Book Fundamentals of Inverse Problems A Simple Example: A Single Bar 2.1.1 Forward Problem 2.1.2 Inverse Problem A Slightly Complex Problem: A Composite Bar 2.2.1 Forward Problem 2.2.2 Inverse Problem Case I-1: Load/Force Identification with Unique Solution (Even-Posed System) 2.2.3 Inverse Problem Case I-2: Load/Force Identification with No Unique Solution (Under-Posed System) 2.2.4 Inverse Problem Case II-1: Material Property Identification with Unique Solution (Even-Posed System) 2.2.5 Inverse Problem Case II-2: Material Property Identification with No Unique Solution (Over-Posed System) 2.2.6 Inverse Problem Case III: Geometry Identification with Unique Solution 2.2.7 Inverse Problem Case IV, Boundary Condition Identification 2.2.8 Points to Note Type III Ill-Posedness 2.3.1 Forward Problem 2.3.2 Differential Operation: Magnification of Error 2.3.3 Definition of Type III Ill-Posedness 2.3.4 A Simple Solution for Type III Ill-Posed Inverse Problems 2.3.5 Features of Ill-Posedness Types of Ill-Posed Inverse Problems Explicit Matrix Systems Inverse Solution for Systems with Matrix Form 2.6.1 General Inversion of System Matrix 2.6.2 Under-Posed Problems: Minimum Length Solution 2.6.3 Even-Posed Problems: Standard Inversion of Matrix 2.6.4 Over-Posed Problems: Least-Squares Solution
© 2003 by CRC Press LLC
1523_Frame_C00.fm Page 14 Thursday, August 28, 2003 3:36 PM
2.7
General Inversion by Singular Value Decomposition (SVD) 2.7.1 Property of Transformation and Type II Ill-Posedness 2.7.2 SVD Procedure 2.7.3 Ill Conditioning 2.7.4 SVD Inverse Solution 2.8 Systems in Functional Forms: Solution by Optimization 2.9 Choice of the Outputs or Effects 2.10 Simulated Measurement 2.11 Examination of Ill-Posedness 2.12 Remarks
3 3.1
3.2 3.3 3.4
3.5
3.6
4 4.1 4.2 4.3
Regularization for Ill-Posed Problems Tikhonov Regularization 3.1.1 Regularizing the Norm of the Solution 3.1.2 Regularization Using Regularization Matrix 3.1.3 Determination of the Regularization Matrix 3.1.4 Tikhonov Regularization for Complex Systems Regularization by SVD Iterative Regularization Method Regularization by Discretization (Projection) 3.4.1 Exact Solution of the Problem 3.4.2 Revealing the Ill-Posedness 3.4.3 Numerical Method of Discretization for Inverse Problem 3.4.3.1 Finite Element Solution 3.4.3.2 Inverse Force Estimation 3.4.4 Definition of the Errors 3.4.5 Property of Projection Regularization 3.4.6 Selecting the Best Mesh Density Regularization by Filtering 3.5.1 Example I: High-Frequency Sine Noise 3.5.2 Example II: Gaussian Noise Remarks Conventional Optimization Techniques The Role of Optimization in Inverse Problems Optimization Formulations Direct Search Methods 4.3.1 Golden Section Search Method 4.3.2 Hooke and Jeeves’ Method 4.3.2.1 Exploratory Moves 4.3.2.2 Pattern Moves 4.3.2.3 Algorithm 4.3.2.4 Example 4.3.3 Powell’s Conjugate Direction Method 4.3.3.1 Conjugate Directions
© 2003 by CRC Press LLC
1523_Frame_C00.fm Page 15 Thursday, August 28, 2003 3:36 PM
4.4
4.5
4.6
4.7 4.8
5 5.1 5.2
5.3
5.4
4.3.3.2 Example Gradient-Based Methods 4.4.1 Cauchy’s (Steepest Descent) Method 4.4.2 Newton’s Method 4.4.3 Conjugate Gradient Method Nonlinear Least Squares Method 4.5.1 Derivations of Objective Functions 4.5.2 Newton’s Method 4.5.3 The Gauss–Newton Method 4.5.4 The Levenberg–Marquardt Method 4.5.5 Software Packages Root Finding Methods 4.6.1 Newton’s Root Finding Method 4.6.2 Levenberg–Marquardt Method Remarks Some References for Optimization
Genetic Algorithms Introduction Basic Concept of GAs 5.2.1 Coding 5.2.2 Genetic Operators 5.2.2.1 Selection 5.2.2.2 Crossover 5.2.2.3 Mutation 5.2.3 A Simple Example 5.2.3.1 Solution 5.2.3.2 Representation (Encoding) 5.2.3.3 Initial Generation and Evaluation Function 5.2.3.4 Genetic Operations 5.2.3.5 Results 5.2.4 Features of GAs 5.2.5 Brief Reviews on Improvements of GAs Micro GAs 5.3.1 Uniform µGA 5.3.2 Real Parameter Coded µGA 5.3.2.1 Four Crossover Operators 5.3.2.2 Test Functions 5.3.2.3 Performance of the Test Functions Intergeneration Projection Genetic Algorithm (IP-GA) 5.4.1 Modified µGA 5.4.2 Intergeneration Projection (IP) Operator 5.4.3 Hybridization of Modified µGA with IP Operator 5.4.4 Performance Tests and Discussions 5.4.4.1 Convergence Performance of the IP-GA
© 2003 by CRC Press LLC
1523_Frame_C00.fm Page 16 Thursday, August 28, 2003 3:36 PM
5.4.4.2 Effect of Control Parameters α and β 5.4.4.3 Effect of the IP Operator 5.4.4.4 Comparison with Hybrid GAs Incorporated with Hill-Climbing Method 5.5 Improved IP-GA 5.5.1 Improved IP Operator 5.5.2 Implementation of the Improved IP Operator 5.5.3 Performance Test 5.5.3.1 Performance of the Improved IP-GA 5.5.3.2 Effect of the Mutation Operation 5.5.3.3 Effect of the Coefficients α and β 5.5.3.4 Effect of the Random Number Seed 5.6 IP-GA with Three Parameters (IP3-GA) 5.6.1 Three-Parameter IP Operator 5.6.2 Performance Comparison 5.7 GAs with Search Space Reduction (SR-GA) 5.8 GA Combined with the Gradient-Based Method 5.8.1 Combined Algorithm 5.8.2 Numerical Example 5.9 Other Minor Tricks in Implementation of GAs 5.10 Remarks 5.11 Some References for Genetic Algorithms
6 6.1 6.2 6.3
6.4
6.5 6.6
6.7 6.8
7 7.1
Neural Networks General Concepts of Neural Networks Role of Neural Networks in Solving Inverse Problems Multilayer Perceptrons 6.3.1 Topology 6.3.2 Back-Propagation Training Algorithm 6.3.3 Modified BP Training Algorithm Performance of MLP 6.4.1 Number of Neurons in Hidden Layers 6.4.2 Training Samples 6.4.3 Normalization of Training Data Set 6.4.4 Regularization A Progressive Learning Neural Network A Simple Application of NN 6.6.1 Inputs and Outputs of the NN Model 6.6.2 Architecture of the NN Model 6.6.3 Training and Performance of the NN Model Remarks References on Neural Networks Inverse Identification of Impact Loads Introduction
© 2003 by CRC Press LLC
1523_Frame_C00.fm Page 17 Thursday, August 28, 2003 3:36 PM
7.2 7.3
7.4
7.5
7.6 7.7
8 8.1 8.2 8.3
Displacement as System Effects Identification of Impact Loads on the Surface of Beams 7.3.1 Finite Element Model 7.3.2 Validation of FE Model with Experiment 7.3.3 Estimation of Loading Time History 7.3.4 Boundary Effects Line Loads on the Surface of Composite Laminates 7.4.1 Hybrid Numerical Method 7.4.2 Why HNM? 7.4.3 TransWave© 7.4.4 Comparison between HNM and FEM 7.4.5 Kernel Functions 7.4.6 Identification of Time History of Load Using Green’s Functions 7.4.7 Identification of Line Loads 7.4.7.1 Identification of Time Function 7.4.7.2 Identification of the Spatial Function 7.4.7.3 Identification of the Time and Spatial Functions 7.4.8 Numerical Verification Point Loads on the Surface of Composite Laminates 7.5.1 Inversion Operation 7.5.2 Concentrated Point Load Ill-Posedness Analysis Remarks
Inverse Identification of Material Constants of Composites Introduction Statement of the Problem Using the Uniform µGA 8.3.1 Solving Strategy 8.3.2 Parameter Coding 8.3.3 Parameter Settings in µGA 8.3.4 Example I: Engineering Elastic Constants in Laminates 8.3.4.1 Laminate [G0/+45/–45]s 8.3.4.2 Laminate [C0/+45/–45/90/–45/+45]s 8.3.4.3 Regularization by Projection 8.3.4.4 Regularization by Filtering 8.3.4.5 Discussion 8.3.5 Example II: Fiber Orientation in Laminates 8.3.5.1 Eight-Ply Symmetrical Composite Laminates 8.3.5.2 Ten-Ply Symmetrical Composite Laminates 8.3.5.3 Further Investigations 8.3.6 Example III: Engineering Constants of Laminated Cylindrical Shells
© 2003 by CRC Press LLC
1523_Frame_C00.fm Page 18 Thursday, August 28, 2003 3:36 PM
8.4 8.5 8.6
8.7
9 9.1 9.2 9.3 9.4
9.5
9.6 9.7
9.8
8.3.6.1 [G0/+45/–45/90/–45/+45]s Cylindrical Shell 8.3.6.2 [G0/–30/30/90/–60/+60]s Cylindrical Shell 8.3.6.3 [C0/30/–30/90/–60/60]s Cylindrical Shell Using the Real µGA Using the Combined Optimization Method Using the Progressive NN for Identifying Elastic Constants 8.6.1 Solving Strategy and Statement of the Problem 8.6.2 Inputs of the NN Model 8.6.3 Training Samples 8.6.4 Results and Discussion 8.6.5 A More Complicated Case Study Remarks
Inverse Identification of Material Property of Functionally Graded Materials Introduction Statement of the Problem Rule of Mixture Use of Gradient-Based Optimization Methods 9.4.1 Example 1: Transversely Isotropic FGM Plate 9.4.1.1 Approach I: Identification of Material Property at Discrete Locations 9.4.1.2 Approach II: Identification of Parameterized Values 9.4.2 Example 2: SiC-C FGM Plate 9.4.2.1 Identification of Parameterized Values 9.4.2.2 Approach III: Identification of Volume Fractions Use of Uniform µGA 9.5.1 Material Characterization of FGM Plate 9.5.1.1 Parameters Used in the Uniform µGA 9.5.1.2 Test of GAs’ Performance 9.5.1.3 Search Range 9.5.2 Material Characterization of FGM Cylinders Use of Combined Optimization Method Use of Progressive NN Model 9.7.1 Material Characterization of SiC-C FGM Plate 9.7.1.1 Inputs of the NN Model 9.7.1.2 Training Samples 9.7.1.3 Results and Discussion 9.7.2 Material Characterization of SS-SN FGM Cylinders 9.7.2.1 Inputs of the NN Model 9.7.2.2 Training Samples 9.7.2.3 Results and Discussions Remarks
© 2003 by CRC Press LLC
1523_Frame_C00.fm Page 19 Thursday, August 28, 2003 3:36 PM
10 Inverse Detection of Cracks in Beams Using Flexural Waves 10.1 Introduction 10.2 Beams with Horizontal Delamination 10.2.1 The SEM 10.2.2 Why SEM? 10.2.3 Brief on SEM Formulation 10.2.4 Experimental Study 10.2.5 Sensitivity Study and Rough Estimation of Crack in Isotropic Beams 10.2.5.1 Crack Length 10.2.5.2 Crack Depth 10.3 Beam Model of Flexural Wave 10.3.1 Basic Assumptions 10.3.2 Homogeneous Solution 10.3.3 Particular Solution 10.3.4 Continuity Conditions 10.3.5 Comparison between SEM and Beam Model 10.3.6 Experimental Verification 10.4 Beam Model for Transient Response to an Impact Load 10.4.1 Beam Model Solution 10.4.2 Experimental Study on Impact Response 10.4.3 Comparison Study 10.5 Extensive Experimental Study 10.5.1 Test Specimens 10.5.2 Test Setup 10.5.3 Effect of Crack Depth 10.5.4 Effect of Crack Length 10.5.5 Effect of Excitation Frequency 10.5.6 Effect of Location of the Excitation Point 10.5.7 Study on Beams of Anisotropic Material 10.6 Inverse Crack Detection Using Uniform µGAs 10.6.1 Use of Simulated Data from SEM 10.6.2 Use of Experimental Data 10.7 Inverse Crack Detection Using Progressive NN 10.7.1 Procedure Outline 10.7.2 Composite Specimen 10.7.3 Delamination Detection 10.7.4 Effect of Different Training Data 10.7.5 Use of Beam Model and Harmonic Excitation 10.7.6 Use of Beam Model and Impact Excitation 10.7.7 FEM as Forward Solver 10.8 Discussion on Ill-Posedness 10.9 Remarks
© 2003 by CRC Press LLC
1523_Frame_C00.fm Page 20 Thursday, August 28, 2003 3:36 PM
11 11.1 11.2 11.3
11.4 11.5 11.6
11.7
11.8
Inverse Detection of Delaminations in Composite Laminates Introduction Statement of the Problem Delamination Detection Using Uniform µGA 11.3.1 Horizontal Delamination 11.3.2 Vertical Crack Delamination Detection Using the IP-GA Delamination Detection Using the Improved IP-GA Delamination Detection Using the Combined Optimization Method 11.6.1 Implementation of the Combined Technique 11.6.1.1 Formulations of Objective Functions 11.6.1.2 Switch from µGA to BCLSF 11.6.1.3 Effect of Noise 11.6.1.4 Ill-Posedness Analysis 11.6.1.5 Regularization by Filtering 11.6.2 Horizontal Delamination in [C90/G45/G–45]s Laminate 11.6.2.1 Noise-Free Cases 11.6.2.2 Noisy Cases 11.6.2.3 Discussion Delamination Detection Using the Progressive NN 11.7.1 Implementation 11.7.2 Noise-Free Case 11.7.3 Noise-Contaminated Case 11.7.4 Discussion Remarks
12 Inverse Detection of Flaws in Structures 12.1 Introduction 12.2 Inverse Identification Formulation 12.2.1 Damaged Element Identification 12.2.2 Stiffness Factor Identification 12.2.2.1 Objective Function with Weight 12.2.2.2 Direct Formulation 12.3 Use of Uniform µGA 12.3.1 Example I: Sandwich Beam 12.3.2 Example II: Sandwich Plate 12.4 Use of Newton’s Root Finding Method 12.4.1 Calculation of Jacobian Matrix 12.4.2 Iteration Procedure 12.4.3 Example I: Cantilever Beam 12.4.3.1 Stiffness of Cantilever Beam 12.4.3.2 Performance Comparison with µGA 12.4.3.3 Noise-Contaminated Case
© 2003 by CRC Press LLC
1523_Frame_C00.fm Page 21 Thursday, August 28, 2003 3:36 PM
12.4.4 Example II: Plate 12.5 Use of Levenberg–Marquardt Method 12.6 Remarks
13 Other Applications 13.1 Coefficient Identification for Electronic Cooling Systems 13.1.1 Using the Golden Section Search Method 13.1.1.1 Natural Convection Problem 13.1.1.2 Numerical Results 13.1.1.3 Summary 13.1.2 Using GAs 13.1.2.1 Forward Modeling 13.1.2.2 Inverse Analysis of a PCB Board 13.1.2.3 A Complex Example 13.1.2.4 Summary 13.1.3 Using NNs 13.1.3.1 Coefficient Identification of a Telephone Switch Model 13.1.3.2 Coefficient Identification for IC Chips 13.1.3.3 Summary 13.2 Identification of the Material Parameters of a PCB 13.2.1 Introduction 13.2.2 Problem Definition 13.2.3 Objective Functions 13.2.4 Finite Element Representation 13.2.5 Numerical Results and Discussion 13.2.5.1 Sensitivity Analysis 13.2.5.2 Identification Using Natural Frequencies 13.2.5.3 Identification Using Frequency Response 13.2.6 Summary 13.3 Identification of Material Property of Thin Films 13.3.1 Noise-Free Cases 13.3.2 Noisy Cases 13.3.3 Discussion 13.4 Crack Detection Using Integral Strain Measured by Optic Fibers 13.4.1 Introduction 13.4.2 Numerical Calculation of Integral Strain 13.4.3 Inverse Procedure 13.4.3.1 Crack Expression 13.4.3.2 Remesh Technique 13.4.3.3 Definition of Objective Functions 13.4.4 Numerical Results 13.4.4.1 Different Dimensions of Cracks 13.4.4.2 Different Locations of Cracks (Case C)
© 2003 by CRC Press LLC
1523_Frame_C00.fm Page 22 Thursday, August 28, 2003 3:36 PM
13.5 13.6
13.7
13.8
13.9
13.4.4.3 Different Materials (Case D) 13.4.4.4 Different Applied Loads (Case E) 13.4.4.5 Different Boundary Conditions (Case F) 13.4.5 Summary Flaw Detection in Truss Structure Protein Structure Prediction 13.6.1 Protein Structural Prediction 13.6.2 Parameters for Protein Structures 13.6.3 Confirmation Energy 13.6.4 Lattice Model 13.6.4.1 Cubic Lattice Model 13.6.4.2 Random Energy Model 13.6.4.3 Lattice Structure Prediction by IP3-GA 13.6.5 Results and Discussion 13.6.6 Summary Fitting of Interatomic Potentials 13.7.1 Introduction 13.7.2 Fitting Model 13.7.3 Numerical Result 13.7.4 Summary Parameter Identification in Valveless Micropumps 13.8.1 Introduction 13.8.2 Valveless Micropump 13.8.3 Flow-Pressure Coefficient Identification 13.8.4 Numerical Examples 13.8.5 Summary Remarks
14 Total Solution for Engineering Systems: A New Concept 14.1 Introduction 14.2 Approaching a Total Solution 14.2.1 Procedure for a Total Solution 14.2.2 Forward Solver 14.2.3 System Parameters 14.2.4 Mathematical Representation 14.3 Inverse Algorithms 14.3.1 Sensitivity Matrix-Based Method (SMM) 14.3.1.1 Sensitivity-Based Equations 14.3.1.2 Algorithms 14.3.1.3 Solution Procedure 14.3.1.4 Comments on SMM 14.3.2 Neural Network 14.4 Numerical Examples 14.4.1 Vibration Analysis of a Circular Plate
© 2003 by CRC Press LLC
1523_Frame_C00.fm Page 23 Thursday, August 28, 2003 3:36 PM
14.4.1.1 SMM Solution 14.4.1.2 Progressive NN Solution 14.4.2 Identification of Material Properties of a Beam 14.4.2.1 SMM Solution 14.4.2.2 Progressive NN Solution 14.5 Remarks References
© 2003 by CRC Press LLC
1523_Frame_C01.fm Page 1 Thursday, June 5, 2003 9:55 AM
1 Introduction
1.1
Forward and Inverse Problems Encountered in Structural Systems
In engineering, computer-aided design (CAD) tools are used to design advanced structural systems. Computational simulation techniques are often used in such tools to calculate the displacement, deflection, strains, stresses, natural frequencies, and vibration modes, etc. in the structural system for given loading, initial and boundary conditions, geometrical configuration, material properties, etc. of the structure. These types of problems are called forward problems and are often governed by ordinary or partial differential equations (ODE or PDE) with unknown field variables. For structure mechanics problems, the field variable is basically the displacements; the constants in the ODE or PDE and problem domain are known a priori. The source or the cause of the problem or phenomenon governed by the ODE or PDE and the relevant initial and boundary conditions are also known. To solve a forward problem is, in fact, to solve the ODE or PDE subjected to these initial and boundary conditions. Many solution procedures, especially the computational procedures, have been developed, such as: • Finite deference method (FDM; see e.g., Hirsch, 1988; Anderson, 1995) • Finite element method (FEM; see e.g., Zienkiewicz and Taylor, 2000; Liu and Quek, 2003) • Strip element method (Section 10.3) • Boundary element method (BEM; see e.g., Brebbia et al., 1984) • FEM/BEM (see e.g., Liu, Achenbach et al., 1992) • Mesh-free methods (see e.g., Liu, 2002a; Liu and Liu, 2003) • Wave propagation solvers (see e.g., Liu and Xi, 2001) These methods for solving forward problems have been well established, although the mesh-free methods are still in a stage of rapid development. Using these methods, the displacements in the structure and then the strains
© 2003 by CRC Press LLC
1523 Page 2 Tuesday, June 3, 2003 2:14 PM
and stresses (outputs) can be obtained — as long as the material property, the geometric configuration of the structure, and the loading, initial, and boundary conditions (inputs) are given. Another class of often encountered practical problems is called inverse problems. In an inverse problem, the effects or outputs (displacement, velocity, acceleration, natural frequency, etc.) of the system may be known (by experiments, for example), but the parameters of the loading profile (inputs), material property, geometric feature of the structure, boundary conditions, or a combination of these may need to be determined. Solving this class of problem is obviously extremely useful for many engineering applications. One of the earliest inverse problems in mechanical engineering is the inverse problem in wave propagation. These problems are formulated based on the fact that mechanical (elastic) waves (Achenbach, 1973; Liu and Xi, 2001) traveling in materials are scattered from the boundaries and interfaces of materials, and propagate over distance to “encode” the information on their path such as the domain boundaries, martial properties, and the wave source (loading excitation, etc.). It must be possible to “decode” some of the information encoded in the waves that are recorded as wave responses. A systematic method to decode the information is to formulate and solve inverse problems. Problems of this nature arise from nondestructive evaluation (NDE) using waves and ultrasounds, ocean acoustics, earth and space exploration, biomedical examination, radar guidance and detection, solar astrophysics, and many other areas of science, technology, and engineering. The nature of inverse problems requires proper formulations and solution techniques in order to perform the decoding successfully. In this book, approaches to formulating inverse problems, inverse analysis procedures, and computational techniques are discussed. Many engineering inverse problems are formulated and investigated using these techniques and many important issues related to inverse problems are examined and revealed by using simple examples. Methods for dealing with these issues are also presented. Note that many types of inverse problems exist in engineering. Some of them can only be formulated in an under-posed form (see Chapter 2), due to the difficulty or cost of obtaining more experimental data or observations. Solving this class of under-posed inverse problems will be discussed but is not the major focus of this book. This book focuses on inverse problems of even- and over-posed problems because, for many engineering systems, sufficient experimental readings can be produced, at least in numbers, to formulate the problem in even- or over-posed forms.
1.2
General Procedures to Solve Inverse Problems
The general procedure of solving an inverse problem is illustrated in Figure 1.1. The details are as follows:
© 2003 by CRC Press LLC
1523 Page 3 Tuesday, June 3, 2003 2:14 PM
Define the problem
Create the forward model
Sensitivity analysis between the inputs and outputs
Experiment design
Minimize measurement error (e.g., filtering)
Inverse analysis General inversion or optimization or NN (Chapters 4-6); regularization techniques (Chapter 3) may be used
Solution verification
No
Yes END
FIGURE 1.1 General procedure to solve inverse problems.
• Define the problem — define the purpose and objectives of the project with an analysis on the available budget, resources, and timeframe. An overall strategy and feasible schedule should be determined for later effective execution. Efforts must be made at all times to (1) reduce the number of unknowns to be inversely identified and (2) confine all the parameters in the smallest possible region. Made at the very first step, these two efforts can often lead to an effective reduction on possibilities of ill-posed inverse problems, and, thus, drastically increase the chance of success and improve the efficiency and accuracy of the inversion operation. • Create the forward model — a physical model should be established to capture the physics of the defined problem. The outputs or effects of the system should be as sensitive as possible to the system parameters to be inversely identified. The parameters should be independently influential to the outputs or the effects of the system. Enforcing more conditions can help to well-pose the inverse problems. Mathematical and computational models should be developed
© 2003 by CRC Press LLC
1523 Page 4 Tuesday, June 3, 2003 2:14 PM
•
•
•
•
for the underlined forward problem. Possible standard computational methods are FEM, FDM, FVM (finite volume method), meshfree method, wave solvers, etc. Analyze sensitivity between the effects or outputs and the parameters — make sure that the outputs of the problem and parameters (including the inputs) to be inversely identified are well correlated. Ensuring high sensitivity of the outputs to the parameters is one of the most effective approaches to reducing ill-posedness in the later stage of inverse analysis. The analysis should be done using the forward model created without the need of experiments that may be expensive. Based on the sensitive analysis, modification to the forward model and to the choices of parameters may be made. Design the experiment — decide on proper measurement methods, type of equipment for testing and recording, and data analysis. The number of the measurements or readings should be at least more than the number of unknowns to be inversely identified, which can lead to at least an even-posed problem. An over-posed system (using more outputs) is usually preferred so as to improve the property of the system equation and reduce the ill-posedness of the problem. An over-posed formulation can usually accommodate higher levels of noise contamination in the experimental data. However, too heavily over-posed systems may result in a poor output reproducibility that can be checked later by computing the output reproducibility after obtaining the inverse solution. Minimize measurement noise (e.g., through filtering) — errors in the measurement data should be eliminated as much as possible because they can trigger the ill-posedness of the problem and can be magnified in the inverse solution, or even result in an unstable solution. Properly designed filters can be used to filter out the errors before the measurement data are used for the inverse analysis. The principle is to use a low pass filter to filter out all the noise with frequency higher than the frequency or wavelength shorter than the wavelength of the effects of the problem. The frequency or the wavelength of the effects of the problem can often be estimated by the forward solver. Details will be covered in Chapter 3. Apply the inverse solver — if the system can be formulated in an explicit matrix form, general inversion of the system (or transformation) matrix can be performed to obtain the inverse solution. For complex systems that cannot be formulated in an explicit matrix form, a functional of error can always be established using a proper norm, and optimization/minimization techniques should be used to search for the solution that minimizes the error norm. These optimization techniques will be discussed in detail in Chapter 4 and Chapter 5. Proper regularization techniques may be used for ill-posed inverse problems. The regularization techniques are very important
© 2003 by CRC Press LLC
1523 Page 5 Tuesday, June 3, 2003 2:14 PM
for obtaining stable solutions for the ill-posed inverse problems. Note also that the use of some of the regularization techniques should be the last resort to remedy the ill-posedness of the problem. Some side effects will occur in using many of the regularization methods and misuse of regularization techniques can also lead to erroneous results. Regularization methods will be detailed in Chapter 3. • Verify the solution — this is important to ensure that the inverse solution obtained is physically meaningful. All possible methods with proper engineering judgments should be employed to make sure that the solution obtained is reliable. Checking on the output and input reproducibility matrices can give some indications on the quality of the solution. Modifications of the inverse and experimental strategy may be needed, and the preceding steps may be repeated until the inverse solution is satisfactory. Note that many of the verifications can be done computationally, and experimental verifications need to be done at the final stage.
1.3
Outline of the Book
This book details the theory, principles, computational methods and algorithms, and practical techniques for inverse analyses using elastic waves propagating in solids and structures or the dynamic responses of solids and structures. These computational inverse methods and procedures will be examined and tested numerically via a large number of examples of force/ source reconstructions, crack detection, flaw characterization, material characterization, heat transfer coefficients identification, protein structure prediction, interatomic potential construction, and many other applications. Some of these techniques have been confirmed with experiments conducted by the authors and co-workers in the past years. Discussions of regularization methods for the treatment of ill-posed inverse problems will be easy to understand. The book also discusses many robust and practical optimization algorithms that are very efficient for inverse analysis and optimization, especially algorithms developed through the combination of different types of the existing optimization methods such as gradient based methods with genetic algorithms, intergeneration projection genetic algorithms, real number coded microgenetic algorithms, and progressive neural networks. The efficiency and features of all these optimization algorithms will be demonstrated using benchmark objective functions as well as actual inverse problems. Table 1.1 gives a concise summary of the applications of those computational inverse techniques for actual inverse problems studied in this book.
© 2003 by CRC Press LLC
Summary of Applications of Computational Inverse Techniques for Actual Inverse Problems Studied in This Book Computational Inverse Techniques
Conventional optimization techniques (Chapter 4)
Applications
Golden section search method (Section 4.3.1)
Section 13.1: Coefficient identification of electronic cooling system
Conjugate gradient method (Section 4.4.3)
Section 7.4.7: Identification of the time history of the force
Nonlinear least square method (Section 4.5)
1. Section 7.4.6: Identification of the time function of the force 2. Section 9.4: Identification of material property of functionally graded materials 2.1 Section 9.4.1: Transversely FGM plates 2.2 Section 9.4.2: SiC FGM plates 3. Used frequently in the combined optimization methods
1523 Page 6 Tuesday, June 3, 2003 2:14 PM
© 2003 by CRC Press LLC
TABLE 1.1
Genetic algorithms (GA) (Chapter 5)
Newton’s root finding method (Section 4.6.1)
1. Section 12.4.3: Identification of stiffness factors of cantilever beams 2. Section 12.4.4: Identification of stiffness factors of plates
Levenberg-Marquardt root finding method (Section 4.6.2)
1. Section 12.5: Flaw detection in cantilever beams 2. Section 13.5: Flaw detection in truss structures
Binary micro-GA (µGA) (Section 5.3.1)
1. Section 8.3.4: Identification of material constants of laminates 1.1 Glass/epoxy [0/45/-45]s laminate 1.2 Carbon/epoxy [0/45/-45/90/-45/45]s laminate 2. Section 8.3.5: Identification of fiber orientation in laminates 2.1 Eight-ply symmetrical laminates 2.2 Ten-ply symmetrical laminates 2.3 Complex case study 3. Section 8.3.6: Identification of material constants of laminated cylinders 3.1 Glass/epoxy [0/45/-45/90/-45/45]s laminate 3.2 Glass/epoxy [0/-30/30/90/-60/60]s laminate 3.3 Carbon/epoxy [0/-30/30/90/-60/60]s laminate 4. Section 9.5: Material characterization of FGMs 4.1 Section 9.5.1: FGM plates 4.2 Section 9.5.2: FGM cylinders 5. Section 10.6: Crack detection in beams 5.1 Section 10.6.1: Using SEM simulated displacement 5.2 Section 10.6.2: Using experimental displacement
Real µGA (Section 5.3.2)
1. Section 8.4: Identification of material constants of composite laminate
Intergeneration projection GA (IP-GA) (Section 5.4)
1. Section 11.4: Delamination detection in laminates 2. Section 13.3: Identification of material property of thin films 3. Section 13.8: Parameter identification in valueless micropumps
Improved IP-GA (Section 5.5)
1. Section 11.5: Delamination detection in laminates 2. Section 13.7: Fitting of interatomic potentials
IP-GA with three parameters (IP3-GA) (Section 5.6)
1. Section 13.2: Identification of the material parameters of PCBs 2. Section 13.6: Protein structure prediction
GA with search space reduction (SR-GA) (Section 5.7)
1. Section 13.1.2.2: Thermal coefficient identification of electronic cooling system 2. Section 13.1.2.3: Thermal coefficient identification of PCB
Genetic algorithm combined with the gradient-based methods (Section 5.8)
Neural network (Chapter 6)
1523 Page 7 Tuesday, June 3, 2003 2:14 PM
© 2003 by CRC Press LLC
6. Section 11.3: Delamination detection in laminates 6.1 Section 11.3.1: Horizontal delamination 6.2 Section 11.3.2: Vertical crack 7. Section 12.3: Flaw detection in sandwich structures 7.1 Sandwich beams 7.2 Sandwich plates 8. Section 13.4: Crack detection using integral strain measured by optical fibers
1. Section 8.5: Identification of material constants of composite laminate 2. Section 9.6: Identification of material property of functionally graded materials 3. Section 11.6: Delamination detection in composite laminate 1. Section 13.1.3: Coefficient identification of a telephone switch model
Progressive neural network
1. Section 8.6: Identification of material constants of composite laminate 2. Section 9.7: Identification of material property of functionally graded materials 2.1 Section 9.7.1: FGM plate 2.2 Section 9.7.2: FGM cylinder 3. Section 10.7: Crack detection in beams 3.1 Sections 10.7.2-10.7.4: Using SEM model 3.2 Section 10.7.5: Using beam model and harmonic excitation 3.3 Section 10.7.6: Using beam model and impact excitation 3.4 Section 10.7.7: Using FEM model 4. Section 11.7: Delamination detection in laminates
Plain neural network
1523 Page 8 Tuesday, June 3, 2003 2:14 PM
The book is organized as follows: • Chapter 1 provides a general description and procedure of inverse analysis and backgrounds and motivations that led to the development of these methods for nondestructive evaluation, as well as development of this book. • In Chapter 2 the general definition of forward problem as well as inverse problem will be presented. Ill-posed inverse problems are classified into three types. Issues related to these three types of illposedness are revealed and discussed using very simple examples. The formulation of inverse problems will be presented, and the general procedure to solve inverse problems that can be formulated in explicit matrix forms will be provided. • Chapter 3 offers a brief introduction of five regularization methods for ill-posed inverse problems. These regularization methods include the Tikhonov regularization, regularization by singular value decomposition, iterative regularization methods, regularization by projection, and regularization by filtering. • In Chapter 4 some conventional optimization techniques, including direct search algorithms as well as gradient-based algorithms, are introduced because engineering inverse problems are usually formulated and solved as optimization problems. These techniques are provided in a concise and insightful manner with the help of simple examples. • Chapter 5 describes the basic concept of genetic algorithms (GAs) and some modified GAs, with an emphasis on the intergeneration project GA (IP-GA) as well as the method that combines GAs with gradient-based methods. • In Chapter 6, the basic terminology, concepts, and procedures of the neural network (NN) will be briefly introduced. A typical NN model and multilayer perceptrons (MLP), along with the back-propagation learning algorithm, will be detailed. Some practical computational issues on NNs as well as the progressive NN model are also discussed. • Chapter 7 through Chapter 12 present a number of computational inverse techniques using elastic waves propagating in composite structures or dynamic responses of structures. Practical complex nondestructive evaluation problems of force function reconstruction, material property identification, and crack (delamination, flaw) detection have been examined in detail in the following order: • Chapter 7 presents inverse procedures for identification of impact loads in composite laminates. Traditional optimization methods are employed for the inverse analysis and numerical examples of identification of impact loads applied on beam and plate types of structures are presented. Experimental studies
© 2003 by CRC Press LLC
1523 Page 9 Tuesday, June 3, 2003 2:14 PM
have also been presented for the verification of the inverse solution. • In Chapter 8 material constants include the elastic constants or the engineering constants required in the constitutive law for composites, and fiber orientation of composite laminates will be inversely identified from the dynamic displacement responses recorded at only one receiving point on the surface of composite laminated structures. • Chapter 9 discusses the computational inverse techniques for material property characterization of functionally graded materials (FGMs) from the dynamic displacement response recorded on the surface of the FGM structures. • In Chapter 10, numerical analysis and experimental studies on the use of flexural waves for nondestructive detection of cracks and delaminations in beams of isotropic and anisotropic materials are introduced. Computational inverse procedures employing the GAs and NNs are detailed for determining the geometrical parameters of the crack and delaminations. • Computational inverse techniques using elastic wave responses of displacement for delamination detection in composite laminates are introduced in Chapter 11. Horizontal delaminations as well as vertical cracks will be considered. GAs and NNs are employed for the inverse analysis; the strip element method is used as the forward solver to compute the wave response. Examples of practical applications are presented to demonstrate the efficiency of computational inverse techniques for delamination detection in composite laminates. • Chapter 12 considers the detection of flaws in beams or plates; special considerations and treatment for the detection of flaws in sandwich structures are also provided. The finite element model is used for forward analysis. GAs and Newton’s root finding method, as well as the Levenberg–Marquardt method, are used for the inverse analysis. A number of numerical examples are provided to demonstrate the application of these computational inverse techniques. • Several other application examples of the computational inverse techniques are presented in Chapter 13. These topics range from the electronic system (heat transfer coefficient identifications), use of integral optical fibers, MEMS, and interatomic potential to the protein structure. These applications provide a landscape view on the broadness of the applications of the inverse techniques. • Chapter 14 introduces a concept of total solution for engineering mechanics problems as an extension of the inverse analysis. The approach for obtaining a total solution is to formulate practical engi-
© 2003 by CRC Press LLC
1523 Page 10 Tuesday, June 3, 2003 2:14 PM
neering problems as a parameter identification problem. All the parameterized unknown information is determined through an iterative procedure of conducting alternately forward and inverse (or mixed) analyses. This chapter suggests a new approach to formulate and deal with practical engineering problems. The background of and many terminologies used in this book are defined in Chapter 1 through Chapter 3. These chapters will be useful in understanding Chapter 7 through Chapter 14, and therefore should be read first before proceeding to other chapters. Chapter 4 through Chapter 6 can be read separately. In fact, these materials are useful not only for inverse problems but also for general optimization problems. Readers who are familiar with these optimization techniques may skip these chapters. Chapter 7 through Chapter 14 can be read in any order, based on the interest of the reader, because proper cross references for commonly used materials are provided. The book is written primarily for senior university students, postgraduate students and engineers in civil, mechanical, geographical and aeronautical engineering, and engineering mechanics. Students in mathematics and computational science may also find the book useful. Anyone with an elementary knowledge of matrix algebra and basics of mechanics should be able to understand its contents fairly easily.
© 2003 by CRC Press LLC
2 Fundamentals of Inverse Problems
Using simple examples that can largely be treated manually, this chapter reveals some important and fundamental issues in formulating and solving inverse problems. This will prepare readers for dealing with complex inverse problems presented in later chapters. The general definition of the often used technologies for forward problems, as well as inverse problems, are presented in this chapter. Issues related to the ill-posedness of problems are discussed and ill-posed inverse problems are classified into three types. Features of these three types are examined, and general approaches and steps to deal with them are then discussed. Detailed methods are introduced for dealing with a class of inverse problems whose input, output, and system can be expressed explicitly in matrix forms. The properties of this class of inverse problems as well as the general procedure to solve them are then provided. Formulations of other complex inverse problems are also introduced. This chapter is written referencing works by Santamarina and Fratta (1998), Tosaka et al. (1999), Engl et al. (2000).
2.1
A Simple Example: A Single Bar
Consider now a simple mechanical system of a straight bar with uniform cross-sectional area A and length l, as shown in Figure 2.1. The bar is made of elastic material with Young’s modulus of E. It is subjected to force f1 at node 1 and f2 at node 2. The axial displacement of the bar is denoted by u1 at node 1 and u2 at node 2. The governing equation for the bar member can be written as EA l EA − l
© 2003 by CRC Press LLC
EA l u1 = f1 EA u2 f2 l
−
(2.1)
E, A, l Initial status Stressed status
2
1 f1
u2
u1
f2 2
1
FIGURE 2.1 A straight bar of uniform cross-sectional area A and length l. The bar is made of elastic material with Young’s modulus of E. The bar is subjected to forces f1 at node 1 and f2 at node 2. The axial displacement of the bar is denoted by u1 at node 1 and u2 at node 2.
or k − k
− k u1 f1 = k u2 f2
(2.2)
where k=
EA l
(2.3)
is the tensional stiffness of the bar. Equation 2.2 can be written in a standard matrix form of K 2×2 U 2×1 = F2×1
(2.4)
−k k
(2.5)
where k K= − k
is called the stiffness matrix, U is the nodal displacement vector that collects the displacements at these two nodes of the bar: u U = 1 u2
(2.6)
and F is the nodal force vector that collects the forces acting at the two nodes of the bar:
© 2003 by CRC Press LLC
f F = 1 f2
(2.7)
For complex engineering systems, a set of discrete system equations like Equation 2.4 can always be created using the standard and well-established finite element method (see, for example, Zienkiewicz and Taylor, 2000; Liu and Quek, 2003), as well as finite difference methods, element free methods (Liu, 2002a), or any other type of numerical methods. If the total degrees of freedom (DOF) are N, the standard discrete system equation can be given in the form of K N × N U N ×1 = FN ×1
2.1.1
(2.8)
Forward Problem
In forward problems, it is assumed that the following parameters are known: Geometrical parameters: A = Aˆ , l = lˆ
(2.9)
E = Eˆ
(2.10)
f2 = fˆ2
(2.11)
Material property parameter:
External force:
where “^” stands for the parameters whose values are specified. This is intentionally utilized, particularly in this chapter, to help us to distinguish explicitly the knowns and unknowns in the process of establishing the concept of forward and inverse problems. For the forward problem, the unknown are the displacements u1 and u2, and it is only necessary to solve the linear algebraic Equation 2.4 for the unknown. However, because the stiffness matrix K is singular, the solution will not be unique. To obtain a unique solution, the bar must be properly supported or constrained, which provides additional conditions called boundary conditions. Consider the problem shown in Figure 2.2. The bar is now fixed at one end, so that a boundary condition exists: u1 = uˆ 1 = 0
© 2003 by CRC Press LLC
(2.12)
E, A, l
1
2 x
f2
u2 FIGURE 2.2 A straight bar of uniform cross-sectional area A and length l clamped at node 1. The bar is made of elastic material with Young’s modulus of E and is subjected to force f2 at node 2.
The u2 is now the only unknown. Using the second equation in Equation 2.4, u2 can then be obtained easily: − kˆ × uˆ 1 + kˆ × u2 = fˆ2
(2.13)
ˆˆ EA kˆ = lˆ
(2.14)
fˆ2 kˆ
(2.15)
0
where
Equation 2.13 gives u2 =
After u2 is obtained, f1, termed the reaction force, can be obtained using the first equation in Equation 2.2: fˆ f1 = kˆ × uˆ 1 − kˆ × u2 = − kˆ × 2 = − fˆ2 kˆ
(2.16)
0
This simple example has demonstrated that a forward problem can be solved for the unique solution provided the boundary conditions are given sufficiently for the problem to be well-defined or well-posed. Otherwise, the forward problem can be nonunique or ill-posed.
2.1.2
Inverse Problem
Consider now that, somehow (e.g., via experiment), the value of u2 = uˆ 2 , is known, and the geometrical information of the bar (Equation 2.9), boundary condition (Equation 2.12), and external force at nodes 2 (Equation 2.11) are still known. However, the material property parameter — Young’s modulus
© 2003 by CRC Press LLC
E (Equation 2.10) — is not known. Using the second equation in Equation 2.2, it can then be obtained easily: − kˆ × uˆ 1 + k × uˆ 2 = fˆ2
(2.17)
0
which gives k=
fˆ2 uˆ 2
(2.18)
which is simply an inverse expression of Equation 2.15. Using Equation 2.14, E=
ˆˆ lˆˆf kl = 2 ˆˆ Aˆ Au 2
(2.19)
The problem of solving for the unknown of material property using the measured displacement is an often encountered inverse problem. This example, in fact, is the standard procedure used in practice for determining the Young’s modulus of materials. Because the problem is very simple, no special techniques are usually needed to resolve it. The fact that this is an inverse problem may not even be obvious. Other inverse problems related to this example could be those of finding force applied on the bar, area, or length of the bar. They are all equally trivial and can all be solved very easily for this simple example without any difficulty. The terms of forward problems and inverse problems are naturally used following the physics of the problem or the convention of looking at the problem.
2.2
A Slightly Complex Problem: A Composite Bar
Consider now the slightly more complex problem shown in Figure 2.3. The governing equation of this system can be easily obtained by assembling these two bar members using Equation 2.2: k1 − k 1 0
where
© 2003 by CRC Press LLC
− k1 k1 + k 2 − k2
0 u1 f1 − k 2 u2 = f2 k 2 u3 f3
(2.20)
1
E1, A1, l1
E2, A2, l2 f2
2
u1
3
f3
u3
u2
FIGURE 2.3 A straight bar made of two uniform cross-sectional bar members clamped at node 1. The bar is subjected to forces f2 at node 2 and f3 at node 3.
k1 =
E1 A1 l1
(2.21)
k2 =
E2 A2 l2
(2.22)
and
Details about the assembly of the matrices of the members can be found in any FEM textbook (e.g., Liu and Quek, 2003). Using the boundary condition, u1 = 0, Equation 2.20 becomes kˆ + kˆ 1 ˆ 2 − k 2
2.2.1
− kˆ2 u2 fˆ2 = kˆ2 u3 fˆ3
(2.23)
Forward Problem
First examine the conventional forward problem with the conditions given in Table 2.1. Solving Equation 2.23 for the displacements gives 1 ˆ − kˆ2 fˆ2 k1 = kˆ2 fˆ3 1 ˆ k1
u2 kˆ1 + kˆ2 = ˆ u3 − k 2 = 1 kˆ1
(
(2.24)
)
1 kˆ + kˆ 1 2 ˆ + f3 kˆ2
1 ˆ ˆ f2 + f3 kˆ fˆ2
1 ˆk ˆ 1 f2 kˆ1 + kˆ2 fˆ3 kˆ1kˆ2
−1
Using the first equation in Equation 2.20, the reaction force at node 1 is found to be
(
f1 = − kˆ1uˆ 2 = − fˆ2 + fˆ3
© 2003 by CRC Press LLC
)
(2.25)
TABLE 2.1 Cases of Problems for the Composite Bar
Cases
Boundary Conditions
Forward problem u1 = 0 Inverse problem Case I-1 evenposed Inverse problem Case I-2 underposed
u1 = 0
u1 = 0
Inverse problem Case II-1 evenposed
u1 = 0
Inverse problem Case II-2 overposed
u1 = 0
Inverse problem Case III-1 evenposed Inverse problem Case III-2 evenposed Inverse problem Case IV evenposed
u1 = 0
u1 = 0
u1 = ? f1 = ?
Geometry Parameters
Material Property Parameter
External Causes (Forces)
Effects (Displacements, Natural Frequency/ Modes)
A1 = Aˆ 1 , l1 = lˆ1
E1 = Eˆ 1
f2 = fˆ2
u2 = ?
A2 = Aˆ 2 , l 2 = lˆ2
E2 = Eˆ 2
f3 = fˆ3
u3 = ?
A1 = Aˆ 1 , l1 = lˆ1
E1 = Eˆ 1
f2 = ?
u2 = uˆ 2
A2 = Aˆ 2 , l 2 = lˆ2
E2 = Eˆ 2
f3 = ?
u3 = uˆ 3
A1 = Aˆ 1 , l1 = lˆ1
E1 = Eˆ 1
f2 = ?
A2 = Aˆ 2 , l 2 = lˆ2
E2 = Eˆ 2
f3 = ?
A1 = Aˆ 1 , l1 = lˆ1
E1 = ?
f2 = fˆ2
u2 = uˆ 2
A2 = Aˆ 2 , l 2 = lˆ2
E2 = ?
f3 = fˆ3
u3 = uˆ 3
A1 = Aˆ 1 , l1 = lˆ1
E1 = Eˆ 1
f2 = fˆ2
u2 = uˆ 2
A2 = Aˆ 2 , l 2 = lˆ2
E2 = ?
f3 = fˆ3
u3 = uˆ 3
A1 = ?, l1 = lˆ1
E1 = Eˆ 1
f2 = fˆ2
u2 = uˆ 2
A2 = ?, l 2 = lˆ2
E2 = Eˆ 2
f3 = fˆ3
u3 = uˆ 3
A1 = Aˆ 1 , l1 = ?
E1 = Eˆ 1
f2 = fˆ2
u2 = uˆ 2
A2 = Aˆ 2 , l 2 = ?
E2 = Eˆ 2
f3 = fˆ3
u3 = uˆ 3
A1 = Aˆ 1 , l1 = lˆ1
E1 = Eˆ 1
f2 = fˆ2
u2 = uˆ 2
A2 = Aˆ 2 , l 2 = lˆ2
E2 = Eˆ 2
f3 = fˆ3
u3 = uˆ 3
u3 = uˆ 3
It is shown again that, for given conditions Equation 2.9 through Equation 2.12, the forward problem can be solved, and the displacements can be uniquely determined. The solution given in Equation 2.24 can be written in the following general form: Y2×1 = Sˆ 2×2 Xˆ 2×1
(2.26)
From the mechanics point of view, the vector X in this case is the force vector given by
© 2003 by CRC Press LLC
fˆ Xˆ = 2 ˆ f3
(2.27)
and S is the system matrix (known as the flexibility matrix) obtained as 1 kˆ 1 Sˆ = 1 ˆ k1
1 ˆk 1 kˆ1 + kˆ2 kˆ1kˆ2
(2.28)
which depends on the material property and the geometrical parameters of the system. The vector Y in this case is the displacement vector (effect of the system) given by u2 Y= u3
(2.29)
Mathematically, the vector X is viewed as an input vector, S is termed a transformation matrix, and Y is an output vector, as illustrated in Figure 2.4. Note that, if Equation 2.20 is to be solved without using the boundary condition, this forward problem is also ill-posed and cannot be solved for a unique solution. Also, if k1 or k2 are zero, Equation 2.24 still cannot provide a unique solution. It seems unlikely to happen in this example but, mathematically, it can always be argued that k1 or k2 could be zero, and the solution could be nonunique. In fact, there are such problems in practice. The socalled “locking” problem in mechanics, e.g., “shear locking” (Zienkiewicz
Forward Problem: Y = SX (Smoothing operator on X )
Input X
System formulated as a transformation matrix S
(smoothing operator)
Output Y
Inverse Problem: Y = S-gX (Harshening operator on X)
FIGURE 2.4 A simple schematic illustration on forward and inverse problems.
© 2003 by CRC Press LLC
and Taylor, 2000), is exactly of this nature. The point here is that the forward problem can also be ill-posed, which is usually said to be not well-defined.
2.2.2
Inverse Problem Case I-1: Load/Force Identification with Unique Solution (Even-Posed System)
Consider now the first case of the inverse problem of load/force identification. The conditions are given in Table 2.1. In this case, the outputs or the effects (displacement) of the system as well as other conditions, such as the boundary condition, geometrical parameters, and material properties, are somehow known, but not the input (load/force). Using the boundary condition, uˆ 1 = 0, Equation 2.20 becomes f2 kˆ1 + kˆ2 = ˆ f3 − k 2
(
)
− kˆ2 uˆ 2 kˆ1 + kˆ2 uˆ 2 − kˆ2 uˆ 3 = kˆ2 uˆ 3 − kˆ2 uˆ 2 + kˆ2 uˆ 3
(2.30)
These are the two nodal forces input to the system to produce the outputs of nodal displacements uˆ 2 and uˆ 3 . The case I inverse problem is therefore successfully solved, and the solution is unique. Equation 2.30 can be written in the general form of X 2×1 = Sˆ 2−×1 2 Yˆ 2×1
(2.31)
where Sˆ 2×2 is the system matrix of the forward problem model given in Equation 2.26. Therefore, when the model of the forward problem is given, the output of the system is somehow obtained (via measurement, for example), and the forward transformation matrix is given and invertible, the solution of the inverse problem is obtained by simple matrix inversion. Because the number of unknowns and knowns is the same, this problem is said to be even-posed.
2.2.3
Inverse Problem Case I-2: Load/Force Identification with No Unique Solution (Under-Posed System)
Consider again the first case of inverse problem of load/force identification. The conditions, knowns, and unknowns are also listed in Table 2.1. In this case, the boundary condition and geometrical and material properties are known, but only partial output of the system, that is, uˆ 3 is known. The input (load/force) must be identified based on Equation 2.30. Because uˆ 2 is not known, it must be removed from these equations. To do this, Equation 2.30 is first changed to
© 2003 by CRC Press LLC
1 ˆ kˆ + kˆ f2 uˆ − k 2 uˆ 2 1 2 3 = 1 kˆ1 + kˆ2 f3 − uˆ + uˆ 2 3 kˆ2
(2.32)
Eliminating uˆ 2 in this equation by adding the preceding two equations together yields 1 kˆ + kˆ f2 1 2 u3 = kˆ ˆk kˆ f3 1 1 2 ˆ Y X
(2.33)
Sˆ
In this case the corresponding forward model becomes Yˆ 1×1 = Sˆ 1×2 X 2×1
(2.34)
where the force (input) vector F has the form of f X = 2 f3
(2.35)
The system transformation matrix S is given by 1 Sˆ = kˆ1
kˆ1 + kˆ2 kˆ1kˆ2
(2.36)
and the displacement (output) vector becomes Yˆ = uˆ 3
(2.37)
In this inverse problem, the input X from the given output Y must be found based on the forward model given by Equation 2.34. From Equation 2.33, it is clear that multiple solutions for the two inputs of f2 and f3 exist, because two unknowns must be determined with only one equation. It is necessary to obtain the inverse of the system transformation matrix S that is “fat” with dimension of 1 × 2, so that the solution can be given by X 2×1 = Sˆ 2−×g1Yˆ 1×1
© 2003 by CRC Press LLC
(2.38)
where Sˆ 2−×g1 is a generalized inverse matrix of Sˆ 2×1 that can be obtained using the so-called minimum length (ML) method as follows (see the details later in Section 2.6.2).
(
Sˆ 2−×g1 = Sˆ T2×1 Sˆ 1×2Sˆ T2×1
)
−1
=
( (
kˆ1 + kˆ2 + kˆ22 kˆ1kˆ2 kˆ1 + kˆ2 2 kˆ1 + kˆ2 + kˆ22 kˆ1kˆ22
)
(
2
)
)
(2.39)
The solution of this inverse problem to determine the input of forces becomes
f2 X 2×1 = = Sˆ 2−×g1Yˆ 1×1 f3
=
( (
kˆ1 + kˆ2 + kˆ22 uˆ 3 kˆ1kˆ2 kˆ1 + kˆ2 2 kˆ1 + kˆ2 + kˆ22 kˆ1kˆ22
(
) )
2
)
(2.40)
Because the number of unknowns is more than the number of knowns, this problem is said to be under-posed. Note that, for this kind of under-posed system, many other types of generalized inverse matrix for the system transformation matrix can be defined based on other criteria. The key point here is that the solution is nonunique, and will not always be reliable. In fact, this is one of the causes of the so-called ill-posedness of inverse problems. This book classifies this type of problem as Type I ill-posed inverse problem. Trying to obtain additional information from the system to have the problem even- or over-posed is the most reliable approach to obtain an accurate solution for this type of ill-posed inverse problem. For example, if the relation of f2 and f3 are somehow known, the inverse problem will be well defined because a unique solution can be obtained. Note that Type I ill-posedness is also seen in the forward problem defined in Equation 2.2 and Equation 2.23, if one of the nodal forces is unknown. An under-posed problem is always ill-posed.
2.2.4
Inverse Problem Case II-1: Material Property Identification with Unique Solution (Even-Posed System)
Consider now the second case of an inverse problem of material property identification. The conditions are given in Table 2.1. In this case, everything but the two Young’s moduli of the two bar members is known. These Young’s moduli must be identified using the outputs of the system obtained
© 2003 by CRC Press LLC
experimentally. Rewrite Equation 2.23 in the following form with k1 and k2 the unknowns: uˆ 2 k1 + (uˆ 2 − uˆ 3 )k 2 = fˆ2
(2.41)
(−uˆ 2 + uˆ 3 )k2 = fˆ3 The matrix form of the forward model is then obtained as fˆ uˆ 2 uˆ 2 − uˆ 3 k1 2 ˆ = −uˆ 2 + uˆ 3 k 2 f3 0 X
Sˆ
Yˆ
(2.42)
In this type of problem, the system transformation matrix depends on the measured displacements. The output vector is the external forces, and the input vector refers to the material property and the geometrical parameters of the system. This fact reveals an important feature of inverse problems: the system matrix is not limited to representing the characteristics of the structure system, and it can be formed using the field variables of the mechanics problem. In addition, the input vector X for this model is the stiffness of the structure system related to the material property and the geometrical parameters of the system. Solving the above equation for input X gives X 2×1 = Sˆ 2−×1 2 Yˆ 2×1
(2.43)
where
Sˆ −1
uˆ = 2 0
uˆ 2 − uˆ 3 −uˆ 2 + uˆ 3
−1
1 uˆ = 2 0
1 uˆ 2 1 −uˆ 2 + uˆ 3
(2.44)
Therefore, 1 uˆ k1 ˆ −1 ˆ X = = S 2×2 Y2×1 = 2 0 k 2
© 2003 by CRC Press LLC
(
)
1 1 ˆ ˆ f +f uˆ 2 fˆ2 uˆ 2 2 3 = 1 fˆ 1 fˆ3 3 uˆ 3 − uˆ 2 uˆ 3 − uˆ 2
(2.45)
Using the preceding equation, the Young’s modulus of these two bar members can be determined: lˆ lˆ 1 k1 ˆ 1 fˆ2 + fˆ3 E1 Aˆ 1 A1uˆ 2 = ˆ = ˆ E l l 2 2 2 fˆ3 k2 Aˆ ˆ 2 A2 (uˆ 3 − uˆ 2 )
(
)
(2.46)
If uˆ 2 = uˆ 3 , E2 cannot be determined, and if uˆ 2 = 0, E1 cannot be determined. This reveals another very important feature of inverse problems: situations can exist in which the solution process fails. In addition, when uˆ 2 (or uˆ 2 − uˆ 3 ) is very small and erroneous, it can be easily seen that the error in estimated E1 (or E2) can be magnified and even unstable (a small change in uˆ 2 could result in a big change in E1). This reveals another very important feature of inverse problems: the error in the solution can be magnified or the solution can be unstable. This instability is responsible for the ill-posedness of inverse problems. This book classifies this type of problem as Type II ill-posed inverse problem. Note that this instability or ill-posedness is caused mathematically by the rank of the system transformation matrix S defined in Equation 2.42. It is clearly seen that when uˆ = 0 or uˆ 2 = uˆ 3 , S has only a rank of 1. The physical cause of this ill-posedness is that E2 is not sensitive to any measurement that produces uˆ 2 = uˆ 3 , because such a measurement will not cause any deformation in the bar number 2. Therefore, there is no way to determine E2 from such a measurement. Similarly E1 is not sensitive to any measurement that produces uˆ 2 = 0 because such a measurement will not cause any deformation in the bar number 1. Because the unknowns and knowns are equal in number, this problem is said to be even-posed. Note that an even-posed system does not necessarily guarantee a stable solution for the inverse problem due to the possible Type II ill-posedness of the problem mentioned previously. Even-posed problems can also be ill-posed. Note that the Type II ill-posedness in the forward problem has been observed with a solution of Equation 2.24 when k1 or k2 is zero (see the discussion in the last paragraph of Section 2.2.1).
2.2.5
Inverse Problem Case II-2: Material Property Identification with No Unique Solution (Over-Posed System) Consider again case II-1, but assume that E = Eˆ , as shown in Table 2.1. 1
1
In this case, Young’s modulus E2 can be posed using the following two equations:
© 2003 by CRC Press LLC
uˆ 2
Eˆ 1 Aˆ 1 Aˆ + (uˆ 2 − uˆ 3 ) 2 E2 = fˆ2 ˆl lˆ 1
2
(2.47)
Aˆ (−uˆ 2 + uˆ 3 ) ˆ 2 E2 = fˆ3 l 2
or in the matrix form of Aˆ (uˆ 2 − uˆ 3 ) ˆ 2 Eˆ Aˆ fˆ2 − uˆ 2 1ˆ 1 l2 {E2 } = l1 Aˆ 2 ˆf X ( −uˆ 2 + uˆ 3 ) 3 ˆl ˆ 2 Y
(2.48)
Sˆ
It is seen that this system is over-posed because, for one unknown, there are two equations. Two different contradicting solutions for E2 could exist. Therefore, strictly speaking, no solutions satisfy both equations in Equation 2.48. To obtain the input X, it is necessary to perform the inversion of the system transformation matrix S that is “slim” with dimension of 2 × 1, and the solution can be given by X 1×1 = Sˆ 1−×g2 Yˆ 2×1
(2.49)
where Sˆ 1−×g2 is a generalized inverse matrix of Sˆ 2×1 that can be obtained using the least square method (LSM). (See details later in Section 2.6.4.)
(
Sˆ 1−×g2 = Sˆ 1T×2Sˆ 2×1
)
−1
Sˆ 1T×2
(2.50)
Because the number of unknowns is less than the number of knowns, this problem is said to be over-posed. It should be emphasized that over-posed problems can also be ill-posed. This can be clearly observed from Equation 2.48, when uˆ 2 = uˆ 3 . In such a case, E2 is not defined and will be unstable if noisy data of uˆ 2 and uˆ 3 are used — a typical Type II ill-posedness.
2.2.6
Inverse Problem Case III: Geometry Identification with Unique Solution
Consider now the third case of inverse problems of geometrical parameter identification, as specified in Table 2.1. In this case, everything is known but
© 2003 by CRC Press LLC
the areas of these two bar members. To determine these two geometrical parameters, use Equation 2.45 and Equation 2.3: lˆ lˆ ˆ1 k1 ˆ 1 fˆ2 + fˆ3 A1 E1 E1uˆ 2 = =ˆ ˆ A l l 2 2 2 fˆ3 k2 ˆ Eˆ 2 E2 (uˆ 3 − uˆ 2 )
(
)
(2.51)
Note that if uˆ 2 = uˆ 3 , A2 cannot be determined, and if uˆ 2 = 0, A1 and A2 cannot be determined. Therefore, Type II ill-posedness could exist. The exactly similar procedure is applicable in determining the length of the bar members, with the result: Eˆ 1 Aˆ 1 Eˆ 1 Aˆ 1 ˆ ˆ uˆ 2 l1 k1 f2 + f3 =ˆ ˆ =ˆ ˆ l2 E2 A2 E2 A2 ˆ ˆ k 2 fˆ (u3 − u2 ) 3
(
)
(2.52)
For this particular example, the case III inverse problem has the same characteristics as case II-1.
2.2.7
Inverse Problem Case IV, Boundary Condition Identification
Now consider an inverse problem for the identification of boundary conditions. The conditions are given in Table 2.1. In this case everything but the boundary conditions are known. Using the first two equations in Equation 2.20 yields kˆ1u1 − f1 = kˆ1uˆ 2
(
)
− kˆ1u1 = − kˆ1 + kˆ2 uˆ 2 + kˆ2 uˆ 3 + fˆ2
(2.53)
or kˆ kˆ1uˆ 2 −1 u1 ˆ1 = ˆ ˆ 0 f1 − k1 + k 2 uˆ 2 + kˆ2 uˆ 3 + fˆ2 − k1 X
(
Sˆ
)
(2.54)
Yˆ
Note in this case that the input and the output vectors consist of both components of displacement and force.
© 2003 by CRC Press LLC
Solving Equation 2.54 for the unknown input of boundary values yields −1
kˆ X = Sˆ −1Yˆ = 1 ˆ − k1
−1 0
(
)
kˆ1uˆ 2 ˆ ˆ ˆ ˆ ˆ ˆ − k1 + k 2 u2 + k 2 u3 + f2
(
)
kˆ + kˆ kˆ uˆ + fˆ 2 1 uˆ 2 − 2 3 2 = kˆ kˆ1 1 ˆk (uˆ − uˆ ) − fˆ 2 2 3 2
(2.55)
This shows that the boundary conditions can be determined if all the other parameters of the system are given and the inputs and the effects of the system are known. Because the number of unknowns is the same as the number of knowns, this problem is even-posed. 2.2.8
Points to Note
• As shown in many of the preceding cases, not all the inverse problems are ill-posed. In fact, careful formulation of inverse problems and better planning of experiment strategy can always help to wellpose an inverse problem. • Forward problems and inverse problems can be expressed in a discrete matrix based on physics (mechanics) of the problem. • Both the forward problem and the inverse problem can have Type I and Type II ill-posedness. • An under-posed problem is always ill-posed; an even- or over-posed problem may or may not be ill-posed. Based on these points, it may be argued that forward and inverse problems are apparently mutually reciprocal mathematically. The inverse problem of an inverse problem returns to the forward problem, if the viewpoint is revised. It seems acceptable to call the previously defined inverse problem a forward problem, as long as the previously defined forward problem is redefined as an inverse problem. This argument is not wrong if the system is discrete in nature. Continuous systems need a better and more precise definition. Otherwise, someone could then easily conclude that all the existing techniques for the conventional forward problems should be applicable to all the inverse problems. Special methods to solve inverse problems are not needed. The question then is what the decisive property of the inverse problems of continuous systems is. The answer is the differential operator. To illustrate this clearly and explicitly, a one dimensional continuous problem is used to reveal an important property of the inverse problems: Type III ill-posedness.
© 2003 by CRC Press LLC
2.3
Type III Ill-Posedness
A very simple problem is used to reveal explicitly and to examine clearly the Type III ill-posedness.
2.3.1
Forward Problem
Consider the following simple continuous system equation that governs the static state of the simple bar problem examined in Section 2.1. f du( x) = 2 dx EA
(2.56)
The boundary condition for this problem is given by Equation 2.12, as shown in Figure 2.2. The detailed procedure that leads to Equation 2.56 can be found in Section 1.2 in a textbook by Liu and Xi (2001). The conventional forward problem is to solve the axial displacement u via the following integral operation: u( x) =
∫ EA dx + c f2
(2.57)
0
where c0 is the integral constant to be determined by the boundary condition Equation 2.12.
2.3.2
Differential Operation: Magnification of Error
Now estimate f2 using displacement u(x) measured at x that is an internal point in the bar. In a practical situation of measurement, there will be an error or noise that can be expressed in the form of
[
]
u m = u a + u noise = u a + u a e sin(ω noise x) = u a 1 + e sin(ω noise x)
(2.58)
where the superscript m stands for the measurement data, a stands for the result being analytical or exact, e is the noise level that is usually relatively smaller than 1, and ωnoise is the frequency of the noise distribution along x. In the inverse analysis, ua is not known; only um is known. To view the error more clearly in graphs, simplify the problem by assuming l = 1.0, EA = 1.0
© 2003 by CRC Press LLC
(2.59)
and ω noise = 10π , e = 0.01
(2.60)
which implies a 1% measurement error relative to u(x), which is, in fact, a very good measurement. However, the frequency of the measurement data is high, which is also very common in the measurement errors. Using the parameters given in Equation 2.59, the exact solution of u becomes ua = x
(2.61)
u m = u a + u noise = x[1 + e sin(10πx)]
(2.62)
The simulated measurement is
which is plotted in Figure 2.5 together with the exact displacement. It is shown that the measured displacement is indeed very accurate. Use of this erroneous result in Equation 2.56 for the inverse predication of force can result in a magnification of errors. This magnification of error causes the instability in the inverse analysis procedure. To demonstrate this, substitute Equation 2.58 into Equation 2.56 to obtain fe du m du a = 1 + e sin(ω noise x) + u a ω noise e cos(ω noise x) = 2 dx dx EA
[
]
(2.63)
where the superscript e stands for the estimated value. Therefore, f2e can be inversely determined as du a ω noise cos(ω noise x) (2.64) f2e = EA 1 + e sin(ω noise x) + EAu a e dx factor of magnification
[
]
fa
It is seen clearly that the error in the inverse solution is magnified drastically by ωnoise times. For parameters given by Equation 2.59 and Equation 2.60, using Equation 2.64, yields f2e = f2a [1 + 0.01 × sin(10πx)] +
10 π
× 0.01 × x cos(10πx) (2.65)
factor of magnification
The results of the estimated force for the unit true force are plotted in Figure 2.6. The magnification of errors is clearly evidenced. This magnification is obviously caused by the differential operation in the system equation.
© 2003 by CRC Press LLC
1 0.9 0.8 0.7 0.6
u 0.5 0.4 0.3 0.2 0.1 0 0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
x
FIGURE 2.5 The displacement in a clamped uniform bar subject to a force at the free end. Comparison of the exact result (dashed line) and the simulated-measurement (solid line) with 1% oscillatory error. 1.4
1.2
1
0.8
0.6
0.4
0.2
0
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
x
FIGURE 2.6 Inverse solution of the force (dashed line) applied on the clamped uniform bar using the simulated measurement of displacement with 1% oscillatory error compared with the true force (solid line).
2.3.3
Definition of Type III Ill-Posedness
In practice, the situation can get worse because the measurement error may not be expressible in the continuous form as Equation 2.58, which means that the measured displacement may not be differentiable. This differentiation-induced magnification of error in the inverse solution is an often encountered ill-posedness of inverse problems. This type of ill-posed problem is classified in here as the Type III ill-posed inverse problems.
© 2003 by CRC Press LLC
The preceding example clearly shows that the root of the Type III ill-posed problem is that the differential operator is applied on an erroneous measured displacement. Even though the measurement error in displacement is very small, the error in the inverse solution of force is drastically magnified by the rate of the frequency of the error distribution. The higher the frequency, the larger the error of the inverse solution is. The differential operator is therefore termed a harshening operator in this book. Why is no such magnification in the forward solution? The answer is that the forward solution is obtained via Equation 2.57, that is, an integral operation. Even if an oscillatory error is in the measurement date of the force used in Equation 2.57, the integral operation can smear out the error. The integral operator is therefore called smoothing operator. Note that, for the majority of engineering problems, the field variables (unknowns) are governed by partial or ordinary differential equations. The solution for this kind of problem is through a series of smoothing integral operators; therefore, the solution will be stable with respect to perturbations or oscillatory errors in the inputs. Following this argument, the forward problems for continuous systems may be defined in the mathematics viewpoint as problems whose unknowns (filed variables) are governed by ordinary or partial difference equations . On the other hand, the inverse problem for continuous systems may be defined as problems whose knowns (effects) are subjected to differential operations. These definitions of forward and inverse problems are illustrated in Figure 2.4; they make good sense for continuous systems. For discrete systems like the ones discussed in Section 2.1 and Section 2.2, definitions based on the physics and convention make more sense.
2.3.4
A Simple Solution for Type III Ill-Posed Inverse Problems
If Equation 2.57 is integrated first to establish a relationship between knowns and unknowns (u and f2) that do not contain any differential operator, the Type III ill-posedness can be entirely removed. To illustrate this in detail, performing the integration of Equation 2.57 and applying boundary condition at x = 0 yields f2 x EA
(2.66)
EAu( x) x
(2.67)
u( x) =
or f2 =
This relation between u and f2 does not contain any differential operator. Now Equation 2.62 is substituted into Equation 2.67:
© 2003 by CRC Press LLC
f2 =
[
]
[
]
EA a u 1 + e sin(ω noise x) = f2a 1 + e sin(ω noise x) x
(2.68)
Clearly, the error of measurement of the displacement is not magnified, as shown in Figure 2.7; it is simply transmitted to the estimated force. Therefore, no Type III ill-posed problem is observed. This finding has revealed a very important technique to remove Type III ill-posedness, which is to solve the differential (or partial differential) equations analytically first before introducing the noisy data for inverse analysis. This seems to be the most effective way to solve the Type III ill-posed inverse problems. However, not many governing partial differential equations in engineering problems can be solved analytically. Most engineering problems must be solved using numerical methods of domain or time discretization. The differential operator is therefore changed to a discrete operator of discrete transformation matrix. This operator is smoother than the original continuous differential operator; therefore, the Type III ill-posedness will be reduced. This discrete effect is termed the projection regularization and will be discussed in great detail in Chapter 3. 2.3.5
Features of Ill-Posedness
Equation 2.64 clearly shows that the error magnification factor is proportional to the frequency (or wavenumber) of the noise. If the frequency is low or the wavelength is large, the ill-posedness will be proportionally reduced. Therefore, Type III ill-posedness is sensitive only to high frequency (or larger 1.2 1
Force
0.8
0.6
0.4
0.2
0 0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
x
FIGURE 2.7 Inverse solution of the force (dashed line) applied on the clamped uniform bar using the simulated measurement of displacement with 1% oscillatory error compared with the true force (solid line). The inversion is based on Equation 2.67. It is clearly shown that the Type III illposedness has been removed from the problem.
© 2003 by CRC Press LLC
wavenumber) of the noise in the measurement. In extreme cases when the measurement contains an error of constant shift (with zero frequency), the error in the inverse solution will be zero. This can be clearly shown as follows, assuming that u m = u a + u noise = u a + c
(2.69)
where c is a constant, yielding f e = EA
∂u m ∂ ∂u a = EA (u a + c) = EA = fa ∂x ∂x ∂x
(2.70)
This feature of Type III ill-posedness is the basis of the regularization by discretization or projection.
2.4
Types of Ill-Posed Inverse Problems
Based on discussions in the preceding sections, inverse problems can be defined in two different ways: • For discrete systems the inverse problem is defined based on the physical nature or the conventional way of formulating problems. The forward and inverse problems are interchangeable following the change of the viewpoint. • For continuous systems, the inverse problem is defined as a problem whose knowns (effects) are subjected to differential operations. The definition of the inverse problem is less important compared to the illposedness of inverse problems. Summarizing the discussions in the previous sections, three types of ill-posed inverse problems are listed in Table 2.2. Type I ill-posedness is caused by the fact that the unknowns are more than the knowns, which leads to an under-posed system, as shown in case I-2. The best way to remove this ill-posedness is to perform more tests to make the problem even- or over-posed or use other additional information to improve the uniqueness. Type II ill-posedness is caused by the fact that the unknowns are not sensitive to the knowns, leading to a rank deficiency in the transformation matrix, as shown in case II-1 when uˆ 2 = uˆ 3 . The best way to remove this type of ill-posedness is to modify the experiments to improve the sensitivity. Use of regularization method can often solve this problem (discussed in detail in Chapter 3). Type III ill-posedness is caused by the harshening differential operator, as demonstrated in Section 2.3.2. The best way to remove this ill-posedness is to solve the ordinary or partial differential
© 2003 by CRC Press LLC
TABLE 2.2 Types of Ill-Posed Inverse Problems Cause of Ill-Posedness
Types Type I
Under-posed
Type II
Lack of sensitivity
Type III
Harshening operation on noisy data
Solution 1. Perform more tests to make the problem even- or overposed; use additional information 2. ML solution or other type of solution suitable for the particular problem 1. To change the experimental strategy to increase the sensitivity between the knowns and unknowns; use additional information 2. To increase the over-posedness 3. Regularization 1. To solve the PDE analytically before inversion 2. Regularization
Note: Type I and Type II are shared with forward problems; Type III is unique to the inverse problem, following the definition of the inverse problem in this book.
equations first to obtain the relationship between knowns and unknowns that do not contain differential operators. However, this may not be possible for many complex engineering problems. Use of regularization method can also solve this problem. Note that the use of some of the regularization methods should be the last resort, when direct information that can be used to well-pose the problem or to make the problem sufficiently over-posed cannot be obtained. Care must be taken in the use of regularization methods, because their side effects can lead to the reduction of accuracy or even nonphysical solutions. For many regularization methods, additional efforts are often needed just to determine the regularization parameters. A good understanding of the regularization method used is always essential. Type I and II ill-posedness exists also in forward problems and therefore is not unique to inverse problems. Type III ill-posedness is unique to inverse problems, based on the definition of inverse problem in this book.
2.5
Explicit Matrix Systems
Table 2.3 shows four types of inverse problems in the area of mechanics of solids and structures. This classification is based on physics of the application problems and is far from complete. For example, the input or output vector or the system matrix could have all sorts of combinations of parameters; the eigenvalues and eigenvectors can also be involved in the problem. It is therefore not possible to list all cases of inverse problems exclusively, even in the area of mechanics of solids and structures. The important point here is that the inverse problem is diversified. The system matrix S can be prac-
© 2003 by CRC Press LLC
TABLE 2.3 Cases of Inverse Problems for Mechanics of Solids and Structures Cases
Objective X
Case I
Force identification
Case II
Material property identification Geometrical parameters identification Boundary conditions identification
Case III Case IV
Forward Operator S Property Material property, geometry of the structure Displacements Material property, geometry of the structure Material property, geometry of the structure
Output Vector Y Displacements Forces Forces Forces and displacements
Note: All cases of problems can be under-posed, even-posed, or over-posed. An under-posed problem is always ill-posed; even- and over-posed problems may or may not be ill-posed. Forward model: Y = SX, and inverse solution: X = S−g Y.
tically formed with any kind of combination of parameters of the system, as can input and output vectors. Note also that all types of problems can be under-posed, even-posed, or over-posed, depending on the numbers of components in the input and output vectors. To present methods to all these types of problems systematically, assume that the forward model of the problem can be expressed in the form of YMp ×1 = S M × N X N ×1
(2.71)
where the superscript p stands for the predicted values of effects using a forward solver, and the output vector is formed using M predicated values: y1p p y Yp = 2 p y M
(2.72)
Vector X contains N inputs given in the form of x1 x 2 X= x N
and S is the system matrix or transformation matrix with the form of
© 2003 by CRC Press LLC
(2.73)
s11 s 21 S= sM 1
s12 s22 sM 2
s1N s2 N sMN
(2.74)
The next section presents a set of systematic ways to solve this class of inverse problems that have a forward prediction model in the form of Equation 2.71.
2.6
Inverse Solution for Systems with Matrix Form
2.6.1
General Inversion of System Matrix
Consider, in general, a system whose forward problem can be formulated in the form of Equation 2.71. The solution of the corresponding inverse problem for the estimation of the input X can be obtained using X eN ×1 = S −Ng× M YMm ×1
(2.75)
where the superscripts e and m stand for the estimated and measured values, respectively. The output vector Y is now known as the effects of the system obtained via usual measurements, records, or observations, and S–g is a general inverse of matrix S created using the corresponding forward problem model of Equation 2.71. The procedure of calculating the general inverse matrix depends on the numbers of the measured knowns and unknowns to be estimated in the inverse problems. As illustrated in Section 2.2 these problems are classified into three categories: • Under-posed problems — when M < N (unknowns are more than knowns) • Even-posed problems — when M = N (unknowns are equal to knowns) • Over-posed problems — when M > N (unknowns are less than knowns) The next section will discuss the methods to obtain the solution for these inverse problems. In the meantime, assume that S–g can be somehow obtained, and therefore X can be estimated using Equation 2.75. It is necessary to examine the quality of the predication after the inversion. Using the estimated X, output Y can be predicted using the forward model of Equation 2.71, which gives
© 2003 by CRC Press LLC
[
]
YMp ×1 = S M × N X eN ×1 = S M × N S −Ng× M YMm ×1
(2.76)
This equation clearly shows that, if
[S
M×N
]
S −Ng× M = I
(2.77)
where I is an identity matrix, the inverse procedure can reproduce the measurement data. If an S–g obtained does not satisfy Equation 2.77, the measurement data will not be reproduced. Therefore the matrix
[
R o( M × M ) = S M × N S −Ng× M
]
(2.78)
is defined as an output reproducibility matrix. Computing Ro can give an indication on the quality of an inverse procedure in terms of reproducing the measurement or output data. When Ro = I, the inversion is output reproducible. Next, it is argued that the true X should satisfy the forward model for the measured Y that obtained for the real event: YMp ×1 = S M × N X tN ×1
(2.79)
where the superscript t stands for the true values. Substituting this equation into Equation 2.75 gives
[
]
X eN ×1 = S −Ng× M S M × N X tN ×1
(2.80)
The preceding equation clearly shows that, if
[S
−g N×M
]
S M×N = I
(2.81)
the inverse procedure can provide the exact estimate of the inputs. If an S–g somehow obtained does not satisfy Equation 2.81, the estimation will not be the true input. Therefore the matrix
[
R I ( N × N ) = S −Ng× M S M × N
]
(2.82)
is defined as an input reproducibility matrix. Computing RI can give an indication on the quality of an inverse procedure in terms of estimating the input of the system. When RI = I, the inversion is input reproducible.
© 2003 by CRC Press LLC
2.6.2
Under-Posed Problems: Minimum Length Solution
When the number of measured knowns, M, is less than the number of unknowns to be estimated for the system, the problem is under-posed. The example given in Section 2.2.3 is the case. The under-posed problem will have an infinite number of solutions that satisfy exactly the equation of the corresponding forward model without any error. Therefore, it is necessary to choose a physically meaningful solution. The most reliable means of solving this kind of under-posed system is to add more information to have the system even- or over-posed. However, this cannot always be done. A common mathematical choice is the so-called minimum length (ML) solution. The process of leads to the ML solution is as follows: The function Π can first be defined as Π = X T X + λ T {Y m − SX}
(2.83)
where the first term in the right-hand side is, in fact, the Pythagorean length of the vector X: X T X = x12 + x 22 + + x N2
(2.84)
and the second term in the right-hand side is the constraint that forces the unknown vector X to satisfy the equations system of the forward problem, Equation 2.71. λ is a vector of the so-called Lagrange multipliers. Therefore, the function Π defined in Equation 2.83 is, in fact, a constrained Pythagorean length of the unknown vector X. To seek for the minimum length, ∂Π =0 ∂X
(2.85)
∂Π =0 ∂λ
(2.86)
∂Π = 2 X − ST λ = 0 ∂X
(2.87)
∂Π = Y m − SX = 0 ∂λ
(2.88)
and
are required, leading to
and
© 2003 by CRC Press LLC
Equation 2.87 gives X=
1 T S λ 2
(2.89)
Substituting the preceding equation into Equation 2.88 yields 2Y m = SST λ
(2.90)
Note that matrix SST is surely symmetric as
(SST )T = (ST )T (S)T
= SST
(2.91)
Assuming SST is invertible, the vector of the Lagrange multipliers λ can be found as λ = 2[SST ] Y m −1
(2.92)
Substituting Equation 2.92 into Equation 2.89 yields X e = ST [SST ] Y m −1
(2.93)
Comparison with Equation 2.75 gives the definition of the generalized inverse matrix for the under-posed inverse problem as S − g = ST [SST ]
−1
(2.94)
The output reproducibility matrix is R o = SS − g = SST [SST ] = I −1
(2.95)
Therefore the ML solution is output reproducible. On the other hand, the input reproducibility matrix is R I = S − g S = ST [SST ] S −1
(2.96)
which, in general, may not be an identity matrix. Therefore, the ML solution is not input producible and will not, in general, give the true estimation of the input of the system. Note that SST may not be invertible, and the ML solution may not exist. The solution to this situation is to
© 2003 by CRC Press LLC
• Modify the experiment strategy to improve the sensitivity in the equation system. • Perform more tests to provide more equations to at least even-pose the problem.
2.6.3
Even-Posed Problems: Standard Inversion of Matrix
When the number of knowns of measured outputs, M, is the same as N, the unknowns to be estimated for the system, the problem is even-posed and the system matrix SN × N is a square matrix with N rows and columns. The example given in Subsection 2.2.2 is the case. If the system matrix S is invertible (with a full rank of N), the even-posed problem should have a unique solution and the generalized inverse matrix is the standard inverse of a square matrix. S − g = S −1
(2.97)
In this case, it is very easy to confirm that the input reproducibility matrix and the output reproducibility matrix are both the identity matrix, implying that the solutions are output and input reproducible. Note that the even-posed formulation does not guarantee the full rank of S, and thus the existence of the inversion of the system matrix can be questionable. In fact, in many cases S can be singular, meaning the rank of S is smaller than the dimension of the S, i.e., Rank(S) < N. This is due to the diverse and complex nature of the inverse problems. The rows in the system matrix can often be linearly dependent, leading to the so-called rank deficiency of the system matrix, as seen in Section 2.2.4 when uˆ 2 = uˆ 3 . In engineering practice, it is often difficult to ensure a full-rank system matrix. In such cases, some kind of regularization method, such as the singular decomposition method (see Section 2.7), should be used to obtain a solution mathematically. Note that these kinds of mathematical regularization treatments do not necessarily ensure that the solution will be physically meaningful. The most reliable method, however, is to modify the experiment strategy or increase the number of measurements M to increase the rows of the system matrix so as to make the system over-posed. It is hoped that this will lead to Rank(S) = N. The solution methods for over-posed systems are detailed next.
2.6.4
Over-Posed Problems: Least-Squares Solution
When the number of measured knowns, M, is larger than the number of the unknowns to be estimated for the system, N, the problem is over-posed. The example given in Section 2.2.5 is the case. The over-posed problem could have a number of solutions that satisfy some of the equations of the corre-
© 2003 by CRC Press LLC
sponding forward model. Therefore, it is necessary to find a solution that is physically meaningful and satisfies all the equations of the forward model in a proper compromised manner. A common method is the so-called leastsquares (LS) method. The process of deriving the LS solution is as follows: First define the functional Π as Π = (Y m − SX) (Y m − SX) T
(2.98)
which is, in fact, the L2 norm of the error of the prediction of the forward model and the measurements. To seek for the minimum error, ∂Π =0 ∂X
(2.99)
∂Π = −2ST (Y m − SX) = 0 ∂X
(2.100)
ST Y m = ST SX
(2.101)
is required, which leads to
or
Note that matrix STS is surely symmetric as
(ST S)T = (S)T (ST )T
= ST S
(2.102)
Assuming STS is invertible, the estimated X can be found as X e = [ST S] ST Y m −1
(2.103)
The definition of the generalized inverse matrix for the over-posed inverse problem is S − g = [ST S] ST −1
(2.104)
The output reproducibility matrix is R o = SS − g = S[ST S] ST −1
© 2003 by CRC Press LLC
(2.105)
which, in general, may not be an identity matrix. Therefore, the LS solution will not, in general, be output reproducible. On the other hand, the input reproducibility matrix is R I = S − g S = [ST S] ST S = I −1
(2.106)
which implies that the LS solution is input reproducible and gives the true estimation of the input of the system. Note that, for some engineering systems, even if M > N, STS may still not have a full rank, and the estimation based on Equation 2.103 can fail. In such cases, some kind of regularization method, such as the singular decomposition method (Section 2.7), or the Tikhonov regularization (Chapter 3) should be used. However, the most reliable method to ensure physically meaningful solution is to modify the experiment strategy to improve the sensitivity or increase the number of measurements M to increase the rows of the system matrix so as to make the system even more over-posed. It is hoped that this will lead to a full rank for matrix STS.
2.7
General Inversion by Singular Value Decomposition (SVD)
2.7.1
Property of Transformation and Type II Ill-Posedness
Consider a system matrix S given in the following matrix form YM ×1 = S M × N X N ×1
(2.107)
Mathematically, the system matrix can be viewed as an operator that transforms the input vector X into the output vector Y. Considering a system with M Š N, if the rank of S is less than N, i.e., Rank(S) < N, a set of vectors X will be normal to the columns of S, which leads to S M × N X N ×1 = YM ×1 = 0 M ×1
(2.108)
This set of X that leads to Y = 0 forms a subspace of X. This subspace is called null space or kernel of S, as shown in Figure 2.8. In the context of inverse problems, the null space is the subset of the space of the unknowns (inputs) of X that is mapped onto Y = 0. In other words, this subset of inputs is not contributing anything to the output, or the outputs are not sensitive at all to this set of inputs. The measurements conducted for Y cannot be used for identifying any X in the null space. The use of such a measurement
© 2003 by CRC Press LLC
RM N
R
Forward Transformation SX = Y Y
X Inverse Transformation X = S − gY
Ker(S): Null-Space in X
Im(S): Image of X in Y
Y X SX = 0
Y=0
Ker(S−g ): Null-Space in Y −g
Im(S ): Image of Y in X Y X
X=0
S−g Y = 0
FIGURE 2.8 Null space in X and the image of X in Y for the transformation SX = Y. The dimension of the image in Y is r = Rank(S), and the dimension of the null space in X is Dim Ker(S) = N – Rank(S). The same property can be observed in the inverse transformation: null space in Y, and the image of Y in X for the inverse transformation X = S−gY. The dimension of the image in X is r = Rank(S–g), and the dimension of the null space in Y is Dim Ker(S−g) = M – Rank(S−g).
with errors in the inverse estimation process can lead to solutions with magnified errors or even unstable solutions — symptoms of Type II illposedness of inverse problems. A typical example is given in Section 2.2.4 for the case of uˆ 2 = uˆ 3 . The subspace of Y reachable by the transformation is the image of S in the space of Y. The rank of S indicates the maximum number of independent columns in S, which determines the image of the transformation. Therefore, the dimension of the image is the rank of S: Dim(Im(S)) = Rank(S)
(2.109)
In the context of inverse problems, the image of the transformation is the subset of the measurements (outputs) of Y that can be reached by the inputs
© 2003 by CRC Press LLC
X through the transformation of S. Once the image is found, the dimension of the null space can be given by Dim(null space) = N − Dim(Im(S))= N − Rank(S)
(2.110)
The same analysis can be done for the inverse transformation: X = S− g Y
(2.111)
The details are illustrated in Figure 2.8; this similarity shows that the forward and inverse problems can have the same mathematics problem appear with the Type II ill-posedness. The procedure of the singular value decomposition (SVD) can be used to deal with these problems.
2.7.2
SVD Procedure
If the rank or the null space of the transform matrices can be identified, then it can somehow be regularized. The SVD is an efficient tool that can be used to determine the null space of S. In the SVD any real matrix SM×N with M Š N and Rank(S) ð N, can be factorized into three component matrices (see, for example, Golub and Van Loan, 1996): S M × N = U M × M Λ M × N VNT × N
(2.112)
where
Λ M×N
λ 1 0 0 0 =0 0 0 0
0 0
0 0 0
0 0 0 λr
0 0 0 0
0 0 0
0 0 0
0 0 0
0 0 0
0
0
0
0
0 λ2
Λ r ×r = 0 ( M − r )× r
0 0 0 0 0 0 0 0
(2.113)
0 r ×( N − r ) 0 ( M −r )×( N −r )
in which λi (i = 1, 2, …, r = Rank(S)) are the eigenvalues of matrix SST or STS. In Equation 2.112, U is an orthogonal matrix
© 2003 by CRC Press LLC
U M × M = [Φ 1
Φ2
Φr
ΦM ]
(2.114)
where Φ i (i = 1, 2, …, M) are the eigenvectors of matrix SST, in which Φ i (i = 1, 2, …, r = Rank(S)) are corresponding to the eigenvalues λi (i = 1, 2, …, r). These orthogonal vectors Φ i (i = 1, 2, …, M) span the space of the measurement (output) Y. The image for S is spanned by vectors of the orthogonal vectors of Φ i (i = 1, 2, …, r). This is illustrated graphically in Figure 2.9. In Equation 2.112, V is also an orthogonal matrix: VN × N = [Ψ1
Ψ2
Ψr
ΨN ]
(2.115)
where Ψi (i = 1, 2, …, N) are the eigenvectors of matrix STS, in which Ψi (i = 1, 2, …, r = Rank(S)) are corresponding to the eigenvalues λi (i = 1, 2, …, r = Rank(S)). These orthogonal vectors Ψi (i = 1, 2, …, N) span the space of the input X. The null space of S is spanned by vectors of the vectors of Ψi (i = r + 1, r + 2, …, N). This is also illustrated graphically in Figure 2.9. It is seen clearly that the SVD provides a very useful tool to reveal these important properties of the transformation matrix. Note that the SVD of Equation 2.112 can be performed numerically via standard routines (see Press et al., 1989). It is, however, computationally very expensive compared to solving linear system equations. Note also that if S is a matrix of real numbers, the three component matrices will also be of real numbers. The preceding SDV process can also be extended to complex matrices. In such cases, the Hermitian should be used instead of the transpose.
FIGURE 2.9 SVD decomposited transform matrix and the related spaces.
© 2003 by CRC Press LLC
2.7.3
Ill Conditioning
For many ill-posed problems, the operate matrix S is shown numerically illconditioned or with a rank deficiency that can be measured using the socalled condition number. For a square and symmetric matrix of n × n, the condition number is defined as (see Press et al., 1989): κ=
max(λ 1 , λ 2 , , λ n ) min(λ 1 , λ 2 , , λ n )
(2.116)
If the matrix is singular, the smallest eigenvalue will be zero and the condition number will become infinity. In practical numerical analysis, the smallest eigenvalue for a singular matrix will usually be a very small number, which leads to a large condition number. Therefore, the larger the condition number is, the worse the conditioning of the matrix. A very large condition number in a system indicates that the outputs of the system are not sensitive to at least one of the inputs. Because the SVD can provide all the eigenvalues for the matrix, the condition number can be very easily obtained once the SVD is performed. Sensitivity of the system can also be examined.
2.7.4
SVD Inverse Solution
Using Equation 2.112 as well as the largest r eigenvalues and the corresponding eigenvectors, the general inversion of the operate matrix S can be defined as S −Ng× M = VN ×r Λ−r1×r U Tr × M
(2.117)
S M × N = U M × M Λ M × N VNT × N
(2.118)
where r ð min(M, N).
Once the general inverse matrix is obtained, the input X can be obtained explicitly using (Golub et al., 1996) r
X = S− g Y =
∑ i =1
ΨiT Y Φi λi
(2.119)
It is clearly seen that the inclusion of small value of λi will magnify the measurement (output) error in Y. Therefore, choosing a proper cut-off r is very important, and engineering experience and judgment and/or trial and
© 2003 by CRC Press LLC
error are needed. Choosing a cut-off r is practically exercising a regularization via SVD. Now check the output reproducibility matrix of the SVD inversion: R o = SS − g = U M × M Λ M × N VNT × N VN ×r Λ−r1×r U Tr × M Ψ1T T Ψ2 = U M × M Λ M × N T [Ψ1 Ψr T ΨN 1 0 = U M× M Λ M×N 0 0
Ψ2
0
0 1 0 0
Ψr ]Λ−r1×r U Tr × M
0 0 −1 T Λ U 1 r ×r r × M 0
I r ×r −1 T = U M× M Λ M×N Λ r ×r U r × M 0 ( N − r )× r Λ r ×r = U M× M 0 ( M − r )× r
(2.120)
0 r ×( N −r ) I r ×r −1 T Λ U 0( M −r )×( N −r ) 0 ( N −r )×r r ×r r × M
Λ r ×r −1 T T = U M× M Λ r ×r U r × M = U M × M U r × M 0 ( M − r )× r
= [Φ 1
= [Φ 1
Φ2
Φ2
Φ 1T I r ×r Φ T2 Φ M ] 0 ( M − r )× r T Φ r
Φr
Φ 1T T Φ ≠ I M × M Φr ] 2 ⇒ = I M × M T Φ r
when r < M when r = M
Therefore, the SVD solution will not, in general, be output reproducible unless r = M. On the other hand, the input reproducibility matrix is
© 2003 by CRC Press LLC
R I = S − g S = VN ×r Λ−r1×r U r−×1M U M × M Λ M × N VNT × N Φ 1T T Φ = VN ×r Λ−r1×r 2 [Φ 1 T Φ r
[
[ [
= VN ×r I r ×r
= [Ψ1
[
= Ψ1
Ψ2
Ψ2
Φr
Φ M ]Λ M × N VNT × N
]
= VN ×r Λ−r1×r I r ×r = VN ×r Λ−r1×r
Φ2
0 r ×( M −r ) Λ M × N VNT × N Λ r ×r 0 r ×( M − r ) 0 ( M − r )× r
0 r ×( N − r ) T V 0 ( M −r )×( N −r ) N × N
]
(2.121)
]
0 r ×( N −r ) VNT × N
[
ψ r ] I r ×r
Ψr
0 r ×( N − r )
0 r ×( N − r )
]
]
Ψ1T T Ψ2 T Ψr T ΨN
Ψ1T T Ψ2 ≠ I N × N T⇒ Ψr = I N × N T ΨN
when r < N when r = N
This implies that the SVD solution is not input reproducible and does not give the true estimation of the input of the system unless r = N.
2.8
Systems in Functional Forms: Solution by Optimization
For the majority of engineering inverse problems, however, the forward model cannot be expressed in the form of Equation 2.71 due to the diverse nature of the inverse problems, which are demonstrated in Section 2.2. The elegant form of solution shown in Equation 2.75 will not often be possible. The reasons could be that:
© 2003 by CRC Press LLC
• The problem is too complex to express the relationships of inputs, outputs, and the system in an explicit matrix form. • The number of unknowns (inputs) is too large so that the matrix form of solution may not be the most effective way to obtain the solution. These types of inverse problems are therefore often formulated using functional forms. In general, it can always be assumed that the forward model of the problem can be expressed in the form of Y p = S(P1 , P2 , , Pk , X )
(2.122)
where Y is a vector that collects all the outputs or the effects, S is the system matrix of functions of vectors of all kinds of parameters, P1, P2, …, Pk, and X is the vector that collects inputs. Note that X need not be expressed explicitly in the forward model Equation 2.122. Considering now that the output or effects of the system Y can be obtained via means of measurement, the purpose is to determine the input of the system. To this end, construct a function in the form of so-called L2 norm defined as
Π (X) = (Y p − Y m ) (Y p − Y m ) = T
ns
∑ ( y (X) − y p i
m i
(X t ))
2
(2.123)
i =1
which counts for the sum of the least squares of the errors of the predicated outputs based on the forward model Equation 2.122 and an assumed X with respect to the measured output for the true system. ns is the number of sampling points of the experiment or measurement. It is clear that if X = Xt
(2.124)
then (assuming the predication is exact),
Π (X) = 0
(2.125)
Π (X) ≥ 0
(2.126)
For all other X, we have
and an X can possibly be found that leads to
Π (X) → minimum
© 2003 by CRC Press LLC
(2.127)
It can then be argued that if the minimum of the functional Π can be found, at least one of the approximation of Xt can be obtained. The inverse problem therefore becomes an optimization problem seeking for X that minimizes the functional Π of error. Chapter 4, Chapter 5, and Chapter 6 will discuss a number of often used optimization methods in solving inverse problems, including direct search methods, gradient-based methods, iterative methods, genetic algorithms, and neural networks. Chapter 7 through Chapter 13 will discuss a number of practical inverse problems that are solved using the method of minimizing the error functions. Other often used forms of error functional are • L1 norm is defined as
Π (X) =
ns
∑ y (X) − y p i
m i
(X t )
(2.128)
i =1
which counts for the sum of the absolute errors of the predicated outputs based on the forward model defined by Equation 2.122 and an assumed X with respect to the measured output for the true system. • L× norm is defined as
(
)
Π (X) = max yip (X) − yim (X t ) , i = 1, 2, … , ns
(2.129)
which chooses the maximum absolute error for all the errors of the predicated outputs based on the forward model and an assumed X with respect to the measured output from the true system. Error function constructed with these three norm forms, as well as the effects of the norm forms, will be discussed in detail in Section 11.6 using practical inverse problems of crack detection.
2.9
Choice of the Outputs or Effects
In performing an inverse analysis, one needs to use some kind of output or effect Y of the system to construct the objective function of error. For mechanics problems, the effects can be the displacement, velocity, and acceleration at points of the structure that are excited by harmonic or transient forces. They can also be the eigenvalues or eigenvectors of the structure obtained by model analysis techniques. The type of effect used should be decided based on the particular problem. Three important considerations in making such a decision are:
© 2003 by CRC Press LLC
• Sensitivity ensures that the effects Y chosen are sufficiently sensitive to the parameters Pi and/or the inputs X to be identified. • Accuracy ensures that the noise in the effects to be measured and computed can be well controlled so that the effects obtained are accurate. • Easy to acquire ensures that the effects can be easily obtained — experimentally at lower cost and computationally with available efficient forward solve with minimum cost. Different types of effects can be combined in different ways as the system outputs to form the objective function. One way is simply to add all the different types of effects together with proper weightings to form a single objective function. Another way is to formulate multiple objective functions that can be minimized in stages or simultaneously using optimization tools to obtain the inverse solution. An example for the multistage minimization of objective functions will be given in Section 13.2. The methods of simultaneous optimization for multiple objective functions will not be discussed, however.
2.10 Simulated Measurement In the stage of developing or examining an inverse procedure, the simulated measurements using the forward solver with the actual parameters should be used instead of carrying out the actual experiments. For instance, for a crack identification problem using displacement responses, the measured displacement responses are simulated using a forward solver (HNM code, SEM code, FEM code, etc.) with the actual impact loads. Noise-contaminated displacement, obtained by adding some “artificial” noise to the computergenerated displacement, can be used to simulate the experimentally recorded data that are contaminated with measurement noises. Two types of artificial noises are employed in the inverse procedures in this book. The first one is the Gaussian noise. In generating such a noise, a vector of a pseudorandom number is generated from a Gaussian distribution with mean a and standard deviation b using the Box-Muller method (Press et al., 1989). In the cases studied in this book, the mean a is set to zero, and the standard deviation b is defined as (D’Cruz et al., 1992): b = pe × 1 ns
ns
uim
∑( ) i =1
2 12
(2.130)
where uim is the displacement response sampled at point i, ns is the total number of the sampling points, and pe is the value to control the level of the noise contamination. For example, pe = 0.05 means a 5% noise level.
© 2003 by CRC Press LLC
(
)
Another type of artificial noise simulation is to use the white noise, Γ u noise , j obtained using the following formulations (Priestley, 1981; Xu and Liu, 2002d): ns
∑ Γ (u ) = 0 noise j
(2.131)
j =1
ns
∑ Γ(u )Γ(u noise j
noise j
)
− τ = 2 πDδ( τ)
j =1
(2.132)
where 1 D = pe ns − 1
ns
∑ (u ) j =1
m 2 j
12
(2.133)
in which pe is the parameter that controls the level of the noise contamination for the white noise. The simulated measurements with artificial noises are very useful to test the stability of an inverse procedure. Testing by using different levels of simulated noise can provide a good gauge on how robust an inverse procedure is in accommodating noise contaminations. If an inverse procedure cannot pass the simulated measurement test, it is not recommended for any actual inverse analysis, unless the measurement data are perfect. On the other hand, if the inverse procedure passes the test using a simulated measurement with a certain level of artificial noise contamination, the procedure should be ready for practical use, provided • The forward solver is reliable and experimentally validated. • Experimental data are validated and accurate, containing noise of lower level than the artificial noise used to test the inverse procedure. Based on this argument, computational inverse procedures can always be developed and tested on computers before they are applied in actual NDE practice.
2.11 Examination of Ill-Posedness In carrying out an inverse analysis, the following procedures can help to examine the types of ill-posedness of the inverse problem:
© 2003 by CRC Press LLC
• Type I ill-posedness can be easily identified from the numbers of knowns and unknowns. • Use the simulated noise-free data generated from the forward solver to perform an inversion to make sure the inverse procedure works and produces the true solution within acceptable error. This will confirm that the inverse procedure will not have Type II ill-posedness. Otherwise, the inverse problem is likely to have Type II illposedness and modifying the inversion and experiment strategy or the use of regularization is required. • Use the simulated noise-contaminated data generated by adding some random noise to the simulated data to perform an inversion to examine the stability of the inverse procedure. An inverse solution producing reasonably accurate data with errors of roughly the same level of the noise can safely confirm that the inverse procedure does not have Type III ill-posedness. Otherwise, the inverse problem is likely to have Type III ill-posedness and modifying the inversion and experiment strategy or the use of regularization is required. If the solution is not stable (the preceding test fails), proper regularization techniques may be used for ill-posed inverse problems. The regularization techniques are then very important for obtaining stable inverse solutions (detailed in Chapter 3).
2.12 Remarks • Remark 2.1 — for all the inverse problems discussed in this book, the assumption is that a forward model is available, meaning that the forward operation matrix or forward operator/solver can be formed and the output of the system can be obtained accurately for a given set of inputs. The forward solver should be more accurate compared to the experiments. • Remark 2.2 — the forward model can be provided in two ways: explicit matrix formulation and functional formulation. For explicit matrix form systems, the solution can be sought via a general matrix inversion. For systems of functional formulation, optimization techniques are used to obtain the solution. Quadratic functional formulation can lead to an explicit matrix formulation. • Remark 2.3 — three considerations for choosing the outputs or effects are: (1) sensitivity, (2) accuracy, and (3) easy to acquire. • Remark 2.4 — inverse problems in discrete form are presented with respect to forward problems based on physical and conventional considerations. The forward and inverse problems are interchange-
© 2003 by CRC Press LLC
•
•
•
•
•
able depending on the viewpoint of the analyst. Forward problems for continuous systems are defined from the mathematics point of view as problems whose field variables are governed by ordinary or partial difference equations. On the other hand, inverse problems for continuous systems may be defined as problems whose knowns (effects) are subjected to differential operations. Remark 2.5 — inverse problems are not necessarily ill-posed. Three types of ill-posed inverse problems have been summarized and examined in this chapter. Type I ill-posedness is due to the underposed formulation; Type II ill-posedness is due to the insensitivity between the inputs and outputs; and Type III ill-posedness is due to the harshening differential operation on noisy data. The first two types of ill-posedness are basically common for forward and inverse problems, and Type III ill-posedness is unique for inverse problems by the definition of this book. Remark 2.6 — the inversion using noise-free simulated data can provide an indication on Type II ill-posedness. The inversion using noise-contaminated simulated data can provide an indication on Type III ill-posedness. Remark 2.7 — the diverse nature of inverse problems makes them difficult problems with which to deal. Proper formulation of the forward model of an inverse problem is very important towards developing an effective solution procedure. The authors strongly believe that most engineering inverse problems can be well posed. Remark 2.8 — by paying a price in accuracy, the regularization methods presented in Chapter 3 are the last resort in solving the illposed problem. They are useful for ill-posed inverse problems. Remark 2.9 — the authors are confident that, with the advances in computer and computational technology, most engineering inverse problems can be properly formulated and effectively solved; nondestructive evaluation (NDE) techniques can be drastically improved by equipping with advanced inverse techniques.
© 2003 by CRC Press LLC
3 Regularization for Ill-Posed Problems
Chapter 2 showed a class of inverse problems that can be ill-posed, leading to unstable solutions. Therefore, the issue is how to stabilize the solution and the cost of stabilization in terms of efficiency and accuracy. To stabilize the solution at the lowest possible cost is the task of regularization. This chapter presents a number of often used methods of regularization, including • • • • •
Tikhonov regularization (stabilization) Regularization by singular value decomposition (null space removal) Iterative regularization methods (discrepancy principle) Regularization by discretization or projection (operator smoothing) Regularization by filtering (noise removal)
The first four methods are standard and described well by Engl et al. (2000); practical techniques for implementation of the first two regularization methods are well presented by Santamarina and Fratta (1998). This chapter is written in reference to these works and offers detailed discussion and examination on the regularization by discretization or projection and regularization by filtering. Filtering is a common practice to remove noise in measurement data in experiments for all purposes. Because all types of ill-posedness are often triggered by the presence of noise, the removal of the noise is naturally the most effective and practical method to stabilize the solution or mitigate the ill-posedness and is effective for all types of ill-posedness. Therefore, this book treats it as a regularization method. Examples will be presented to demonstrate the effectiveness of the regularization by filtering. For Type I and Type II ill-posed problems, the root of the ill-posedness is insufficient or insensitive information used in the inversion (see Table 2.2). Therefore, the regularization should try to make use of additional information to supplement the information. The Tikhonov regularization provides a way to make use of the information to stabilize the solution. For Type III ill-posedness, the instability is triggered by noise or error in the measurement data and a priori information of the noise must be used to restore stability. The property of the noise consists of two important factors:
© 2003 by CRC Press LLC
noise level and the frequency (in time domain) or wavenumber (in spatial domain) of the noise. Any regularization method makes use of the noise level or the frequency (or wavenumber) of the noise or both. The regularization by discretization (projection) makes use of the frequency (or wavenumber) of the noise and iterative regularization methods make use of the noise level. The Tikhonov regularization, which is regarded as the most popular method for ill-posed problems, uses the frequency and the level of the noise. Regularization by filtering requires the use of the frequency of the noise to design the filter properly.
3.1
Tikhonov Regularization
3.1.1
Regularizing the Norm of the Solution
A simple solution to the ill-posed problem is to use the so-called damped least-squares (DLS) method that is one of the methods of Tikhonov regularization. The process of deriving this Tikhonov regularization method is as follows: First define the following functional Π in the form of
{
} {Y
Π = Y m − SX
T
m
}
− SX + αX T X
(3.1)
The first term in this function is the L2 norm used in deriving the LS method for over-posed problems (see Section 2.6.4). The second term is the Pythagorean length used for deriving minimum length method for underposed problems (see Section 2.6.2). The nonnegative α is the regularization parameter called the damping factor that is used to penalize the Pythagorean length. The second term prevents the solution having too large length, and thus controls the stability of the solution. Seeking the minimum error with the penalty on the solution length requires ∂Π =0 ∂X
(3.2)
which leads to ∂Π = −2ST Y m − SX + 2αX = 0 ∂X
(3.3)
ST Y m = [ST S + αI]X
(3.4)
{
}
or
© 2003 by CRC Press LLC
Note that matrix [ST S + αI] is surely symmetric, and will be invertible due to the presence of αI . This clearly shows that the effect of the damping factor α is to increase the positive definiteness of the matrix and improve the condition of the system matrix. If a very large damping factor is used in comparison with the diagonal term of matrix ST S , matrix [ST S + αI] will be very well conditioned, and the condition number will approach 1. The solution will be very stable at this extreme. The estimated X can then be found using Equation 3.4 as X e = [ST S + αI] ST Y m −1
(3.5)
The definition of the generalized inverse matrix for the damped even- or over-posed inverse problem becomes S − g = [ST S + αI] ST −1
(3.6)
and the output reproducibility matrix is R O = SS − g = S[ST S + αI] ST −1
(3.7)
which, in general, may not be an identity matrix. Therefore, the DLS solution will generally not be output reproducible. On the other hand, the input reproducibility matrix is R I = S − g S = [ST S + αI] ST S −1
(3.8)
which also will not be the identity matrix, unless α vanishes. This implies that the DLS solution is not input reproducible and does not give the true estimation of the input of the system. The accuracy of the estimation depends on the damping factor used. Note that a large damping factor is preferred when trying to improve the condition of the equation system of the inverse problem. In improving the accuracy of the estimation, however, a small damping factor must be used; therefore, the analyst needs to compromise. The guideline should be to use the smallest damping factor that is just enough to prevent the ill conditioning of the matrix [ST S + αI] . Therefore, the damping factor used is often very small. The so-called L-curve method (Hansen, 1992) has been used for determining α . Note also that the presence of a small α in matrix [ST S + αI] effectively increases these smallest eigenvalues of the matrix. Therefore, it reduces the largest eigenvalues of the matrix [ST S + αI]−1 in Equation 3.5 that give the
© 2003 by CRC Press LLC
solution of the estimation. Therefore, α effectively acts to attenuate the high frequencies in the solution and has earned the name of damping factor.
3.1.2
Regularization Using Regularization Matrix
Next, the so-called regularized least-squares (RLS) method is formulated. The process to derive the RLS method is as follows: First define the functional Π in the form of
{
} {Y
Π = Y m − SX
T
m
}
T − SX + α[RX] [RX]
(3.9)
The first term in this function is the L2 norm of the error between the prediction of the forward model and the measurements, which is the same as that used in deriving LS method for over-posed problems. The second term is the regularization term where R is the regularization matrix, and α is the regularization factor. The regularization matrix contains the a priori information about the unknown X, and the regularization factor controls the degree of the regularization. The formation of the regularization matrix will be detailed in the next section. In seeking for the minimization condition of the regularized function, Equation 3.9 leads to ∂Π = −2ST Y m − SX + 2αR T RX = 0 ∂X
(3.10)
ST Y m = [ST S + αR T R]X
(3.11)
{
}
which gives
It is clear that matrix [ST S + αR T R] is surely symmetric and usually will be invertible due to the presence of αR T R . The addition of the matrix αR T R increases the positive definiteness of the matrix and hence improves the condition of the system matrix. It can be easily seen that, if the regularization matrix is an identity matrix, the RLS method becomes the DLS method, and if α = 0, the RLS method becomes the LS method. From Equation 3.11, the estimated X can then be found as X e = [ST S + αR T R] ST Y m −1
(3.12)
and the definition of the generalized inverse matrix for the damped evenor over-posed inverse problem becomes
© 2003 by CRC Press LLC
S − g = [ST S + αR T R] ST −1
(3.13)
The output reproducibility matrix is R O = SS − g = S[ST S + αR T R] ST −1
(3.14)
which, in general, may not be an identity matrix. Therefore, the DLS solution will generally not be output reproducible. On the other hand, the input reproducibility matrix is R I = S − g S = [ST S + αR T R] ST S −1
(3.15)
which also will not be the identity matrix, unless α vanishes. This implies that the DLS solution is not input reproducible and does not give the true estimation of the input of the system. The accuracy of the estimation depends on the regularization factor used. The guideline for selecting the regularization factor should be to use the smallest regularization factor that is just enough to prevent the ill conditioning of the matrix [ST S + αR T R] ; therefore, the damping factor used is often very small. The so-called L-curve method (Hansen, 1992) may be used for determining α .
3.1.3
Determination of the Regularization Matrix
Using Equation 3.9 means basically trying to regularize RX to the degree controlled by α . Therefore, it is necessary to decide what in the unknowns should be regularized, for which information about X is needed. Often the information of neighboring components of X is used. The finite difference formulations are therefore often used to create matrix R (Press et al., 1989; Santamarina and Fratta, 1998). • Case 1 — if X should be more or less constant, the gradient there should then be regularized. In this case the ith row ri in the regularization matrix R should be
[
ri = 0
0
− 1 i −1
1 i
0
0
]
(3.16)
so that ri X = xi − xi −1
© 2003 by CRC Press LLC
(3.17)
This implies that the difference between the two neighboring components will be regularized. • Case 2 — if the gradient of X should be more or less constant, the first derivative should then be regularized. In this case, ith row ri in the regularization matrix R should be
[
ri = 0
0
− 2
1
i −1
1
0
i +1
i
0
]
(3.18)
so that ri X = xi +1 − 2 xi + xi −1
(3.19)
This implies that the second derivative will be regularized. • Case 3 — if the second derivative of X should be more or less constant, the second derivative should then be regularized. In this case, ith row ri in the regularization matrix R should be
[
ri = 0
0
− 3
1
i−2
− 1
3
i −1
0
i +1
i
0
]
(3.20)
so that ri X = xi −2 − 3xi −1 + 3xi − xi +1
(3.21)
This implies that the third derivative will be regularized. • Case 4 — the same method can be extended to form regularization matrices of two-dimensional cases of regular grids. For example, to regularize the Laplacian ∂2x ∂2x + ∂ξ 2 ∂ η 2
(3.22)
In this case, ith row ri in the regularization matrix should be
[
ri = 0
1
1
i −1
ξ →0 1 0
© 2003 by CRC Press LLC
− 4 i
1 −4 1
1
i +1
0↓ η 1 0
1
0
] (3.23)
3.1.4
Tikhonov Regularization for Complex Systems
The Tikhonov regularization method can also be implemented for complex systems that cannot be formulated explicitly in a matrix form by adding the regularization functional terms into the object function of error. Optimization methods are then used as usual to minimize the modified object function to obtain the inverse solution. Tikhonov regularization is the most effective and powerful regularization method for many inverse problems arising from complex systems. The choice of the regularization parameter α , however, is not always straightforward.
3.2
Regularization by SVD
The method of regularization by singular value decomposition (SVD) is a very straightforward method applicable when no additional information on the solution is available. It is a mathematic means to obtain a stable solution; therefore, there is no guarantee on the quality of the solution. It is used when the SVD method described in Section 2.7 is used to solve the inverse problems. The solution is then given by Equation 2.119. r
X = S− g Y =
∑ i =1
ΨiT Y Φi λi
(3.24)
where λi (i = 1, 2,…, r = Rank(S)) are the nonzero eigenvalues of matrix SST or STS. In actual practice of solving ill-posed inverse problems, there is an issue on how to define “nonzero” because a number of eigenvalues can be very close to zero but they are not. It is clearly seen from Equation 3.24 that the inclusion of small value of λi will magnify the measurement (output) error in Y, which is the source of the ill-posedness of the problem. Therefore, choosing a sufficiently large cut-off eigenvalue can effectively obtain the stabilized inverse solution. However, it is often difficult to make the decision on cutting off the frequencies, which should be related somehow to the information on the noise. Engineering experience and judgment and trial and error are needed to make a proper decision. Computation of the output and input reproducibility matrices can reveal the reproduce property and hence help to make a decision. In summary, Table 3.1 gives several different representations of S − g for different cases. In this table, [S Y] is an expanded matrix by adding Y as the last column of S , and R is a regularization matrix.
© 2003 by CRC Press LLC
TABLE 3.1 Formulation and Solution for Ill-Posed Inverse Problems with Regularization Type
Criterion
Formulation
Under-posed (Rank[S] = Rank(SY) Over-posed Mixed-posed Information about the solution is available General
Minimum length solution
[S]–g
= ST(SST)–1
Least-squares solution Damped least-squares solution Regularized least-squares solution Singular value decomposition
S–g = (STS)–1ST S–g = (STS + αI)–1ST S–g = (STS + αRTR)–1ST
Functional of error
Discrepancy principle
Minimize (Π(X) + αδ)
S–g = VΛ–1UT (S = UΛV–1)
Note: For well-posed problems, α = 0.
3.3
Iterative Regularization Method
The iterative regularization method can be used when iterative methods (see, for example, Santamarina and Fratta, 1998; Engl et al., 2000) are used to solve inverse problems by minimizing the functional of errors. The method is based on the so-called discrepancy principle and is applicable to systems with or without explicit matrix form. This method works in obtaining a stable solution when the information on the level of the measurement noise is available. The goal of solving an inverse problem is to solve SX = Y
(3.25)
for X that is a vector of inputs, and Y is the measured outputs. A functional can then be defined:
Π (X) = SX − Y
(3.26)
where defines a norm. Y is measured and hence will be noisy; however, it is possible to have Y − Ym ≤ δ
(3.27)
where δ represents the magnitude or the level of the noise or discrepancy. The functional of error given in Equation 3.26 can be rewritten as
Π (X) = SX − Y m
© 2003 by CRC Press LLC
(3.28)
Because the measurement has the error of δ, in seeking for the minimum of the preceding function, there is no point to looking for an X that gives the Π (X) in the order of δ. Therefore, when an iterative method is used to minimize Π (X) , it is necessary to stop iterating any further as soon as Π (X) reaches αδ, where α > 1 (e.g., α = 1.1). Seeking any Π (X) smaller than αδ is physically meaningless. Therefore, the discrepancy principle states that when an iterative method is used to minimize the function of error, the iteration process should stop when
Π (X) = αδ
(3.29)
This is the stopping criteria for the iteration process that prevents triggering the ill-posedness of the problem leading to an unstable solution. The iterative regularization method is, in principle, also applicable to inverse problems that are formulated in the form of functional of errors, as shown in Table 3.1. Iterative method using the sensitivity matrix-based method will be addressed in detail in Section 14.3. A total solution of engineering problems will be received.
3.4
Regularization by Discretization (Projection)
For the detailed theoretical background of this regularization, readers are referred to the book by Engl et al. (2000). This section demonstrates how regularization by discretization or projection works with a simple problem of the force estimation of a clamped bar.
3.4.1
Exact Solution of the Problem
Consider a clamped bar of length L and cross-sectional area A, subjected to a body force b(x) continuously acting along the bar, as illustrated in Figure 3.1. Suppose that the material of the bar is homogeneous and the cross section E, A,L O b (x)
x
FIGURE 3.1 A clamped straight uniform bar subjected to a distributed body force b(x). The bar is made of a homogeneous material with the elastic Young’s modulus of E and constant cross section of A; the length of the bar is L.
© 2003 by CRC Press LLC
A is constant. The governing differential equation for this static problem is given as: EA
d 2u + b( x) = 0 dx 2
(3.30)
where E is the elastic Young’s modulus, and u denotes the displacement. The boundary conditions are: u( x = 0) = 0; EA
du =0 dx x = L
(3.31)
In this study, the body force b(x) is assumed to be one-quarter sine function along the bar as
(
)
b( x) = sin ω f x = sin
π x 2L
0≤x≤L
(3.32)
Also, L = 0.4 , EA = 1.0 are given for obtaining the numerical results. The exact solution of Equation 3.30, considering the boundary conditions of Equation 3.31, can be easily obtained as u a (x) =
(
1 sin ω f x ω 2f
)
0≤x≤L
(3.33)
Correspondingly, the member force Q can be derived from Equation 3.33 as Q a ( x) = −EA
3.4.2
(
du 1 =− cos ω f x dx ωf
)
0≤x≤L
(3.34)
Revealing the Ill-Posedness
In order to reveal the ill-posedness clearly and for further comparison with the result after the regularization, the process described in Section 2.3.2 is repeated here. Consider the inverse problem of estimating the force of the bar. In this case, the distribution of the displacements of the bar is known somehow, as well as the boundary conditions in Equation 3.31, the cross section, and the elastic modulus. Assume now that the measurement of the displacements contains a noise of low level but high frequency as u m = u a + e sin(ω noise x) 0 ≤ x ≤ L
© 2003 by CRC Press LLC
(3.35)
where ω noise = 30π , e = 2 / umax * 0.01 and umax is the maximum value of the displacement in the bar. Note that the magnitude of the noise is constant in 0 ð x ð L. The simulated measurement given in Equation 3.35, as well as the exact displacement given in Equation 3.33, is plotted in Figure 3.2. This figure shows that the measured displacement (dashed line) is very accurate compared to the actual value (solid line). This simulated measurement with very low noise level can be used to estimate the force. Substituting Equation 3.35 into Equation 3.34 obtains the estimated internal force analytically
(
)
1 cos ω f x − eω noise cos(ω noise x) 0 ≤ x ≤ L ωf
(3.36)
cos(ω noise x) 0 ≤ x ≤ L
(3.37)
Qae ( x) = −
Qa ( x )
or Qae ( x) = Q a ( x) − e
ω noise
factor of magnification
Clearly, the inverse solution for the estimation of the internal force is magnified drastically by the frequency ω noise of the noise of the measurement. This magnification is shown in Figure 3.3 with the comparison between the 0.07 0.06
actual displacement noise-contaminated displacement
0.05 0.04
u 0.03 0.02 0.01 0 0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
x FIGURE 3.2 Comparison of the actual displacements and the simulated one by adding a noise of low level (1%) but high frequency (ωnoise = 30π) for the clamped bar.
© 2003 by CRC Press LLC
0.05 inverse solution with noise-free displacement inverse solution with noise-contaminated displacement
0
-0.05
-0.1
Q -0.15
-0.2
-0.25
-0.3
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
x FIGURE 3.3 Inverse solution of the internal force using the simulated measurement with 1% oscillatory noise-contaminated in displacement data.
exact internal force and the estimated results obtained with the noise-contaminated displacement. The estimated external body force could also be obtained in the following close form
(
)
2 bae ( x) = sin ω f x + e ω noise sin(ω noise x) 0 ≤ x ≤ L factor of magnification
(3.38)
b( x )
2 The inverse body force is even magnified by ω noise times as specified in Equation 3.38. This inverse problem has the Type III ill-posedness, as was observed in Section 2.3.
3.4.3
Numerical Method of Discretization for Inverse Problem
As illustrated in Section 3.4.2, the inverse problem for force estimation of the bar could be analytically solved using the known displacements; the Type III ill-posed problem has been also revealed in this example. However, most engineering problems cannot be solved in a close form formulation. They are normally solved using a method of discretization, such as the finite element method, mesh-free method, finite difference method, etc. The same inverse problem studied in the previous subsection is now solved using the FEM.
© 2003 by CRC Press LLC
3.4.3.1 Finite Element Solution The finite element equation can be written as ku = f
(3.39)
where k is the stiffness matrix, f is the nodal force vector, and u is the nodal displacement vector for the element. The detailed procedure that leads to Equation 3.39 can be found in any textbook for FEM, such as the one by Liu and Quek (2003). Assume the bar is divided into N elements with a typical element defined in Ω i = ( xi , xi +1 ) , whose nodes are at x = xi and x = xi +1 . For the choice of linear shape functions, ψ i1 (x) =
x i +1 − x x − xi ; ψ i 2 (x) = x i +1 − x i x i +1 − x i
x i ≤ x ≤ x i +1
(3.40)
yields
EA 1 k= l −1
−1 , f= 1
f ( x)ψ 1 ( x)dx xi xi + 1 f ( x)ψ 2 ( x)dx xi
∫ ∫
xi + 1
(3.41)
where l = xi +1 − xi . The assembled equations for all elements in the whole bar can be obtained as KU = F
(3.42)
The matrices K, U, and F are obtained by assembling k, u, and f from the connectivity of the elements. The displacements on the node of the bar can be obtained by solving Equation 3.42 after the imposition of the displacement boundary condition. The comparison of the FEM results and those from the exact solution given in Equation 3.33 has been carried out. The results are plotted in Figure 3.4, in which the FEM results are obtained using 24 elements of equal length. It is shown that the FEM results are very accurate. 3.4.3.2 Inverse Force Estimation Now the nodal force F with the known displacements U can be “inversely” obtained using Equation 3.42. From the nodal force F, the internal force Q(x) can be estimated using
© 2003 by CRC Press LLC
0.07
Exact solution FEM solution
0.06 0.05 0.04
u 0.03 0.02 0.01 0 0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
x FIGURE 3.4 The finite element solution with 24 elements for the clamped bar as illustrated in Figure 3.1.
Q fe ( x1 = 0) = Ffe ( x1 = 0)
( )
( )
( )
Q fe x j = Q fe x j −1 + Ffe x j
j = 2, … , N
(3.43)
where the subscript f stands for the value estimated from the FEM solution. Using the FEM model, the continuous domain of the problem is projected into a discretized space. The so-called regularization by discretization or projection is expected to mitigate the ill-posedness of the inverse problems to a certain degree and to reduce the factor of magnification of error. The next section reveals the effect of the discretization regularization via the simple force estimation of the bar from the displacements.
3.4.4
Definition of the Errors
To investigate the accuracy of the inversely estimated results of the internal force using the FEM model, as well as to compare them with the directly analytical solution, the following error norms between the estimated and the exact solutions are defined εa = εt =
© 2003 by CRC Press LLC
max Qia − Qaie ε max Qia − Q fie ε
, for all i
(3.44) , for all i
εm = εf =
max Qia − Q fia ε max Q − Q a fi
, for all i
(3.44 continued) e fi
ε
, for all i
where Qia — the estimated internal force at the ith node obtained using the analytical Equation 3.34 without noise in the displacement Qaie — the estimated internal force at the ith node obtained using the analytical Equation 3.36 with noise in the displacement a Q fi — the estimated internal force at the ith node obtained using the FEM model without noise in the displacement Q fie — the estimated internal force at the ith node obtained using the FEM model with noise in the displacement ε — the reference error defined by ε = eω noise
(3.45)
which is the maximum error of the estimated internal force in the bar obtained using Equation 3.36. Using Equation 3.34 and Equation 3.36, the error of the estimated internal force-based analytical formula using the noisy displacement becomes εa = 1
(3.46)
For convenience, εa is termed the analytical inverse error. All the other errors are in relation to εa: εf stands for the maximum error of the estimated force in the bar using the FEM model and noisy displacement (short for FEM inverse error).
ε m stands for the maximum error from the FEM model (short for FEM model error). ε t stands for the total maximum error of the estimated force and combines the FEM model error and the FEM inverse error. Thus εt = εm + ε f
© 2003 by CRC Press LLC
(3.47)
1.4
1.2
Error
1 0.8 0.6
εa εm εt εf
Best number of divison of elements
0.4
0.2 0
0
20
40
60
80
100
120
140
Number of element 5x10-4
u noise
0
-5 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4
FIGURE 3.5 Errors involved in the inverse analysis of the internal force in a clamped bar. εa: analytical inverse error; εf : FEM inverse error; εm: FEM model error; εt = εm + εf: total error. The frequency of the noise in the displacement is ωnoise = 60π; that is the factor of error magnification. The FEM discretization can always reduce the inverse error compared to the analytical inverse error εa. This error reduction effect is termed regularization by discretization or projection and is more effective when the number of division of the bar is small; the effect reduces with the increase of the number of the elements. The FEM mode error, however, increases with the decrease of the number of the elements. The best division of element should correspond to the minimum total error point. In a special case the FEM inverse error drops to zero when the element number is 24 due to the special element division that picks up no noise.
3.4.5
Property of Projection Regularization
Figure 3.5 shows the errors involved in the inverse analysis of the internal force in a clamped bar, for different numbers of element divisions. The frequency of the noise is ω noise = 60π . • The FEM discretization can always reduce the inverse error compared to the analytical inverse error ε a . This error reduction effect is termed regularization by discretization or projection. • Regularization by discretization is more effective when the number of division of the bar is small, and the effect reduces with the increase of the number of the elements.
© 2003 by CRC Press LLC
0.1
True solution Analytical estimation FEM estimation with 20 elements FEM estimation with 100 elements
0.05 0 -0.05 -0.1
Q -0.15 -0.2 -0.25 -0.3 -0.35
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
x
FIGURE 3.6 Estimated force using FEM with different numbers of the elements. The frequency of contaminated noise is ωnoise = 60π. Clearly, using a fewer number of elements leads to better solution . The estimated results with too many elements will approach the analytical estimation. This figure shows that the projection regularization only works well at coarse discretization.
• The FEM mode error, however, increases with the decrease of the number of the elements. • The best division of elements should correspond to the minimum total error point. Therefore, in the context of inverse analysis, it is not always true that the use of more elements leads to a more accurate solution. In a special case the FEM inverse error drops to zero when the element number is 24 because of the special element division that picks up no noise. Generally, this will not happen. Figure 3.6 gives the estimated force from the noisy displacement using the FEM with different numbers of the elements. In this estimation, the frequency of the contaminated noise is ω noise = 60π . Two divisions of 20 even elements and 100 even elements are used in the analysis. It can be clearly found from this figure that using fewer elements (20) obtains a better inverse solution than using more elements (100). 3.4.6
Selecting the Best Mesh Density
The FEM inverse error increases with the density of discretization. In contrast, the FEM model error will decrease when more elements are used in the analysis. Thus the question now is how to select the best division in an inverse analysis to achieve least total error. Figure 3.5 shows that the number of elements should be at the deep point (marked with a dot) of the total error curve, where the total error is minimum.
© 2003 by CRC Press LLC
At this point, the total error is around one fifth of the analytical error ε a . For this problem of force estimation, the best density of discretization can be determined easily because the force is given. However, for practical engineering problems, this will be very difficult because the forces are not known and all the curves cannot be drawn. The best density of discretization can only be found if some information about the force is known in terms of rough distribution and the frequency (wavenumber). This is often possible, as shown in examples in later chapters.
3.5
Regularization by Filtering
Filtering is the most effective and straightforward method of regularization. Because all the ill-posedness is triggered by the presence of the noise, removal of noise naturally works best. In practice, the data should always be filtered before they are used for the inverse analysis. The basic idea is to remove noise whose frequency is higher than that of the true outputs (effects). The filtering can be carried out simply by selecting a proper filter from the filter bank (see Strong and Nguyen, 1996) or directly using the signal processing toolbox provided in the commercial software MATLAB. The following gives a method of the smoothing with moving average method to treat the noisy data for inverse analysis. High-frequency noise can be reduced by running a moving average. The ith smoothed value of the signal is computed from the original noisy signal u as an average of neighboring values around i. A kernel h of m-values (m is an odd number) contains the weights to be applied to neighboring elements. The sum of all weights in a smoothing kernel equals 1. The filtered data uF can be written as k =( m−1)/2
uiF =
∑
hp ui + k
(i = 1, 2, … , N )
(3.48)
k =− ( m−1)/2
where h is the kernel with m parameters. 2 4 6 4 2 1 1 A kernel, h = , and m = 7 are 20 20 20 20 20 20 20 selected for the study. Two kinds of assumed noises will be considered in the following application of the force estimation: high-frequency sine noise and the Gaussian noise along the bar.
3.5.1
Example I: High-Frequency Sine Noise
Considering the force estimation of the clamped bar as studied in Section 3.4.1, the noise is given in Equation 3.35. Figure 3.7 shows the estimation of
© 2003 by CRC Press LLC
0.05 0 -0.05 -0.1
Q
-0.15 -0.2 -0.25
With filtering Without filtering Analytical estimation
-0.3 -0.35 0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
x FIGURE 3.7 Comparison of estimated force with and without filtering for the displacement containing a sine function noise as specified by Equation 3.35. The FEM solution is obtained using 20 elements.
the force using FEM with 20 elements. In this figure, the estimated results with the filtered displacement are plotted in a dashed line, the estimated results with the noise-contaminated displacements are plotted in a long dashed-dot line, and the true results are plotted in a solid line as the reference of the comparison of the estimated results. This figure shows that the estimated results with filtered displacements are very close to the exact ones and that the oscillation of the estimated results has been mitigated with the filtered displacement.
3.5.2
Example II: Gaussian Noise
Gaussian noise (see Chapter 2) is directly added to the computer-generated displacements to simulate the noise contamination. The Gaussian noise is generated from a Gaussian distribution with a zero mean and standard deviation of b = ( 2 / 2)umax * 0.01, as shown in Figure 3.8. Then the noisecontaminated displacement, obtained by adding the Gaussian noise to the displacement obtained using FEM with 100 elements, is used to simulate the measurement. Figure 3.8 shows that the Gaussian noise is random in nature and much more complex in distribution than the sine function noise along the bar. Applying the filter on the noisy displacement, inverse analysis is then performed. Figure 3.9 gives the estimated force results together with those obtained using the unfiltered noisy displacements. The estimated results with the filtered displacements (dashed line) are much more accurate and very close to the exact results. Designing a good filter requires a good understanding of the noise nature of the measurement data, as well as some information on the frequency (or wavenumber) of the solution. If the possible frequency of the noise is known
© 2003 by CRC Press LLC
-3
1.5
x 10
Gaussian noise
1
0.5
0
-0.5
-1
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
x FIGURE 3.8 Gaussian noise generated with a zero mean and standard deviation of b = ( 2 / 2)umax * 0.01 for simulating the measured displacement of the clamped bar.
with filter true solution without filter
FIGURE 3.9 Comparison of estimated force of the bar with and without filtering for the displacement containing the Gaussian noise. The FEM solution is obtained using 100 elements.
© 2003 by CRC Press LLC
from the experimental setup, as well as the fact that the solution will not contain such a frequency, a filter can always be designed to filter out the noise before the data are used for the inverse analysis.
3.6
Remarks
Remark 3.1 — a set of regularization methods for ill-posed inverse problems has been presented. The Tikhonov regularization is considered the most popular regularization method for ill-posed problems. Remark 3.2 — projection regularization works well for coarse mesh used in the discrete methods. In contrast, the model error will decrease when the density of discretization increases. The best density of discretization should lead to least total error. Remark 3.3 — practical engineering inverse problems are often nonlinear and of high dimension, so it is difficult to develop a “magic” regularization method that works for all the ill-posed problems. Performing a proper regularization requires a good understanding of the nature of the ill-posedness of the inverse problem, and prevention is always better than cure. Some ways of prevention that cannot be over emphasized include: • The reduction of the numbers of parameters to be inversely determined in an inverse procedure is always the first effort to make. The search bounds of these parameters should be as narrow as possible. • To reduce the Type I ill-posedness, the number of the knowns (measurements) should be at least more than the number of unknowns to be inversely determined, leading to at least an evenposed problem. Efforts should be made to make the problem over-posed. • To reduce the Type II ill-posedness, ensure a high sensitivity of the effects to the parameters (including the inputs) to be inversely identified. The parameters should be influential to the effects and as independent as possible. • When a forward solver of discretization is used, coarse mesh is preferred as long as it gives slightly more accurate results compared to the experiments. This can reduce the Type III ill-posedness. • Filter the measurement data before they are fed to the inverse analyses. This can very effectively reduce the ill-posedness.
© 2003 by CRC Press LLC
• Use discrepancy principle as the stopping criterion to avoid triggering the ill-posedness. • As the last resort, use the Tikhonov regularization to restore the stability of the inverse solution.
© 2003 by CRC Press LLC
4 Conventional Optimization Techniques
As discussed in Chapter 2, inverse problems in engineering are usually formulated and solved as optimization problems; therefore, optimization techniques are, in general, applicable to inverse problems. This chapter introduces some important conventional optimization techniques as well as some root finding methods for nonlinear system equations. These conventional optimization techniques can be roughly classified into two categories: direct search algorithms and gradient-based algorithms. In direct search algorithms, only the function values are used in the search process. In gradientbased methods, both derivative and function values are used to achieve high efficiency. Note that many conventional optimization techniques have a basic assumption that only one minimum exists in the research region. Therefore, these methods are also called local search methods.
4.1
The Role of Optimization in Inverse Problems
It can be understood from Chapter 2 that an inverse problem can be formulated as a minimization problem to find an inverse solution of X leading to Minimize Π (X)
(4.1)
where Π is a functional of errors or discrepancies. When the so-called L2 norm is used to define the functional, then
Π (X) =
ns
∑ ( y (X) − y p i
m i
(X t ))
2
(4.2)
i =1
where ns is the number of sampling points of experiment readings or measurements. The functional Π(X) counts for the sum of the ns squares of the errors of the predicated outputs yip based on the forward model Equation
© 2003 by CRC Press LLC
2.112 and an assumed input X with respect to the measured output yim for the system with the true input Xt. When L1 and L∞ norms are used, Π is then given by Equation 2.128 and Equation 2.129, respectively. This clearly shows that an inverse problem can be formulated as a minimization problem and any optimization method can be applied to solve the inverse problem so formulated. Many works have been reported using optimization methods to solve practical inverse problems such as the estimation of the impact location, as well as the time–history of dynamic forces. Law et al. (1997) derived the solution for the vertical dynamic interaction force between a moving vehicle and bridge deck, which was modeled as one point or two point loads moving at a constant speed on a simply supported beam with viscous damping. They used acceleration measurements for their inverse prediction, and their numerical predication is reported well. Möller (1999) proposed a method for load identification in an attempt to approach the inverse problem in a general manner for homogeneous solid plates. Without solving any equation system, a large number of trial load cases with assumed distributions were evaluated to determine the respective magnitudes. An optimization scheme with added discrete mass as design variables was adopted to reproduce a structure such that each set of response generated by each one of the previously identified loads was clearly distinguishable at the transducer positions. Using elastic waves, a minimization procedure has also been proposed for identifying the concentrated and the extended load from the displacement response measured on the surface of composite laminate (Liu et al., 2002e, f). Conjugate gradient optimization methods were used to deconvolute the integral, which leads to the inverse solution. Optimization methods were also employed to reconstruct the material property of structures. Rokhlin and co-workers (Rokhlin et al., 1992; Chu et al., 1994; Chu and Rokhlin, 1994a, b) used the ultrasonic technique to determine the elastic constants of composites. In most cases, the optimization methods such as simple search methods, as well as gradient-based methods, are used to determine the elastic constants from the measured bulk wave velocity. Mota Soares et al. (1993) solved a constrained minimization of an error functional expressing the difference between measured higher frequencies of a plate specimen and the corresponding numerical ones to find the mechanical properties of composite specimens. Liu et al. (2001a) inversely characterized the material property of functionally graded material using the nonlinear least squares method. Some researchers have used genetic algorithms (GAs) as the inversion technique to reconstruct the elastic constants. Balasubramaniam and Rao (1998) reconstructed material stiffness properties of unidirectional fiber-reinforced composites from obliquely incident ultrasonic bulk wave velocity data. Recently, Liu et al. (2002a) used a combined optimization method for material characterization of composite from given surface displacements. In this method, the nonlinear least square algorithm as well as the GA is combined for the inversion using an overposed data set.
© 2003 by CRC Press LLC
Optimization methods were also used for flow detection and crack identification in engineering materials and structures. Doyle (1995) utilized the spectral element method combined with a stochastic genetic algorithm to locate and size cracks in structural components. Bicanic and Chen (1997) proposed a procedure for the damage identification of framed structures, using only a limited number of measured natural frequencies. Based on the characteristic equation of the original and damaged structure, a set of equations is formulated corresponding to the stiffness matrix, and is solved by the direct iteration and Gauss–Newton techniques. Wu et al. (2002) used the GA to inversely detect the crack location and crack length for anisotropic laminated plates from the scattered wave fields in the frequency domain. Different stiffness distribution with respect to damage detection has also been investigated using structural dynamics response in the frequency domain (Liu and Chen, 2002). In this technique, element stiffness factors of the finite element model of a structure are taken to be parameters, and explicitly expressed in a linear form in the system equation for forward analysis of the harmonic response of the structure. This offers great convenience in applying Newton’s method to search for the parameters of stiffness factor inversely because the Jacobian matrix can be obtained simply by solving sets of linear algebraic equation derived from the system equation. Yang et al. (2002b) proposed a nondestructive detection method using an integral strain that can be measured by an optic fiber. In their method, the GA is employed to solve an optimal problem formulated by the calculated and the measured data set. There are many applications of optimization techniques to other inverse problems. For instance, heat transfer coefficients in electronic system cooling systems has been determined by an improved genetic algorithm (Liu et al., 2002h). Also, protein structure stability analysis and its native structure prediction using GAs have been reported (Dandekar and Argos, 1992; Unger and Moult, 1993; Yang et al., 2002a, c). A new procedure for fitting interatomic potentials is suggested by Xu and Liu (2003) using molecular dynamics simulations and the GA. The flow-pressure characteristic parameters of the value-less micropumps are also identified from the flow-membrane coupling vibration model with the trial pressure-loss coefficients (Xu and Liu, 2002e), where the GA is used to solve this optimization problem and identify the actual pressure-loss coefficients. These practical examples clearly demonstrate the importance of optimization techniques as well as the innovative applications of these techniques in solving inverse problems. The next section describes the often used conventional optimization techniques that are useful for inverse problems.
4.2
Optimization Formulations
Optimization is a process of determining the “best” solutions of a desired objective function f(x) while satisfying the prevailing constraints. The major-
© 2003 by CRC Press LLC
ity of engineering problems involving minimization can be mathematically expressed as: Minimize f (x) Subject to gi (x) ≤ 0, i = 1, 2, … , m h j (x) = 0,
(4.3)
j = 1, 2, … , l
(4.4)
x kL ≤ x k ≤ x Uk , k = 1, 2, … , n
where x = {x1, x2, …, xn}T is a column vector of the n design variables, f(x) is the objective function, gi(x) is the ith inequality constraint, and hj(x) is the jth equality constraint. Note that the maximization of f(x) is equivalent to the minimization of –f(x). Therefore, minimization and maximization can be essentially treated as the same problem. Most of the optimization methods use an iterative approach that generates a sequence of points, x(1) , x(2) , … (the superscripts denoting iteration number) converging to the point x* that is the solution of the problem (or the minimizer of the object function). If the f(x) can be written in the form of Equation 4.2, the objective function is in a form of a sum of ns squared terms; this is the formulation of the so-called nonlinear least squares problem. Special numerical treatments exist that are effective for this type of problem and will be introduced in Section 4.5. If the objective function f(x) is smooth or continuously differentiable, at any point x the vector of first partial derivatives of the function, or the gradient vector, can be written as ∂f ∂f ∂f ∇f (x) = , ,…, ∂xn ∂x1 ∂x 2
T
(4.5)
If f(x) is twice continuously differentiable, then the second partial derivatives or the so-called Hessian matrix can be given as ∂2 f ∂x1∂x1 ∇ 2 f (x) = symmetric
© 2003 by CRC Press LLC
∂2 f ∂x1∂x 2 ∂2 f ∂x 2 ∂x 2
∂2 f ∂x 1 ∂x n 2 ∂ f ∂x 2 ∂xn ∂2 f ∂xn ∂xn
(4.6)
However, for real engineering problems the expressions for the partial derivatives cannot be obtained in an explicit form. In this case, the derivatives can be obtained by numerical methods, such as the finite difference method. Several such numerical methods can be found in the book by Rao (1996) for the evaluation of the first-order derivatives. Next, several traditional optimization methods for dealing with unconstrained minimization problems will be presented. Though practical engineering problems are rarely unconstrained, the methods for solving unconstrained minimization problems are still important because first, the methods for unconstrained minimization problems are the fundamentals of the constrained problems and, second, many practical engineering problems can be solved using the unconstrained minimization methods. In fact, many engineering inverse problems provided in the following chapters are defined as minimization problems without constraint or nonlinear least squares problems with simple bounds. For instance, the material property of a composite structure (Chapter 8) subjected to a known impact load can be inversely determined by solving a nonlinear least squares problem with simple bounds on parameters formulated using dynamic displacement responses. The techniques for unconstrained minimization problems can be roughly classified into two categories: direct search algorithms and gradient-based algorithms. They assume that only one optimum or minimum is in the region of searching. In the direct search algorithms, only the function values are used in the search process. In the gradient-based methods, the function and its derivative values are used. Gradient-based methods are generally more efficient than direct search methods because the gradient is used for guiding the direction of searching. Therefore, as long as the derivatives for the objective function can be obtained within reasonable cost, gradient-based methods are preferred. The direct search methods are useful if the derivatives of the object function cannot be obtained or are too costly to obtain. In describing the algorithms of these conventional optimization methods, emphasis is placed on understanding the concepts and procedures of these methods rather than the algorithm coding. This is because many source codes are available in open literature or standard math libraries. In most situations, users need only to call the subroutine provided by the standard libraries. Moreover, simple examples that can be worked out by hand for easy comprehension have been used in describing the concepts and detailing the procedures of these methods.
4.3
Direct Search Methods
A number of direct search methods exist, including • Grid search method (the simplest direct search method)
© 2003 by CRC Press LLC
f(x)
fU
fL f3 f1 f2
xL
x1
x2
x3
x xU
FIGURE 4.1 Schematic presentation of the golden section search method.
• • • •
Fibonacci research as well as the golden section search method Simplex search method Hooke and Jeeves pattern search method Powell’s conjugate direction method, etc.
All these methods assume only one optimum in the search region. This section presents only the golden search method, which is a well-known classic direct search method; the Hooke and Jeeves method, which is one of the most commonly used direct search methods; and Powell’s method, which is regarded as one of the most efficient and powerful direct search methods. For practical applications, the golden section search method is a very simple, robust, and useful method. It has been employed for the cooling system identification problem that is examined in detail in Chapter 13.
4.3.1
Golden Section Search Method
Golden section search method is a single-variable optimization algorithm. It is assumed that there is only one optimum in the region of search. Figure 4.1 illustrates this method for determining the minimum of a function f(x). The lower and upper bounds on x are xL and xU, respectively, and their corresponding function values are fL and fU. In this method, two intermediate points are picked according to the following equations: x1 = τx L + (1 − τ)x U
© 2003 by CRC Press LLC
(4.7)
x 2 = (1 − τ)x L + τx U
(4.8)
5 −1 = 0.618 . 2 The function is evaluated at these points to provide f1 and f2. Due to the assumption that the function is unimodal, it follows that either x1 or x2 will form a new bound:
where τ =
• If f1 is greater than f2, x1 becomes the new lower bound and new bounds, x1 and xU, are formed as illustrated in Figure 4.1. • If f2 is greater than f1, it is clear that x2 would be the new upper bound, and minimum of f would fall in between xL and x2. • Then additional point, x3, is picked according to Equation 4.8, and f3 is evaluated. Comparing f2 and f3, it is easy to see that f3 is greater than f2. Thus x3 replaces xU as the upper bound. • Repeating this process, the bounds are eventually narrowed to the desired tolerance, leading to the minimum solution.
4.3.2
Hooke and Jeeves’ Method
Hooke and Jeeves’ method is one of the most commonly used direct search methods. It is assumed that there is only one minimum in the region of search. In this method, an initial step size is chosen and the search direction is initiated from a given starting point. A combination of exploratory and pattern moves is made iteratively to find the most profitable search directions. An exploratory move is employed first to find the best point around the initial point. If the exploratory move leads to a decrease in the value of function, it is regarded as a success; otherwise, it is considered a failure. Then a pattern move is made to find the next point. 4.3.2.1 Exploratory Moves An exploratory move acquires information about the function values around the current base point. The current base point is denoted by x(0) , and the initial step size is denoted by ∆0. The reduction factor is denoted by α. • Evaluate f(x(0) + ∆0). If the move from x(0) to x(0) + ∆0 is successful, replace the base point from x(0) to x(0) + ∆0; otherwise, retain the original base point. Repeat the same process for point x(0) – ∆0. • Find a new base point x(1). • If x(1) = x(0), reduce the initial step length (by half if the reduction factor is α = 2) and return to step 1. The search terminates if the step length has been reduced to the prescribed level. If x(1) ¦ x(0), make a pattern move from these two points.
© 2003 by CRC Press LLC
The exploratory move will be employed in the hybrid genetic algorithms (Chapter 5) to construct the local operator. 4.3.2.2 Pattern Moves A pattern move attempts to speed up the search performance. It seems sensible to move from x(1) in the direction x(1) – x(0) because a move in this direction has already led to a decrease in the function value. The procedure for a pattern move is expressed in an iterative form of (
)
x pk +1 = x
(k)
+ (x
(k)
−x
( k −1)
) = 2x ( k ) − x ( k −1)
(4.9)
A new sequence of exploratory moves about x k+1 is also performed conp tinuously. If the lowest function value is reached during the pattern move as well as exploratory move of x k+1 p , then a new base point can be established, and the current base point becomes xk+1 . Otherwise, a new sequence of exploratory moves about xk should be performed. 4.3.2.3 Algorithm Based on the preceding illustration of the exploratory and pattern moves, the algorithm of Hooke and Jeeves’ method (also called Hooke and Jeeves’ pattern search method) has been outlined in Figure 4.2. Next, an example of a simple minimization problem that can be analyzed by hand will be presented to illustrate the detailed procedure of this method. Start Initial point x (0 ) , k = 0, initial increment ∆(0 ) , termination parameter ε , reduction factor α ≥ 2
∆( k +1) = ∆( k ) α
Perform the exploratory move with x (k )
No failure Criteria of exploratory move
∆(k ) ≤ ε
Successful yes
k +1 k = k + 1 , find x
Perform the pattern move x kp+ 2 = x k +1 + (x ( k +1) − x ( k ) ) = 2x k +1 − x k k +2
Perform the exploratory move with x p
FIGURE 4.2 Flow chart of Hooke and Jeeves’ method.
© 2003 by CRC Press LLC
Stop
4.3.2.4 Example Use Hooke and Jeeves’ method to solve the following problem: Minimize f ( x1 , x 2 ) = x12 + x 22 − 4 x1 − 4 x 2 − 8 − x1x 2
(4.10)
Set the initial point at (0,0)T. The initial step is chosen as (0.5, 0.5)T, the reduction factor is α = 2, and the termination parameter ε = 0.1. Iteration 1: First, an exploratory move using (0, 0)T as the base point is performed. Thus, x0 = (0, 0)T and k = 0 are set. Consider the first variable, x1: f (0, 0) = −8 f ( −0.5, 0) = −5.75
(4.11)
f (0.5, 0) = −9.75
The minimum of the preceding function values at these three points is –9.75 at point (0.5,0.0)T. Now the exploratory move on the second variable, x2, is explored: f (0.5, 0) = −9.75 f (0.5, 0.5) = −11.75
(4.12)
f (0.5, −0.5) = −7.25
The exploratory move is successful because a smaller function value is found at the corresponding point x(1) = (0.5, 0.5)T. Next, perform the pattern move: x p2 = 2x 1 − x ( 0 ) = (1.0, 1.0) ( )
( )
T
(4.13)
Then, perform another exploratory move based on the point (1.0,1.0)T: f (1.0, 1.0) = −15 f (1.5, 1.0) = −16.25 f (0.5, 1.0) = −13.25 f (1.5, 1.5) = −17.75 f (1.5, 0.5) = −14.25
© 2003 by CRC Press LLC
(4.14)
This exploratory search is again successful and the new base point is found to be x(2) = (1.5, 1.5) T. This completes one iteration of the Hooke and Jeeves’ search. Iteration 2: First, the exploratory move using (1.5,1.5)T as the base point is performed, f (1.5, 1.5) = −17.75 f (2.0, 1.5) = −18.75 f (1.0, 1.5) = −16.25
(4.15)
f (2.0, 2.0) = −20 f (2.0, 1.0) = −17
The exploratory move is successful, and the corresponding point becomes x(3) = (2.0, 2.0)T. Next, perform the pattern move: x (p4 ) = 2x ( 3 ) − x ( 2 ) = (2.5, 2.5)
T
(4.16)
which is followed by another exploratory move based on the point (2.5,2.5) T: f (2.5, 2.5) = −21.75 f (2.0, 2.5) = −20.75 f (3.0, 2.5) = −22.25
(4.17)
f (3.0, 2.0) = −21 f (3.0, 3.0) = −23
This search is also successful and the new base point is x(4) = (3.0, 3.0)T. Iteration 3: Repeating the same process as described in iteration 2 will yield x(6) = (4.0, 4.0)T, with a function value of –24. This new point is accidentally the true optimal point of the problem. It should be pointed out that even after the optimal point has been found, the algorithm will process until the increment vector ∆ is less than the terminate parameter ε. The following iteration shows how the algorithm decreases the increment and finally terminates at the optimal point. Iteration 4: First, the exploratory move using (4.0, 4.0)T as the base point is performed, considering the first variable,
© 2003 by CRC Press LLC
f ( 4.0, 4.0) = −24 f ( 4.5, 4.0) = −23.75
(4.18)
f (3.5, 4.0) = −23.75
Also, performing the exploratory move to the second variable, f ( 4.0, 4.0) = −24 f ( 4.0, 4.5) = −23.75
(4.19)
f ( 4.0, 3.5) = −23.75
Note that the exploratory move has failed because all the newly found function values are larger than –24. Thus the increment should be reduced to ∆ = 1/2∆(0) = (0.25, 0.25)T, and the exploratory move will be performed again, using the reduced increment, for the first variable as f ( 4.0, 4.0) = −24 f ( 4.25, 4.0) = −23.9375
(4.20)
f (3.75, 4.0) = −23.9375
Also, for the second variable, f ( 4.0, 4.0) = −24 f ( 4.0, 4.25) = −23.9375
(4.21)
f ( 4.0, 3.75) = −23.9375
Again the exploratory move failed and the increment should be reduced at ∆ = (0.125, 0.125)T. The algorithm will continue the exploratory move using the new increment until the increment is smaller than the desired termination parameter. The final solution is pinpointed at (4.0, 4.0)T with a function value of –24.
4.3.3
Powell’s Conjugate Direction Method
Powell’s conjugate direction method is one of the most efficient and powerful direct search methods because the conjugate directions method can speed up the convergence for general nonlinear objective functions. This method has been proven (Powell, 1964) convergent for the quadratic objective functions.
© 2003 by CRC Press LLC
4.3.3.1 Conjugate Directions Considering the following minimization problem of quadratic object function, Minimize f (x) =
1 T x Ax + BT x + c 2
(4.22)
direction vectors Di and D j are conjugate with respect to A if they satisfy the following relationship: DTi AD j = 0, i ≠ j
(4.23)
Powell developed a novel method for constructing the conjugate direction without use of the derivatives of the objective functions. The basic idea is to set a number of independent search directions and correspondingly perform searches along each of this set of directions, starting from the previous best point. For a given direction D and two initial points xA and xB , if yA is the minimum solution of the function f(x) along the direction D from the initial point xA, and yB is the minimum solution of f(x) along the direction D from the initial point xB, then the direction (yB – yA) is conjugate to D. That is,
(y B − y A )AD = 0
(4.24)
This defines the parallel subspace property, which states that if two minima are obtained along the parallel direction, then the direction specified by the line joining the minima is conjugate with respect to the parallel direction (Belegundu and Chandrupatla, 1999). Figure 4.3 illustrates this property. Also, this parallel subspace can be extended to the so-called extended parallel subspace property. Assume that the minimum yA is the result after unidirectional searches along each of a number of conjugate (usually cooryA
xA D2 = (yA-yB) D1
yB xB
FIGURE 4.3 Parallel subspace for generating a conjugate direction.
© 2003 by CRC Press LLC
(10)
-x )
8
D4
D5 = (x
10 D4
(8)
D3
9
D3
7 (7) (5) D4 = (x - x )
6 D2 D1
3 D2
4 (4)
D1
1
5
D3
(2)
D3 = (x - x )
2
FIGURE 4.4 Extended parallel subspace for generating conjugate direction.
dinate) directions from a given initial point xA, and the minimum yB is the result after unidirectional searches along each of a number of conjugate (usually coordinate) directions from a given initial point xB. Then the direction (yB – yA) is conjugate to all these search directions. To illustrate the extended parallel subspace property clearly, consider a two-variable minimization problem as illustrated in Figure 4.4. D1 and D 2 are the directions along the coordinate directions. First, make line searches along D1, D2, and D1. This leads to point 4. Then the direction D 3 = (x 4 − x 2 ) is conjugate with D1. In the next iteration, search starts from point 4 along D 3 , D2, and D 3 to reach point 7. Then the direction D 4 = (x 7 − x 5 ) is conjugate with D 3 as well as with D1. 4.3.3.2 Example Next, a simple minimization problem that can be performed by hand is presented to illustrate the procedure of the Powell’s conjugate direction method. Consider again the example studied in Section 4.3.2.4. The initial point is x(0) = (0, 0)T. The initial search directions are along the coordinates D1 = ( ±1, 0)T and D 2 = ( ±0, 1)T. The termination parameter is ε = 0.001, which means that the search will terminate if the magnitude of the newly generated directions is less than 0.001. Solutions: First, search the minimum of f(x) along the search direction D1. To find the descending directions (D1 = (1, 0)T or D1 = (–1, 0)T) for f(x), probe tests should be performed as specified in Equation 4.11. It is then found that f decreases along the direction D1 = (1, 0)T. Any point along this direction from the initial point could be denoted as x p = x ( 0 ) + λD1 , where λ is the step length along
© 2003 by CRC Press LLC
D1. Thus the point xp can be simply expressed as a function of λ as xp = (λ, 0)T; also, the function f(x) can be expressed in terms of λ as f (λ ) = λ2 − 4λ − 8
(4.25)
To find the optimal step length, λ*, minimize f(λ). As df/dλ = 0 at λ* = 2, x(1) = (2, 0)T is obtained. Next, minimize f(x) along the second direction D2 from x(1). Following the exact same way yields x(2) = (2, 3)T. Third, search the minimum of f(x) along the first search direction D1 from the point x(2) = (2, 3)T, and obtain the minima x(3) = (3.5, 3)T. Then, according to the extended parallel subspace property, the new conjugate search direction is D 3 = x ( 3 ) − x (1) = (1.5, 3)T
11.25
(4.26)
The magnitude of the search vector is larger than ε = 0.001, and the search continues. The new search directions are D 3 and D2 for the next iteration. A minimizing search is carried out along the search direction D 3 from point x(3) = (3.5, 3)T. A new point, x(4) = (4, 4) T, is obtained. One more singlevariable search along the direction D2 from the point x(4) = (4, 4)T leads to the point x(5) = (4, 4)T. Minimizing along D3, the point x(6) = (4, 4)T is found. The new conjugate direction is D 3 = x ( 6 ) − x ( 4 ) = ( 0 , 0)
T
(4.27)
The magnitude of this newly generated conjugate search direction is zero, that is less than ε = 0.001; thus, the search terminates.
4.4
Gradient-Based Methods
Gradient-based methods are the search methods that make use of the derivative information of the objective functions. Because the gradient information of the objective function is used to determine the search direction efficiently, these methods are usually much faster than the direct search methods. Gradient-based methods include Cauchy’s (steepest descent) method and Newton’s method, as well as Marquardt’s method, etc.
4.4.1
Cauchy’s (Steepest Descent) Method
Cauchy’s method is often called the steepest descent method. The search direction used in this method is the negative of the gradient vector at every iteration point. It requires an initial estimated solution x(0). In the searching
© 2003 by CRC Press LLC
Start
Initial point x (0 ) , k = 0, termination parameter
Calculate ∇f (x (k ) ) to obtain Dk = -∇f(x(k))
Find α k from minimizing f (x + α Dk) (k)
x
(k+1)
k
= x + α Dk = x − α ∇fk (k)
k
(k)
k
Yes
x* = x ( k )
Convergence criteria
Stop
No k = k +1
FIGURE 4.5 Flow chart for the steepest descent algorithm.
process, at the kth iteration, x(k) is replaced by x(k+1), which is a better estimate of the solution. The name of steepest descent method is earned because it uses the steepest descent direction, –∇f(x), as the search direction: D k = −∇f (x ( k ) )
(4.28)
x ( k +1) = x ( k ) + α k D k = x ( k ) − α k ∇f (x ( k ) )
where α is the optimal step-length. The basics of the steepest descent method can be summarized in the schematic flowchart shown in Figure 4.5. At each iteration, the search direction as well as the optimal step-length in the search direction will be calculated. The detailed procedure can be better understood by going through the following simple example: Minimize f ( x1 , x 2 ) = ( x1 + 1) + ( x 2 + 1) 2
2
(4.29)
Obviously, the minimum solution of this problem is (−1, −1)T. Here the steepest descent method, starting from the initial guess of (0, 0)T, is used to find it. Iteration 1: The gradient of f can be derived as
© 2003 by CRC Press LLC
∂f ∂x 2( x + 1) ∇f = 1 = 1 ∂f 2( x 2 + 1) ∂x 2
(4.30)
2 ∇f0 = ∇f (x ( 0 ) ) = 2
(4.31)
−2 D 0 = −∇f0 = −2
(4.32)
Thus,
or
The optimal step-length can be found by minimizing
(
)
f x ( 0 ) + α 0 D 0 = 2(1 − 2α 0 )
2
(4.33)
with respect to α0, which gives α0 = 0.5. Thus, −1 x (1) = x ( 0 ) + α 0 D 0 = −1
(4.34)
To check whether or not point x(1) is the minimum point, evaluate the 0 gradient ∇f1. ∇f1 = ∇f (x (1) ) = , which confirms that x(1) is indeed the min0 imum point. Only one iteration can reach the minimum for this simple example. However, for complex practical engineering problems, more iterations are usually required. Because the steepest descent direction uses only the gradient information at each current step, the method is not always truly “steepest descent” on the overall path that leads to the minimum of the problem.
4.4.2
Newton’s Method
Newton’s method uses the first- and second-order derivatives to build the search operators. Considering the Taylor’s series expression of the function f(x) at x = x(k),
© 2003 by CRC Press LLC
f (x) = f (x ( k ) ) + ∇fkT (x − x ( k ) ) +
(
3 1( x − x ( k ) )∇ 2 fk (x − x ( k ) ) + O x − x ( k ) 2
)
(4.35)
where the superscript T stands for the transposed matrix of the vector. f(x) can be minimized by setting the first-order derivatives of Equation 4.35 equal to zero: ∂f ( x ) =0 ∂x
(4.36)
∇f = ∇fk + ∇ 2 fk (x − x k ) = 0
(4.37)
Thus
and
[
x ( k +1) = x ( k ) − ∇ 2 fk
]
−1
∇fk
(4.38)
Equation 4.38 is used iteratively to find the minimum point. Newton’s method may converge to saddle points (see, for example, Rao, 1996); in order to avoid these problems, the equation is modified as
[
x ( k +1) = x ( k ) − α k ∇ 2 fk
]
−1
∇fk
(4.39)
The algorithm of Newton’s method is the same as the steepest descent method, except the αk is found by
(
[
Minimizing f x ( k ) − α k ∇ 2 fk
]
−1
∇fk
)
(4.40)
which can be better understood by going through the same simple example studied in Section 4.4.1. Iteration 1: The Hessian for the function defined in Equation 4.29 is now found as ∂2 f ∂x 2 ∇ 2 f (x) = 2 1 ∂ f ∂x ∂x 1 2
© 2003 by CRC Press LLC
∂2 f ∂x 1 ∂x 2 2 = ∂ 2 f 0 ∂x22
0 2
(4.41)
Therefore, 0 1
(4.42)
2 ∇f0 = ∇f (x ( 0 ) ) = 2
(4.43)
[∇ f ] 2
−1
0
=
1 1 2 0
and Equation 4.31 gives
Using Equation 4.38 gives
[
x 1 = x 0 − ∇ 2 f0
]
−1
−1 ∇f0 = −1
(4.44)
As shown in Section 4.4.1, the point found is the minimum point. The value [∇2fk]–1∇fk should be evaluated at each step. It is impractical for problems with a complex objective function and with a large number of variables, because the computation of both gradient and Hessian can be very expensive. Cauchy’s method works well if the initial point is far away the minimum point, and Newton’s method works well when the initial point is close to the minimum point. Marquardt’s method (1963) is proposed to combine the advantages of Cauchy’s method and Newton’s method. In Marquardt’s method, Cauchy’s method is used first, followed by Newton’s method. This method modifies the diagonal elements of the Hessian matrix as
[∇ f ]′ = ∇ 2
k
2
fk + λ2 I
(4.45)
Then the iteration algorithm of Marquardt’s method could be written following Newton’s method as
[
x ( k +1) = x ( k ) − α k ∇ 2 fk + λ2 I
]
−1
∇fk
(4.46)
Initially, a sufficiently large λ is employed, and the search is similar to that of Cauchy’s method. After a number of iterations, the λ is gradually reduced, and the search is performed as in Newton’s method to pinpoint the minimum.
© 2003 by CRC Press LLC
4.4.3
Conjugate Gradient Method
The concept of the conjugate gradient method is to obtain the set of conjugate gradient directions at each iteration from an orthogonalization of the successive gradients. It uses only the first derivative of the objective function. Fletcher and Reeves (1963) provided the following conjugate search direction: ∇fk
D k = −∇fk +
2
∇fk −1
2
(4.47)
D k −1
where D 0 = −∇f(x ( 0 ) ) . The conjugate gradient algorithm is implemented in Figure 4.6. Consider the following test function that was examined by Ronald (2000), which reveals the detailed procedure of the conjugate gradient method: Start Initial point x (0 ) , k = 0, termination parameters
Calculate ∇f (x (0) ) to obtain D0 = -∇f (x ) (0)
Find α 0 by minimizing f (x(0) + α0D0)
x
(1)
=
x + α D0 , k = 1 (0)
0
冨∇fk冨
2
Dk = -∇fk +
冨∇fk–1冨
2
Dk–1
Find α k by miniming f (x(k) + αkDk)
(k+1)
x
= x + α Dk (k)
k
Convergence criteria No k = k +1
FIGURE 4.6 Flow chart of the conjugate gradient algorithm.
© 2003 by CRC Press LLC
Yes
x* = x ( k )
Stop
Minimize f ( x1 , x 2 ) = x12 + x 22 − 4 x1 − 5x 2 − 5 − x1x 2
(4.48)
Here it is necessary to find the minimum solution using the conjugate gradient method from the initial point (0, 0)T. Solutions Iteration 1: The gradient of f can be derived as ∂f ∂x 2 x − x − 4 2 ∇f = 1 = 1 ∂f − 2 x x 1 − 5 2 ∂x 2
(4.49)
−4 ∇f0 = ∇f ( x ( 0 ) ) = −5
(4.50)
4 D 0 = −∇f0 = 5
(4.51)
Thus,
or
The search direction of the first iteration is taken as D 0 . To obtain the optimal step-length α0 along the direction D 0 , process the following minimization with respect to α0:
(
)
f x ( 0 ) + α 0 D 0 = 21(α 0 ) − 41α 0 − 5
(4.52)
∂f = 42α 0 − 41 = 0 ∂α 0
(4.53)
3.90476 x (1) = x ( 0 ) + α 0 D 0 = 4.88095
(4.54)
2
Using
yields α0 = 41/42. Thus,
© 2003 by CRC Press LLC
To check whether or not point x(1) is the minimum point, ∇f1 can be evaluated. −1.07143 0 (1) Because ∇f1 = ∇f (x (1) ) = ≠ , x is not yet the minimum point. 0.85714 0 Iteration 2: Equation 4.47 provides the search direction of this iteration, yielding
D1 = −∇f1 +
∇f1
2
∇f0
2
D0
(4.55)
where the following values could be obtained: 2
f0 = 41 and
2
f1 = 1.88266
(4.56)
Thus, 1.07143 1.88266 4 1.25510 D1 = + = 41 5 −0.62755 −0.85714
(4.57)
α1 is found by minimizing the objective function with respect to α1
(
∂f x (1) + α 1D1 ∂α
1
)=0
(4.58)
which gives α1 = 0.34146. Thus, 4.33333 x ( 2 ) = x (1) + α 1D1 = 4.66667
(4.59)
To check whether or not point x(2) is the minimum point, evaluate ∇f2. 0 Because ∇f2 = ∇f (x ( 2 ) ) = , x ( 2 ) is indeed the minimum point. 0
4.5
Nonlinear Least Squares Method
As mentioned in Section 4.1, common engineering inverse problems can always be formulated as minimization problems with properly defined
© 2003 by CRC Press LLC
objective functions, such as a sum of the squares of other nonlinear functions. Note that such an objective function can never take a negative value. The minimization of this kind of objective function can be formulated as a nonlinear least squares problem. Fletcher (1987) has conducted detailed theoretical investigations on this type of problem and provided a number of numerical treatments. This section will only discuss some of the effective methods available for solving nonlinear least squares problems and introduces some practically useful packages for directly practical applications. 4.5.1
Derivations of Objective Functions
The least squares problem could be written as Minimize f (x) = R T (x)R(x)
(4.60)
where R(x) = Y p (x) − Y m (x) is the vector of error or discrepancy or residuals, and can be formed in the following manner:
[
R(x) = R1 (x)
R2 ( x )
]
Rns (x)
(4.61)
The first partial derivative of the objective function is given as ∇f (x) = 2(∇R(x)) R(x) T
(4.62)
Similarly, the second derivative can be obtained by differentiating Equation 4.62 with respect to x ∇ 2 f (x) = 2(∇R(x)) ∇R(x) + 2B(x) T
where B(x) =
(4.63)
m
∑ R (x)∇ R (x) . 2
i
i
i
However, the objective function of real engineering problems is often not differentiable, the expressions for the partial derivatives cannot be explicitly defined, or the partial derivatives cannot be easily computed. The most straightforward way is to estimate the derivatives by numerical method (for instance, finite-difference method is commonly used to estimate the Jacobian) from the objective function. 4.5.2
Newton’s Method
Newton’s method is first applied to solve the nonlinear least squares problem, the basic iteration of which is given via Equation 4.38 as
© 2003 by CRC Press LLC
((∇R(x))
T
)
∇R(x) + B(x) p( k ) = −(∇R(x)) R(x) x
( k +1)
=x
(k)
T
−p
(k)
(4.64)
The iterations based on Equation 4.64 can converge very fast for the nonlinear least squares problem. The problem with the Newton approach is that m
B(x) =
∑ R (x)∇ R (x) is usually unavailable or inconvenient to get; also, 2
i
i
i
it is too expensive to approximate by finite difference methods (Dennis and Schnabel, 1996). Therefore, simplification of B(x) is important to achieve a better computational efficiency. There are largely two classes of algorithms for nonlinear squares problems: those that ignore the B(x), which are called small residual algorithms, and those that approximate it in some way, which are called large residual algorithms. Because the R(x) is minimized in the least squares form, the components of B(x) are often small. Next, two small residual algorithms that make the use of this property of B(x) are presented.
4.5.3
The Gauss–Newton Method
Newton’s method can be first used to minimize the sum-of-squares function, as defined by Equation 4.60. Using the special form of the gradient vector and the Hessian matrix as specified in Equation 4.62 and Equation 4.63, the direction of search D k can be obtained from the equivalent form
((∇R(x))
T
)
∇R(x) D k = −(∇R(x)) R(x) T
(4.65)
which, together with the step iteration of the equation x ( k +1) = x ( k ) + α k D k
(4.66)
is called the Gauss–Newton method. The simple algorithm of this method is illustrated in Figure 4.7. More discussions on the advantage as well as the convergence of the Gauss–Newton method can be found in Scales (1985). Consider the following problem provided in Fletcher’s book (1987): R1 ( x) = x + 1 ( ) 2 R2 x = 0.1x + x − 1
for which the minimum solution is x* = 0.
© 2003 by CRC Press LLC
(4.67)
Start
Initial point x (0 ) , k = 0, termination parameters
Calculate ∇R(x ( k ) ) , get Dk from ((∇R(x)) ∇R(x))Dk = -(∇R(x)) R(x) T
T
Find α k from minimizing f (x(k) + αkDk)
x
(k+1)
= x + α Dk (k)
k
Yes Convergence criteria
x* = x ( k )
Stop
No k = k +1
FIGURE 4.7 Flow chart of the Gauss–Newton algorithm for nonlinear least squares problems.
Solution Start from x0 = 1. For numerical simplification, the α is fixed as 1. Iteration 1: First, the function value at x0 = 1 can be obtained 1 R( x 0 ) = 0.1
The Jacobian matrix can be obtained as 1 ∇R( x) = 0.2 x + 1
(4.68)
Thus, 1 ∇R( x 0 ) = and D0 = −0.86885 1.2
and x (1) = x ( 0 ) + D0 = 0.13115
© 2003 by CRC Press LLC
(4.69)
Iteration 2: From x(1) , obtain 1 ∇R( x 1 ) = and D1 = −0.11751 1.02623
and x ( 2 ) = x (1) + D1 = 0.01364
(4.70)
Following the same way, after four iterations, x(5) = 0.00014 is obtained. Gill and Murray (1978) modified the Gauss-Newton method to improve its efficiency by improving the Hessian approximation.
4.5.4
The Levenberg–Marquardt Method
Equation 4.65 can be modified as
((∇R(x))
T
)
∇R(x) + µ k I D k = −(∇R(x)) R(x) T
(4.71)
where µk Š 0, and I is the unit matrix of order ns . Equation 4.66 is still used in each iteration. This method was first suggested by Levenberg and Marquardt (1963) and is known as the Levenberg–Marquardt method. This method is effective for dealing with (ill-posed) problems related to singularity in the matrix and is also an effective algorithm for small residual problems. Many versions of the Levenberg–Marquardt method have been implemented using various schemes to select µk. Among them, the modified Levenberg–Marquardt method developed by Mo´re (1977) has proven to be one of the most successful schemes in practice; it is recommended as the general solution of nonlinear least squares problems. Dennis and Schnabel (1996) have concluded several preferable features of the Levenberg–Marquardt method: the step is close to being in Cauchy’s direction and is often superior to the Gauss–Newton method; it is well defined even the Jacobian matrix ∇R(x) is not of full column rank. The treatment of using µk in Equation 4.71 is somehow quite similar to the use of α in the Tikhonov regularization method (Chapter 3) for solving ill-posed problems. Thus the Levenberg–Marquardt method is a very efficient inverse operator for solving engineering ill-posed inverse problems formulated as nonlinear least squares forms.
4.5.5
Software Packages
Many inverse problems can be defined in the functional form of L2 norm and solved as a nonlinear least squares problem subject to simple bounds on the variables. The problem can be stated as
© 2003 by CRC Press LLC
Minimize f (x) = R T (x)R(x)
(4.72)
Subject to x L ≤ x ≤ x U
The routines BCLSF and BCLSJ provided by the mathematical and statistical libraries (IMSL) can be used to solve this problem. These two routines use a modified Levenberg–Marquardt method to solve this problem. Routine BCLSF uses a finite-difference method to estimate the Jacobian. Whenever the exact Jacobian can be easily provided, routine BCLSJ should be used. The subroutine BCLSF is employed in the combined optimization method presented in Chapter 5, where it is used as the gradient-based method at the second stage of the combination algorithm, in Chapter 8 and Chapter 9 for the determination of material property, and in Chapter 11 for crack detection in laminates.
4.6
Root Finding Methods
The root finding methods for nonlinear system equations also can be used for the inverse analysis operator to find an inverse solution that makes the function of error vanish. This is just a slightly different viewpoint of the inverse problem. In the following sections, two root-finding methods are briefly introduced; their practical applications will be addressed in Chapter 12 for flaw detection in sandwich structures. 4.6.1
Newton’s Root Finding Method
Newton’s root finding method is used directly to solve the nonlinear system for the parameters. Newton’s method uses an iterative process to approach a root of a function f(x). Beginning with an initial trial value of x ( 0 ) , the succeeded solution is obtained through x ( k +1) = x ( k ) −
f (x ( k ) )
f ′( x
(k)
(4.73)
)
where x(k) is the solution obtained in the previous iteration, f(x(k)) and f′(x(k)) represent the value of the function and its derivative at x(k), respectively, and x(k+1) is the current iteration result. When x(k+1) converges to a value, it will be a root of the function. For nonlinear equation system f(x) = 0 (fi(x1, …, xn) = 0, i = 1, n), the similar iteration formula is given as x
© 2003 by CRC Press LLC
( k +1)
=x
(k)
− ∇ −1fk f(x
(k)
)
(4.74)
Initial guess of parameters x (0 )
Calculation of Jacobian matrix at x (k )
Update parameters
(∇fk + ∆k)(x(k+1) - x(k)) = -f(x(k)) No k = k+1
x ( k +1) − x ( k ) < tolerance
Yes Output parameters
FIGURE 4.8 Flow chart of Newton’s root finding method.
In the numerical implementation, the iteration stops when the specified ( ) ( ) accuracy, x k +1 − x k ≤ ε is reached. Note that, to ensure an invertible Jacobian matrix to be determined, the number of measurements must equal the number of parameters and the columns in the Jacobian matrix are linearly independent. In this case, Newton’s root finding method can get the solution very fast if it converges. However, it has the local convergence properties and may not converge or converge to values that exceed the physically defined validity region, depending on the initial guess. To improve the performance of Newton’s method while retaining the fast convergence rate, a modification is made to correct the iteration step size when necessary: x
( k +1)
=x
(k)
(
− ∇fk + ∆k
) f( x ( ) ) −1
k
(4.75)
where ∆k is a diagonal matrix, so chosen to ensure that f (x) → 0 and make the solution converge. To ensure that the solution falls into the physically feasible region, upper and lower bounds are applied to constrain the parameters xL ð x ð xU . Figure 4.8 shows the flowchart of the procedure of the Newton root finding method. 4.6.2
Levenberg–Marquardt Method
Despite the high convergence rate and accuracy, Newton’s root finding method is not practically useful for inverse problems because it is sensitive to the random measuring error. The Gauss–Newton method cannot be applied to get a robust approximate solution using measurement data that may contain noise.
© 2003 by CRC Press LLC
For the case in which the number of measurement data is more than the number of detected parameters, it is impossible to solve Equation 4.74 because of the nonsquare Jacobian matrix. An approximate solution can be given as ∇fkT ∇fk ∆x k +1 = −∇fkT f(x
(k)
)
(4.76)
where ∆x k +1 = x
( k +1)
−x
(k)
(4.77)
Similarly to the procedure of the Newton method, modification of the Gauss–Newton method is made to improve the convergence performance:
[
x ( k +1) = x ( k ) − ∇fkT ∇fk + µ k ∆ k
]
−1
∇fkT f(x ( k ) )
(4.78)
where µk is a positive scalar named damping parameter, which is tuned down gradually in magnitude as the iteration proceeds and serves to improve the condition number and to regularize the iteration process, and ∆k is a diagonal ( ) ( ) matrix. The iteration terminates if f (x) ≤ δ or if x k +1 − x k ≤ ε . This is the iterative Levenberg–Marquart method. It is a slightly different form of the method described in Section 4.5.4 and gives the estimation of the parameters based on the minimization of the least squares of the error norm. It is expected to be robust to random errors of measurement. To avoid the ill-posedness of problems, it is often effective to increase the number of measurements or to select effective measurements that are sensitive to the parameter variation. The number of measurements must at least equal the number of parameters. As discussed in Section 4.5.4, the treatment of using µk in Equation 4.78 is somehow quite similar to the use of α in the Tikhonov regularization method (Chapter 3) for solving ill-posed problems. Thus the Levenberg–Marquart method is a very efficient and stable inverse operator when noisy measurements are used. The numerical procedure is the same as that of Newton’s method given in Section 4.6.1, except that Equation 4.75 replaced by Equation 4.78.
© 2003 by CRC Press LLC
4.7
Remarks
• Remark 4.1 — the golden section method for estimating the maximum and minimum of a one-variable function is a popular classic technique. Hooke and Jeeves’ method is one of the most commonly used direct search methods. In this method, a combination of exploratory and pattern moves is made iteratively to find the most profitable search directions. Powell’s conjugate direction method is the most efficient and powerful direct search method, as the conjugate directions method can speed up the convergence of general nonlinear objective functions. However, the gradient-based methods are generally more efficient than the direct search methods. • Remark 4.2 — gradient-based methods generate the search directions iteratively using the derivatives of the object function. Cauchy’s method (or the steepest descent method) is the fundamental of the gradient-based methods, but it is not an efficient method for many engineering problems due to its poor rate of convergence, especially for problems whose system matrix has a large condition number. Newton’s method is the most rapidly convergent method because it uses the Hessian matrix of the object function to seek for the search directions. However, this method does not guarantee a descent direction when the Hessian matrix is indefinite or singular. Marquardt’s method is proposed to combine the advantage of Cauchy’s method as well as Newton’s. The concept of conjugate gradient method is to obtain the set of conjugate gradient directions at each iteration from an orthogonalization of the successive gradients; it is one of the most efficient methods. • Remark 4.3 — the special structure of the derivatives of the sum-ofsquares function paves the way to develop specialized methods to solve this type of objective function. Gauss–Newton’s method is the fundamental method, and the Levenberg–Marquardt method is a very efficient and practical method for solving engineering evenand over-posed inverse problems formulated as nonlinear least squares forms. • Remark 4.4 — the direct root finding methods for nonlinear system equations are also accessed to solve inverse problems. Newton’s root finding method and the Levenberg–Marquardt method have been introduced in this chapter. The Levenberg–Marquardt method is a very efficient and practical method for solving engineering inverse problems for which the parameters are explicitly expressed in a linear form in the forward system equations.
© 2003 by CRC Press LLC
• Remark 4.5 — all the algorithms are the so-called local methods, available as standard subroutines in libraries of mathematics and scientific computing, and can be directly used for inverse analyses.
4.8
Some References for Optimization
Ashok, D. B. and T.R. Chandrupatla, Optimization Concepts and Applications in Engineering, Prentice Hall, Inc., Englewood Cliffs, NJ, 1999. Dennis, J.E. and R.B. Schnabel, Numerical Methods for Unconstrained Optimization and Nonlinear Equations, Society for Industrial and Applied Mathematics, Philadelphia, 1996. Fletcher, R., Practical Methods of Optimization (2nd ed.), John Wiley & Sons, New York, 1987. Kalyanmoy , D., Optimization for Engineering Design: Algorithms and Examples, Prentice Hall of India, New Delhi, 1998. Mo´re, J.J., The Levenberg–Marquardt algorithm: implementation and theory, in numerical analysis, in G.A. Waston, Ed., Lecture Notes in Math, 630, Springer Verlag, Berlin, 105–0116, 1977. Morris, A.J., Foundations of Structural Optimization: A Unified Approach, John Wiley & Sons, New York, 1982. Rao, S.S., Engineering Optimization: Theory and Practice (3rd ed.), John Wiley & Sons, Inc, New York, 1996. Scales, L.E., Introduction to Nonlinear Optimization, Macmilian, U.K., 1985. Walsh, G.R., Methods of Optimization, John Wiley & Sons, New York, 1975.
© 2003 by CRC Press LLC
1523_Frame_C05.fm Page 107 Thursday, August 28, 2003 4:23 PM
5 Genetic Algorithms
Chapter 4 gave a concise, insightful description of traditional optimization methods that have a long history of development and application. This chapter introduces a “nontraditional” search or optimization method, known as the genetic algorithm (GA), that has become a potential search algorithm for complex engineering problems. The word “nontraditional” may not be appropriate, because GAs are already widely used; however, the word is used here for the purpose of distinguishing GAs from the methods discussed in Chapter 4. Over the past two decades, many different versions of GAs have been developed. Combinations of GAs with traditional optimization methods or hybrid GAs have also been proposed by many and proved to be very effective for a large number of problems. This chapter describes the basic concept of the GA, and then some modified GAs with an emphasis on versions of the intergeneration project GAs (IP-GAs), as well as the method that combines GAs with gradient-based methods. A large portion of this book is devoted to GAs because they are particularly useful for inverse problems that are usually very complex in nature and for which the global optimum is always required and forward solvers are often expensive. Most of the methods will be employed in solving the inverse problems presented in Chapter 7 through Chapter 13.
5.1
Introduction
Genetic algorithms are computational techniques for searching the optimum or maximum of complex objective fitness functions based on a process that simulates Darwin’s nature evolution theory. In 1975, Holland established the theoretical foundation that initiated the most contemporary developments of GAs. Since then, extensive research works have been carried out on the theoretical investigation and engineering application of GAs. Due mainly to their applicability to problems with very complex objective functions, GAs have been successful in a wide variety of scientific fields such as computational search algorithms, optimization, and machine learning. As effective
© 2003 by CRC Press LLC
1523_Frame_C05.fm Page 108 Thursday, August 28, 2003 4:23 PM
optimization techniques, GAs have also been extremely successful in their applications in structural optimization problems and appear to be promising in dealing with complex, nonlinear multimodal optimization problems, including inverse problems.
5.2
Basic Concept of GAs
GAs emulate the survival-of-the-fittest principle of nature to perform the search and are naturally formulated for optimization problems. They are also applicable to minimization problems because they can be easily converted into maximization problems. Consider the following optimization problem to present the basic concept of the GA: Maximize f (x) Subject to x L ≤ x ≤ x U
(5.1)
where x is the vector of parameters, x = {x1, x2, …, xN}T. The superscripts L and U represent the lower and upper bands of the parameters respectively. 5.2.1
Coding
In a plain GA program, each parameter, xi (i = 1, 2, , N ) , of a given problem should be coded into a finite-length string according to one of the coding methods, among which binary coding is the simplest and most popular. A so-called chromosome is formed as a super string that combines all these finite-length strings and represents an individual (a candidate for solution to the given problem). After the optimal individual is found, it is then decoded back to the physical parameter. It should be noted that binary coding of the parameters is not absolutely necessary. As will be illustrated in Section 5.3.2, the parameters can be directly used in the so-called real parameter coded GA. Here, the popular binary coding is used to illustrate the process of GAs. The objective function is often defined using continuous variables of parameters. GAs, however, operate on the (binary-encoded) discrete parameters. Therefore, these parameters in a continuous space should be first discretized, and then are encoded in binary form. The mathematical formu lation for the binary encoding and decoding of the ith parameters can be given as (Haupt and Haupt, 1998) Encoding xi =
© 2003 by CRC Press LLC
xi − xiL xiU − xiL
(5.2)
1523_Frame_C05.fm Page 109 Thursday, August 28, 2003 4:23 PM
gene[m] = roundxi − 2 − m −
m −1
∑ gene[m]2 k =1
−k
(5.3)
Decoding N gene
x
qn i
=
∑ gene[m]2
−m
+ 2 − ( m+1)
m =1
(
(5.4)
)
xiq = xiqn xiU − xiL + xiL
where xi : normalized i th parameter 0.0 ≤ x ≤ 1.0 xiL : smallest values of ith parameter xiU : highest values of ith parameter gene[m]: binary value of xi
round[]: round to nearest integer Ngene: number of bits in the gene xiqn : quantized value of xi xiq : quantized value of xi A plain GA program starts with a generation of chromosomes (individuals) that are randomly selected from the entire pool of the search space. Each of the chromosome’s fitness values is evaluated by computing the fitness function (objective function). The following simulated genetic operators are then employed to simulate the natural evolution process, which leads to the most fit chromosome or individual that is the solution or the optimizer of the optimization problem. 5.2.2
Genetic Operators
Three basic genetic operators — selection, crossover, and mutation — are performed in that order on these chromosomes of the current generation to produce child generations that become fitter in the simulated evolution process. The details of these operators are given next. 5.2.2.1 Selection Selection is a process in which a mating pool of individual chromosomes of the current generation is chosen in a certain way for reproduction of the child generation according to the fitness values of the chromosomes of the current generation. This operator is designed to improve the average quality of the population by giving individuals of higher fitness a higher probability to be copied to produce the new individuals of chromosomes in the child
© 2003 by CRC Press LLC
1523_Frame_C05.fm Page 110 Thursday, August 28, 2003 4:23 PM
generation. The quality of an individual in the current generation is measured by its fitness value through the evaluation of the fitness function; therefore, the selection can focus on more promising regions in the search space. A number of selection schemes, such as proportionate selection and ranking selection, as well as tournament selection, have been popularly used in GA programs. Once a chromosome has been selected for reproduction, it enters into a mating pool that is a tentative new population ready for further genetic operations. Obviously, the selection operation is an artificial emulation of natural selection of the Darwinian survival theory. 5.2.2.2 Crossover After the selection operation is completed and the mating pool is formed, the so-called crossover operator may proceed. Crossover is an operation to exchange part of the genes in the chromosomes of two parents in the mating pool to create new individuals for the child generation; it is the most important operator in a GA. A simple crossover proceeds in two steps. First, members of the chromosomes in the mating pool are mated at random. Next, each pair of the randomly selected chromosomes undergoes a crossover using one of the following schemes to generate new chromosomes (Davis, 1991; Goldberg, 1989; Lawrence, 1987; Syswerda, 1989): • One-point crossover scheme • Multipoint crossover scheme • Uniform crossover scheme 5.2.2.2.1 One-Point Crossover Scheme A crossover operator randomly selects a crossover point within a chromosome then interchanges the two parent chromosomes at this point to produce two new offspring. For example, consider the following two parents that have been selected for crossover. The “|” symbol indicates the randomly chosen crossover point: Parent #1: 011101|0001 crossover
(5.5)
Parent #2: 100111|0101 The first part of the gene segment of the first parent is hooked up with the second part of the gene segment of the second parent to make the first offspring. The second offspring is built from the first part of the second parent and the second part of the first parent:
© 2003 by CRC Press LLC
1523_Frame_C05.fm Page 111 Thursday, August 28, 2003 4:23 PM
Offspring #1: 011101|0101 Offspring #2: 100111|0001
(5.6)
5.2.2.2.2 Multipoint Crossover Scheme A crossover operator randomly selects a number of crossover points within a chromosome then interchanges the gene segments in the chromosomes of the two parents between these points to produce two new offspring. In the following, a two-point crossover scheme is used to illustrate the process of the multipoint crossover operator. For example, consider two parents that have been selected for crossover: Parent 1: 1101|010|101 crossover
(5.7)
Parent 2: 0010|001|110 After interchanging the genes in the parent chromosomes between the crossover points, the following offspring are produced: Offspring 1: 1101|001|101 Offspring 2: 0010|010|110
(5.8)
In the multipoint crossover scheme, more than one crossover point is selected in a pair of the parent chromosomes. The crossover operator performs bit by bit at the gene bit level. The number of crossover points and crossover positions, distinct from each other, in each pair of chromosomes is selected randomly. 5.2.2.2.3 Uniform Crossover Scheme A uniform crossover operator decides which parent will contribute each of the genes in the offspring chromosomes with a given probability. This allows the parent chromosomes to be mixed at the gene bit level rather than at the gene segment level (as in the one- and multi-point crossover schemes). This uniform crossover operation provides flexibility, but also destroys the building block in the chromosomes. However, for some problems, this additional flexibility outweighs the disadvantage of destroying building blocks. In the uniform crossover strategy, the crossover positions are predefined in a mask. This mask determines from which parent the genetic material is taken for each gene. All the chromosomes in a population are uniformly crossed over in the same positions. Note that, in the multipoint crossover strategy, each pair of chromosomes is crossed over at different points because no pre-
© 2003 by CRC Press LLC
1523_Frame_C05.fm Page 112 Thursday, August 28, 2003 4:23 PM
defined mask is used. For example, consider the following two parents that have been selected for crossover: Parent 1: ABCDEFGH crossover with a mask
(5.9)
Parent 2: IJKLMNOP If the probability is 0.5, approximately half of the gene bits in the offspring will come from parent 1 and the other half will come from parent 2. With the mask of 1 0 1 0 1 0 1 0, the possible sets of offspring after uniform crossover are: Offspring 1: AJCLENGP Offspring 2: IBKDMFOH
(5.10)
With the mask of 0 1 0 1 0 1 0 1, the possible sets of offspring after uniform crossover are: Offspring 1: IBKDMFOH Offspring 2: AJCLENGP
(5.11)
In addition to these standard crossover operators, offspring can also be generated using other crossover operators, such as the arithmetic crossover operator and the heuristic crossover operators (Davis, 1991). Section 5.3.2 gives some crossover operators for real parameter coded GAs, and Section 9.5.1.1 detailed investigates the influence of the probability of the uniform crossover operator.
5.2.2.3 Mutation The mutation operator is designed so that one or more of the chromosome’s genes will be mutated at a small probability. The goal of the mutation operator is to prevent the genetic population from converging to a local minimum and to introduce some new possible solutions to the generation. Without mutation, the population would rapidly become uniform under the so-called conjugated effect of selection and crossover operator. There are a number of mutation methods (OpitGA: http//www.optwater.com/optiga): flip bit, random, and min–max. For example, consider the following parent that has been selected for mutation. The bit at a selected point is mutated from 0 to 1:
© 2003 by CRC Press LLC
1523_Frame_C05.fm Page 113 Thursday, August 28, 2003 4:23 PM
Parent: 1101010101 mutation
(5.12)
Offspring: 1101011101 Now the basic operators of GAs have been briefly introduced. The contemporary developments of GAs have introduced many new techniques to improve the performance of GA operators, as well as the way of performing the coding. For additional details as well as the mathematic foundations, readers are referred to the references listed in Section 5.11. The next section demonstrates application of the basic operations of the genetic algorithm via a simple example of optimization.
5.2.3
A Simple Example
To demonstrate how the plain GA works, solve the minimization problem that was considered in Section 4.4.1: Minimize f ( x1 , x 2 ) = ( x1 + 1) + ( x 2 + 1) 2
Subject − 2 ≤ x1 ≤ 2, − 2 ≤ x 2 ≤ 2
2
(5.13)
Obviously, the optimum solution of the problem is (–1, –1)T with the function value of zero. 5.2.3.1 Solution Because GAs are often coded for the maximization problem, first transfer the minimization problem specified by Equation 5.13 to a maximization problem. A number of such transformations can be used. Here, the following fitness function is employed in the GA performance according the transformation given by Deb (1998). f ( x1 , x 2 ) =
1.0 1.0 + f ( x1 , x 2 )
(5.14)
5.2.3.2 Representation (Encoding) A binary vector is used to represent the real values of parameters x1 and x2. The GA search space has been limited to a region of a parameter space, as listed in Table 5.1. These two parameters are discretized and translated into
© 2003 by CRC Press LLC
1523_Frame_C05.fm Page 114 Thursday, August 28, 2003 4:23 PM
TABLE 5.1 GA Search Space for Numerical Test of Problem Defined by Equation 5.13 Parameter
Search Range
Possibilities #
Binary Digit
x1 x2
–2.0–2.0 –2.0–2.0
4096 4096
12 12
Total population is 2 24 (1.678 × 10 7 ) .
TABLE 5.2 Initial Generation of 15 Randomly Generated Chromosomes and the Corresponding Real Parameters and Fitness Value No.
Binary Code
x1
x2
Fitness
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
100011111101010001001000 001101000110010010001101 001110011100110010101011 001110110101011010010100 111000101110111111010001 111011111100000100100111 110101010101110010010111 001101010100101011011110 001100100011101111101101 010100000010111101010000 001000110010110101110110 011101101110111111111011 111010110101110111111111 001010100111010010101010 101111100000011010000011
0.2476 –1.1814 –1.0974 –1.0730 1.5458 1.7470 1.3338 –1.1678 –1.2156 –0.7477 –1.4510 –0.1421 1.6777 –1.3368 0.9695
–0.9294 –0.8620 1.1678 –0.3551 1.9551 –1.7118 1.1482 0.7175 0.9822 1.8291 1.3661 1.9961 1.4999 –0.8337 –0.3717
0.39039 0.95061 0.17517 0.70360 0.06168 0.11046 0.09040 0.25139 0.20098 0.11029 0.14702 0.09335 0.06935 0.87638 0.18962
a chromosome of length 24 bits according to the binary coding procedure given in Section 5.2.1, with 12 bits for each parameter. In the entire search space, a total of 224 (Ý1.678 × 10 7) possible combinations of these two parameters exists. 5.2.3.3
Initial Generation and Evaluation Function
The GA starts from an initial generation that is usually created in a random manner. Table 5.2 shows the initial generation of the 15 chromosomes (individuals) created randomly for this example. In this table, the binary coding, real parameters, and corresponding function value are explicitly listed. 5.2.3.4 Genetic Operations Selection is first performed from the individuals of the initial generation; several selection operators have been proposed. The following is the simplest one. All 15 individuals in the generation and their corresponding function values are evaluated and ranked in descending order based on their fitness values. Only a number (usually about one half, e.g., seven, in this case) of the best individuals with the highest fitness values in the generation are retained
© 2003 by CRC Press LLC
1523_Frame_C05.fm Page 115 Thursday, August 28, 2003 4:23 PM
as seven new individuals for the next generation, and the rest are discarded based on the rule of survival of the fittest. These seven best individuals are also used to form the mating pool to produce the other shortfall of eight new individuals for the next generation. Two individuals from the mating pool are paired in a random fashion. Pairing chromosomes in a GA can be carried out by various methods, such as pairing from top to bottom, random pairing, ranking weighting, etc. The often used approaches are based on the probabilities of individuals. The probability pi of the ith individual selected for the pairing is proportional to its fitness value and can be computed using pi =
fi
(5.15)
nbi
∑f
i
i =1
where nbi is the number of the best individuals (equal to seven in this example). A simple algorithm can be coded to pair up eight pairs of parents using these seven individuals, based on the probability value obtained from Equation 5.15. Using the crossover operators, these parents are then mated to produce the shortfall of eight children for the next generation. In the crossover operation, the crossover points must be determined first. In this example, one crossover point is first randomly selected. The gene segments in chromosomes of the paired parent individuals are then exchanged. Assuming the following two pairing individuals are selected from the mating pool: Chromosome 1 (C1): 100011111101 010001001000 x1
x2
Chromosome 2 (C2): 001101000110 010010001101 x1
(5.16)
x2
the corresponding parameter values of these two chromosomes are C1: x1C1 = 0.2476, x 2C1 = −0.9294 C2: x1C 2 = −1.1814 , x 2C 2 = −0.6820
(5.17)
These chromosomes are evaluated to arrive at their function value of f (C1) = 0.39039 f (C 2) = 0.95061
(5.18)
Assume that the crossover point was randomly selected after the 12th gene bit:
© 2003 by CRC Press LLC
1523_Frame_C05.fm Page 116 Thursday, August 28, 2003 4:23 PM
Chromosome 1 (C1): 100011111101 010001001000 Chromosome 2 (C2): 001101000110 010010001101
(5.19)
After the crossover operation, the two resulting offspring are Offspring 1 (O1): 100011111101 010010001101 Offspring 2 (O2): 001101000110 010001001000
(5.20)
and the corresponding values for these two offspring are O1: x1O1 = 0.2476, x 2O1 = −0.6820 O2: x1O 2 = −1.1814 , x 2O 2 = −0.9294
(5.21)
These offspring are evaluated to obtain their function values f (O1) = 0.376275 f (O 2) = 0.963493
(5.22)
The crossover will produce a total of eight children. In addition to the seven best individuals from the parent generation, a tentative generation of 15 individuals is now ready for the next genetic operation: mutation, which is a random alteration at a small percentage of the gene bits in the chromosomes. Mutation points are randomly selected for individual chromosomes in the population pool. For example, for the mutation operator on chromosome 1 in Equation 5.16, if the mutation point is at the 20th bit, the bit at point 20 is mutated from 0 to 1 as Chromosome 3 (C3): 100011111101 010001001000
(5.23)
Mutation Offspring 3 (O3): 100011111101 010001011000 The corresponding value for the mutated offspring is O31: x1O 3 = 0.2476, x 2O 3 = −0.9138
and these chromosomes evaluate to
© 2003 by CRC Press LLC
(5.24)
1523_Frame_C05.fm Page 117 Thursday, August 28, 2003 4:23 PM
f (O31) = 0.19125
(5.25)
After the mutation operation on all 15 individuals, a new generation of 15 is finally born, and the next cycle of evolution begins. The evolution is repeated until the best individual in the entire search space is found or the prescribed maximum number of generations is reached. 5.2.3.5 Results For this numerical example, the following GA operational parameters have been used: • • • •
Population size is 15. Probability of crossover is 0.4. Probability of mutation is 0.02. Maximum generation is 100.
From the calculation result, it has been found that the best chromosome after 100 generations is 001111111110010000000000, and corresponding values of this chromosome are x1 = –1.0017, x2 = –0.9998; the corresponding fitness value is 0.999993. The convergence of fitness value against the number of generations for a GA run is plotted in Figure 5.1. It can be observed from the convergence curve that the GA converges very fast at the beginning and very slowly at later stages. The converging performance slows down significantly at the final stage of searching. 1 0.995 0.99 0.985 0.98
f 0.975 0.97 0.965 0.96 0.955 0.95 0
10
20
30
40
50
60
70
80
90
100
Generation FIGURE 5.1 Convergence of a GA for the simple problem defined in Equation 5.14. The GA converges very fast at the beginning and very slow at the later stage.
© 2003 by CRC Press LLC
1523_Frame_C05.fm Page 118 Thursday, August 28, 2003 4:23 PM
5.2.4
Features of GAs
Genetic algorithms are stochastic global search methods and differ in the fundamental concepts from traditional gradient-based search techniques. One important feature of genetic algorithms is that they work on groups (generations) of points in the whole search space, while most gradient-based search techniques handle only one point at a time. For this reason, gradientbased search techniques have a drawback of depending heavily on the initial guess point and are more likely to be trapped at a local optimum in some complex problems. The genetic algorithms work on a group of points and proceed in a more globally exploratory move, and thus work well in many complex search problems where gradient-based search techniques fail. This feature gives the GAs an edge in dealing with complicated, nonlinear, and multimodal optimization problems, including inverse problems. Furthermore, GAs require only objective function information, while many other search techniques usually require auxiliary information in order to work properly. For example, the gradient-based techniques need knowledge of derivatives of the objective function in order to climb in the right direction to the current (local) peak. GAs can work well for those types of problems to which gradient-based search techniques are not applicable, such as problems whose objective function is not differentiable. This characteristic makes GAs more canonical than many other search schemes for many complex engineering problems. Table 5.3 summarizes the differences between genetic algorithms and the traditional gradient-based optimization and search procedures. One major disadvantage of the GA is its higher computational cost; generally, more evaluations of the objective function are required by a GA than a traditional gradient-based search method. This drawback is very critical for expensive forward solvers, but becomes less critical with faster computers or simple objective functions that can be computed very fast. For solving an TABLE 5.3 Comparison between GA and Gradient-Based Optimization and Search Procedures Items
GA
Gradient-Based Optimization
Search bases
Groups of points
Single point
Initial guess
No
Yes
Function information
Objective function only
Objective function and its derivatives
Search rule
Probabilistic in nature
Deterministic laws
Convergence
Fast at beginning, slow at the later stage
Relatively slow at initial stage, very fast at later stage
Applicability
Global search for complex problems with many local optima
Local search for simple problem with single optimum
Computing efficiency
Computationally expensive
Efficient
© 2003 by CRC Press LLC
1523_Frame_C05.fm Page 119 Thursday, August 28, 2003 4:23 PM
inverse problem using GAs, exploring a faster forward computation solver is very important to reduce the computer time because GAs require a large number of calls of the forward solver. A small saving in the single run of forward calculation can significantly reduce the total running time of the inverse problem. Another major disadvantage of a GA is its deficiency for problems with too many variables because of the exponential growth rate of the search space with respect to the increase of the number of variables. Gradient-based methods are far superior to GAs in this regard. GAs sometimes demonstrate very poor convergence performance, especially when the search has found a good individual very close to the global optimum. Because of the probabilistic nature of the GA, once a very good individual is found, to find a better one in the entire search space becomes more difficult. In addition, their performance near the global solutions appears to be relatively imprecise when compared with the conventional gradient-based optimization techniques that use the deterministic translation rules (Gen and Chen, 1997; Krishnan and Navin, 1998). The next section gives a brief review of developments in improving the GA’s performance. 5.2.5
Brief Reviews on Improvements of GAs
To improve convergence performance and enhance searching capability, it has been recommended to incorporate GAs with conventional optimization techniques (Bosworth et al., 1972; Bethke, 1981; Goldberg, 1983; Angelo, 1996; Back et al., 1997; Dozier et al., 1998). GAs are good at global searching but slow at converging, while some of the conventional optimization techniques are good at fine-tuning but lack a global perspective, so a hybrid algorithm can be an ideal alternative. Such a hybrid algorithm can combine the global explorative power of GAs with the local exploitation behaviors of conventional optimization techniques, complement their individual weak points, and thus outperform either one individually (Gen and Chen, 1997). Various hybrid algorithms have been proposed so far (Davis, 1991; Gen and Chen, 1997; Cheng et al., 1999; Magyar et al., 2000). Basically, they can be classified into three categories (Xu et al., 2001c): 1. Inject the problem-specific information into the existing genetic operators in order to reproduce the offspring that possess higher fitness values. For example, Davidor (1991) defined the Lamarckian probability for mutations in order to enable mutation operators to be more controllable. Yamada and Nakano (1992) designed a new crossover operator based on the Giffer and Thompson’s algorithm. Cheng et al. (1996) designed a new mutation operator based on a neighborhood search mechanism. 2. Design new heuristic-inspired operators in order to guide genetic search more directly toward better solutions. For example, Bosworth et al. (1972) used the Fletcher-Reeves method together with the
© 2003 by CRC Press LLC
1523_Frame_C05.fm Page 120 Thursday, August 28, 2003 4:23 PM
golden section search method as a new mutation operator. Grefenstette et al. (1985) developed a greedy, heuristic crossover operator and Grefenstette (1991) introduced a Lamarckian operator. Davis (1991) and Miller et al. (1993) proposed an extra move operator and a local improvement operator. Magyar et al. (2000) proposed an adaptively fired hill-climber operator. Goldberg (1989) developed a G-bit improvement operation for binary strings and Whitley et al. (1994) proposed a Baldwinian strategy to change the value of fitness function based on the hill-climbing operation. 3. Incorporate conventional optimization methods into GAs. This can be done in two typical ways. The first is to take the conventional optimization methods as an add-on extra to the basic loop of genetic algorithms. That is, apply a conventional optimization method (typically the hill-climbing method) to each newly generated offspring to move it to a local optimum, and then replace the current individ uals with these locally optimal solutions before putting the offspring back into the population. This approach is commonly called Lamarckian evolution as explained by Kennedy (1993), or memetic algorithms introduced by Moscato and Norman (1992) and Radcliffe and Surry (1994). The second approach is to run the GA and then apply a conventional optimization method to obtain the final solution (Levine, 1996; Yang et al., 1995; Mohammed and Uler, 1997; Xiao and Yabe, 1998; Liu et al., 2002a; Xu and Liu, 2002d). Incorporating conventional optimization methods into the GAs is the most common form of hybrid genetic algorithms in engineering practice so far because these kinds of algorithms are relatively simple in implementation (Gen and Chen, 1997; Levine, 1996). However, they usually require high computation cost because a large number of function evaluations must be conducted in the local optimization process. Most conventional optimization methods used in hybrid algorithms are the hill-climbing methods for maintaining the flexibility of algorithms (Ackley, 1987; Gorges-Schleuter, 1989; Davis, 1991; Kennedy, 1993; Whitley et al., 1994; Levine, 1996; Gong et al., 1996; Dozier et al., 1998; Magyar et al., 2000; Xu et al., 2001a). This usually results in an expensive computation cost in each of the local optimization processes for realistic problems where the number of decision variables is large and/or a single function evaluation takes considerable computation time (Xu et al., 2001c), making the implementation of hybrid algorithms difficult or even impossible in these cases. Recently, a novel hybrid genetic algorithm has been proposed by Xu et al. (2001c). This GA uses an additional operator called intergeneration projection (IP), and hence the algorithm is termed an intergeneration projection GA (IP-GA). In conventional or micro GAs (see the next section), the child generation is produced using the genes of parent generation based on the fitness of the parent individuals. In the IP-GA, however, some of the individuals in the child generation are produced using genes of the parent and the grand-
© 2003 by CRC Press LLC
1523_Frame_C05.fm Page 121 Thursday, August 28, 2003 4:23 PM
parent generations. This intergeneration operator drastically improves the efficiency of searching for all the problems tested so far. The IP-µGA was later further improved by Liu’s group (Xu et al., 2001c, 2002; Xu and Liu, 2002a, e; Yang et al., 2001) and the latest version of IP- µGA is about 20 times faster than the µGA (Yang et al., 2001; Xu and Liu, 2002e). In the next sections, several improved GAs will be introduced, the IP-GA will be emphasized and detailed results will be provided.
5.3
Micro GAs
As mentioned before, one main disadvantage of using GAs is that a relatively large number of forward evaluations are generally required. Hence, various other versions of GAs have been developed to improve performance, such as micro GA (Krishnakumar, 1989), messy GA (Goldberg et al., 1989), nontraditional GA (Eshelman, 1989), etc. The micro GA (µGA) is an extension of the “plain” GA. It is capable of avoiding the premature convergence and of performing better in reaching the optimal region than the traditional GA (Krishnakumar, 1989; Carroll, 1996a). Recently, the µGA has been widely applied in engineering practice due to these advantages (Carroll, 1996b; Johnson and Abushagur, 1997; Xiao and Yabe, 1998; Abu-Lebdeh and Benekohal, 1999; Liu and Chen, 2001; Liu et al., 2002c, f; Wu et al., 2002, etc.). Basically, the µGA uses a similar evolutionary strategy to that used in traditional GAs. Selection and crossover are still the basic genetic operations in the µGA, while mutation is usually omitted. Other operations, such as niching, elitism, etc., are also often recommended (Carroll, 1996a). Niching means that the multidimensional phenotypic sharing scheme with a triangular sharing function is implemented (Goldberg and Richardson, 1987). Elitism means that the best individual must be replicated in the next generation. These operations have been found effective in improving the convergence performance of the µGA (Carroll, 1996a; Sareni and Krahenbuhl, 1998), although they are not absolutely necessary. The main differences of the µGA from traditional GAs are in the population size for each generation and the mechanism to introduce and maintain the genetic diversity (Abu-Lebdeh and Benekohal, 1999). Generally, the µGA operates on a very small population size (typically 5 ~ 8). The small population size very often allows fast convergence to a local optimum in the encoded space in a few generations. To maintain the genetic diversity in the population, the µGA uses a restart strategy, not the conventional mutation operation. That is, once the current generation converges, a new generation will be generated that has the same population size and consists of the best individual from the previously converged generation and other new randomly generated individuals from the entire space. This evolutionary pro-
© 2003 by CRC Press LLC
1523_Frame_C05.fm Page 122 Thursday, August 28, 2003 4:23 PM
j = :0
Randomly initialize P(0) Binary encoding Decoding Evaluate P(0)
Convergence criteria 1
Yes
Elitism, with others in P(j) are randomly selected
No j = :j+1
Binary encoding
Tournament selection Uniform crossover Decoding Evaluate P(j) Elitism
FIGURE 5.2 Flow chart of the µGA. (From Xu, Y.G. et al., Appl. Artif. Intelligence, 15(7), 601–631, 2001. With permission.)
cess will be sequentially conducted until the global optimum is found (or the predesignated number of generations is reached) and is schematically depicted in Figure 5.2, where P(j) and C(j) denote the parents and child (offspring) in the jth generation, respectively. The key strategy of the the µGA is to divide the GA search into many cycles, each of which will find a local optimal in the encoded space. To do this efficiently, it uses a small population size for each “micro” generation to achieve fast convergence to a local optimum in one cycle, and to restart the global exploration via randomly generating a relatively large number of individuals in the microgeneration of a new cycle. The elitism is always used in generation to generation and cycle to cycle. By introducing the micro technique, the µGA guarantees its robustness in a different way: whenever the microgeneration is reborn, new chromosomes are randomly generated, so new genetic information keeps flowing in. Krish-
© 2003 by CRC Press LLC
1523_Frame_C05.fm Page 123 Thursday, August 28, 2003 4:23 PM
nakumar’s 1989 study pointed out that a µGA can avoid premature convergence and demonstrates faster convergence to the near optimal region than does a plain GA for many multimodal problems.
5.3.1
Uniform µGA
The uniform crossover operator, which was developed by Syswerda in 1989, generally works better than the one-point and two-point crossovers. Following the reproduction process, in which pairs of chromosomes have been chosen for mating and stored in the mating pool, the uniform crossover operator proceeds. For each bit at the same position of two mated chromosomes, a random number is generated and compared with a preset crossover possibility; if the random number is larger than the crossover possibility, the crossover operator swaps the two bits of the mated chromosomes. On the other hand, if the random number is smaller, the two chromosomes remain unchanged and the crossover operation on this bit is finished. This crossover operation is performed to every bit of the mated chromosomes in sequence. When the crossover operation completes, two new chromosomes are created for the next GA operation. A uniform µGA program combines the two improved techniques of µGAs and uniform crossover operator. Carroll’s study (1996b) has shown that the uniform-µGA generally exhibits more robustness in handling an order 3 deceptive function than traditional GA methods, and pointed out that the robustness of the uniform µGA lies in the constant infusion of new genetic information as the micropopulation restarts, as well as the uniform crossover operator’s characteristic of being unbiased to position.
5.3.2
Real Parameter Coded µGA
As summarized by Man et al. (1999), in general, binary encoding is the most classic method used by GA researchers because of its simplicity and traceability. The conventional GA operations and theory (schemata theory) are also developed on the basis of this fundamental structure. However, a direct manipulation of real-value chromosomes (Janikow and Michalewicz, 1991; Wright, 1991) has also raised considerable interest. The study by Janikow and Michalewicz (1991) indicates that the floating point representation would be faster in computation and more consistent from the run-to-run basis. A real parameter coded microgenetic algorithm (real µGA) is constructed by Liu and Ma (2003) based on the concept of the µGA. The flow chart of the real µGA is presented in Figure 5.3. Comparing Figure 5.3 with Figure 5.2, it can be seen that these two algorithms are basically the same; both consist of many subcycles. In the beginning of every subcycle, new generation is formed using randomly generated individuals with the best individ ual of the last generation. Because of the small population size in the µGA and real µGA run, it will converge quickly to a local optimum. After the
© 2003 by CRC Press LLC
1523_Frame_C05.fm Page 124 Thursday, August 28, 2003 4:23 PM
j = :0
Randomly initialize P(0)
Evaluate P(0)
Convergence
Yes Elitism
No j = : j+1
Selection
Crossover Evaluate P(j) Elitism
FIGURE 5.3 Flowchart of the real µGA.
convergence occurs, this subcycle ends and the next subcycle starts. Each subcycle typically consists of several generations. In every generation, tournament selection, elitism, and crossover operators are included. A mutation operator is not present in the process. Although many similarities have been mentioned, some differences do exist. Two of them are • A different crossover operator must constructed in the real µGA due to the different coding scheme. The crossover operator used in the µGA operates on a binary string, so it cannot be used directly in the real µGA. Types of crossover operators will be detailed in the following subsections. • Convergence has different meanings in the µGA and the real µGA. In the µGA, convergence means that less than a certain percentage of the total bits of other individuals in a generation are different from the best individual. In the real µGA, convergence means that all the individuals in a generation are very near to each other in physical space. In other words, the convergence occurs in real physical space in the real µGA, but it occurs in bit space in the µGA. In
© 2003 by CRC Press LLC
1523_Frame_C05.fm Page 125 Thursday, August 28, 2003 4:23 PM
x2
x2 x1
x1
(a)
(b)
Possible x location
Possible x location
x1 x2
x1
x2
(c)
(d)
Possible x location
Possible x location
FIGURE 5.4 Schematic representation of different cross-over operators: (a) simple crossover; (b) uniform arithmetical crossover; (c) uniform heuristic crossover; and (d) uniform extended arithmetical crossover.
the real µGA, the search covers a large proportion of the entire search space in the beginning of every subcycle. The search range covered reduces as the search progresses, until all the candidates in a generation are crowded in a very small area and the convergence criterion is reached. Once that happens, new randomly generated individuals will flow in and the next subcycle starts. In the µGA, no clear physical interpretation on the convergence can be provided. 5.3.2.1 Four Crossover Operators Four crossover operators have been introduced for the real µGA (Liu and Ma, 2003). For all the crossover operators here, it is assumed that two parents generate one child. Assume the two parents and one child can be written as: x 1 = {x11
x12
...
x1n } (parent 1)
(5.26)
x 2 = {x 21
x 22
...
x 2 n } (parent 2)
(5.27)
x = {x1
x2
...
xn } (the child)
(5.28)
where xij stands for the jth parameter in the ith parent individual. The four crossover operators are plotted schematically in Figure 5.4 (a–d). They can be expressed mathematically as:
© 2003 by CRC Press LLC
1523_Frame_C05.fm Page 126 Thursday, August 28, 2003 4:23 PM
• Simple crossover
{
x = x11
x12
...
x1i
x 2 , i +1
x 2 ,i + 2
...
x2n
}
(5.29)
in which crossover occurs at the randomly selected ith position. Using this operator, the child is located at one of the corner points of the rectangle whose diagonal line is as shown in Figure 5.4 (a). This operator has been used by Wright (1991). • Uniform arithmetical crossover xi = ai x1i + (1 − ai )x 2 i , i = 1, 2, … , n
(5.30)
in which ai ∈[0, 1] is randomly selected. Using this operator, the child x lies in the rectangular whose diagonal line is x 1x 2 , as is shown in Figure 5.4 (b), while in the arithmetical crossover operator used by Wright (1991), the child x can only lie on the diagonal line x 1x 2 . • Uniform heuristic crossover xi = ai ( x 2 i − x1i ) + x 2 i
(5.31)
in which ai ∈[0, 1] is randomly selected, and the fitness value at x 2 is larger than that at x 1 . This operator is called uniform heuristic crossover because it uses fitness value of the function in determining the direction of the search, as is shown in Figure 5.4 (c). It is different from the heuristic crossover operator used by Wright (1991), in which the child x can only lie on the line segment extended from x 1x 2 . • Extended uniform arithmetical crossover xi = 2 x1i − x 2 i + 3 ai ( x 2 i − x1i )
(5.32)
in which ai is also randomly selected. This operator is named uniform extended arithmetical crossover here because it extends the search range of the uniform arithmetical crossover and the uniform heuristic crossover, as shown in Figure 5.4 (d). This crossover has been used by Liu and Ma (2003). With these four crossover operators, four versions of real µGAs are constructed. In order to compare the performance of different algorithms meaningfully, all the parameters and operations are set the same, except for the
© 2003 by CRC Press LLC
1523_Frame_C05.fm Page 127 Thursday, August 28, 2003 4:23 PM
coding scheme, crossover operator, and convergence criteria. The details of the operations and parameters used here for the five algorithms, i.e., the uniform µGA and the four real µGAs, are • Tournament selection • Uniform simple, uniform arithmetical, uniform heuristic, or uniform extended arithmetical crossover; for uniform crossover in uniform µGA, the probability of crossover is set to 0.5 • Elitism operator • No mutation operation • Population size of each generation set to 5 • The population convergence criterion for the real µGA is 2%, which means that when all the candidates in a generation are located very near to each other in physical space so that the maximum distance of two candidates is less than 2% of the search range, the convergence occurs. In the uniform µGA, the population convergence criterion is set to be 5%, which means that when less than 5% of the total bits of other individuals in a generation are different from the best individual, the convergence occurs. 5.3.2.2 Test Functions To examine the effectiveness of the real-coded GAs and other GAs, six typical benchmarking functions listed in Table 5.4 are used to test the performance of various modified GAs to search for their optimal. These functions are selected from the examples used in the programme on Advanced Genetic Algorithm in Engineering, School in Computer Science, Sophia Antipolis, France (available at website: http://www.essi.fr/~parisot/GA200/ga.html). They have been especially designed to have many local optima and one or more global optima. For visualization of their features, two two-dimensional functions F1 and F2 are plotted in Figure 5.5 and Figure 5.6. 5.3.2.3 Performance of the Test Functions These six test functions are used in the section to compare the performance of the real µGAs and the uniform µGA. Five different algorithms are examined: real-µGAs, with four different crossover operators, and the uniform µGA. Because the most significant operator in the real µGAs is the crossover operator, the tests are designed to examine the performance of the different crossover operators. The generation number required to achieve different best fitness values by different algorithms are tabulated in Table 5.5 to Table 5.10 for the six test functions, respectively. The convergence process of all these GAs are plotted in Figure 5.7 to Figure 5.12. From these tables and figures, the performance of each algorithm can be observed:
© 2003 by CRC Press LLC
Test Functions No.
Objective Function
F1
Global Optima
Fitness Value
0 < xi < 1.0 i = 1, 2
(0.0669, 0.0669)
1.0 (maximum)
–10 < xi < 10 i = 1, 2
(4.8581, –7.0835) (–1.4251, –0.8003) (–0.8003, –1.4251)
–186.7309 (minimum)
–10 < xi < 10 i = 1, 2
(–1.0467, 0.0)
–0.3524 (minimum)
–5 < xi < 5 i = 1, 2, 3
(1.0, 1.0, 1.0)
0.0 (minimum)
(( x 1 − x i2 ) 2 + ( x i − 1) 2 )
–5 < xi < 5 i = 1, 2, 3
(1.0, 1.0, 1.0)
0.0 (minimum)
(( ax1 − bx i2 ) 2 + (cx i − d) 2 )
0 < xi < 10.0 i = 1, 2, 3, 4
(4.0, 4.0, 4.0, 4.0)
–10.1532 (minimum)
2
f (x1 , x 2 ) =
Variable Bound
∏ [sin(5.1px + 0.5)] exp 6
i
i =1
−4(log 2)( x i − 0.0667 ) 2 0.64
p = 3.14159 F2
5
f (x1 , x 2 ) =
∑
5
∑
i cos((i + 1)x 1 + i)
i =1
F3
i cos((i + 1)x 2 + i)
i =1
f ( x 1 , x 2 ) = x 14 / 4 − x 12 / 2 + x 1 / 10 + x 22 / 2
F4
3
f (x1 , x 2 , x 3 ) =
∑ i =1
F5
3
f (x1 , x 2 , x 3 ) =
∑ i =1
0.999 ≤ a, b, c, d ≤ 1.001 randomly F6
5
f (x1 , x 2 , x 3 , x 4 ) =
∑ i =1
−1 4
∑ (x
j
− d( j , i)) 2 + c(i)
j =1
d[4, 5] = (4, 4, 4, 4; 1, 1, 1, 1; 8, 8, 8, 8; 6, 6, 6, 6; 3, 7, 3, 7) c[5] = (0.1, 0.2, 0.2, 0.4, 0.4) Source: Xu, Y.G., et al., Appl. Artificial Intelligence, 15(7), 601–631, 2001. With permission
1523_Frame_C05.fm Page 128 Thursday, August 28, 2003 4:23 PM
© 2003 by CRC Press LLC
TABLE 5.4
1523_Frame_C05.fm Page 129 Thursday, August 28, 2003 4:23 PM
X2 X1 FIGURE 5.5 Test function F1 has a number of local optima in the search space. (From Xu, Y.G. et al., Appl. Artifi. Intelligence, 15(7), 601–631, 2001. With permission.)
X2
X1
FIGURE 5.6 Test function F2 with a number of local optima in the search space. (From Xu, Y.G. et al., Appl. Artif. Intelligence, 15(7), 601–631, 2001. With permission.)
© 2003 by CRC Press LLC
Generation Number Required to Achieve Best Function Values by Different GAs for Test Function F1 Real-µGA, Crossover 1 Best Function Generation Value Number 0.456 0.8696 0.95347 0.9989
50 191 1101 1879
Real-µGA, Crossover 2 Best Function Generation Value Number 0.5113 0.8208 0.9785 0.9996
170 103 977 693
Real-µGA, Crossover 3 Best Function Generation Value Number 0.5139 0.8398 0.9785 0.9996
44 114 977 1020
Real-µGA, Crossover 4 Best Function Generation Value Number 0.5655 0.8471 0.9500 0.9996
100 138 280 450
Binary-µGA Best Function Generation Value Number 0.58124 5 0.8423 242 0.8423 (up to 2000) 0.8423 (up to 2000)
TABLE 5.6 Generation Number Required to Achieve Best Function Values by Different GAs for Test Function F2 Real-µGA, Crossover 1 Best Function Generation Value Number –41.895 –140.68 –182.58 –186.11
12 57 109 1753
Real-µGA, Crossover 2 Best Function Generation Value Number –38.82 –120.87 –182.30 –186.47
12 83 140 157
Real-µGA, Crossover 3 Best Function Generation Value Number
Real-µGA, Crossover 4 Best Function Generation Value Number
Binary-µGA Best Function Generation Value Number
–48.497 –147.94 –185.60 –186.50
–33.211 –171.39 –186.51 –186.70
–44.04 –114.4 –186.06 –186.19
7 183 230 257
13 14 33 58
26 52 151 221
1523_Frame_C05.fm Page 130 Thursday, August 28, 2003 4:23 PM
© 2003 by CRC Press LLC
TABLE 5.5
Generation Number Required to Achieve Best Function Values by Different GAs for Test Function F3 Real-µGA, Crossover 1 Best Function Generation Value Number –0.2134 –0.3437 –0.3509
16 128 1698
Real-µGA, Crossover 2 Best Function Generation Value Number –0.1508 –0.3514 –0.3524
53 146 187
Real-µGA, Crossover 3 Best Function Generation Value Number –0.1536 –0.3502 –0.35236
50 77 112
Real-µGA, Crossover 4 Best Function Generation Value Number –0.1009 –0.3455 –0.35237
46 47 132
Binary-µGA Best Function Generation Value Number –0.1345 –0.3279 –0.35238
28 86 225
TABLE 5.8 Generation Number Required to Achieve Best Function Values by Different GAs for Test Function F4 Real-µGA, Crossover 1 Best Function Generation Value Number 2.043 1.3573 1.3573
23 (up to 2000) (up to 2000)
Real-µGA, Crossover 2 Best Generation Function Value Number
Real-µGA, Crossover 3 Best Generation Function Value Number
Real-µGA, Crossover 4 Best Generation Function Value Number
Binary-µGA Best Generation Function Value Number
0.9847 0.00133 0.000
1.298 0.00087 0.000
0.9928 0.000202 0.000
0.4703 0.0092245 0.0092245
16 96 131
62 122 183
136 331 1464
22 1180 2000
1523_Frame_C05.fm Page 131 Thursday, August 28, 2003 4:23 PM
© 2003 by CRC Press LLC
TABLE 5.7
Generation Number Required to Achieve Best Function Values by Different GAs for Test Function F5 Real-µGA, Crossover 1 Best Function Generation Value Number 5.8747 1.3795 1.3795
16 (up to 2000) (up to 2000)
Real-µGA, Crossover 2 Best Function Generation Value Number
Real-µGA, Crossover 3 Best Function Generation Value Number
Real-µGA, Crossover 4 Best Function Generation Value Number
Binary-µGA Best Function Generation Value Number
0.9927 0.836e-4 0.153e-4
0.1946 0.9488e-4 1.529e-5
0.9965 0.2369e-3 1.529e-5
0.47 0.999e-2 0.999e-2
16 130 131 (up to 2000)
63 157 467
136 352 452
22 (up to 2000) (up to 2000)
TABLE 5.10 Generation Number Required to Achieve Best Function Values by Different GAs for Test Function F6 Real-µGA, Crossover 2 Best Function Generation Value Number –1.019 –5.3675 –10.150
39 129 855
Real-µGA, Crossover 3 Best Generation Function Value Number
Real-µGA, Crossover 4 Best Generation Function Value Number
–2.119 –5.0255 –10.152
–1.0557 –5.0183 –10.151
29 63 156
67 271 1302
Binary -µGA Best Generation Function Value Number –1.208 –2.6301 –2.6301
14 (up to 1168, failed) (up to 1168, failed)
1523_Frame_C05.fm Page 132 Thursday, August 28, 2003 4:23 PM
© 2003 by CRC Press LLC
TABLE 5.9
1523_Frame_C05.fm Page 133 Thursday, August 28, 2003 4:23 PM
1 0.9
Best function value
0.8 0.7 0.6 0.5 0.4 0.3
Binary coded Real coded crossover 1 Real coded crossover 2 Real coded crossover 3 Real coded crossover 4
0.2 0.1 0
0
200
400
600
800
1000 1200 1400
1600 1800
2000
Generation number FIGURE 5.7 Convergence of the real µGA for test function F1.
0 -20 -40
Best function value
-60 -80 -100 -120
Binary coded Real coded crossover 1 Real coded crossover 2 Real coded crossover 3 Real coded crossover 4
-140 -160 -180 -200 0
200
400
600
800
1000 1200 1400
Generation number FIGURE 5.8 Convergence of the real µGA for test function F2.
© 2003 by CRC Press LLC
1600 1800
2000
1523_Frame_C05.fm Page 134 Thursday, August 28, 2003 4:23 PM
20 Binary coded Real coded crossover 1 Real coded crossover 2 Real coded crossover 3 Real coded crossover 4
Best function value
15
10
5
0
-5
0
50
100
150
200
250
300
350
400
450
500
Generation number FIGURE 5.9 Convergence of the real µGA for test function F3.
90 80 Binary coded Real coded crossover 1 Real coded crossover 2 Real coded crossover 3 Real coded crossover 4
Best function value
70 60 50 40 30 20 10 0
0
50
100
150
200
250
300
Generation number FIGURE 5.10 Convergence of the real µGA for test function F4.
© 2003 by CRC Press LLC
350
400
450
500
1523_Frame_C05.fm Page 135 Thursday, August 28, 2003 4:23 PM
90 80 Binary coded Real coded crossover 1 Real coded crossover 2 Real coded crossover 3 Real coded crossover 4
Best function value
70 60 50 40 30 20 10 0
0
50
100
150
200
250
300
Generation number FIGURE 5.11 Convergence of the real µGA for test function F5.
0
Best function value
-2
-4
-6 Binary coded Real coded crossover 1 Real coded crossover 2 Real coded crossover 3 Real coded crossover 4
-8
-10
-12 0
200
400
600
800
1000 1200 1400 1600 1800 2000
Generation number FIGURE 5.12 Convergence of the real µGA for test function F6.
© 2003 by CRC Press LLC
1523_Frame_C05.fm Page 136 Thursday, August 28, 2003 4:23 PM
• The real-µGA with crossover operator 1 performs poorly for every test functions, while the search even failed for test function F6, as shown in Figure 5.12. The poor performance of this algorithm may be due to the limitation of the simple crossover operator, where only the corner points of the rectangular (see Figure 5.4a) can be explored by the new generations. • The real-µGA with crossover operator 2 performs reasonably well for test functions F1, F2, F3, F4 and F6, while it performs badly for test functions F5. This phenomenon can be explained by the biased nature of the crossover operator 2. Using the uniform arithmetical crossover (see Figure 5.4b), points outside the rectangular have no chance to be tested in new generations, resulting in a bias against points near the boundary of the search range. The points in the middle of the search range are given a higher probability to be tried. • The real-µGA with crossover operator 3 or 4 performs reasonably well for all the functions. The algorithm with crossover operator 3 outperforms that with the crossover operator 4 for test functions F3, F4 and F6. However, it underperforms the crossover operator 4 for multimodal test functions F1 and F2 (c.f. Figure 5.5 and Figure 5.6), because it may be deceived at some stage in the search process. It is expected that with increasing deception level of the function, the performance of the algorithm with crossover operator 3 may deteriorate. • The real-µGA with crossover operator 4 perform consistently well for all the test functions. In order to clearly compare the performance of real-µGA with crossover operator 4 with the binary uniform µGA, convergence on the six test functions are collected in Table 5.11. It can be found that the real-µGA with crossover operator 4 consistently converges faster than the uniform µGA. By using uniform extended arithmetical crossover operator, the search process is faster, more accurate, and is not easily deceived. TABLE 5.11 Convergence Comparison between Real-µGA with Crossover 4 and the Binary- µGA Test Functions (Maximum or Minimum) F1 (1.000) F2 F3 F4 F5
(–186.7309) (–0.3524) (0.0) (0.0)
F6 (–10.1532)
© 2003 by CRC Press LLC
Real-µGA , Crossover 4 Convergence Generation Point Number 0.9996 –186.70 –0.3524 0.000 1.529e-5 –10.151
Binary -µGA Convergence Point
450
0.834
58 132 1464 452
–186.19 –0.3524 0.00922 0.96d-2
1302
–2.6301
Generation Number Up to 2000 (failed to find the solution) 221 225 2000 Up to 2000 (failed to find the solution) Up to 2000 (failed to find the solution)
1523_Frame_C05.fm Page 137 Thursday, August 28, 2003 4:23 PM
Summarize above observations, the real µGA with crossover operator 4 is recommended due to its consistently good performance for all the six test functions studied. These findings are very much similar to those reported by Liu and Ma (2003).
5.4
Intergeneration Projection Genetic Algorithm (IP-GA)
The IP-GA was proposed by Xu et al. (2001c). In the IP-GA, the child generation is produced using information from the parent and grandparent generations. IP-GAs were originally developed based on the µGA algorithm, to make use of its feature of small population size per generation so as to maximize the efficiency. It was therefore termed IP-µGA. The concept of the IP is, of course, applicable to all other version of GAs. In this book, only the IP-µGA is used, but for simplification, the abbreviation of IP-GA will be used to refer to the IP-µGA. The IP-GA starts from the modified µGA. 5.4.1
Modified µ GA
It is obvious that the population size and the measuring criterion for defining the population convergence have a great influence upon the performance of µGAs. The issue of population size was examined and the corresponding procedure to determine the best population size was developed by AbuLebdeh and Benekohal (1999). The criterion for defining the population convergence was described by Carroll (1996a) as having less than 5% of the genes (or bits) of the other individuals different from the best individual in one generation; it has been successfully applied in engineering practice so far (Xiao and Yabe, 1998; Carroll, 1996a, b). Improvement for this criterion is still possible, however, because it takes into account only the number of the “different genes,” but not their positions in the compared chromosome strings. In fact, if two individuals have the same number of the different gene from the best individual, but the different genes in the compared chromosomes are at different positions, their Euclidean distance from the best individual may be significantly different in solution space (or real-value parameter space). This can be immediately demonstrated by the following example (see Table 5.12). Chromosomes A, TABLE 5.12 Comparison of Euclidean Distances between Two Chromosomes in µGAs Chromosome A Chromosome B Chromosome C
Binary String
Real Value
Euclidean Distance
1101|1001|0001 0101|1001|0011 1101|1001|0011
13 | 9 | 1 5|9|3 13 | 9 | 3
||A–C||2 = 2 ||B–C||2 = 8
Source: Xu, Y.G. et al., Appl. Artif. Intelligence, 15(7), 601–631, 2001. With permission.
© 2003 by CRC Press LLC
1523_Frame_C05.fm Page 138 Thursday, August 28, 2003 4:23 PM
B, and C are constructed by a 12-bit string coded from three real value parameters, respectively. Chromosome A and chromosome B have one different gene with the chromosome C. Because the different gene in chromosome A is under a position different from that in chromosome B, the Euclidean distance of chromosomes A and C is thus significantly different from that of chromosomes B and C in the solution space. The solution space is the real space to define whether or not the population converges; therefore, it is insufficient for the criterion to take into account only the number of different genes in coding space without considering the differences of their Euclidean distances in solution space. A modified criterion has been introduced to overcome this problem (Xu et al., 2001c), in which a weight wi is introduced to take into account the position of the different gene, i.e., the difference of Euclidean distance in solution space for the compared individuals. The weight wi is given as: 2i
wi =
(5.33)
np
∑ n (n + 1)
( N − 1)
j
j
j =1
where i is the position of the different gene that counts from right to left in the substring representing the ith real parameter, nj is the number of genes (or bits) in the jth substring, np is the number of parameters to be optimized, and N is the population size. It is obvious that in Equation 5.33, the more leftwards the position of the different gene in two substrings of the compared chromosomes, the larger the Euclidean distance of these two chromosomes is in the solution space, and thus, the larger the weight wi is. This means more influence of the different gene on the population convergence of the µGA. As for two extreme cases where each of the genes in N – 1 compared chromosomes is identical to (complete convergence) or different from that in the best individual (Xu et al., 2001c): N −1 np
nj
∑∑∑ j =1
j =1
i =1
2i np
∑ n (n + 1)
( N − 1)
j
=0
(5.34)
=1
(5.35)
j
j =1
and N −1 np
nj
∑∑∑ j =1
j =1
i =1
2i np
∑ n (n + 1)
( N − 1)
j
j =1
© 2003 by CRC Press LLC
j
1523_Frame_C05.fm Page 139 Thursday, August 28, 2003 4:23 PM
Therefore, the criterion for defining the population convergence of the µGA is set as: N −1 np
nj
∑∑∑ j =1
j =1
i =1
2i np
∑ n (n + 1)
( N − 1)
j
≤γ
(5.36)
j
j =1
With reference to the convergence criterion (Carroll, 1996a), it is recommended that γ = 5 ~ 10%.
5.4.2
Intergeneration Projection (IP) Operator
The intergeneration projection (IP) operator aims to find a better individual by jumping along the move direction of the best individual at two consecutive generations so as to improve the convergence rate. It usually requires no additional function evaluations. Construction of the move direction of the best individual is a key of implementing the IP operator. Optimization methods based on the heuristic pattern move are actually a kind of direct search method. Generally, they are less efficient than the traditional gradient-based methods; however, they have usually been the preferred choice in hybrid genetic algorithms. This is due to their simplicity and also the fact that many real optimization problems require the use of computationally expensive simulation packages to calculate the values of objective functions. It is very difficult or extremely expensive to compute the derivatives of objective functions in such cases. In addition, some objective functions formulated from the real world may be nondifferentiable or noncontinuous, making the gradient-based methods inapplicable. Intergeneration projection (IP) is performed using two best individuals in the current (parent) and the previous (grandparent) generation, denoted by p bj and p bj −1 , respectively. The IP operator produces two new child individuals, c1 and c2, around p bj , based on the formula (Xu et al., 2001c):
(
)
(5.37)
(
)
(5.38)
c1 = p bj + α p bj − p bj −1
c2 = p bj −1 + β p bj − p bj −1
where α and β are the control parameters of the IP operator; both are recommended to be within the range from 0.3 to 0.7. The effect of the control parameters on the evolutionary process is addressed in the following examples.
© 2003 by CRC Press LLC
1523_Frame_C05.fm Page 140 Thursday, August 28, 2003 4:23 PM
The two newly obtained individuals, c1 and c2, are used to replace the two worst individuals in the present offspring. Because some kind of gradient between the generations is used, the use of Equation 5.37 and Equation 5.38 is expected to get to a better individual. This feature is especially important when searching has entered into the local region around the global optimum, where the best individual is close to the global optimum.
5.4.3
Hybridization of Modified µ GA with IP Operator
Based on the preceding discussion, the IP-GA can be outlined as follows: 1. Letting j = 0, initialize the population of individuals, P(j) = (pj1, pj2,…,pjN). 2. Evaluate the fitness values of P(j). 3. Check the termination condition. If “yes,” the process ends. Otherwise, j = j + 1 and go to the next step. 4. Conduct the genetic operations — selection, crossover, etc. — to generate the initial offspring C (j) = (cj1, cj2,…,cjN). 5. Evaluate the fitness values of offspring C(j), and find the two worst individuals. 6. Perform the IP operation using the two best individuals, p bj and p bj −1 . 7. Generate two new individuals, c1 and c2, by conducting the interpolation and extrapolation along the direction of pattern move, and evaluate their fitness values. 8. Replace the two worst individuals in the initial C(j) with c1 and c2 to obtain the updated offspring, Ch(j) = (cj1, cj2,…,cjN-2, c1, c2), used in the next round of evolution 9. Check if population convergence occurs in offspring Ch(j). If “yes,” implement restarting strategy. Otherwise, go back to step 3. The flowchart of the IP-GA is depicted in Figure 5.13. When compared with the conventional µGA shown in Figure 5.2, some features of the IP-GA can be observed as follows: • The main difference between the IP-GA to the conventional µGA is the add-on of a local intergeneration operator (IP) in the evolution process. Because this IP operator is a simple heuristic operator, this IP-GA basically can be regarded as the second kind of hybrid algorithm mentioned in Section 5.2.5. • The IP-GA is different from the conventional Lamarckian approach in hybrid principal. Lamarckian approach uses the incorporated local operator to move all the newly generated offspring C(j) to their
© 2003 by CRC Press LLC
1523_Frame_C05.fm Page 141 Thursday, August 28, 2003 4:23 PM
j=0 Initiate P(j) Evaluate P(j) Yes Stop criterion ?
End
No j = j+1
Niching
Selection
Intergeneration projection (IP) operator
Crossover
Elitism
Obtain C1, C2
Generate C(j) Evaluate C1, C2 Evaluate C(j)
Generate Ch(j)
No
Population convergence ? Yes Restart
FIGURE 5.13 Flow chart of the IP-GA. (From Xu, Y.G. et al., Appl. Artifi. Intelligence, 15(7), 601–631, 2001. With permission.)
local optima in each of the generations (Moscato and Norman, 1992; Radcliffe and Surry, 1994), which usually results in an expensive computation. The IP-GA only uses the IP operator to find out a better individual near the present best individual; it does not require the individuals, c1 and c2, to be local optima. This greatly simplifies the local search process and reduces the computation cost in the hybridization process. • The IP operator in the IP-GA affects the evolution process in a selfadaptive manner. At the early stage of evolution, the subspace Sp{cjb : f(cjb ) ≥ f(cb)} is larger (see Figure 5.14), where cjb ∈ C(j), f(cjb) =
© 2003 by CRC Press LLC
1523_Frame_C05.fm Page 142 Thursday, August 28, 2003 4:23 PM
Sp1{cbj}
Sp1{cbj: f(cbj)≥f(c1)} is small
C1
IP operation dominates the evolution
p bj C2
p bj−1 Sp2{cbj}
Sp2{cbj: f(cbj)≥f(c2)} is larger
Fitness value
C2
p bj−1
p bj
Genetic operators dominate the evolution
C1
Individuals
FIGURE 5.14 Effect of intergeneration projection operation on the evolution process. At the early stage of evolution, the subspace Sp{cjb: f(cjb) ≥ f(cb)} is larger, where cjb∈C(j), f(cjb) = max{f(cj1), f(cj2),…,f(cjN)}, cb ∈ (c1, c2), f(cb) = max{f(c1), f(c2)}. This means the conventional genetic operations based on the stochastic model have a larger possibility to generate the individual cjb better than cb generated from the IP operator. As a result, the conventional genetic operators have great dominance in this stage. At the later stage, with the subspace Sp{cjb: f(cjb) ≥ f(cb)} growing smaller, the possibility p(f(cjb) ≥ f(cb)) would also correspondingly become smaller and smaller. (From Xu, Y.G. et al., Appl. Artif. Intelligence, 15(7), 601–631, 2001. With permission.)
max{f(cj1), f(cj2),…,f(cjN)}, f(.) is the fitness function, cb ∈ (c1, c2), f(cb)= max{f(c1), f(c2)}. This means the possibility p(f(cjb) ≥ f(cb)) is larger. In other words, conventional genetic operations based on the stochastic model have larger possibility to generate the individual cjb better than cb generated from the IP operator. As a result, the conventional genetic operators have great dominance in this stage. At the later stage, with the subspace Sp{cjb: f(cjb) ≥ f(cb)} becoming smaller, the possibility p(f(cjb) ≥ f(cb)) would also become correspondingly smaller. As a result, the best individual in one generation would be mainly generated from the IP operator, not from the conventional genetic operations, which means that the IP operator would play a more important role. This self-adaptive feature of the IP operator is very beneficial to the whole evolution process. The lesser effect of the IP operator at the early stage would be helpful to avoid the pitfall of sticking to searching at a local optimum. This is because searching at this stage is to focus on finding the promising areas, which is mainly achieved by using the conventional genetic operations. The larger effect of the IP operator at the later stage can greatly speed up the convergence of the evolution process because most searching
© 2003 by CRC Press LLC
1523_Frame_C05.fm Page 143 Thursday, August 28, 2003 4:23 PM
at this stage is to focus on finding a better solution neighbor to the present individual until the global optimum is reached. • The IP operator always shifts its starting point for the search and keeps it to be the best one in the present individual population, no matter how this best individual is obtained (by the conventional genetic operations or by the IP operator in the previous generation). This ensures the insertion of an IP operator without the pitfall of sticking the evolution process at a local optimum. • The IP operator costs less computationally for obtaining two new individuals, c1 and c2 because no evaluation of objective function is required in this process. The computation cost for the IP-GA to reproduce each of the new generations has hardly increased when compared with the conventional GAs. Thus, time saving is a remarkable advantage of the IP-GA when compared with the other hybrid algorithms such as that incorporated with the hill-climbing method. • The implementation of integrating the IP operator into the basic loop of GAs is simple and straightforward. It is therefore convenient to use this hybrid algorithm in engineering practice. In addi tion, because the IP operator can be programmed as an independent subroutine to be called in computation process, this ideal of hybridization is also easy to incorporate into any existing GA software packages.
5.4.4
Performance Tests and Discussions
To examine the effectiveness of the IP-GA, six typical benchmarking functions listed in Table 5.4 are tested to see how fast their global optima can be obtained using the IP-GA algorithm. 5.4.4.1 Convergence Performance of the IP-GA For each test function, 18 cases are studied in order to test the convergence performance of the IP-GA fully. These 18 cases use the same genetic operators but different combinations of α and β. The genetic operators are set as: a population size of 7, tournament selection, no mutation, niching, elitism, possibility of uniform crossover of 0.5, one child, and γ = 5%. The 18 combinations of α and β are created by setting β = 0.5, varying α from 0.1 to 0.9 with an increment of 0.1, and setting α = 0.6, varying β from 0.1 to 0.9 with the same increment. Table 5.13 and Table 5.14 show their convergence results in terms of the numbers of generations, nIP-GA, that the IP-GA has taken to reach the global optimum. For comparison, the conventional µGA with the same genetic operators but without the IP operator incorporated is also run for these six test functions. Their results are also shown in Table 5.13 and Table 5.14, where nµGA is the number of generations to convergence when using the µGA and fn is the best fitness value at generation n.
© 2003 by CRC Press LLC
1523_Frame_C05.fm Page 144 Thursday, August 28, 2003 4:23 PM
TABLE 5.13 Comparison of Numbers of Generations to Convergence Using µGA and IP-GA for Different α with β =0.5 0.1
nIP-GAa (β =0.5, α varies from 0.1 ~ 0.9) 0.2 0.3 0.4 0.5 0.6 0.7 0.8
0.9
F1
189
325
213
209
44
151
96
103
293
F2
177
229
164
58
72
187
68
104
94
F3
261
353
236
278
202
80
198
286
326
F4
326
446
249
278
309
266
247
363
418
F5
337
458
235
274
265
188
279
389
437
F6
1759
959
746
532
436
596
873
682
1232
No.
a
nµGAb (fn) >500 (0.9998) >500 (–185.83) 493 (–0.3524) >1000 (–0.0090) >1000 (–0.0093) >3000 (–5.0556)
nIP-GA/nµGA (%) Min Max <8.8
<65
<11.6
<45.8
16.2
71.6
<24.7
<44.6
<18.8
<45.8
<14.5
<58.6
nIP-GA = the number of generations to convergence using the IP-GA. nµGA = the number of generations to convergence using the µGA.
b
Source: Xu, Y.G. et al., Appl. Artif. Intelligence, 15(7), 601–631, 2001. With permission.
TABLE 5.14 Comparison of Numbers of Generations to Convergence Using µGA and IPGA for Different α with β =0.6 nIP-GAa (α = 0.6, β varies from 0.1 ~ 0.9) 0.2 0.3 0.4 0.5 0.6 0.7 0.8
No.
0.1
F1
302
32
97
54
151
203
55
66
87
F2
205
71
93
110
187
105
91
40
97
F3
351
367
198
234
80
135
186
259
301
F4
333
482
388
198
266
231
206
589
342
F5
392
513
401
201
188
303
257
556
312
F6
770
1214
768
486
596
723
512
780
1106
a b
0.9
nµGAb (fn) >500 (0.9998) >500 (–185.83) 493 (–0.3524) >1000 (–0.0090) >1000 (–0.0093) >3000 (–5.0556)
nIP-GA / nµGA (%) Min Max <6.4
<60.4
<8
<41
16.2
74.4
<19.8
<58.9
<18.8
<55.6
<17.1
<40.5
nIP-GA = the number of generations to convergence for the IP-GA. nµGA = the number of generations to convergence for the µGA.
Source: Xu, Y.G. et al., Appl. Artif. Intelligence, 15(7), 601–631, 2001. With permission.
It can be seen that the IP-GA demonstrates the excellent performance of convergence over the conventional µGA. It takes only 6.4 ~ 74.4% of the number of generations required in the µGA to obtain the global optimum for any α and β in the range of 0.1 ~ 0.9. This means that the IP-GA can always perform better compared to the conventional µGA even with the worst combination of α and β. If the parameters α and β are further limited
© 2003 by CRC Press LLC
1523_Frame_C05.fm Page 145 Thursday, August 28, 2003 4:23 PM
to a smaller range of 0.3 ~ 0.7, the maximal ratio nIP-GA/nµGA would decrease from 74.4 to 56.4%. The computation time for reproducing the same number of generations using the IP-GA is almost equivalent to that using the conventional µGA for each test function. For example, both take about 1 minute to complete the evolution process of the first 500 generations for the test function F1 in the workstation SGI/Cray. This feature clearly results from the fact that no function evaluation is required in the added IP operator. To reveal the evolution process, Figure 5.15 to Figure 5.20 show the convergence processes of test functions F1 ~ F6, respectively, using the IP-GA (α = 0.6, β = 0.5) against the conventional µGA. 1.2
Fitness value
1.0 0.8 µGA
0.6
IP-GA (α = 0.6 β = 0.5)
0.4 0.2 0.0 0
100
200 300 400 Number of generations
500
FIGURE 5.15 Comparison of convergence processes for test function F1. The IP-GA provide a quick convergence. (From Xu, Y.G. et al., Appl. Artif. Intelligence, 15(7), 601–631, 2001. With permission.)
200
Fitness value
160 120 µGA IP-GA (α = 0.6 β = 0.5)
80 40 0 0
100
200 300 400 Number of generations
500
FIGURE 5.16 Comparison of convergence processes for test function F2. (From Xu, Y.G. et al., Appl. Artif. Intelligence, 15(7), 601–631, 2001. With permission.)
© 2003 by CRC Press LLC
1523_Frame_C05.fm Page 146 Thursday, August 28, 2003 4:23 PM
0.5
Fitness value
0.3
0.0
µGA IP-GA (α = 0.6 β = 0.5)
-0.3
-0.5 0
100
200 300 400 Number of generations
500
FIGURE 5.17 Comparison of convergence processes for test function F3. (From Xu, Y.G. et al., Appl. Artif. Intelligence, 15(7), 601–631, 2001. With permission.)
1.0
Fitness value
0.0 µGA
-1.0
IP-GA (α = 0.6 β = 0.5) -2.0
-3.0 0
200
400 600 800 Number of generations
1000
FIGURE 5.18 Comparison of convergence processes for test function F4. (From Xu, Y.G. et al., Appl. Artif. Intelligence, 15(7), 601–631, 2001. With permission.)
5.4.4.2 Effect of Control Parameters α and β It can be found from Table 5.13 and Table 5.14 that the different combinations of α and β result in different nIP-GA in the IP-GA for each of the test functions. The selection of α and β has a significant effect upon the evolution process of the IP-GA. For revealing this feature more obviously, Figure 5.21 and Figure 5.22 show the convergence processes of test function F1 when using the 18 different combinations of α and β. Further observation results in the following findings: • Using any combinations of α and β, the IP-GA will always converge significantly faster than the conventional µ GA when the same genetic operators are used.
© 2003 by CRC Press LLC
1523_Frame_C05.fm Page 147 Thursday, August 28, 2003 4:23 PM
1.0
Fitness value
0.0 µGA
-1.0
IP-GA (α = 0.6 β = 0.5) -2.0
-3.0 0
200
400 600 800 Number of generations
1000
FIGURE 5.19 Comparison of convergence processes for test function F5. (From Xu, Y.G. et al., Appl. Artif. Intelligence, 15(7), 601–631, 2001. With permission.)
12.0
Fitness value
10.0 µGA
8.0
IP-GA (α = 0.6 β = 0.5)
6.0 4.0 2.0 0.0 0
500
1000 1500 Number of generations
2000
FIGURE 5.20 Comparison of convergence processes for test function F6. (From Xu, Y.G. et al., Appl. Artif. Intelligence, 15(7), 601–631, 2001. With permission.)
• Further improvement on convergence performance of the IP-GA depends on a better combination of α and β. Extreme values of α and β (too small or too large, such as 0.1 or 0.9) usually result in less improvement. • It is difficult to specify exactly the value of α and β, which can produce the best convergence performance for all the test functions. For example, α = 0.6, β = 0.2 is the best choice for test function F1, resulting in the fastest convergence (only 32 generations required). However, this choice does not generate the best results for the other test functions. This means that the best selection of α and β is fitness function-dependent.
© 2003 by CRC Press LLC
1523_Frame_C05.fm Page 148 Thursday, August 28, 2003 4:23 PM
Fitness value
1.0 0.8
α
0.6
= 0.1
0.6
0.2
0.7
0.3
0.8
0.4
0.9
0.4 0.2
0.5
0.0 0
50
100 150 200 Number of generations
250
300
FIGURE 5.21 Evolution processes using different α for function F1 (β = 0.5). (From Xu, Y.G. et al., Appl. Artif. Intelligence, 15(7), 601–631, 2001. With permission.)
1.0
Fitness value
0.8
β 0.6 0.4 0.2
= 0.1
0.6
0.2
0.7
0.3
0.8
0.4
0.9
0.5
0.0 0
50
100 150 200 Number of generations
250
300
FIGURE 5.22 Evolution processes using different β for function F1 (α = 0.6). (From Xu, Y.G. et al., Appl. Artif. Intelligence, 15(7), 601–631, 2001. With permission.)
Based on the preceding analysis, parameters α and β are recommended within the range of 0.3 ~ 0.7. It can be found from Table 5.15 that the means of nIP-GA, when using α and β within this recommended range, obviously decreases compared to that using α and β within 0.1 ~ 0.9. A selection of α and β within 0.3 ~ 0.7 may not be the optimal choice for a specific fitness function; however, it always significantly ends in a better result when compared with the conventional µGA. 5.4.4.3 Effect of the IP Operator It is interesting to quantitatively reveal the influence of the IP operator on the evolution process in the IP-GA. A simple way to this end is to figure out the
© 2003 by CRC Press LLC
1523_Frame_C05.fm Page 149 Thursday, August 28, 2003 4:23 PM
TABLE 5.15 Means of the Numbers of Generations (nIP-GAa) of IP-GA to Convergence When Using Different Ranges of α and β α~β
F1
F2
F3
F4
F5
F6
0.3 ~ 0.7 0.1 ~ 0.9
127 151
113 120
183 248
264 338
259 342
627 845
a
nIP-GA = the number of generations to convergence for the IP-GA.
Source: Xu, Y.G. et al., Appl. Artif. Intelligence, 15(7), 601–631, 2001. With permission.
TABLE 5.16 Numbers of Best Individuals (nb) Generated by the IP Operator Using Different α for Function F1 (β = 0.5) α
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
nb nb/nIP-GA (%)
144 76.2
278 85.5
173 81.2
168 80.4
30 75.0
121 80.1
67 69.8
73 70.9
260 88.7
a
nIP-GA = the number of generations to convergence for the IP-GA. Source: Xu, Y.G. et al., Appl. Artif. Intelligence, 15(7), 601–631, 2001. With permission.
TABLE 5.17 Numbers of Best Individuals (nb) Generated by the IP Operator Using Different β for Function F1 (α = 0.6) β
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
nb nb/nIP-GA (%)
249 82.5
18 56.3
75 77.7
35 64.8
121 80.1
157 77.3
29 52.7
40 60.6
67 77.0
a
nIP-GA = the number of generations to convergence for the IP-GA.
Source: Xu, Y.G. et al., Appl. Artif. Intelligence, 15(7), 601–631, 2001. With permission.
number of best individuals, nb, generated by the IP operator in the evolution process. The larger the number nb is, the stronger the influence of the IP operator. Table 5.16 and Table 5.17 show the ratio of nb/nIP-GA for test function F1. The ratio of nb/nIP-GA ranges from about 53 ~ 89% for all 18 cases. This means that the IP operator plays a very important role in the whole evolution process — also true for the other test functions. Table 5.18 shows the mean ratios of nb/nIP-GA for functions F1 through F6. All of them are over 69%. Figure 5.23 shows the convergence processes of test function F1 when using the IP-GA with three different sets of α and β. The “ •” mark indicates the best individuals generated by the IP operator. This mark becomes denser with the increase of generation number in the three convergence curves, meaning that the IP operator plays a more important role in approaching the global optimum. This self-adaptive feature of IP operator is very ideal. It is this feature that ensures the IP-GA capable to explore the promising
© 2003 by CRC Press LLC
1523_Frame_C05.fm Page 150 Thursday, August 28, 2003 4:23 PM
TABLE 5.18 Mean Ratios of nba/nIP-GAb for All Test Functions
a b
No.
F1
F2
F3
F4
F5
F6
nb/nIP-GA (%)
69.3
75.3
80.1
73.2
75.6
78.1
nb = number of best individuals generated by the IP operator. nIP-GA = the number of generations to convergence for the IP-GA.
Source: Xu, Y.G. et al., Appl. Artif. Intelligence, 15(7), 601–631, 2001. With permission.
1.2
Fitness value
1.0 0.8 0.6
α = 0.3 β = 0.5 α = 0.5 β = 0.5
0.4
α = 0.6 β = 0.6
0.2 0
50
100 150 200 Number of generations
250
FIGURE 5.23 Effect of IP operator on evolution process of function F1. (From Xu, Y.G. et al., Appl. Artif. Intelligence, 15(7), 601–631, 2001. With permission.)
areas containing the global optima at the early stage and to converge quickly to the final solution at the later generations. 5.4.4.4
Comparison with Hybrid GAs Incorporated with Hill-Climbing Method The hybrid GAs incorporated with the hill-climbing method have been widely applied in present engineering practice, so it is significant to compare this kind of hybrid GA with the IP-GA. In this study, two typical schemes for the hybrid GAs incorporated with the hill-climbing method are used. The first is to integrate the hill-climbing method with the µGA (denoted as GA-HC(1)). This is called Lamarchian evolution algorithm (Kennedy, 1993) or memetic algorithm (Moscato and Norman, 1992; Radcliffe and Surry, 1994). The second one is to run the µGA to the predesignated number of generations, npre , then apply a hill-climbing method to all the obtained individuals and finally get the best solution (denoted as GA-HC(2)). The minimal step-length for variable pi in the hill-climbing method is set to be [pimax – pimin] /32768 so as to be in accordance with the computation accuracy in the IP-
© 2003 by CRC Press LLC
1523_Frame_C05.fm Page 151 Thursday, August 28, 2003 4:23 PM
TABLE 5.19 Results Obtained from Hill-Climbing Method in GA-HC(1)a for Function F2
Generation Individual
Starting Points Function Variables Value
Ending Points Function nf-hillb Variable Value
1
1 2 3 4 5 6 7
–9.6728, 5.6255 –7.2533, 2.8965 0.6772, 5.3959 –7.1868, –9.9219 0.0003, –9.9169 –9.9774, –9.6802 –9.3262, 0.3177
22.0841 18.8253 30.6487 19.5499 –7.1568 –1.3402 –13.4106
–9.7803, 5.4827 –7.0837, 2.7860 0.8219, 5.4827 –7.0837, –9.7803 –0.1956, –0.0342 –10.0165, –0.780 –9.7803, 0.3342
38.2959 38.2959 54.4048 38.2959 3.0050 0.0946 10.1556
69 63 57 60 91 53 112
2
1 2 3 4 5 6 7
–9.4775, 6.1065 –6.9249, 2.7525 –8.2366, –9.8401 0.8219, 5.4827 2.8336, –9.8804 –9.9756, –9.5239 –9.0576, 0.4739
12.8347 27.1076 14.1813 54.4048 –5.1788 0.4181 –1.2259
–9.2865, 6.0875 –7.0837, 2.7860 –8.2904, –9.7803 0.8219, 5.4827 2.7860, –10.0366 –9.7803, –9.2865 –8.7939, 0.3342
30.7807 38.2959 16.2861 54.4048 1.0463 9.5388 13.8031
71 65 60 16 85 49 74
3
1 2 3 4 5 6 7
1.1429, 6.1065 –9.5868, 8.2409 –9.4903, 5.4729 –9.2816, 0.9116 –7.1618, –7.3705 0.8219, 5.4827 –6.5575, 0.4788
9.5043 –0.9176 –18.4041 11.5597 –32.5859 54.4048 16.8553
1.3199, 6.0875 –9.2865, 8.0889 –9.7803, 5.4827 –9.2865, 0.8219 –7.0837, –7.7081 0.8219, 5.4827 –6.4788, 0.3342
24.9369 9.8608 38.2959 13.5513 186.7307 54.4048 32.7709
71 79 81 55 92 16 55
= method integrating the hill-climbing method with the µGA. = the number of function evaluations taken in the hill-climbing process.
aGA-HC(1) bn f-hill
Source: Xu, Y.G. et al., Appl. Artif. Intelligence, 15(7), 601–631, 2001. With permission.
GA, where the number of possibilities for variables is 32,768. The comparison study is done for all six test functions. Table 5.19 shows the results obtained from the hill-climbing method in the GA-HC (1) for the test function F2. Function F2 is selected as a representative function for detailed discussion because function F2 has many local optima (see Figure 5.6). In Table 5.19, the starting points are actually the offspring in the present generation that are obtained from the present parents using the conventional genetic operations. The ending points are the local optima, obtained using the hill-climbing method from the corresponding starting points; they are the parents of the next generation. Table 5.19 shows that, in the first three generations, the conventional genetic operations in the µGA failed to discover the global optimum. The best fitness value in the first, second, and third generations is 30.6478 (individual 3), 54.4048 (individual 4), and 54.4048 (individual 6), respectively. However, starting from individual 5 in the offspring of generation 3, the hill-climbing method has successfully discovered the global optimum (fitness value = 186.7307). The total number of function evaluations nf taken in this process is 1395 (sum of the numbers of function evaluations taken by both the hill-climbing method and
© 2003 by CRC Press LLC
1523_Frame_C05.fm Page 152 Thursday, August 28, 2003 4:23 PM
by the genetic operators in the first three generations). For visualization of the searching process of hill-climbing method, Figure 5.24 and Figure 5.25 show how this method started from the initial points provided by the µGA at the generation 1 and generation 3, respectively, to the corresponding ending points.
FIGURE 5.24 Hill-climbing searching starting from offspring in generation 1 in the GA-HC(1) for function F2. (From Xu, Y.G. et al., Appl. Artif. Intelligence, 15(7), 601–631, 2001. With permission.)
FIGURE 5.25 Hill-climbing searching starting from offspring in generation 3 in the GA-HC(1) for function F2. (From Xu, Y.G. et al., Appl. Artif. Intelligence, 15(7), 601–631, 2001. With permission.)
© 2003 by CRC Press LLC
1523_Frame_C05.fm Page 153 Thursday, August 28, 2003 4:23 PM
TABLE 5.20 Best Individuals Obtained from Hill-Climbing Method in GAHC(2)a for Function F2 Generation (npre)b 5 10 20 30 40
Ending Point (Best Individual) Variable Function Value 0.8219, 5.4827 –6.4788, 5.4827 –6.4788, 5.4827 –1.4249, 5.4827 –1.4249, 5.4827
54.4048 123.5766 123.5766 186.7307 186.7307
No. of Optima
nf-hillc
— — — 1 3
743 619 801 1087 1060
a
GA-HC(2) = GA is run at predesignated number of generations (npre), then hill-climbing method is applied to get the final solution. b n = predesignated number of generations from which hill-climbing methpre od starts. c n f-hill = number of function evaluations taken in the hill-climbing process. Source: Xu, Y.G. et al., Appl. Artif. Intelligence, 15(7), 601–631, 2001. With permission.
Function value
Table 5.20 shows the best individuals and their fitness values obtained by the hill-climbing method in the GA-HC(2). The hill-climbing searching starts from the offspring in the generations 5, 10, 20, 30, and 40, respectively. It can be seen that the hill-climbing searching starting from the offspring at the generations 5, 10, and 20 fails to obtain the global optimum (Figure 5.26 for the first case). It is not successful until the number npre increases to 30: one of the offspring has successfully led the hill-climbing searching to reach the global optimum (Figure 5.27). With the further increase of npre, more offspring
FIGURE 5.26 Hill-climbing searching starting from offspring in generation 5 in the GA-HC(2) for function F2. (From Xu, Y.G. et al., Appl. Artif. Intelligence, 15(7), 601–631, 2001. With permission.)
© 2003 by CRC Press LLC
Function value
1523_Frame_C05.fm Page 154 Thursday, August 28, 2003 4:23 PM
FIGURE 5.27 Hill-climbing searching starting from offspring in generation 30 in the GA-HC(2) for function F2. (From Xu, Y.G. et al., Appl. Artif. Intelligence, 15(7), 601–631, 2001. With permission.)
become close to the global optimum. The number of offspring that can lead the hill-climbing searching successfully to reach the global optimum has also correspondingly increased. For example, when npre increases to 40, there are three offspring that have successfully led the hill-climbing searching to get to the global optimum. However, the larger npre usually results in more function evaluations. Comparing the GA-HC(1) with the GA-HC(2), the total number of function evaluations, nf, taken to reach the global optimum for test function F2 is 1395 and 1297(npre = 30), respectively. The GA-HC(2) costs less computationally than the GA-HC(1) — also true for the other test functions. However, it is usually difficult to designate the number npre properly in the GA-HC(2) (Yang et al., 1995; Xiao and Yabe, 1998; Xu et al., 2001c). Improper selection of npre usually results in the overuse of function evaluations or failing to get the global optimum. Table 5.21 shows the number nf for the six test functions when using the GA-HC(1), GA-HC(2), IP-GA, and the conventional µGA, respectively. It can be found that the IP-GA incurs the least computation cost among all three of these algorithms. The advantage is more obvious for test functions with more decision variables such as functions, F3, F4, and F5. In addition, the IP-GA does not have difficulty in choosing npre in the GA-HC(2). Nevertheless, there is likely a situation in which this IP-GA does not perform particularly well. That is, when the individual p bj is identical to p bj −1 at the jth generation in the evolution process, the IP operator fails to find the new individuals different from those in the C(j). This would decrease the population diversity and also increase unnecessary evaluations for the same individuals in one generation.
© 2003 by CRC Press LLC
1523_Frame_C05.fm Page 155 Thursday, August 28, 2003 4:23 PM
TABLE 5.21 Comparison of Numbers of Function Evaluations (nf) Using Different Algorithms for Test Functions Algorithm
F1
F2
F3
F4
F5
F6
GA-HC(1) a
1476 1373 1150 >3500
1395 1297 1024 >3500
6199 3375 1654 3451
10004 6736 2383 >7000
10012 6765 2338 >7000
9615 8004 5650 >21,000
GA-HC(2)b IP-GA µGA a
b
GA-HC(1) = method integrating the hill-climbing method with µGA. GA-HC(2) = GA is run at predesignated number of generations (npre), then hill-climbing method is applied to get the final solution.
Source: Xu, Y.G. et al., Appl. Artif. Intelligence, 15(7), 601–631, 2001. With permission.
5.5
Improved IP-GA
Improvements for the previous IP-GA are made to overcome the mentioned problem and further increase the searching efficiency of the algorithm (Xu et al., 2002). This includes: • The IP operator is improved by using an alternative way to construct the move direction of the best individual so that it can find better individuals different from those in the C(j) with a significantly increased possibility. That is, the move direction of the best individual is made using either p bj and p bj −1 or p bj and p sj (second best individual at jth generation) when p bj is identical to p bj −1 . • Only the better of two new individuals obtained from the IP operator is used to replace the worst individual in the current C(j) to implement the hybridization process. This is obviously beneficial to avoiding the decrease of population diversity due to the insertion of two new individuals that are close to each other, and also beneficial to decreasing the population size. • The mutation operation is employed in the evolution process to increase the population diversity. This is especially beneficial when the improved IP operator fails to find new individuals different from those in the C(j) when p bj , p bj −1, and p sj are identical. This new IPGA is termed as improved IP-GA. 5.5.1
Improved IP Operator
Improvement on the IP operator, in which p bj and p bj −1 are used only if they are not identical, is carried out. Otherwise, p bj and p sj should be used. This means that the better individual c is obtained by (Xu et al., 2002):
© 2003 by CRC Press LLC
1523_Frame_C05.fm Page 156 Thursday, August 28, 2003 4:23 PM
{
}
c ∈ {c1 , c2 }
f (c) = max f (c1 ), f (c2 )
( ) + β( p − p)
c1 = p bj + α p bj − p c2 = p bj −1 p bj −1 p= s p j
b j
(5.39)
p bj ≠ p bj −1 p bj = p bj −1
where α and β are recommended to be within 0.1 ~ 05 and 0.3 ~ 0.7, respectively. It is clear that α and β decide how far the newly generated individual c is from the present best individual p bj . Selection of α and β has obvious effects on the convergence process of the IP-GA. Detailed discussion on this point takes place in Section 5.4.4 for the IP-GA; the following examples address it further.
5.5.2
Implementation of the Improved IP Operator
The implementation of the improved IP operator can be carried out exactly as described in Section 5.4.3, except that only the following steps are adopted: • Carry out the conventional genetic operations: niching, selection, crossover, mutation, and elitism, which result in a new generation. Details can be found in Section 5.2.2. The mutation operator is employed in the improved IP-GA, as highlighted in Figure 5.28. • Generate offspring C(j) = (cj1 , cj2,…,cjN), and evaluate their fitness values. They are expected to be closer to the global optimum than those in the P(j). • Carry out the projection operation: • Using Equation 5.39, construct the move direction of the best individual. • Generate the individuals c1, c2, and evaluate their fitness values. • Select the better individual. This process is depicted in Figure 5.28. Basically, the improved IP-GA takes the same strategy in incorporating the IP operator into the basic loop of the µGA as that used in the previous IP-GA. It thus maintains the main advantages of IP-GAs: • Very little computation effort is required in the projection operator.
© 2003 by CRC Press LLC
1523_Frame_C05.fm Page 157 Thursday, August 28, 2003 4:23 PM
j = 0, initiate P(j) Evaluate P(j) Yes Stop criterion ?
End
No j = j+1
Niching
Selection
Intergeneration projection (IP)operator
Crossover
Mutation
Obtain and evaluate c1, c2
Elitism Select c Generate and evaluate C(j)
Generate Ch(j)
No
Population convergence ? Yes Restart
FIGURE 5.28 Flow chart of the improved IP-GA. (From Xu, Y.G. et al., AIAA J., 40(9), 1860–1866, 2002. With permission.)
• The incorporated projection operator affects the evolution process in a self-adaptive manner so as to ensure global searching and fast convergence. • The implementation of the improved IP operator is straightforward. Therefore, it is convenient to use in engineering practice.
5.5.3
Performance Test
The test functions listed in Table 5.4 have been used for the performance test of the improved IP-GA.
© 2003 by CRC Press LLC
1523_Frame_C05.fm Page 158 Thursday, August 28, 2003 4:23 PM
5.5.3.1 Performance of the Improved IP-GA Performance of the improved IP-GA on the convergence is investigated in terms of the number of generations (or times of function evaluations) required to obtain the global optimum for the preceding benchmark functions. To make the results meaningful statistically, each benchmark function is tested 40 times using the improved IP-GA with the different combinations of α and β and different initial random number seed idum. With a different negative number idum, the Knuth’s algorithm generates different series of random numbers. The combinations are sampled at α = 0.1, 0.3, β = 0.5, 0.618, and idum= –1000, –5000, –10,000, –15,000, –20,000, –30,000, –35,000, –40,000, –45,000, and –50,000, using the full fractional combination method. The genetic operators and other operation parameters used are: the possibility of uniform crossover of 0.5, the possibility of mutation of 0.02, tournament selection, one child, niching, elitism, population size of 5, and γ = 5%. Table 5.22 shows the mean number of generations (nIIP-GA) of the improved IP-GA to obtain the global optimum for six benchmark functions listed inTable 5.4. In order to have a fair and meaningful comparison, the IP-GA and µGA are also run 40 times with genetic operators and operation parameters similar to those of the improved IP-GA (except for population size N = 7) for all the six test functions listed in Table 5.4. nIIP-GA, nIP-GA, and nµGA are the means of numbers of generations to obtain global optimum when using the improved IP-GA, IP-GA, and µGA, respectively. The corresponding results are shown in Table 5.22. It can be found that the improved IP-GA demonstrates a much faster convergence than the conventional µGA as well as the previous IPGA. For clearly showing the comparison, two relative ratios are defined as RatioIP = (5 × nIIP-GA)/(7 × nIP-GA) and RatioµGA = (5 × nIIP-GA)/(7 × nµGA). These TABLE 5.22 Convergence Performance of Improved IP-GA and Comparison with IP-GA and µGAa No.
Global Optimum
Function Value
nIIP-GAb
nIP-GAc
nµGAd
RatioIP (%)e
RationµGA (%) f
F1 F2 F3 F4 F5 F6
(0.0669, 0.0669) (1.4251, –0.8003) g (–1.0467, 0.0) (1.0, 1.0, 1.0) (1.0, 1.0, 1.0) (4.0, 4.0, 4.0, 4.0)
1.0 –186.73 –0.352 0.0 0.0 –10.153
158 185 175 261 229 652
189 348 437 544 583 1195
984 1136 983 1561 1648 3271
59.7 38.0 28.6 34.3 28.1 39.0
11.5 11.6 12.7 11.9 9.9 14.2
a b c d e f g
Obtained from 40 independent runs. nIIP-GA = number of generations to convergence for the improved IP-GA. nIP-GA number of generations to convergence for the IP-GA. nµ GA = number of generations to convergence for the µGA. Ratio IP (%) = (5 × nIIP-GA)/(7 × nIP-GA) × 100%. Ratio µGA (%) = (5 × nIIP-GA)/(7 × nµGA) × 100%. One of global optima.
Source: Xu, Y.G. et al., AIAA J., 40(9), 1860–1866, 2002. With permission.
© 2003 by CRC Press LLC
1523_Frame_C05.fm Page 159 Thursday, August 28, 2003 4:23 PM
0.5
Fitness value
0.3 0.0
µGA IP-GA Improved IP-GA
-0.3 -0.5 0
30
60 90 120 Number of generations
150
(a) Function F3
12.0
Fitness value
9.0
µGA IP-GA
6.0
Improved IP-GA 3.0 0.0 0
200
400 600 800 Number of generations
1000
(b) Function F6
FIGURE 5.29 Convergence processes of the improved IP-GA for benchmark function F3 and F6. (From Xu, Y.G. et al., AIAA J., 40(9), 1860–1866, 2002. With permission.)
two relative ratios are calculated and shown in Table 5.22. From this table, it can be found that the improved IP-GA takes only 9.9 ~ 14.2% (or 28.1 ~ 59.7%) of the number of function evaluations required by the µGA (or the IP-GA) to obtain the global optimum for these benchmark functions. Figure 5.29 shows the convergence processes of functions F4 and F6 using the improved IP-GA against the IP-GA and µGA, from which the outstanding performance of the improved IP-GA on the convergence can be seen clearly. 5.5.3.2 Effect of the Mutation Operation Traditionally, the mutation operation is not used in the µGA (Krishnakumar, 1989; Carroll, 1996a, b). However, it is recommended to apply in the improved IP-GA for increasing the population diversity. For testing the effect of the mutation operation, the preceding benchmark functions are investigated using the improved IP-GA with and without mutation operation,
© 2003 by CRC Press LLC
1523_Frame_C05.fm Page 160 Thursday, August 28, 2003 4:23 PM
TABLE 5.23 Effect of Mutation Operator on Performance of Improved IPGA for the Case with α = 0.2, β = 0.5, and idum = –10,000 nIIP-GA(a)a nIIP-GA(b)b Ratio (%) c a
b
c
F1
F2
F3
F4
F5
F6
97 107 90.7
126 156 80.8
104 116 89.7
173 352 49.1
170 417 40.8
412 >1000 <41.2
nIIP-GA(a) = number of generations to convergence for the improved IP-GA with the mutation operator. nIIP-GA(b) = number of generations to convergence for the improved IP-GA without the mutation operator. Ratio (%) = nIIP-GA(a) / nIIP-GA(b) × 100%.
Source: Xu, Y.G. et al., AIAA J., 40(9), 1860–1866, 2002. With permission.
TABLE 5.24 Effect of Coefficient α on Performance of Improved IP-GA for Function F1 Where β = 0.5 and idum = –10,000 α nIIP-GA(a)a nIIP-GA(b)b Ratio (%)c a
b
c
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
147 149 98.7
86 109 78.9
50 86 89.3
119 130 91.5
38 44 86.4
81 91 89.0
63 74 85.1
56 91 61.5
nIIP-GA(a) = number of generations to convergence for the improved IP-GA with the mutation operator. nIIP-GA(b) = number of generations to convergence for the improved IP-GA without the mutation operator. Ratio (%) = nIIP-GA(a) / nIIP-GA(b) × 100%.
Source: Xu, Y.G. et al., AIAA J., 40(9), 1860–1866, 2002. With permission.
respectively. Table 5.23 shows that the improved IP-GA with the mutation operation finds the global optimum faster than that without the mutation operation for all the benchmark functions. Further investigations on the effect of the mutation operation associated with the variation of α, β, and idum are also performed, and the results are given in Table 5.24 to Table 5.26. Effect of the Coefficients α and β To study the effect of α and β on the improved IP-GA, benchmark function F1 is investigated again using the different α and β, with the same genetic operators and other operation parameters (idum = –10,000). Two schemes of the improved IP-GA with and without mutation operation are used. Table 5.24 and Table 5.25 show that nIIP-GA corresponding to the different α (α = 0.1 ~ 0.8) with β = 0.5 ranges from 38 to 147, while that corresponding to the different β (β = 0.2 ~ 0.9) with α = 0.2 ranges from 70 to 486. Figure 5.30 shows the convergence processes. Further investigation has shown that it is difficult to specify exactly the value of α and β, which can get the best convergence performance for all 5.5.3.3
© 2003 by CRC Press LLC
1523_Frame_C05.fm Page 161 Thursday, August 28, 2003 4:23 PM
TABLE 5.25 Effect of Coefficient β on Performance of Improved IP-GA β nIIP-GA(a)a nIIP-GA(b) b Ratio (%)c a
b
c
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
486 121 401.6
217 106 204.7
70 98 71.4
86 109 78.9
80 94 85.1
164 74 221.6
372 109 341.3
243 99 245.5
nIIP-GA(a) = number of generations to convergence for the improved IP-GA with the mutation operator. nIIP-GA(b) = number of generations to convergence for the improved IP-GA without the mutation operator. Ratio (%) = nIIP-GA(a) / nIIP-GA(b) × 100%.
Source: Xu, Y.G. et al., AIAA J., 40(9), 1860–1866, 2002. With permission.
TABLE 5.26 Effect of the Random Number Seed idum on Performance of Improved IP-GA and µGA i dum (× 102) nIIP-GA(a)a nIIP-GA(b)b nµGAc a
b
c
1
50
100
150
200
300
350
400
450
500
245 279 823
164 230 2741
97 107 1229
150 77 1512
196 186 1105
229 238 240
60 185 1043
175 128 415
67 395 526
197 199 208
nIIP-GA(a) = number of generations to convergence for the improved IP-GA with the mutation operator. nIIP-GA(b) = number of generations to convergence for the improved IP-GA without the mutation operator. nµ GA = number of generations to convergence for µGA.
Source: Xu, Y.G. et al., AIAA J., 40(9), 1860–1866, 2002. With permission.
the benchmark functions. However, it has been found that any combinations of α and β always result in the fact that the improved IP-GA converges significantly faster than the µGA using the same genetic operators and operation parameters. It is found from this study that α and β should be within 0.1 ~ 0.5 and 0.3 ~ 0.7, respectively. The recommended choice is α = 0.l ~ 0.3 and β = 0.5, which generally ends in good results in numerical experiments. 5.5.3.4
Effect of the Random Number Seed A total of 10 different random number seeds idum has been used to investigate their influence on the convergence performance of the improved IP-GA. All are selected to be negative according to the suggestion made by Carroll (1996a, b) in a public version (1.7) of the GA program. To show the effect of idum , Table 5.26 presents the corresponding nIIP-GA and nµGA for function F1 when the improved IP-GA and µGA use the different idum . In these investigations, the genetic operators and the operation parameters remain the same (α = 0.2, β = 0.5). From Table 5.26, it can be found that the improved IP-GA is not as sensitive as the µGA to idum . This feature makes the improved IPGA more robust to use in practice.
© 2003 by CRC Press LLC
1523_Frame_C05.fm Page 162 Thursday, August 28, 2003 4:23 PM
1.0
Fitness value
0.8 0.6 0.4 0.2
α = 0.1
0.5
0.2
0.6
0.3
0.7
0.4
0.8
0.0 0
20
40 60 80 100 Number of generations
120
(a) Effect of α with β = 0.5 and idum = -10000
1.0
Fitness value
0.8
β = 0.2
0.6 0.4 0.2
0.6
0.3
0.7
0.4
0.8
0.5
0.9
0.0 20
40
60 80 100 120 140 160 180 Number of generations
(b) Effect of β with α = 0.2 and idum = -10000
FIGURE 5.30 Effects of coefficients α and β on the convergence processes of the improved IP-GA for test function F1. (From Xu, Y.G. et al., AIAA J., 40(9), 1860–1866, 2002. With permission.)
5.6
IP-GA with Three Parameters (IP3-GA)
In the IP-GA, as well as the improved IP-GA, two new individuals are generated in each generation using forward and internal interpolations based on the two best individuals in the neighboring generations. Compared to the µGA, the IP-GA has shown great success in time saving to search for the global optimum; it can always perform better than the µGA. The only drawback for this method is that the searching performance depends greatly on the parameters for interpolations, and the improvement may not be significant for some discrete or singular functions. In order to overcome this shortcoming, a further improvement has been implemented in the IP-GA using three parameters (Yang et al., 2002a). For convenience of description, this improved IP-GA is termed IP3-GA for the use of three parameters.
© 2003 by CRC Press LLC
1523_Frame_C05.fm Page 163 Thursday, August 28, 2003 4:23 PM
5.6.1
Three-Parameter IP Operator
The main idea of this further modification is that the best individuals in the adjacent generations are selected as two basic individuals; three new individuals near the two original ones will be generated through forward, inter nal, and backward interpolations. Distance between new individuals and original ones can be adjusted through changing the corresponding parameter of the interpolation. These newly produced individuals are evaluated and the best one will be inherited by the next generation. Because only three additional new individuals are introduced in each generation, extra work for evaluation is small, but the speed for searching for the true solution is significantly improved. In order to find out whether better individuals near p bj and p bj −1 exist, three new individuals, c1, c2, and c3, will be generated through the forward, internal, and backward interpolations, respectively, which can be expressed in the following equations (Yang et al., 2001):
(
)
(5.40)
(
)
(5.41)
(
)
(5.42)
c1 = p bj + α p bj − p bj −1
c2 = p bj −1 + β p bj − p bj −1 c 3 = p bj − γ p bj − p bj −1
where α, β, and γ are three non-negative decimal parameters, whose values can be changed to adjust the distances of these new individuals to original individuals p bj and p bj −1 . To achieve stable convergence, generally the ranges of these three parameters are: 0 ð β ð 1.0, 0 ð α ð 1.0, 0 ð γ ð 1.0. The procedure of this IP3-GA is shown in Figure 5.31.
5.6.2
Performance Comparison
Table 5.27 gives results for previous testing functions listed in Table 5.4. The performance comparison between the IP3-GA and IP-GA is presented, and the rate of generations for desirable fitness of the IP3-GA over the IP-GA is also listed in this table. This table is not exactly consistent with the preceding studies for the µGA, IP-GA, as well as the improved IP-GA. This is due to the difference in the parameters used in these GAs. Also the binary digit for parameter discretization is different. For all the testing functions, the results show that the IP3-GA has performed much better in terms of accuracy of results or convergence speed compared to the µGA. The parameters are not yet optimized (α = 0.2, β = 0.5, and γ = 0.2 are arbitrarily used). The more the variables in each individual, the better the improvement will be. The results also show that the IP3GA performed a little better than the IP-GA for some cases (F1, F2, F4, F5).
© 2003 by CRC Press LLC
1523_Frame_C05.fm Page 164 Thursday, August 28, 2003 4:23 PM
Get started, j = 0 Initiate P(j) Evaluate P(j)
Stop criterion ?
Yes End
No Generate C(j)
Evaluate C(j)
No
Population convergence ?
Yes IP operation to generate C1, C2, C3 Generate P(j)
Evaluate C1, C2, C3 j = j+1
Restart
FIGURE 5.31 Flowchart of the IP3-GA.
For other cases (such as F3, F6), the IP3-GA performed substantially better than the IP-GA.
5.7
GAs with Search Space Reduction (SR-GA )
Several modified GAs have been presented. All these improvements are aimed at speeding up the local search with the help of local operators or getting out of the stagnation using IP operators. As mentioned in the last paragraph of Section 5.2.4, the stagnation in the latest stage of the GA search process is due to the fact that when a very good individual is found, it is very difficult to find a better individual from the entire original searching space. The main reason is the significant reduction of the space that contains the better individuals compared with the entire original space that is unchanged. Therefore, the best approach to solve this problem is to shrink
© 2003 by CRC Press LLC
Performance of IP3-GA and Comparison with Other Methods µGA
Test Function No
(x1, …, xn)opta
foptb
nµGAc
fµGAd
F1 F2 F3 F4 F5 F6
(0.0669, 0.0669) (–1.4251, –.8003) (–1.0467, 0.0) (1.0, 1.0, 1.0) (1.0, 1.0, 1.0) (4.0,4.0,4.0,4.0)
1.0 –186.73 –0.3524 0.0 0.0 –10.153
984 1136 983 1561 1648 3271
1.0 –186.73 –0.3524 –2.235E–8 –0.917E–7 –10.1532
a b c d e f g h i j
fIP-GAf
nIP3-GAg
fIP3-GAh
RatioIPi/ RatioIP3j (%)
1.0 –186.73 –0.3524 –2.235E-8 1.254E-5 –10.1532
221 277 137 331 503 330
1.0 –186.73 –0.3524 –2.235E-8 1.254E-5 –10.1532
19.2/22.5 30.6/24.4 44.5/13.9 34.8/21.2 35.4/30.5 36.5/10.1
IP-GA nIP-GAe 189 348 437 544 583 1195
(x1, …, xn)opt = optimal point of the test functions. fopt = function values at the optimal point. nµ GA = number of generations to convergence for the µGA. fµ = function value at the convergence point of the µGA. nIP-GA = number of generations to convergence for the IP-GA. fIP-GA = function value at the convergence point of the IP-GA. nIP3-GA = number of generations to convergence for the IP3-GA. fIP3-GA = function value at the convergence point of the IP3-GA. RatioIP = nIP-GA / nµ GA × 100%. RatioIP3 = nIP3-GA / nµGA × 100%.
IP3-GA
1523_Frame_C05.fm Page 165 Thursday, August 28, 2003 4:23 PM
© 2003 by CRC Press LLC
TABLE 5.27
1523_Frame_C05.fm Page 166 Thursday, August 28, 2003 4:23 PM
the searching space while the GA is advancing so as to increase the chance of getting better individuals. A technique has been proposed by Liu et al. (2002h) to narrow the search domain after generations of GA runs. It is termed space-reduction GA (SRGA) here and works as follows. After a number of generations, the maximum and minimum values, PMAXj and PMINj, of each parameter can be found from the M best individuals up to this stage, where j refers to the jth parameter to be identified. A new reduced search space is defined as follows:
(
)
(
)
PMAX jnew = PMAX j + α PMAX jold − PMIN jold PMIN jnew = PMIN j − α PMAX jold − PMIN jold
(5.43) (5.44)
where α is a predefined factor and PMAX jold , PMIN jold are the maximum and minimum values of each parameter in the previous search domain. This procedure is depicted in Figure 5.32. In the SR-GA, a sufficient number of generations is first carried out to ensure that the recorded M best individuals are covering the space that contains the global optimum of the objective function. The parameter M should be so chosen to avoid trapping at any local optimal point when the objective error function is not unimodal or not continuous. The parameter α is used to ensure the local best individual is not excluded from the new GA search process. The combination of M and α ensures that the GAs can find the best individual for complicated objective functions, even when the searching space is reduced.
Fitness value
M=3
Individuals
FIGURE 5.32 Schematic drawing of the search space deduction in the SR-GA. (From Liu, G.R. et al., Comput. Struct., 80, 23–31, 2002h. With permission.)
© 2003 by CRC Press LLC
1523_Frame_C05.fm Page 167 Thursday, August 28, 2003 4:23 PM
The SR-GA has been applied to predict engineering problems and has proved to be very efficient compared to the plain GA (see Section 13.1.2). The authors strongly believe that this idea of search space reduction is one of the most effective ways to solve the convergence stagnation problem in GAs. Therefore, much more effort should be made in this direction to further improving GAs by developing more efficient ways to reduce the search space while the individual is approaching the global optimum.
5.8
GA Combined with the Gradient-Based Method
As discussed in Chapter 4, gradient-based optimization methods have a high probability to converge to a local optimum, depending on the given initial guess. The advantage of gradient-based optimization is that it converges very fast to the local optimum, especially when the initial guess is close to the optimum. However, the search for suitable initial points for a locally converged optimization method often proves to be difficult. On the other hand, GAs hold complementary promises in searching for the global optimum in comparison with traditional optimization methods. The other advantages of GAs are the capability to escape from the local optima and no need for initial guesses. GAs are, however, computationally expensive; their converging performance slows down significantly at the later stage of searching. This can often be observed from the convergence curve of a GA, where it converges very fast at the beginning and very slowly at the later stage, as shown in Figure 5.1. Thus, it is expected to combine a GA and a traditional optimization method so as to provide an ideal performance for the optimization procedure, which is often vital in nonlinear optimization problems. As such, not only can the global optima be ensured but results can also be obtained at a reasonably fast speed. 5.8.1
Combined Algorithm
As reviewed in Section 5.2.5, several kinds of combined algorithms have been proposed. One of them has been used by Liu et al. (2002a) for determining the material property of composites. This combined optimization method combines the µ GA with the modified Levenberg–Marquardt method, which is efficient for solving nonlinear least squares problem. The subroutine BCLSF of IMSL is directly employed in the combined method in which the modified Levenberg–Marquardt method is employed and the Jacobian is obtained using the finite-difference method. This combined algorithm performs in three steps: 1. The µGA is used to determine the initial points. The main purpose is to select a set of better solutions close to the optima. The selection criterion is imposed to limit the function value below a required value.
© 2003 by CRC Press LLC
1523_Frame_C05.fm Page 168 Thursday, August 28, 2003 4:23 PM
2. Each set of these solutions is used as the initial point in searching for the individual local optimum using the BCLSF (refer to the gradient-based method). 3. All solutions from the BCLSF searching are considered the local optima of the function. The global optimum is found from these solutions simply by comparing their corresponding objective function values.
5.8.2
Numerical Example
In order to demonstrate this combined method clearly, consider the Himmelblau function as the benchmark test problem. The Himmelblau function can be written in the form of nonlinear least squares of: Minimize F( x1 , x 2 ) =
2
∑ ( f (x , x )) i
1
2
2
(5.45)
i =1
Subject to − 6 ≤ x1 , x 2 ≤ 6
where f1 ( x1 , x 2 ) = x12 + x 2 − 11;
f2 ( x1 , x 2 ) = x1 + x 22 − 7
(5.46)
Note that the preceding function has four minimum points that can be obtained by solving the following equations: x12 + x 2 − 11 = 0; x1 + x 22 − 7.0 = 0
(5.47)
The solutions to these equations are (3.0, 2.0)T, (–2.805, 3.131)T, (–3.779, –3.283)T, and (3.584, –1.848)T. This study is concerned with designing a function, which has only one global minimum point. Add two terms to the Himmelblau function and form the following nonlinear least square problem, i.e., Minimize g( x1 , x 2 ) =
4
∑ ( f (x , x )) i
1
2
2
i =1
(5.48)
Subject to − 6 ≤ x1 , x 2 ≤ 6
where f3 ( x1 , x 2 ) = 0.316( x1 − 3);
© 2003 by CRC Press LLC
f4 ( x1 , x 2 ) = 0.316( x 2 − 2)
(5.49)
1523_Frame_C05.fm Page 169 Thursday, August 28, 2003 4:23 PM
The additional terms do not alter the location of the optimum and the function value at the global optimal point (3.0, 2.0)T. They alter locations and function values of the other three minimum points, thereby making them local minimum points. Therefore, the global minimum of Equation 5.48 is still at (3.0, 2.0) T and has a function value of zero. Other three local minima have higher function values of 3.498, 7.386, and 1.515, respectively. This problem has been studied by Deb (1998). He has found that, on average, one out of four simulations of the steepest descent algorithm solve the preceding problem to the global optimum, and a successful run takes 215 function evaluations on average to convergence. This finding is typical for many traditional gradient-based optimization algorithms. If they do not begin with a sufficiently good point, the algorithms may converge to a wrong solution of a local minimum. In contrast, the GA could be the global minimum of the function find most of the time; the average number of function evaluations required to achieve the global minimum is 520. As a comparison, a uniform µGA with binary parameter coding, tournament selection, uniform crossover, and elitism is adopted to solve the problem. The population size of each generation is set to be 5 and the probability of uniform crossover is set to be 0.6. The population convergence criterion is 5%, i.e., when less than 5% of the total bits of other individuals in a generation are different from the best individual, the convergence occurs. A new population, in which the best individual of the last generation is replicated, will be randomly generated and the evolution process restarts. Knuth’s subtractive method is used to generate random numbers. The search space defined for this numerical test is listed in Table 5.28. The two parameters are described and translated into chromosomes. In the whole search space, a total of 214 (16,384) possible combinations of these two parameters exists. It has been found that the uniform µGA can find the global minimum of the function; the average number of function evaluations required to achieve the global minimum is 480, that is, less than using the plain GA (520). The value of error function against the number of generation for a GA run is also plotted in Figure 5.33. From this figure, it can be seen that the GA can reach the “better” points fast, but its convergence to the “best” is very slow. It has been found from this example that it can reach TABLE 5.28 Uniform µGA Search Space for Numerical Test Defined by Equation 5.48 Parameter
Search Range
Possibilities #
Binary Digit
x1 x2
–6.0, 6.0 –6.0, 6.0
128 128
7 7
Note: Total population in the entire search space is 2 14 (16,384). Source: Liu, G.R. et al., Comput. Methods Appl. Mech. Eng., 191, 1909–1921, 2002. With permission.
© 2003 by CRC Press LLC
1523_Frame_C05.fm Page 170 Thursday, August 28, 2003 4:23 PM
35
30
Function value
25
20
15
10
5
0 0
20
40
60
80
100
120
140
160
180
200
Number of generations
FIGURE 5.33 Convergence of a µGA for the numerical test of problem defined by Equation 5.48. (From Liu, G.R. Comput. Methods Appl. Mech. Eng., 191, 1909–1921, 2002. With permission.)
the point (3.176, 1.906)T at 51 generations, but it does not converge to the global minimum point (3.0, 2.0)T until 96 generations. Now, the combined optimization method is used to solve the same problem. At the first step, the uniform µGA is used to isolate the best zones in the parameters space. In other words, the uniform µGA is employed as a tool to determine the initial estimations of the parameters. Four sets of the parameters can be selected based on the results generated from the first five generations. The selection criterion is imposed to limit the function value below 200. The values of these selected sets and their corresponding function values are listed in Table 5.29. When studying the features of all the parents from the five generations of uniform µGAs, only four types of parameters — (+, +), (–, +), (+, –), and (–, –) — are found among these parents. These are selected as the better zones in the parameter space. At the second step, these four sets of parameters are considered as the four initial points. The BCLSF is applied four times, each starting from a different initial point. The results from BCLSF are shown in Table 5.30. All these solutions could be considered the local minima of the function; the global minimum can be found from these solutions by comparing their corresponding function values. The global solution is easily found to be (3.0, 2.0)T and has a function value equal to zero, as shown in bold fonts in Table 5.30. The required number of function evaluations to convergence of the BCLSF is very few (about six function evaluations of each run of BCLSF) because the initial points are very close to these minima. For the present method, 25 function evaluations in GA runs and 26 function evaluations in the BCLSF are performed. Therefore, 51 function evaluations in total are required in the combined method, significantly less than 215
© 2003 by CRC Press LLC
1523_Frame_C05.fm Page 171 Thursday, August 28, 2003 4:23 PM
TABLE 5.29 Selected Sets of Better Solutions Close to Optima from Uniform µGA for the Numerical Test Defined by Equation 5.48a Set Number
Point (x1, x2)
Function Value
1 2 3 4
(3.929, 0.635) (–4.447, –4.635) (–4.165, 4.259) (2.329, –2.706)
33.45 128.73 167.90 78.36
a
Results obtained at fifth generation.
Note: Total number of function evaluations in the GA search stage is 5 × 5 = 25. Source: Liu, G.R. et al., Comput. Methods Appl. Mech. Eng., 191, 1909–1921, 2002. With permission.
TABLE 5.30 Results from Gradient-Based Method (BCLSF) for Numerical Test Defined by Equation 5.48 Set Number
Initial Point Solution
Corresponding Evaluations
Function at Solution Point
Function Value
1 2 3 4
(3.929,0.635) (–4.447,–4.635) (–4.165,4.259) (2.329,–2.706
(3.0,2.0) (–3.763,–3.266) (–2.787,3.128) (3.581,–1.821)
8 6 6 6
0.000 7.367 3.487 1.504
Note: Total number of function evaluations in the gradient-based method search stage is 26. Source: Liu, G.R. et al., Comput. Methods Appl. Mech. Eng., 191, 1909–1921, 2002. With permission.
function evaluations in the steepest descent method (even for the successful runs), 480 function evaluations of the uniform µGA, and 520 function evaluations of the plain GA. This numerical test demonstrates the high efficiency of the combined method.
5.9
Other Minor Tricks in Implementation of GAs
The implementation of a GA in an inverse procedure is schematically outlined in Figure 5.34. In the applications presented in Chapter 7 through Chapter 13, the µGAs and IP-GAs play a very important role in solving a wide range of inverse problems. Special implementation techniques will be addressed separately for each of the practical applications. The
© 2003 by CRC Press LLC
1523_Frame_C05.fm Page 172 Thursday, August 28, 2003 4:23 PM
Trial parameters
Search range
Output results (Identified parameters)
Measurement date or simulated measurements generated by adding the random noise to the computergenerated results.
GA
Stopping criterion
Inputs for GA
Forward solver
Error function (Fitness function)
Computed results
FIGURE 5.34 Flowchart of the computational inverse technique using the GA for the inverse analysis. (From Han, X. et al., Inverse Probl. Eng., 10(4), 309, 2002. With permission.)
following addresses some common minor tricks useful in achieving better efficiency. The first minor improvement of the µGA is to record the best individual of current generation and use it directly in the next generation. For the population size of 5, in each generation (except the initial generation in which all five individuals must be evaluated), only four individuals need to be newly evaluated with forward solvers. This will reduce one fifth of the forward computation. This technique has been implemented in Chapter 8 and Chapter 9 for the material property identification of composite. This improvement will obviously be worthwhile for cases in which the forward computation is computationally expensive. Another improvement for the convergence rate of the GA is the two-stage searching method, which consists of a global searching at the first stage and a local searching at the second stage. The local search is performed by reducing the search space after the global search locates the likelihood of the optima region. This method has been employed in Chapter 12 and Chapter 13. For the application of the combined optimization method, Chapter 12 provides the detailed implementation of the combined technique, as well as the technique issue on switch from the GA to the gradient-based optimization algorithm for inversely detecting the crack in composite structures.
© 2003 by CRC Press LLC
1523_Frame_C05.fm Page 173 Thursday, August 28, 2003 4:23 PM
5.10 Remarks • Remark 5.1 — genetic algorithms are stochastic global search methods and differ in their fundamental concept from traditional gradient-based search techniques. For those complicated optimization problems in which the derivatives of the objective functions are difficult or impossible to obtain, the GA can work well. This characteristic makes GAs more canonical than many other search schemes. • Remark 5.2 — the micro GA ( µGA) is a variation of traditional GAs, able to avoid premature convergence and perform better in reaching the optimal region than traditional GAs for many problems. By introducing the microtechnique, the µGA guarantees its robustness in a different way: whenever the micropopulation is reborn, new chromosomes are randomly generated to ensure that the new genetic information keeps flowing in the entire searching process. • Remark 5.3 — the IP-GA uses one more additional operator called intergeneration projection (IP). In the IP-GA, the child generation is produced using genes of the parent and grandparent generations to achieve much better convergence. The concept of the intergeneration projection is applicable to all other versions of GAs. • Remark 5.4 — a method combining the GA with the gradient-based optimization algorithm has been suggested to be used as an effective optimization method. In this method, the genetic algorithm is first used to select a set of better solutions that are close to the optima; then the gradient-based optimization algorithm is applied using these better solutions as the initial guesses. Finally, the optima can be determined from the solutions of the gradient-based optimization algorithm by comparing their corresponding fitness values. This method takes advantage of the global operation of the GA and fast convergence of the gradient-based optimization algorithm.
5.11 Some References for Genetic Algorithms Ackley, D., A Connectionist Machine for Genetic Hillclimbing, Kluwer Academic Publishers, Boston, 1987. Bethke, A.D., Genetic algorithms as function optimizers, University of Michigan, Diss. Abst. Int., 41(9), 3503B, 1981. Coley, D.A., An Introduction to Genetic Algorithms for Scientists and Engineering, World Scientific, Singapore, 1999. Davis, L., Handbook of Genetic Algorithms, Van Nostrand Reinhold, New York, 1991.
© 2003 by CRC Press LLC
1523_Frame_C05.fm Page 174 Thursday, August 28, 2003 4:23 PM
Gen, M. and Chen, R.W., Genetic Algorithms and Engineering Design, John Wiley & Sons, New York. 1997. Goldberg, D.E., Genetic Algorithms in Search, Optimization, and Machine Learning, Addison-Wesley, Reading, MA, 1989. Haupt, L.R. and Haupt, S.E., Pratical Genetic Algorithms, John Wiley & Sons, Inc., New York, 1998. Holland, J.H., Adaptation in Natural and Artificial Systems, University of Michigan Press, Ann Arbor, 1975. Krishnakumar, K., Micro-Genetic Algorithms for Stationary and Non-Stationary Function Optimization, SPIE: Intelligent Control and Adaptive Systems, Philadelphia, PA, 1196, 1989, 289. Lawrence, D., Genetic Algorithms and Simulated Annealing, Morgan Kaufmann Publishers, London, 1987. Levine, D., Users Guide to the PGA Pack Parallel Genetic Algorithm Library, U.S, Argonne National Laboratory, Argonne,. 1996. Man, K.F., Tang, K.S. and Kwong, S., Genetic Algorithms: Concepts and Designs, Springer-Verlag, London, 1999.
© 2003 by CRC Press LLC
6 Neural Networks
The neural network (NN) is a useful tool for information processing and many other applications. Many textbooks detailing the basic principles, fundamentals, and applications of NNs are available. Due to their unique features of neural networks, NNs can be used to solve complex problems that cannot be handled by analytical approaches, even problems whose underlining physical and mathematical models are not well known. NN techniques have also been used to model the nonlinear and complex relationship between the structure parameters and the dynamic characteristics, and hence are very useful for solving inverse problems related to NDE of material and structural systems. This chapter introduces the basic terminology, concepts, and procedures used in creating an NN. A brief review on the role of NNs in solving inverse problems is first provided. The property of using NNs for solving inverse problems is also discussed. As typical NNs, multilayer perceptrons (MLP), along with the back-propagation learning algorithm, have been detailed. Some practically computational issues concerning NNs, especially on the progressive NN model, are also discussed.
6.1
General Concepts of Neural Networks
The concept of artificial neural networks is different from the traditional concept of computation that performs programmed tasks because NNs can be trained using samples (teaching signals) to perform certain tasks. In an artificial neural network, the artificially simulated neuron is an informationprocessing unit fundamental to the operation of a neural network. Figure 6.1 illustrates the schematic representation of a neuron. Usually, by a simple summation, the neuron combines the weighted values of these inputs; it also includes an applied bias. The bias has the effect of increasing or decreasing the net input of the activation functions, depending on whether it is positive or negative, respectively. The combined value is then modified by an activation function. This function may be a simple linear threshold function or
© 2003 by CRC Press LLC
Bias b j x1
x2
w1 j
w2 j
Activation function
∑
f (.)
Output y j
M
xm
M
wmj
FIGURE 6.1 Schematic diagram of a typical simulated neuron for the jth output. The result of nodal summing junction is passed through the activation function f to obtain an output y j .
may be a continuous function such as piecewise linear functions, sigmoidal functions, etc. Figure 6.2 gives a number of commonly used activation functions. The output value of the activation function is generally passed directly to the output of the neuron. Neurons are usually organized into groups called layers. A typical network consists of a number of layers with full or patterned connections between the layers. Neural networks are specified by their network topology, node characteristics, and training algorithms (Lippmann, 1987). The learning algorithms define an initial set of weights and determine how the weights should
Tan-sigmoid function
Linear function
Log-sigmoid function
Hard- limit function
FIGURE 6.2 Various types of activation functions used in the artificial neural network.
© 2003 by CRC Press LLC
be adjusted to further improve the neural network performance in terms of rate of convergence and accuracy in results.
6.2
Role of Neural Networks in Solving Inverse Problems
Neural networks can be used to solve many problems that cannot be handled by analytical approaches, providing an effective approach for a broad spectrum of applications (Bishop, 1994; Sumpter and Noid, 1996; Ootao et al., 1999a, b; Wu et al., 1992; Levin and Lieven, 1998). As mentioned by Bishop (1994), the NN should be used to solve problems that have some, or all, of the following characteristics: …(i) there is ample data for network training; (ii) it is difficult to provide a simple first-principles or model-based solution which is adequate; (iii) new data must be processed at high speed, either because a large volume of data must be analyzed, or because of some real-time constraint; and (iv) the data processing method needs to be robust to modest levels of noise on the input data.
Many of the problems that arise in data analysis to which neural networks may be applied are inverse problems. A number of publications using NNs in NDE have been reported in the last decades. To date, the use of NNs in this research area has been demonstrated with source locations, defects classifications, and material characterizations. Examples include the reconstruc tion of constitutive properties using the depth-load response (Huber and Tsakmakis, 1999), using group velocities, phase velocities, or slowness mea surements (Sribar, 1994), using displacement response (Liu et al., 2001b, 2002b), the estimation of contact forces from impact-induced strains (Chandrashekhara et al., 1998), and the prediction of impact wave force (Mase and Kitano, 1999). Currently, interest in employing NN techniques to detect structural damages is increasing. Wu et al. (1992) adopted an NN model to portray the structural behavior before and after damage in terms of the frequency response function, and then used this trained model to detect the location and extent of damages by feeding in measured dynamic responses. Klenke and Paez (1994) used two probabilistic techniques to detect the damages in aerospace housing components, one of which involved a probabilistic neural network model. Rhim and Lee (1995) used an MLP model to identify the damages in a composite cantilevered beam in which the damage was modeled as delamination in the FEM model of the beam. An MLP model was used by Masri et al. (1996) to detect changes in the dynamic characteristics of a structure-unknown system. Luo and Hanagud (1997) employed an NN model with the dynamic learning rate steepest descent (DSD) method to
© 2003 by CRC Press LLC
carry out a real-time flaw detection of composite materials. Using a counterpropagation NN model, Zhao et al. (1998) identified the damages in beams and frames. Liu et al. (1999) used the NN model to detect the impact damages in carbon fiber reinforced polymer composite laminates. More applications of the NN model in the area of damage detection have been reviewed by Doebling et al. (1996). A well-defined forward problem usually has a (stable) solution; the inverse problem is often ill-posed with the Type III ill-posedness that can lead to unstable solutions when noisy data are used. The ill-posed inverse problems possess two difficulties. First, it is often difficult to perform real-time inverse analysis due to the demand on computation time when a conventional computational approach is used to find a solution. Second, proper techniques are required to ensure the stability of the inverse solution. The NN approach offers the advantages of a very high-efficiency inversion solution that avoids thousands of times of invoking expensive forward solver online. In addition, the NN model is not sensitive to the ill-posedness of the inverse problem because the underlining partial differential equation (PDE) is not used in the inverse process. Instead, the NN model that is a “projected” model of the analytical model governed by the PDEs in the space created by the train samples, which are usually very small in number and definitely discrete in nature, is used. Therefore, the regularization by projection (see Chapter 3) works automatically. Moreover, the ill-posedness will be further damped by properly setting the error criteria in the NN training based on discrepancy principle (see Chapter 3), corresponding to the noise level in the training samples. There are, however, problems of so-called “overfitting” when noisy training samples are used. In dealing with the overfitting problem, it is found that a slight modification on the feedback learning algorithm by adding a regularization term in the performance function should effectively solve the problem (see Section 6.4.4).
6.3
Multilayer Perceptrons
6.3.1
Topology
Multilayer feed-forward networks typically consist of a set of sensory neurons that constitute the input layer, one or more hidden layers of computation nodes, and an output layer of computation nodes, as shown in Figure 6.3. The input signals propagated through the networks are commonly referred to as multilayer perceptrons (MLP). Nodes receive their inputs exclusively from outputs of nodes in the previous layer, and outputs from these nodes are passed exclusively to nodes in the following layer. Mathematically, the MLP represents a nonlinear mapping between system input X = {xi , i = 1, , N } and outputs Y = { yi , i = 1, , M} via the equation:
© 2003 by CRC Press LLC
y1 Output layer
y2
yM
. . . wij3
. . .
2nd hidden layer
wij2
. . .
1st hidden layer wij1
. . .
Input layer
x1
x2
xN
FIGURE 6.3 Typical two-hidden-layer MLP with N inputs and M outputs. (From Xu, Y.G. et al., Int. J. Solids Struct., 38, 5625–5645, 2001. With permission.)
Y = g( W , X )
{
(6.1)
}
where W = wijk , i = 1, , N i , j = 1, , N j ; k = 1, N l is a matrix of weights (and bias) corresponding to the connections between the layers. In the following, W refers the weight matrix that also contains the bias matrix, k is the kth layers of the MLP, and N l is the number of layers. N i and N j are the numbers of neurons for the ith and jth layers, respectively. Training of the NN model is achieved by the calculation and update of the weight matrix W using the training data set. Once the training is completed, the NN calculation is relatively fast regardless of the complexity of the actual physics of the problem.
6.3.2
Back-Propagation Training Algorithm
Standard back-propagation (BP) is a gradient descent algorithm in which the network weights are moved along the negative of the gradient of the performance function. Back-propagation refers to the manner in which the gradient is computed for nonlinear multilayer networks. A number of variations on the basic algorithm are based on other standard optimization techniques, such as the conjugate gradient and the Newton methods. Properly trained BP networks tend to give reasonable answers when presented with inputs that they have never seen. Typically, a new input leads to an output similar to the correct output for input vectors used in training that are similar to the new input being presented. This generalization property makes it possible to train a network on a representative set of input/target pairs and get good results without training the network on all possible input/output pairs. The MLP is trained with a back-propagation training algorithm. In its original form, it is an iterative gradient algorithm designed to minimize the
© 2003 by CRC Press LLC
squares of differences between the actual and target outputs of an MLP. In the following, only an abbreviated description of the basic BP is described. The error norm E between the predicted output Y p vector and the targeted output vector Y t is defined as E(W ) = Y p − Y t
(6.2)
2
The operator represents the L2 norm form as described in Chapter 2. 2 The weight matrix is adjusted iteratively based on the following equations. The adjustment for W can be written as (Luo and Hanagud, 1997): W r +1 = W r + ∆W r ∆W r = − η
(6.3)
∂E(W ) ∂E(W ) + αη ∂W W = Wr ∂W W = Wr −1
(6.4)
where η, α, and r are defined as the learning rate, the momentum rate, and the iterative number, respectively. The derivatives in Equation 6.4 are matri ces whose arguments can be found as k ∂E(W ) ∂E(W ) ∂net j = = −δ kj ⋅ oik −1 ∂w jik ∂net jk ∂w jik
(6.5)
net jk and oik represent the input and the output of the ith neuron in the kth layer, respectively. net jk =
∑w o
k k −1 ji i
,
( )
o kj = f net jk
(6.6)
i
δ kj can be expressed as δ kj = ( y tj − y j ) f ′(net jk )
when k is in the output layer
(6.7)
or δ kj =
∑δ i
k +1 i
w kij +1 f ′(net jk )
when k is in the hidden layer
(6.8)
f ′(net jk ) is the first derivative of the activation function f(•) with respect to net jk .
© 2003 by CRC Press LLC
6.3.3
Modified BP Training Algorithm
The majority of training algorithms of NNs are based on the BP learning algorithm. The convergence speed of this algorithm usually is slow. Vogl et al. (1988) proposed a modified method to accelerate the convergence of the BP algorithm. One of the improvements in their algorithm is that the learning rate η is varied according to whether or not an epoch decreases with the system error norm E(W). If an update results in a reduced E(W), η is multiplied by a factor larger than 1 for the next epoch. If a step produces a network with the error norm E(W) more than a few percent above the previous value, all changes to the weights are then rejected, η is multiplied by a factor less than 1, and α is set equal to zero. This procedure is repeated. This concept was later adopted by Luo and Hanagud (1997) in the dynamic learning rate steepest descent method they proposed. A modified BP learning algorithm with a dynamically adjusted learning rate and a jump factor is adopted to improve the performance of the BP learning algorithm. The dynamic adjustment of the learning rate is first used; however, the learning rate is adjusted once every ne epochs instead of at every epoch. Assuming that the learning rate for the nth epoch is represented by η(n), this learning rate will be adjusted at the (n + ne )th epoch based on the following criterion η(n + ne ) = con × η(n)
(6.9)
where the range of con is based on some numerical studies (Xu, 1996). It can be found from Equation 6.4 that the change of ∆W is dependent not only on the learning rate η, but also on the partial derivative ∂E / ∂W. Such a change is directly related to the training of an NN model. As indicated by Riedmiller and Braun (1993), it is possible that the effect of the carefully adapted η can be drastically disturbed by the unforeseeable behavior of the derivative. In fact, this problem mainly comes from the possible saturation of the sigmoid function, i.e., f ′( z kj ) → 0 , which leads to δ kj → 0 and causes the weight matrix to stagnate. To solve this problem, Riedmiller and Braun (1993) proposed a resilient propagation scheme that directly adapted the weight step based on local gradient information. Recently, Chang et al. (2000) proposed a jump factor γ to be used as δ kj = ( y tj − y j )[ f ′(net jk ) + γ ]
when k is in the output layer
(6.10)
or δ kj =
© 2003 by CRC Press LLC
∑δ i
k +1 i
w kij +1 [ f ′(net jk ) + γ ]
when k is in the hidden layer (6.11)
Numerical studies have recommended that the values of γ are selected between 0 and 0.15, which can be varied during the training process. The purpose of adding this small positive value for f ′( z kj ) is to maintain a nonzero δ kj value and to prevent the weight matrix from stagnation.
6.4
Performance of MLP
Many facts can influence the performance of MLP, including how many hidden layers are needed for a specified problem and how many neurons per layer; what a model can generalize; how training samples should be selected; the information of the inputs for an NN model, etc. Some of these problems have already been solved definitively; others are still under inten sive investigations. Hertz et al. (1991) reviewed number of hidden units, input representation, and generalization, based on several solid theorems (Lippmann, 1987; Denker, 1987; Lapedes and Farber et al., 1988). The next subsection discusses computational treatments on some of these issues.
6.4.1
Number of Neurons in Hidden Layers
Back-propagation can train multilayer feed-forward networks with differentiable transfer functions to perform function approximation, pattern association, and pattern classification. The term back-propagation refers to the process by which derivatives of the network error, with respect to network weights and biases, can be computed. This process can be used with a number of different optimization strategies. The architecture of a multilayer network is not completely constrained by the problem to be solved. The number of inputs to the network is constrained by the problem and the number of neurons in the output layer is constrained by the number of outputs required by the problem. However, the designer decides the number of layers between network input and the output layers as well as the sizes of the layers. The two-layer sigmoid/linear network can represent any functional relationship between inputs and outputs if the sigmoid layer has a sufficient number of neurons. Two hidden layers are usually recommended for most structural problems (Masri et al., 1996), although one hidden layer theoretically has been demonstrated to be sufficient to model an arbitrary complex nonlinear relationship (Chen and Chen, 1996). Once the numbers of layers are defined, the most difficult task related to the MLP architecture is to determine the number of neurons in each hidden layer; this is usually completed by using numerical experiments (trial and error), and is often tedious, with some uncertainty. One effective method has been proposed to tackle this problem (Xu et al., 2001b). The basic idea is that, for a hidden layer being adjusted, first a larger
© 2003 by CRC Press LLC
neuron number is selected; the correlativity γ ij between the output of the ith neuron and the output of the jth neuron, and a criterion parameter αi , are then calculated as oij
γ ij =
oi o j
m
,
αi =
∑γ
ij
, i, j = 1, …, m
(6.12)
j =1
where p
p
oe =
∑ k =1
o
2 eik
−
∑
p
oek
k =1
p
p
( e = i, j) ,
oij =
∑ k =1
p
∑ ∑o oik
oi o j −
k =1
k =1
p
jk
(6.13)
m is the current number of neurons, p is the number of the total training samples, and oik is the output of the ith neuron for the kth sample. If γ max = max( γ ij ) ≥ γ c in this hidden layer (in general, γ c = 0.8 ~ 0.9), those neurons with larger α i value would be first cancelled out. Then, γ max and α i are calculated again with the adjusted number of neurons. This process is repeated until γ max < γ c in this hidden layer. It is obvious that the overabundance of neurons in one hidden layer can be eliminated this way.
6.4.2
Training Samples
Rogers (1994) indicated that, in addition to the proper networks architecture and the efficient learning algorithm, the selection of training samples is another key factor in obtaining a reliable MLP model for the studied problem. Generally, an ideal set of training samples should be complete, i.e., be able to represent the total sample space. One common method is to use a complete combination (Manson et al., 1989). For cases where there are p parameters and each parameter comes with q discrete values, the number of all the possible combinations is qp. The completeness of samples is guaranteed by these qp samples; however, the number is prohibitively large and it is impractical to include all the samples for the complex engineering problems. Another simplified method with a similar idea is to use a hypercube to cover the sample space (Manson et al., 1989), but the number of the required samples is still large. The linear method (Rogers, 1994) generates the training samples by starting at the lower bound of each parameter and then stepping through the sample space at a given increment until reaching the upper bound. Recently, Atalla and Inman (1998) have suggested that the random generation of the characteristic parameters within their variation ranges could produce a good training result. Levin and Lieven (1998) proposed a two-part scheme. The first part consisted of assigning each parameter in turn to one of the q discrete values while giving
© 2003 by CRC Press LLC
all the other parameters to their respective nominal values. The second part consisted of generating a given number of training samples by adjusting a random selection of the p values by a random amount. The orthogonal array (OA) method (Besterfield et al., 1995) has been adopted to generate representative training samples (Liu et al., 2001b; Xu et al., 2001b). The OA was originally developed for experimentalists to reduce the number of experimental trials normally required in a full factorial experimental design (Manson et al., 1989). With this OA method, only p(q – 1) + 1 combinations (rather than qp) are required for representing the total sample space for the case stated previously, if no interaction takes place among the p parameters. The number p(q – 1) + 1, which is determined by the corresponding orthogonal array LA(qp) where A = p(q – 1) + 1, is significantly smaller than the complete sample number of qp, especially when the number of parameters and/or their discrete values to be considered is large. This method is able to guarantee the completeness of samples. In addition, because each sample is orthogonal to the others among these p(q– 1) + 1 samples, the effect of each parameter on the trained MLP model will tend to be accurate and reproducible. The orthogonal array can be found in any engineering mathematics handbook; as an example, Table 6.1 gives the orthogonal array of samples with six parameters with five levels.
6.4.3
Normalization of Training Data Set
The NN model requires normalization of input and output data. The sigmoid transfer function is commonly used in the BP algorithm, so the system cannot actually reach its extreme values of 0 and 1 without infinitely large weights. However, it is appropriate in practice to normalize the input as well as output patterns between 0.1 and 0.9 (Topping and Bahreininejad, 1997). The inputs of the training samples are normalized linearly based on the following formulas: xi =
xi − xi min + ε 1 xi max − xi min + ε 2
(6.14)
where xi min and xi max are the minimum and maximum values of the ith input value xi in the sample data set, respectively; xi is the normalized value of parameter x ranging between 0 and 1.0. ε 1 , ε 2 (0 ≤ ε 1 < ε 2 << 1) are the scaling factors to ensure that the normalized values are not close to 0 or to 1. The outputs can be normalized in exactly the same way.
6.4.4
Regularization
Generalization is an attribute of the NN whose output for a new input vector tends to be close to outputs for similar input vectors in its training set. One
© 2003 by CRC Press LLC
TABLE 6.1 Five-Level Orthogonal Array Sample Number
1
Row Number 2 3 4 5
6
1 2 3 4 5
1 1 1 1 1
1 2 3 4 5
1 2 3 4 5
1 2 3 4 5
1 2 3 4 5
1 2 3 4 5
6 7 8 9 10
2 2 2 2 2
1 2 3 4 5
2 3 4 5 1
3 4 5 1 2
4 5 1 2 3
5 1 2 3 4
11 12 13 14 15
3 3 3 3 3
1 2 3 4 5
3 4 5 1 2
5 1 2 3 4
2 3 4 5 1
4 5 1 2 3
16 17 18 19 20
4 4 4 4 4
1 2 3 4 5
4 5 1 2 3
2 3 4 5 1
5 1 2 3 4
3 4 5 1 2
21 22 23 24 25
5 5 5 5 5
1 2 3 4 5
5 1 2 3 4
4 5 1 2 3
3 4 5 1 2
2 3 4 5 1
of the problems that occurs during neural network training is called overfitting. Even a very small error in the training set can generate a large error of the output of the NN. The network has memorized the training examples, but it has not learned to generalize for new situations. In order to illustrate overfitting clearly, consider the following example consisting of inputs p and targets t to solve with a neural network. This simple example involves fitting a noisy sine wave. The problem is stated as t = −1.0 + 0.05 * (i − 1), i = 1, 2, … , 41 x = sin( πt) + 0.1 * random(t)
(6.15)
Here a two-layer feed-forward network provided in MATLAB is used. The network's input range is [–1, 1]. The first layer has 20 tan-sigmoid function neurons, and the second layer has one Purelin neuron. Figure 6.4 shows the response of the 1–20–1 neural network that has been trained to approximate
© 2003 by CRC Press LLC
1.5
1
0.5
x
0
-0.5
-1
-1.5 -1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
t
FIGURE 6.4 Outputs of an NN trained to approximate a sine function using a set of noisy data (solid line: sine function without noise; “+”: noise measurements; dashed line: outputs of the NN).
a sine function using noisy input (measurement) data. The sine function without noise is shown by the solid line, the noise measurements are given by the “+” symbols, and the output of the neural network is given by the dashed line. Clearly this network has overfitted the data. Because the noise in a measurement is random in nature, the use of any other input of noise data not in the training sample can lead to serious error using this overfitted network, thus implying that the overfitted network will not adapt well for general cases. One method to improve network adaptability for general cases is to use a network that is just large enough to provide an adequate fitting. The larger the network that is used, the more complex functions the network can emulate. Therefore, a small network will not have the capacity to overfit the data of a complex phenomenon. Unfortunately, it is often difficult to know before hand how large a network should be for a particular engineering application. Regularization is one efficient method for improving the adaptability of the network. As specified by Equation 6.2, the typical performance function for training a feed-forward NN is the mean sum of squares of the network errors. It is possible to modify the performance function by adding a term that consists of the mean of the sum of squares of the weights and biases, e.g., E r (W ) = γ Y p − Y t
2
+ (1 − γ ) W
2
(6.16)
where γ is the regularization parameter. Using this modified performance function will cause the network to have smaller weights and biases; this will force the network response in a smoother manner and is less likely to overfit.
© 2003 by CRC Press LLC
1.5
1
0.5
x 0
-0.5
-1
-1.5 -1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
t FIGURE 6.5 Results for the regularized outputs of the 1–20–1 NN (solid line: sine function without noise; “+”: noisy measurements; dashed line: outputs of the NN).
The problem with this type of regularization is that it is difficult to determine the optimal value for the performance ratio regularization parameter γ. Making this parameter too large may result in overfitting. If the ratio is too small, the network will not adequately fit the training data. This issue is quite similar to that encountered in using the Tikhonov regularization (see Chapter 3). MacKay (1992) has proposed an automated process to determine the optimal regularization parameters based on the Bayesian concept. In this work, the weights and biases of the network are assumed to be random variables with specified distributions. The regularization parameters are related to the unknown variances associated with these distributions. Then, these parameters can be estimated using statistical techniques. Figure 6.5 shows the response of the trained 1–20–1 network with the preceding automated Bayesian regularization (trainbr algorithm in MATLAB). In contrast to Figure 6.4, in which the unregularized network overfitted the data, the network response (dashed line) is very close to the sine function given by the dashed line; therefore, the network will adapt well to new noisy inputs.
6.5
A Progressive Learning Neural Networ k
The relationship between the specified inputs and the outputs for practical engineering problems can be extremely complex and highly nonlinear. It can
© 2003 by CRC Press LLC
be very difficult, even impossible, to train a usual NN model for such a complex relationship valid in a wide range of parameters; therefore, more sophisticated neural networks are required. Many adaptive neural networks have been reported for various applications (Bai and Farhat, 1992; Leung and Payandeh, 1996; Barro et al., 1998, etc.) Next a strategy for creating a neural network that is trained in a progressive manner is introduced. The progressive learning neural network was used for model updating of dynamic structures by Chang et al. (2000), for crack detection of composite laminates (Xu et al., 2001b) and for identification of material property of composites by Liu et al. (2001b). The progressive learning NN model is schematically shown in Figure 6.6, where a forward solver is employed to generate training samples. After the initial training of the NN model, the outputs Y p can be predicted by feeding the inputs (for instance, measured values) X m into the initially trained NN model. These outputs are then fed into the forward solver to produce a set of calculated inputs X c . A comparison between the calculated inputs X c and measured inputs X m is made based on a given criterion. If these two sets differ significantly such that the criterion is not satisfied, the NN model will then be retrained online using adjusted training samples that contain X c and Y p . The retrained NN model is then used to generate the outputs again by feeding in the measured inputs X m . This online retraining procedure is repeated until the difference between the calculated and measured inputs satisfies the given criterion. At the end of the progressive training, the final output is ensured to produce inputs very close to the measured ones when fed into the forward solver. This is the basic framework of a progressive neural network. Retraining of the NN model is achieved by adding new samples to the original pool of samples in order to enforce a more stringent convergence criterion. It has been pointed out that it could be difficult to achieve the same level of convergence while maintaining the same NN architecture when the number of samples increases. To tackle this problem, a dynamic adjustment method for selecting samples for retraining is used. While adding a new sample to retrain the NN model, one sample from the original sample set can be removed so as to maintain the same number of samples. The sample to be removed is the one that has the largest distance norm from the measured inputs X m . The distance norm of the inputs between the ith sample X i and the measured inputs X m is defined as d = Xm − Xi
2
(6.17)
By replacing the remote sample with a new sample, the sample density around the measured displacement responses increases as the process progresses. As a result, the modeling accuracy of the NN model in the neighborhood of the measured displacement responses could be improved.
© 2003 by CRC Press LLC
Measured inputs Xm
NN model architecture
Initial training NN
Initial training samples NN model
Characterized outputs Y
p
Forward Solver
Calculated inputs c X
Retraining of NN model
Adjustment of training samples
Compare c
X and X
m
No
Yes
Results Y
t
FIGURE 6.6 Flowchart of the progressive NN. (From Liu, G.R. et al., Composites Sci. Technol., 61, 1401–1411, 2001. With permission.)
The accuracy of the output from the NN model increases with the increase of the number of retraining cycles. The desired accuracy can therefore be obtained by repeating the retraining process. This progressive NN is adopted as the computational inverse operator in following chapters for solving various inverse problems.
6.6
A Simple Application of NN
Several general applications of neural networks are provided by the neural network toolbox 4.0.1 of MATLAB (2001); among them, the recognition of the alphabet is quite useful for machine-performed recognition. Thus, this application is presented here as a typical instance of an NN.
© 2003 by CRC Press LLC
6.6.1
Inputs and Outputs of the NN Model
A network is to be designed and trained to recognize the 26 letters of the alphabet. An imaging system that digitizes each letter centered in the system's field of vision is defined in MATLAB. The result is that each letter is represented as a 5 × 7 grid of Boolean values. The twenty-six 35-element input vectors are defined as a matrix of input vectors. Each letter is defined by the 5 × 7 bit maps. For instance, the 5 × 7 bit map input vector for letter E is given as
LetterE input
1 1 1 = 1 1 1 1
1 0 0 1 0 0 1
1 0 0 1 0 0 1
1 0 0 1 0 0 1
1 0 0 0 0 0 1
T
(6.18)
The target vectors are also defined. Each target vector is a 26-element vector with a 1 in the position of the letter it represents, and 0s everywhere else. For example, the letter E is to be represented by a 1 in the fifth element (because E is the fifth letter of the alphabet), and 0s in other elements. LetterE output = [0
0
0
0
1
0
0
0]26×1 T
(6.19)
In addition, the network should be able to handle noise. In practice, the network does not receive a perfect Boolean vector as input. Specifically, the network should make as few mistakes as possible when classifying vectors with noise-contaminated inputs. Similar to Equation 6.8, the input vector of Letter E with the Gaussian noise (defined in Equation 2.130 of mean 0 and standard deviation of 0.2) is given as
© 2003 by CRC Press LLC
1.113 0.9912
0.8004 −0.1337
0.5785 −0.1398
0.8918 0.0419
0.7874 0.0169
1.2564 0.7314
−0.3144 0.9304
0.2520 1.0035
0.019 1.0447
0.1463 −0.2483
0.8742 1.0910
0.5271 0.1512
0.0209 0.2270
−0.0868 0.1283
0.0216 0.2258
1.1736
0.8581
0.8097
0.7145
0.9901
T
(6.20)
6.6.2
Architecture of the NN Model
The network receives the 35 Boolean values as a 35-input vector and is then required to identify the letter by responding with a 26-element output vector. Each of the 26 elements of the output vector represents a letter. To operate correctly, the network should respond with a 1 in the position of the letter being presented to the network. All other values in the output vector should be 0. Thus the NN needs 35 inputs and 26 neurons in its output layer to identify the letters. The network is a two-layer log-sigmoid/log-sigmoid network (MATLAB, 2001). The hidden layer has 10 neurons.
6.6.3
Training and Performance of the NN Model
To create a network that can handle noise-contaminated input vectors, it is best to train the network on both ideal and noise-contaminated vectors. The network is first trained on ideal vectors until it has a low error, e.g., beneath 0.1. Figure 6.7 shows the convergence process during training. The network is then trained on noise-contaminated vectors. The noises with mean 0.1 and 0.2 are added to the ideal vectors to simulate the noise-contaminated inputs. This forces the neuron to learn how to identify noise-contaminated letters properly, while requiring that it still respond well to ideal vectors. To train with noise, the maximum number of epochs is reduced to 300 and the error is increased to 0.6, reflecting the higher error expected because more vectors are being presented. All training is done using back-propagation with adap1000
Training error
100
10
1.0
0.1
0.01
0
50
100
150
200
247
Training numbers FIGURE 6.7 Training error norm of the NN model without noise for the character recognition.
© 2003 by CRC Press LLC
(a)
(b)
FIGURE 6.8 The performance of the NN model for character recognition; (a) the letter E with noise, (b) the recognized letter E from NN.
tive learning rate and with momentum. Once the network is trained with noise, it is recommended to train it again without noise to ensure that ideal input vectors are always classified correctly. To test the system, a letter with noise must be created and presented to the network. Figure 6.8(a) illustrates an input with noise; the identified result for letter E from the NN is shown in Figure 6.8(b).
6.7
Remarks
• Remark 6.1 — NN techniques are well known for their ability to model nonlinear and complex relationships between the structure parameters and the dynamic characteristics. The NN approach offers the advantages of a highly efficient inversion solution procedure that avoids thousands of hours of computation online using the forward solver. The NN model is not very sensitive to the ill-posedness of the inverse problem. • Remark 6.2 — MLP is the common NN model and its performance can be improved via several treatments. Regularization is an effective method to solve overfitting, as well as an efficient method for improving the adaptability of NN models. • Remark 6.3 — the relationship between the specified inputs and the outputs for practical engineering problems can be extremely complex. It is very difficult and even impossible to train an NN model
© 2003 by CRC Press LLC
for such a complex relationship valid in a wide range of parameters. A progressive NN can be used for such purposes to improve the results of the network in specified ranges of parameters for complex practical problems.
6.8
References on Neural Networks
Anderson, J.A., Introduction to Neural Networks, MIT Press, Cambridge, MA, 1995. Anderson, J.A. and Rosenfeld, E., Neurocomputing: Foundations of Research, MIT Press, Cambridge, MA, 1988. Caudill, M., Neural Networks Primer, Miller Freeman Publications, San Francisco, 1989. Caudill, M. and C. Butler, Understanding Neural Networks: Computer Explorations, Vols. 1 and 2, MIT Press, Cambridge, MA, 1992. DARPA Neural Network Study, MIT Lincoln Laboratory, Lexington, MA, 1988. Grossberg, S., Studies of the Mind and Brain, Reidel Press, Drodrecht, Holland, 1982. Hagan, M.T., H.B. Demuth, and M.H. Beale, Neural Network Design, PWS Publishing, Boston, 1996. Haykin, S., Neural Networks: A Comprehensive Foundation, 2nd ed., Prentice Hall Inc., New York, 1999. Hertz, J., A. Krogh, and R.G. Palmer, Introduction to the Theory of Neural Computation, Addison-Wesley Publishing Company, Reading, MA, 1991. Jolliffe, I.T., Principal Component Analysis, Springer-Verlag, New York, 1986. Kohonen, T., Self-Organization and Associative Memory, 2nd ed., Springer-Verlag, Berlin, 1987.
© 2003 by CRC Press LLC
7 Inverse Identification of Impact Loads
In this chapter, inverse procedures for identification of impact loads are presented. In these procedures, the dynamic displacement response to a load with an arbitrary spatial or time function is expressed in a form of convolution. These continuous convolution functions are then spatially and temporally discretized. The discretized force functions are recovered by minimizing the formulated functionals of error. Traditional optimization methods such as the algorithm for nonlinear least squares problems, as well as the conjugate gradient method, are adopted to solve this minimization problem. Numerical examples of identification of impact loads applied on beam and on plate structures are presented to demonstrate the force inversion procedures. These procedures of impact force identification are generally useful for many engineering inverse problems related to source identification, such as wave source identification problems, as long as the outputs (effects) can be expressed in a form of convolution in terms of the inputs (sources, forces, etc.).
7.1
Introduction
Due to the dynamic effects, impacts can often cause extensive delaminations within structures (especially in composite structures) that can severely degrade the load-carrying capability of the structures. Considering the great effects of impact loading on the integrity and serviceability of the structure, it is obviously valuable to have a better understanding of the loading profile, such as spatial distribution and the loading time history. The loading profile is also very important for optimization design of practical structures; however, it is difficult to obtain the loading profile in an actual engineering application. The traditional ways of direct sensing require an access to the impact locations. Therefore, it is technically difficult and sometimes impossible to perform due to the complex nature of the impact loading or danger that arises in the actual situation. On the other hand, it is routinely convenient to measure the structural response at remote points from the impact loca-
© 2003 by CRC Press LLC
tions. Therefore, a properly formulated inverse procedure can be very promising for the reconstruction of the impact loading profile using the structural responses obtained (measured) at a distant point from the impact location. Early attempts to determine the source function based on the response data include work by Goodier et al. (1959); in their study, an integral equation was solved to calculate the time history of a vertical force applied to a halfspace using the far-field response. Hsu et al. (1977) and Michaels et al. (1981) discretized and inverted a time convolution integral to determine the time history of a vertical force applied to a plate from the near-field response. The important factor in the success of their work was the availability of an artificial source, which is generated by fracturing a glass capillary against the structure surface (Michaels and Pao, 1986). Doyle (1984a, b, 1987) presented a series of papers to determine the impact force for beam and plate types of structures subjected to transverse impacts. Strain gauges were used in his experiments to sense the strain responses at selected locations on the specimens. The force/strain relations for the transverse impact of beams and plates were established in the time domain for beam, and the frequency domain for the beam and the plate. Inversion by the use of the fast Fourier transform (FFT) was shown to determine the force history. Michaels and Pao (1985) proposed a double iterative method to solve an inverse source problem for an oblique impact force on an elastic plate. The orientation and time function of the oblique force were recovered from the transient response of the plate surface, with a minimum of two receivers required to sense the wave motions. Numerical and experimental verifications were performed to validate the proposed method. Later, Chang and Sun (1989) proposed a modified method to recover the transverse impact force on a composite laminate by using experimentally generated Green’s function and signal deconvolutions. No governing equations for the structure were needed because all the boundary conditions, material properties, and structure geometry were accounted for through the use of the experimental Green’s function found from the actual structure. Chang and Sachse (1985) studied the forward and inverse problems of an extended, finite source of elastic waves in a thick plate. The signals received at a point near the source were computed by a superposition of the signals found with a generalized ray algorithm from point sources of variable strength arranged along a straight line. An iterative procedure of deconvolution was developed to utilize the signals detected at only one receiver point to obtain the solution for the source inverse problem. Yen and Wu (1995) proposed a method to identify the impact location and the time history of a transverse impact force from the strain responses at certain points on a rectangular plate. The governing equations for the plate were obtained by applying the Reissner–Mindlin plate theory. The strain response was related to the impact force by solving these equations using the eigenmode expansion method. A mutual relationship between any pairs of strain responses was used to find the impact locations without knowing the impact force history in advance. The force history was subsequently
© 2003 by CRC Press LLC
determined after the impact location was identified. The conjugate gradient method was adopted to search for the optimal impact location as well as the force history. Numerical and experimental verifications were performed. Law et al. (1997) solved an identification problem of the vertical dynamic interaction force between a moving vehicle and a bridge deck, which was modeled as one- or two-point loads moving at a constant speed on a simply supported beam with viscous damping. The acceleration measurement adopted and the numerical predication appeared good. Möller (1999) provided an outline on work on load identifications, and proposed a new method in an attempt to approach loading identification problems in a general manner for homogeneous solid plates. He employed the Betti recip rocal theorem together with a set of reference load cases to calculate the required magnitude of the unknown load with assumed spatial shape and position of the load. Without solving any equations, a large number of trial load cases with assumed distribution were evaluated to determine the respective magnitudes. Then an optimization scheme with added discrete mass as a design variable was adopted to reproduce a structure such that each set of responses generated by each one of the previously identified loads was clearly distinguishable at the transducer positions. Investigations on the force inverse problem for composite laminates with many anisotropic layers have also been reported. An inverse procedure using elastic waves (Lamb waves) has been proposed to deal with the force identification problem of composite laminates for two-dimensional cases (Liu and Ma, 1999; Liu et al., 2002e), as well as for three-dimensional cases (Liu et al., 2002f). The time history and the spatial distribution function of line loads or point loads applied on the surface of composite laminates are reconstructed from the dynamic displacement response at one receiving point. In these methods, by employing the HNM method (see Section 7.4.1) as a forward wave solver, the time history is reconstructed for concentrated loads as well as for distribution loads on the surface of composite laminates. Based on work performed by Liu and co-workers, this chapter introduces inverse procedures for force identification using Lamb wave response in laminates. Identification of impact loads applied on the surface of beams, and that of composite laminates of many anisotropic layers, are first examined for two-dimensional (line loads) and three-dimensional cases (point loads); and, then are studied using these inverse procedures.
7.2
Displacement as System Effects
Inverse identification problems in mechanics need to use effects of the system to construct the objective function of error. The choice of the effects should be based on the three considerations, sensitive, accurate, and easy to acquire, detailed in Section 2.9. The effects for the force identification problem can
© 2003 by CRC Press LLC
be the displacement, velocity, and acceleration at points of the structure excited by harmonic or transient forces. Basically they can all be easily obtained without too much difficulty experimentally and computationally. In most of the cases discussed in this book, use of displacement is often chosen over velocity and acceleration because • The direct output of a sensor used in the experiment is often in the form of velocity or acceleration. The displacement can be obtained by simple integration using the time-history output of the velocity or acceleration. The integration is a smothering operation, so it helps to remove some noise that may be included in the measurement of velocity or acceleration. Therefore, the accuracy of the measurements can be improved once it is converted to displacements. • The direct output of a forward solver is often in the form of displacements. If the velocity or the acceleration is used in the evaluation, a differential operation must be applied to the displacement output. Because the differentiation is a harshening operation, it will magnify the errors in the solution of the forward solver. Therefore, it should be avoided when possible. The disadvantage of using displacement as the effects for computing the object function is that the displacement can be less sensitive to changes in the system compared to velocity and acceleration. Therefore, if sensitivity is a concern in the inverse problem, the use of velocity or acceleration should be considered.
7.3
Identification of Impact Loads on the Surface of Beams
The purpose in this section is to remotely detect the impact loads that are acting on beams or beam-like structures. The task includes the identification of the time history as well as the spatial distribution of the loading. This section discusses only the identification of time history of impact loads using the finite element method (FEM) as the forward solver. Spatially, the load is assumed to be a point load.
7.3.1
Finite Element Model
The finite element model of a beam is created using two-dimensional quadrilateral solid elements; a plane stress problem is considered because the beam being investigated has a two-dimensional profile with the width much smaller than the height. The meshes for beams are made finer around the point of excitation. This will ensure the accuracy of the modeling in the area
© 2003 by CRC Press LLC
Conditioning Amplifier
Mechanical Pendulum
Test Piece
PC
Clamped
Flaw Force Transducer
Video Control Box Impact Tip
A/D Board Video
Laser Scanning Head Video
Clamped Scanner Supply, Controller
DAC GPIB
FIGURE 7.1 Schematic drawing of experiment setup for impact experiment. The impact point is maintained at 100 mm away from the left edge of the crack. Impact on the beam is achieved by swinging a mechanical pendulum whose end is a steel ball. The diameter of the steel ball is 15.21 and 9.98 mm and the angle of swing is 30°. The input impact force is measured using a Kistler transducer type 8720A500 attached to the steel ball, and the response of the beam in terms of velocity on its surface is measured using a Polytec scanning laser vibrometer. The signals from the force transducer and the laser vibrometer are then discretized and analyzed using a computer. (From Ishak, S.I. et al., J. Sound Vib., 252(2), 343, 2002. With permission.)
of the point loading. The displacements at the nodes on both ends of the beam are fixed to simulate the clamped boundary condition. The FEM model is fairly simple and can be done very easily. The analysis can be performed using any commercially available code, such as ABAQUS (2000). Other possible software packages can be found in Table 14.1. Details on how to create the FEM model and use an FEM code such as ABAQUS can be found in a book by Liu and Quek (2003).
7.3.2
Validation of FE Model with Experiment
Before the inverse analysis, the forward solver, FEM, is verified with the experiments. The schematic drawing of the experiment setup is shown in Figure 7.1. Impact force is applied at the midspan of the specimen in order to obtain maximum time of observation before the arrival of the waves reflected by the boundary of the two ends of the beam. Impact on the beam is achieved by swinging a mechanical pendulum whose end is a small steel ball. The angle of swing is 30°. The impact force is measured using a Kistler transducer type 8720A500 attached to the steel ball, and the response of the beam in terms of velocity on its surface is measured using a Polytec scanning laser vibrometer. The signals from the force transducer and the laser vibrometer are then discretized, processed, and analyzed using a PC.
© 2003 by CRC Press LLC
Unknown impact beam Response at a point away from the impact location
FIGURE 7.2 Estimation of the impact load on an aluminum beam with dimensions of 1000 × 4 × 25 mm (length × width × height) from the dynamic response measured at one point away from the impact point.
TABLE 7.1 Material Properties of the Beam Material
Aluminum
Young’s modulus, E Poisson’s ratio Density
69 GPa 0.33 2750 kg/m3
An aluminum beam with dimensions of 1000 × 4 × 25 mm (length × width × height), illustrated in Figure 7.2, is examined (Irwan, 2001). The material property of the beam is listed in Table 7.1. An experiment is conducted using the setup shown in Figure 7.1 and the impact force on the beam is obtained, as shown in Figure 7.3. The measured time history of the force is sampled and used in the FEM model to compute the dynamic response of velocity of the aluminum beam using ABAQUS. The FEM results obtained are plotted in Figure 7.4, together with those obtained from the experiments. The exper 1 0.8
force (N)
0.6 0.4 0.2 0 0
0.0001
0.0002
0.0003
0.0004
-0.2
time (s) FIGURE 7.3 Profile of the measured impact force (normalized). The forces are sampled and used in the FEM mode to compute the dynamic response of velocity of the aluminum beam.
© 2003 by CRC Press LLC
Comparison of velocity responses at a point (40 mm from the impact point)
velocity (m/s)
1 0.5 0 -0.5 0
0.0001
0.0002
0.0003
0.0004
0.0005
-1 -1.5 time (s) experimental
vel at a pt near the edge
Comparison of velocity responses at a point (117.5 mm from the impact point)
1
velocity (m/s)
simulated
0.5 0 -0.5
0
0.0001
0.0002
0.0003
0.0004
0.0005
-1 -1.5 time (s) experimental
1
simulated
vel at a pt near the edge
Comparison of velocity responses at a point (155 mm from the impact point)
velocity (m/s)
0.5 0 -0.5 0
0.0001
0.0002
0.0003
0.0004
0.0005
-1 -1.5 -2
time (s) experimental
simulated
vel at a pt near the edge
FIGURE 7.4 Comparisons of the velocity response between the simulated result from the FEM with that from the experiment for the aluminum beam subjected to an impact force. The thick solid line is the response recorded by a sensor mounted at the point near the beam end for the purpose of estimating the time needed for the waves to travel a distance of half a beam length.
imental and finite element results have been normalized for comparison purposes. This is possible because the normalization will not alter the timedomain profile of the response, and only the profile of the time-history curve is used for the inverse analysis. Figure 7.4 shows excellent agreement between the finite element and experimental results before the arrival of the
© 2003 by CRC Press LLC
1.6E-03
8.0E-04
Velocity (m/s)
0.0E+00 0
0.0001
0.0002
0.0003
0.0004
-8.0E-04
-1.6E-03
-2.4E-03
-3.2E-03
Time (s) 1N
2N
4N
FIGURE 7.5 Velocity response to impact force of amplitudes of 1N, 2N, and 4N. The results are obtained at 117.5 mm from impact point for the aluminum beam. The response is proportional to the impact force due to the linear nature of the problem.
reflected waves from the two ends of the beam. The arrival time of the reflected waves can be estimated from a response curve recorded using a sensor mounted at a point near the end of the beam (the thick solid lines). This experimental comparison shows that the finite element model has been successfully created to produce accurate results for later inverse analysis. Figure 7.5 shows the velocity responses computed at 117.5 mm away from the impact point to forces of various amplitudes and the same time-history profile, using ABAQUS. It is observed that the profile of the wave response of velocity is directly proportional to the magnitude of the force due to the linear nature of the problem. This indicates that the flexural wave response can be normalized without the loss of the generality. Therefore, the pattern of variation of the time history of the response can be used in determining the pattern of variation of the time history of the dynamic load.
7.3.3
Estimation of Loading Time History
The finite element model validated in Section 7.3.2 is first used to compute the displacement response of a beam subjected to a Heaviside time-step
© 2003 by CRC Press LLC
excitation. This response will be used in the following inverse procedure to predict the time history of a concentrated load from the knowledge of the displacement response obtained at only one receiving point on the surface, as shown in Figure 7.2. The identification of the time history of loads using a Heaviside step response is carried out based on a simple formulation. By introducing a Heaviside step function H(t), an arbitrary force or source f(t) can be approximately expressed as: f(t) = f0H(t) + (f1 – f0)H(t – t1) + ··· + (fi – fi–1)H(t – ti) (ti < t < ti+1) (7.1) where fj is the impact force value at time t = j ∆ t, h(t) is used to denote the displacement response of the beam subjected to a Heaviside step force at time t, under the zero initial condition. According to superposition principle, the response of the impact force f(t) can be expressed as: u(t) = f0h(t) + (f1 – f0)h(t – t1) + ··· + (fi – fi–1 )h(t – ti)+ ··· (ti < t < ti + 1) (7.2) or u(t) = f0[h(t) – h(t – t1)] + f1[h(t – t1) – h(t – t2)] + ··· + fi–1[h(t – ti – 1) – h(t – ti)] + fih(t – ti) + ···
(7.3)
where h(t – tj) is the displacement value at time t – tj subjected to a Heaviside time-step excitation at time t = 0. To simplify the notation, use uj, hj to denote the displacements at time t = j∆t , subjected to the impact force and the Heaviside step force applied at time t = 0, respectively. Equation 7.3 can be transformed into: m
um + 1 =
∑ f [h i
m − i +1
− hm−i ]
(7.4)
i=0
and gi = hi – hi–1
(7.5)
is introduced, where gi is called discrete Green’s function. Substituting Equation 7.5 into Equation 7.4 yields m
um + 1 =
∑fg i
i =0
which can be written in a matrix form of
© 2003 by CRC Press LLC
m − i +1
(7.6)
0 0 0 f0 u1 g1 u g g1 0 0 f1 2 2 U= = = GF 0 uns gns gns −1 g1 fns −1 G
(7.7)
F
where G is the matrix of Green’s function, and ns is the total sampling points of the measurements. The objective function is defined as: ferr = || GF – Um ||2
(7.8)
where Um is the vector of displacement measurement at the same point where hi is recorded. The task is now to find a force (source) function F in the discrete form, which minimizes ferr . The modified Levenberg–Marquardt method for nonlinear least-squares problems introduced in Chapter 4 is used to achieve this purpose. The results are obtained for the aluminum beam and are shown in Figure 7.6, together with the measured results. The inverse procedure has reproduced the time history of force very well. The force prediction is more accurate using responses obtained at the point closer to the impact point, as seen in Figure 7.6. 7.3.4
Boundary Effects
The boundary condition at the two ends of the beam plays an important role in the process of inverse recovering of impact forces. The Green’s function in the convolution must be computed using the same boundary conditions for the beam used in the experimental measurement. If the boundary condition in the actual experimental setup is not sure, any type of boundary conditions to compute the Green’s function can be used; however, make sure that the receiving point is far enough from the boundary so that the measured Um used in Equation 7.8 does not contain waves reflected back from the boundary.
7.4
Line Loads on the Surface of Composite Laminates
The purpose of this section is to remotely detect impact loads that are acting on plates or plate-like structures. The task includes the identification of the time history as well as the spatial distribution of the loading.
© 2003 by CRC Press LLC
25 Experiment Force predicted 20mm from impact point Force predicted 80mm from impact point
20
Force (N)
15
10
5
0
-5
-4
0.5
0
1.5
1
2.5 x10
2
Time (s) FIGURE 7.6 Identified time history of the impact force using responses obtained at points 20 and 80 mm away from the impact point of the aluminum beam.
Consider now a composite laminate with any number of anisotropic linear elastic layers and the overall thickness of H excited by a line load independent of the y-axis, as shown in Figure 7.7. Because both the load and the displacement field are independent of y, the problem is a two-dimensional plane-strain problem. The composite laminate is assumed to be stationary before the excitation of the dynamic loading. z Fiber orientation
y H
x FIGURE 7.7 Composite laminate subjected to a line impact load on the upper surface.
© 2003 by CRC Press LLC
7.4.1
Hybrid Numerical Method
The hybrid numerical method (HNM) is used in this study as the forward solver. The HNM will also be used for inverse problems in other chapters, therefore, it is briefed here. The HNM is one of the most efficient numerical tools for two-dimensional or three-dimensional transient wave analysis in composite laminates. The efficiency is achieved by combining the finite element technique with the numerical Fourier transformation technique, as well as the model analysis technique for dealing with the time integration. In the HNM, the concepts of FEM, fast Fourier transform (FFT), and model analysis technique for dealing with the time integration are combined to achieve high efficiency in computing the transient response in the time domain for a laminate of many anisotropic layers. As part of the basic process of HNM, following the standard procedure of the finite element method, the laminate is first divided into a number of layer elements in the thickness direction; a set of approximate partial differential equations for an element can be obtained using a weak form. By assembling all the elements, a set of approximate governing partial differential equations (PDEs) for the whole laminate is obtained. Using the standard Fourier transformation technique to perform the spatial-wavenumber transformation, a set of ordinary differential equations (ODEs) with respect to time is obtained in the wavenumber domain. This set of ODEs is then solved using the modal analysis techniques (Liu et al., 1991a; Liu and Xi, 2001). The frequencies and the corresponding left and right eigenvectors can be calculated by solving the eigenvalue equations corresponding to the ODEs for various wave numbers. Then the time history of the displacement vector in the wavenumber domain can be obtained analytically in a closed form of model summation for the transient excitation of Heaviside time step, Delta pulse, and many other simple functions. The displacement in the wavenumber domain for arbitrary loading can also be calculated using the Duhamel integral. That is also a convolution form. Finally, the time history of the displacement vector in the spatial-time domain can be obtained through numerical inverse Fourier transformation using the well-known FFT or other numerical integration schemes. The HNM is especially efficient computationally for transient wave response calculation, because it avoids singular integrations by integrate cover the time first before the inverse wavenumber-spatial Fourier inverse transform. The inverse Fourier transform can be easily performed numerically because no pole is on the axis of integration. In addition, the computation can be performed in real numbers for most of the problems. The HNM was developed by Liu, Tani et al. (1991a) based on works by Waas (1972), Dong and Nelson (1972), Nelson and Dong (1973), Weaver and Pao (1982), Kausel (1981, 1986), Santosa and Pao (1989), etc. The HNM was improved (Liu et al., 1997) for efficient time integration, and extended for functionally graded materials (Liu et al., 1991c) and functionally-graded piezoelectric materials (Liu and Tani, 1994), as well as cylindrical structures (Han et al., 2001b).
© 2003 by CRC Press LLC
Many other methods for wave analysis exist. Readers are advised to refer to the book by Liu and Xi (2001) and references in this book.
7.4.2
Why HNM?
The HNM is chosen because it is particularly suitable for inverse analysis. In the HNM, the time integration for the displacements is performed analytically for delta function pulse excitation, Heaviside step function excitation, and many other excitations of simple time functions. Therefore, using transient wave response sampled on the time-history response curves to perform inverse analyses will not have the Type III ill-posedness (see the discussion in Section 2.3). Type I ill-posedness can be removed by sampling more points on the time-history curves of the transient wave response to have the problem at least even-posed, which can be done computationally and experimentally very easily without any difficulty or increased cost. To remove the Type II ill-posedness, make sure that the transient wave response sampled on the time-history curve is sensitive to the parameters to be inversely determined, which will be confirmed on a case-by-case basis as shown in the following intensive numerical and experimental studies.
7.4.3
TransWave©
Based on the formulation of HNM, a simple desktop software package, TransWave, has been developed by Liu’s group. This software can be used to analyze the transient waves propagating in composite laminates and cylinders excited by impact line loads. TransWave has been further upgraded to investigate laminated composite structures as well as functionally graded structures. TransWave consists of three modules: preprocessor, wave solver, and postprocessor. The preprocessor and postprocessor are designed in the form of the graphic user interface. The preprocessor is used to input parameters required for computation, whereby a database of the commonly used material constants is built-in for user’s selection. The wave solver performs numerical analysis and produces the results, which are then channeled to the postprocessor for graphic display and presentation. The following example will briefly show the performance of TransWave. This software package can currently be downloaded for free from the website of CRC Press: http://www.crcpress.com/e_products/downloads/download.asp?cat_no=1523. As an example, consider a composite laminate as shown in Figure 7.7, consisting of two carbon/epoxy layers and four glass/epoxy layers. The stacking sequence of the laminate is denoted by [C90/G+45/G–45]S, where C and G stand for the carbon/epoxy and the glass/epoxy layers, respectively, and 90, +45, and –45 stand for the angle of fiber-orientation to the x-axis. The subscript of “s” means that the laminate is symmetrically stacked. The
© 2003 by CRC Press LLC
TABLE 7.2 Material Properties of Glass/Epoxy and Carbon/Epoxy Material Constants Glass /epoxy Carbon / epoxy
E1 (GPa)
E1 (GPa)
G12(GPa)
v12
38.49 142.17
9.367 9.255
3.414 4.795
0.2912 0.3340
23
0.5071 0.4862
v
ρ (g/cm3) 2.66 1.90
FIGURE 7.8 Material selection and modification in the TransWave.
material properties of the carbon/epoxy and glass/epoxy are given by Takahashi and Chou (1987) and listed in Table 7.2. As illustrated in Figure 7.8, first select or modify the material property from the database of the package. After the number of layers for the lamina is specified, the material property, the number of sublayers, type of the load, and selection of the input as well as output can be easily defined in the input window shown in Figure 7.9. Figure 7.10, Figure 7.11, and Figure 7.12 give some typical results of the TransWave. Figure 7.10 shows the horizontal displacement response on the upper surface of the [C90/G+45/G–45] S composite laminate subjected to a vertical line load with the one cycle sine function of time. Figure 7.11 shows the horizontal displacement response on the upper surface of the [C90/G+45/G–45]S composite laminate subjected to a vertical line load of time pulse, and Figure 7.12 shows the horizontal displacement response on the upper surface of the laminate subjected to a Heaviside step line load.
7.4.4
Comparison between HNM and FEM
The result from HNM code is also compared with that from the FEM model. An aluminum plate subjected to one cycle sine function excitation is consid ered. The ABAQUS is employed for the FEM solver.
© 2003 by CRC Press LLC
FIGURE 7.9 Properties of layer, type of load, and output specification in the TransWave input window.
0.1 0.08
Dimensionless displacement
0.06 0.04 0.02 0 -0.02 -0.04 -0.06
0
5
10
15 20 25 Dimensionless time
30
35
40
FIGURE 7.10 Horizontal displacement response on the upper surface of the [C90/G+45/G–45] S composite laminate subjected to a line load of one cycle sine function.
The response at x = 10H on the upper surface of the plate is computed (see Figure 7.7). Figure 7.13 shows the results obtained from FEM and HNM, where the result from FEM is shown in the dashed line, and the result from HNM in the solid line. This figure shows that comparisons are good between the responses generated by FEM and by the HNM before the arrival of the reflected waves at t = 15 . The FEM model used is 1 m in length with fixed displacements at the ends, while the HNM result is obtained using an infinite plate. The comparison between the FEM and HNM and the comparison between the FEM and the experiment confirm that the HNM is an accurate forward solver (see Figure 7.4).
© 2003 by CRC Press LLC
0.2
Dimensionless displacement
0.15 0.1 0.05 0 -0.05 -0.1 -0.15 -0.2
0
5
10
15 20 25 Dimensionless time
30
35
40
FIGURE 7.11 Horizontal displacement response on the upper surface of the [C90/G+45/G–45] S composite laminate subjected to a line load of time-pulse excitation.
0
Dimensionless displacement
-2
-4
-6
-8
-10
-12
-14
0
5
10
15 20 25 Dimensionless time
30
35
40
FIGURE 7.12 Horizontal displacement response on the upper surface of the [C90/G+45/G–45] S composite laminate subjected to a line load of Heaviside step excitation.
7.4.5
Kernel Functions
Next, the HNM will be used as a forward solver for inverse analysis, as well as to compute the kernel functions, including Green’s functions. Two types of kernel functions can be used for transient wave analysis for sources of excitation of arbitrary time functions. The first is Green’s function and the
© 2003 by CRC Press LLC
HNM FEM
FIGURE 7.13 The vertical response at x = 10H to the line load at x = 0 on the upper surface of the aluminum plate computed by HNM and FEM. Good comparison is observed between the response generated by FEM and by the HNM before the arrival of the reflected waves at t = 15 . The FEM model used is 1 m in length with fixed displacements at the ends, while the HNM result is obtained using an infinite plate.
other is the wave response function of Heaviside step excitation used in Section 7.3.3. Consider two-dimensional cases in which the Green's function Gij(x|x′,t) is the displacement response in the ith direction at location x and time t subjected to an impulsive concentrated force of unit magnitude in the jth direction at location x' and time t = 0. Assume the time-pulse load is acting at x = 0, which can be expressed by F = F0 δ( x)δ(t), where δ(•) is the Dirac delta function, and F0 is a unit amplitude vector of the applied load. Following the HNM procedure, M
d˜ =
∑ m =1
ψ Lm F0 ψ mR sinω mt ω m ( ψ mL Mψ mR )
(7.9)
is obtained, where d˜ is the transformation of the displacement vector on the nodal planes, M is the mass matrix, ωm (m = 1,2,…,M) is the eigenfrequencies, and ψ mR and ψ Lm are the corresponding right and left eigenvectors, respectively. For details, readers can refer to Liu et al. (1991a) and Liu and Xi (2001). Applying the inverse Fourier transformation, the Green’s function 1 G(x, t) = 2π
+∞
M ψ Lm F0 ψ mR sinω mt e − ikx x dk L R m = 1 ω m ( ψ m Mψ m )
∫ ∑ −∞
is obtained, where k x is the real wave number in the x direction.
© 2003 by CRC Press LLC
(7.10)
The wave response function of the composite laminate subjected to a Heaviside time-step force can also be obtained. The external load can be expressed by F = F0 δ( x)H (t) , where H(t) is the Heaviside step function of time t: M
d˜ =
∑ m =1
ψ Lm F0 ψ mR (1 − cos ω m t) ω 2m ( ψ mL Mψ mR )
(7.11)
Applying inverse Fourier transformation, the response function of Heaviside step force can be written as (Liu et al., 1991a; Liu and Xi, 2001):
h(x, t) =
7.4.6
1 2π
+∞
M ψ Lm F0 ψ mR (1 − cos ω m t) e − ikx x dk x 2 L R m = 1 ω m ( ψ m Mψ m )
∫ ∑ −∞
(7.12)
Identification of Time History of Load Using Green's Functions
The displacement response at any receiving point can be expressed as a convolution integral of the force function and a Green’s function: u( x , t) =
t
∫ G(x, t − τ) f (τ)dτ
(7.13)
0
where u(x,t) is the displacement in x direction at point x, f(t) is the time history of the line load, and G(x,t) is the Green's function. By discretizing this convolution integral into ns evenly spaced sample points in the time domain, Equation 7.13 is transformed into a matrix form: 0 0 0 f0 G1 u1 G u G 0 0 f1 2 1 2 = ∆ t = GF 0 uns Gns −1 G1 fns −1 Gns G
(7.14)
F
where u j , Gj, and fj are displacement, Green’s function, and impact force at time t = j∆t , respectively; ∆t is the time interval. Because the laminate is stationary before impact, u0 and G0 are equal to zero. The special form of the Green’s function matrix reflects the characteristic of the convolution integral. In order to recover the time history f(t) from u(x, t) in real service situations, the displacement response at the receiver point is obtained by measurement, and Green’s function is formulated using the HNM method. Because Equa-
© 2003 by CRC Press LLC
z
ξ2
ξ1
x
FIGURE 7.14 A composite laminate subjected to a line load distributed along the x-axis from ξ1 to ξ2 on the upper surface of the laminate.
tion 7.14 has the similar convolution form of Equation 7.7, the algorithm for nonlinear least squares problems is used to minimize the function to recover the F defined in Equation 7.8 (see Section 7.3.3). Identification of time history of load of laminates using the Heaviside step response of Equation 7.14 can be carried out following the same way; the only difference is that the Heaviside step function should be used instead of the Green’s function.
7.4.7
Identification of Line Loads
The same form in Equation 7.7 and Equation 7.14 implies that the Green's function method and the Heaviside step response method can be practically used to identify the spatial distribution of line loads (Liu et al., 2002e). Consider now a laminate subjected to a line load distributed along the x axis from ζ 1 to ζ 2 , as shown in Figure 7.14. Denote the loading function as F(x, t) and assume that the time and spatial dependencies of F(x, t) are separable, i.e., F(x, t) = f(t)p(x), where f(t) is loading time function, and p(x) is the loading distribution function. The displacement at receiving point x is given by Chang and Sachse (1985): u( x , t) =
∫
t
0
f (t ′)dt ′
∫
ζ2
ζ1
p( x ′)g( x|x ′ , t − t ′)dx ′
(7.15)
The displacement response at a receiving point is a superposition of contributions from all the point loads. Discretize the loading region ( ζ 1 , ζ 2 ) into ms point loads in equal intervals of ∆x ′ so that x n = n∆x ′ . The time domain is also discretized evenly as described in Section 7.4.6. Equation 7.15 can then be rewritten in the discretized form of
© 2003 by CRC Press LLC
m −1
um ( x ) =
∑
ms
f ( k∆t)∆t
k =0
∑ p(x )∆x' g[x|x , (m − k)∆t] (m = 1, 2,… , n ) n
n
s
(7.16)
n =1
where um ( x) is the displacement response at the receiver point x and time t m = m∆t . The inverse problem for extended source is to recover the time function f(t) as well as the distribution function p(x). Three different situations are considered next. 7.4.7.1 Identification of Time Function For the first situation, assume that the distribution function p(x) is known and try to find time function f(t). Considering the right-hand side of Equation ms
7.16, introduce Gm ( x) =
∑ p(x )∆x′g[x|x , (m − k)∆t], which can be written n
n
n =1
in the matrix form of G1 g1 G g 2 2 G = = Gns gns ∆x '
g1 g 2 gn s 2 ∆x '
p( ∆x ' ) g1 g p(2 ∆x ' ) 2 ∆x ′ (7.17) gn ( ∆ ' ) P m x s s ms∆x '
Then substituting Equation 7.17 into Equation 7.16 to arrive at u1 G1 u G 2 2 U= = uns Gns
0 G1 Gns −1
0 0
0 f0 0 f1 ∆t 0 G1 fns −1
(7.18)
which can be expressed as U = GF. Objective function can also be defined as ferr = ||GF – Um|| 2, an optimization method can be used to solve the problem for F. The conjugate gradient method introduced in Chapter 4 can be chosen to find the F that minimizes ferr . 7.4.7.2 Identification of the Spatial Function For the second situation, assume that the time function f(t) is known; the distribution function p(x) is recovered through the following procedure. Equation 7.16 can be rewritten as
© 2003 by CRC Press LLC
m −1
ms
um(x) =
∑
p( x n )∆x '
n =1
∑ f (k∆t)g[x|x , (m − k)∆t]∆t (m = 1, 2, …, n ) n
s
(7.19)
k =0
m −1
Then introduce R n =
∑ f (k∆t)g[x|x , (m − k)∆t]∆t, and express it in a matrix n
k =0
form: g1 r1 g r 2 2 = Rn = rns n g x =n∆x′ ns
0 g1 gns −1
0 0
0 f0 f 0 1 ∆t 0 fn −1 g1 n s x =n∆x′
(7.20)
Substitute Equation 7.20 into Equation 7.19 to obtain p r1 u1 r1 r1 1 u r r r p 2 2 = 2 2 2 ∆x ' p m u r r r n n n ns s ∆x′ s 2 ∆x′ s ms∆x′ s
(7.21)
or U = RP. Again the objective function can be formulated as ferr = || RP – Um||2, and use an optimization method such as the conjugate gradient method to search for P that minimizes the ferr. 7.4.7.3 Identification of the Time and Spatial Functions If the time function and the distribution function are unknown and need to be recovered from the displacement measured at one receiving point, Equation 7.19 should be used in a double iterative procedure. First, an initial guess of the distribution function p(x) is made to initiate the double iterative process. Based on the procedure described in Section 7.4.7.1, a time function f(t) can be obtained. Then the f(t) is used as input to evaluate the new p(x) following the procedure in Section 7.4.7.2. These iterations are repeated until a desired accuracy is attained. It can be seen from Equation 7.15 that the recovered source characteristics are represented as a product of f(t) and p(x). The recovered functions may differ from the true one by a constant, i.e., the recovered f(t) may be the true time function multiplied by an arbitrary constant c, with p(x) being the true one multiplied by c −1 . In the numerical computation, always set the maximum value of the source time function to 1.0.
© 2003 by CRC Press LLC
1
0.5
f (t)
0
-0.5
-1 0
2
4
6
8
t
FIGURE 7.15 Inversely identified time history of a concentrated line load applied on the composite laminate [C0/G45/G–45] s.
7.4.8
Numerical Verification
Having established the formulation for the inverse problems of force identification, numerical investigation is conducted in the following to verify their effectiveness. The [C0/G+45/G–45]S composite laminate is considered. In the numerical computation, the following dimensionless parameters are used: x = x / H , t = tc s / H , c s = (c( 4 , 4) / ρ)1/2 , u = u / u0 , u0 = q0 / c( 4 , 4)
(7.22)
where u is the displacement in the x direction, and ρ, c(4,4), and c s are, respectively, the density, the elastic constant, and the velocity of shear wave (in the fiber orientation) in the carbon/epoxy layer of the laminate. q0 is the constant related to the amplitude of the force vector. A concentrated line load with the time history of f (t ) = 2 sin(πt ) , 0 ≤ t ≤ 2 , i.e., the time history of one cycle of sine function, is first considered. Green’s function method is used to identify the time history of the load. The experimental measurements are simulated using the HNM with the actual time function for the actual load, and the conjugate gradient method is adopted to perform the optimization. The identification results are shown in Figure 7.15, from which it is observed that the identified time history function agrees well with the input function. In the next step, the extended line load with p( x ) = 5(sin((10 x + 1)π / 2) + 1) , (−0.2 < x < 0.2), and f (t ) = 2 sin(πt ) , 0 ≤ t ≤ 2 is considered as the true loading time and distribution function. If the distribution function p(x) is known, considering a line load with time function f (t ) = 2 sin(πt ) , 0 ≤ t ≤ 2 . The conjugate gradient method is adopted to perform optimization following the procedure given in Section 7.4.7.1. The identified results agree well with the input function.
© 2003 by CRC Press LLC
12
True Identified
10 8
p (x )
6 4 2 0 -2 -0.2
-0.15
-0.1
-0.05
0
0.05
0.1
0.15
0.2
x
FIGURE 7.16 Inversely identified and true functions for a distributed line load.
Figure 7.11 shows that the Green's function vibrates very frequently in the main response domain. Therefore, in time discretization, the sampling rate should be high enough to avoid the alias phenomenon, which means that if the time interval ∆t is too large, the high frequency of the Green's function will appear as lower frequency. In this work, ∆t = 0.04 ~ 0.05 is used and satisfactory results obtained. If the loading time function is of short duration, the main response period is also of short duration. The chosen time period for computation must locate in the main response period to reduce the noise effect and increase the sensitivity. If f(t) is known and p(x) is to be recovered, it is necessary to choose an approximate initial region, which should include the entire true force region. In the numerical computation, a larger area is chosen, for example, (–0.4, 0.4). p(x) is recovered following the procedure given in Section 7.4.7.2; Figure 7.16 shows the comparison between the identified and the true function. If both time and distribution functions are to be recovered, perform the double iterative procedure given in Section 7.4.7.3. The comparison of the identified and the true distribution function is shown in Figure 7.17. In this study, the conjugate gradient method is used in the double iterative process. The numerical testing shows that the algorithm has fast convergence when recovering the time function of the load. It gives satisfactory and stable results of time function after several double iterations.
7.5
Point Loads on the Surface of Composite Laminates
7.5.1
Inversion Operation
Consider a composite laminate with any number of anisotropic layers and the overall thickness of H. The laminate is excited by a point dynamic load,
© 2003 by CRC Press LLC
12 True Identified 10 8
p (x )
6 4 2 0 -2 -0.2
-0.15
-0.1
-0.05
0
0.05
0.1
0.15
0.2
x
FIGURE 7.17 Inversely identified and true functions for a distributed line load whose time and distribution functions are all unknown. The result is obtained after the 30 double iterations.
z Fiber orientation
y H
x FIGURE 7.18 A point load on the surface of a composite laminate.
as shown in Figure 7.18. Now use the HNM for a three-dimensional case to derive the displacement response of the composite laminate subjected to a point step-impact load. The external load is applied on the upper surface of the laminate and in the vertical direction, with the force function of F = F0 δ( x)δ( y )H (t) , where H(t) is the Heaviside step function of time t, and δ(x), δ(y) are the Dirac delta function of x and y, respectively. F0 is a constant vector. Applying inverse Fourier transformation, the response function subject to the Heaviside step load can be written as (Liu et al., 1991a; Liu and Xi, 2001): d(x, y, t) =
© 2003 by CRC Press LLC
1 4π 2
+∞
+∞
M ψ Lm F0 ψ mR − ik y (1 − cos ω m t) e − ikx x e y dk x dk y (7.23) 2 L R ψ ψ ( M ) ω m =1 m m m
∫ ∫ ∑ −∞
−∞
Following the same way as that described in Section 7.4, the identification procedure is quite similar to those for the two-dimensional case of concentrated line loads. The only difference is that the corresponding three-dimen sional kernel functions, instead of the two-dimensional kernel functions, are employed for the present problem.
7.5.2
Concentrated Point Load
Numerical calculations are conducted in this section to verify the effectiveness of the proposed algorithm for the force inversion problems of composite laminate. The composite laminate [C0/G+45/G–45] s is still considered to be made of two carbon/epoxy and four glass/epoxy layers. The composite laminate is excited at (0, 0) by a point load with the time function of 2 sin(πt ) f (t ) = 0
0≤t ≤2 others
(7.24)
The time history is one cycle of sine function. A receiving point is chosen at (3H, 3H) on the upper surface of laminate. The dynamic displacement response at the receiving point is computed using the HNM method and shown in Figure 7.19. This is considered the noise-free inputs for the load identification. Noise effects are also investigated by adding Gaussian noise (see Equation 2.130) directly to the computer-generated displacement readings. In this numerical example, pe = 0.2 is chosen to generate a set of noise. The noise-contaminated displacement generated is also shown in Figure 7.19. The inversely identified results of the time history of the force function using the noise-free and the noise-contaminated are shown in Figure 7.20; the identified time-history function agrees well with the true-force function. Comparing the identification results with noise-free and noise-contaminated displacements, the result from the noise-contaminated displacement is naturally less accurate. However, the shape and the magnitude of the identified curve agree satisfactorily with the true-force history. Numerical verification is also performed for a point Gaussian pulse. The time history of a Gaussian pulse can be expressed as f (t ) =
1 exp( −(t − t0 ) 2 / 2σ 2 ) (0 < t < ×) 2 πσ
where t is time, σ is a parameter that controls the duration of the pulse, and t0 determines the time delay of the pulse. In this numerical computation, t0 and σ are taken as 1.5 and 0.4, respectively. The identification result based on the noise-free displacement response is shown in Figure 7.21, from which a satisfactory agreement of the identified result to the true time history can
© 2003 by CRC Press LLC
0.12 Noise free
Noise added
0.08
Dimensionless Displacement
0.04
0
-0.04
-0.08
-0.12
-0.16 2
4
6
8
Dimensionless Time FIGURE 7.19 Noise-free and noise-contaminated (Gaussian noise) displacement response at (3H, 3H) on the upper surface of composite laminate [C0/G+45/G–45] S subjected to a point load at (0, 0) defined by Equation 7.24. (From Liu, G.R. et al., Acta Mech., 157(1–4), 223, 2002. With permission.)
2.4 Noise-added Noise-free
Amplitude
1.2
True
0
-1.2
-2.4 0
2
4
6
Dimensionless Time FIGURE 7.20 Comparison of the point excitation with history of sine function and identified results based on noise-free and noise-contaminated displacements. (From Liu, G.R. et al., Acta Mech., 157(1–4), 223, 2002. With permission.)
© 2003 by CRC Press LLC
Noise-added Noise-free True
Amplitude
1
0.6
0.2
-0.2 0
2
4
6
Dimensionless Time FIGURE 7.21 Identified force of Gaussian pulse on the composite laminate [C0/G45/G–45]s compared with the true time function of the force. (From Liu, G.R. et al., Acta Mech., 157(1–4), 223, 2002. With permission.)
be observed. The noise effect is also considered by adding a Gaussian noise with pe = 0.1 ; the inverse identified result is also shown in Figure 7.21. The identified results are satisfactory. Clearly, the inverse procedure is stable — it does not increase the error contained in the original input data. Note the issue of boundary condition at the left and right edges of the laminate. The discussion in Section 7.3.4 is applicable also to the case of laminates.
7.6
Ill-Posedness Analysis
In the preceding examples, the present inverse procedure is immune from the ill-posedness. Type I ill-posedness is removed because the problems were all even-posed and Type III ill-posedness is removed or mitigated due to the use of HNM, which deals with the time integration analytically. When using FEM, the displacement responses sampled at discrete time points are used in the inverse analysis. This discrete sampling functions as a regularization (see Chapter 3). Therefore, as observed from these examples, the solution will be stable as long as the displacement responses are sampled in a proper region on the time-history curve to ensure sensitivity to remove the Type II ill-posedness.
© 2003 by CRC Press LLC
7.7
Remarks
• Remark 7.1 — dynamic displacement response measured at only one receiving point on the structure surface can be used to recover the time history as well as the distribution function of dynamic loads. The identification problem can be at least even-posed and formulated as an optimization problem. The distribution function of loads applied on the surface of structures is reconstructed by solving the optimization problem. • Remark 7.2 — FEM has been used as a forward solver for impact force identification. The results agree well with the experiments. FEM’s advantage is its capability to model structures of complex geometry and boundary conditions. The projection regularization helps to stabilize the inverse solution. • Remark 7.3 — the HNM can be used as an efficient forward solver to investigate the response of composite laminated structures generated by external excitations. Two kernel functions (Green function and time-step response function) are evaluated using the HNM method. Because HNM deals with the time integration analytically, inverse problems formulated using the time-history response and HNM as forward solver will not have Type III ill-posedness. • Remark 7.4 — the time history of the load acting on the surface of the beam has been successfully identified with the measured displacement response on the surface. For two-dimensional problems, concentrated and distributed line loads are investigated, while for three-dimensional problems, the point loads as well as loads with small loading areas are studied. Numerical verifications are conducted and the identification results obtained are found in good agreement with the true ones.
© 2003 by CRC Press LLC
8 Inverse Identification of Material Constants of Composites
In this chapter, several computational inverse techniques are introduced to inversely identify the material constants from the dynamic displacement response recorded at only one receiving point on the surface of composite laminated structures. Material constants include the elastic constants or the engineering constants, which are required in the constitutive law, and fiber orientation of composite laminates. The materials of this chapter are largely based on work by Liu and co-workers (Liu et al., 2002a–d; Liu and Ma, 2000, 2003; Han et al., 2002b; Han and Liu, 2002c)
8.1
Introduction
Accurate knowledge on the material constants of composites is important in many engineering applications, e.g., the design or quality assurance of advanced composite structural systems. The identification of material parameters for complex materials such as fiber-reinforced composites is much more complicated than for isotropic materials. The number of material constants has increased; also, additional complications are caused by the anisotropy nature of the materials. Conventional static test methods by simple tensile tests meet difficulties when employed on composite materials. Specific problems such as boundary effects, sample size dependence, and the presence of nonuniform stress/strain fields often occur in such a test, and the testing results often exhibit large scatter. In addition, material properties obtained based on specimens used in the experiments can deviate severely from the material properties of the actual structure. Therefore, new effective techniques that are applicable to the actual structures have been sought. Among the proposed methods, the nondestructive evaluation (NDE) of composite material properties by employing computational inverse techniques appears very promising.
© 2003 by CRC Press LLC
Computational inverse techniques for material characterization of composites utilize the complex relationship between structure behaviors and the material property. This relationship is often represented by a known mathematical and computational model, which defines the forward problem. Thus, if a set of reasonably accurate experimentally measured data is available for the structure behavior, material property of the composite may be determined by solving a properly formulated inverse problem. The material property can be characterized by minimizing the sum of the squares of the errors or discrepancy between the experimental and the computed structure behavior data. Mignogna et al. (1990, 1991) have reviewed the use of ultrasonic wave velocity as the structure behavior data for the determination of elastic constants of many anisotropic composites. Rokhlin and co-workers (Rokhlin et al., 1992; Chu et al., 1994; Chu and Rokhlin, 1994a, b) proposed several modifications of the immersion ultrasonic technique to determine the elastic constants of composites. In these techniques, the Christoffel equation was adopted to establish the relationship between material property and bulk wave velocity. Comparably complicated techniques were needed to measure the phase velocity of ultrasonic bulk waves in anisotropic materials. Other ultrasonic techniques, such as the guided-wave technique (Bratton and Datta, 1989), the surface-wave technique (Rose et al., 1990) and the resonance measurement technique (Nakamura and Kimura, 1991), are also proposed for the determination of elastic constants. Sachse and Kim (1986) described in detail the principle of a point-source/point-receiver (PS/PR) technique, which is a procedure to measure the group velocity. This technique makes use of the fact that the group velocities of waves traveling in a direction between the PS and the PR in a specimen can be determined from the measured wave arrival times. Balasubramaniam and Rao (1998) investigated the reconstruction of material stiffness properties of unidirectional fiberreinforced composites from obliquely incident ultrasonic bulk wave data. Genetic algorithms (GAs) were used as the inversion technique and detailed discussion presented on advantages and disadvantages of GAs for the identification problem over conventional methods. Sribar (1994) estimated the elastic constants of composite from the group velocity data using an artificial neural network. Attempts to reconstruct the material constants of composite materials by using the knowledge of the eigenfrequency of plates were also reported. Soares et al. (1993) presented a technique to predict the mechanical properties of composite plate specimens that made use of experiment eigenfrequencies, the corresponding numerical eigenvalues evaluation, sensitivity analysis, and optimization. A finite element model based on Mindlin plate theory was used for the laminate analysis. The constrained minimization of an error functional expressing the difference between measured higher frequencies of a plate specimen and the corresponding numerical ones were then carried out to find the desired optimum parameters. Rikards and Chate (1998) and Rikards et al. (1999) used a similar laminate analysis model to that in Soares’
© 2003 by CRC Press LLC
study (1993), but instead of using direct minimization of identification functions, a method of experiment design was introduced. Physical experiments were performed on the sample plates to measure the eigenfrequencies by a real-time television holography. The finite element method was used to obtain the numerical data in the reference points and then response surfaces were determined. Based on the response surfaces and experimental data of eigenfrequency, the identification of material properties was performed. Frederiksen (1997) identified the elastic constants of thick orthotropic plates. His study focused on the experimental technique and application of the method to real tests. A mathematical model based on the higher-order shear deformation theory was used and higher mode frequencies were used to obtain reliable estimates of the two transverse shear modules. The elastic constants of orthotropic cylindrical composite shells were characterized using the method of Bayesian estimation based on their natural frequencies, obtained from a free–free configuration model (Lp et al., 1998). Recently, dynamic wave responses have been employed as structure behavior data to determine the elastic constants of anisotropic laminates (Liu et al., 2002a–d; Han and Liu, 2002c; Liu and Ma, 2003) and the fiber orientation of laminates (Liu and Ma, 2000), as well the elastic constants of laminated cylindrical structures (Han and Liu, 2002a, b; Han et al., 2002b). Furthermore, the material properties of functionally graded materials have been also characterized based on the dynamic displacement response of the structures (Liu et al., 2001b, 2002a; Han et al., 2002c). In these techniques, the relationship between the dynamic responses and the material constants of the composite structures are established using a number of well established efficient forward solvers such as HNM (Liu et al., 1991a; Liu and Xi, 2001), strip element method for composite laminates (Liu and Achenbach, 1994, 1995; Liu and Xi, 2001), and LS_DYNA (Hallquist, 1998). These efficient and accurate forward solvers pave the way for successful implementation of the inverse procedures because thousands of forward computations may be required in an inverse procedure. Several types of optimization algorithms, such as gradient-based optimization, µGAs, and IP-GAs, as well as the identification technique based on the neural network, are employed as the inverse operators. This chapter introduces computational inverse techniques for identifying the material constants of composite using dynamic displacement responses.
8.2
Statement of the Problem
The aim now is to identify the material constants of a composite laminate made of anisotropic materials with many layers of different fiber orientations. Material constants include the elastic constants (as well as engineering constants) at macro scale and the fiber orientation of laminates. The
© 2003 by CRC Press LLC
measurement should be as simple as possible, nondestructive, and easy to implement on the actual laminated structures. The following procedure is used in this chapter: The laminate is first excited by a dynamic load with a given time history and the wave signal (displacement, velocity, acceleration response) at a receiving point on the laminate surface is measured. An error function is then defined as the objective function using the computed and measured wave responses. Finally, an inverse procedure is employed to look for the minimum of the error function, which leads to the material constants of the laminate. In this chapter, the displacement response is used to construct the error function. It is, of course, possible to use the velocity or acceleration that is often the direct output of sensors used in the experiment. A receiving or sampling point is arbitrarily chosen on the laminate surface near the point of excitation or loading, and the displacement response subjected to the load can be easily measured. The displacement readings sampled at the jth time point are denoted as u mj . These readings are used as input to inverse procedures to identify the material constants of the composite laminates. Details of the inverse procedures are given in the next section.
8.3
Using the Uniform µ GA
8.3.1
Solving Strategy
Using the forward solver, the displacement response u cj of the laminate can be computed for a set of assumed material constants for the laminates. The computed response is, in general, different from those measured u mj from the actual laminates. The inverse procedure can then be formulated by an optimization procedure, which minimizes the error function that is the sum of squares of the difference between computed and measured responses. The optimization problem becomes: minimize the objective function of error defined in the L2 norm form as ns
ferr (p) =
∑ (u
m j
(pt ) − u cj (p))2
(8.1)
j =1
where pt is the vector of the true parameters of material constants, p is the vector of the assumed parameters of material constants, and ns is the total numbers of the sample points on the time history curve of the displacement response. In the examples discussed in this section, ns = 75 is used. The number of parameters of material constants is less than 10 for all the examples studied in this chapter. Therefore, this inverse problem is heavily over-
© 2003 by CRC Press LLC
posed; thus, no chance for Type I ill-posedness exists, and Type II ill-posedness is very unlikely. Note that it is not difficult to sample 75 points from the transient response curves computationally or experimentally. The objective function ferr (p) is a very complex function of p and can have multiple minima. Thus the GA is used to search for the global minimum. The implementation of a GA for the determination of material property of composite laminate can be carried out following the procedure outlined in Figure 5.34.
8.3.2
Parameter Coding
In a GA run, each individual chromosome represents a candidate combination of parameters to be identified as described in Chapter 5. For each individual of parameters combination, forward calculation must be performed to compute uic , the displacement response at the ith time sample point. These computed displacements are used to obtain the fitness value of the individual combination. The fitness value, which is defined using Equa tion 8.1, will determine the probability of the candidate being chosen as a parent of the next generation.
8.3.3
Parameter Settings in µ GA
A uniform µGA with binary parameter coding, tournament selection, uniform crossover, and elitism is adopted as the inverse operator in this study. Knuth’s subtractive method (Press et al., 1989; Carroll, 1996b) is used to generate random numbers because it is regarded as one of the best random number generators. With a different negative number initialization, the Knuth’s algorithm generates a different series of random numbers. Elitism operator is adopted to replicate the best individual of the current generation into the next generation. No mutation operation is used in the uniform µGA. The population size of each generation is set to 5 and the probability of uniform crossover is set to 0.5. The population convergence criterion is 5%, which means less than 5 % of the total bits of other individuals in a generation are different from the best individual found when the convergence occurs. The simple stopping criterion is imposed to limit each GA run to a maximum of 500 generations. The stopping criterion can also be set based on the error function value with proper consideration based on the discrepancy principle (see Chapter 3) to suppress the effects of the ill-posedness, if any.
8.3.4
Example I: Engineering Elastic Constants in Laminates
The first example is to identify the engineering constants of a composite laminate composed of many layers with different fiber orientations, as shown in Figure 7.7. Two composite laminates made of glass/epoxy and carbon/
© 2003 by CRC Press LLC
0.05
P
Dimensionless displacement
0 -0.05
S
-0.1 -0.15 -0.2
Select time range
-0.25 -0.3 -0.35 0
1
2
3
4
5
6
7
8
9
10
Dimensionless time FIGURE 8.1 Displacement response at x = 2.0H on the upper surface of the laminate [G0/+45/–45] s excited by a line force at the origin. The time history of the excitation force is defined by Equation 8.2. The letter P marks the estimated arrival time of the longitudinal wave, and S marks the estimated arrival time of the shear wave.
epoxy materials are studied. The ply orientations of the composite laminate are arbitrarily chosen. The incident excitation wave to the laminate is a vertical line load on the upper (outer) surface, as illustrated in Figure 7.7. The line load is a function of time defined as sin(2 πt / td ) f (t ) = 0
0 < t < td t ≤ 0 and t ≥ td
(8.2)
where td is the time duration of the incident wave. Equation 8.2 implies that the time history of the incident wave is one cycle of sine function. The HNM is used as the forward solver for the reasons given in Section 7.4. 8.3.4.1
Laminate [G0/+45/–45]s
First, consider a composite laminate [0/+45/–45]s consisting of six glass/ epoxy layers symmetrically stacked. The composite laminate is excited at x = 0 by a unit dynamic line load with a time function as specified by Equation 8.2. The receiving point is set at x = 2.0H, where H denotes the thickness of the laminate. The dimensionless parameters defined by Liu et al. (1991a) are used in the analysis. The displacement response in the z direction at the receiving point is computed using the HNM code and is shown in Figure 8.1. The letters P and S mark the estimated arrival times of the longitudinal and shear waves. In this figure, the dilatational wave has a small magnitude
© 2003 by CRC Press LLC
TABLE 8.1 GA Search Space for the Material Property of Glass/Epoxy Material Parameter E1 (GPa) E2 (GPa) G12 (GPa) ν12 ν23
True Data 38.48 9.38 3.41 0.292 0.507
Search Range
Possibilities
Binary Digit
26.00–50.00 6.50–12.00 2.40–4.40 0.200–0.40 0.350–0.660
512 512 512 128 128
9 9 9 7 7
Note: Total chromosome length: 41; total population: 2 41(2.2 × 1012). Source: Han, X. et al., Inverse Probl. Eng., 10(4), 309, 2002. With permission.
and is easily contaminated by noise, while the shear wave has a large magnitude and dominates the main response curve of the laminate. Figure 8.1 also designates the time range selected for the inverse operation in this study, which lies in the shear wave-dominated area. This computer-simulated dis placement response sampled at 75 time points in the selected time range is employed as noise-free input to the inverse algorithm for the identification of the engineering constants. The GA search space defined for this numerical example is listed in Table 8.1. The bounds on five parameters (two Young’s moduli, one shear modulus, and two Poisson’s ratios) are set as approximately ±30% off from the true values. The five parameters are discretized and translated into a chromosome of length 41 bits according to the binary coding procedure. In the whole discretized search space of five dimensions (parameters), there is a total of approximately 241( ≈ 2.2 × 1012 ) possible combinations of individuals. Because genetic algorithms use random numbers to identify trial individuals, the result of a single GA run can be a matter of chance. It can be expected that the statistical mean of many GA runs can provide higher confidence about the global solution obtained. Therefore, four GA runs with different random numbers generated by Knuth’s algorithm are performed; the mean values of identification results of material property are shown in Table 8.2. It is found that the solution is very stable and accurate for the noise-free case. No ill-posedness is observed. TABLE 8.2 Identified Results for the Glass/Epoxy Laminate [G0/45/–45]s
© 2003 by CRC Press LLC
Material Constant
True Data
E1 (GPa) E2 (GPa) G12 (GPa) ν12 ν23
38.48 9.38 3.41 0.292 0.507
Noise Free Mean % Error
Gaussian Noise (5%) Mean % Error
38.47 9.349 3.42 0.295 0.493
40.24 8.87 3.65 0.285 0.537
0.00 0.3 0.3 0.7 2.8
4.6 5.4 2.93 3.08 5.96
Because the elitism technique is adopted, the best individual is automatically copied into the next generation, so in each generation (except the initial generation in which all five individuals must be evaluated), only four individuals need to call the forward solver for evaluating their fitness. To check the stability of the solution further, noise effects are investigated by adding the Gaussian noise directly to the computer-generated displacement readings. The Gaussian noise generated according to Equation 2.130 with pe = 0.05 is employed, which means that an average 5% noise level is used to simulate the measurement noise. A noise-contaminated displacement, obtained by adding the Gaussian noise to the computer-generated displacement is used to simulate the experimentally recorded data with measurement error. The mean values of four GA runs using the function of error with these noise-contaminated displacement responses are also listed in Table 8.2. Even with a 5% level of noisy inputs of response, the result is still accurate. This implies that the Type III ill-posedness is also not evidenced in this example, and the solution is stable to the perturbation of the inputs. Note that performing multiple GA runs helps not only to increase the confidences about the solutions, but also to check on the ill-posedness. Another method to control the ill-posedness is to use the discrepancy principle, and stop the GA run when the error function value drops to the noise level of the measurement. 8.3.4.2
Laminate [C0/+45/–45/90/–45/+45]s
Numerical verification is also performed on a composite laminate made of carbon/epoxy layers whose anisotropy is much stronger than the glass/ epoxy layers. The GA search space defined for this carbon/epoxy material is listed in Table 8.3. The bounds on five parameters are still set as approximately ±30% off from the true values. The five parameters are discretized and translated into a chromosome of length 41 bits according to the binary coding scheme. The effects of different noise levels on the identified results are also inves tigated for the carbon/epoxy laminate. Table 8.4 gives the identified results TABLE 8.3 GA Search Space for the Material Property of the Carbon/Epoxy Composite Material Parameter E1 (GPa) E2 (GPa) G12 (GPa) ν12 ν23
True Data 142.2 9.26 4.8 0.33 0.49
Search Range
Possibilities
Binary Digit
99.00–185.0 6.50–12.00 3.30–6.20 0.230–0.430 0.343–0.640
512 512 512 128 128
9 9 9 7 7
Note: Total chromosome length: 41; total population: 2 41(2.2 × 1012). Source: Han, X. et al., Inverse Probl. Eng., 10(4), 309, 2002. With permission.
© 2003 by CRC Press LLC
TABLE 8.4 Identified Results for the Carbon/Epoxy Laminate [C0/45/-45/90/-45/45]S Based on Computer-Simulated Reposes with Different Noise Levels Material Constant
True Data
E1 (GPa) E2 (GPa) G12 (GPa) ν12 ν23
142.2 9.26 4.80 0.33 0.49
Noise Free % Mean Error 142.7 9.24 4.85 0.333 0.491
0.35 0.22 1.04 0.91 0.20
1% Noise % Mean Error
3% Noise % Mean Error
10% Noise % Mean Error
140.2 9.44 4.83 0.35 0.46
140.0 9.66 4.53 0.39 0.45
137.91 10.09 4.69 0.49 0.55
1.41 1.94 0.62 6.06 5.51
1.57 4.32 –5.60 18.2 –8.16
3.02 8.96 –2.29 48.40 12.24
Source: Liu, G.R. et al., Comput. Methods Appl. Mech. Eng., 191, 3543, 2002. With permission.
based on the computer-simulated response, with the Gaussian noise generated using Equation 2.130 at noise levels of 1, 3, and 10%, respectively. It is found from the table that Young’s modulus E1 and shear modulus G12 are insensitive to noise effects and remain accurate even at high levels of noise; E2 is quite sensitive to the noise and the identified results are stable and satisfactory. The identified results of two Poisson’s ratios are satisfactory at low noise levels, but deteriorate with the increasing noise level because the transient wave response is not very sensitive to the Poisson’s ratios. 8.3.4.3 Regularization by Projection From the preceding two examples, it is seen that, to a certain degree, the present inverse procedure is immune to the ill-posedness of the problem. Besides the effect of using HNM and over-posed formulation, another reason may be that only displacements sampled at 75 discrete points are used in the inverse analysis. The discrete sampling functions as the projection regularization (see Chapter 3). Therefore, the solution will be stable as observed from these two examples. Table 8.2 and Table 8.4 show that the Poisson’s ratios are less accurate compared to other parameters, due to less sensitivity of the wave response to the Poisson’s ratios. There is some Type II ill-posedness. It seems that the over-posed formulation helps a lot in resisting the contamination of the noise, but does not help in improving the sensitivity. 8.3.4.4 Regularization by Filtering Consider again the problem described in Section 8.3.4.2. A Gaussian noise with a noise level of 10% is added. The smoothing with moving average method introduced in Section 3.5 is then employed to filter the noise-contaminated displacements. The filtered displacements are then used in the inverse analysis. Table 8.5 gives the identified results based on the filtered response. The errors are now below the level of the noise contaminated in the displacements, and the regularization by filtering works very effectively.
© 2003 by CRC Press LLC
TABLE 8.5 Identified Results for Carbon/Epoxy Laminate [C0/45/-45/90/-45/45]S Using Filtered Inputs Material Constant
True Data
E1 (GPa) E2 (GPa) G12(GPa) ν12 ν23
142.2 9.26 4.80 0.33 0.49
Identified Results Mean % Error 145.36 9.33 5.11 0.32 0.45
2.2 0.1 6.45 3.0 8.2
Note: The original noise-contaminated displacement contains 10% noise.
8.3.4.5 Discussion Displacement response recorded at x = 2.0H is used in the preceding inverse analysis. Theoretically, the receiving point can be arbitrarily chosen on the laminate surface; different receiving points should not result in much difference in the identified results. However, in the real structure, there will be attenuation of signals. If the receiving point is too far away from the point of excitation, an accurate model must be included in the forward solver to take into account the attenuation. The same analysis has been performed for composite laminates of different materials, stacking sequences, layer numbers, and thickness. Noise effects are also investigated by inputting Gaussian noises into computersimulated displacement readings. We state without showing the detailed results that satisfactory results have been achieved for all these cases; hence, the general effectiveness and robustness of the procedure were confirmed. To avoid the boundary effects, the receiver should be far enough away from the boundary to enable the complete record of the full range of the shear wave-dominated response, before the arrival of the waves reflected from the boundary to the receiver. The arrival time of reflective waves can be calculated easily by wave propagation theory. A testing setup is presented in Chapter 7 with the consideration of boundary effects. Note that if the boundary condition is known and the forward solver is able to model it using, for example, the FEM, the boundary effects will not be a concern. In practice, the material properties for composite laminates are measured by testing on unidirectional composite specimens made in the same manufacturing process as the composite laminated structure. Due to the anisotropic nature of the materials, some specific problems, such as boundary effects, stacking sequence effects, sample size dependence, and the presence of nonuniform stress/strain fields, often occur in such a test. The testing results often exhibit large scatter and can be different from that of the actual structure. Using the present procedure, the material properties can be directly measured on the actual composite laminates with any stacking sequence. It is not necessary to prepare separately unidirectional specimens
© 2003 by CRC Press LLC
for material properties measurement, and the results can be much more accurate.
8.3.5
Example II: Fiber Orientation in Laminates
The stacking sequence of composite laminates plays an essential role in determining the mechanical properties of the laminates, especially for those made of fiber-reinforced composites. When designing composite laminates, the typical design variables are the fiber orientations and the number of layers, as well as the layer thickness. Manufacturing constraints often limit the choice of fiber orientations to certain discrete angles and fix the basic ply thickness. By tailoring the stacking sequence of composite laminates, engineers have great design freedom to achieve high performance for a given application. Moreover, as one of several factors concerning the damage resistance capability of composite laminates, the stacking sequence of plies affects the amount of damage resulting from an impact. Due to the importance of the ply orientations on the mechanical performance of composite laminates, it is valuable to develop a reliable and nondestructive method for identification of the ply orientations of composite laminates for theoretical research purpose and practical application interests. An example is presented here in which the ply orientations of composite laminates with many anisotropic layers are inversely identified using wave response to a dynamic line load with the time history defined by Equation 8.2. First, define the search space of parameters for GAs, based on the principle that the parameters to be inversely identified should be kept as minimum as possible. In this particular example, the ply orientations that may be used in real structures of composite laminates are often limited to a small set of angles such as 0, ±30, ±45, ±60, and 90°, due to reasons of easy fabrication. Ply angles of ±15 and ±75° are also used with less frequency. The thickness of the layers is also limited to integer multiples of the lamina thickness. Based on these practical considerations, assume that the possible ply orientations of the laminates under investigation are restricted to a set of discrete angles from –90 to 90° with 15° intervals, i.e., the possible fiber orientation of each layer can only be chosen from 12 discrete angles: ±75, ±60, ±45, ±30, ±15, 0, and 90°. The search space thus defined includes all the ply angles that may be adopted in producing real structures of composite laminates for practical engineering applications. Next, these parameters (ply orientations) are coded into binary strings or chromosomes. Generally, the number of possible values of each parameter is requested to be a number of 2 to the power of n by the standard GA programs, e.g., 23(8), 24(16), 25(32), etc. Also, each of the binary strings should have one-to-one correspondence to each of the possible parameters. For example, 16 possibilities can be precisely represented by binary strings of four digits, while 32 possibilities can be represented by five-digit binary strings. In our application, each ply has 12 possible orientations, which is
© 2003 by CRC Press LLC
not a number of 2 to the power of any n. Obviously, the number of the possibility defined does not match the number of any binary strings in the standard GA decoding with any integer n. This mismatching is resolved in the following simple manner. Four-digit binary strings, which can represent up to 16 possibilities, are used to represent 12 possibilities of ply orientations, while an additional checking subroutine is employed. Whenever a binary string is created by GA operations, the subroutine is run to determine whether the corresponding parameter value of this string falls within the specified parameter range. If not, the binary string is randomly reassigned within the range. This approach can result in a slight performance reduction in GA searching, but it is found effective for handling this kind of mismatching problem, in the sense that the mismatching strings are not really wasted because their share of chances is redistributed to these matching strings in a random manner. It is also ensured that the chances for each possible candidate to be evaluated are still the same (no bias). Numerical investigations are conducted to evaluate the effectiveness of the present method. Composite laminates made of glass/epoxy and carbon/ epoxy materials are chosen to perform numerical calculation. Note also that, in this example, the material constants of these two kinds of composite material are known and are listed in Table 7.2. The laminate is excited at the origin, and the response is registered at x = 1.5H, that is, 1.5 times the laminate thickness from the origin, which is the point to excitation.
8.3.5.1 Eight-Ply Symmetrical Composite Laminates First consider a composite laminate consisting of eight glass/epoxy layers. The stacking sequence of the laminate is denoted by [–45/45/–75/75]s. The laminate has eight layers but only four independent ply orientation angles because of its symmetrical sequence. In most practical situations, the fiber angles of the upper and lower layers are exposed and can be easily measured. With this practical consideration, the number of parameters can be reduced even further to reduce the complexity of the problem. In this case, the fiber angle of the upper layer can be assumed to be known as –45°; therefore, three angles are left unknown and should be identified from the displacement wave response. A search space with three parameters is defined. Because each parameter has 12 possible values, the total possibilities in the whole search space are 123 (1728). The displacement response in the y-axis direction at the point of recording subjected to the line load is computed using the HNM code and is shown in Figure 8.2. In the figure, the letters P and S mark the estimated arrival time of the longitudinal wave and the shear wave, respectively. Instead of experimentally recorded data, this computer-simulated displacement response is employed as noise-free input of the inverse algorithm to identify the ply orientations.
© 2003 by CRC Press LLC
Dimensionless Displacement
0.04
0.02
p
0
-0.02
s -0.04 0
2
4
6
8
10
Dimensionless Time FIGURE 8.2 Displacement response at x = 1.5H on the upper surface of the laminate [G45/–45/75/–75]s excited by the line load at the origin. The time history of the excitation force is defined by Equation 8.2. The letter P marks the estimated arrival time of the longitudinal wave, and S marks the estimated arrival time of the shear wave.
TABLE 8.6 Identified Results of Ply Orientation Using Noise-Free Displacement for Eight-Layer Laminates Stacking Sequence
Generation No. for Finding True Solution
No. of Forward Solver Calls
Exploration Ratea
[G45/–45/75/–75] s [C0/–45/45/75] s [C90/45/G–30/60]s [C45/–60/G15/75] s
134 50 126 93
537 201 505 373
31.08% 11.63% 29.22% 21.59%
a
Percentage of the number of performed forward computations to the number of the total population size (1728) of the whole search space.
The identified results are listed in Table 8.6 for four different laminates in which the generation number of finding the true solution and the total number of the forward solver (subprogram) being called until the true solutions are found, as well as the percentage of calling number for the forward subprogram to the total population in the whole search space, are listed. The percentage of forward computation to the whole search space is used as a performance indicator for the inverse procedure. High percentage means high computational cost. For these four laminates, 12 to 31% of total possible candidates were evaluated to find the true solution. Noise contamination effects are investigated by adding Gaussian noise directly to the computer-simulated displacement response. Figure 8.3 plots the Gaussian noise with 2% level that is generated for this example. This Gaussian noise is added to the computer-simulated displacement as the
© 2003 by CRC Press LLC
0.003
Dimensionless Noise
0.002 0.001 0 -0.001 -0.002 -0.003 0
2
4
6
8
10
Dimensionless Time
FIGURE 8.3 Gaussian noise added to the displacement response of the composite laminate [G45/–45/75/ –75] s for inverse analysis.
TABLE 8.7 Identified Results of Ply Orientation Using Noise-Contaminated Displacement (2% Level) for Eight-Layer Laminates Stacking Sequence
Generation No. for Finding True Solution
No. of Forward Solver Calls
Exploration Ratea
[G45/–45/75/–75] s [C0/–45/45/75] s [C90/45/G–30/60] s [C45/–60/G15/75] s
229 201 238 325
917 805 953 1301
53.06% 46.59% 55.15% 75.29%
a
Percentage of the number of performed forward computations to the number of the total population size (1728) of the whole search space.
input data for the inverse identification operation. The identified result is listed in Table 8.7. For these four laminates, 47 to 75% of total possible candidates were evaluated to find the true solution. Compared to the results listed in Table 8.6 and Table 8.7, the performance is significantly reduced due the noise contamination. Table 8.6 and Table 8.7 contain the results for composite laminates made of weak (glass) and strong (carbon) anisotropy materials. For all these cases, satisfying identified results have been obtained, thus demonstrating the general effectiveness of the inverse procedure. 8.3.5.2 Ten-Ply Symmetrical Composite Laminates Numerical verifications are also performed for ten-layer symmetrical laminates, which initially have five independent parameters. As the orientations for the upper and lower layers are known, the number of parameters is reduced to four. The total population of possible candidates for this example is 124 = 20736 that is 12 times that of the eight-ply cases. The identified results from computer-generated noise-free and the noise-contaminated displacements are given in Table 8.8 and Table 8.9, respectively.
© 2003 by CRC Press LLC
TABLE 8.8 Identified Results of Ply Orientation Using Noise-Free Displacement for Ten-Layer Laminates Stacking Sequence
Generation No. of Finding True Solution
No. of Forward Subprogram Call
Exploration Ratea
[C90/–45/45/–60/60] s [C0/–30/60/–75/45]s [G0/30/–60/45/90]s [C0/–45/45/G–75/75] s
1972 682 1543 372
7889 2729 6173 1489
38.04% 13.16% 29.76% 7.18%
a
Percentage of the number of performed forward computations to the number of the total population size (20,736) of the whole search space.
TABLE 8.9 Identified Results Using Noise-Contaminated Displacement (2% Level) for Ten-Layer Laminates Stacking Sequence
Generation No. of Finding True Solution
No. of Forward Subprogram Call
Exploration Ratea
[C90/–45/45/–60/60] s [C0/–30/60/–75/45]s [G0/30/–60/45/90]s [C0/–45/45/G–75/75] s
1034 570 140 862
4137 2281 561 3449
19.95% 11.00% 2.71% 16.63%
a
Percentage of the number of performed forward computations to the number of the total population size (20,736) of the whole search space.
Comparing the identified results between Table 8.6 and Table 8.7 and Table 8.8 and Table 8.9, it can be concluded that, in general, the genetic algorithm improves its performance as the total population of the problem increases. The problem under investigation is a combinational identification problem, with the search space a combination of multiple discrete ply angles. When the number of plies increases, the search space becomes larger and the inverse problem becomes more difficult to solve. The searching number required usually goes up with the increase of the ply number, but the “exploration rate” that is the percentage of the number of performed forward computations to the total population of the whole search space decreases. Another observation is that the effect of noise contamination on the explo ration rate does not demonstrate an obvious trend. For eight-layer cases, the exploration rates increase after the noise contaminations are taken into consideration, but for ten-layer cases, this phenomenon does not appear. Considering that random numbers are used in the GA operation, the generation numbers of finding the true solution inevitably demonstrate a characteristic of randomness. 8.3.5.3 Further Investigations As specified earlier, the fiber orientations of layers have been limited to 12 discrete values, and the preceding discussions have been conducted based
© 2003 by CRC Press LLC
on this specification. As the number of possibilities increases, the problem will become more complicated. Next, further investigation on the identification problem is made by increasing the possibilities of ply orientations. Now assume that the ply orientations of laminates under investigation are restricted to a set of discrete angles from –90 to 90° with 5° intervals. Thus the number of possible angles per parameter is 36, compared with the 12 possibilities in the previous discussion. By decreasing the interval, the search space should become much bigger and the inverse problem become more complicated with more near-global optima. First, investigate the effectiveness of the procedure with the new defined search space by considering eight-layer composite laminates made of glass/ epoxy material. Each real parameter is coded into a six-digit binary string. The total population of possible candidates in the search space is 363 (= 46,656). Compared with the cases of eight-layer laminates with 15° intervals, the total population has increased by 27 times. A higher computational cost is expected. The identified results, based on the computer-simulated and noise-contaminated displacement responses, are given in Table 8.10 and Table 8.11, respectively. From the tables, it can be concluded that the proposed inverse procedure can accurately locate the true solution, but with a high computation cost. The accurate results, even after the effect of noise contaminations TABLE 8.10 Identified Results of Ply Orientation Using Noise-Free Displacement for Glass/Epoxy Laminates with 5° Intervals Stacking Sequence [30/15/–60/90] s [30/–30/60/–60]s [0/–45/45/–75] s [90/–30/75/0]s a
Generation No. for Finding True Solution
No. of Forward Solver Calls
Exploration Ratea
45 744 1135 593
226 2977 4541 2373
0.48% 6.38% 9.73% 5.09%
Percentage of the number of performed forward computations to the number of the total population size (46,656) of the whole search space.
TABLE 8.11 Identified Results of Ply Orientation Using Noise-Contaminated (10% Level) Displacement for Glass/Epoxy Laminates with 5° Intervals Stacking Sequence [30/15/–60/90] s [30/–30/60/–60]s [0/–45/45/–75] s [90/–30/75/0] s a
© 2003 by CRC Press LLC
Generation No. for Finding True Solution
No. of Forward Solver Calls
Exploration Ratea
232 534 739 906
929 2137 2957 3625
1.99% 4.58% 6.34% 7.77%
Percentage of the number of performed forward computations to the number of the total population size (46,656) of the whole search space.
TABLE 8.12 Identified Results of Ply Orientation for Ten-Layer Laminates Based on 5° Intervals Stacking Sequence
Identified Result (at 5000 Generation)
True Result Found?
[G70/–30/15/–60/45] s [G90/–40/0/–55/75] s [G90/–60/60/–15/35] s [C30/–60/75/–45/00] s [C0/–50/50/–15/60]s
[70/–30/15/–60/45]s [90/–35/5/–50/75]s [90/–55/55/–15/35]s [30/–60/75/–45/0]s [0/–50/45/–25/55]s
Yes No No Yes No
Note: Total population size of the whole search space was 1,679,616.
is taken into consideration, demonstrate the robustness of the procedure in the presence of the given noise. Comparing Table 8.10 with Table 8.7 shows that, for the case of 5° intervals, the search numbers required to locate the true solution are generally much higher, but the exploration rate is much lower than the case of 15 ° intervals. This phenomenon is in accordance with previous examples and discussions. The procedure is also employed for the composite laminate of 10 layers whose fiber orientation can be from –90 to +90° with 5° intervals. In this case the total population is 36 4 (= 1,679,616); Table 8.12 lists the identified results. Obviously, for some cases the GA search failed to locate the true solution, but converged to an optimum close to the true solution at the prescribed maximum number of generations (5000). The exploration rate at the maximum number of generation is about 0.3%. It is observed that the true solution has already been found at this low rate of exploration, so further exploration is needed to find the true solution for other cases. Some analyses on the search space have been carried out. Table 8.13 lists the fitness comparison of neighbor points of the true solution to the identified results for the laminate [G90/–40/0/–55/75] s. From this table, it can be TABLE 8.13 Comparison of Fitness of Neighborhood of Global Optima to Identified Optima
© 2003 by CRC Press LLC
Identified Results
Fitness Value
Global optimal Identified result [90/–40/5/–55/75]s [90/–40/0/–50/75]s [90/–40/0/–55/70]s [90/–35/0/–55/75]s [90/–40/0/–60/75]s [90/–40/0/–55/80]s [90/–45/0/–55/75]s
0 0.005046 0.037727 0.014919 0.017461 0.005133 0.018623 0.009707 0.007391
R2
r
H
x
R1
FIGURE 8.4 Configuration of a laminated cylindrical shell (From Han, X. et al., Inverse Probl. Eng., 10(4), 309, 2002. With permission.)
seen that the neighborhood points of the global optima have worse fitness values than that of the identified point, i.e., the global optimum is an isolated point encompassed by worse-fitness points. This phenomenon shows that the identification problem possesses the characteristic of deception. It is not an easy task for a search technique to solve these so-called “deceptive problems.” Even GA search methods, which are generally robust in dealing with complicated search problems, may become inefficient with such complicated problems, especially when the searching space is very large. To solve this problem effectively, a modification of experimental strategy needs to be considered.
8.3.6
Example III: Engineering Constants of Laminated Cylindrical Shells
In this section, numerical results are presented for determination of engineering constants of laminated cylindrical shells, as shown in Figure 8.4. The inverse technique outlined in Figure 5.34 is employed here for the determining the engineering constants of laminated cylindrical shells. The HNM is adopted as the forward solver, and the uniform µGA is employed as the inverse operator. In a uniform µGA run, each individual chromosome represents a candidate combination of the parameters (engineering constants). For each candidate combination, the dynamic displacement responses on the outer surface of the shell can be calculated using the forward technique. As a necessary condition for successfully utilizing this inverse approach, the selected displacement responses must be sensitive to the parameters of material constants. Therefore, in the first step, the effect on the displacement responses of varying engineering constants is studied. All the results are based on the present analytical–numerical method, and the dimensionless operator introduced by Han et al. (2001b) is used. Figure 8.5(a) through Figure 8.5(e) offer examples of the displacement responses at x = 2 H on the outer surface of the [G0/+30/–30/90/–60/+60]s laminated cylindrical shell excited by an incident wave of one cycle of sine function at x = 0.0. In this
© 2003 by CRC Press LLC
Dimensionless displacement
0.6 38.48 25.6 12.8
E1 (GPa ) =
0.5 0.4 0.3 0.2 0.1 0 -0.1 -0.2 -0.3 1.5
2.5
3.5
4.5
5.5
6.5
7.5
Dimensionless time
a) Effect of E1 on the displacement response Dimensionless displacement
0.6 9.38 6.25 3.13
E 2 (GPa ) =
0.4 0.2 0 -0.2 -0.4 -0.6 -0.8 1.5
2.5
3.5
4.5
5.5
6.5
7.5
Dimensionless time
b) Effect of E2 on the displacement response FIGURE 8.5 Time history of displacement response in the x direction at x = 2.0H on the outer surface of the [G0/+30/–30/90/–60/+60]s laminated cylindrical shell excited by an incident wave of one cycle of sine function at x = 0.0. (From Han, X. et al., Inverse Probl. Eng., 10(4), 309, 2002. With permission.)
figure, only one engineering constant varies, and others are kept at the actual values listed in Table 8.3. Each of the material constants displays an appreciable influence on the response curve. It is now necessary to decide what special information from the displacement responses to include. Considering the features of Figure 8.5(a) through Figure 8.5(e), select the displacement responses of 75 points in the time range from 1.5 to 8.0. 8.3.6.1
[G0/+45/–45/90/–45/+45]s Cylindrical Shell
First, consider a composite laminated cylindrical shell consisting of 12 glass/ epoxy layers. The stacking sequence of the laminated layer is denoted by [G0/+45/–45/90/–45/+45]s. Though here a symmetrical laminate is used for the numerical verification, the method is applicable to laminates with arbitrary layer stacking sequence. The receiving point is set at x = 2.0H. The displacement response in x direction at the receiving point is computed using
© 2003 by CRC Press LLC
Dimensionless displacement
0.8
3.41 2.28 1.14
G 1 2 ( GP a ) =
0.6 0.4 0.2 0 -0.2 -0.4 -0.6 1.5
2.5
3.5
4.5
5.5
6.5
7.5
Dimensionless time
c) Effect of G12 on the displacement response 0.4
Dimensionless displacement
0.292 0.195 0.097
v12 =
0.3 0.2 0.1 0 -0.1 -0.2 -0.3 -0.4 -0.5 -0.6 1.5
2.5
3.5
4.5
5.5
6.5
7.5
Dimensionless time
d) Effect of v12 on the displacement response 0.4
v 23 =
Dimensionless displacement
0.3 0.2 0.1 0 -0.1 -0.2 -0.3 1.5
2.5
3.5
4.5
5.5
6.5
Dimensionless time
e) Effect of v23 on the displacement response FIGURE 8.5 Continued.
© 2003 by CRC Press LLC
0.507 0.34 0.17
7.5
35 30
Function value
25 20 15 10 5 0 0
20
40
60
80
100
120
140 160
180
200
Number of generations FIGURE 8.6 Convergence of the GA in the process of determining the engineering constants of [G0/+45/ –45/90/–45/+45]s laminated cylindrical shell. x = 0.0. (From Han, X. et al., Inverse Probl. Eng., 10(4), 309, 2002. With permission.)
the forward solver. This computer-simulated displacement response is considered noise-free data and used to identify the material constants. For the noise-contaminated case, the noise-added displacement response is generated using the Gaussian noise created using Equation 2.130; the noise-free data is then used as input data for the inverse identification operation. The GA search space defined in Table 8.1 is still used. Four GA runs with different random number series generated by Knuth’s algorithm are performed; the mean values of identified results are shown in Table 8.14. The variation of error value against the number of generation for a GA run is plotted in Figure 8.6, showing the convergence in the identification process. Table 8.14 shows that the proposed inverse technique gives quite accurate results for the noise-free case, where the maximum error of the results is less than 4%. For the noise-contaminated case, the error is less than 7% with a TABLE 8.14 Characterized Material Constants of Glass/Epoxy [G0/+45/–45/90/–45/+45]S Laminated Cylindrical Shell Material Constant
Original Data
Noise Free Mean % Error
Gauss Noise (5%) Mean % Error
Gauss Noise (10%) Mean % Error
E1 (GPa) E2 (GPa) G12 (GPa) ν12 ν23
38.48 9.38 3.41 0.292 0.507
39.62 9.15 3.28 0.274 0.520
38.60 9.01 3.24 0.307 0.536
38.02 8.96 3.21 0.318 0.568
2.96 2.45 2.05 3.81 2.56
0.30 3.94 4.93 5.08 5.65
Source: Han, X. et al., Inverse Probl. Eng., 10(4), 309, 2002. With permission.
© 2003 by CRC Press LLC
1.2 4.5 5.87 9.02 12.01
TABLE 8.15 Characterized Material Constants of Glass/Epoxy [G0/+30/–30/90/–60/+60]s Laminated Cylindrical Shell Material Constant
Original Data
Noise Free Mean % Error
Gauss Noise (5%) Mean % Error
E1 (GPa) E2 (GPa) G12 (GPa) ν12 ν23
38.48 9.38 3.41 0.292 0.507
38.88 9.28 3.37 0.301 0.488
38.69 9.16 3.31 0.278 0.480
1.04 0.54 1.24 3.01 3.78
0.54 2.34 3.06 4.73 5.41
Source: Han, X. et al., Inverse Probl. Eng., 10(4), 309, 2002. With permission.
noise level of 5%; the results are not good with a noise level of 10%. Note that the characterized results are stable regardless of the presence of noise of up to a level of 5% because of the use of HNM and the projection regularization, as described in Section 8.3.4.3. [G0/–30/30/90/–60/+60]s Cylindrical Shell A laminated cylindrical shell made of the same material but with different layer orientations is also considered. The stacking sequence is arbitrarily chosen as [G0/–30/30/90/–60/–60]s. The same search space defined in Table 8.1 is used for GA computation and identified results, based on noise-free response and Gaussian noise-contaminated (5%) response, are obtained (see Table 8.15). The identified results are equally satisfactory as those obtained using [G0/+45/–45/90/–45/45]s. This fact is important because it shows that the identification procedures are independent of the stacking sequence of the laminates. Therefore, this technique can be applied directly to the actual laminated shell structures. 8.3.6.2
[C0/30/–30/90/–60/60]s Cylindrical Shell Numerical verification is also performed on a laminated cylindrical shell made of carbon/epoxy layers whose anisotropy is much stronger than that of the glass/epoxy layers. The stacking sequence of the carbon/epoxy laminate is [C0/30/–30/90/–60/60]s. The GA search space defined for this numerical example is listed in Table 8.3. Table 8.16 lists the identified results based on noise-free response and Gaussian noise response (5%). The results indicate that the proposed technique works well even when the materials have very strong anisotropy. These tables show that the identified results of Young’s modulus as well as of shear modules are accurate and stable to the added noise, while the results of Poisson’s ratio have a relatively large variation when the noise is added. In general, the results obtained are satisfactory. 8.3.6.3
© 2003 by CRC Press LLC
TABLE 8.16 Characterized Material Constants of Carbon/Epoxy [G0/+30/–30/90/–60/+60]S Laminated Cylindrical Shell Material Constant E1 (GPa) E2 (GPa) G12 (GPa) ν12 ν23
Original Data 142.2 9.26 4.80 0.33 0.49
Noise Free Mean % Error 142.97 9.24 4.70 0.326 0.506
0.54 0.23 2.06 1.06 3.34
Gauss Noise (5%) Mean % Error 144.36 9.03 4.75 0.312 0.516
1.52 2.44 1.03 5.37 5.22
Source: Han, X. et al., Inverse Probl. Eng., 10(4), 309, 2002. With permission.
8.4
Using the Real µ GA
In this study the FEM is used as the forward solver to compute the dynamic response of composite laminates. Consider a four-layer carbon/epoxy laminate, with a stacking sequence of [C0/+45/–45/0]. The ply orientation of the laminate is arbitrarily chosen. Both laminates have a length of 1 m, width of 1 m, and thickness of 0.12 m. The laminate is discretized into eight-node thick shell elements with one point reduced integration formulation. The laminate is loaded by a half-cycle sine function dynamic load in z direction over one element on the upper surface of the laminate, as shown in Figure 8.7. Duration of the load is chosen as 1 ms; the loading location and the receiving point are chosen arbitrarily. An explicit, three-dimensional finiteelement software package, LS_DYNA (Hallquist, 1998), is used as the forward solver to calculate the displacement response. In the thickness direction, one through-thickness integration point is used for each material layer. z
1m
1m f(t) Receiving point
y x 0.12m
FIGURE 8.7 Composite laminate [C0/45/–45/0] subjected to a dynamic load at an arbitrary location. Displacement response can be received at another location. LS_DYNA is used as the forward solver to calculate displacement of the laminate.
© 2003 by CRC Press LLC
3
x 10-4
2.5
ferr
2
1.5
1
0
0
30
60
90
120
150
180
210
240
270
300
Generation Number FIGURE 8.8 Convergence of real µGA for identification of the material constants of the composite laminate [C0/45/–45/0].
An explicit solver that employs a central difference integration algorithm is used for the numerical calculation of the equations of motion. The computer-simulated displacement response with actual material constants is to be used as experimental input data. The GA search space defined for this numerical example is listed in Table 8.3. The real-µGA with crossover operator 4 (Chapter 5) is employed as the inverse operator. The relationship between the convergence of the error function defined by Equation 8.1 and generation number using the real µGA is plotted in Figure 8.8. It can be seen from this figure that the error function decreases fast using the real µGA. The search results at the 300th generation are listed inTable 8.17. The results of Poisson’s ratio have a comparable large variation; this may be caused by the fact that variation of Poisson’s ratio has very little effect on the dynamic response of the laminate. In general, the results obtained are stable and satisfactory for engineering applications.
8.5
Using the Combined Optimization Method
Consider the problem to determine the material constants of glass/epoxy laminate [0/45/–45/90/–45/45] S . The combined optimization method described in Chapter 5 is employed for the inverse analysis. The uniform
© 2003 by CRC Press LLC
TABLE 8.17 Identified Results for Four-Layer Carbon/ Epoxy Laminate [C0/45/–45/0] Using Real-µGA with Crossover Operator 4 Material Constant E1(GPa) E2(GPa) G12(GPa) v12 v23
Actual Value
Search Result
142.170 9.26 4.8 0.33 0.49
141.9(–0.19%) 9.330(+0.81%) 4.760(–0.83%) 0.342(+3.6%) 0.471(–3.9%)
µGA and the subroutine BCLSF of IMSL are used for the combination. The BCLSF is a gradient-based method based on the modified Levenberg–Marquardt algorithm and uses the finite-difference method for evaluating the Jacobian. At the first step, the uniform µGA is used to select a set of better solutions close to the optima. The selection criterion is imposed to limit the function value below a required value, say 0.03. The error function defined by Equation 8.1 is more complex than that in Equation 5.48; thus the better solutions for this problem should be selected from more generations comparing the numerical test presented in Section 5.8.2. Considering the first 50 generations, three kinds of the parameters among the parents are generated from the uniform µGA. Therefore, three sets of parameters are selected; values of these selected sets of parameters and the corresponding fitness function values are listed in Table 8.18. At the second step, these sets of better solutions are considered as the initial points for the gradient-based method. Thereafter, the BCLSF is applied three times, starting from a different initial point. These final converged results are shown in Table 8.19. This table shows that the difference solution corresponds to difference initial guess for BCLSF. All these solutions could be considered the local optima of the function, whereby the global optimum can be found by comparing their corresponding function values, as shown in Table 8.19 in bold fonts. The maximum error of the TABLE 8.18 Selected Sets of Better Solutions Close to Optima from Uniform µGA for Material Characterization of Glass/ Epoxy Laminate [0/45/–45/90/–45/45]S Set Number
E1, E2, G12, v12, v23
Fitness Function Value
1 2 3
(39.7,9.7,3.2,0.36,0.5) (38.9,9.65,3.2,0.31,0.503) (42.8,8.09,3.79,0.12,0.31)
5.334e–4 1.520e–4 1.026e–2
Source: Liu, G.R. et al., Comput. Methods Appl. Mech. Eng., 191, 1909–1921, 2002. With permission.
© 2003 by CRC Press LLC
TABLE 8.19 Results from the Gradient-Based Optimization Method for Material Characterization of Glass/Epoxy Laminate [0/45/–45/90/–45/45]s Using Initial Values from the µGA Set Number
Initial Value
Corresponding Solution
Function Evaluations
Function Value
1 2 3
(39.7,9.7,3.2,0.36,0.5) (38.9,9.65,3.2,0.31,0.503) (42.8,8.09,3.79,0.12,0.31)
(39.6,9.64,3.19,0.358,0.497) (38.9,9.61,3.205,0.305,0.504) (37.96,8.82,3.38,0.15,0.396)
8 9 14
6.889e–7 7.291e–8 5.625e–7
Source: Liu, G.R. et al., Comput. Methods Appl. Mech. Eng., 191, 1909–1921, 2002. With permission.
identified result is less than 6%. Furthermore, very few numbers of function evaluations to convergence of each BCLSF are required, a total of 31 function evaluations are performed in all the BCLSF runs. For this material characterization problem using the combined optimization method, 250 function evaluations in the uniform µGA runs, and 31 function evaluations in the BCLSF runs are performed. Therefore, a total of 281 function evaluations is required in the combined method, which is significantly less than the 2001 function evaluations required using the uniform µGA alone for the same problem. It can be clearly concluded that the combined optimization method for material characterization is very efficient.
8.6
Using the Progressive NN for Identifying Elastic Constants
The neural network is used for identification of elastic constants of laminated structures. The outputs of the NN model are elastic constants and the inputs are the time history of displacement responses on the surface of the laminate, which can be easily measured using conventional experimental techniques. Only one receiving point is chosen on the surface of the laminates, and the responses in the time domain for displacement components in z and y directions are selected as the inputs for the NN model.
8.6.1
Solving Strategy and Statement of the Problem
An NN model is trained using initial training data containing a set of assumed elastic constants, which represents various elastic constants of laminates and their corresponding displacement responses calculated from the HNM solver. The trained NN model is used to identify the elastic constants by feeding in the measured displacement responses. The identified elastic constants are then used in the HNM solver to calculate the dynamic displacement responses. The NN model would go through a retraining process if the calculated displacement responses deviate unacceptably from the
© 2003 by CRC Press LLC
1523 Page 249 Tuesday, June 3, 2003 2:14 PM
TABLE 8.20 Search Range for Elastic Constants for Glass/ Epoxy Laminate Elastic Constants
Actual Data (GPa)
Search Range (GPa)
c11 c12 c22 c23 c55
42.020 6.067 13.500 7.277 3.410
30–54 4–8 10–18 5–9 2–4
Source: Liu, G.R. et al., J. Sound Vib., 252(2), 239, 2002. With permission.
actual ones. The progressive NN process for identification of elastic constants of laminates will follow the same procedure illustrated in Figure 6.6. This NN process for identification of elastic constants of laminates is demonstrated using one actual laminate consisting of 10 glass/epoxy layers. The stacking sequence of the laminated layers is denoted by [0/+45/–45/60/–60]s. The glass/epoxy material is a transversely isotropic material. The onprincipal-axis elastic constants cij (i , j = 1, 2, L , 6) of glass/epoxy material are related to engineering constants in the form (Vinson and Sierakowski, 1987):
(
)
c11 = E1 1 − ν 223 / ∆ , c22 = E2 (1 − ν 21 ν12 ) / ∆ c 44 = (c22 − c23 ) / 2, c 55 = G12 , c12 = ( ν12 + ν 23 ν12 )E2 / ∆ ,
c66 = G12
c13 = ( ν12 + ν12 ν 23 )E2 / ∆
c23 = ( ν 23 + ν12 ν 21 )E2 / ∆ ,
(8.3)
ν 21 = E2 ν12 / E1
2 ∆ = 1 − 2 ν12 ν 21 − ν 23 − 2 ν 21 ν 23 ν12
where the actual engineering constants are provided in Table 8.3. Five parameters, named c11 , c12 , c13 , c 33 , and c 44 , need to be identified, and their search range is also listed in Table 8.20.
8.6.2
Inputs of the NN Model
The displacement responses data on the surface of the laminate are selected as the inputs. As a necessary condition for successfully utilizing the NN model, the sought outputs should have significant dependence on the input data. Therefore, in the first step, an intensive sensitivity study is considered to examine the effect on the displacement responses of the elastic constants. All the results are obtained using the HNM, and the dimensionless variables defined by Liu et al. (1991a) are used.
© 2003 by CRC Press LLC
0.15
c11[GPa] 0.1
30 38 0.05
46
u
54 0
-0.05
-0.1 1
2
3
4
5
6
7
6
7
8
9
10
t
(a) 0.15
c12[GPa] 0.1
4.0 5.33 6.66 8.0
0.05
u 0
-0.05
-0.1 1
2
3
4
5
8
9
10
t
(b) FIGURE 8.9 Time history of displacement response in the x direction at x = 3.0H on the upper surface of single-ply glass/epoxy laminate excited by a vertical line load of one cycle sine function at x = 0.0. (From Liu, G.R. et al., J. Sound Vib., 252(2), 239, 2002. With permission.)
Examples of the displacement response in the x direction (see Figure 7.7) at x = 3.0H on the upper surface of the single-ply glass/epoxy laminate excited by an incident vertical line load of one cycle of sine function at x = 0.0 are displayed in Figure 8.9(a) through Figure 8.9(e). These figures show that each c11 , c12 , c 33 , and c 44 displays an appreciable influence on the response curve when the rest of the elastic constants are set to their actual values. However, the effect of c13 cannot be seen clearly (even no effect) from the response curve, as shown in Figure 8.9(d). To reflect the effect of
© 2003 by CRC Press LLC
0.15
c33[GPa]
0.1
10.0 12.67 15.33 18.0
0.05
u
0 -0.05 -0.1 -0.15 -0.2
1
2
3
4
5
6
7
8
9
10
t
(c) 0.02
0
-0.02
c13[GPa] 5.0
-0.04
6.33
u
7.66
-0.06
9.0 -0.08
-0.1
1
2
3
4
5
6
t
(d) FIGURE 8.9 Continued.
© 2003 by CRC Press LLC
7
8
9
10
0.15 0.1 0.05
u
0
c44[GPa] 2.0 2.67 3.33 4.0
-0.05 -0.1 -0.15 -0.2 1
2
3
4
5
6
7
8
9
10
t
(e) FIGURE 8.9 Continued.
c13 , it is necessary to consider the component of displacement response in the y direction subjected to a shear load in the y direction acting at x = 0.0. The influences of the change of the elastic constants on this displacement component are shown in Figure 8.10(a) through Figure 8.10(e). The influence of the change of elastic constant c13 can be clearly observed from Figure 8.10(d), but the effect of the changes of c11 and c12 cannot be clearly observed from the response of the displacement component in the y direction, as shown in Figure 8.10(a) and Figure 8.10(b), respectively. Figure 8.11 and Figure 8.12 show the effect of the changes of the elastic constants on the response curves for the 10-layer glass/epoxy [0/+45/–45/60/–60]s laminate. As with the single-ply glass/epoxy laminate, Figure 8.11 shows that the response of the displacement component in the x direction is sensitive to the change of a number of the elastic constants, but unsensitive to the change of the others’ elastic constants. The same phenomena can be found from Figure 8.12 for the displacement component in the y direction. It is natural to expect that a combination of these two components of displacement responses should be sensitive to the change of all the elastic constants of the laminate. From Figure 8.11 and Figure 8.12, it can be observed that there is a “special region” that results from the change of the elastic constants. Significant change occurs in amplitude and in pattern of the response curve within this region. The effect of the change of the elastic constants on the displacement response is thus obviously reflected. This minor modification on the inverse strategy effectively solves the sensitivity problem. It is now necessary to decide what special information from the displacement response is to be included in the input training sample. Here two components of displacement responses are used as the input of the NN model: one is displacement
© 2003 by CRC Press LLC
0.3
c11[GPa]
0.2
30 38
0.1
46 54
0
v -0.1
-0.2
-0.3
-0.4
1
2
3
4
5
6
7
8
9
10
6
7
8
9
10
t
(a) 0.3
c12[GPa]
0.2
4.0 0.1
5.33 6.66
0
8.0
v -0.1
-0.2
-0.3
-0.4
1
2
3
4
5
t
(b) FIGURE 8.10 Time history of displacement response in the y direction at x = 3.0H on the upper surface of the single-ply glass/epoxy laminate excited by a shear line load in y direction of one cycle of sine function at x = 0.0. (From Liu, G.R. et al., J. Sound Vib., 252(2), 239, 2002. With permission.)
© 2003 by CRC Press LLC
0.5 0.4
c33[GPa]
0.3
10.0 12.67
0.2
15.33 0.1
18.0
v 0 -0.1 -0.2 -0.3 -0.4 1
2
3
4
5
6
7
8
9
10
6
7
8
9
10
t
(c) 0.4
c13[GPa]
0.3
5.0 0.2
6.33 7.66
0.1
9.0
v 0
-0.1
-0.2
-0.3 1
2
3
4
5
t
(d) FIGURE 8.10 Continued.
© 2003 by CRC Press LLC
0.3
c44[GPa]
0.2
2.0 2.67
0.1
3.33
v
4.0
0
-0.1
-0.2
-0.3
-0.4 1
2
3
4
5
6
7
8
9
6
7
8
9
10
t
(e) FIGURE 8.10 Continued.
0.25 0.2
c 11 [ GPa ]
0.15
30
0.1
38 0.05
46
0
54
u -0.05 -0.1 -0.15 -0.2 -0.25 1
2
3
4
5
10
t
(a) FIGURE 8.11 Time history of displacement response in the x direction at x = 3.0H on the upper surface of the [G0/+45/–45/60/–60]s laminate excited by a vertical line load in z direction of one cycle of sine function at x = 0.0. (From Liu, G.R. et al., J. Sound Vib., 252(2), 239, 2002. With permission.)
© 2003 by CRC Press LLC
0.25 0.2
c12[GPa]
0.15
4.0
0.1
5.33
0.05
6.66 8.0
u
0 -0.05 -0.1 -0.15 -0.2 -0.25
1
2
3
4
5
6
7
8
9
6
7
8
9
10
t
(b) 0.4
0.3
c33[GPa] 10.0
0.2
12.67 15.33
u
0.1
18.0
0
-0.1
-0.2
-0.3 1
2
3
4
5
t
(c) FIGURE 8.11 Continued.
© 2003 by CRC Press LLC
10
0.25 0.2
c13[GPa]
0.15
5.0 0.1
6.33 0.05
u
7.66 9.0
0 -0.05 -0.1 -0.15 -0.2 -0.25 1
2
3
4
5
6
7
8
9
10
6
7
8
9
10
t
(d) 0.3
0.2
c44[GPa] 2.0
0.1
2.67 3.33
u
0
4.0
-0.1
-0.2
-0.3
-0.4 1
2
3
4
5
t
(e) FIGURE 8.11 Continued.
© 2003 by CRC Press LLC
0.02 0 -0.02 -0.04
c11[GPa]
v
-0.06
30
-0.08
38 -0.1
46 54
-0.12 -0.14
1
2
3
4
5
6
7
8
9
10
t
(a) 0.02
0
-0.02
c12[GPa]
-0.04
v
4.0 -0.06
5.33 6.66
-0.08
8.0 -0.1
-0.12 1
2
3
4
5
6
7
8
9
10
t
(b) FIGURE 8.12 Time history of displacement response in the y direction at x = 3.0H on the upper surface of [G0/+45/–45/60/–60] s laminate excited by a shear line load in y direction of one cycle of sine function at x = 0.0. (From Liu, G.R. et al., J. Sound Vib., 252(2), 239, 2002. With permission.)
© 2003 by CRC Press LLC
0.06 0.04 0.02 0
v
-0.02 -0.04
c33[GPa] -0.06
10.0 -0.08
12.67 15.33
-0.1
18.0 -0.12 1
2
3
4
5
6
7
8
9
10
t
(c) 0.04 0.02 0 -0.02
v
c13[GPa]
-0.04
5.0 -0.06
6.33 -0.08
7.66 9.0
-0.1 -0.12
1
2
3
4
5
6
t
(d) FIGURE 8.12 Continued.
© 2003 by CRC Press LLC
7
8
9
10
0.04 0.02 0 -0.02
v
-0.04
c44[GPa] 2.0
-0.06
2.67
-0.08
3.33 -0.1 -0.12
4.0 1
2
3
4
5
6
7
8
9
10
t
(e) FIGURE 8.12 Continued.
responses in the x direction subjected to a vertical line load in the z direction, and the other is displacement responses in the y direction subjected to a shear load in y direction. Considering the features of Figure 8.11 and Figure 8.12, the special region is located in the time duration [2.35, 3.25]; thus the combination of responses of two displacement components of five points in the time duration [2.35, 3.25] is selected as the inputs, namely t = 2.35, 2.89, 2.98, 3.16, and 3.25.
8.6.3
Training Samples
The training samples for the initial training of the NN model consist of a number of sets of inputs and outputs; they should cover all possible values of elastic constants. Obviously, it is impossible to generate all the combinations of elastic constants and thus a good cross section of possible alterations is required. The orthogonal array method is adopted for the selection of part of the training samples (Chapter 6). To further reinforce the sample set, another sample group is created from a random selection of the parameters. For this problem, a range defined in Table 8.20 is used. To formulate the initial training samples, it is assumed that there are four levels of change in the search range for these five elastic constants, corresponding to c11 , c12 , c13 , c 33 , and c 44 of their discrete values. Based on the orthogonal array method, these five four-level parameters will only require 16 samples to cover the whole domain. In addition, another 21 samples, created randomly, are added into the training data set. This combined strategy covers a good cross section of all possible elastic constant variations.
© 2003 by CRC Press LLC
The inputs and outputs of the training samples are normalized linearly based on Equation 6.14. The NN model used in this example has two hidden layers; the neuron numbers of the input and output and the first and second hidden layers are 10, 5, 30, and 16, respectively.
8.6.4
Results and Discussion
Two sets of elastic constants of the [0/+45/–45/60/–60]s glass/epoxy laminates are identified using the present procedure. One set is the actual value listed in the second column in Table 8.21. Another is an assumed arbitrary set of elastic constants from the search range given for the first set, which is listed in the second column in Table 8.22. An NN model is built for the identification of elastic constants of the actual [0/+45/–45/60/–60]s glass/ epoxy laminate using training samples generated based on the range of actual elastic constants. The NN model is then used to identify the set of actual elastic constants and the set of arbitrary elastic constants from the search range for the actual elastic constants, to validate the stability of the present procedure. The displacement responses at the sampling points on the upper surface of these two sets are calculated using the HNM and used as inputs to the NN model. In order to simulate the measured displacement responses, noise-contaminated inputs are used. The two displacement components on the five points in the time history are used as inputs, which are calculated results using HNM. Table 8.21 summarizes the identified results of the elastic constants for the first case. The results for six progressions are listed. The result at the first progression is not accurate because the maximum error is high, and the displacement responses corresponding to these identified elastic constants are quite different from the simulated ones using the actual values of elastic constants. A retraining for the NN model is required. A new sample is created from the first identified result and the corresponding displacement responses calculated from the HNM. The new sample is added into the original sample pool to replace the sample with the largest distance norm. The retraining process is repeated until the displacement responses corresponding to the identified elastic constants are sufficiently close to the simulated measurements. The results at stages of progressive training are listed in Table 8.21. It can be seen that the accuracy of the identified results increases as the progression number increases, and the identified result is very accurate after six progressions. The maximum error of the sixth progression elastic constants is as low as 5%. The identified result remains stable regardless the presence of the noise and the required number of progressions is not changed, even when the noise is added. Another set of elastic constants of glass/epoxy [0/+45/–45/60/–60] s laminate is also identified. The result for this case is shown in Table 8.22,which shows that very accurate results can be obtained after six progressions, even though the maximum error of the first identified elastic constants is as large
© 2003 by CRC Press LLC
© 2003 by CRC Press LLC
TABLE 8.21 Identified Results of Elastic Constants of the [0/+45/–45/60/–60]s Glass/Epoxy Laminate (Case 1) Elastic Constants
Result (Error) at Progressions 3 4
Original Value (GPa)
1
2
42.020 6.067 13.500 7.277 3.410
41.311(–1.7%) 5.395(–11.1%) 12.950(–6.8%) 7.920(12.3%) 3.745(9.8%)
41.642(–0.9%) 6.695(10.4%) 14.230(5.4%) 7.871(8.2%) 3.490(2.4%)
42.991(2.3%) 6.590(8.6%) 14.020(3.9%) 7.883(8.2%) 3.535(3.7%)
40.980(–2.5%) 5.355(–11.7%) 12.980(–3.9%) 7.995(9.9%) 3.750(10.0%)
43.564(3.7%) 6.985(15.1%) 13.930(3.2%) 7.565(4.0%) 3.390(–0.6%)
42.720(1.7%) 6.555(8.0%) 14.010(3.8%) 7.943(9.1%) 3.550(4.1%)
5
6
42.990(2.3%) 6.520(7.5%) 13.600(0.7%) 7.635(4.9%) 3.620(6.2%)
42.930(2.2%) 6.470(6.6%) 13.871(2.7%) 7.655(5.2%) 3.571(4.6%)
42.930(2.2%) 6.275(3.4%) 13.760(1.9%) 7.535(3.6%) 3.573(4.8%)
42.720(1.7%) 6.545(7.9%) 13.51(0.0%) 7.455(2.5%) 3.575(4.9%)
42.510(1.2%) 6.440(6.2%) 13.910(3.0%) 7.760(6.6%) 3.573(4.9%)
42.360(0.8%) 6.235(2.8%) 13.780(2.1%) 7.640(4.9%) 3.578(4.9%)
(a) Noise Free c11 c12 c22 c23 c55
(b) Noise Added (1%) c11 c12 c22 c23 c55
42.020 6.067 13.500 7.277 3.410
Source: Liu, G.R. et al., J. Sound Vib., 252(2), 239, 2002. With permission.
© 2003 by CRC Press LLC
TABLE 8.22 Identified Results of Elastic Constants of the [0/+45/–45/+60/–60]s Glass/Epoxy Laminate (Case 2) Elastic Constants
Original Value (GPa)
1
2
Result (Error) at Progressions 3 4
5
6
(a) Noise Free c11 c12 c22 c23 c55
50.00 5.00 12.00 5.50 2.50
51.48(3.0%) 4.167(16.7%) 11.370(–5.3%) 5.710(3.8%) 2.803(12.1%)
50.430(0.9%) 4.995(–0.1% 13.000(8.3%) 5.971(8.9%) 2.420(3.2%)
49.022(2.0%) 4.560(8.8%) 12.370(3.1%) 5.810(5.6%) 2.513(4.%)
48.990(2.0%) 4.612(7.8%) 12.070(0.6%) 5.525(0.5%) 2.440(2.4%)
49.680(0.6%) 4.650(7.0%) 12.011(0.1%) 5.525(0.5%) 2.518(0.7%)
50.040(0.0%) 4.843(3.2%) 12.256(2.1%) 5.710(3.8%) 2.528(1.1%)
51.300(2.6%) 4.140(–17.2%) 11.390(–5.1%) 5.800(5.5%) 2.815(12.6%)
50.220(0.4%) 4.935(1.3%) 13.050(8.8%) 6.115(11.2%) 2.438(2.5%)
48.810(2.4%) 4.510(9.8%) 12.321(2.7%) 5.880(6.9%) 2.525(1.0%)
48.810(2.4%) 4.585(8.3%) 12.100(0.8%) 5.610(2.0%) 2.453(1.9%)
49.590(1.2%) 4.635(6.2%) 12.070(3.0%) 5.625(2.3%) 2.525(1.0%)
50.04(0.0%) 4.805(3.9%) 12.260(2.2%) 5.780(5.0%) 2.358(1.5%)
(b) Noise Added (2%) c11 c12 c22 c23 c55
50.00 5.00 12.00 5.50 2.50
Source: Liu, G.R. et al., J. Sound Vib., 252(2), 239, 2002. With permission.
as 17.2%. Compared to the first case, the maximum error of the first identified elastic constants is bigger, but accurate results can still be obtained because the training samples are selected based on a range of elastic constants for the first case. Although this set of samples is not the best suitable sample set for the second case, after several progressive trainings, the sample density around the simulated measurement of displacement responses increases until the desired accuracy is obtained. For a six-time progressive NN model, the forward HNM solver has only been called for 42 times, comparing to about 2000 times when using the genetic algorithm to solve the same problem (see Section 8.3). It can be clearly concluded that this NN model for identification of elastic constants is very efficient. This advantage of the NN model is extremely significant if the forward solver requires longer CPU time for a single run. It should be noted that this training algorithm is still slow and computationally expensive even though it has been modified. Using modern network architectures and training algorithms may give further improvements in the presented NN procedures. As an example, Levin and Lieven (1998) have shown that the radial basis function (RBF) network with the orthogonal least squares algorithm can be trained significantly faster than a multilayer network with back-propagation. Also, the level of noise used in the inputs for the NN model is only 1%, not the 5% noise level added in the GA process. Noise with 5% level for the NN model has been examined, but the true solution has not been obtained. The presence of noise triggers the overfitting phenomenon and causes the procedure to break down. The use of more sampling points should effectively solve the problem, as shown in Section 11.7.3.
8.6.5
A More Complicated Case Study
Now try to extend the presented NN procedure to a more complicated case with more parameters to be identified. Conside the same laminate as that described in Section 8.6.1, but the material is graphite/epoxy (Rokhlin and Wang, 1992). This material is an orthotropic material; the nine elastic constants are listed in the second column in Table 8.23. Thus, there are nine parameters, named c 11, c12, c22, c13, c23, c 33, c44, c55, and c66 that need to be identified. The search range is also illustrated in Table 8.23. Following the process described in Section 8.6.2 and Section 8.6.3, the inputs of the model training samples are obtained. The NN model used in this application has two hidden layers; neuron numbers of the input and the output and first and second hidden layers are 14, 9, 36, and 28, respectively. There are 36 training samples in total. Table 8.24 summarizes the identified results of the elastic constants for this complicated case in which very accurate results can be obtained after three progressions. As shown in Table 8.23, the search range of ±20% off from the actual value is used in this case — a range smaller than the range
© 2003 by CRC Press LLC
TABLE 8.23 Search Range for Elastic Constants to be Identified for Graphite/Epoxy Laminate Elastic Constants
Actual Data (GPa)
Search Range (GPa)
c11 c12 c22 c13 c23 c33 c44 c55 c66
144.00 5.47 13.60 5.00 7.00 12.00 3.70 6.00 6.50
115.0–172.0 4.3–6.5 10.0–16.0 4.0–6.0 5.6–8.4 9.6–14.4 2.9–4.4 4.8–7.2 5.2–7.8
Source: Liu, G.R. et al., J. Sound Vib., 252(2), 239, 2002. With permission.
TABLE 8.24 Identified Results of Elastic Constants of the [0/+45/–45/+60/–60]s Graphite/Epoxy Laminate Result (Error) at Progressions 2
Elastic Constants
Original Value (GPa)
1
c11 c12 c22 c13 c23 c33 c44 c55 c66
144.00 5.47 13.60 5.00 7.00 12.00 3.70 6.00 6.50
149.762(4.0%) 5.934(8.5%) 12.988(–4.5%) 5.3244(6.5%) 6.607(–5.6%) 12.289(2.4%) 3.570(–3.5%) 5.543(–7.6%) 6.096(–6.2%)
145.295(0.9%) 5.404(–1.2%) 13.288(–2.3%) 5.078(1.6%) 7.302(4.3%) 12.384(3.2%) 3.715(0.4%) 6.396(6.6%) 6.227(–4.2%)
3
144.010(0.0%) 5.306(–3.0%) 13.437(–1.2%) 5.072(1.4%) 6.891(–1.6%) 12.147(1.2%) 3.859(4.3%) 6.040(0.7%) 6.543(0.7%)
Source: Liu, G.R. et al., J. Sound Vib., 252(2), 239, 2002. With permission.
of ±30% off from the actual value used in Section 8.6.1. In this study, the convergent solution cannot be obtained when the search range of ±30% is used. The accuracy of the characterized results does not increase as the progression number increases; this phenomenon is more obvious for the elastic constants relating to the shear deformation, which may be due to strongly anisotropic material. In addition, accurate results cannot be obtained using the noise-contaminated displacement responses, even with a search range of ±20% off from the actual value, possibly due to the small number of sampling points used. For successful extension of the presented procedure to more complicated cases,with a wider range for parameters and more parameters, the above procedure is directly applicable. However, the inputs, training samples, and structures of the NN model, as well as the learning algorithm, should be carefully considered based on the nature of the applications. It is often a
© 2003 by CRC Press LLC
very challenging task to use NN for inverse problems with too many parameters that have ranges too wide.
8.7
Remarks
• Remark 8.1 — the HNM and FEM can be used to compute the response of composite laminated structures with very high efficiency that paves the way for inverse procedure. Several computational inverse techniques are applied here for material characterization of composite laminated structures using HNM and FEM as the forward solvers. • Remark 8.2 — the conventional GA as well as the µGA are more straightforward and robust regardless of the noise contamination, although the convergence is slow. The method combining the GA with the gradient-based optimization method has the advantage of globalization of the genetic algorithm and of fast convergence of the gradient-based optimization method. The average call for the forward solver is reduced to about one ninth of that required in using the GA alone. Comparing the GA and the NN models, the GA is more robust in practical application because it is more stable to noise contamination in the input displacement response data. • Remark 8.3 — these computational inverse techniques for material characterization of composite laminated have several advantages; one is the feasibility for the application in engineering problems. By simply applying an impact load on the outer surface of the structure and measuring the surface displacement responses, the material property of the structure can be successfully determined. Another feature is its robustness to accommodate the presence of noise in the measured data, which is very critical to practical detection. • Remark 8.4 — the ill-posedness of the problem can be partly overcome through the use of discrete sampling data, use of the proper forward solver, projection regularization, use of sensitive data, and filtering regularization.
© 2003 by CRC Press LLC
9 Inverse Identification of Material Property of Functionally Graded Materials
This chapter discusses computational inverse techniques for material characterization of functionally graded materials (FGMs) from the dynamic displacement response recorded on the surface of FGM structures. The procedures used are the same as those described in Chapter 8 that deal with layered structures. Because the material property of FGMs varies continuously along the thickness direction of the structures, three special identification approaches (i.e., at discrete locations, parameterized values, and volume fractions) for material property of FGMs are suggested to reduce the numbers of the parameters to be identified. The use of the rule of mixture can reduce significantly the numbers of parameters to be inversely identified; therefore, the volume fractions are mainly used as parameters representing the material property of FGMs. The inverse identification is performed for plates and cylindrical shells using the gradient-based optimization method, GAs, and combined optimization methods as well as the progressive NN.
9.1
Introduction
Functionally graded materials are relatively new engineered composites in which the proportions of the material change continuously spatially. This is achieved through the reinforcement of the matrix with inclusions of different properties, sizes, and shapes, as well as by interchanging the roles of reinforcement and matrix phases in a continuous manner (Hirai, 1996; Schwartz, 1994). Figure 9.1 illustrates the characteristics of a typical FGM (Koizumi, 1997). To date, several main manufacturing processes have been developed for fabricating various types of FGMs, such as powder metallurgy (Watanabe and Kawasaki, 1990; Stoloff, 1999), plasma spraying (Shimoda et al., 1990), vapor deposition (Sasaki et al., 1989), self-propagating high-temperature synthesis (Yanagisawa et al., 1990), thin film lamination (Takemura et al., 1990), sintering synthesis (Takahashi et al., 1990; Yamaoka et al., 1993), and laser cladding (Pei et al., 2000).
© 2003 by CRC Press LLC
FIGURE 9.1 Composition of a typical functionally graded material (FGM). The material property of FGM varies continuously in the thickness direction. (From Koizumi, K., Composites Part B, 28(1–2), 1–4, 1997. With permission.)
Continuous changes in their microstructure, and hence the properties, distinguish FGMs from conventional composite materials in three ways (Schwartz, 1994; Hirai, 1996): • FGMs can be designed with a capability to endure steep temperature gradients and protracted exposure to high temperature. • FGMs can reduce the drastic mismatch in material property observed between differently oriented, adjacent plies in an anisotropic laminated structure. • FGMs enhance the fracture toughness in ceramic matrix composites through tailored interface and introduce a second phase that creates compressive stress fields in critical and crack-prone regions. The gradual change of material property can be tailored to different applications and working environments. Major applications of FGMs have been in high-temperature environments, from advanced aircraft and aerospace ships to computer circuit boards. For example, high-speed aircraft traveling at a speed of Mach 2.4 will incur a temperature in the vicinity of 175° C at the leading edge and skin of the structure, and a temperature about 1600° C at the engine (Nadeau and Ferrari, 1999). To address the structural and thermal issues associated with the high-speed aircraft, an FGM has been proposed using heat-resistant ceramics on the high-temperature side and tough metals with high thermal conductivity on the low-temperature side, with a gradual compositional variation from ceramic to metal. FGMs can be used not only in the thermal-protection systems but also in many other
© 2003 by CRC Press LLC
applications. Many more applications of FGMs have been reported (Shiota and Miyamoto, 1997; Ilschner and Cherradi, 1995; Gasik, 1996; Matsumura et al., 1993), including the recent focus on solar energy conversion devices (Koizumi, 1997), dental implants (Watari et al., 1997) and naturally occurring biological FGMs (Nogata and Takahashi, 1995; Amada et al., 1997). With the gradually increasing applications of FGMs, there is a need for a reliable method to measure the material property of FGM nondestructively. It is important to have an efficient and effective technique to provide input data of material property for FGM design, and even more important to have a technique to evaluate FGM after fabricating and in service. This will enable verification of whether the actual material property matches the design requirement to ensure safety and quality. Utilizing elastic wave fields in structures as well as the computational inverse procedures might be promising in charactering FGMs because reasonably good forward solvers are available for analyzing elastic waves that propagate in FGM plates (Liu and Tani, 1990; Liu et al., 1991c, 1999, 2001a; Han et al., 2000, 2001a) and FGM cylinders (Han et al., 2001b, 2002a). These numerical models and the forward solvers of FGM structures provide the relationship between the material property and the dynamic displacement responses for FGMs. In this chapter, several computational inverse techniques based on successfully developed forward solvers are introduced to characterize the material property of FGMs, using the dynamic displacement responses on the surface of the FGM structures.
9.2
Statement of the Problem
The goal is to inversely determine the material property of FGMs from the measured displacement responses on the surface of the structures. As in Chapter 8, the measured responses are simulated by computer-generated displacement responses using the actual material property. The incident excitation wave to the FGM structure is assumed to be a line load acting at x = 0 on the upper/outer surface, as given by Equation 8.2. The receiving point can be arbitrarily chosen on the surface of the FGM structures away from the excitation point by a distance of several thicknesses of the structures.
9.3
Rule of Mixture
The material property of the FGM can be obtained using methods of the rule of mixture derived from micromechanics, using the material property of the
© 2003 by CRC Press LLC
matrix and inclusion for a given volume fraction. Thus, the step-by-step (SBS) method proposed by Liu (1984, 1998) is used for predicting the material property of FGM for given volume fractions. In the SBS method, the composite material is composed through a hypothetical process, in which one component is treated as matrix and another is treated as inclusion; the inclusion is mixed one by one into the matrix. The material property of the composite materials is obtained step by step using the cylinder model and sphere model (Christensen, 1979) for fiber- and particle-reinforced materials, respectively. The SBS procedure is briefly described next. Consider a composite made of particle mixture with a volume fraction of VP for inclusion. A sufficiently small volume fraction at one step is denoted by VP1. After n steps of mixing, the property can be obtained at the nth step as the final property of composition (Liu, 1984, 1998): (3K n−1 + 4Gn−1 )( K P − K n−1 )VP1 (3K n−1 + 4Gn−1 ) + 3( K P − K n−1 )
(9.1)
5(3K n−1 + 4Gn−1 )(Gn−1 − GP )VP1 9K n−1Gn−1 + 8Gn2−1 + 6( K n−1 + 2Gn−1 )GP
(9.2)
K = K n = K n −1 +
G = Gn = Gn−1 +
E=
9KG 3K + G
(9.3)
ν=
E −1 2G
(9.4)
where K, G, E, and ν are the bulk modulus, shear modulus, Young’s modulus, and Poisson’s ratio, respectively. The material property of the inclusion is denoted with subscript P, and n is given by n=
ln(1 − VP ) ln(1 − VP1 )
(9.5)
Furthermore, the mass density is obtained as ρ = ρm + (ρ P − ρm )VP
(9.6)
where ρm is the mass density of the matrix. As discussed in Chapter 3, reduction of the number of parameters to be inversely determined is always important before applying any inverse procedure. The use of rule of mixture can reduce the number of unknowns significantly, as will be shown in the following examples; therefore, it should be used wherever possible.
© 2003 by CRC Press LLC
9.4
Use of Gradient-Based Optimization Methods
The inverse procedure can be formulated in exactly the same way as defined in Equation 8.1 for the characterization of the material property of FGMs. The modified Levenberg–Marquardt algorithm (Chapter 5) for solving nonlinear least squares problems is employed as the inverse operator to determine the distribution of the material property in FGMs. The subroutine BCLSF of IMSL based on the modified Levenberg–Marquardt method using the finite-difference method for evaluating the Jacobian is directly used for the inverse analysis. The following examples demonstrate this procedure.
9.4.1
Example 1: Transversely Isotropic FGM Plate
For FGM plates, the modified HNM (Liu et al., 2001a) that counts the material property variation even at the element level, is employed as the forward solver to calculate wave responses in terms of displacement. Consider an FGM plate with an overall thickness of H, as shown in Figure 9.2. The HNM was modified to accommodate a linear variation of material properties within an element in the thickness directions (z direction):
(
(cij )n = (ciju )n − (cijl )n
) hz
+ (cijl )n
(9.7)
n
(
ρn = ρun − ρln
) hz
+ ρln
(9.8)
n
l l l where hn is the thickness of the nth element and cn = (cij )n , ρn , u u u cn = (cij )n , and ρn ( i , j = 1, , 6 ) are the elastic coefficient matrix and the mass density on the lower and upper surfaces of the nth element, respectively. The use of inhomogeneous elements can reduce the number of elements needed to model the material variation of FGM. This is especially important for inverse problems because it leads to a reduction of parameters to be inversely determined (see Chapter 3). It has also been confirmed by Liu et al. (1999) that using inhomogeneous elements for FGM plates can provide more accurate results than using homogeneous layered elements. This is also important for inverse analysis because it leads to more accurate evaluation of the error function as defined by Equation 8.1. Consider now a transversely isotropic FGM plate; transversely isotropic materials have five independent engineering constants. Assume that (Liu et al., 2001a)
E1 = AE1 + BE1 z + CE1 z 2
© 2003 by CRC Press LLC
(9.9)
z (w)
H
y (v)
x (u )
(c iju ) n , ρ nu ρ
hn
nth element
cij
(c ijl ) n , ρ nl
FIGURE 9.2 An FGM plate is divided into a number of layered elements in the thickness direction. The material property in each layered element is assumed as a linear function in the thickness direction. This reduces the number of elements needed to model the material variation of FGM and also provides more accurate results than a homogeneous layered element.
and the other four elastic constants and the mass density are constants. The values of these material constants can be found in Table 8.3. In this example, AE1 = 62.17 , BE1 = 20.0, and CE1 = 60.0 are set. This assumption is hypothetical, but serves the purpose of testing the inverse procedure. 9.4.1.1
Approach I: Identification of Material Property at Discrete Locations
If the value of E1 in each element surface is found, the variation of E1 through the thickness of the element can then be determined in the fashion of linear variation; thus, all the material constants can be determined using Equation 9.7 and Equation 9.8. If the plate is divided into five elements, then six discrete values of E1 are at the element interfaces and the upper and lower surfaces of the plate. Therefore, six parameters are to be identified. Performing the inverse procedure using BCLSF, the distribution of the material property E1 can be determined; the determination is very accurate as shown in Table 9.1.
© 2003 by CRC Press LLC
TABLE 9.1 Identification Results of Discrete Values of E1 in Thickness Direction of a Transversely Isotropic FGM Plate Position z
Original Value E1
+10%
+20%
–20%
+10%
+20%
–20%
0.0
62.17
68.39
74.60
49.74
0.2
68.57
75.43
82.28
54.86
0.4
79.77
87.75
95.72
63.82
0.6
95.7
105.35
114.92
76.62
0.8
116.57
128.23
139.88
93.26
1.0
142.17
156.39
170.60
113.74
62.12 (0%) 68.59 (0%) 79.78 (0%) 95.67 (–0.1%) 116.61 (0%) 142.24 (0.05%)
62.12 (0%) 68.59 (0%) 79.77 (0%) 95.68 (–0.1%) 116.60 (0%) 142.26 (0.05%)
62.13 (0%) 68.59 (0%) 79.77 (0%) 95.68 (–0.1%) 116.60 (0%) 142.24 (0.05%)
Initial Guess
Results (Error)
Source: Liu, G.R. et al., J. Composites Mater., 35(11), 954–971, 2001. With permission.
In this approach, if the number of elements increases, the number of parameters of inverse procedure will also increase. In many cases, the number of elements required should be determined by the frequency contents of the excitation/wave-source (Liu and Xi, 2001). The general rule is that the thickness of the elements should be at least smaller than one quarter of the wavelength at the highest frequency of the excitation. Therefore, if the frequency of excitation is very high, a very large number of elements must be used. In this case, the inverse procedure may fail, due to too many parameters to be identified and an alternative approach is therefore required. 9.4.1.2 Approach II: Identification of Parameterized Values Note that the gradient of the material properties does not change very drastically within the FGM plate; for many cases, the variation of the material properties can be expressed as a quadratic function of the thickness coordinate as defined in Equation 9.9. The use of Equation 9.9 can parameterize E1 to further reduce the number of parameters that need to be inversely identified. In this particular case, only three parameters, AE1 , BE1 and CE1, for each property need to be identified, regardless of the number of elements used in the forward solver. N ow c onsi de r a probl em t o id en t ify t he th ree par amet er s AE1 , BE1 , and CE1. For this problem, again divide the plate into five elements. The material properties of each layer element can be calculated using Equation 9.9 and Table 8.3. The inverse solution on the distribution of the result has been obtained using BCLSF and is shown in Table 9.2. The identification gives excellent results in terms of accuracy.
© 2003 by CRC Press LLC
TABLE 9.2 Identified Results of E1 as a Function of Thickness in a Transversely Isotropic FGM Plate Coefficients of Material Property in Equation 9.9
Original Value
Initial Guess
Results (Error)
+10%
+20%
–20%
+10%
+20%
–20%
AE1
62.17
68.39
74.60
49.74
BE1
20.00
22.00
24.00
16.00
CE1
60.00
66.00
72.00
48.00
62.16 (–0.02%) 20.13 (0.65%) 59.77 (–0.4%)
62.17 (0%) 19.98 (–0.1%) 59.93 (0.1%)
62.18 (0.02%) 19.92 (0.4%) 59.99 (0.02%)
Source: Liu, G.R. et al., J. Composites Mater., 35(11), 954–971, 2001. With permission.
TABLE 9.3 Material Properties of SiC and C Monolith Materials Material Constants
E (GPa)
v
ρ (g/cm3)
SiC C
320 28
0.3 0.3
3.22 1.78
Source: Sasaki, M. et al., J. Ceramic Soc. Jpn., 95(5), 539, 1989.
9.4.2
Example 2: SiC-C FGM Plate
The nonlinear LSM is now applied to the actual SiC-C plate. The SiC-C material is developed by combining materials SiC and C, using a chemical vapor deposition (CVD) technique (Sasaki et al. 1989). The material properties of the SiC and C monolith are given in Table 9.3. Young’s modulus E, shear modulus G, and Poisson’s ratio v of het SiC-C FGM plate are obtained using the method given by Kerner (1956) and shown in Figure 9.3. The distribution of the content of SiC and C plotted in this figure is obtained by ρ = ρc vc + ρ sic v sic
(9.10)
where ρc and vc are the density and the volume fraction of the C monolith, respectively, and ρ sic and v sic are those of the SiC monolith, respectively. The true material properties are shown in Table 9.4. These values are obtained using the simple Kerner’s formula. More sophisticated expressions on the rule of mixture of composite materials can be found in the textbook by Tsai and Hahn (1980). 9.4.2.1 Identification of Parameterized Values The SiC–C FGM plate is divided into five layer elements, each modeled as an isotropic material for which the constitutive equations involve three inde pendent material properties, ( E, ρ , v ). Table 9.4 shows that the Poisson’s
© 2003 by CRC Press LLC
1.0
30
Dimensionless Modula
Poisson’s ratio v
0.2
0.1
v
0.8
1.5
20 0.6
E
15 G
0.4
10
0.5 0.2
5 0.0
1.0
0
Dimensionless density
ρ
25 0.3
2.0
Sic Content
0.4
0.2
0.4
0.6
0.8
z
(SiC)
0.0
0 1 (C)
FIGURE 9.3 Distribution of material properties along the thickness direction of an SiC–C FGM plate. Young’s modulus E, shear modulus G, Poisson’s ratio v, and mass density of the SiC-C FGM plate are obtained using the method given by Kerner (1956). This figure shows that material properties of the SiC-C FGM continuously vary along the thickness of the plate. (From Liu, G.R. et al., Composites Sci. Tech., 61, 1401–1411, 2001. With permission.)
TABLE 9.4 True Material Properties of SiC-C FGM Plate z
ρ(g/cm3)
E(GPa)
v
0.00 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 1.00
1.78000 1.79424 1.83696 1.90816 2.00784 2.13600 2.29264 2.47776 2.69136 2.93344 3.20400
25.32438 28.11006 36.46710 50.39551 69.89528 94.96641 125.60890 161.82276 203.60798 250.96456 303.89251
0.30 0.30 0.30 0.30 0.30 0.30 0.30 0.30 0.30 0.30 0.30
Source: Liu, G.R. et al., J. Composites Mater., 35(11), 954–971, 2001. With permission.
ratio can be taken as a constant in each layer element. The Young’s modulus and mass density can be approximated by quadratic functions of thickness coordinate using the data set in Table 9.4 and the least squares method. The functions are obtained as (Liu et al., 2001a)
© 2003 by CRC Press LLC
E = AE + BE z + CE z 2
(9.11)
ρ = Aρ + Bρ z + Cρ z 2
(9.12)
where AE = 2.62, BE = 0.00, CE = 28.78 ; Aρ = 1.80, Bρ = 0.01, Cρ = 1.42 , E = E const , and const = 9.68GPa . The task of the inverse analysis is thus to identify the six parameters, AE , BE , CE , Aρ , Bρ , and Cρ , using the transient wave response and the modified HNM solver. Following the procedure described in Section 9.4.1.2, E and ρ can be identified as a function of the thickness as shown in Table 9.5. It can be seen that the results are very accurate compared to the actual properties. 9.4.2.2 Approach III: Identification of Volume Fractions The preceding problems are based on some assumptions on the material property of FGMs. In fact, the FGMs are usually microscopically heterogeneous and are typically made from two isotropic components, such as metals and ceramics. The material property of the FGM cylinder can be obtained using methods of rule of mixture derived from the micromechanics using the material properties of the matrix and inclusion for given volume fractions. Because the material property of the components is usually available, the volume fraction and its variation in the thickness direction of FGM structures are key to characterization of material property of FGM. As long as the volume fraction is known, the material property can be obtained easily using Equation 9.1 to Equation 9.6. Therefore, the characterization of the material property of FGM is actually equivalent to the characterization of volume fractions. If the structure is divided into m layered elements in the thickness direction, then a total of m + 1 discrete values of volume fractions are at the element interfaces and the surfaces of the structure. Compared to the numbers of unknowns of the material constants, this approach leads to a significant deduction on the number of parameters to be inversely identified. The task now is to inversely characterize the volume fractions of FGMs using the GA and NN techniques.
9.5
Use of Uniform µ GA
The uniform µGA is employed as the inverse procedure to determine the material property of FGM structures based on the displacement response on the surface of the structures. The process of parameter coding follows exactly as described in Section 9.3.2, but the volume fractions of FGM replace the material constants of laminated structures. Forward solvers used here are the modified HNM for FGM plates (Liu et al., 2001a) and numerical analytical method for FGM cylinders (Han et al., 2001b). The volume fractions of two different FGMs are identified. The first is the SiC-C FGM examined in Section 9.4.2; the other is the SS-SN that is composed
© 2003 by CRC Press LLC
© 2003 by CRC Press LLC
TABLE 9.5 Identified Results of Distribution of Young’s Modulus E and Mass Density ρ of SiC-C FGM Plate Coefficients of Material Property (See Equations 9.11 and 9.12)
Initial Guess
Identified Results (Error)
True Value
+10%
+20%
–20%
+40%
–40%
+10%
+20%
–20%
+40%
–40%
AE
2.62
2.88
3.14
2.09
3.67
1.57
BE
0.00
0.00
0.00
0.00
0.00
0.00
CE
28.78
31.65
34.53
23.02
40.29
17.27
Aρ
1.80
1.98
2.16
1.44
1.98
1.08
Bρ
0.01
0.01
0.01
0.01
0.014
0.006
Cρ
1.42
1.56
1.70
1.13
1.99
0.85
2.6 (0%) 0.00 (0%) 28.77 (0.03%) 1.80 (0%) 0.01 (0%) 1.42 (0%)
22.62 (0%) 0.00 (0%) 28.77 (0.03%) 1.80 (0%) 0.01 (0%) 1.42 (0%)
2.62 (0%) 0.00 (0%) 28.77 (0.03%) 1.80 (0%) 0.01 (0%) 1.42 (0%)
2.62 (0%) 0.00 (0%) 28.76 (0.06%) 1.80 (0%) 0.01 (0%) 1.42 (0%)
2.62 (0%) 0.00 (0%) 28.77 (0.03%) 1.79 (0.05%) 0.01 (0%) 1.42 (0%)
Source: Liu, G.R. et al., J. Composites Mater., 35(11), 954–971, 2001. With permission.
TABLE 9.6 Material Properties of Stainless Steel and Silicon Nitridea Coefficient s
Stainless Steel E (GPa) v ρ (kg/m3) 207.82
a
0.3177
8166
Silicon Nitride E (GPa) v ρ (kg/m3) 322.4
0.24
2370
Touloukian, Y.S., Thermophysical Properties of High Temperature Solid Materials, Macmillan, New York, 1967.
of stainless steel and silicon nitride. The material property for stainless steel and silicon nitride is listed in Table 9.6 (Touloukian, 1967). In the SiC-C FGM, the C is considered the inclusion material, and in SS-SN FGM, the silicon nitride is considered the inclusion material.
9.5.1
Material Characterization of FGM Plate
In considering the SiC-C FGM plate, the following issues will be discussed: • Parameters used in the uniform µGA • Search range of the parameters • Performance of the characterization • Sensitivity and stability to noise contamination The SiC-C FGM plate is divided into five elements in the thickness direction. Figure 9.4 gives the schematic view on this division. Six discrete volume fractions of the inclusion (carbon material) at the nodal lines of the entire plate need to be identified. The actual values of the volume fractions are listed in the third column in Table 9.7. The wave responses of displacement in the x direction at the receiving point x = 2 H are computed using the modified HNM and shown in Figure 9.5; the time range is selected for the inverse operation. The selection of the time range from the wave-response curve is based on the sensitive study, to ensure that the wave response is sensitive to the parameters to be idenv1 v2 v3 v4 v5 v6 FIGURE 9.4 The schematic view of the discrete volume fractions on the nodal lines. The plate is divided into five layered elements; only six volume fractions need to be identified for the whole structure.
© 2003 by CRC Press LLC
© 2003 by CRC Press LLC
TABLE 9.7 Nine Uniform µGA Runs for the Identification of Volume Fractions of Carbon in the SiC–C FGM Plate Position
z z z z z z
= 0.0 = 0.2 = 0.4 = 0.6 = 0.8 = 1.0
Volume Fractions
Original Data
v1 v2 v3 v4 v5 v6
1.000 0.961 0.842 0.634 0.367 0.000
Results Obtained using Different Negative Number Initialization in µGA –1000 –2000 –3000 –4000 –5000 –6000 –7000 –8000 –9000 1.000 0.964 0.812 0.653 0.343 0.008
1.000 0.996 0.824 0.637 0.347 0.005
1.000 0.962 0.799 0.601 0.356 0.002
1.000 0.953 0.900 0.556 0.386 0.003
1.000 0.971 0.769 0.627 0.334 0.005
1.000 0.961 0.799 0.698 0.312 0.007
1.000 0.968 0.822 0.645 0.331 0.001
1.000 0.952 0.899 0.533 0.389 0.003
1.000 0.965 0.825 0.630 0.345 0.008
Mean (Error) 1.000(0%) 0.965(0.4%) 0.828(1.7%) 0.620(2.2%) 0.349(4.9%) 0.005(–)
Note: The material properties can be computed easily using the volume fraction and the rule of mixture (e.g., Liu, 1984, 1998).
Dimensionless displacement in x-direction
0.2
Selected time range 0.15
0.1
0.05
0
-0.05
-0.1
0
1
2
3
4
5
6
7
8
9
10
Dimensionless time FIGURE 9.5 Displacement response at x = 2.0H on the upper surface of an SiC-C FGM plate excited by a line load at x = 0.0 with time history of one cycle sine function. The time range selected for the inverse operation is designated based on the sensitively analysis.
tified. The results are obtained using the modified HNM and the dimensionless parameters defined by Liu et al. (2001a) are used. These computergenerated displacement responses are considered as noise-free data and used to identify the volume fractions of the SiC-C FGM plate. For noisecontaminated cases, the inputs can be simulated by directly adding a Gaus sian noise generated from Equation 2.130 to the computer-generated noisefree inputs. 9.5.1.1 Parameters Used in the Uniform µ GA Parameters, such as population size, chromosome string length, and probability of crossover, are chosen based on works of others and the authors. As recommended by Krishnakumar (1989), the population size of each generation is set to 5; tournament selection and elitism are used. Because the probability of uniform crossover is one of the most important parameters in the uniform µGA, it has been studied through numerical experiments on material characterization of the SiC–C FGM plate (Han and Liu, 2002d). The influence of the probability of uniform crossover pcross rate on the convergence of the uniform µGA is shown in Figure 9.6. It appears that the probability of uniform crossover controls the solution convergence speed. The detailed result from the numerical tests is shown in Table 9.8. • For smaller value of crossover probability, e.g., pcross = 0.1, the convergence is slow. • For larger value of crossover probability, e.g., pcross = 0.9, the convergence is fast initially but slower at the later stage.
© 2003 by CRC Press LLC
2.5
x 10
-3
2
p corss = 0.9
1.5
err(p)
p cross = 0.3
1
p cross = 0.6
0.5
0
p cross = 0.5
0
50
100 150
200
250
300 350
p cross = 0.1
400
450 500
Number of generations FIGURE 9.6 Influence of the probability of the uniform crossover used in the performance of the uniform µGA. The SiC-C FGM plate is used for this examination. pcross is the value of the probability of the uniform crossover. For smaller value of crossover probability = 0.1, the convergence is slow. For larger value of crossover probability, pcross = 0.9, the convergence is fast initially but slower at the later stage. Compared to the case of pcross = 0.6, the convergence for pcross = 0.5 is faster at the latest stage. (From Han, X. and Liu, G.R., AIAA J., 41(2), 288–295, 2003. With permission.)
TABLE 9.8 Detailed Results of Influence of pcross in Uniform µGA pcross
Generation Number
0.1 0.3 0.5 0.6 0.9
493 496 326 367 448
Fitness Value
Maximum Error
× × × × ×
11.5% 7.5% 4.5% 3.8% 8.6%
1.5 5.0 2.0 2.0 1.0
10–2 10–3 10–3 10–3 10–2
Source: Han, X. and Liu, G.R., AIAA J., 41(2), 288–295, 2003. With permission.
• Compared to the case of pcross = 0.6, the convergence for pcross = 0.5 is faster. The final value of fitness function is the same for pcross = 0.5, and pcross = 0.6 but smaller than all other cases. • From the maximum derivation of individual parameter shown in Table 9.8, the result is more accurate when pcross = 0.6. Therefore, pcross = 0.6 is used in this example. This value is particularly applicable to problems studied in this work; it can be used also as a reference for other types of problems.
© 2003 by CRC Press LLC
9.5.1.2 Test of GAs’ Performance Because genetic algorithms use random numbers, the result of a single GA run can be a matter of chance. To examine the performance of the present uniform µGA accurately, nine uniform µGA runs with different random number series generated by Knuth’s algorithm for characterizing material property of this SiC-C FGM plate were made. The results are shown in Table 9.7, which shows that the statistical means of these nine uniform µGA runs with different random numbers give very accurate results.
9.5.1.3 Search Range The bounds on the parameters are required to define a finite search space for the uniform µGA. In engineering practice, a narrower range is always preferred for accuracy in inverse solution and for computational efficiency. The volume fraction is usually controlled in the process of fabricating FGM; therefore, a rough distribution of the volume fraction is known when the FGM is fabricated and a range for the distribution of the volume fractions can be determined. In this work, ranges are assumed from ±20% to ±50% off from the actual value of the volume fraction. To investigate the inverse procedure’s sensitivity to noise and stability, Gaussian noise with noise levels of 2, 5, and 10% is considered Table 9.9 summarizes the inversely characterized result of the volume fraction for carbon in the SiC–C FGM plate and shows that • The accuracy of the characterization decreases as the search range increases, and characterized results are very accurate when the search range is up to 30% from the actual value. This finding suggests the necessity and importance of trying to narrow the search range of parameters for more accurate inverse solution. • The error for the characterized volume fractions increases as the noise level increases. When the search range is smaller than ±30% off from the actual value, the characterized results remain stable regardless of the levels of noise, even for the level of 10%. This finding re-enforces the suggestion that the use of a narrower search range can help to obtain stable and accurate inverse solutions effec tively when noisy data are used. Clearly, the present characterization procedure is very reliable if the search range is within ±30% off from the actual value. The current fabricating technology can usually control the volume fraction at least within 20% of error. A search range of ±30% off from the actual value for the volume fractions should be robust enough for most engineering applications.
© 2003 by CRC Press LLC
TABLE 9.9 Characterized Volume Fractions in the Carbon of SiC–C FGM Plate for Different Search Ranges and Noise Levels of Contamination Position
Volume Fractions
Original Data
Results (Error) for Different Search Ranges ±20% ±30% ±40% ±50%
(a) Noise Free
z z z z z z
= 0.0 = 0.2 = 0.4 = 0.6 = 0.8 = 1.0
v1 v2 v3 v4 v5 v6
1.000 0.961 0.842 0.634 0.367 0.000
1.000(0%) 0.966(0.5%) 0.835(–0.8%) 0.629(–0.8%) 0.364(–0.8%) 0.008(–)
1.000(0%) 0.959(–0.2%) 0.824(–2.0%) 0.665(4.9%) 0.361(1.6%) 0.005(–)
1.000(0%) 0.952(–0.9%) 0.896(6.4%) 0.604(4.7%) 0.421(15%) 0.009(–)
1.000(0%) 0.969(0.8%) 0.809(–3.9%) 0.532(–16%) 0.314(14.4%) 0.009(–)
v1 v2 v3 v4 v5 v6
1.000 0.961 0.842 0.634 0.367 0.000
1.000(0%) 0.954(–0.7%) 0.862(2.3%) 0.613(–3%) 0.357(–2.7%) 0.006(–)
1.000(0%) 0.957(–0.4%) 0.853(1.3%) 0.601(–5%) 0.390(6.3%) 0.003(–)
1.000(0%) 0.963(0.2%) 0.750(–11%) 0.758(19.6%) 0.365(–0.5%) 0.007(–)
1.000(0%) 0.960(–0.1%) 0.806(–4.3%) 0.719(13.4%) 0.334(–9.0%) 0.009(–)
v1 v2 v3 v4 v5 v6
1.000 0.961 0.842 0.634 0.367 0.000
1.000(0%) 0.956(–0.6%) 0.859(2%) 0.605(–4.6%) 0.361(–1.6%) 0.008(–)
1.000(0%) 0.959(–0.2%) 0.812(3.6%) 0.692(9.1%) 0.335(–8.7%) 0.006(–)
0.999(0.1%) 0.993(3%) 0.8865.2%) 0.412(–35%) 0.440(19.9%) 0.007(–)
1.000(0%) 0.934(2.8%) 0.961(14.1%) 0.394(–38%) 0.527(44%) 0.008(–)
v1 v2 v3 v4 v5 v6
1.000 0.961 0.842 0.634 0.367 0.000
1.000(0%) 0.952(–0.9%) 0.856(1.7%) 0.643(1.4%) 0.336(8%) 0.008(–)
1.000(0%) 0.955(–0.6%) 0.862(2.3%) 0.599(–5.5%) 0.395(7.6%) 0.007(–)
0.999(–0.1%) 0.995(0.6%) 0.876(3.9%) 0.445(–29.8%) 0.402(9.5%) 0.005(–)
1.000(0%) 0.950(–1%) 0.930(11%) 0.375(–41%) 0.533(45%) 0.007(–)
(b) 2% Noise
z z z z z z
= 0.0 = 0.2 = 0.4 = 0.6 = 0.8 = 1.0
(c) 5% Noise
z z z z z z
= 0.0 = 0.2 = 0.4 = 0.6 = 0.8 = 1.0
(d) 10% Noise
z z z z z z
= 0.0 = 0.2 = 0.4 = 0.6 = 0.8 = 1.0
Note: Results are obtained at the 500th generation. Source: Han, X. and Liu, G.R., AIAA J., 41(2), 288–295, 2003. With permission.
9.5.2
Material Characterization of FGM Cylinders
The uniform µGA has also been applied to characterize the volume fractions for SS-SN FGM cylinders (Liu and Han, 2001). This process is described next. Consider an FGM cylinder with varying material properties in the thickness direction. The thickness, inner radius and outer radius of the cylinder are denoted by H, R1, and R2, respectively, as shown in Figure 9.7. Let x and
© 2003 by CRC Press LLC
z R2
x
H R1
z z
(cijO ) n , ρ nO
du
r = rn + hn
m M
ρ
dm
n M
nth
2 1
c ij
r = rn dl
rn
o
I I (c ij ) n , ρ n
x
x
FIGURE 9.7 Configuration of an FGM cylinder and annular element subdivision. The FGM cylinder is divided into N cylindrical elements with three nodal lines in the wall thickness. The elemental material properties are assumed to vary linearly in the thickness direction to better model the spatial variation of material properties of FGM. (From Han, X et al., Neurocomputing, 51, 341–360, 2003. With permission.)
z denote respectively the axial and radial coordinates. The cylinder is subjected to a radial line load of uniformly distributed along the circumferential direction. The receiving point is chosen on the outer surface of the FGM cylinder and the responses of displacement at sampled points in the time domain are selected as the inputs for the inverse analysis. In developing the forward solver, the cylinder is divided into a number of layered cylindrical elements, as shown in Figure 9.7. The thickness and inner radius of the nth element are denoted by hn and rn , respectively and the outer radius of the nth element is equal to rn + hn . The elastic coefficient matrix and the mass density on the inner and outer surfaces of the nth element are denoted by cnI = (cij )nI (i , j = 1, 2, , 6), ρnI , cnO = (cij )On (i , j = 1, 2, , 6 ) and ρOn , respectively, where the superscripts I and O represent the inner and outer surface, respectively. It is assumed that the material properties of the nth element change linearly in the radial direction (Han et al., 2001b):
(
c n = c On − c nI
) rr −−rr
n −1
n
© 2003 by CRC Press LLC
n −1
+ c nI = ∆c n rˆ + c nI
(9.13)
(
ρn = ρOn − ρnI
) rr −−rr
n −1
n
n −1
+ ρnI = ∆ρn rˆ + ρnI
(9.14)
where rˆ = (r − rn−1 ) / (rn − rn−1 ) , rn−1 ≤ r ≤ rn . The cylinder is divided into six elements. It is assumed that this SS-SN FGM cylinder is made in such a way that the outer surface is pure stainless steel and the inner surface is pure silicon nitride. The volume fraction of the stainless steel (SS) is to be characterized in this example. Therefore, the volume fractions on the outer and inner surfaces are known as 1.0 and 0.0, respectively. Thus five parameters need to be characterized (see Figure 9.4). The original values of the volume fraction of SS are assumed to vary in the following function (See Figure 9.7): z − R1 VP = 1 − R2 − R1
3
(9.15)
Based on the study carried out for the SiC-C FGM plate, the bounds on the five parameters for the µGA search are set to be ±30% off from the actual value. The search space for these parameters is listed in Table 9.10. The five volume fractions are described and translated into chromosomes. In the whole search space, there are a total of 2 38 ( ≈ 2.75 × 1011 ) possible combinations of the five parameters. The operation parameters in the uniform µGA run are set as follows. The stopping criterion is imposed to limit each GA run to a maximum of 500 generations. The population size of each generation is set to 5 and the optimal probability of uniform crossover is set to 0.6. The characterized results of volume fraction of silicon nitride based on noise-free and Gaussian noise-contaminated input data with a noise level of 2, 5, and 10% are listed in Table 9.11, which shows that the present inverse procedure gives very accurate results. The total number of function evaluations in this case is 5 × 500 = 2500 . It can also be noted that the characterized results are very stable regardless of the presence of noise, even for the noise level up to 10%. This is due again TABLE 9.10 Uniform µGA Search Space for Volume Fractions of SS in SS-SN FGM Cylinder Position
z z z z z
= 0.3 = 0.5 = 0.7 = 0.8 = 0.9
Volume Fractions
Original Data
Search Range
Possibilities
Binary
v1 v2 v3 v4 v5
0.973 0.875 0.657 0.488 0.271
0.681–1.000 0.613–1.000 0.456–0.854 0.342–0.634 0.100–0.352
256 256 256 128 128
8 8 8 7 7
Note: Total population = 2 38 ≈ 2.75 × 10 11 .
© 2003 by CRC Press LLC
TABLE 9.11 Characterized Volume Fractions of SS in SS-SN Cylinder Position
z z z z z
Volume Fractions
Original Data
v1 v2 v3 v4 v5
0.973 0.875 0.657 0.488 0.271
= 0.3 = 0.5 = 0.7 = 0.8 = 0.9
Results (Error) for Different Noise Levels Noise Free 2% Noise 5% Noise 10% Noise 0.968(–0.6%) 0.860(–1.7%) 0.664(1.1%) 0.486(0.4%) 0.276(2.0%)
0.976(0.3%) 0.866(–1.0%) 0.670(2.0%) 0.501(2.5%) 0.259(–4.3%)
0.977(0.4%) 0.852(–2.6%) 0.685(4.3%) 0.508(4.0%) 0.259(–4.3%)
0.977(0.4%) 0.860(–1.7%) 0.687(3.5%) 0.506(3.8%) 0.250(–7.7%)
to the effects of the projection regularization achieved using discrete sampling for the input of dynamic displacement responses (see Section 8.3.4.3).
9.6
Use of Combined Optimization Method
The combined optimization method is also applied to characterize the volume fractions for the SiC-C FGM plate (Liu et al., 2002a). The uniform µGA and the subroutine BCLSF of IMSL are used for the combination. The BCLSF is based on the modified Levenberg–Marquardt method and using the finitedifference method for evaluating the Jacobian. The plate is divided into five elements. Because volume fractions of carbon on the upper and lower surfaces are known as 1.0 and 0.0, respectively, four volume fractions of carbon on the nodal lines of the plate are to be characterized. These values are denoted as v1 , v2 , v3 , and v 4 , respectively. The four volume fractions are described and translated into chromosomes; Table 9.12 lists the search space for these parameters. The parameters in the uniform µGA run are set so as: population size of each generation is 5, and the optimal probability of uniform crossover is 0.6. TABLE 9.12 Uniform µGA Search Space for Volume Fraction of Carbon in the SiCC FGM Plate Position
z z z z
= 0.2 = 0.4 = 0.6 = 0.8
Volume Fractions
Original Data
Search Range Digit
Possibilities
Binary
v1 v2 v3 v4
0.961 0.842 0.634 0.367
0.673–1.000 0.589–1.000 0.444–0.824 0.257–0.477
256 256 256 128
8 8 8 7
Note: Total population = 2 31 ≈ 2.15 × 10 9 . Source: Liu, G.R. et al., Comput. Methods Appl. Mech. Eng., 191, 1909–1921, 2002. With permission.
© 2003 by CRC Press LLC
TABLE 9.13 Selected Sets of Outstanding Individuals Close to Optima Generated from Uniform µGA at 50th Generation for Volume Fractions of Carbon in the SiC-CFGM Platea Set Number
(v1, v2, v3, v4)
Fitness Function Value
1 2 3
(0.716,0.864,0.686,0.338) (0.886,0.858,0.638,0.449) (0.938,0.897,0.584,0.388)
2.781e–2 4.872e–3 1.066e–4
a
Using noise-free displacements.
Note: True solution = (0.961,0.842,0.634,0.367); total numbers of function evaluations at this stage = 5 × 50 = 250. Source: Liu, G.R. et al., Comput. Methods Appl. Mech. Eng., 191, 1909–1921, 2002. With permission.
TABLE 9.14 Results from BCLSF for Volume Fractions of Carbon in SiC-C FGM Plate Set Number
Initial Value
1
(0.716,0.864,0.686,0.338)
2
(0.886,0.858,0.638,0.449)
3
(0.938,0.897,0.584,0.388)
a
Corresponding Solution Function Value Function (Maximum Error) at Solution Point Evaluations (0.921,0.878,0.654,0.400) (9.0%) (0.913,0.871,0.637,0.401) (9.3%) (0.940,0.889,0.596,0.386) (5.2%)
2.85e–3
14
1.25e–6
10
3.60e–9
33
Using noise-free displacements.
Note: True solution = (0.961,0.842,0.634,0.367); total numbers of function evaluations at this stage = 14 + 10 + 33 = 57. Source: Liu, G.R. et al., Comput. Methods Appl. Mech. Eng., 191, 1909–1921, 2002.With permission.
At the first stage, the uniform µGA is used to select a set of better solutions close to the optima. The selection criterion is imposed to limit the function value below a required value, e.g., 0.03. The error function defined by Equa tion 8.1 is more complex, so the better solutions for this problem should be selected from more generations compared to the numerical test carried out in Section 5.8.2. After the run of the first 50 generations, three outstanding individuals are generated from the uniform µGA. The three sets of parameters of these three outstanding individuals are then selected. The values of these selected sets of parameters and corresponding fitness function values are listed in Table 9.13. Total number of function evaluations of this stage is 50 × 5 = 250 . At the second stage, these three sets of outstanding individuals are considered as the initial points for the BCLSF runs. The BCLSF runs three times, each time starting from a different set of the three initial points selected by the µGA. Table 9.14 shows that
© 2003 by CRC Press LLC
• Three different solutions correspond to different initial guesses. These solutions are considered the local optima of the function; the global optimum can be found from these three solutions by comparing their corresponding fitness function values. The global optimum set of parameters is found and shown in Table 9.14 in bold font. • The maximum error of the four parameters in the final identified result is less than 6%. • The required number of function evaluations for each run of the BCLSF is 14, 10, and 33 function evaluations, a total of 57. In order to simulate the measured displacement responses, noise-contaminated displacement responses are also used for characterization of the volume fractions. A Gaussian noise defined by Equation 2.130 is directly added to the computer-generated displacement responses and then the noise-contaminated responses are used as inputs for the identification. To investigate the sensitivity and stability of the present inverse procedure to the noise level, two noise levels of 5 and 10% are considered. The noise-added characterization results are listed in Table 9.15 and Table 9.16. The presence of noise can make the identification much more complicated compared to the noise-free case. If the noise is too large, a local optimum could be found as the true results rather than the global optimum. It has been found that when the noise is larger than 10%, the true results were not identified. If the noise is less than 5%, the true results can still be found, as shown in Table 9.15 and Table 9.16, and the characterized result remains stable regardless of the presence of the noise. For this material characterization problem using the combined µGA and gradient-based optimization method, there are 250 function evaluations in the uniform µGA runs and 57 function evaluations in the BCLSF. Therefore, a total of 307 function evaluations is required in the combined method. In contrast, using the uniform µGA alone to solve the same problem needs 5 × 500 = 2500 function evaluations, as illustrated in Section 9.5.1. It can be clearly concluded that the combined method for material characterization is TABLE 9.15 Selected Sets of Outstanding Individuals Close to Optima Generated from Uniform µGA at 50th Generation for Volume Fractions of Carbon in the SiC-C FGM Platea (v1, v2, v3, v4)
Set Number
5% Noise
10% Noise
1 2 3
(0.727,0.919,0.573,0.470) (0.876,0.968,0.720,0.381) (0.936,0.898,0.580,0.388)
(0.851,0.968,0.580,0.401) (0.867,0.947,0.657,0.382) (0.936,0.898,0.579,0.402)
a
Fitness Function Value 5% Noise 10% Noise 2.881e–2 2.667e–3 1.274e–4
4.101e–3 7.710e–3 2.268e–3
Using noise-contaminated displacements.
Note: True solution = (0.961,0.842,0.634,0.367). Source: Liu, G.R. et al., Comput. Methods Appl. Mech. Eng., 191, 1909–1921, 2002.With permission.
© 2003 by CRC Press LLC
© 2003 by CRC Press LLC
TABLE 9.16 Results from BCLSF for Volume Fractions of Carbon in SiC-C FGM Platea Initial Value
Set Number
5% Noise
10% Noise
1 2 3
(0.727,0.919,0.573,0.470) (0.876,0.968,0.720,0.381) (0.936,0.898,0.580,0.388)
(0.851,0.968,0.580,0.401) (0.867,0.947,0.657,0.382) (0.936,0.898,0.579,0.402)
a
Corresponding Solution 5% Noise 10% Noise (0.727,0.919,0.573,0.470) (0.871,0.963,0.716,0.404) (0.938,0.895,0.589,0.389)
(0.851,0.969,0.579,0.402) (0.866,0.946,0.657,0.384) (0.937,0.897,0.579,0.402)
Using noise-contaminated displacements.
Note: True solution = (0.961,0.842,0.634,0.367). Source: Liu, G.R. et al., Comput. Methods Appl. Mech. Eng., 191, 1909–1921, 2002. With permission.
Function Value at Solution Point 5% Noise 10% Noise 2.55e—2 2.88e—3 9.96e—5
4.087e–3 7.684e–3 2.264e–3
more efficient than the µGA alone, but more complex to implement. For problems of expensive forward solvers, the combined optimization method is strongly recommended. The above findings in this section are largely the same as those in Section 8.5.
9.7
Use of Progressive NN Model
The progressive NN model described in Chapter 6 is adopted for the characterization of material property of the FGM. The outputs of the NN model are volume fractions representing the distributions of material property of FGM. The inputs of the NN model are the time history of displacement responses sampled on the time-history curve of the response on the surface of FGM structures, which can be easily measured using conventional exper imental techniques. 9.7.1
Material Characterization of SiC-C FGM Plate
The progressive NN has been applied to characterize the volume fractions for the SiC-C FGM plate (Liu et al., 2001b). The SiC-C FGM plate is divided into five layered elements; it is assumed that this plate is made so that the upper surface is pure carbon, and the lower surface is pure silicon carbon. Therefore, the volume fractions of carbon on the upper and lower surfaces are 1.0 and 0.0, respectively. Thus, there are four parameters of volume fractions of carbon at the nodal lines of the plate, named as v1 , v2 , v3 , v 4 , that need to be characterized (c.f. Figure 9.4). Two key factors, however, govern the success of this method in the practical application. The first is that the inputs of the NN models should be carefully chosen so that variation in the outputs (volume fractions) can be truthfully reflected by the changes of these inputs. The second factor is that the training samples used in the initial training and the retraining should be carefully selected in order to describe the characteristics of the FGM. 9.7.1.1 Inputs of the NN Model Inputs for the NN model should be easily and accurately measured and are sensitive to the change of the volume fractions of the FGM plate. The displacement response data on the surface of the FGM plate are selected as the inputs. In the first step, the effect of varying material properties on the displacement responses is studied. All results are based on the modified HNM, and the dimensionless parameters defined by Liu et al. (2001a) are used. Examples of the displacement responses at x = 3.0 on the upper surface of the SiC-C FGM plate excited by a vertical line load of one cycle sine function at x = 0.0 are displayed in Figure 9.8(a) through Figure 9.8(d). It can be seen
© 2003 by CRC Press LLC
that each of the volume fractions v1 , v2 , v3 , v 4 displays an appreciable influence on the response curve. It is now necessary to decide what special information from the displacement responses should be included in the input training sample. Considering the features of Figure 9.8(a) through Figure 9.8(d), displacement responses are selected at six discrete points in the timehistory curve, namely, t = 2.65, 3.15, 3.51, 3.58, 3.65, and 5.02 as the inputs. The NN model used here has two hidden layers; the neuron numbers of the input and output and first and second hidden layers are 6, 4, 20, and 12, respectively. 9.7.1.2 Training Samples Training samples for the initial training of the NN model consist of a number of sets of inputs and outputs; these samples should cover all possible values of volume fractions of interest. Obviously, it is impossible to generate every combination of volume fractions, so a good cross section of possible alterations is required. The volume fraction is usually controlled in the process of fabricating FGM; therefore, a rough distribution of the volume fraction is known when the FGM is fabricated. The range of ±30% off from the actual value of volume fractions is assured. The search range for this problem is shown in Table 9.12. These training samples are created in the following three steps (Liu et al., 2001b): • Step 1 — all volume fractions are set to the maximum and minimum values, which will generate two training samples. Each volume fraction is then set to the maximum and minimum values, while all the rest are set to the average value. This will generate eight training samples and ensure the NN model can receive teaching signals from the key volume fractions. A total of 10 training samples will be generated in the first step. • Step 2 — the training samples created in the first step do not give the NN teaching signals of other volume fractions. Therefore, five training samples are created from a random selection of the volume fractions. • Step 3 —Figure 9.8(d) shows that the influence of volume fraction v 4 on the response curve is very irregular. Ten sets of samples are thus added in the following manner. The search range of the fourth volume fraction is divided equally into 10 intervals and these values are then used for the fourth volume fraction. For each of these 11 values for v 4 , all the other three-volume fractions v1 , v2 , and v3 are set to a randomly generated value in the search range. This will generate an additional 11 sets of training samples. This combined strategy covers a good cross section of all possible volume fraction variations. Normalization of training data sets is processed
© 2003 by CRC Press LLC
Dimensionless displacement
0.12 v = 0.0 1
0.1
v = 0.2 1 v = 0.4 1
0.08
v = 0.6 1
0.06
v = 0.8 1 v = 1.0 1
0.04
0.02
0
-0.02
-0.04 2
3
4
5 Dimensionless time
6
7
8
(a) 0.1 v
Dimensionless displacement
0.08
2
= 0.0
v = 0.2 2 v = 0 .4 2
0.06
v = 0.6 2 v = 0.8 2
0.04
v = 1.0 2
0.02
0
-0.02
-0.04 2
3
4
5 Dimensionless time
6
7
8
(b) FIGURE 9.8 Time history of displacement response in the x direction at x = 3.0 on the upper surface of the SiC-C FGM plate excited by a vertical line load of one cycle sine function at x = 0.0 . (a) Effect of volume fraction v1 on the displacement response ( v 2 = v 3 = v 4 = 0.5 ; (b) effect of volume fraction v 2 on the displacement response ( v1 = v 3 = v 4 = 0.5 ; (c) effect of volume fraction v 3 on the displacement response ( v1 = v 2 = v 4 = 0.5) ; and (d) effect of volume fraction v 4 on the displacement response (v1 = v 2 = v 3 = 0.5) . (From Liu, G.R. et al., Composites Sci. Tech., 61, 1401–1411, 2001. With permission.)
© 2003 by CRC Press LLC
0.1 v = 0.0 3
0.08
Dimensionless displacement
v = 0. 2 3 v = 0.4 3
0.06
v = 0.6 3 v = 0.8 3
0.04
v = 1. 0 3
0.02
0
-0.02
-0.04 2
3
4
5 Dimensionless time
6
7
8
(c) 0.1 v = 0.0 4
Dimensionless displacement
0.08
v = 0.2 4 v = 0 .4 4
0.06
v = 0.6 4 v = 0.8 4
0.04
v = 1.0 4
0.02
0
-0.02
-0.04 2
3
4
5 Dimensionless time
(d) FIGURE 9.8 Continued.
© 2003 by CRC Press LLC
6
7
8
TABLE 9.17 Characterized Volume Fractions of Carbon in the SiC–C FGM Plate (Case 1)
Position
Volume Fractions
Result (Error) Obtained from NN Trained at Each Progression 1 2 3 4
Original Data
(a) Noise Free
z z z z
= 0.2 = 0.4 = 0.6 = 0.8
v1 v2 v3 v4
0.961 0.842 0.634 0.367
0.966(0.5%) 0.869(3.2%) 0.591(–6.8%) 0.412(12.3%)
0.961(0.0%) 0.832(–1.2%) 0.613(3.3%) 0.395(7.8%)
0.974(1.3%) 0.822(–2.4%) 0.625(–1.5%) 0.389(5.8%)
0.972(1.1%) 0.835(–0.8%) 0.632(–2.9%) 0.385(5.0%)
0.954(–0.7%) 0.888(5.5%) 0.569(10%) 0.401(9.3%)
0.964(0.3%) 0.857(1.8%) 0.598(–5.7%) 0.397(8.2%)
0.969(0.8%) 0.832(–1.2%) 0.616(–2.8%) 0.378(3.0%)
(b) Noise Added (1% Gaussian Noise)
z z z z
= 0.2 = 0.4 = 0.6 = 0.8
v1 v2 v3 v4
0.961 0.842 0.634 0.367
0.952(–0.8%) 0.905(7.5%) 0.556(–12.2%) 0.423(15.2%)
Source: Liu, G.R. et al., Composites Sci. Tech., 61, 1401–1411, 2001. With permission.
according to Equation 6.14. The NN is then trained using these training samples. 9.7.1.3 Results and Discussion Two sets of volume fractions of the SiC-C FGM plates are identified using the progressive NN, which is detailed in Chapter 6. One set is the actual volume fractions for the SiC-C FGM listed in the third column in Table 9.17 (case 1). Another is an assumed arbitrary set of volume fractions from the search range given for the actual set, which is listed in the third column in Table 9.18 (case 2). The NN model is then used to characterize the volume fractions of case 1 and case 2 to validate the stability of the present procedure. The displacement responses of the sample points on the upper surface in the x direction of these two sets are calculated using the modified HNM and then used as inputs to the NN model. In order to simulate the measured displacement responses, noise-contaminated inputs are simulated by adding the Gaussian noise into the noise-free displacements, which are calculated results using modified HNM. 9.7.1.3.1 Case 1 Table 9.17 summarizes the characterized result for the first case. The results of four progressions are listed. The characterization result at the first iteration is not accurate because the maximum error is high, and the displacement responses corresponding to these characterized volume fractions are quite different from the simulated ones from the actual values of volume fractions. The distance norm calculated using Equation 6.17 is large, as shown in Table
© 2003 by CRC Press LLC
© 2003 by CRC Press LLC
TABLE 9.18 Characterized Volume Fractions of Carbon in the SiC–C FGM Plate (Case2) Position
Volume Fractions
Result (Error) Obtained from NN Trained at Each Progression 2 3 4 5
Original Data
1
0.850 0.750 0.650 0.350
0.898(5.7%) 0.685(8.6%) 0.720(10.8%) 0.274(–21.7%)
0.883(3.8%) 0.708(5.6%) 0.694(6.7%) 0.301(–14%)
0.840(–1.1%) 0.771(2.8%) 0.651(0.1%) 0.321(–8.3%)
0.862(1.5%) 0.731(2.6%) 0.676(3.9%) 0.324(–7.5%)
0.854(0.5%) 0.737(–1.8%) 0.670(3.1%) 0.325(7.1%)
0.848(–0.2%) 0.752(0.3%) 0.659(1.4%) 0.349(0.3%)
0.878(3.3%) 0.679(–9.5%) 0.724(11.4%) 0.275(–21.5%)
0.859(1.1%) 0.740(–1.2%) 0.689(6.0%) 0.279(–20.3%)
0.850(0.0%) 0.749(–0.1%) 0.668(2.8%) 0.294(–15.9%)
0.822(–3.3%) 0.795(6.1%) 0.634(–2.5%) 0.310(–11.4%)
0.833(–2.0%) 0.771(2.8%) 0.650(0.0%) 0.315(–10.0%)
0.827(–2.7%) 0.787(4.9%) 0.634(–2.5%) 0.332(–5.0%)
6
(a) Noise Free
z z z z
= 0.2 = 0.4 = 0.6 = 0.8
v1 v2 v3 v4
(b) Noise Added (1% Gaussian Noise)
z z z z
= 0.2 = 0.4 = 0.6 = 0.8
v1 v2 v3 v4
0.850 0.750 0.650 0.350
Source: Liu, G.R. et al., Composites Sci. Tech., 61, 1401–1411, 2001. With permission.
TABLE 9.19 Comparisons of Displacement Response Corresponding with Progressively Characterized Volume Fractions Displacement Response (3) (4) (5)
Volume Fractions
(1)
(2)
Actual values First progression Second progression Third progression Final result
0.23610 0.21494 0.24734 0.23122 0.22553
0.27178 0.26668 0.27294 0.27218 0.27288
0.31408 0.31183 0.32020 0.30912 0.31446
0.30981 0.30504 0.31873 0.31218 0.31426
0.30410 0.29439 0.31304 0.29006 0.29242
(6)
Distance Norm d
0.43105 0.48345 0.42731 0.41843 0.43528
— 0.05781 0.01840 0.02026 0.01695
Source: Liu, G.R. et al., Composites Sci. Tech., 61, 1401–1411, 2001. With permission. 0.25 Progression 1 Progression 2 Progression 3 Progression 4
Error
0.2
0.15
0.1
0.05
0
0
100
200
300
400
500
600
700
800
900
Training numbers
FIGURE 9.9 Summary of training or retraining error norm of the progressive NN model for characterizing the volume fractions of the SiC-C FGM plate. Four progressions are shown in this figure, which also shows that the convergence of the error norm improves as the progression number increas es. (From Liu, G.R. et al., Composites Sci. Tech., 61, 1401–1411, 2001. With permission.)
9.19. A retraining for the NN model is required. A new sample is then created from the first iteration result, and the corresponding displacement responses are calculated using the modified HNM. The new sample is added into the original sample pool to replace the sample with largest distance norm. The retraining process is repeated until the displacement responses corresponding to the characterized volume fractions are sufficiently close to the simulated measurements. Figure 9.9 shows the summary of training and retraining error norm of the progressive NN model for characterizing the volume fraction of the SiC-C FGM plate. The results at stages of progressive training are also listed in Table 9.17, where it can be seen that • The accuracy of the characterized results increases as the progression number increases, and the characterized result is very accurate after four progressions.
© 2003 by CRC Press LLC
• The displacement responses corresponding to the fourth characterized volume fractions are very close to the ones corresponding to the actual volume fractions. • The maximum error of volume fractions after the fourth progression is as low as 5%. • The characterized result remains stable with the presence of noise with level of 1% and the required number of progression is not changed, even when the noise is added. 9.7.1.3.2 Case 2 The second set of volume fractions for SiC-C FGM plate is also characterized, and the result for this case shown in Table 9.18. It can be found that very accurate results can be obtained after six progressions; even the maximum error of the first characterized volume fractions is as large as 21.7%. Compared to the first case, the maximum error of the first characterized volume fractions is bigger so more progressive training is required. This can be explained as follows. Because the training samples are selected based on a range of volume fractions for the first case, this set of samples is not the most suitable sample set for case 2. After several progressive trainings, the sample density around the inputs of case 2 increases to achieve the desired accuracy. 9.7.1.3.3 Efficiency Analysis For the six-time progressively trained NN model, the modified HNM solver has been called for a total of 31 times, and six trainings of the NN model have been carried out. Each modified HNM run needs about 25 seconds on an SGI origin 2000 computer; a single retraining of the presented NN model needs about 1200 seconds (or 20 minutes). Therefore, the six-time progressive NN model needs in total about 7985 seconds to perform (uniform µGA) the material characterization once. In contrast, using the genetic algorithm to solve the same problem needs about 62,500 seconds (see Section 9.5.1) because significantly more calls for forward solver were needed. A detailed comparison of CPU time is shown in Table 9.20. It can be clearly concluded TABLE 9.20 Comparison of CPU Time for Material Characterization of SiC-C FGM Plate Using the Progressive NN Model and the µGA Item
µGA
Modified HNM Training NN model Total
25 × 2500 — 62,500
CPU Time (s) Progressive NN Model 25 × 31 1200 × 6 7975
Source: Liu, G.R. et al., Composites Sci. Tech., 61, 1401–1411, 2001. With permission.
© 2003 by CRC Press LLC
that this NN model for material characterization is very efficient compared with using GAs. The CPU time required in the progressive NN is about one eighth of that required by µGA for this example. The advantage of the NN model can be shown even more clearly if the forward solver requires longer CPU time for a single call. On the other hand, the implementation of the NN is more tedious, especially in the initial training stage.
9.7.2
Material Characterization of SS-SN FGM Cylinders
The progressive NN has also been applied to characterize the volume fractions for the SS-SN FGM cylinders (Han et al., 2003). The cylinder is composed of stainless steel and silicon nitride, with stainless steel on its outer surface and silicon nitride on its inner surface. The material properties for stainless steel and silicon nitride are listed in Table 9.6. The original values of the volume fraction of SS are assumed varying in the following function (see Figure 9.7): z − R1 VP = 1 − R2 − R1
2
(9.16)
This FGM cylinder is divided into six layered elements. As stainless steel on its outer surface and silicon nitride on its inner surface, the volume fractions of SS on the outer and inner surfaces are 1.0 and 0.0, respectively. Thus, there are five volume fractions of SS along the thickness of the cylinder, named v1 , v2 , v3 , v 4 , and v5 that need to be characterized (see Figure 9.4). 9.7.2.1 Inputs of the NN Model Following the process described in Figure 6.6, the first step is to investigate the sensitivity between the displacement responses and the variation of material properties. Figure 9.10(a) through Figure 9.10(e) displays examples of the displacement responses at x = 2.0 on the upper surface of the FGM cylinder composed of stainless steel and silicon nitride excited by an incident wave of one cycle of sine function at x = 0.0 . It can be seen that each of the volume fractions v1 , v2 , v3 , v 4 , and v5 displays an appreciable influence on the response curve. It is now necessary to decide what special information from the displacement responses to include in the input training sample. The choice of the dynamic response is based on its sensitivity to the change of the parameters; the selected time range should be selected to be more sensitive to the change of the parameters. As such, the time range located between 3.9 and 5.9 as shown in Figure 9.10(a) through Figure 9.10(e) is selected as the time range. Displacement responses at eight points in the
© 2003 by CRC Press LLC
Dimensionless axial displacement
5
x 10
-3
0
-5
-10
v
= 0.0
1
v = 0.25 1 v = 0. 5 1
-15
v = 0.75 1 v = 1.0 1
-20
2
3
4
5
6
7
8
Dimensionless time
a) Effect of volume fraction v1 on the displacement response ( v2 = v3 = v4 = v5 = 0.5 )
Dimensionless axial displacement
x 4 10
-3
2
0
-2
-4 v
-6
v
v
= 0.25
2
v
-8
= 0.0
2
= 0.5
2
2 v
= 0.75
2
= 1. 0
-10 2
3
4
5
6
7
8
Dimensionless time
b) Effect of volume fraction v2 on the displacement response ( v1 = v3 = v4 = v5 = 0.5 ) FIGURE 9.10 Time history of displacement response in the axial direction at x = 2.0 on the outer surface of the FGM cylinder excited by a vertical line load of one cycle sine function at x = 0.0 . (From Han, X. et al., Neurocomputing, 51, 341–360, 2003. With permission.)
© 2003 by CRC Press LLC
Dimensionless axial displacement
4
x 10
-3
2
0
-2
-4 v
-6
v
-8
v
= 0.0
3
= 0. 25
3
v
= 0.5
3
= 0. 75
3 v
3
= 1.0
-10 2
3
4
5
6
7
8
Dimensionless time
c) Effect of volume fraction v3 on the displacement response v1 = v2 = v4 = v5 = 0.5 )
4
x 10
-3
Dimensionless axial displacement
2
0
-2
-4 v
-6
v
-8
-10
-12
3
4
5
6
7
= 0.5
4
4 v
2
= 0.25
4
v v
= 0.0
4
= 0.75
4
= 1. 0
8
Dimensionless time
d) Effect of volume fraction v4 on the displacement response v1 = v2 = v3 = v5 = 0.5 ) FIGURE 9.10 Continued.
© 2003 by CRC Press LLC
4
x 10
-3
Dimensionless axial displacement
2
0
-2
-4
-6
v5 = 0.0 v5 = 0.25
-8
v5 = 0.5 v5 = 0.75
-10
v5 = 1.0 -12 2
3
4
5
6
7
8
Dimensionless time
e) Effect of volume fraction v5 on the displacement response ( v1 = v2 = v3 = v4 = 0.5 ) FIGURE 9.10 Continued.
time-history curve in this specified range, namely, t = 3.93, 4.18, 4.53, 4.79, 4.97, 5.23, 5.49, and 5.92 are selected as the inputs. The NN model used for this example has two hidden layers, and the neuron numbers of the input and output and first and second hidden layers are 8, 5, 24, and 16, respectively. 9.7.2.2 Training Samples In this example, the orthogonal array with random selection for generating training samples is adopted to generate the samples. A search range of ±30% off from the actual value of volume fractions is used (see Table 9.21). To TABLE 9.21 Search Range for Characterization of Volume Fractions of SS in SS-SN FGM Cylinder
z z z z z
Position
Volume Fractions
Original Data
Search Range
= 0.167 = 0.333 = 0.5 = 0.667 = 0.833
v1 v2 v3 v4 v5
0.972 0.889 0.750 0.555 0.306
0.68–1.00 0.60–1.00 0.53–1.00 0.38–0.72 0.21–0.40
Source: Han, X. et al., Neurocomputing , 51, 341–360, 2003. With permission.
© 2003 by CRC Press LLC
formulate the initial training samples, it was assumed that there were four levels of change in the search range for the five volume fractions v1 , v2 , v3 , v 4 , and v5 which correspond to their discrete values. Based on the orthogonal array method, these five four-level parameters would only require 16 samples to cover the whole domain. In addition, another 20 randomly created samples were added into the training data set. This combined strategy covers good cross sections of all possible elastic constant variations. The training samples are listed in Table 9.22. 9.7.2.3 Results and Discussions The displacement responses, which are sampled at the points on the upper surface in the x direction of FGM cylinder with the actual volume fractions, are calculated using the forward solver. In order to simulate the measured displacement responses, noise-contaminated inputs are used. The displacement responses on the eight points in the time history are used as inputs, the five volume fractions are employed as outputs, and all the values of inputs and outputs in the training samples are normalized according to Equation 6.14. Table 9.23 summarizes the characterized results of three progressions. It can be found that the first characterization is not accurate because the maximum error is high. The displacement responses corresponding to these characterized volume fractions are different from simulated ones from the actual values of volume fractions; the distance norm defined by Equation 6.17 is large, as shown in Table 9.24. Retraining for the NN model is required. A new sample is then created from the first iteration result and the corresponding displacement responses calculated using the forward solver. The new sample is added into the original sample pool to replace the sample with the largest distance norm. The retraining process is repeated until the displacement responses corresponding to the characterized volume fractions are sufficiently close to the simulated measurements. Table 9.23 also lists the results at stages of progressive training. It can be seen from this table that the accuracy of the characterized results increases as the progression number increases, and the characterized result is very accurate after three progressions. The maximum error of the volume fractions at the fourth iteration is very low. It can also be found from Table 9.24 that the displacement responses corresponding to the volume fractions at fourth iteration are very close to the ones corresponding to the actual volume fractions. From the numerical examples, it is seen that the accuracy of output from the NN model increases with the increased number of retraining cycles; therefore, the required accuracy may be obtained by repeating the retraining process. However, when noisy input data are used, additional iteration retraining is meaningless. As long as the error is falling into the range of the measurement error, the retraining should end, according to the discrepancy principle (Chapter 3). In fact, too excessive retraining could lead to overfitting (Chapter 6).
© 2003 by CRC Press LLC
TABLE 9.22 Training Samples © 2003 by CRC Press LLC
Displacement Response
No. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
–0.00394 –0.00274 –0.00185 –0.00224 –0.00371 –0.00181 –0.00494 –0.00209 –0.00335 –0.00245 –0.00232 –0.00208 –0.00370 –0.00338 –0.00220 –0.00167 –0.00264 –0.00192 –0.00211 –0.00194 –0.00371 –0.00283 –0.00234 –0.00346 –0.00350 –0.00253 –0.00226 –0.00240 –0.00230 –0.00267 –0.00277 –0.00227 –0.00305 –0.00300 –0.00361 –0.00361
–0.00294 –0.00295 –0.00337 –0.00354 –0.00705 –0.00489 –0.00789 –0.00286 –0.00671 –0.00397 –0.00524 –0.00334 –0.00673 –0.00271 –0.00292 –0.00329 –0.00318 –0.00431 –0.00575 –0.00389 –0.00697 –0.00235 –0.00371 –0.00613 –0.00308 –0.00281 –0.00530 –0.00607 –0.00414 –0.00602 –0.00652 –0.00467 –0.00596 –0.00603 –0.00700 –0.00685
–0.00458 –0.00633 –0.00677 –0.00759 –0.00713 –0.00886 –0.00587 –0.00719 –0.00851 –0.00787 –0.00801 –0.00739 –0.00780 –0.00497 –0.00695 –0.00767 –0.00646 –0.00802 –0.00791 –0.00850 –0.00686 –0.00502 –0.00766 –0.00831 –0.00557 –0.00637 –0.00931 –0.00926 –0.00783 –0.00879 –0.00745 –0.00738 –0.00769 –0.00718 –0.00713 –0.00672
–0.00746 –0.00794 –0.00737 –0.00867 –0.00392 –0.00663 –0.00261 –0.00840 –0.00403 –0.00766 –0.00614 –0.00761 –0.00357 –0.00809 –0.00832 –0.00786 –0.00754 –0.00703 –0.00580 –0.00834 –0.00445 –0.00838 –0.00847 –0.00482 –0.00754 –0.00798 –0.00637 –0.00550 –0.00715 –0.00520 –0.00501 –0.00637 –0.00500 –0.00519 –0.00428 –0.00468
–0.00784 –0.00687 –0.00633 –0.00605 –0.00202 –0.00417 –0.00122 –0.00684 –0.00147 –0.00493 –0.00394 –0.00611 –0.00128 –0.00807 –0.00678 –0.00613 –0.00653 –0.00497 –0.00408 –0.00519 –0.00278 –0.00837 –0.00569 –0.00063 –0.00698 –0.00706 –0.00336 –0.00274 –0.00480 –0.00250 –0.00355 –0.00498 –0.00278 –0.00354 –0.00243 –0.00329
Volume Fraction –0.00547 –0.00354 –0.00388 –0.00166 –0.00034 –0.00147 –0.00052 –0.00359 0.00027 –0.00086 –0.00159 –0.00309 0.00030 –0.00491 –0.00328 –0.00303 –0.00378 –0.00197 –0.00187 –0.00099 –0.00099 –0.00550 –0.00119 0.00250 –0.00401 –0.00393 –0.00066 –0.00070 –0.00144 –0.00026 –0.00182 –0.00256 –0.00094 –0.00150 –0.00091 –0.00137
Source: Han, X. et al., Neurocomputing, 51, 341–360, 2003. With permission.
–0.00223 –0.00065 –0.00180 0.00086 –0.00010 –0.00024 –0.00071 –0.00100 –0.00023 0.00114 –0.00069 –0.00053 0.00001 –0.00129 –0.00064 –0.00093 –0.00127 –0.00033 –0.00058 0.00091 –0.00045 –0.00217 0.00118 0.00180 –0.00092 –0.00103 –0.00016 –0.00044 0.00034 –0.00025 –0.00070 –0.00036 –0.00036 –0.00043 –0.00062 –0.00030
0.00149 0.00134 0.00037 0.00074 0.00071 –0.00049 0.00118 0.00049 –0.00030 0.00070 0.00011 0.00072 0.00023 0.00164 0.00073 0.00013 0.00114 0.00022 –0.00007 0.00015 0.00060 0.00094 0.00077 –0.00065 0.00167 0.00117 –0.00081 –0.00099 0.00064 –0.00048 0.00021 0.00081 0.00040 0.00079 0.00048 0.00063
0.69900 0.79200 0.94000 0.68300 0.91446 0.81510 0.98765 0.70307 0.85709 0.77344 0.95456 0.78746 0.89766 0.69709 0.79414 0.81901 0.83946 0.86000 0.87731 0.71626 0.99600 0.68000 0.68000 0.68000 0.78700 0.78700 0.78700 0.78700 0.89300 0.89300 0.89300 0.89300 1.00000 1.00000 1.00000 1.00000
0.65600 0.68100 0.68000 0.84200 0.95932 0.90184 0.91644 0.92600 0.86800 0.68036 0.70924 0.85048 0.81476 0.62380 0.63560 0.76364 0.70852 0.78960 0.96360 0.83848 0.92000 0.73300 0.86600 1.00000 0.60000 0.73300 0.86600 1.00000 0.60000 0.73300 1.0000 0.86600 0.60000 0.73300 0.86600 1.00000
0.66000 0.62300 0.53700 0.88100 0.68463 0.75475 0.81068 0.60586 0.91986 0.97937 0.80988 0.54349 0.91169 0.81675 0.85970 0.72970 0.57333 0.70653 0.60811 0.92160 0.63100 0.68700 0.84300 1.00000 0.68700 0.53000 1.00000 0.84300 0.84300 1.00000 0.68700 0.53000 1.00000 0.83300 0.68700 0.53000
0.53100 0.69700 0.53800 0.52200 0.66512 0.53354 0.70524 0.43005 0.67577 0.64160 0.53103 0.59100 0.70358 0.59760 0.46408 0.44389 0.49992 0.54680 0.51916 0.53759 0.66200 0.49300 0.60700 0.72000 0.60700 0.72000 0.38000 0.49300 0.72000 0.60700 0.38000 0.49300 0.49300 0.38000 0.72000 0.60700
0.37100 0.31000 0.24800 0.33800 0.32607 0.22353 0.26972 0.32560 0.24325 0.32799 0.25674 0.32161 0.30616 0.29831 0.31287 0.27494 0.38904 0.28634 0.26846 0.28820 0.25100 0.27300 0.33700 0.40000 0.40000 0.27300 0.27300 0.21000 0.27300 0.21000 0.33700 0.40000 0.33700 0.40000 0.21000 0.27300
TABLE 9.23 Characterized Volume Fractions of SS in SS-SN FGM Cylinder
Position
Volume Fractions
Original Data
v1 v2 v3 v4 v5
0.972 0.889 0.750 0.555 0.306
Result (Error) Obtained from NN Trained at Each Progressions 1 2 3
(a) Noise Free
z z z z z
= 0.167 = 0.333 = 0.5 = 0.667 = 0.833
0.976(0.5%) 0.858(–3.5%) 0.726(3.2%) 0.601(8.3%) 0.296(3.3%)
0.978(0.6%) 0.896(0.8%) 0.714(4.8%) 0.584(5.3%) 0.296(3.2%)
0.978(0.6%) 0.900(1.2%) 0.713(–5.0%) 0.581(4.8%) 0.297(2.9%)
0.941(–3.2%) 0.884(–0.5%) 0.701(–6.5%) 0.607(9.4%) 0.292(4.6%)
0.982(1.0%) 0.902(1.5%) 0.700(6.7%) 0.572(3.1%) 0.296(3.2%)
0.983(1.1%) 0.905(1.9%) 0.708(–5.5%) 0.569(2.5%) 0.297(2.8%)
(b) Noise Contaminated (1%)
z z z z z
= 0.167 = 0.333 = 0.5 = 0.667 = 0.833
v1 v2 v3 v4 v5
0.972 0.889 0.750 0.555 0.306
Source: Han, X. et al., Neurocomputing, 51, 341–360, 2003. With permission.
It should also be noted that the level of noise used in the inputs for the NN model for examples in Section 9.7.1 and Section 9.7.2 is only 1%. A high level of noise was tried, but failed to obtain a good result. The reason could be that the number of sampling points used in this case is only eight, which is too small. The present of noise triggers the over-fitting phenomenon and causes the procedure to break down. The use of more sampling points should effectively solve the problem, as shown in Section 10.7 and Section 11.7.3.
9.8
Remarks
• Remark 9.1 — FGMs are usually microscopically heterogeneous and typically made from two isotropic components. The material property of the FGM structure can be obtained using methods of rule of mixture derived from micromechanics, using the material properties of the matrix and inclusion for given volume fractions. Because the material property of the components is usually available, the volume fractions in the thickness direction of FGM structures are the key parameters of characterization of the material property of FGM. • Remark 9.2 — the HNM is modified to accommodate a linear variation of material properties in an element in the thickness direction. This modified HNM is employed for the forward calculation to
© 2003 by CRC Press LLC
© 2003 by CRC Press LLC
TABLE 9.24 Comparisons of Displacement Response Corresponding with Iterative Characterized Volume Fractions Volume Fractions
Displacement Response (1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
Distance Norm (Defined by Equation 6.17)
Actual values First iteration Second iteration Final result
–0.00365 –0.00361 –0.00389 –0.00366
–0.00705 –0.00700 –0.00707 –0.00703
–0.00688 –0.00713 –0.00670 –0.00684
–0.00425 –0.00428 –0.00428 –0.00433
–0.00257 –0.00243 –0.00266 –0.00266
–0.00090 –0.00091 –0.00089 –0.00090
–0.00049 –0.00062 –0.00035 –0.00045
0.00072 0.00048 0.00072 0.00073
— 4.502e–4 3.445e–4 1.352e–4
Source: Han, X. et al., Neurocomputing, 51, 341–360, 2003. With permission.
obtain the displacement response for a given material property of an FGM structure. The modified HNM can further reduce the number of elements and can effectively obtain more accurate results. • Remark 9.3 — several computational inverse techniques, including nonlinear least squares method and GAs as well as the progressive NN, are employed for the material characterization of FGMs. The input data used for the inverse procedure is the time-history curve of the displacement response recorded at one point on the surface of the FGM structure. Numerical tests on these inverse procedures are performed. The noise effect is also examined by adding Gaussian noise directly into the computer-generated displacement to simulate the measured displacement. For all these cases, good agreements between the identified results and the true ones have been observed.
© 2003 by CRC Press LLC
1523_Frame_C10.fm Page 307 Thursday, August 28, 2003 5:03 PM
10 Inverse Detection of Cracks in Beams Using Flexural Waves
Numerical analysis and experimental studies on the use of flexural waves for nondestructive evaluation of cracks and delaminations in beams of isotropic and anisotropic materials are introduced in this chapter. Computational inverse procedures employing genetic algorithms (GAs) and neural networks (NNs) for determining the geometrical parameters of the crack and delamination are detailed. In these procedures, the strip element method (SEM) for composite laminates, methods based on the beam model of wave propagation, and the finite element method are used as forward solvers. The study is conducted according to the following procedure: • Forward solvers (SEM code and beam model) are introduced first. • Intensive experiments are carried out to reveal the sensitivity of the measured responses to the geometric parameters of the cracks and delaminations. The experiments are also used to verify the forward solver to be used for inverse analysis. Furthermore, the experiments are used to estimate roughly the geometrical parameters of the cracks and delaminations. • Inverse procedures are established using GAs as well as NNs to provide systematic ways to perform nondestructive identification of cracks and delaminations in isotropic and anisotropic materials. Examples of practical applications are presented and frequency domain (harmonic excitation) and time domain (impact excitation) analyses conducted. Results indicate that these computational inverse techniques are efficient for nondestructive evaluation of cracks or delaminations in beams made of isotropic or anisotropic materials.
© 2003 by CRC Press LLC
1523_Frame_C10.fm Page 308 Thursday, August 28, 2003 5:03 PM
10.1 Introduction Delaminations and cracks are typical examples of flaws commonly found in beam structures during fabrication or service. The presence of flaws in struc tural members and machine parts will undermine their strength, the integrity of the structure, and also their dynamic behavior. They are also often the source of failures or engineering disasters. Effectively identifying and assessing flaw size and location has become increasingly important in NDE. Ultrasonic testing techniques are presently the most popular NDE means for crack evaluation or detection. The principle of this technique is to send an acoustic pulse through a probe into a specimen and infer the internal conditions, such as flaws and cracks, from the induced reflected and refracted waves at the outer boundary. The advantage of this method is that access to only one surface can sufficiently measure the depths of the cracks. However, most ultrasonic systems require the specimen to be immersed in liquid or impinged upon by a jet of fluid; proper contact between the probe and the test surface must be well maintained during testing. These requirements, together with the necessity of point-by-point scanning on a surface, frequently restrict the application of ultrasonic testing. In addition, the frequency of ultrasound is usually very high. Therefore it can only be used to detect cracks near the surface of the material or structures. NDEs using elastic waves and vibrations have also been developed. Unlike conventional ultrasonic methods that employ high-frequency signals, NDE uses elastic waves and vibrations of low frequency signals. Hence, the testing can be carried out very fast and does not require coupling fluids. Lowfrequency signals with wavelength of the same order as the thickness of the specimens are often used, which enables the wave to penetrate deeper to detect cracks in thick materials or structures. Many studies on vibration and elastic wave technique have been reported; the following gives a brief review of recent studies conducted. Lange et al. (1971, 1972) and Lagerkvist and Lundberg (1982) developed techniques for design and operation of an impedance transducer as a sensor for the vibration measurement. Subsequently, Cawley (1984, 1985, 1987) and Brownjohn et al. (1980) expanded and implemented the method by carrying out theoretical and experimental studies to investigate impedance changes caused by the presence of delamination, and analyzed the sensitivity of this method in detecting the delamination. The delamination was modeled as a spring underneath the rest of the structure, whose properties were unaltered. This model worked well for shallowly embedded delamination and the base structure was relatively stiff; however, the sensitivity of this method was limited by the stiffness of the dry-point contact between the transducer and the structure. Cawley et al. (1985) also used this method for the production of quality control of fiber composites. Adams et al. (1978, 1988) and Ni and Adams (1984) found that a crack can be detected by assessing the reduction
© 2003 by CRC Press LLC
1523_Frame_C10.fm Page 309 Thursday, August 28, 2003 5:03 PM
in stiffness and the increase in damping. Changes in stiffness led to changes in the natural frequencies of a vibrating system; therefore, the location and dimensions of the crack could be estimated from the measurement of the natural frequency. Rizos et al. (1990) used the measured amplitudes at two points of the structure vibrating at one of its natural modes to locate transverse surface cracks in a cantilever beam and also to estimate their respective depths. The method is simple to perform because only the natural frequencies are used, but it lacks the accuracy for very small cracks as compared with ultrasonic techniques. Narkis (1994) developed a method for calculating the natural frequencies of a simply supported cracked beam, with the crack simulated by an equivalent linear spring. A model was developed to relate the equivalent spring flexibility to beam and crack parameters; it was found that the only information required for accurate crack identification was the variation of the first two natural frequencies. This method was highly dependent on the accuracy of the frequency measurement and resolution, which are quite difficult to fulfill in practice. Boltezar et al. (1998) and Montalvao et al. (1990) proposed an experimental method by measuring the axial and flexural vibration response in order to identify the existence of a crack in a specimen. The method gave reliable and accurate results for crack depths larger than 5% of its thickness. Introducing the coupling effect between longitudinal and flexural rigidity, Wang et al. (1982) established a systematic procedure for determining the natural frequencies and the corresponding mode shapes of delaminated beams. Good correlation was observed between calculated and experimental results. Another model by Mujumdar and Suryanarayan (1988) based on the Euler beam theory assumed that at the delaminated region, the beam sections at both sides of the delamination were constrained to have identical transverse deformations and remained in contact throughout the vibration. Mace (1984) proposed reflection and transmission coefficients of near field waves for the case of point support and a change in section area. Using this method, the effects of the interaction of the near fields with neighboring discontinuities were included. Finite element method (FEM) was used to model the disbond and an impedance measurement system was used to verify the theoretical results. These results showed that this method could be used to detect planar defects such as disbonds in composite materials, voids in laminated materials and defects in honeycomb structures. Doyle (1991, 1997), Doyle et al. (1985, 1987, 1995), Rizzi and Doyle (1992) and Al-Hunaidi (1996) developed experimental and theoretical approaches to characterize the dispersive flexural wave in a stepped beam and at the free end of the beam. Liu and Achenbach (1995) utilized the strip element method (SEM) to investigate wave scattering by crack in anisotropic laminates. In solving the wave scattering problem, the SEM needed a much smaller number of equations than FEM. In practical applications of crack estimation, Liu and Lam (1994) and Lam et al. (1995) used the SEM for the characterization of horizontal and vertical cracks in anisotropic laminates. Ishak et al. (1999, 2001a) use the
© 2003 by CRC Press LLC
1523_Frame_C10.fm Page 310 Thursday, August 28, 2003 5:03 PM
SEM and experimental study for characterization horizontal crack in isotropic beams. Numerical and experimental studies for crack detection in beam employing transverse impact have been also conducted (Ishak, 2001; Ishak et al., 2002). These theoretical and experimental studies have found that elastic wave techniques can be a very promising alternative for detecting and characterizing cracks in beams. The important issue in applying the elastic wave technique is to determine the relationships between the characteristics of the crack and the wave response in the structure. Parallel with this progress, interest has grown considerably in developing inverse procedures employing inverse numerical methods that include GAs and NNs for crack and assessment. During the last three decades, there has been a growing interest in solving problems based on principles of evolution, genetic operators, and heredity that maintains a population of potential solutions. On its application, the GA has also been used widely in areas of NDE. Doyle (1995) utilized the spectral element method combined with a stochastic genetic algorithm to locate and size cracks in structural components. The results indicated that the GA was a good scheme for arriving at improved estimates from a given set of initial random estimates of an objective function. Recently, Ishak (2001) developed a procedure employing GAs for inverse determination of the length and location of a crack in an isotropic beam from its dynamic displacement responses. Wu et al. (1992) reported that the NN was a promising method for solving inverse problem in detection of damage in a simple structure. They adopted NNs to portray the structural frequency response before and after damage and used the trained model to detect the location of the structural damage. Zgnoc and Achenbach (1996) used the NN and ultrasonic technique for the detection and sizing of cracks emanating from rivet holes. The network was trained with a combination of experimental and synthetic data generated by finite element method. It was shown that the FEM results could be used to train the NN and the ultrasonic data processing system estimated the crack sizes well. Ishak et al. (2001c, 2002) used the NN model and the SEM code for detection of the location and size of delamination in composite laminated beam. Displacement responses calculated using SEM for laminated beams containing predetermined delamination parameters (i.e., delamination location, depth, and length) were used as training data for the NN. Once the NN was trained, it was then employed for inverse determination of delamination using experimental displacement responses measured with a scanning laser vibrometer. Motivated by recent progress, this chapter introduces numerical and experimental methods using elastic waves and computational inverse techniques to determine the location and size of the cracks in beams. The materials presented here are largely from works by Ishak et al. (2000, 2001a, b, c, 2002).
© 2003 by CRC Press LLC
1523_Frame_C10.fm Page 311 Thursday, August 28, 2003 5:03 PM
10.2 Beams with Horizontal Delamination This section introduces numerical and experimental methods for rough estimation of horizontal delamination or cracks in beams of isotropic and anisotropic materials. The SEM is used for numerical analysis of waves scattered by the delamination or crack in the beams in the frequency domain.
10.2.1
The SEM
In the SEM, the composite laminated beam with a delamination or crack is first divided into subdomains of layered structures. Each of the domains is divided further into strip elements in the thickness direction. The partial differential equations that govern the wave motion in each layer are converted into a dimension-reduced set of ordinary differential equations with respect to the horizontal coordinate only. This set of ordinary differential equations can be solved analytically to obtain the frequency wave response by combining the complementary solutions with the particular solutions. The final solution for the laminated beams with delamination is obtained by assembling all the equations of solutions for the subdomains. The SEM was developed by Liu and Achenbach (1994, 1995) for composite laminates based on works by Waas (1972), Dong and Nelson (1972), Nelson and Dong (1973), Kausel (1981, 1986), Liu et al. (1991b), etc. Some of the formulations of the SEM are similar to those presented earlier by Kausel and Roësset (1977) in the context of semianalytic hyperelement for layered strata of isotropic materials. Experimental studies will also be presented in this section. The experiments serve a very important purpose of verifying the SEM code as a forward solver for later inverse analysis.
10.2.2
Why SEM?
The SEM is chosen because it is particularly suitable for inverse analysis for the following reasons. In the SEM, the partial differential equations that govern the wave motion in the beams are analytically solved in the horizontal direction. Therefore, using wave response sampled on the surface of the beam along the axial (horizontal) direction to perform inverse analyses will not have Type III ill-posedness (see discussion in Section 2.3). Type I ill-posedness can be removed by sampling more points on the surface of the beam to have the problem over-posed, which can be very easily done in both computation and experiment. To remove Type II ill-posedness, it is necessary to ensure that the frequency wave response is sensitive to the presence, locations, and dimensions of the crack or delamination in the beams, which will be confirmed by the following intensive numerical and experimental studies.
© 2003 by CRC Press LLC
1523_Frame_C10.fm Page 312 Thursday, August 28, 2003 5:03 PM
z ac Delamination
F(x) b1
III
dc
b3 H
I
b2
lc IV
x
b4 II
junctions
FIGURE 10.1 Division of a beam with a horizontal crack into domains. The beam lies on the finite region –× ð (x, y) ð ×, –H/2 ð z ð H/2. The length of the horizontal crack and the depth of the crack from the upper surface of the specimen are represented as lc and dc, respectively. The distance from the original to the left edge of the crack is denoted by ac. (From Ishak, S.I. et al., J. Sound Vib., 238(4), 661–671, 2000. With permission.)
10.2.3
Brief on SEM Formulation
Consider an infinite anisotropic beam with a thickness H and a horizontal delamination as shown in Figure 10.1. It is assumed that the beam lies on the region of –× ð (x, y) ð ×, –H/2 ð z ð H/2. The length of the horizontal crack and the depth of the crack from the upper surface of the beam are represented as ac and dc, respectively. In order to simplify the problem, assume that the crack covers the whole width of the beam, and the harmonic loading is uniform over the width in the y direction. The analytical study now reduces to a two-dimensional case. The excitation force acting on the upper surface of the beam is a time harmonic load fixed at x = 0 and can be represented as: F(t) = F0 exp(iωt)
(10.1)
where F0 is the amplitude of the harmonic load. It is expected that the excited flexural waves in the beam will be scattered by the delamination and carry the information of the presence and geometrical parameters of the delamination. The beam with horizontal delamination is divided into four subdomains as denoted by Roman numbers in Figure 10.1: • Domain I is bounded by boundaries b 1, b2 and the upper and lower surfaces of the beam. • Domain II is bounded by boundaries b3, b4 and the upper and lower surfaces of the beam. • Domain III is bounded by boundaries b 1, b3, the upper surface of the beam and the upper surface of the delamination. • Domain IV is bounded by boundaries b2 and b4, the lower surfaces of the crack, and the lower surface of the beam.
© 2003 by CRC Press LLC
1523_Frame_C10.fm Page 313 Thursday, August 28, 2003 5:03 PM
The relationship between displacements and external traction at the nodal points on the vertical boundaries of the subdomains can be expressed as (Liu and Achenbach, 1995): I R bI1 K 11 I I R bL = K 21 R I K I b 2 31
I K 12 I K 22 I K 32
I VbI1 SbI1 K 13 I I I K 23 VbL + SbL I I S I K 33 V b2 b2
(10.2)
where the subscripts b1 and b2 stand for the vectors on the upper and lower right vertical boundary of domain I, respectively, excluding the left delamination tip. The subscript bL stands for the vectors at the left delamination tip. Similarly for domain II, the relationship between displacements and external traction can be expressed as: II R bII3 K 11 II II R bR = K 21 R II K II b 4 31
II VbII3 SbII3 K 13 II II II K 23 VbR + SbR II II II K 33 Vb 4 Sb 4
II K 12 II K 22 II K 32
(10.3)
where the subscript b3 and b4 stand for the vectors on the upper and lower right vertical boundary of domain II, respectively, excluding the right delamination tip. The subscript bR stands for the vectors at the right delamination tip. For domain III and domain IV, III R bIII1 K 11 III III R bL K 21 III = III R b 3 K 31 III III R bR K 41
III K 12 III K 22 III K 32 III K 42
III K 13 III K 23 III K 33 III K 43
III VbIII1 SbIII1 K 14 III III III K 24 VbL + SbL III VbIII3 SbIII3 K 34 III III III K 44 VbR SbR
(10.4)
IV IV R bL K 11 IV IV R b 2 K 21 IV = IV R bR K 31 IV R bIV4 K 41
IV K 12 IV K 22 IV K 32
IV K 13 IV K 23 IV K 33
(10.5)
IV K 42
IV K 43
IV IV VbLIV SbL K 14 IV IV IV K 24 Vb 2 Sb 2 + IV IV IV K 34 VbR SbR IV IV IV K 44 Vb 4 Sb 4
and
can be otained. The continuity conditions at the junctions, which divide the beam into subdomains, are as follow:
© 2003 by CRC Press LLC
1523_Frame_C10.fm Page 314 Thursday, August 28, 2003 5:03 PM
R b1 = R bI1 − R bII1 , Vb1 = VbI1 = VbIII1 I III IV R bL = R bL − R bL − R bL , VbL = VbLI = VbLIII = VbLIV
R b 2 = R bI 2 − R bIV2 , Vb 2 = VbI2 = VbIV2
(10.6)
R b 3 = R bIII3 − R bII3 , Vb 3 = VbII3 = VbIII3 III IV II II III IV + R bR − R bR , VbR = VbR = VbR = VbR R bR = R bR
R b 4 = R bIV4 − R bII4 , Vb 4 = VbII4 = VbIV4
By assembling the set of the equations for all the domains, the relationship between the displacement and external traction at the nodal points on the junction could be expressed as: R J = KVJ + S J
(10.7)
where RJ is the external force vector acting on the junctions given by R J = {R b1
R bL
R b2
R b3
R bR
Rb4 }
T
(10.8)
and VJ is the displacement vector acting on the junctions given by VJ = {Vb1
VbL
Vb 2
Vb 3
VbR
Vb 4 }
T
(10.9)
Moreover, SJ in Equation 10.7 is given by SbI1 − SbIII1 I SbL SbI 2 − SbIV2 S J = II III −S b 3 + S b 3 II −SbR II IV −Sb 4 + Sb 4
(10.10)
In Equation 10.7, K is the stiffness matrix for the delaminated beam given by
© 2003 by CRC Press LLC
1523_Frame_C10.fm Page 315 Thursday, August 28, 2003 5:03 PM
I III K 11 − K 11 I III K − K 21 21 I K 31 K= III K 31 K III 41 0
I III K 12 − K 12
I K 13
III −K 13
III −K 14
I III IV K 22 − K 22 − K 11
I IV K 23 − K 12
III −K 23
III IV −K 24 − K 13
I IV K 32 − K 21 III K 32
I IV K 33 − K 22
0
K
III IV K 42 + K 31
IV K 32
K
IV K 41
IV K 42
III 33 III 43
0 II − K 11
IV −K 23 II K − K 12
II − K 21
III II IV K 44 − K 22 + K 33
II −K 31
III 34
IV II K 43 − K 32
IV −K 24 II −K 13 IV II K 34 − K 23 IV II K 44 − K 33 0 IV −K 14
(10.11)
By solving Equation 10.7, the displacement at the junctions can be obtained and then the whole displacement field can be computed. More detailed formulation for wave scattering analysis by delamination or crack in composite beams can be found in a book by Liu and Xi (2001). An SEM code has been developed following the formulations and will often be used in this book.
10.2.4
Experimental Study
A series of experimental investigations has been carried out on two types of specimens (Ishak et al., 2001b). The type 1 specimen consists of two crack configurations (A and B) and the type 2 specimen consists of three different crack configurations (A, B, and C). The dimensions and configurations of these specimens are shown in Figure 10.2. Type 1 specimens are made from perspex 800 mm long, 20 mm wide, and 5 mm thick. It is considered as a plane-strain problem because the thickness in the z direction (out of paper) is much smaller compared to the dimension on the other two directions. Two specimens are made: beam 1A (no crack) and beam 1B (horizontal crack). The height of the beam is H = 5 mm. The parameters for the crack are lc = 25 mm and dc = 2.25 mm. The crack is artificially created with a milling cutter and is actually a very thin rectangular void with a thickness of 0.5 mm. Therefore, these two surfaces of the crack will not be in contact at the level of excitation for the purpose of nondestructive evaluation. Because the rectangular void is sufficiently thin, it is still referred to as the crack here. Note that the sharpness of the crack tip is not very important in this study because it will only affect the results of displacement and stresses near the crack tip. This has been confirmed in a study using a method that combines the SEM and FEM (Liu, 2002b), where the FEM is used for the cracked region and the crack tip elements can be used to accurately model the singularity at the crack tip (see, for example, Liu and Quek, 2003). In the inverse analysis, the displace ment responses used are sampled on the surface of the beam, which is far away from the crack tip. Therefore, the sharpness of the crack tip need not elicit much concern. Type 2 specimens are also made from perspex 800 mm long, 20 mm wide, and 5 mm thick. The height of the beam is H = 20 mm. It is considered as a plane-stress problem because the thickness in the z direction (out of paper) is much larger compared to the dimension on the other two directions. The horizontal crack is artificially created with a milling cutter and is also a very
© 2003 by CRC Press LLC
1523_Frame_C10.fm Page 316 Thursday, August 28, 2003 5:03 PM
Inside the sand box
Excitation point
Inside the sand box
(a) Specimen 1A (without crack), H = 5 mm
Inside the sand box
Excitation point
Crack
Inside the sand box
(b) Specimen 1B (with crack of 2.25 mm-depth), H = 5 mm
Inside the sand box
Excitation point
Crack
Inside the sand box
(c) Specimen 2A (with crack of 4.75 mm-depth), H = 20 mm
Inside the sand box
Excitation point
Crack
Inside the sand box
(d) Specimen 2B (with crack of 9.75 mm-depth), H = 20 mm
Inside the sand box
Excitation point
Crack
Inside the sand box
(e) Specimen 2C (with crack of 14.75 mm-depth), H = 20 mm FIGURE 10.2 Specimens used in the first experimental study of crack estimation (unit: mm). (From Ishak, S.I. et al., J. Sound Vib., 238(4), 661–671, 2000. With permission.)
© 2003 by CRC Press LLC
1523_Frame_C10.fm Page 317 Thursday, August 28, 2003 5:03 PM
thin rectangular void with a thickness of 0.5 mm. Three specimens are made. The cracks in all the specimens are 25 mm long and 5 mm wide at 482.5 mm distance from the left end. The depths of the cracks in these three specimens are: • Beam 2A: 4.75 mm below the beam surface • Beam 2B: 9.75 mm below the beam surface • Beam 2C: 14.75 mm below the beam surface The shaded areas in Figure 10.2 show that 200 mm from both ends of the specimens are immersed in a box of sand. This is done in order to simulate an infinite length of the specimen and minimize the effect of the boundary. The sandbox is made in a tapered form built for a smoother transition in the impedance (see Figure 10.3). This ensures that the outgoing waves along the beam axis to the sandbox will be damped gradually without being reflected back. As a result, a nonreflecting boundary or an infinite beam is experimentally simulated. The schematic drawing of experiment setup is shown in Figure 10.3. The beam’s response along its surface is measured using a Kistler accelerometer (type 8614A). The accelerometer is attached to the measurement points using cementing stud. Sweep sine signal with frequency from 0 to 20 kHz was generated by LDS oscillator type (TPO 25) and applied to the specimen via a small exciter connecting to a push rod. In this experimental study, the excitation point is fixed but the response is measured along the beam surface. The distance between the excitation and measurement points is denoted by Oscillator
Fast Fourier Transform Analyzer
Test Piece Small Exciter
Push Rod
Point 35
Small Accelerometer Point 1
Power Coupler
Sand in the supporting structure
FIGURE 10.3 Schematic drawing of the experiment setup. The accelerometer is attached to the measurement points. Sine signal is generated by LDS oscillator type TPO 25 and applied to the specimen via an LDS electrodynamic exciter type V101 connecting to a coupling rod. The beam’s response along its surface is measured using a Kistler accelerometer (type 8614A). The measured response signal from the accelerometer was then amplified using the power coupler. Subsequently, the output signals from the power coupler are recorded and analyzed using a Hewlett Packard type 35670A dynamic signal analyzer. (From Ishak, S.I. et al., J. Sound Vib., 238(4), 661–671, 2000. With permission.)
© 2003 by CRC Press LLC
1523_Frame_C10.fm Page 318 Thursday, August 28, 2003 5:03 PM
x. The measured response signal from the accelerometer is then amplified using a power coupler. Subsequently, the output signals from the power amplifier are recorded and analyzed using a Hewlett Packard type 35670A dynamic signal analyzer. 10.2.5
Sensitivity Study and Rough Estimation of Crack in Isotropic Beams
Sensitivity studies can help to reduce the ill-posedness of inverse problems. For many cases, it can also help to estimate the bounds of the parameters to be identified, and even to roughly estimate these parameters. For beams with no crack, the curve of displacement response is a smooth curve with no significant sudden change on its pattern. There is no boundary that could scatter the incident wave. However, for beams with a horizontal crack, the pattern of the displacement response should change along the axis direction of the beam from region I to region II to region III (see Figure 10.1). This pattern change can be used to estimate the crack length and location approximately via simple observation. Accurate estimation of the parameters of the crack can be done using an inverse procedure, as long as the changes are sensitive to these crack parameters. 10.2.5.1 Crack Length The calculated and measured frequency response (acceleration) of specimen 1A along its surface is shown in Figure 10.4(a) and Figure 10.4(b) at frequencies of 6515 and 6666 Hz, respectively. Both calculated and measured results are normalized. Figure 10.4(a) shows that the SEM results agree reasonably well with the experimental ones. Both curves show no significant pattern change on the beam’s response curve along its length. Similar findings can also be obtained from Figure 10.4(b), which is calculated at a higher frequency. On the specimen with no crack, there is no boundary that may reflect or scatter the incident wave. Thus, the wave will propagate smoothly. The different number of peaks on Figure 10.4(a) and Figure 10.4(b) indicate that the beam’s responses to the harmonic load are dependent on the frequency of the excitation of the load. The calculated and measured frequency responses on the surface of specimen 1B are shown in Figure 10.5. It is seen again that the SEM results agree reasonably with the experimental results for the case of beam with crack. From Figure 10.5 it is noted that the crack length can be approximately estimated from the pattern change of the beam response curve. Also, it can be observed clearly that the starting point of the dying-off response occurs at about x = 26.5 cm, which is the right crack tip. For this case, however, the left tip of the crack cannot be observed, but this can be easily done by changing the location of loading to the right side of the crack. The calculated and measured frequency responses along the surface of specimen 2A are shown in Figure 10.6. Compared with the response of beam
© 2003 by CRC Press LLC
1523_Frame_C10.fm Page 319 Thursday, August 28, 2003 5:03 PM
1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 1
3
5
7
9
11
13
15
17
19
21
23
25
27
29
31
33
35
Distance from the excitation point, cm Theoretical
Experimental
(a) Specimen 1A at frequency 6.515 kHz 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 1
3
5
7
9
11
13
15
17
19
21
23
25
27
29
31
33
35
Distance from excitation point, cm Theoretical
Experimental
(b) Specimen 1A at frequency 6.666 kHz
FIGURE 10.4 Calculated and measured response of beam 1A to harmonic excitations. The SEM results agree well with the experimental results for a beam without crack. (From Ishak, S.I. et al., J. Sound Vib., 238(4), 661–671, 2000. With permission.)
1B, the response of specimen 2A in Figure 10.6 shows more clearly an indication of the crack region. The growing beam response at about x = 24 cm indicates the beginning of the crack region. The beam’s response is approaching its peak value at about x = 25.5 cm, a very close point to the center of the crack region. After that, this response declines and begins to die off at x = 26.5 cm. From the curves in Figure 10.6, it can be concluded that the crack region lies roughly between x = 24 and 26.5 cm. Both computed and measured responses are very sensitive to the presence and the location of the crack. 10.2.5.2 Crack Depth In further investigation, the crack depth is correlated with the frequency that gives a response sensitive to the crack. For that, the SEM code and experimental study are employed on beam 2B and beam 2C that have, respectively, 9.75- and 14.75-mm crack depths. The calculated and measured response of beam 2B and beam 2C are shown in Figure 10.7 and Figure 10.8.
© 2003 by CRC Press LLC
1523_Frame_C10.fm Page 320 Thursday, August 28, 2003 5:03 PM
1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 1
3
5
7
9
11
13
15
17 19
21
23
25
27
29 31
33
35
29
33
35
Distance from excitation point, cm Theoretical
Experimental
(a) Specimen 1B at frequency of 6.515 kHz 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 1
3
5
7
9
11
13
15
17
19
21
23
25
27
31
Distance from the excitation point, cm Theoretical
Experimental
(b) Specimen 1B at frequency of 6.666 kH
FIGURE 10.5 Calculated and measured response of beam 1B to harmonic excitations. The SEM results agree reasonably well with the experimental results for a beam with a horizontal crack. The crack length can be approximately determined from the pattern change of the beam response. It can be observed clearly that the starting point of the decaying response occurs at about x = 26.5 cm, which corresponds to the right tip of the crack. The harmonic response is very sensitive to the presence and location of the crack. (From Ishak, S.I. et al., J. Sound Vib., 238(4), 661–671, 2000. With permission.)
As in the response of beam 2A, the response of beam 2B in Figure 10.7 shows a clear indication on the starting and ending points of the crack region. The response starts to grow at about x = 24 cm, peaks near x = 25.5 cm, and then decreases to the minimum value at about x = 26.5 cm before dying down. The response of beam 2C shows the same pattern as beam 1B, only showing a clear indication on the ending point of the crack region. In this case, the crack is deeper and is located closer to the bottom surface of the beam. Thus, if the pattern change is not very clear, it can be improved by investigating the beam response by exciting it on the other side of the beam. The responses of beams 2A, 2B, and 2C indicate that, to enhance the sensitivity of the response to deeper cracks, it is necessary to excite the beam with a higher frequency. For detecting a crack of 4.75 mm depth, a frequency of 6515 Hz is needed, and for a crack of 9.75 mm depth, a frequency of 10,000
© 2003 by CRC Press LLC
1523_Frame_C10.fm Page 321 Thursday, August 28, 2003 5:03 PM
1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 1
3
5
7
9
11
13
15
17
19
21
23
25 27
29
31
33
35
Distance from the excitation point, cm Theoretical
Experimental
(a) Specimen 2A at frequency 6.515 kHz 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 1
3
5
7
9
11 13 15 17
19 21 23 25 27 29 31 33 35
Distance from the excitation point, cm Theoretical
Experimental
(b) Specimen 2A at frequency 6.666 kHz
FIGURE 10.6 Calculated and measured response of beam 2A to harmonic excitations. The SEM results agree reasonably well with the experimental results. The crack region is clearly shown in this figure. Increased beam response at x = 24 cm indicates the beginning of crack the region. The peak beam’s response value is at about x = 25.5 cm, very close to the center of the crack region. After this, the response starts to decline and decay at about x = 26.5 cm. From the curves, it can be concluded that the crack lies roughly between x = 24 and 26.5 cm. The harmonic response is very sensitive to the presence and location of the crack. (From Ishak, S.I. et al., J. Sound Vib., 238(4), 661–671, 2000. With permission.)
Hz is required. For detecting a crack of 14.75 mm depth, an even higher frequency of 12,250 Hz is needed.
10.3 Beam Model of Flexural Wave Theoretical and experimental studies on flexural wave in beams have long been established (Graff, 1975; Doyle, 1991, 1997; Doyle et al., 1985, 1987, 1995; Mujumdar and Suryanarayan, 1988). The formulation presented here is largely from those given by Ishak et al. (2001a) based on work by Graff (1975)
© 2003 by CRC Press LLC
1523_Frame_C10.fm Page 322 Thursday, August 28, 2003 5:03 PM
1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0 1
3
5
7
9
11 13 15 17 19 21 23 25 27 29 31 33 35
Distance from the excitation point, cm Theoretical
Experimental
(a) Specimen 2B at frequency 10 kHz 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0 1
3
5
7
9
11 13 15 17 19 21 23 25 27 29 31 33 35
Distance from the excitation point, cm Theoretical
Experimental
(b) Specimen 2B at frequency 10.25 kHz
FIGURE 10.7 Calculated and measured response of beam 2B to harmonic excitations. There is an indication of the starting and ending points of the crack region. The response starts to increase at about x = 24 cm, reach a peak near x = 25.5 cm, and then decrease to the minimum value at about x = 26.5 cm before decaying. The harmonic response is very sensitive to the presence and location of the crack. (From Ishak, S.I. et al., J. Sound Vib., 238(4), 661–671, 2000. With permission.)
for beams, Doyle (1997) for cracked beams and Liu et. al. (1991b, 1995b) and Liu and Lam (1994) for composite laminates with infinite boundaries. Compared with the SEM, the beam model employs a smaller number of equations; consequently, much shorter computing time is needed for solutions of comparable accuracy. In the beam model, the differential equations that govern the wave motion in the beam are analytically solved. Therefore, using wave response sampled on the surface of the beam along the axial (horizontal) direction to perform inverse analyses will not have Type III ill-posedness (see discussion in Section 2.3). This reasoning is the same as in Section 10.2.2. Next, the formulation of the beam model is introduced. Experimental studies are also conducted on beams containing a simulated crack to validate the beam model. The results of the beam model are verified by comparing them with those obtained by the SEM code. The effects of excitation frequencies and their associated wavelength, material properties, and crack sizes on the scattering of flexural wave are investigated using the beam model.
© 2003 by CRC Press LLC
1523_Frame_C10.fm Page 323 Thursday, August 28, 2003 5:03 PM
1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 1
3
5
7
9
11 13 15 17 19 21 23 25 27 29 31 33 35
Distance from the excitation point, cm Theoretical
Experimental
(a) Specimen 2C at frequency 12.25 kHz
1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 1
3
5
7
9
11 13
15
17 19
21
23 25
27
29
31 33
35
Distance from the excitation point, cm Theoretical
Experimental
(b) Specimen 2C at frequency 12.50 kHz
FIGURE 10.8 Calculated and measured response of beam 2C to harmonic excitations. The SEM results agree reasonably well with the experimental results. The ending point at x = 26.5 cm of the crack region is indicated by the oscillation suddenly dying off. The harmonic response is very sensitive to the presence and location of the crack. (From Ishak, S.I. et al., J. Sound Vib., 238(4), 661–671, 2000. With permission.)
10.3.1
Basic Assumptions
In the beam model of wave propagation, the beam is divided into four region spans, namely, two crack regions separated by the crack and two infinite regions, one on each side of the crack regions, as shown in Figure 10.9. Each region is modeled as a Euler beam. The model is assumed as a thin beam and the effects of rotary inertia and shear deformation are not taken into consideration. Based on Euler beam assumptions, this model can only be applied to a beam in which the crosssectional dimensions should be much smaller in comparison to its length. Therefore, the model introduced here can only be applied to thin beams subjected to low frequency of excitation. The solution for the entire delaminated beam is obtained in terms of the solutions of all regions, based on the theory of Euler beam, by satisfying the appropriate boundary conditions at the junctions between these regions.
© 2003 by CRC Press LLC
1523_Frame_C10.fm Page 324 Thursday, August 28, 2003 5:03 PM
F(x)=δ(x+ξ)exp(-iωt)
ξ
lc
dc
z, z1 y, y1
H
2
z4 x4
x, x1 y4 3
1
Infinite region
4
Infinite region Delamination regions
F(x)
ξ
1
z1, w1 M1,V1 x1, u1 Pd, V2, M2
V4, M4 z4, w4 x4, u4
M2,V2, Pd
z2, w2 x2, u2 z3, w3 x3, u3
Pd, V3, M3
4
2 3 M3,V3, Pd
FIGURE 10.9 Geometry and modeling of a beam with a delamination for wave propagation analysis. The beam is divided into four regions in terms of spans: two delamination regions separated by the delamination and two infinite regions, one on each side of the delamination regions. Each region is modeled as a Euler beam. (From Ishak, S.I. et al., ASME, J. Vib. Acoust., 123, 421–427, 2001. With permission.)
10.3.2
Homogeneous Solution
The governing equation of motion for a region can be presented as ∂ 4 wi / ∂xi4 + (ρAci / EI )∂ 2 wi / ∂t 2 = 0 i = 1, 4
(10.12)
where I is the second moment of area of a beam region; ρ and E are the density and Young’s modulus of the beam material. Aci, xi, and wi are the cross-sectional area, the axial coordinate, and the transverse displacement of the region i. For the delaminated regions, the governing equations can be written as −EI 2 ∂ 4 w2 / ∂x 24 − Pd ∂ 2 w2 / ∂x 22 − ρAc 2 ∂ 2 w2 / ∂t 2 − p = 0
(10.13)
−EI 3 ∂ 4 w3 / ∂x 34 + Pd ∂ 2 w3 / ∂x 32 − ρAc 3 ∂ 2 w3 / ∂t 2 + p = 0
(10.14)
where p is the normal contact pressure distribution between the two regions, and Pd is the magnitude of the axial load in each region. Figure 10.10 shows
© 2003 by CRC Press LLC
1523_Frame_C10.fm Page 325 Thursday, August 28, 2003 5:03 PM
w2
w3
1
4
2 3
2 p 3
FIGURE 10.10 Modeling of the contact pressure between the layers of the delamination region. Segments 2 and 3 at the delamination region are set to have identical transverse displacements and, except at their ends, assumed to be free to slide over each other in the axial direction.
the pattern of normal contact pressure distribution between the delaminated regions. Segments 2 and 3 at the crack region are set to have identical transverse displacements and assumed to be free to slide over each other in the axial direction, except at their ends. At any section (x2 = x3) through the crack region, w2 is equal to w3; by replacing w3 and x3 by w2 and x2 in Equation 10.14, and adding it to Equation 10.13, the governing equation can be written as
{
}
∂ 4 w2 / ∂x 24 + ρ( Ac 2 + Ac 3 ) / E( I 2 + I 3 ) ∂ 2 w2 / ∂t 2 = 0
(10.15)
˜ hi = w hi ( xi ) exp(−ωt) , Equation 10.12 and Equation For harmonic motion w 10.15 can be written as ˜ hi / ∂xi4 − λ4i w ˜ hi = 0 i = 1, 2, 3, 4 ∂4w
(10.16)
where the subscript h denotes the homogenous solution, and λi is the frequency parameters presented by λ41 = λ44 = ρAc1ω 2 / EI
(10.17)
λ42 = λ43 = ρ( Ac 2 + Ac 3 )ω 2 / E( I 2 + I 3 )
(10.18)
The general homogeneous solution of Equation 10.16 is given by ˜ hi ( xi ) = Ai exp(iλ i xi ) + Bi exp( −iλ i xi ) + Ci exp(λ i x) + Di exp( − λ i xi ) w i = 1, 2, 3, 4
© 2003 by CRC Press LLC
(10.19)
1523_Frame_C10.fm Page 326 Thursday, August 28, 2003 5:03 PM
The first two parts are the propagating terms, while the last two parts are the stationary terms. The radiation condition at the infinite region 1 states that waves can only propagate away in the negative direction to infinity. This requires A1 = 0 because that term, in conjunction with the time variation term, corresponds to incoming waves (see, for example, Chapter 1 in Liu and Xi, 2001). In addition, the solution must be bounded when x approaches negative infinity. Therefore D1 must also be zero. Thus, the homogeneous solutions on the left side of the infinite region can be simplified as ˜ h1 ( x1 ) = B1 exp( −iλ 1x1 ) + C1 exp(λ 1x1 ) w
(10.20)
Comparing with infinite region 1, the wave in infinite region 4 propagates only in the positive x direction. Thus, for this region, it requires B4 = 0. To obtain a bounded solution in this region as x → × requires C4 = 0. Therefore, the solution for region 4 becomes the form of ˜ h4 ( x 4 ) = A4 exp(iλ 4 x 4 ) + D4 exp( − λ 4 x 4 ) w
10.3.3
(10.21)
Particular Solution
The inhomogeneous equations of motion for a beam subjected to a harmonic concentrated load at x = –ξ can be expressed as (Ishak et al., 2001a) ∂ 4 w / ∂x 4 + (ρAc1 / EI )∂ 2 w / ∂t 2 = (δ( x + ξ) exp( −iωt)) / EI
(10.22)
˜ p ( x) exp( −iωt) , Considering a solution of the form w p = w
(
)
˜ p / ∂x 4 − ρAc1ω 2 / EI w ˜ p = δ( x + ξ) / EI ∂4w
(10.23)
can be obtained, where the subscript p denotes the particular solution. Applying Fourier transform to Equation 10.23, the particular solution is given by ˜ p ( x) = w
1 2 πEI
(
) ∂λ
∞
exp −iλ( x + ξ)
−∞
λ4 − λ40
∫
(10.24)
There are four poles in the integrand, two on the real axis and two on the imaginary axis. They are λ = ± λ 0 , ± iλ 0
where λ0 = (ρAc1ω2/EI)1/4
© 2003 by CRC Press LLC
(10.25)
1523_Frame_C10.fm Page 327 Thursday, August 28, 2003 5:03 PM
Im iλ0
λ0
Re
-λ0 -iλ0
FIGURE 10.11 Contour for analyzing waves in the infinite beam x > –ξ subjected to a harmonic concentrated load. (From Ishak, S.I. et al., ASME J. Vib. Acoust., 123, 421–427, 2001. With permission.)
For x = –ξ, it is necessary to choose the semicircular contour in the lower half of the complex –ξ plane to carry out the integration in Equation 10.24. The imaginary root in the upper half-plane is excluded by the contour closure because it will result in unbounded solution when x approaches positive infinity. The pole at λ = –λ0 in conjunction with the term exp(-iωt) represents the wave propagating rightwards when x > –ξ. It is therefore included in the contour loop. The pole at λ = +λ0 corresponds to the wave propagation leftwards. It must be excluded from the contour loop by an indentation below it. The resulting contour of integration is shown in Figure 10.11. By applying the same analysis for the case of x < –ξ, it is found that the semicircular contour at the upper half of the complex ξ-plane must be chosen. Employing the residue theorem for the integration in Equation 10.24, yields the particular solution as: 1 −i 4λ3 EI exp iλ 0 ( x + ξ) + 4λ3 EI exp − λ 0 ( x + ξ) , 0 ˜ p (x) = 0 w i 1 3 exp −iλ 0 ( x + ξ) − 3 exp λ 0 ( x + ξ) , 4λ 0 EI 4λ 0 EI
{ {
}
{
}
{
}
for x ≥ − ξ
}
for x < ξ
(10.26)
The general solution for a beam can be finally written as ˜ =w ˜h +w ˜p w
(10.27)
This equation is applicable to each of these four regions. 10.3.4
Continuity Conditions
In order to satisfy the compatibility of displacements and equilibrium of forces at the junctions between the infinite and crack regions, the conditions
© 2003 by CRC Press LLC
1523_Frame_C10.fm Page 328 Thursday, August 28, 2003 5:03 PM
of continuity must be applied at these junctions. The continuity conditions at the junction of x1 = 0 and x2 = 0 are as follows: • Continuity of transverse displacement ˜1 = w ˜2 w
(10.28)
˜ 1 / ∂x1 = ∂w ˜ 2 / ∂x 2 ∂w
(10.29)
• Continuity of normal slope
The compatibility of transverse displacement and slopes between regions 1 and 3 is also automatically satisfied with the assumption ˜2 = w ˜3. of w • Continuity of shear forces −EI∂ 3 u˜ 1 / ∂z13 = −(EI 2 + EI 3 )∂ 3 u˜ 2 / ∂z23
(10.30)
• Continuity of bending moments ˜ 1 / ∂x12 = −(EI 2 + EI 3 )∂ 2 w ˜ 2 / ∂x 22 + Pd ( H / 2) −EI∂ 2 w
(10.31)
The continuity conditions at the junction x2 = lc, x4 = 0 are obtained by ˜ 1 and x1 by w ˜ 4 and x4 respectively in Equation 10.28 through replacing w Equation 10.31. Apart from the continuity of transverse displacement, slopes, bending moments, and shear forces, the continuity of axial displacement and forces must be satisfied at each junction. As shown in Figure 10.12, an additional axial load system of equal and opposite forces on the delaminated regions is needed to maintain geometrical compatibility at the ends of these regions in the same plane. The continuity of axial displacement on the junctions satisfies the following conditions
( H / 2)[∂w˜ 1 / ∂x1 ]x =0 = [u˜ 2 − u˜ 3 ]x =0
(10.32)
( H / 2)[∂w˜ 4 / ∂x 4 ]x =0 = [u˜ 2 − u˜ 3 ]x =l
(10.33)
1
4
2
2
c
Subtracting Equation 10.33 from Equation 10.32 yields
( H / 2)[∂w˜ 1 (0) / ∂x1 − ∂w˜ 4 (0) / ∂x 4 ] = [u˜ 3 (lc ) − u˜ 3 (0)] − [u˜ 2 (lc ) − u˜ 2 (0)]
© 2003 by CRC Press LLC
(10.34)
1523_Frame_C10.fm Page 329 Thursday, August 28, 2003 5:03 PM
Pd M2 Pd
M2 Pd Pd 2 3
M3
M3
M2
M2
Pd Pd
2 3
Pd Pd M3
M3
FIGURE 10.12 Deformation and stresses when the axial displacements are satisfied.
By considering the axial equilibrium of the two regions of the crack region and neglecting the longitudinal inertia terms, the total axial extension of the middle plane of each of these regions can be obtained as u˜ i (lc ) − u˜ i (0) =
Pi lc EAci
∫
b
0
2
˜i ∂w ∂x dxi i
i = 2, 3
(10.35)
where P2 = –Pd and P3 = Pd. The integral term in Equation 10.35 represents the axial shortening caused by bending of the beam regions. Substituting Equation 10.35 for the axial displacements into Equation 10.34 ˜ 2 equals w ˜ 3 , Equation and noting that the integral terms are cancelled as w 10.34 can be rewritten as EAc 2 Ac 3 ˜ ( ) ˜ Pd = ( H / 2) [∂w1 0 / ∂x1 − ∂w 4 (0) / ∂x 4 ] lc ( Ac 2 + Ac 3 )
(10.36)
Using this expression for the axial force term, Equation 10.28 through Equa tion 10.31 and the corresponding equations at the second junction can be written as a set of eight simultaneous linear equations with eight unknown constants. By solving this set of equations for these constants, the frequency response of the beam can then be easily determined.
10.3.5
Comparison between SEM and Beam Model
The relationship between the characteristics of crack and the beam displace ments subjected to harmonic excitations has been investigated (Ishak, 2001)
© 2003 by CRC Press LLC
1523_Frame_C10.fm Page 330 Thursday, August 28, 2003 5:03 PM
Inside the sand box
Excitation point
Crack
Inside the sand box
(a) Specimen 3A, H = 25.4 mm
Inside the sand box
Excitation point
Delamination Inside the sand box
(b) Specimen 3B, H = 25.4 mm
Inside the sand box
Excitation point
Delamination Inside the sand box
(c) Specimen 3C, H = 25.4 mm
FIGURE 10.13 Specimens used in the second experimental study (unit: mm).
analytically and experimentally. The comparison of beam model results with those obtained using the SEM code has also been conducted. For a nondelaminated beam subjected to point harmonic excitation on its surface, the displacement distribution on the beam surface is a smooth curve. However, for a delaminated beam, the corresponding displacement distribution will be perturbed, especially over the region of the crack. Numerical calculations are made on an aluminum beam of 25.4 mm width and 4.5 mm thickness containing a simulated through-width crack located at 4.75 mm below the beam surface at 242.5 mm from the excitation point (see Figure 10.13). The height of the beam is H = 25.4 mm. The lengths of the crack are • Beam 3A: lc = 35 mm • Beam 3B: lc = 45 mm • Beam 3C: lc = 55 mm The Young’s modulus of the aluminum used in this calculation is 69 GPa. Based on the assumptions of the Euler beam model, the analysis on beams 3A, 3B, and 3C is limited up to the frequency of 21.6, 13.1, and 8.7 kHz, respectively. These frequencies correspond to the first natural frequency of crack region 2 on beams 3A, 3B, and 3C, respectively.
© 2003 by CRC Press LLC
1523_Frame_C10.fm Page 331 Thursday, August 28, 2003 5:03 PM
For beam 3A, the magnitudes of the homogeneous and particular solutions of beam displacement subjected to a harmonical concentrated excitation of 17.5 kHz are shown in Figure 10.14. Figure 10.14(a) shows that the homo0.54
delamination 0.53 0.52 0.51 0.5 0.49 0.48 0.47 0.46 0.45 -5
0
5
10
15
20
25
30
35
40
30
35
40
(a) Homogeneous solution
0.65 0.6 0.55 0.5 0.45 0.4 0.35
excitation point
0.3 0.25 -5
0
5
10
15
20
25
(b) Particular solution FIGURE 10.14 Homogeneous and particular solutions of displacement of beam 3A subjected to a point harmonic excitation. (a) The homogeneous solution comprises two identical peaks at the left- and right-hand edges of the delamination. The U-curve area on this solution indicates the delamination region. (b) The particular solution has a single peak at the excitation point and then symmetrically declines to its left and right sides. (From Ishak, S.I. et al., ASME J. Vib. Acoust., 123, 421–427, 2001. With permission.)
© 2003 by CRC Press LLC
1523_Frame_C10.fm Page 332 Thursday, August 28, 2003 5:03 PM
geneous solution comprises two identical peaks at the left- and right-hand edges of the crack. The U-curve area on this solution indicates approximately the crack region. Figure 10.14(b) indicates that the particular solution has a single peak at the excitation point and then symmetrically declines to its left and right sides. Figure 10.15 presents the general solutions of the vertical displacement of beam 3A computed using the beam model and SEM code at frequency of 17.5 kHz. It shows that results of the beam model agree reasonably well with the SEM results. The transverse displacement of beam 3A begins to increase at a 235 mm distance from the excitation point (x1 = –7.5 mm) and then approaches its peak value at about 256 mm. After that, the displacement declines and begins to die off at about 284 mm away from the excitation point. Mujumdar and Suryanarayan (1988) mention that the abrupt change on the transverse displacement at the crack region is mainly due to the reduction in flexural rigidity. Figure 10.15 shows that the crack is located in the region where the pattern of the beam displacement curve changes significantly. Approaching the lefthand edge of the crack region, the beam response begins to increase and achieve its peak value near the middle of the crack. Beyond that, the response begins to decay and attains a steady-state value at the right-hand edge of the delaminated region. The length of this area is related to the crack length, which means that for a longer crack, the pattern change will occur in a wider region. The left side of the crack edge cannot always be clearly seen from 0.2
Transverse displacement, x10e-7 m
Beam model SEM 0.18
0.16
delamination 0.14
0.12
0.1
0.08
0
5
10
15
20
25
30
35
40
Distance from excitation point, cm
FIGURE 10.15 Calculated response of displacement amplitude of beam 3A to a point harmonic excitation at the original with a frequency of 17.5 kHz. It shows that results of the beam model agree reasonably well with those obtained using the SEM code. The transverse displacement of beam 3A begins to increase at 235-mm distance from the excitation point (x1 = –0.75 cm) and then approaches its peak value at 25.6 cm. (From Ishak, S.I. et al., ASME J. Vib. Acoust., 123, 421–427, 2001. With permission.)
© 2003 by CRC Press LLC
1523_Frame_C10.fm Page 333 Thursday, August 28, 2003 5:03 PM
the displacement distribution curve but the right-hand side is always quite clearly shown by the point where the response curve starts to die off. This important phenomenon was reported by Liu et al. (1996) based on their numerical analysis. This evidence indicates that the left-hand edge of the crack can be roughly estimated by applying the excitation force in the infinite region 4. From these results, the length of the crack on beam 3A is estimated at about 35 mm. The calculated beam 3A and beam 3B displacements at frequency of 12.5 kHz are presented in Figure 10.16. Again, it is seen that the beam model agrees reasonably well with the SEM results. Figure 10.16 shows that beam displacement begins to increase at 220 and 230 mm from the excitation point and approaches its peak at 256 and 261 mm for beam 3A and beam 3B, respectively. After that, this displacement begins to decay and attains steady value at 280 and 290 mm for beam 3A and beam 3B, respectively. The displacements for beam 3B and beam 3C correspond to frequency of 7.5 kHz and are shown in Figure 10.17. These results also show that the beam model agrees reasonably well with SEM results for lower frequency. The disadvantage of using lower excitation frequency is that the pattern change of the displacement curve at the crack region becomes less significant. Relating the crack length with the excitation frequency as presented in Table 10.1, it is found that the suitable excitation frequency to detect the presence of crack should generate waves with a wavelength longer than the crack length. In contrast with the ultrasonic method that requires signals whose dominant wavelength is much shorter than the crack length, this evidence shows the feature of using dispersive lamb waves. For the cases presented here, the wavelength of the excitation force is within the range of 2.93 lc to 3.78 lc. It may be noted that the beam model is applicable only for relatively low frequency excitation whose wavelength is much longer than the beam thickness due to the use of Euler beam theory. SEM can be used for excitation of any frequency. 10.3.6
Experimental Verification
Experiments are carried out on a cracked aluminum beam 3B. At both ends of the beam, 200 mm are immersed in sand in order to simulate an infinite beam and to ensure that the outgoing waves are damped gradually without being reflected back. As a result, a nonreflecting boundary or an infinite beam is experimentally simulated. The schematic drawing of the experiment setup is shown in Figure 10.18. The beam displacement along its surface is measured and analyzed using the scanning laser vibrometer. Sinusoidal signals with frequencies of 7.5 and 12.5 kHz are generated using a waveform generator and applied to the beam via an electromagnetic exciter connecting to a push rod. In the experiments, the excitation point is fixed but the displacement is measured along the beam surface.
© 2003 by CRC Press LLC
1523_Frame_C10.fm Page 334 Thursday, August 28, 2003 5:03 PM
0.2
Transverse displacement, x10e-7 m
Beam model SEM delamination
0.18
0.16
0.14
0.12
0.1
0.08
0
5
10
15
20
25
30
35
40
Distance from excitation point, cm
(a) Beam 3A
Transverse displacement, x10e-7 m
0.22 Beam model SEM 0.2 delamination 0.18
0.16
0.14
0.12 0.1
0.08
0
5
10
15
20
25
30
35
40
Distance from excitation point, cm
(b) Beam 3B FIGURE 10.16 Calculated response of displacement amplitude of beams 3A and 3B to a point harmonic excitation at the original with a frequency 12.5 kHz. The beam model agrees reasonably well with the SEM results. The beam displacement begins to increase at 22 and 23 cm from the excitation point and approaches its peak at about 25.6 and 26.1 cm for beams 3A and 3B, respectively. After that, this displacement begins to decay and attains steady value at 28 cm and 29 cm for beam 3A and beam 3B, respectively. (From Ishak, S.I. et al., ASME J. Vib. Acoust., 123, 421–427, 2001. With permission.)
© 2003 by CRC Press LLC
1523_Frame_C10.fm Page 335 Thursday, August 28, 2003 5:03 PM
Transverse displacement, x10e-7 m
0.7 Beam model SEM 0.6 0.5 delamination
0.4
0.3
0.2
0.1
0 0
5
10
15
20
25
30
35
40
Distance from excitation point, cm
(b) Beam 3B
Transverse displacement, x10e-7 m
0.7 Beam model SEM 0.6 delamination
0.5
0.4
0.3
0.2 0.1
0 0
5
10
15
20
25
30
35
40
Distance from excitation point, cm
(c) Beam 3C FIGURE 10.17 Response of displacement amplitude of beams 3B and 3C to a point harmonic excitation at the original with a frequency of 7.5 kHz. Reasonable agreement between the beam model results and the SEM ones is achieved. (From Ishak, S.I. et al., ASME J. Vib. Acoust., 123, 421–427, 2001. With permission.)
© 2003 by CRC Press LLC
1523_Frame_C10.fm Page 336 Thursday, August 28, 2003 5:03 PM
TABLE 10.1 Correlation between Length of Delamination and Excitation Frequency Frequency (kHz)
Wavelength (mm)
7.50 10.00 12.50 15.00 17.50
170.28 147.47 131.89 120.41 111.48
Wavelength/Delamination-Length Beam 3B Beam 3C Beam 3A N/A N/A 3.77 3.44 3.19
3.78 3.28 2.93 N/A N/A
3.10 N/A N/A N/A N/A
Source: Ishak, S.I. et al., ASME J. Vib. Acoust., 123, 421–427, 2001. With permission.
PC
Signal Generator
Test Piece Electromagnetic Push Rod Exciter
Point 1
Sand in the supporting structure
Point 35 Video Control Box Laser Scanning Head
A/D Board Video
Video
Scanner Supply, Controller
DAC GPIB
FIGURE 10.18 Schematic drawing of experiment setup. The beam displacement along its surface is measured and analyzed using a scanning laser vibrometer. Sinusoidal signals with desired frequencies are generated using a waveform generator and applied to the specimen via an electromagnetic exciter connecting to a push rod. The excitation point is fixed but the displacement is measured along the beam surface using the scanning head. (From Ishak, S.I. et al., ASME J. Vib. Acoust., 123, 421–427, 2001. With permission.)
Figure 10.19 shows the comparison between the beam model and experiments. The discrepancy at the location next to the excitation point is attributed to the effect caused by the actuator and discontinuity in displacement at the excitation point. Larger discrepancies are observed at the infinite region 4 where the experimental curves do not decay completely in accordance with the calculated curves. This is because the sand could not fully damp all the incoming waves, resulting in the generation of a small amount of reflected wave that interferes with the incidence wave. For further verification of the beam model, a perspex beam is used to compare the beam model and the experiment. The Young’s modulus of the perspex used is 2.65 GPa, which is about 26 times less stiff than the aluminum. The perspex beam is 20 mm wide and 5 mm thick and contains a simulated through-width crack at 5 mm below the beam surface. The height of the beam
© 2003 by CRC Press LLC
1523_Frame_C10.fm Page 337 Thursday, August 28, 2003 5:03 PM
Transverse displacement, x10e-7 m
0.2 Beam model Laser
delamination 0.18
0.16
0.14
0.12
0.1
0.08
0.06 0
5
10
15
20
25
30
35
40
Distance from excitation point, cm
(a) 12.5 kHz
Transverse displacement, x10e-7 m
0.7 Beam model Laser 0.6
0.5 delamination 0.4
0.3
0.2
0.1
0 0
5
10
15
20
25
30
35
40
Distance from excitation point, cm
(b) 7.5 kHz FIGURE 10.19 Response of displacement amplitude of beam 3B to harmonic excitations at various frequencies. Good agreement between the beam model results and experimental ones is achieved. The discrepancy at the location next to the excitation point is attributed to the stiffening effect caused by the actuator and discontinuity in displacement at the excitation point. Larger discrepancies are observed at the infinite region 4 where the experimental curves do not decay completely in accordance with the calculated curves. (From Ishak, S.I. et al., ASME J. Vib. Acoust., 123, 421–427, 2001. With permission.)
© 2003 by CRC Press LLC
1523_Frame_C10.fm Page 338 Thursday, August 28, 2003 5:03 PM
1 Beam model Experiment
delamination
0.9
Normalized Response
0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 5
10
15
20
25
30
35
40
Distance from excitation point, cm
FIGURE 10.20 Response of displacement amplitude of perspex beam excited by a concentrated harmonic load. The result of the beam model agrees reasonably well with the experimental result. The displacement peaks at the excitation point and oscillates; it also peaks at the delamination region and then decays after passing through the right edge of the delamination. From this figure, the location as well as the region of the delamination can roughly be determined. (From Ishak, S.I. et al., ASME J. Vib. Acoust., 123, 421–427, 2001. With permission.)
is 20 mm. The length of the crack is 25 mm, located 242.5 mm from the excitation point. Calculated and measured perspex beam displacement at frequency of 6.66 kHz is illustrated in Figure 10.20, which shows that results of the beam model agree reasonably well with the experiment. Generally, the perspex beam displacement shows the same trend as the aluminium beam; it oscillates and has a peak at the excitation point. The displacement peaks at the crack region and then decays after passing through the right edge of the crack. Comparison study on the aluminium and perspex beams shows that the beam model is a valid forward solver for beams with cracks. The comparison studies indicate that the results from SEM codes agree reasonably well with those of beam model of flexural wave and the experimental study. Note that SEM code is applicable to beams of any thickness because it is based on equations of two-dimensional solid mechanics.
10.4 Beam Model for Transient Response to an Impact Load 10.4.1
Beam Model Solution
Using the frequency response obtained with the beam model introduced in Section 10.2, the transient wave response can be simply obtained using frequency-time Fourier transformation.
© 2003 by CRC Press LLC
1523_Frame_C10.fm Page 339 Thursday, August 28, 2003 5:03 PM
The continuous Fourier transform pair is defined as w( x , t) =
1 2π
˜ (x, ω) = w
∫
∞
−∞
∫
∞
–∞
˜ ( x , ω ) exp(iωt)dω w
w( x , t) exp( −iωt)dt
(10.37)
(10.38)
where ω is the frequency, and “~” represents the value in the frequency domain. The discrete Fourier transform pair over a single period of N is given as w( x , tn ) =
1 N
˜ (x, ω k ) = w
N −1
∑ w˜ (x, ω ) exp(i2πtk / N ); k
n = 0, 1, … , N − 1
(10.39)
k = 0, 1, … , N − 1
(10.40)
k =0
N −1
∑ w(x, t ) exp(−i2πnk / N ); n
n =1
The discretization of the frequencies is given by ωk =
2 πk N
(10.41)
Note that in performing the Fourier transformation, material or structured damping should be included to avoid singularity in the integration.
10.4.2
Experimental Study on Impact Response
Experiments are conducted on four aluminum beams (4A, 4B, 4C, and 4D): 1000 mm in length, 25.4 mm in width, and 4.5 mm in thickness, with the same structure as that shown in Figure 10.13. The height of the beam is H = 25.4 mm. Beam 4A has no crack and is used as the reference beam. Beam 4B, beam 4C, and beam 4D have a crack that is artificially created with a 0.25-mm diameter milling cutter 600 mm from the left end of the beam and 5 mm below the surface. Beam 4B has a 35-mm long crack, beam 4C has a crack 45 mm long, and beam 4D has a crack 55 mm long. The experiment setup illustrated in Figure 10.18 is used in this study. The measured time history of the excitation force generated by a 15.21mm diameter steel ball on the flawless beam 4A and flawed beam 4B is shown in Figure 10.21. It is observed that the time period of the generated impact force is about 96 µs. Figure 10.21 also shows that the shape of the recorded force time history for the flawless beam 4A is almost the same as
© 2003 by CRC Press LLC
1523_Frame_C10.fm Page 340 Thursday, August 28, 2003 5:03 PM
1.2 96 µs
Normalized Force
1 0.8 0.6 0.4 0.2 0 -0.2 0
0.0001
0.0002
0.0003
0.0004
0.0005
0.0006
0.0007
0.0008
time, s
flawless
flaw
FIGURE 10.21 Measured-time history of impact forces using a steel ball of 15.21 mm diameter. The time period of the generated impact force is 96 µs. The shape of the recorded force time history for the flawless beam 4A is almost the same as that of the flawed beam 4B. (From Ishak, S.I. et al., J. Sound Vib., 252(2), 343–360, 2002. With permission.)
that of the flawed beam 4B. The frequency spectrum of the excitation force generated by the steel ball can be obtained by using the widely used technique of fast Fourier tansform (FFT). The results are shown in Figure 10.22. Because the dominate frequency of the excitation is below 10 kHz, the frequency of the transient wave response should be within 10 kHz. All the noise with frequencies of more than 10 kHz in the measured response can be safely removed by filtering before use for an inverse analysis. Frequency contents of the excitation force Linear Magnitude
80 60 40 20 0
0
0.5
1
1.5
2
2.5
3
3.5
4
Frequency, Hz The zoom in view of the frequency contents
4.5 x 10 5
Linear Magnitude
80 60 40 20 0
0
0.5
1
1.5
2
2.5
Frequency, Hz
3
3.5
4 x 10 4
FIGURE 10.22 Frequency spectrum of the force generated using a steel ball of 15.21 mm diameter. (From Ishak, S.I. et al., J. Sound Vib., 252(2), 343–360, 2002. With permission.)
© 2003 by CRC Press LLC
1523_Frame_C10.fm Page 341 Thursday, August 28, 2003 5:03 PM
Normalized Response
With the use of the 15.21-mm diameter steel ball, the time history response of the beam at points located at 80, 117.5, and 155 mm from the point of impact are obtained and shown in Figure 10.23 (a), (b), and (c), respectively. As shown in Figure 10.23(a) and (c), the responses of the flawless beam 4A and the flawed beam 4B at the left-hand side and right-hand side of the crack do not show significant differences. However, when the beam is excited at a point above the crack, as shown in Figure 10.23(b), some differences, especially at the time after 300 µs, are observed. The different in response between the flawless and the flawed beams might be because the frequency of the excitation force is too low (Figure 10.22). 0.4 0.2 0 -0.2 -0.4 -0.6 -0.8 -1 0
0.0001 0.0002 0.0003 0.0004 0.0005 0.0006 0.0007 0.0008
time, s
flawless
flaw
Normalized Response
(a) Beam response at 80 mm from impact point (left hand side of the crack) 0.4 0.2 0 -0.2 -0.4 -0.6 -0.8 -1 0
0.0001
0.0002
0.0003
0.0004
0.0005
0.0006
0.0007
0.0008
time, s
flawless
flaw
Normalized Response
(b) Beam response at 117.5 mm from impact point (above the crack) 0.6 0.4 0.2 0 -0.2 -0.4 -0.6 -0.8 -1 0
0.0001 0.0002 0.0003 0.0004 0.0005 0.0006 0.0007 0.0008
time, s
flawless
flaw
(c) Beam response at 155 mm from impact point (right hand side of the crack)
FIGURE 10.23 Measured-time histories of beam responses of the transverse displacment to the impact of a steel ball of 15.21 mm diameter. (From Ishak, S.I. et al., J. Sound Vib., 252(2), 343–360, 2002. With permission.)
© 2003 by CRC Press LLC
1523_Frame_C10.fm Page 342 Thursday, August 28, 2003 5:03 PM
Normalized Force
1.2
75.8 µs
1 0.8 0.6 0.4 0.2 0 -0.2 0
0.0001
0.0002
0.0003
0.0004
0.0005
0.0006
0.0007
0.0008
time, s
flawless
flaw
FIGURE 10.24 Measured-time history of impact forces using a steel ball of 9.98 mm diameter. The time period of the generated impact force is 75.8 µs. (From Ishak, S.I. et al., J. Sound Vib., 252(2), 343–360, 2002. With permission.)
To increase the sensitivity of the time-domain response to the parameters of the crack in the beam, use a smaller steel ball (9.98 mm in diameter). The measured time history and the calculated frequency domain of excitation force generated by the steel ball for the flawless beam 4A and the flawed beam 4B are shown in Figure 10.24 and Figure 10.25, respectively. It is seen that the time period of the generated impact force is about 75.8 µs, and the frequency becomes as high as 15 kHz. This indicates using a smaller ball will generate a sharper impact force with higher frequency. Thus, the use of a smaller ball might improve the sensitivity of the model for crack identification. With the use of the 9.98 mm diameter steel ball, the time history of beam response at points located at 80, 117.5, and 155 mm from the point of impact are shown in Figure 10.26(a), (b), and (c), respectively. The results in Figure 10.23 and Figure 10.26 also indicate that the beam response is dispersive. As shown in Figure 10.26(a) and Figure 10.26(c), the presence of the crack can generate additional small oscillations on the beam response curve at the lefthand side and the right-hand side of the crack after reaching the minimum point of the response at 208 and 274 µs, respectively. Responses at the middle of cracked region, shown in Figure 10.26(b), indicate a very significant difference between the flawless and the flawed beams. The response of the flawed beam 4B shows a pronounced oscillation that can be utilized to estimate the presence and the parameters of the crack. 10.4.3
Comparison Study
Figure 10.27 presents numerical and experimental time history of flawless beam 4A responses recorded at points 80, 117.5, and 155 mm from the point of impact. The numerical responses are calculated using beam model and FEM, and the experimental responses are from the laser measurement. The
© 2003 by CRC Press LLC
1523_Frame_C10.fm Page 343 Thursday, August 28, 2003 5:03 PM
Frequency contents of the excitation force
Linear Magnitude
50 40 30 20 10 0 0
0. 5
1
1. 5
2
2. 5
3
3. 5
4
Frequency, Hz
4. 5
x 10
5
The zoom in view of the frequency contents
Linear Magnitude
50 40 30 20 10 0 0
0. 5
1
1. 5
2
Frequency, Hz
2. 5
3
3. 5
4
x 10
4
FIGURE 10.25 Frequency domain of force generated using a steel ball of 9.98 mm diameter. (From Ishak, S.I. et al., J. Sound Vib., 252(2), 343–360, 2002. With permission.)
flawless beam 4A responses calculated using the beam model show that results of the beam model agree reasonably well with the FEM and experimental results until the wave reflected from the boundary interferes with the incidence wave. The interference between the reflected and incidence waves is evidenced by the oscillation on the beam responses beyond 400 µs. These oscillations do not appear in the response curve obtained using the beam model due to the use of infinite beams that allow waves propagating to infinity without being reflected back. As shown in Figure 10.27, a response curve of a crackfree beam does oscillate. Numerical and experimental time history of cracked beam responses recorded at points located at 80, 117.5, and 155 mm from the point of impact are shown in Figure 10.28. Results obtained from the beam model agree reasonably well with the FEM and the experimental results for flawed beams. The interference of the incidence and reflected waves from the right end of the beam are evidenced in the later part of the curves. Besides the oscillation due to interferences, a small oscillation is on the response curve at the lefthand side and the right-hand side of the crack, as shown in Figure 10.28(a) and Figure 10.28(c). The oscillation becomes more pronounced when the
© 2003 by CRC Press LLC
1523_Frame_C10.fm Page 344 Thursday, August 28, 2003 5:03 PM
Normalized Response
0.2 0 -0.2 -0.4 -0.6 -0.8 -1 0
0.0001
0.0002
0.0003
0.0004
0.0005
0.0006
0.0007
0.0008
time, s
flawless
flaw
(a) Beam response at 80 mm from the impact point (left hand side of the crack)
Normalized Response
0.4 0.2 0 -0.2 -0.4 -0.6 -0.8 -1 0
0.0001
0.0002
0.0003
0.0004
0.0005
0.0006
0.0007
0.0008
time, s
flawless
flaw
Normalized Response
(b) Beam response at 117.5 mm from the impact point (above the crack) 0.6 0.4 0.2 0 -0.2 -0.4 -0.6 -0.8 -1 0
0.0001 0.0002 0.0003 0.0004 0.0005 0.0006 0.0007 0.0008 time, s
flawless
flaw
(c) Beam response at 155 mm from the impact point (right hand side of the crack)
FIGURE 10.26 Measured-time histories of beam responses to the impact of a steel ball of 9.98 mm diameter. (From Ishak, S.I. et al., J. Sound Vib., 252(2), 343–360, 2002. With permission.)
response is sampled at the point above the crack shown in Figure 10.28(b). These results can be used as a good estimation of the presence and the parameters of a crack.
10.5 Extensive Experimental Study To validate the analytical method, an extensive experimental study has been conducted (Ishak et al., 2001b). The test beam is of infinite length and a crack
© 2003 by CRC Press LLC
1523_Frame_C10.fm Page 345 Thursday, August 28, 2003 5:03 PM
Beam model FEM Laser
0.2 0
-0.2
-0.4
-0.6 -0.8
-1 0
1
2
3
4
5
6
7
8 x 10-4
(a) Time history of flawless beam response at 80 mm from the impact point Beam model FEM Laser
0.2 0 -0.2 -0.4 -0.6 -0.8 -1
0
1
2
3
4
5
6
7
8 x 10
-4
(b) Time history of flawless beam response at 117.5 mm from the impact point
FIGURE 10.27 Comparison of time history curves of flawless beam response obtained from the beam model, FEM and laser measurement. (From Ishak, S.I. et al., J. Sound Vib., 252(2), 343–360, 2002. With permission.)
of prescribed length is artificially created at predetermined locations. Using two different types of excitations and measurements, the effects on the beam response of crack length and depth, the beam material, excitation frequencies, and location of excitation are analyzed.
10.5.1
Test Specimens
Experimental studies are conducted on isotropic and anisotropic beams. The isotropic beams are made of perspex and aluminum. Beam 5A, beam 5B, and beam 5C are made of perspex with dimensions of 800 × 20 × 5 mm
© 2003 by CRC Press LLC
1523_Frame_C10.fm Page 346 Thursday, August 28, 2003 5:03 PM
0.6 Beam model FEM Laser
0.4 0.2 0 -0.2 -0.4 -0.6 -0.8 -1
0
1
2
3
4
5
6
7
8 -4
x 10
(c) Time history of flawless beam response at 155 mm from the impact point
FIGURE 10.27 Continued.
(length × width × thickness). The height of the beam is H = 20 mm. The length of the crack is 25 mm and artificially created with a 0.25-mm diameter milling cutter 485 mm from the left-hand side of the beam. The depths of the cracks are 5 mm (beam 5A), 10 mm (beam 5B) and 15 mm (beam 5C), below the surface of the beams. These beams are used to study the sensitivity of the beam response to different crack depths. Aluminum is used to investigate the applicability of the current technique to detecting cracks in a material stiffer than perspex and also the sensitivity of the beam response to different crack lengths. Beam 6A, beam 6B, and beam 6C are made of aluminum with dimensions of 800 × 25.4 × 4.5 mm (length × width × thickness). The height of the beam is H = 25.4 mm. Crack lengths are 25 mm (beam 6A), 35 mm (beam 6B), 45 mm (beam 6C), and 55 mm (beam 6D). The cracks are artificially created 5 mm below the surface with a 0.25mm diameter milling cutter at 485 mm from the left-hand side of the beams. Experimental studies on anisotropic material are conducted on a laminate containing cracks at different depths from the surface. Four composite beams (7A, 7B, 7C, and 7D) are prepared with dimensions of 390 × 20 × 7 mm (length × width × thickness) and are made of [graphite/epoxy: 0°/90°]. The height of the beam is H = 7 mm. The composite is made of 20 plies of laminate and is cured using autoclave molding. Beam 7A, which contains no delamination, is used as a reference. A 25mm long delamination is created in beam 7B, beam 7C and beam 7D, 205 mm from the left-hand side of the each beam. The depths of the delamination are 1.75 mm (beam 7B), 3.5 mm (beam 7C), and 5.25 mm (beam 7D) below the beam surface. The delamination is simulated as a rectangular flaw region. The delamination is made by placing a stainless steel (of 0.1-mm thickness) insert between layers during fabrication. This insert, which will create a delami-
© 2003 by CRC Press LLC
1523_Frame_C10.fm Page 347 Thursday, August 28, 2003 5:03 PM
Beam model FEM Laser
0.2 0
-0.2
-0.4
-0.6 -0.8
-1
0
1
2
3
4
5
6
7
8 x 10
-4
(a) Time history of flawed beam response at 80 mm from the impact point Beam model FEM Laser
0.2 0
-0.2
-0.4
-0.6 -0.8
-1 0
1
2
3
4
5
6
7
8 x 10 -4
(b) Time history of flawed beam response at 117.5 mm from impact point
FIGURE 10.28 Comparison of the time history of flawed beam responses at points located at 80, 117.5 and 155 mm from the point of impact. Results are obtained using the beam model, FEM and laser measurement. (From Ishak, S.I. et al., J. Sound Vib., 252(2), 343–360, 2002. With permission.)
nation between the top and bottom plies while the composite is being cured, will be removed after curing. The insert must be rigid and stiff in nature so that it will be able to sustain its shape during curing and be removed by force afterwards. To prevent the insert from attaching to the epoxy resin, it is covered with a very thin aluminum foil.
10.5.2
Test Setup
In order to simulate a beam of infinite length and to ensure that outgoing waves are damped gradually without being reflected back, both ends of the
© 2003 by CRC Press LLC
1523_Frame_C10.fm Page 348 Thursday, August 28, 2003 5:03 PM
0.6 Beam model FEM Laser
0.4 0.2 0 -0.2 -0.4 -0.6 -0.8 -1 0
1
2
3
4
5
6
7
8 x 10-4
(c) Time history of flawed beam response at 155 mm from the impact point
FIGURE 10.28 Continued.
test beam are embedded in sand boxes. The embedded lengths are 200 mm for the aluminum and perspex beams and 95 mm for the composite beam. Different types of measurement and excitation on the testing beams with various depths and lengths of delamination are shown in Table 10.2. Before comparing the results of different types of measurement with the SEM result, the data need to be presented in a nondimensional form by normalizing the data with respect to their highest value.
10.5.3
Effect of Crack Depth
The effect of crack depth on the sensitivity of the response is studied using the perspex beams subjected to harmonic excitation; the calculated and measured nondimensional responses on the surface of beam 5A, beam 5B, and beam 5C are shown in Figure 10.29(a), (b), and (c), respectively. The response distribution curves in this figure show oscillations between the excitation point and the right-hand edge of the crack. The oscillations start from the point of excitation and end at the right-hand edge of the crack. Figure 10.29 also shows that the results from two measurement techniques agree well with the SEM results. To examine the agreement between the experimental and theoretical results, the correlation coefficient is used. The correlation coefficient Cr (x,y) between two column vectors x and y is defined in the following standard form of Cr ( x , y ) =
© 2003 by CRC Press LLC
Cov (x , y )
Cov (x , x)Cov ( y , y )
(10.42)
1523_Frame_C10.fm Page 349 Thursday, August 28, 2003 5:03 PM
TABLE 10.2 Different Types of Measurements, Excitations and Beam Specimens Delamination (mm)
Measurements
Specimens
Excitations
Depth
Length
Distance from Excitation to Left Tip of Delamination (mm)
Laser and Accelerometer
Aluminum
Electromagnetic
5 5 5 5
25 35 45 55
245 245 245 245
Laser and Accelerometer
Perspex
Electromagnetic
5 10 15
25 25 25
245 245 245
Laser and Accelerometer
Composite
Electromagnetic
3.5
25
80
Laser
Aluminium
Electromagnetic
5
55
245, 195, 145, 95, 45, left, middle, and right edges of delamination
Laser
Aluminium
Piezoelectric
5
35
245
Type of Measurement
Source: Ishak, S.I. et al., Exp. Mech., 41(2), 157–164, 2001. With permission.
where Cov represents the covariance of the displacement vectors. In this case, the row of these vectors indicates the point of observation on the beam surface and its column represents the normalized beam response. The correlation coefficient will lie between –1 and +1 with zero meaning uncorrelated vectors. Correlation analysis is conducted on the beam response at the entire span of measurement and at only the delaminated region. The delaminated region for the calculation of correlation coefficient begins 15 mm before the lefthand edge of the crack and ends 20 mm after the right-hand edge of the crack. Graphics presenting the correlation coefficient between the measured and calculated results are shown in Figure 10.30. The high correlation coefficients (more than 0.85) between the two experimental results and between the experimental results and SEM results indicate that the measurements are in good agreement with the SEM results. In Figure 10.29(a), which shows the response of a perspex beam containing a crack at 5 mm depth, the delaminated region can be roughly located by the area with abrupt changes in the beam response. Approaching the lefthand edge of the crack edge, the beam response begins to increase and achieve its peak value near the middle of the crack. The response begins to decay and attains a relatively constant value after the right-hand edge of the delaminated region. This observation may be used to estimate the crack
© 2003 by CRC Press LLC
1523_Frame_C10.fm Page 350 Thursday, August 28, 2003 5:03 PM
1.2
delamination
Response
1 0.8 0.6 0.4 0.2 0 0
2
4
6
8 10 12 14 16 18 20 22 24 26 28 30 32 34 Position SEM
Laser
Accelerometer
(a) 5 mm delamination depth, excited at frequency of 6.67 kHz 1.2
delamination
Response
1 0.8 0.6 0.4 0.2 0 0
2
4
6
8 10 12 14 16 18 20 22 24 26 28 30 32 34 Position
SEM
Laser
Accelerometer
(b) 10 mm delamination depth, excited at frequency of 10 kHz 1.2 delamination
Response
1 0.8 0.6 0.4 0.2 0 0
2
4
6
8
10 12 14 16 18 20 22 24 26 28 30 32 34 Position
SEM
Laser
Accelerometer
(c) 15 mm delamination depth, excited at frequency of 12.5 kHz
FIGURE 10.29 The frequency response of a perspex beam with a delamination at different depth (harmonic excitation). (From Ishak, S.I. et al., Exp. Mech., 41(2), 157–164, 2001. With permission.)
length. In the case of perspex beam, Figure 10.29(a) shows that this region corresponds to a crack length of approximately 25 mm. On the perspex beam responses for crack depth of 10 mm (Figure 10.29(b)) and 15 mm (Figure 10.29(c)), only the decaying phenomenon at the righthand edge of the crack can be observed clearly. The deeper the crack, the less significant the change is on the amplitude of beam response. In this case, the left-hand edge of the crack can be approximately determined by drawing a horizontal line contacting all the peak points. This line will intersect the beam response curve approximately at the left-hand edge of the crack. A
© 2003 by CRC Press LLC
1523_Frame_C10.fm Page 351 Thursday, August 28, 2003 5:03 PM
Correlation Coefficient
1.2 1.0
0.888
0.874
0.858
5 mm
0.8
10 mm
0.6
15 mm
0.4
Average
0.2 0.0 Laser-Acc
Laser-Sem
Sem-Acc
Cases
(a) On the overall response
Correlation Coefficient
1.2 1.0
0.893
0.904
0.888 5 mm
0.8
10 mm
0.6
15 mm
0.4
Average
0.2 0.0 Laser-Acc
Laser-Sem
Sem-Acc
Cases
(b) Over the delamination region
FIGURE 10.30 Correlation coefficient of perspex beam responses obtained by experiment and computation using SEM code. (From Ishak, S.I. et al., Exp. Mech., 41(2), 157–164, 2001. With permission.)
better approach to estimate the left-hand edge of the crack is to apply the excitation force on the right-hand side. A systematic way would be to use an inverse procedure, which will be done in Section 10.6 and Section 10.7. The results in Figure 10.29 also indicate that the effects of the crack depth can be enhanced by adjusting the frequency. For beam 5A (Figure 10.29(a)), the use of 6.67 kHz yields a significant change on the beam response. On beam 5B (Figure 10.29(b)) and beam 5C (Figure 10.29(c)), frequencies of 10 and 12.5 kHz are needed respectively to show the change on the response over the crack region. Thus, the deeper the crack is, the higher the frequency required to increase the senstivity.
10.5.4
Effect of Crack Length
In this study, aluminum beams (6A, 6B, 6C, and 6D) containing different crack lengths are used; the calculated and measured nondimensional responses of these beams are shown in Figure 10.31(a) to Figure 10.31(f). Figure 10.31 also shows that the experimental results are in good agreement with the SEM results of these beams under different excitation frequencies. It is also observed that with a shorter length of crack, a higher
© 2003 by CRC Press LLC
1523_Frame_C10.fm Page 352 Thursday, August 28, 2003 5:03 PM
1.2 delamination Response
1 0.8 0.6 0.4 0.2 0 0
2
4
6
8 10 12 14 16 18 20 22 24 26 28 30 32 34 Position
SEM
Laser
Accelerometer
(a) 25 mm delamination length, excited at frequency of 20 kHz 1.2
delamination
Response
1 0.8 0.6 0.4 0.2 0 0
2
4
6
8 10 12 14 16 18 20 22 24 26 28 30 32 34 Position
SEM
Laser
Accelerometer
(b) 35 mm delamination length, excited at frequency of 14.1 kHz
FIGURE 10.31 Aluminum beam responses at different lengths of a crack. (From Ishak, S.I. et al., Exp. Mech., 41(2), 157–164, 2001. With permission.)
excitation frequency is needed for the estimation; a crack of 55-mm length can be well estimated at 6.4 kHz, but to detect 45-, 35-, and 25-mm length, the excitation frequency increases from 9.2 to 20 kHz. This issue will be less critical when an inverse procdure is used to determine the crack length. The sensitivity of the response to the crack length is more than sufficient for a stable inversion. Figure 10.31(d) and Figure 10.31(f) shows that, for beams with 45- and 55mm length of crack, two peaks at 19 and 15 kHz are observed at the del m a inated region. It is also found from Figure 10.31(a) to Figure 10.31(f) that, at regions beyond the crack, the calculated results decline and attain the steadystate value. However, experimental results show that the beam there is still oscillating, and the oscillations in aluminum beams are more obvious than in perspex beams. This may be because the incidence wave in the aluminum beam is reflected back from the sand boxes at higher rate. As with the case of perspex beams, the crack in aluminum beams is located at the area where the beam displacement curve changes significantly, and the width of this area is related to the crack length. This means that, for a longer crack, the pattern change will occur over a wider region. Relating the crack length to the excitation frequency as presented in Table 10.3, the minimum frequency required to observe the crack region clearly should corre-
© 2003 by CRC Press LLC
1523_Frame_C10.fm Page 353 Thursday, August 28, 2003 5:03 PM
1.2
delamination
Response
1 0.8 0.6 0.4 0.2 0 0
2
4
6
8
10 12 14 16 18 20 22 24 26 28 30 32 34 Position SEM
Laser
Accelerometer
(c) 45 mm delamination length, excited at frequency of 9.2 kHz 1.2
delamination
Response
1 0.8 0.6 0.4 0.2 0 0
2
4
6
8
10 12 14 16 18 20 22 24 26 28 30 32 34 Position SEM
Laser
Accelerometer
(d) 45 mm delamination length, excited at frequency of 19 kHz 1.2
delamination
Response
1 0.8 0.6 0.4 0.2 0 0
2
4
6
8 10 12 14 16 18 20 22 24 26 28 30 32 34 Position SEM
Laser
Accelerometer
(e) 55 mm delamination length, excited at frequency of 6.4 kHz 1.2
delamination
Response
1 0.8 0.6 0.4 0.2 0 0
2
4
6
8 10 12 14 16 18 20 22 24 26 28 30 32 34 Position SEM
Laser
Accelerometer
(f) 55 mm delamination length, excited at frequency of 15 kHz
FIGURE 10.31 Continued.
© 2003 by CRC Press LLC
1523_Frame_C10.fm Page 354 Thursday, August 28, 2003 5:03 PM
TABLE 10.3 Correlation between Length of Delamination and Excitation Frequency Delamination Length (mm)
Theoretical Frequency Wave Length (kHz) (mm)
35 45 55
14.1 9.2 6.4
Experimental Dist. pk Wave Length to pk (mm) (mm)
120.41 153.75 184.33
60 76 92
120 152 184
Del. Length Wave Length 3.44 3.42 3.35
Source: Ishak, S.I. et al., Exp. Mech., 41(2), 157–164, 2001. With permission. 1.2
Correlation Coefficient
1.0
0.926 0.839
0.813
0.8
25 mm 35 mm
0.6
45 mm 55 mm
0.4
Average 0.2 0.0 Laser-Acc
Laser-Sem
Sem-Acc
Cases
(a) On the overall response
Correlation Coefficient
1.2 1.0
0.946
0.936
0.922 25 mm
0.8
35 mm 45 mm
0.6
55 mm
0.4
Average
0.2 0.0 Laser-Acc
Laser-Sem
Sem-Acc
Cases
(b) Over the delamination region
FIGURE 10.32 Correlation coefficients of aluminum beam responses obtained by experiments and simulated result from SEM. (From Ishak, S.I. et al., Exp. Mech., 41(2), 157–164, 2001. With permission.)
spond to a wave length with the same order as the crack length. For the cases presented here, the wavelength of the excitation force is about 3.4 lc, where lc is the length of the crack. Figure 10.32 shows the correlation coefficient between the response obtained from experiments and the SEM code. Excellent correlation is obtained, which suggests that the SEM code can be used as a good forward solver for inverse analyses to locate and size the crack precisely.
10.5.5
Effect of Excitation Frequency
The effect of excitation frequency on the sensitivity of response is analyzed using aluminum beam 6B excited at various frequencies. For excitation fre-
© 2003 by CRC Press LLC
1523_Frame_C10.fm Page 355 Thursday, August 28, 2003 5:03 PM
1.2
Correlation Coefficient : 0.981
Response
1.0 0.8
delamination
0.6 0.4 0.2 0.0 0
2
4
6
8
10 12 14 16 18 20 22 24 26 28 30 32 34 Position Piezoelectric
Electromagnetic
Response
(a) At frequency of 14.1 kHz 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0
delamination
0
2
4
6
8 10 12 14 16 18 20 22 24 26 28 30 32 34 Position Piezoelectric
(b) At frequency of 33.1 kHz
FIGURE 10.33 Normalized experiment results of response of the aluminum beam excited at different frequen cies. (From Ishak, S.I. et al., Exp. Mech., 41(2), 157–164, 2001. With permission.)
quencies up to 20 kHz, two types of excitations, using electromagnetic and piezoelectric exciters, are employed. For excitation frequencies between 20 and 40 kHz, only the piezoelectric exciter is used. The measured responses of beam 6B at frequencies of 14.1 and 33.1 kHz are shown in Figure 10.33. A very high correlation coefficient of 0.981 is obtained. The high correlation coefficient indicates that excitation using electromagnetic and piezoelectric exciters is suitable for frequencies up to 20 kHz. In using a higher frequency of excitation, it is found that, at 33.1 kHz, two peaks occur at the crack region — a drastic change from 14.1 kHz. This finding suggests that a wide range of test frequencies can be used for estimating crack length from the displacement response curve. When an inverse procedure is used, the parameters of the crack are expected to be determined accurately in a rigorous manner. 10.5.6
Effect of Location of the Excitation Point
The effect of the location of the excitation point on the sensitivity of response is analyzed by varying the excitation point relative to the location of the crack. The analysis is conducted on the aluminum beam 6D with eight dif-
© 2003 by CRC Press LLC
Normalized Response
1523_Frame_C10.fm Page 356 Thursday, August 28, 2003 5:03 PM
1.2 1 0.8 0.6 0.4 0.2 0 0
2
4
6
8
10 12 14 16 18 20 22 24 26 28 30 32 34 Position
245 mm
195 mm
145 mm
95 mm
Normalized Response
(a) Excitation point at 95 up to 245 mm-distance from left-hand edge of the delamination 0.14 0.12 0.1 0.08 0.06 0.04 0.02 0 0
2
4
6
8 10 12 14 16 18 20 22 24 26 28 30 32 34 Position 45 mm
left tip
middle
right tip
(b) Excitation point at 45 mm from the left crack tip and above the crack region
FIGURE 10.34 Normalized amplitude of the vertical displacement response measured on the upper surface of the aluminum beam excited by a harmonic vertical force at different excitation points. (From Ishak, S.I. et al., Exp. Mech., 41(2), 157–164, 2001. With permission.)
ferent points of excitation; these correspond to points at 0, 5, 10, 15, 20, 24.5, 27.25, and 30 in Figure 10.34. The first five points are 245, 195, 145, 95, and 45 mm away from the crack tip, respectively, and the last three points are 45 mm away from the left crack tip, middle and right-hand edge of the crack. Figure 10.34 shows that, regardless of the point of excitation, the maximum displacement response always occurs at the delaminated region, with the starting point and the width of the maximum response equal for all cases. This implies that the location and size of the crack can be estimated regardless of the location of the excitation point. 10.5.7
Study on Beams of Anisotropic Material
Further application of the SEM for crack identification will be tested on anisotropic material. Figure 10.35 shows the displacement response along the beam surface, obtained from the SEM code and from the laser vibrometer, for beam 7B, beam 7C, and beam 7D. It is seen that the SEM results agree reasonably well with the experiments. The displacement response of beam 7A has been shown by Liu et al. (1996). Their results indicate that this response of beam 7A does not indicate any perturbations because the absence of delamination would not cause any scattering or reflection of the incoming wave. The displacement response in Figure 10.35 for beam 7B, beam 7C, and beam 7D is perturbed due to the presence of delamination, implying that displacement response may be used to roughly locate and size delaminations
© 2003 by CRC Press LLC
1523_Frame_C10.fm Page 357 Thursday, August 28, 2003 5:03 PM
1.2 Beam 7B, 12 kHz
delamination
1 0.8 0.6 0.4 0.2 0 0
2
4
6
8
10
12
14
16
12
14
16
12
14
16
Normalized Response
1.2 Beam 7C, 15 kHz
delamination
1 0.8 0.6 0.4 0.2 0 0
2
4
6
8
10
1.2 delamination
Beam 7D, 31 kHz
1 0.8 0.6 0.4 0.2 0 0
2
4
6
8
10
Distance from excitation point, cm SEM
Laser
FIGURE 10.35 Normalized amplitude of vertical displacement responses of laminated beams 7B, 7C and 7D. (From Ishak, S.I. et al., Composites: Part B, 32, 287–298, 2001. With permission.)
in beams. The results in Figure 10.35 show the general trend that the magnitude of displacement begins to increase to attain the peak value in approaching the delaminated region, after which the displacement attains a steady-state value.
10.6 Inverse Crack Detection Using Uniform µ GAs The studies presented in Section 10.1 through Section 10.5 show that response of the beam is very sensitive to the location and dimension of the cracks and delaminations in beams subjected to harmonic excitation or impact excita-
© 2003 by CRC Press LLC
1523_Frame_C10.fm Page 358 Thursday, August 28, 2003 5:03 PM
tion, especially when the response is sampled in the region over the crack. In many cases the cracks or delaminations can be roughly estimated from the displacement or acceleration response curves. In addition, these experiments have also confirmed that the numerical simulation results from the forward solvers agree well with the experimental ones. These findings suggest that an inverse analysis can be carried out for systematically detecting the location and dimensions of the cracks, flaws, and delaminations in beams. An inverse procedure using a uniform µGA is employed here to determine the parameters of the crack from the dynamic displacement responses on the surface of a cracked beam. The SEM code is used as the forward solver that establishes the relationship between the displacement responses and the crack parameters. The reason for choosing SEM as the forward solver is given in Section 10.2.2. A µGA is used in the inverse procedure to find the parameters of cracks or delaminations that give the minimum of the error between the measured and calculated beam displacement responses. The error function is defined with the L2 norm as given by Equation 2.123. The number of the sampling points used for constructing the error functional is 130 in order to achieve a heavily over-posed inverse problem.
10.6.1
Use of Simulated Data from SEM
The applicability of the inverse procedure to locate and size the crack is first tested using simulated data. The data of frequency response of the vertical displacement are computed using the SEM code for aluminum beams with a known crack length and location. The aluminum beams are 25 mm in width and 4.5 mm in thickness and located 5 mm below the surface. The height of the beam is H = 25 mm. The length of the crack in beam 8A is 25 mm; it is 35 mm in beam 8B and 45 mm in beam 8C. The crack is through-width located at 242.5 mm from the excitation point. The analysis is conducted at excitation frequencies of 20 and 14.1 kHz for beam 8A and beam 8B, respectively. Particularly for beam 8C, the analysis is conducted at frequencies of 9.2 and 19 kHz, which correspond to optimum frequencies that generate beam responses most sensitive to the location and dimensions of the cracks. A uniform µGA with binary parameter coding, tournament selection, uniform crossover, and elitism is adopted. Details of the µGA are given in Chapter 5. In the following case, assume that the depth of the crack is known, and only the crack location and length are to be determined. The bounds on the location of the cracks are set within the range of 0.5 ≤ ac ≤ 1.5, while the length of the cracks is bounded within the range of 0.01 ≤ lc ≤ 3.0. These bounds are wide enough because these two parameters can be roughly estimated from the displacement response curve on the surface of the beam, as seen previously for many cases.
© 2003 by CRC Press LLC
1523_Frame_C10.fm Page 359 Thursday, August 28, 2003 5:03 PM
The convergence of the objective functional of error using the simulated displacement data for beam 8A, beam 8B, and beam 8C are shown in Figure 10.36(a). This part of the figure shows that the inverse procedure finds the optimal crack length and location at generation 60 for beam 8A, generation 60 for beam 8B, and generations 74 and 68 for beam 8C. In most cases, the location and length of the cracks are found after about 300 forward calculations using the SEM code. The convergence for the location and length of the crack is shown in Figure 10.36(b) and Figure 10.36(c), respectively. The actual locations of the left edge of the crack for all cases are normalized with respect to 242.5 mm; thus its value is 1.0. Figure 10.36(b) shows that the inversely identified crack locations are very close to the actual location of the crack. The crack lengths are normalized with respect to 25 mm; therefore, they converge to 1.0, 1.4, and 1.8, respectively, for 8A, beam 8B, and beam 8C as shown in Figure 10.36(c). The results for the simulated SEM data indicate that the inverse procedure can locate and size different crack lengths accurately using the simulated measurement data. The computed displacement response using the identified crack parameters is shown in Figure 10.37 together with that computed using the true parameters. Very good agreement is observed, proving that no Type I and II ill-posedness is in this inverse procedure because the problem is sufficiently over-posed for the use of large sampling points, and the displacement response is very sensitive to the parameters of the cracks. In the next section, this procedure is applied to determine these crack parameters using actual experimental data that contain noise in order to examine whether the problem has Type III ill-posedness.
10.6.2
Use of Experimental Data
For further verification, the inverse procedure is tested using experimental data that contain noise. The beams under study are made of aluminum that is 800 mm in length, 25 mm in width, and 4.5 mm in thickness. The height of the beam is H = 25 mm. The crack is artificially created with a 0.25 mm diameter milling cutter located 482.5 mm from the left end of the beam and 5 mm below the surface. Therefore, the crack is, in fact, a rectangular void. In the experiment, both ends (200 mm on each end) of the specimens are immersed in sand in order to simulate an infinite beam and to ensure that the outgoing waves are damped gradually without being reflected back (see Figure 10.18). The beam displacement along its surface is measured and analyzed using the scanning laser vibrometer at the same frequency as in the previous case using the simulated data. The sinusoidal signals are generated using a waveform generator and applied to the specimen via an electromagnetic exciter connected to a push rod. In the experiments, the excitation point is fixed but the displacement is measured along the beam surface. The measurement is
© 2003 by CRC Press LLC
Abs. Best Fitness Obj. Func.
1523_Frame_C10.fm Page 360 Thursday, August 28, 2003 5:03 PM
1.0E+06 1.0E+05 1.0E+04 1.0E+03 1.0E+02 1.0E+01 1.0E+00 1.0E-01 1.0E-02 0
5
10 15 20 25 30 35 40 45 50 55 60 65 70 Generations 25 mm (20 kHz)
35 mm (14.1 kHz)
45 mm (9.2 kHz)
45 mm (19 kHz)
Normalized left edge of the crack
(a) Convergence of the error function 1.60 1.40 1.20 1.00 0.80 0.60 0.40 0.20 0.00 0
5
10
15
20
25 30
35
40 45
50
55 60
65
70
Generations 25 mm (20 kHz)
35 mm (14.1 kHz)
45 mm (9.2 kHz)
45 mm (19 kHz)
Normalized length of the crack
(b) Convergence of crack location 2.80 2.40 2.00 1.60 1.20 0.80 0.40 0.00 0
5
10 15 20 25 30 35 40 45 50 55 60 65 70 Generations 25 mm (20 kHz)
35 mm (14.1 kHz)
45 mm (9.2 kHz)
45 mm (19 kHz)
(c) Convergence of the crack length FIGURE 10.36 Convergence of the µGA search for the location and length of the crack for different length and different frequency of the excitation load. The crack length is normalized with respect to 25 mm and the location of the crack tip is normalized with respect to 242.5 mm. Simulated displacement responses from SEM code are used.
© 2003 by CRC Press LLC
1523_Frame_C10.fm Page 361 Thursday, August 28, 2003 5:03 PM
1 theo. true disp. inv. calc. disp. 0.9
Normalized Displacement
0.8 0.7 0.6 0.5
0.4 0.3 0.2 0
5
10
15 20 Measurement Point
25
30
35
(a) Beam with 25 mm crack at frequency 20 kHz 1 theo. true disp. inv. calc. disp.
0.9
Normalized Displacement
0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0
5
10
15 20 Measurement Point
25
30
35
(b) Beam with 35 mm crack at frequency 14.1 kHz
FIGURE 10.37 Displacement responses computed using the actual geometry parameters of the crack and the inversely detected geometry parameters of the crack. The excellent agreement shows that a loop is well closed.
taken at 130 sampling points over a distance of 350 mm; therefore, the problem is obviously heavily over-posed. The GA configurations used in the previous case are also applied to this case, but the actual dynamic displacement is now replaced by the experimental data. The absolute value of the best fitness objective function for experimental displacement data of beam 8A, beam 8B, and beam 8C are shown in Figure 10.38(a). As in the previous verification, the results in this part of the figure also indicate that the absolute value of objective function of the best individual decreases with increasing number of generations. The objective function in Figure 10.38(a) shows that the inverse procedure finds the crack length
© 2003 by CRC Press LLC
1523_Frame_C10.fm Page 362 Thursday, August 28, 2003 5:03 PM
1 theo. true disp. inv. calc. disp.
0.9
Normalized Displacement
0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0
5
10
15 20 Measurement Point
25
30
35
(c) Beam with 45 mm crack at frequency 9.2 kHz 1 theo. true disp. inv. calc. disp.
0.9
Normalized Displacement
0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0
5
10
15 20 Measurement Point
25
30
35
(d) Beam with 45 mm crack at frequency 19 kHz
FIGURE 10.37 Continued.
and location at generation 78 for beam 8A, generation 80 for beam 8B, and generations 56 and 45 for beam 8C. Compared with results from the previous case using the simulated data from the SEM code, the objective error function for the experimental data is generally larger because the data contain exper imental noises caused by imperfect experimental setup, such as an infinite boundary condition that cannot be fully fulfilled in the experiment. The convergence of the location and length of the cracks is shown in Figure 10.38(b) and Figure 10.38(c), respectively. Figure 10.38(b) shows that the procedure can locate the crack very well (very close to its actual location). The inversely calculated crack lengths in beam 8A, beam 8B, and beam 8C also converge to the actual crack length as shown in Figure 10.38(c). The
© 2003 by CRC Press LLC
Abs. Best Fitness Obj. Func.
1523_Frame_C10.fm Page 363 Thursday, August 28, 2003 5:03 PM
1.0E+03
1.0E+02
1.0E+01
1.0E+00 0
5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 Generations 25 mm (20 kHz)
35 mm (14.1 kHz)
45 mm (9.2 kHz)
45 mm (19 kHz)
Normalized left edge of the crack
(a) Convergence of the error function 1.40 1.20 1.00 0.80 0.60 0.40 0.20 0.00
0
5
10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 Generations 25 mm (20 kHz)
35 mm (14.1 kHz)
45 mm (9.2 kHz)
45 mm (19 kHz)
Normalized length of the crack
(b) Convergence of crack location 2.80 2.40 2.00 1.60 1.20 0.80 0.40 0.00
0
5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 Generations 25 mm (20 kHz)
35 mm (14.1 kHz)
45 mm (9.2 kHz)
45 mm (19 kHz)
(c) Convergence of crack length FIGURE 10.38 Convergence of the µGA search for the location and length of the crack in an aluminum beam excited by a harmonic load with different frequency. The crack length is normalized with respect to 25 mm; the location of the crack tip is normalized with respect to 242.5 mm. Experiment data are used.
© 2003 by CRC Press LLC
1523_Frame_C10.fm Page 364 Thursday, August 28, 2003 5:03 PM
inversely identified location and length of the crack using the experimental data are slightly shifted away from the actual ones. Overall, the results are very stable, and no Type III ill-posedness is found in this inverse problem due to the use of the SEM code as the forward solver (see the argument in Section 10.2.2). The actual experimental data of vertical displacement response and those computed from the identified parameters of the crack are shown in Figure 10.39. The results in this figure show good agreement, which indicates this inverse procedure can locate and size cracks at reasonable accuracy.
10.7 Inverse Crack Detection Using Progressive NN The progressive NN model discussed in Chapter 6 is now adopted here for inversely detecting the crack parameters in composite laminated beams. The outputs of the NN are the crack parameters that specify the location, depth, and length of the crack in the beam. The inputs of the NN are the displacement response measured on the surface of beams excited by a harmonic load with a certain frequency, which is the same as the case discussed in Section 10.6. 10.7.1
Procedure Outline
Figure 10.40 illustrates the procedure of using the progressive NN for detecting delamination in beams. The purpose of initial training is to establish a preliminary nonlinear mapping relationship between the amplitudes of the displacement response on the surface of the beam and the delamination parameters. The training samples comprise a set of values of delamination parameters that represent the geometry of delaminations and the corresponding displacement responses calculated from the SEM code. After the initial training of the NN model using the SEM code, identification of delamination parameters begins by feeding the experimental displacement response measured from the scanning laser vibrometer. The outputs of the NN model are the predicted delamination parameters, which are then input to the SEM code to produce a set of calculated displacement responses. If the distance norm defined between the computed and measured responses as defined in Equation 6.17 exceeds the permissible error, then the NN will be retrained using the adjusted training samples that contain the currently predicted parameters and corresponding calculated displacement responses using the SEM code. The retrained NN model is then used to identify the delamination parameters again by feeding in the measured displacement responses. The number of surface nodes at the input and output layers of the NN correspond, respectively, to the displacement responses and the number of
© 2003 by CRC Press LLC
1523_Frame_C10.fm Page 365 Thursday, August 28, 2003 5:03 PM
1 meas. true inv. calc.
0.9
Normalized Displacement
0.8 0.7 0.6 0.5 0.4 0.3 0.2 0
5
10
15
20
25
30
35
Measurement Point
(a) Beam with 25 mm crack at frequency 20 kHz 1
meas. true inv. calc.
0.9
Normalized Displacement
0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0
0
5
10
15
20
25
30
35
Measurement Point
(b) Beam with 35 mm crack at frequency 14.1 kHz FIGURE 10.39 Experimental displacement and that calculated using the inversely detected geometry parameters of the crack.
© 2003 by CRC Press LLC
1523_Frame_C10.fm Page 366 Thursday, August 28, 2003 5:03 PM
1
meas. true inv. calc.
0.9
Normalized Displacement
0.8 0.7 0.6 0.5 0.4 0.3 0.2 0. 1 0
0
5
10
15
20
25
30
35
30
35
Measurement Point
(c) Beam with 45 mm crack at frequency 9.2 kHz
1 meas. true inv. calc.
0.9 0.8
Normalized Displacement
0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0
5
10
15
20
25
Measurement Point
(d) Beam with 45 mm crack at frequency 19 kHz FIGURE 10.39 Continued.
© 2003 by CRC Press LLC
1523_Frame_C10.fm Page 367 Thursday, August 28, 2003 5:03 PM
Forward solver
Predetermined delamination parameter
Stop
Yes
Convergence ?
No
Adjustment of NN model
z df F(x)
As outputs of training data
delamination III
dc
H I
As inputs of training data
IV ac
II
x
Calculated responses
1 0.8 0.6 0.4 0.2 0 0 2 4 6 8 10 12 14 16
junctions
Laminated beam with a horizontal delamination
T r a i n i n g
Outputs of NN model
NN Model
Inputs to NN networks
Retaining of NN Model
Adjust samples
Displacement response of the beam surface
Measured responses
D e t e c t i o n
No
Identified delamination parameters
Stop
Forward solver
Yes
Acceptable ?
Calculated responses
FIGURE 10.40 Flow chart of the progressive NN model for crack detection. (From Ishak, S.I. et al., Composites: Part B, 32, 287–298, 2001. With permission.)
delamination parameters. The method for approximately determining the proper neuron number in hidden layers and the modified back-propagation algorithm with a dynamically adjusted learning rate and an additional jump factor to avoid stagnation in the learning process are employed. The orthog onal array method is adopted for the combination of discrete parameters in order to reduce the number of training samples. The inputs as well as outputs of the samples are normalized. These issues have been detailed in Chapter 6.
10.7.2
Composite Specimen
The SEM code is first employed to compute the numerical dynamic responses of laminated composites containing some predetermined delamination parameters. These numerical responses are then used as the training data for the NN. The performance of the NN is tested on experimental data obtained by scanning laser vibrometer for the inverse determination of delamination parameters in laminated composites. Carbon/epoxy composite beams of 390 mm in length, 20 mm in width, and 7 mm in thickness are considered. The height of the beam is H = 7 mm. The composite beam is made of 20 plies and is cured using autoclave mold ing; the stacking sequence is [0°/90°]s. Four beam specimens (9A, 9B, 9C, and 9D) are made. Beam 9A, which is free of delamination, is used as a
© 2003 by CRC Press LLC
1523_Frame_C10.fm Page 368 Thursday, August 28, 2003 5:03 PM
reference. For beam 9B through beam 9D, the delamination is 25 mm long. Beam 9B is 1.75 mm deep from the beam surface; beam 9C is 3.5 mm deep, and beam 9D is 5.25 mm deep. The delamination, a rectangular void that spans across the width of the beam, is prepared by placing a stainless steel insert (of 0.1-mm thickness) between layers. This insert will create a delamination between the top and bottom plies while the composite is cured and will be removed after curing. Beam displacement along its surface is measured and analyzed using a commercial scanning laser vibrometer. Sinusoidal excitation with different frequencies is generated using a waveform generator and applied to the specimen via an electromagnetic exciter. On each measurement point, five measurements are taken and averaged using complex averaging. This averaging process acts practically like a filter to filter out the random noise to improve the quality of the data measured. In this measurement, the excitation point is fixed and the response is measured along the beam surface. 10.7.3
Delamination Detection
The input for the NN is taken from beam responses computed using the SEM code at 21 points spaced equally over the length of 105 mm along the beam; hence the number of neurons at the input layer of the NN is 21. For this application, the training data are only taken from the displacement (amplitude) responses before the delamination region (region I) until the tip of the delamination that can be identified as the displacement response begins to decay. The training data in these regions are also chosen in order to minimize the effect of small oscillation on the experimental results at locations beyond the delaminated region. The discrete values of delamination parameters are presented in Table 10.4. Two hidden layers are used; the numbers of neurons are assigned to be 24 and 9 for first and second hidden layers, respectively. Based on the orthogonal array method and considering the completeness of the sample space, 15 combinations of delamination parameters are used as the initial training data for the NNs. The actual and normalized training data of delamination parameters are presented in Table 10.5. For the initial training process, the NN converges after 16,653 iterations of training. In order to test the performance of the neural networks, the experimental data are input to the trained networks. As the result of the first training TABLE 10.4 Discrete Values of Delamination Parameters (mm) Location Depth Length
0 0 0
20 1.75 25
40 3.5
60 5.25
80
92.5
Source: Ishak, S.I. et al., Composites, Part B, 32, 287–298, 2001. With permission.
© 2003 by CRC Press LLC
1523_Frame_C10.fm Page 369 Thursday, August 28, 2003 5:03 PM
TABLE 10.5 Arrays of Training Data for Initial Training of NN from Composite Laminated Beam [C0/90] Location No.
Actual (mm)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
0.000 40.000 92.500 0.000 20.000 60.000 92.500 0.000 20.000 40.000 60.000 20.000 40.000 60.000 92.500
Depth
Normalized
Actual (mm)
0.000 0.393 0.909 0.000 0.197 0.590 0.909 0.000 0.197 0.393 0.590 0.197 0.393 0.590 0.909
0.000 0.000 0.000 1.750 1.750 1.750 1.750 3.500 3.500 3.500 3.500 5.250 5.250 5.250 5.250
Length
Normalized
Actual (mm)
Normalized
0.000 0.000 0.000 0.303 0.303 0.303 0.303 0.606 0.606 0.606 0.606 0.909 0.909 0.909 0.909
0.000 0.000 0.000 25.000 25.000 25.000 25.000 25.000 25.000 25.000 25.000 25.000 25.000 25.000 25.000
0.000 0.000 0.000 0.909 0.909 0.909 0.909 0.909 0.909 0.909 0.909 0.909 0.909 0.909 0.909
Note: Normalization is performed using Equation 6.14. Source: Ishak, S.I. et al., Composites: Part B, 32, 287–298, 2001. With permission.
process, the distance norm defined by Equation 6.17 between the calculated displacement responses and the measured ones are large. For example, in the case of ac = 80 / H , dc = 1.75 / H , and lc = 2.5 / H , the errors for the identified ac , dc, and lc are as high as 23.56, –16.71, and –3.00%, respectively. The NNs are then retrained. Three new samples are consequently generated using the initially identified delamination parameters, and the corresponding displacement responses are calculated using the SEM code. These new samples are then used to replace three samples with longer distance norms. The NNs are retrained with these adjusted sample sets and then used again to identify the delamination parameters. The convergence processes of the error norm within the first 4000 epochs for the NNs during the first three times of training are shown in Figure 10.41. It is shown that through the retrainings, the system error norm is reduced significantly. For the second and third training, the NN converges after 13,522 and 9719 iterations, respectively. As the retraining process progresses, the delamination parameters get closer to the measured ones: the respective errors for ac, dc, and lc can be as low as 3.12, –9.91, and –0.18%, respectively. Figure 10.42(a) and Figure 10.42(b) show the respective convergence process of these delamination parameters predicted from the NNs for the cases of ac = 80 / H , dc = 1.75 / H , lc = 25 / H , and ac = 80 / H , dc = 3.5 / H , lc = 25 / H . The total comparison results between the actual and predicted delamination parameters for the first 15 training data and 9 testing data are presented in Table 10.6. It is shown that the NN can correctly determine the location,
© 2003 by CRC Press LLC
System norm error (%)
1523_Frame_C10.fm Page 370 Thursday, August 28, 2003 5:03 PM
20 18 16 14 12 10 8 6 4 2 0 0
500
1000
1500
2000
2500
3000
3500
4000
Number of training process 1st training
2nd training
3rd training
Percentage of error
FIGURE 10.41 The convergence process of the NN model against the number of iterations in the training process. (From Ishak, S.I. et al., Composites: Part B, 32, 287–298, 2001. With permission.)
30 20 10 0 -10 -20 0
1
2 Number of training
Location
Depth
3
4
Length
Percentage of error
(a) First case of a c = 80 H , d c = 1.75 H , lc = 25 H 5 0 -5 -10 -15 -20 -25 0
1
2 Number of training
Location
Depth
3
4
Length
(b) First case of a c = 80 H , d c = 3.5 H , lc = 25 H
FIGURE 10.42 The convergence of the error of the delamination parameters in the identification process using the NN (training data from region I and region III). (From Ishak, S.I. et al., Composites: Part B, 32, 287–298, 2001. With permission.)
© 2003 by CRC Press LLC
© 2003 by CRC Press LLC
Comparison between Actual and Identified Delamination Parameters Composite Laminated Beam [C0/90] Using Progressive NN No. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
Location (mm) Measured Calculated 0.00 20.00 40.00 60.00 80.00 92.50 0.00 20.00 40.00 60.00 80.00 92.50 0.00 20.00 40.00 60.00 80.00 92.50 0.00 20.00 40.00 60.00 80.00 92.50
0.89 19.10 39.92 59.58 70.32 92.41 0.02 19.06 38.69 59.39 82.50 92.23 0.00 17.25 35.71 55.31 79.60 92.73 0.00 20.52 38.27 63.37 81.27 92.37
Depth (mm) Measured Calculated 0.00 0.00 0.00 0.00 0.00 0.00 1.75 1.75 1.75 1.75 1.75 1.75 3.50 3.50 3.50 3.50 3.50 3.50 5.25 5.25 5.25 5.25 5.25 5.25
0.00 0.00 0.00 0.00 0.00 0.00 1.83 1.68 1.80 1.69 1.58 1.77 2.95 3.46 3.52 3.19 3.29 3.52 5.25 5.26 5.26 5.25 5.25 5.25
Length (mm) Measured Calculated 0.00 0.00 0.00 0.00 0.00 0.00 25.00 25.00 25.00 25.00 25.00 25.00 25.00 25.00 25.00 25.00 25.00 25.00 25.00 25.00 25.00 25.00 25.00 25.00
Note: Case 1: training data from regions I and III. Source: Ishak, S.I. et al., Composites: Part B, 32, 287–298, 2001. With permission.
0.00 0.00 0.00 0.00 0.00 0.01 25.03 25.04 24.99 25.01 24.96 25.04 25.10 25.03 25.05 24.96 25.03 24.97 24.99 24.99 25.02 24.97 24.98 25.02
Percentage of Error Location Depth Length 0.00 –4.51 –0.21 –0.69 –12.10 –0.10 0.00 –4.71 –3.29 –1.01 3.12 –0.30 0.00 –13.77 –10.71 –7.81 –0.50 0.25 0.00 2.61 –4.33 5.62 1.58 –0.14
0.00 0.00 0.00 0.00 0.00 0.00 4.45 –3.74 2.70 –3.48 –9.91 0.91 –15.78 –1.08 0.68 –8.94 –6.10 0.50 0.01 0.18 0.21 0.08 0.07 0.07
0.00 0.00 0.00 0.00 0.00 0.00 0.10 0.17 –0.05 0.03 –0.17 0.16 0.39 0.13 0.19 –0.15 0.11 –0.12 –0.02 –0.02 0.07 –0.13 –0.10 0.07
1523_Frame_C10.fm Page 371 Thursday, August 28, 2003 5:03 PM
TABLE 10.6
1523_Frame_C10.fm Page 372 Thursday, August 28, 2003 5:03 PM
depth, and length of the delamination over a wide range of delamination parameters for the laminated composite beam. For all the cases discussed, the percentage of errors between the measured and calculated delamination parameters is less than 16%.
10.7.4
Effect of Different Training Data
The effect of training data on the convergence process and accuracy of the NN is studied by inputting the displacement response obtained from differ ent regions of the beam. Three cases are investigated • Case 1 —training data are taken from region I and region III. This case has been studied already and the results presented (Table 10.6; Section 10.7.3). • Case 2 —training data are taken from the entire range of the beam response (regions I, II, and III). A total of 33 equally spaced points from the beam response is used as the input for the NNs. For the initial training process, NNs have converged after 16,909 iterations. • Case 3 —training data are taken from the first half of the beam response (only region I). A total of 17 equally spaced points from the beam response is used as the input for the NNs. For the initial training process, NNs have converged after 12,365 iterations. The performance of the NNs is also examined on 24 different configurations of delaminations listed in Table 10.6. When the predicted results are not acceptable, retraining the NNs is required using the adjusted sample set, as done for case 1. It is found that, (1) for the second training, the NN for case 2 and case 3 converges after 10,150 and 6182 iterations, respectively, and (2) for the third training, the NN for case 2 and case 3 converges after 8421 and 4868 iterations, respectively. For case 2, some identified delamination parameters are still not acceptable after three times of training, so the fourth training for the NN is performed. The NN has then satisfied the given convergence criterion after 3856 iterations. The number of iterations required to converge for these three cases is summarized in Table 10.7. Figure 10.43 and Figure 10.44 show samples of convergence process of these delamination parameters for case 2 and case 3, respectively. These results indicate that the error of the identified delamination parameters decreases significantly with the progression of the retraining process. The errors, in percentages, for case 2 and case 3 are presented in Table 10.8 and Table 10.9, respectively. Comparing Table 10.6, Table 10.8, and Table 10.9, it is found that • For most delamination configurations, the percentage of the error for case 3 is smallest compared to case 1 and case 2.
© 2003 by CRC Press LLC
1523_Frame_C10.fm Page 373 Thursday, August 28, 2003 5:03 PM
TABLE 10.7 Number of Iterations Required to Converge for Three Cases of Different Training Data for Detection of Delamination of Composite Laminated Beam [C0/90] Case No. 1 2 3
Training Data from Region(s) I and III I, II, and III I
Satisfied Convergence Criterion after No. of Iterations 1st Training 2nd Training 3rd Training 4th Training 16,653 16,909 12,365
13,522 10,150 6,182
9,719 8,421 4,868
— 3856 —
Percentage of error
Source: Ishak, S.I. et al., Composites: Part B, 32, 287–298, 2001. With permission. 50 40 30 20 10 0 -10 -20 -30 0
1
2 3 Number of training
Location
Depth
4
5
Length
Percentage of error
(a) a c = 80 H , d c = 1.75 H , lc = 2.5 H 30 20 10 0 -10 -20 -30 -40 0
1
2 3 Number of training
Location
Depth
4
5
Length
(b) a c = 80 H , d c = 3.5 H , lc = 25 H
FIGURE 10.43 The convergence of the error of the delamination parameters in the identification process using the NN (case 2, training data from regions I, II, and III). (From Ishak, S.I. et al., Composites: Part B, 32, 287–298, 2001. With permission.)
• The largest percentage error is found in case 2, in which data in region II are included in the training data. • Compared with the other two cases, case 1 shows a better error distribution for the 24 delamination configurations. These results indicate that, by choosing more sensitive and accurate train ing sets of data, the convergence can be achieved faster, and the delamination parameters can be predicted much more accurately. These results indicate
© 2003 by CRC Press LLC
Percentage of error
1523_Frame_C10.fm Page 374 Thursday, August 28, 2003 5:03 PM
20 15 10 5 0 -5 -10 -15 -20 -25 0
1
2
3
4
Number of training Location
Depth
Length
(a) a c = 80 H , d c = 1.75 H , lc = 25 H
Percentage of error
25 20 15 10 5 0 -5 -10 -15 0
1
2
3
4
Number of training Location
(b)
Depth
Length
a c = 80 H , d c = 3.5 H , lc = 25 H
FIGURE 10.44 The convergence of the error of the delamination parameters in the identification process using the NN (case 3, training data from region I). (From Ishak, S.I. et al., Composites: Part B, 32, 287–298, 2001. With permission.)
that the accuracy of identification from the NN model depends significantly on the sensitivity and errors of measured displacement response of delaminated beams.
10.7.5
Use of Beam Model and Harmonic Excitation
A beam model analyzing transverse impact-wave propagation in beams for the detection and assessment of cracks in beams is presented in Section 10.3. In this model, beam responses are investigated and analyzed in the frequency domain. The major advantage of beam model for solving wave scattering problems is the much smaller number of equations compared with the FEM and even the SEM. Consequently, much shorter computing time is needed for solutions of comparable accuracy. Therefore, it is used here for inverse detection of delamination in beams. The experimental study is conducted using laser vibrometer to verify the numerical results of the beam model. The comparison studies are conducted between beam model and laser measurement results on aluminum beams prior to the application of the NN. The aluminum beams under study are 800 mm in length, 25.4 mm in width,
© 2003 by CRC Press LLC
© 2003 by CRC Press LLC
Comparison between Actual and Identified Delamination Parameters for Detection of Delamination Parameters of Composite Laminated Beam [C0/90] No. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
Location (mm) Measured Calculated 0.00 20.00 40.00 60.00 80.00 92.50 0.00 20.00 40.00 60.00 80.00 92.50 0.00 20.00 40.00 60.00 80.00 92.50 0.00 20.00 40.00 60.00 80.00 92.50
0.64 17.19 44.17 67.78 88.45 94.07 1.43 23.48 41.48 69.09 85.96 93.83 0.37 22.76 42.09 67.20 81.51 94.25 0.05 20.37 42.71 53.57 68.75 94.53
Depth (mm) Measured Calculated 0.00 0.00 0.00 0.00 0.00 0.00 1.75 1.75 1.75 1.75 1.75 1.75 3.50 3.50 3.50 3.50 3.50 3.50 5.25 5.25 5.25 5.25 5.25 5.25
0.00 0.00 0.00 0.00 0.00 0.00 1.61 1.96 1.84 1.73 1.54 1.76 3.12 3.05 3.57 2.95 3.12 3.50 4.57 5.26 5.02 5.21 5.29 5.27
Length (mm) Measured Calculated 0.00 0.00 0.00 0.00 0.00 0.00 25.00 25.00 25.00 25.00 25.00 25.00 25.00 25.00 25.00 25.00 25.00 25.00 25.00 25.00 25.00 25.00 25.00 25.00
Note: Case 2: training data from regions I, II, and III. Source: Ishak, S.I. et al., Composites: Part B, 32, 287–298, 2001. With permission.
0.00 0.01 0.01 0.01 0.02 0.01 24.85 24.81 25.00 25.01 23.86 29.45 24.70 24.73 25.06 24.51 24.12 25.00 23.55 24.95 23.96 24.73 24.85 24.98
Percentage of Error Location Depth Length 0.00 –14.07 10.43 12.96 10.56 1.69 0.00 17.42 3.71 15.15 7.45 1.44 0.00 13.81 5.24 11.99 1.89 1.89 0.00 1.85 6.79 –10.71 –14.06 2.19
0.00 0.00 0.00 0.00 0.00 0.00 –8.05 12.28 5.28 –1.32 –11.88 0.76 –10.97 –12.19 1.95 –15.59 –10.87 0.13 –13.02 0.14 –4.48 –0.69 0.78 0.30
0.00 0.00 0.00 0.00 0.00 0.00 –0.62 –0.75 –0.01 0.05 –4.58 –0.21 –1.21 –1.09 0.23 –1.95 –3.51 0.00 –5.81 –0.21 –4.18 –1.09 –0.62 –0.10
1523_Frame_C10.fm Page 375 Thursday, August 28, 2003 5:03 PM
TABLE 10.8
© 2003 by CRC Press LLC
Comparison between Actual and Identified Delamination Parameters for Detection of Delamination Parameters of Composite Laminated Beam [C0/90] No. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
Location (mm) Measured Calculated 0.00 20.00 40.00 60.00 80.00 92.50 0.00 20.00 40.00 60.00 80.00 92.50 0.00 20.00 40.00 60.00 80.00 92.50 0.00 20.00 40.00 60.00 80.00 92.50
68.84 21.03 40.75 53.62 74.69 91.03 0.15 20.97 37.58 59.84 79.53 92.69 0.01 16.52 38.78 60.75 79.98 91.70 0.02 20.21 37.08 58.76 81.74 92.32
Depth (mm) Measured Calculated 0.00 0.00 0.00 0.00 0.00 0.00 1.75 1.75 1.75 1.75 1.75 1.75 3.50 3.50 3.50 3.50 3.50 3.50 5.25 5.25 5.25 5.25 5.25 5.15
0.00 0.00 0.00 0.00 0.00 0.00 1.64 1.76 1.74 1.83 1.56 1.77 3.46 3.36 3.43 3.51 3.48 3.50 5.24 5.25 5.14 5.28 5.31 5.24
Length (mm) Measured Calculated 0.00 0.00 0.00 0.00 0.00 0.00 25.00 25.00 25.00 25.00 25.00 25.00 25.00 25.00 25.00 25.00 25.00 25.00 25.00 25.00 25.00 25.00 25.00 25.00
Note: Case 3: training data from region I. Source: Ishak, S.I. et al., Composites: Part B, 32, 287–298, 2001. With permission.
0.00 0.01 0.01 0.01 0.01 0.01 24.89 24.94 25.06 25.00 24.89 25.01 24.94 25.02 24.97 25.05 25.06 25.00 24.93 25.03 25.01 29.96 24.96 25.01
Percentage of Error Location Depth Length 0.00 5.16 1.88 –10.63 –6.63 –1.59 0.00 4.85 –6.06 –0.27 –0.59 0.21 0.00 –17.38 –3.06 1.26 –1.03 –0.87 0.00 1.04 –7.31 –2.07 2.17 –0.20
0.00 0.00 0.00 0.00 0.00 0.00 –6.02 0.29 –0.67 4.68 10.80 0.91 –1.10 –3.99 –2.12 0.34 –1.46 0.11 –0.21 0.07 –2.10 0.55 1.09 –0.16
0.00 0.00 0.00 0.00 0.00 0.00 –0.44 –0.23 0.23 0.00 –0.44 0.02 –0.23 0.08 –0.11 0.20 1.23 0.00 –0.27 0.12 0.06 –0.22 –0.15 0.03
1523_Frame_C10.fm Page 376 Thursday, August 28, 2003 5:03 PM
TABLE 10.9
1523_Frame_C10.fm Page 377 Thursday, August 28, 2003 5:03 PM
and 4.5 mm in thickness and located 5 mm below the surface. The height of the beam is H = 25.4 mm. The length of the crack in beam 10A is 35 mm; beam 10B is 45-mm long, and beam 10C is 55-mm long. The results are presented in Figure 10.45, which shows that the beam model results agree reasonably well with the experiment’s results. The results indicate that displacement response of beam 10A, beam 10B, and beam 10C is perturbed due to the presence of the delamination, implying that displacement response may be used to locate and size cracks in beams. The results in Figure 10.45 also indicate the general trend of the magnitude 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0
5
10
15
20
25
30
35
1
Normalized Response
0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0
5
10
15
20
25
30
35
15
20
25
30
35
1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0
5
10
Distance from excitation point, cm Beam Model
Laser
FIGURE 10.45 Normalized displacement responses of beams 10A, 10B, and 10C excited by a harmonic concentrated load at the origin.
© 2003 by CRC Press LLC
1523_Frame_C10.fm Page 378 Thursday, August 28, 2003 5:03 PM
TABLE 10.10 Discrete Values of Crack Parameters for Crack Detection of Aluminum Beam (mm) Location Depth Length
245 5 35
200 20 45
150
100
50
55
of displacement beginning to increase to attain the peak value in approaching the delaminated region, after which the displacement attains a steadystate value. The input for the NN is taken from beam model responses of 30 points spaced equally over the length of 280, 290, and 300 mm along beam 10A, beam 10B, and beam 10C, respectively. The training data of the displacement responses are sampled before and on the crack region (region 1 and region 2 in Figure 10.9). The training data in these two regions are chosen in order to minimize the effect of small oscillation on the experimental results at locations beyond the cracked region. These crack parameters are presented in Table 10.10. Following the same way as described in Section 10.7.3, actual and normalized training data of crack parameters are presented in Table 10.11. In the initial training process, i the NNs have satisfied the given convergence criterion after 25,731 terations. In order to test the performance of the TABLE 10.11 Arrays of Training Data for NN for Crack Detection of Aluminum Beam Location Actual (mm) 245.00 100.00 245.00 200.00 100.00 245.00 200.00 150.00 50.00 200.00 150.00 100.00 245.00 150.00 100.00 50.00 200.00 100.00
Depth
Normalized
Actual (mm)
0.89 0.24 0.89 0.69 0.24 0.89 0.69 0.47 0.02 0.69 0.47 0.24 0.89 0.47 0.24 0.02 0.69 0.24
5.00 5.00 20.00 20.00 20.00 5.00 5.00 5.00 5.00 20.00 20.00 20.00 5.00 5.00 5.00 5.00 20.00 20.00
Length
Normalized
Actual (mm)
Normalized
0.03 0.03 0.89 0.89 0.89 0.03 0.03 0.03 0.03 0.89 0.89 0.89 0.03 0.03 0.03 0.03 0.89 0.89
35.00 35.00 35.00 35.00 35.00 45.00 45.00 45.00 45.00 45.00 45.00 45.00 55.00 55.00 55.00 55.00 55.00 55.00
0.12 0.12 0.12 0.12 0.12 0.47 0.47 0.47 0.47 0.47 0.47 0.47 0.81 0.81 0.81 0.81 0.81 0.81
Note: Normalization is performed using Equation 6.14.
© 2003 by CRC Press LLC
1523_Frame_C10.fm Page 379 Thursday, August 28, 2003 5:03 PM
Percentage of error
10 0 -10 -20 -30 -40 -50 -60 0
1
2
3
4
Number of training Location
Depth
Length
(a) Alminum beam case of a c = 150 H , d c = 5.0 H , lc = 35 H
Percentage of error
30 20 10 0 -10 -20 -30 0
1
2
3
4
Number of training Location
Depth
Length
(b) Alminum beam case of a c = 245 H , d c = 20 H , lc = 45 H
Percentage of error
5 0 -5 -10 -15 -20 -25 0
1
2
3
4
Number of training Location
Depth
Length
(c) Alminum beam case of a c = 245 H , d c = 20 H , lc = 55 H
FIGURE 10.46 The convergence of the error of the crack parameters in the identification process using the progressive NN. The height of the beam is H = 25.4 mm.
initially trained NNs, 30 configurations, including 12 new configurations of crack parameters, are input to the trained networks. These samples have not been included in the initial training samples. The displacement responses for these cases are taken from experimental measurements. The NNs are retrained with the adjusted sample sets and then are used again for identifying the crack parameters. For the second and third training, the NNs have satisfied the given convergence criterion after 14,856 and 9193 iterations, respectively. Figure 10.46 shows three samples of convergence process of crack parameters identified from the NNs. These results indicate that the error of the
© 2003 by CRC Press LLC
1523_Frame_C10.fm Page 380 Thursday, August 28, 2003 5:03 PM
identified crack parameters decreases significantly with the progression of the retraining process. The comparison results between the actual and identified crack parameters for 30 testing data are listed in Table 10.12, which shows that the beam model of wave propagation and NNs can correctly determine the location, depth, and length of the crack over a wide range of crack parameters contained in the cracked aluminum beam. For these cases, the percentage of errors between the actual and identified crack parameters is less than 17%. Compared with the case using SEM code as the forward solver, use of the beam model gives larger errors because the beam model of wave propagation is based on the Euler beam theory and SEM is based on the exact theory of elasticity.
10.7.6
Use of Beam Model and Impact Excitation
The comparison studies carried out in Section 10.4 indicate that the beam model results agree reasonably well with the experiment. These results indicate that time domain response is perturbed due to the presence of a crack, implying that time domain response may be used to locate and size cracks in beams. After extensive comparison studies between the beam model results and experiments for the cases of beam 4A, beam 4B, beam 4C, and beam 4D, the beam models are then used together with the progressive NNs for the inverse identification of crack parameters in the beams. The true crack parameters are presented in Table 10.13. The input for the NN is taken from beam model responses sampled at 40 points started from the first peak response until the response at 350 µs. These training data correspond with the displacement response with which the reflected wave has not interfered; thus, good correlation between beam model results and experiment can be well maintained. The actual and normalized training data of crack parameters are presented in Table 10.14. For the initial training process, the NNs have satisfied the given convergence criterion after 14,751 iterations. After the initial training of the NN model, identification of crack parameters begins by feeding into the NN the experimental displacement response measured from the scanning laser vibrometer. For the second and third retraining, the NNs have satisfied the given convergence criterion after 12,929 and 10,097 iterations, respectively. Figure 10.47 shows three samples of the convergence process of the crack parameters identified from the NNs. These results indicate that the error of the identified crack parameters decreases significantly with the progress of the retraining process. The comparison between the actual and identified crack parameters for 21 testing data are presented in Table 10.15, which shows that the beam model of wave propagation and NNs can correctly determine the location, depth, and length of crack over a wide range of crack parameters for the
© 2003 by CRC Press LLC
© 2003 by CRC Press LLC
Comparison between Actual and Detected Crack Parameters from Progressive NN for Aluminum Beam No. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
Location (mm) Measured Calculated 245.00 200.00 150.00 100.00 50.00 245.00 200.00 150.00 100.00 50.00 245.00 200.00 150.00 100.00 50.00 245.00 200.00 150.00 100.00 50.00 245.00 200.00 150.00 100.00 50.00 245.00 200.00 150.00 100.00 50.00
251.94 196.67 124.61 85.70 47.76 218.63 190.39 159.45 91.22 45.34 248.53 209.20 162.73 104.87 48.01 249.63 172.85 139.25 98.92 46.21 247.90 205.72 143.22 104.29 47.58 240.67 190.41 130.20 108.65 47.51
Depth (mm) Measured Calculated 5.00 5.00 5.00 5.00 5.00 20.00 20.00 20.00 20.00 20.00 5.00 5.00 5.00 5.00 5.00 20.00 20.00 20.00 20.00 20.00 5.00 5.00 5.00 5.00 5.00 20.00 20.00 20.00 20.00 20.00
5.11 4.97 4.97 5.03 4.98 20.32 19.99 19.99 19.94 18.93 4.99 4.99 4.96 4.96 5.01 20.16 19.96 19.98 19.98 20.12 5.02 4.88 4.95 4.97 5.01 19.98 20.05 20.01 19.97 19.97
Length (mm) Measured Calculated 35.00 25.00 35.00 35.00 35.00 35.00 35.00 35.00 35.00 35.00 45.00 45.00 45.00 45.00 45.00 45.00 45.00 45.00 45.00 45.00 55.00 55.00 55.00 55.00 55.00 55.00 55.00 55.00 55.00 55.00
35.24 35.27 35.27 34.44 35.35 34.87 35.12 34.67 34.66 35.61 45.68 43.89 44.32 44.55 47.20 43.39 38.53 46.45 43.17 47.84 55.94 54.48 55.91 55.70 55.84 55.41 54.93 55.10 54.89 55.79
Percentage of Error Location Depth Length 2.83 –1.66 –16.93 –14.30 –4.48 –10.76 –4.81 6.30 –8.78 –9.33 1.44 4.60 8.49 4.87 –3.98 1.89 –13.57 –7.17 –1.08 –7.58 1.18 2.86 –4.52 4.29 –4.84 –1.77 –4.80 –13.20 8.65 –4.97
2.15 –0.55 –0.55 0.57 –0.31 1.61 –0.04 –0.06 –0.30 –5.33 –0.17 –0.24 –0.86 –0.90 0.15 0.78 –0.18 –0.11 –0.09 0.60 0.36 –2.33 –0.94 –0.66 0.29 –0.08 0.24 0.03 –0.17 –0.16
0.68 0.78 0.78 –1.61 1.01 –0.37 0.35 –0.94 0.96 1.74 1.51 –2.47 –1.52 –0.99 4.88 –3.57 –14.38 3.22 –4.06 6.30 1.70 –0.94 1.65 1.28 1.53 0.75 –0.13 0.18 –0.21 1.44
1523_Frame_C10.fm Page 381 Thursday, August 28, 2003 5:03 PM
TABLE 10.12
1523_Frame_C10.fm Page 382 Thursday, August 28, 2003 5:03 PM
TABLE 10.13 Crack Parameters for Crack Detection of Aluminum Beam (mm) Location Depth Length
20 5 35
–17.5 20 45
–55 55
Source: Ishak, S.I. et al., J. Sound Vib., 252(2), 343–360, 2002. With permission.
TABLE 10.14 Arrays of Training Data for NN for Crack Detection of Aluminum Beam Location Actual (mm) 20.00 –55.00 –17.50 –55.00 20.00 –55.00 20.00 –17.50 –55.00 20.00 –55.00 20.00 –17.50 20.00 –17.50
Depth
Normalized
Actual (mm)
0.98 0.07 0.52 0.07 0.98 0.07 0.98 0.52 0.07 0.98 0.07 0.98 0.52 0.98 0.52
0.00 0.00 5.00 5.00 20.00 20.00 5.00 5.00 5.00 20.00 20.00 5.00 5.00 20.00 20.00
Length
Normalized
Actual (mm)
Normalized
0.00 0.00 0.23 0.23 0.91 0.91 0.23 0.23 0.23 0.91 0.91 0.23 0.23 0.91 0.91
0.00 0.00 35.00 35.00 35.00 35.00 45.00 45.00 45.00 45.00 45.00 55.00 55.00 55.00 55.00
0.00 0.00 0.58 0.58 0.58 0.58 0.74 0.74 0.74 0.74 0.74 0.91 0.91 0.91 0.91
Note: Normalization is performed using Equation 6.14. Source: Ishak, S.I. et al., J. Sound Vib., 252(2), 343–360, 2002. With permision.
cracked aluminum beam. For these cases, the percentage of errors between the actual and identified crack parameters is less than 20%. Comparing with the case of using frequency response (Section 10.7.5), the results obtained using transient reponse seem to be less accurate. This is due to the larger difference in the transient response obtained from the experimental and the beam models.
10.7.7
FEM as Forward Solver
The use of FEM has the big advantage of dealing with structures of complex geometry; therefore, using FEM as a forward solver has also been examined (Irwan, 2001). The finite element (FE) model for dynamic analysis of beams described in Chapter 7 has been adopted in this section for the cracked beams. The crack in the beam is modeled by removing a row of elements from the mesh. Therefore, it is actually a rectangular void. A fine mesh is
© 2003 by CRC Press LLC
1523_Frame_C10.fm Page 383 Thursday, August 28, 2003 5:03 PM
Percentage of error
100 50 0 -50 -100 0
1
2
3
4
Number of training Location
Depth
Length
(a) Alminum beam case of a c = − 17.5 H , d c = 5.0 H , lc = 35 H
Percentage of error
150 100 50 0 -50 -100 0
1
2
3
4
Number of training Location
Depth
Length
(b) Alminum beam case of a c = − 17.5 H , d c = 20 H , lc = 45 H
Percentage of error
20 0 -20 -40 -60 -80 -100 0
1
2
3
4
Number of training Location
Depth
Length
(c) Alminum beam case of a c = 20 H , d c = 5 H , lc = 55 H
FIGURE 10.47 The convergence of the error of the crack parameters in the identification process using the progressive NN and transient response. The height of the beam is H = 25.4 mm. (From Ishak, S.I. et al., J. Sound Vib., 252(2), 343–360, 2002. With permission.)
used around the crack regions for more accurate modeling. A cracked aluminum beam with the dimensions is illustrated in Table 10.16; the actual model and the FE model of the beam are illustrated in Figure 10.48. The FE analysis is performed using the commercial software package ABAQUS (2000), and the result has been validated with the experimental result obtained using the experimental setup illustrated in Figure 10.18. Excellent
© 2003 by CRC Press LLC
© 2003 by CRC Press LLC
Comparison between Actual and Identified Crack Parameters from Progressive NN for Crack Detection of Aluminum Beam Using Transient Response No. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
Location (mm) Measured Calculated 20.00 –17.50 –55.00 20.00 –17.50 –55.00 20.00 –17.50 –55.00 20.00 –17.50 –55.00 20.00 –17.50 –55.00 20.00 –17.50 –55.00 20.00 –17.50 –55.00
19.91 –17.30 –54.58 20.22 –16.36 –54.41 20.10 –16.22 –55.22 20.12 –18.22 –54.72 20.18 –17.20 –55.23 19.96 –17.83 –55.10 20.16 –17.48 –54.89
Depth (mm) Measured Calculated 0.00 0.00 0.00 5.00 5.00 5.00 20.00 20.00 20.00 5.00 5.00 5.00 20.00 20.00 20.00 5.00 5.00 5.00 20.00 20.00 20.00
0.04 0.00 0.00 4.11 4.20 4.38 17.73 19.66 20.39 4.11 4.63 4.18 16.50 19.75 20.15 4.49 5.35 5.93 16.12 20.02 19.85
Length (mm) Measured Calculated 0.00 0.00 0.00 35.00 35.00 35.00 35.00 35.00 35.00 45.00 45.00 45.00 45.00 45.00 45.00 55.00 55.00 55.00 55.00 55.00 55.00
Source: Ishak, S.I. et al., J. Sound Vib., 252(2), 343–360, 2002. With permission.
0.07 0.01 0.01 28.44 31.91 35.60 34.52 33.49 39.03 46.83 48.97 46.16 45.31 45.01 45.39 55.27 56.94 54.68 55.71 54.94 54.88
Percentage of Error Location Depth Length –0.44 –1.13 –0.75 1.09 –6.50 –1.07 0.51 –7.30 0.40 0.59 4.11 –0.51 0.88 –1.69 0.42 –0.19 1.89 0.18 0.80 –0.14 –0.20
0.00 0.00 0.00 –17.72 –15.92 –12.40 –11.33 –1.70 1.95 –17.72 –7.34 –16.40 –17.50 –1.23 0.73 –10.28 6.96 18.67 –19.40 0.10 –0.76
0.00 0.00 0.00 –18.76 –8.84 1.73 –1.38 –4.32 11.53 4.06 8.82 2.57 0.70 0.03 0.87 0.49 3.52 –0.58 1.29 –0.11 –0.22
1523_Frame_C10.fm Page 384 Thursday, August 28, 2003 5:03 PM
TABLE 10.15
1523_Frame_C10.fm Page 385 Thursday, August 28, 2003 5:03 PM
TABLE 10.16 Dimensions of Cracked Aluminum Beam Length of beam Width of beam Height of beam (H)
1000 mm 4 mm 25 mm
Flaw Flaw Flaw Flaw
100 mm from impact point 5 mm below surface 35 mm 1 mm
location (horizontal) depth length thickness
Impact
Horizontal crack
y x
FIGURE 10.48 FE model of the crack aluminum beam. (The dimensions of the beam are listed in Table 10.16.)
correlation between FE simulation and experimental results has been observed as shown in Figure 10.49. Motivated by this good experimental comparison, attempt to detect the crack parameters with the use of the NN model as in the previous subsection based on this FE model. FEM is first employed to compute the dynamic responses of cracked beams. These numercial responses using FEM are then used as the training data for the NN. The performance of the NN is tested on experimental data obtained by scanning laser vibrometer for the inverse detection of crack parameters in the aluminum beams. Next, the crack parameters of the aluminum beam that has the dimensions illustrated in Table 10.16 are inversely detected. A total of 28 data points from the transient response curve are sampled at equal intervals between 80 ~ 296 µs to act as the inputs for the NN model. This interval of sampling falls after first peak and before arrival of the reflected waves from the two ends of the beam. The bounds on the location of the cracks are set within the range 100 m ð ac ð 140 mm and 3 mm ð dc ð 7 mm, while the length of the cracks is bounded within the range of 40 mm ð lc ð 56 mm. To create the training and test samples, a total of 81 sample combinations have been simulated using finite element simulation. The samples were expressed in a 3 × 9 × 3 combination as shown in Table 10.17. Using the concept of orthogonal array, 27 combinations are extracted from the sample space. Three more randomly generated samples are taken to reinforce the sample set. These 30
© 2003 by CRC Press LLC
1523_Frame_C10.fm Page 386 Thursday, August 28, 2003 5:03 PM
Comparison of velocity responses at a point before the flaw (40 mm from impact) (flawed beam)
velocity (m/s)
1 0.5 0 -0.5
0
0.0001
0.0002
0.0003
0.0004
0.0005
-1 -1.5 time (s) experimental
simulated
vel at a pt near the edge
Comparison of velocity responses at a point around the flaw (117.5 mm from impact) (flawed beam)
velocity (m/s)
1 0.5 0 -0.5
0
0.0001
0.0002
0.0003
0.0004
0.0005
-1 -1.5 time (s) experimental
simulated
vel at a pt near the edge
Comparison of velocity responses at a point after the flaw (155 mm from impact) (flawed beam)
velocity (m/s)
1 0.5 0 -0.5
0
0.0001
0.0002
0.0003
0.0004
0.0005
-1 -1.5 time (s) experimental
simulated
vel at a pt near the edge
FIGURE 10.49 Comparisons of finite element simulation with experimental results of transient response of the aluminum beam. The thick curves are recorded by a sensor located near the beam end.
© 2003 by CRC Press LLC
1523_Frame_C10.fm Page 387 Thursday, August 28, 2003 5:03 PM
TABLE 10.17 Flaw Combinations in Sample Space for Aluminum Beam Level Location (mm) Depth (mm) Length mm)
1 100 3 40
2 120 3.5 48
3 140 4 56
4 — 4.5 —
5 — 5 —
6 — 5.5 —
7 — 6 —
8 — 6.5 —
9 — 7 —
samples are then used to train the NN model. For more detailed implementation of the NN model, see Chapter 6. The trained network was subsequently fed with the transient wave response from 51 samples and the predicted crack parameters as listed in Table 10.18. The results were generally good with 40 out of the 51 predictions having errors below 10%. Most of the large errors were found in the reconstruction of the depth of the crack. To improve the effectiveness of this network, more training data should be used. Overall, the use of this NN has been successful in identifying crack parameters from the given transient wave response. The inverse solution is much better than that obtained in Section 10.7.3, largely due to the improvement of the forward solver. The price paid for accuracy is the CPU time and the time of the modeling for creating the FEM model. With a more elaborate network encompassing a larger sample space, there is a very high potential of using the NN together with the transient wave response as a practical tool of NDE.
10.8 Discussion on Ill-Posedness In the inverse analyses conducted in this chapter, no ill-posedness has been observed for the inverse problems. Type I ill-posedness is suppressed by using overwhelmingly more sampling points than the crack/delamination parameters to be inversely identified. The sampling points used were at least 21, and the number of parameters were at most 3, leading to the formulation of over-posed inverse problems. Because the elastic waves are used for the inversion, obtaining more sampling points is not a problem and will not result in any increased cost in terms of computation and experiments. Type II ill-posedness is well suppressed by making sure that the response, whether harmonic or transient, is sensitive to the parameters to be inversely identified. This is achieved by careful investigation and computational and experimental study. The sensitive study can usually performed effectively computationally using the forward solver. Type III ill-posedness is mitigated by using the analytical type of forward solver (SEM and beam models) or the effects of the projection regularization when FEM mode is used and the regularization of filtering (see Chapter 3 for details). The projection regularization is achieved at two levels. One is the discretization using finite elements; the other is the discrete sampling of
© 2003 by CRC Press LLC
1523_Frame_C10.fm Page 388 Thursday, August 28, 2003 5:03 PM
TABLE 10.18 Detection of Crack Parameters for 51 Test Samples Using the Progressive NN # 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51
Location (mm) Depth (mm) Length (mm) Percentage error Actual Predicted Actual Predicted Actual Predicted Location Depth Length 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 120.0 120.0 120.0 120.0 120.0 120.0 120.0 120.0 120.0 120.0 120.0 120.0 120.0 120.0 120.0 120.0 120.0 120.0 140.0 140.0 140.0 140.0 140.0 140.0 140.0 140.0 140.0 140.0 140.0 140.0 140.0 140.0 140.0 140.0 140.0
© 2003 by CRC Press LLC
98.8 99.7 98.6 100.2 99.2 100.0 99.7 99.6 99.6 100.0 100.0 99.6 99.9 99.8 100.2 99.6 120.5 124.8 120.2 118.6 118.4 118.8 120.0 119.1 119.7 121.3 120.2 121.2 120.0 120.5 119.4 120.0 118.6 120.3 138.1 140.6 140.6 139.7 140.4 140.0 139.8 140.3 139.8 140.2 139.6 140.0 139.7 140.0 139.8 140.0 140.0
3.5 3.5 4.0 4.0 4.5 4.5 5.0 5.0 5.5 5.5 6.0 6.0 6.5 6.5 7.0 7.0 3.0 3.0 3.5 3.5 4.0 4.0 4.5 4.5 5.0 5.0 5.5 5.5 6.0 6.0 6.5 6.5 7.0 7.0 3.0 3.0 3.5 3.5 4.0 4.0 4.5 4.5 5.0 5.0 5.5 5.5 6.0 6.0 6.5 6.5 7.0
3.0 3.5 3.4 4.2 4.1 4.4 5.0 5.0 5.8 5.6 6.0 6.4 6.8 6.1 6.8 6.0 3.6 3.4 3.7 3.2 3.7 4.1 4.5 4.2 4.8 4.9 5.6 5.5 6.1 6.1 6.3 6.6 6.4 6.7 3.4 3.5 3.7 3.0 3.9 3.2 4.5 3.8 5.4 4.8 6.1 5.5 6.6 6.1 6.8 6.5 6.8
40.0 56.0 40.0 48.0 40.0 56.0 40.0 48.0 48.0 56.0 40.0 48.0 48.0 56.0 40.0 56.0 40.0 56.0 40.0 48.0 48.0 56.0 40.0 48.0 48.0 56.0 40.0 56.0 48.0 56.0 40.0 56.0 40.0 48.0 40.0 48.0 48.0 56.0 40.0 56.0 48.0 56.0 40.0 56.0 40.0 48.0 40.0 56.0 40.0 48.0 48.0
40.5 56.3 40.7 47.6 40.5 56.0 40.2 48.2 48.4 55.9 39.9 48.4 48.3 56.4 40.2 57.2 40.1 50.5 40.1 48.8 49.0 56.5 39.9 48.9 48.6 56.0 40.0 56.6 47.6 56.9 40.2 56.7 41.0 49.5 39.8 49.0 48.3 56.9 40.1 57.0 48.1 56.5 39.8 56.2 39.6 47.9 39.6 56.0 39.7 48.7 49.8
–1.16 –0.28 –1.40 0.21 –0.85 –0.05 –0.27 –0.37 –0.44 0.05 0.00 –0.36 –0.14 –0.19 0.20 –0.42 0.39 3.98 0.15 –1.19 –1.32 –1.04 –0.02 –0.75 –0.25 1.04 0.14 1.01 0.03 0.40 –0.51 0.01 –1.20 0.23 –1.35 0.46 0.41 –0.19 0.28 0.02 –0.15 0.18 –0.13 0.12 –0.26 –0.03 –0.23 0.00 –0.13 –0.02 –0.03
–12.93 –0.56 –15.99 3.89 –8.16 –2.92 –0.96 0.93 4.88 2.61 0.08 7.13 4.80 –6.38 –2.62 –13.92 19.50 12.63 6.60 –7.51 –7.24 1.40 –1.06 –6.00 –3.59 –2.01 1.10 –0.34 2.07 2.33 –2.86 2.25 –8.55 –3.95 12.20 15.77 4.37 –14.07 –3.42 –20.31 1.04 –15.50 7.53 –4.91 10.85 0.59 9.48 1.02 5.32 –0.66 –3.05
1.33 0.53 1.71 –0.74 1.36 0.07 0.44 0.40 0.76 –0.24 –0.26 0.89 0.64 0.80 0.53 2.17 0.32 –9.86 0.23 1.64 2.08 0.95 –0.16 1.98 1.18 –0.02 –0.09 1.07 –0.88 1.61 0.57 1.23 2.58 3.14 –0.61 2.11 0.63 1.62 0.17 1.70 0.17 0.96 –0.58 0.29 –0.95 –0.19 –1.12 –0.08 –0.84 1.51 3.81
1523_Frame_C10.fm Page 389 Thursday, August 28, 2003 5:03 PM
the response used for the inverse analysis. The two levels of discretization have made the problem behave more like a discrete type of inverse problem.
10.9 Remarks • Remark 10.1 — theoretical and experimental studies of inverse procedures employing flexural wave response in beams have been pre sented in this chapter as an alternative nondestructive technique for characterization of cracks or delaminations in isotropic and anisotropic beams. Comparison among the SEM, beam model of wave propagation, FEM, and experimental results in frequency and time domains indicates that these three methods agree reasonably well; all can be used as forward solvers. Both analytical methods and experiments show that the presence of a crack or delaminations in a beam causes the scattering of waves, resulting in generation of a very significant change in the response curves. The study also shows that parameters of cracks or delaminations can be approximately estimated from the response curves. • Remark 10.2 — an inverse procedure to locate and size cracks in beams based on the SEM code as the forward solver and the GA as the inverse operator has been presented. The GA has been successfully implemented to find the global minimum of the error function. Verification of simulation and experimental data indicates that the inverse operation is an effective and accurate approach for locating and sizing cracks. • Remark 10.3 — studies using SEM code, beam model, and the FEM code and the progressive NN have also been conducted to inversely determine crack or delaminations parameters in laminated composite beams. The results indicate that the progressive NN is effective for detecting the crack parameters in beams. • Remark 10.4 — the ill-posedness of inverse problems was well sup pressed by careful formulating and solving of the inverse problems, as well as the use of regularization by discretization/projection and filtering. • Remark 10.5 — it is necessary to re-emphasize that accurate and efficient forward solvers as well as quality experiments are essential for a successful inverse analysis.
© 2003 by CRC Press LLC
1523_Frame_C11.fm Page 391 Thursday, August 28, 2003 5:43 PM
11 Inverse Detection of Delaminations in Composite Laminates
In this chapter, computational inverse techniques using elastic wave responses of displacement for delamination detection in composite laminates are introduced. Both horizontal and vertical cracks will be considered. The genetic algorithm (GA) and neural network (NN) are employed as the inverse operators and the strip element method (SEM) is used as the forward solver to calculate the wave response. Examples of practical applications have been presented to demonstrate the efficiency of computational inverse techniques for delamination detection in composite laminates. This chapter focuses on different computational inverse techniques for delamination detection in composite laminates. No experiments are provided; however, stability of the inverse solution to the presence of the noise simulated by Gaussian noise and white noise will be examined and discussed in detail.
11.1 Introduction Many new delamination detection techniques using elastic waves for composite laminates have been proposed (Karim et al., 1989; Liu and Lam, 1994; Doebling et al., 1996; Luo and Hanagud, 1997). One of them numerically infers the location and size of delaminations from the elastic low-frequency waves scattered by the delaminations. Kundu et al. (1987, 1988) investigated the dynamic interaction between two interface delaminations and the transient behavior of an interfacial delamination in composite plates. Karim et al. (1989, 1992a, b) studied the scattering of elastic waves due to delaminations and flaws in plates using the combination of finite element method (FEM) and guided wave expansions. Karunasena et al. (1991) investigated the scattering of plane-strain waves due to delaminations using the combined FEM and Lamb wave modal expansion method. Liu, S.W. et al. (1991) studied the transient scattering of Rayleigh Lamb waves of a surface-breaking delamina-
© 2003 by CRC Press LLC
1523_Frame_C11.fm Page 392 Thursday, August 28, 2003 5:43 PM
tion using the FEM and boundary element method (BEM). The location and size of delaminations have been investigated from elastic low-frequency waves scattered by these delaminations or flaws (Liu and Lam, 1994; Liu et al., 1995b, 1996). Their calculated results were also compared with measured ones. Datta et al. (1992) and Liu and Datta (1993) later applied the method to investigate the scattering of impact and ultrasonic waves due to delaminations in a composite plate. In order to improve computational efficiency, Liu and his co-workers (Liu and Lam, 1994; Liu et al., 1995b, 1996; Lam et al., 1997, Wang et al., 1998) used the SEM to investigate horizontal and vertical cracks in composite laminates subjected to scanning or fixed source loads. The numerical analyses were carried out in time and frequency domains, and the calculated results were compared with the ones in nondelamination cases. These works demonstrated that scattered elastic waves are significantly related to the location and size of delaminations or flaws. However, they could not provide a systematical procedure for the detection of delaminations or flaws from the scattered waves, although some polynomial formulae were proposed for estimating the larger delaminations in laminates approximately (Liu and Lam, 1994; Lam et al., 1997). This is because analysis for scattered waves with the given delamination configuration is still basically a forward analysis, while the detection for the delaminations from the measured scattered waves needs to be formulated as an inverse problem. The inversion for this kind of problem is mathematically nonlinear and analytically intractable; thus, computational methods are required. Theoretical wave analysis studies have found sensitive correlation between the scattered wave field and the delamination characteristics. This implies that elastic wave techniques can be a very promising alternative means for detecting and characterizing delaminations in laminates, if a proper inverse procedure can be formulated. This fact has also been confirmed by experiments for the case of beams (see Chapter 10). Thus, considerable interest has grown in developing inverse procedures, employing optimization methods that include genetic algorithms for delamination detection. Wu et al. (2002) identified horizontal and vertical cracks in laminates using the µGA. Xu and Liu (2002d) have detected the flaws in composite laminate from the scattered elastic wave field using the modified µGA combined with a gradient-based optimizer. In their treatment, the detection problem is formulated as an optimization problem of minimizing the difference between measured and calculated surface displacement response derived from scattered elastic wave fields. Using lamb waves whose wavelength is of the same order as that of the laminate thickness, Xu and Liu (2002d) recently employed the IP-GA to detect the damage in composite laminates. With the development of artificial intelligent techniques, NNs have provided an effective tool for solving this kind of inverse problems. Wu et al. (1992) adopted an NN model to portray the structural behavior before and after damage in terms of the frequency response function, and then used this trained model to detect the location and extent of damages by feeding
© 2003 by CRC Press LLC
1523_Frame_C11.fm Page 393 Thursday, August 28, 2003 5:43 PM
in measured dynamic response. Klenke and Paez (1994) used two probabilistic techniques to detect the damages in aerospace housing components, one of which involved a probabilistic neural network model. Rhim and Lee (1995) used the NN model to identify damages in a composite cantilevered beam; the damage was modeled as delamination in the FEM model of beam. Using an NN model, Masri et al. (1996) detected changes in the dynamic characteristics of a structure-unknown system. Luo and Hanagud (1997) used an NN model with the dynamic learning rate steepest descent method to carry out the real-time flaw detection of composite materials, while Zhao et al. (1998) used a counter-propagation NN model to identify the damages in beams and frames. Liu, N. et al. (1999) used the NN model to detect the impact damages in carbon fiber-reinforced polymer composite laminates. In their study, the transient acoustic emission waveforms detected from the surface of materials were used as the input of the NN model. More applications of the NN model in the area of damage detection can be found in Bishop (1994) and Doebling et al. (1996). Recently, an NN technique has been proposed for the detection of delaminations in anisotropic laminates (Xu et al., 2001b). Computer-generated displacement responses on the surface of plate, excited by a time-harmonic line load, are used as the input of the NN. The delamination parameters that specify the location and size of the delaminations in the anisotropic laminates are taken as the output of the NN. This chapter introduces several computational inverse techniques using wave responses of the composite laminates to determine the location and size of the delaminations. The materials presented here are largely from Liu and his co-workers (Wu et al., 2002; Xu et al., 2001b, 2002; Xu and Liu, 2002d).
11.2 Statement of the Problem Consider a composite laminate that consists of a number of anisotropic layers. A horizontal or vertical crack is located inside this laminate. A timeharmonic line load of excitation q0, which does not vary in the y direction (see Figure 7.7), is applied on the upper surface of the laminate. This load can be expressed as q0 = q0 e − iωt
(11.1)
where q0 and ω are the load amplitude and frequency, respectively. Without further specification, the exciting load defined by Equation 11.1 has a unit amplitude ( q0 = 1) and frequency ω = 3.14 c 44 / ρ / H (c44 and ρ are one of the elastic constants and mass density of glass/epoxy, respectively). The excited displacement response on the surface of this laminate consists of a
© 2003 by CRC Press LLC
1523_Frame_C11.fm Page 394 Thursday, August 28, 2003 5:43 PM
number of modes of lamb waves scattered from the delamination. The wave response is used as the known information for the delamination detection. Optimization methods including the µ GA, IP-GA, and NN can be employed to inversely detect the geometry parameters of the delaminations. In these procedures, the SEM (see Section 10.2.1) has been used as the forward solver. For the detailed description of SEM implementation for wave scattering in composite laminates with horizontal delaminations and/or vertical cracks, readers are referred a monograph by Liu and Xi (2001). In the following sections, horizontal delaminations and vertical cracks will be considered.
11.3 Delamination Detection Using Uniform µ GA In general, the calculated displacement filed from SEM with a set of assumed delamination parameters for a laminate is different from those measured from the laminates with the actual delamination. The inverse procedure can then be formulated by an optimization procedure, which minimizes the sum of squares of difference between calculated and measured response, defined with L2 norm as given by Equation 2.123. To carry out the minimization of the error function, the uniform µGA method is employed in such procedures. Consider a laminate [C90/G45/G–45]s with horizontal and vertical cracks that has been investigated using the µGA by Wu et al. (2002). Material constants of carbon/epoxy and glass/epoxy are assumed to be known and given in Table 7.2. In the numerical analysis using SEM, every layer in the laminate is divided into four strip elements in the thickness direction. Total number of strip elements is thus 24 along the thickness direction.
11.3.1
Horizontal Delamination
Figure 11.1 shows the composite laminate [C90/G45/G–45]s with a horizontal delamination. The thickness of laminate in the z direction is denoted by H. The length and the depth of laminate in x and y direction are considered to be infinite. A horizontal delamination is located inside the laminate; its loca tion is defined by the distance ac (from x = 0 to the left tip of delamination), the depth dc (from the upper surface of laminate to the center of delamination in z direction) and its length is denoted by lc. The following dimensionless parameters are used in the numerical analysis (Wu et al., 2002): x=
© 2003 by CRC Press LLC
l d a x , ac = c , dc = c , lc = c H H H H
(11.2)
1523_Frame_C11.fm Page 395 Thursday, August 28, 2003 5:43 PM
q(ω) z
ac
lc Delamination
H
x
dc
C90 G+45 G-45 G-45 G+45 C90
1st layer
6th layer
FIGURE 11.1 A six-layer composite laminate with a horizontal delamination. The laminate is subjected to harmonic excitation of a line load on the surface.
FIGURE 11.2 Sudden change of displacement amplitudes above the delamination. A rough indication of the starting and ending point of the delamination region can be observed by comparison with the curve of laminate without delamination. (From Wu, Z.P. et al., Eng. Comput., 18, 116–123, 2002. With permission.)
Consider now a delaminated laminate with a horizontal delamination of ac = 5.0H, lc = 2.0H, and dc= 0.5H. The wave response on the surface of the laminate is computed using SEM and plotted in Figure 11.2, which shows that a noticeable change of the displacement amplitudes takes place over the delamination range. Although the location and length of the delamination cannot be exactly obtained by observing the amplitude change, it is sufficient to determine the range of the response data to be used for constructing the error function before the searching process started. The µGA uses a population size of 5, tournament selection, no mutation, niching, elitism, and uniform-crossover of pcross= 0.5. Assume that the depth of the delamination is known, and the delamination location and length are to be determined. Therefore, two delamination parameters are to be identified. Two cases of searching range for the delam ination parameters are given as
© 2003 by CRC Press LLC
1523_Frame_C11.fm Page 396 Thursday, August 28, 2003 5:43 PM
• Case (a) — lc within [0.5H, 3.5H] and ac within [2.0H, 8.0H] • Case (b) — lc within [1.0H, 3.0H] and ac within [3.0H, 7.0H] Case (a) is a wider search compared to case (b). The convergence process of the µGA search is given in Figure 11.3(a) and Figure 11.3(b) for case (a) and case (b), respectively. The points to be noted are:
crack length
crack location
fitness error
• In case (a), the true delamination length was found at the 51st generation of evolution while the delamination location was found at the 261st generation, impliying that searching for the delamination location is more difficult than searching for the delamination length.
crack length
crack location
fitness error
(a) searching reange: crack length: 0.5 to 3.5 and crack location: 2.0 to 8.0
(b) searching reange: crack length: 1.0 to 3.0 and crack location: 3.0 to 7.0 FIGURE 11.3 Evolution of the searching process for detecting the delamination length and its location for the laminate [C90/G+45/G–45] s using the genetic algorithm (true delamination length = 2.0 and true location = 5.0). (From Wu, Z.P. et al., Eng. Comput., 18, 116–123, 2002. With permission.)
© 2003 by CRC Press LLC
1523_Frame_C11.fm Page 397 Thursday, August 28, 2003 5:43 PM
TABLE 11.1 Identified Results Using µGA for Horizontal Delamination in the Laminate [C90/G45/G–45]s Excited by a Vertical Harmonic Line Load on the Surface True Delamination Parameters Length Location Case (a) Case (b)
2.0H 2.0H
Converged Delamination Parameters Length Location
5.0H 5.0H
2.03H 2.01H
Fitness (%)
5.02H 4.98H
1.5 1.8
Note: H = thickness of the laminate. Source: Wu, Z.P. et al., Eng. Comput., 18, 116–123, 2002. With permission.
TABLE 11.2 Efficiency of µGA Search for Horizontal Delamination in the Laminate [C90/G45/G–45]s Excited by a Vertical Harmonic Line Load on the Surface Searching Range Length Location Case (a) Case (b)
[0.5H, 3.5H] [1.0H, 3.0H]
[2.0H, 8.0H] [3.0H, 7.0H]
Number of Generations at Convergence Length Location 51 35
261 99
Note: H = thickness of the laminate. Source: Wu, Z.P. et al., Eng. Comput., 18, 116–123, 2002. With permission.
In other words, the delamination location is less sensitive to the type of error function used. • In case (b), similar phenomena were observed. Because the search spaces are narrowed for delamination location and delamination length, the search process converged much faster for both parameters as summarized in Table 11.1 and Table 11.2. Case (b) requires 35 and 99 generations for the delamination length and delamination location, respectively. • Comparing the results for case (a) and case (b), it can be found that although the population sizes are the same, finer discretization in the GA searching space can help to achieve a faster convergence. In practical applications, it is always possible to fix a range of searching space. However, the smallest possible search ranges should always be used in order to reduce the computation cost, achieve faster convergence, and increase the reliability of the solution. 11.3.2
Vertical Crack
Figure 11.4 illustrates schematically a composite laminate [C90/G45/G–45] s with a vertical crack located inside the laminate. The location is defined by
© 2003 by CRC Press LLC
1523_Frame_C11.fm Page 398 Thursday, August 28, 2003 5:43 PM
q(ω) z
dc ac lc
H
x
Delamination
C90 G+45 G-45 G-45 G+45 C90
1st layer
6th layer
FIGURE 11.4 A six-layer composite laminate with a vertical crack. The laminate is subjected to harmonic excitation of a line load on the surface. (From Xu, Y.G. et al., Appl. Artif. Intelligence, 15, 601–631, 2001. With permission.)
the distance ac (from the plane x = 0 to the left tip of delamination in x direction) and the depth dc (from the upper surface of laminate to the center of delamination in z direction). The length of delamination is denoted by lc. Two cases of different vertical cracks are examined: • Case a — ac = 4.0H, dc= 0.166H, and lc = 0.667H • Case b — ac = 5.0H, dc= 0.125H, and lc = 0.75H The searching ranges from the three delamination parameters ac, dc, and l c are within [0,10H], [0, H], and [0, H], respectively, for both cases. Figure 11.5(a) and Figure 11.5(b) show the convergence processes of the fitness value as well as the errors of three delamination parameters detected by the µGA, respectively, for these two delamination cases. These figures show that, for both cases, satisfactory results can be obtained at the 40th generation. This study shows that nearly exact values of the delamination parameters have been found with a small number of function evaluations. Results show that the µGA is quite effective for the detection of delaminations in the composite laminates.
11.4 Delamination Detection Using the IP-GA Consider again the problem in Section 11.3.1, but here all three parameters (ac, dc, lc) for the horizontal delamination are to be considered. The delaminations are assumed to be located at the junctions of two adjacent strip elements and within the region from x = 0 to x = 10H to simulate possible ideal delamination between layers. The measured displacement response is sampled at 250 points evenly spaced on the surface of the laminate within the region specified from x = 0 to x = 10H; therefore, this inverse problem is heavily over-posed. Two horizontal delamination cases are considered: • Case I — ac = 4H, dc = 4/24H, lc = H • Case II — ac = 5.4H, dc = 6/24H, lc = 1.5H
© 2003 by CRC Press LLC
Error function and parameter error (%)
1523_Frame_C11.fm Page 399 Thursday, August 28, 2003 5:43 PM
100
60
ac
lc
dc
Error
20
-20
-60 0
10
20 30 Number of generations
40
case (a)
Fitness and parameter error (%)
100
60
ac
lc
dc
Error
20
-20
-60 0
10
20 30 Number of generations
40
case (b)
FIGURE 11.5 Evolution of the searching process for detecting the vertical crack of the laminate [C90/G+45/ G–45]s using the µGA. (From Wu, Z.P. et al., Eng. Comput., 18, 116–123, 2002. With permission.)
The displacement responses on the surface of the laminate calculated using the SEM code with the given delamination parameters are shown in Figure 11.6. They are used as the assumed noise-free measured unms (ns = 1, …, 250) to inversely determine the delamination parameters by using the IP-GA. The IP-GA uses a population size of 7, tournament selection, no mutation, niching, elitism, uniform crossover of pcross = 0.5, one child, and α = 0.382, β = 0.5. The searching range of three delamination parameters, ac, dc, and lc, is within [0, 10H], [1H, 12H], and [0.2H, 5H]; their possibilities are set to 32768, 24 and 32768, respectively. This means the total number of possibilities is about 2.58 × 1010 and the search space for the actual delamination parameters is very large.
© 2003 by CRC Press LLC
1523_Frame_C11.fm Page 400 Thursday, August 28, 2003 5:43 PM
0.5 I 0.4
II
0.3 0.2 0.1 0.0 0
2
4 6 Horizontal location X/H
8
FIGURE 11.6 Displacement responses on the surface of the laminate [C90/G+45/G–45] s with a delamination used as the input of the IP-GA. The laminate is excited by a vertical line load on the surface. Case I: ac = 4H, dc = 4/24H, lc = H; Case II: ac = 5 .4H, dc = 6/24H, lc = 1.5H. (From Xu, Y.G. et al., Appl. Artif. Intelligence, 15, 601–631, 2001. With permission.)
60 40 20
ac
lc
dc
Error
0 -20 -40 -60 0
5
10 15 20 Number of generations
25
FIGURE 11.7 Convergence process for the three parameters of the horizontal delamination in laminate [C90/ G45/G–45]s. The actual values of the delamination parameters are ac = 4H, dc = 4/24H, lc = H. (From Xu, Y.G. et al., Appl. Artif. Intelligence, 15, 601–631, 2001. With permission.)
Figure 11.7 and Figure 11.8 show the convergence processes of error function value as well as the errors of three detected delamination parameters for the best individual for these two simulated delamination cases, respectively. From these two figures, it can be observed that: • Satisfactory results are obtained for both cases at the 30th generation with 210 times of function evaluations. • The maximal error of detected delamination parameters is 0.17 and 2.36%, corresponding to the best error function value of 0.23 and 2.31, respectively, for these two delamination cases.
© 2003 by CRC Press LLC
1523_Frame_C11.fm Page 401 Thursday, August 28, 2003 5:43 PM
60 40 20
ac
lc
dc
Error
0 -20 -40 -60 0
5
10 15 20 Number of generations
25
FIGURE 11.8 Convergence process for the three parameters of the horizontal delamination in laminate [C90/ G45/G–45]s. The actual values of the delamination parameters are ac = 5.4H, dc = 6/24H, lc = 1.5H. (From Xu, Y.G. et al., Appl. Artif. Intelligence, 15, 601–631, 2001. With permission.)
• The same inverse analysis is conducted using the conventional µGA with the same genetic operators. At the same generation, maximal errors of detected parameters are found to be –15.75 and 18.34%, respectively, for these two cases. The corresponding best error function values are 7.566 and 7.854, respectively. It can be found that the present IP-GA is much more efficient for these cases of delamination detection in composite laminates. With only 30 generations of evolution process, the IP-GA has successfully discovered the delamination parameters with the maximal error of 2.36% in the simulated cases.
11.5 Delamination Detection Using the Improved IP-GA The improved IP-GA (see Section 5.5) is also applied for the inverse analysis of delamination detection. Consider the same laminate [C90/G45/G–45] s with a horizontal delamination located at the junctions of two adjacent strip elements, while the vertical cracks are located across several successive strip elements. Four cases with true delamination parameters are examined: • Case I — horizontal delamination: ac = 6.0H, dc = 0.25H, lc = 1.0H; • Case II — horizontal delamination: ac = 4.0H, dc = 0.333H, lc = 1.2H; • Case III — vertical crack: ac = 4.0H, dc = 0.166H, lc = 0.667H; • Case IV — vertical crack: ac = 5.0H, dc = 0.125H, lc = 0.75H.
© 2003 by CRC Press LLC
1523_Frame_C11.fm Page 402 Thursday, August 28, 2003 5:43 PM
0.4 Case I Case II 0.3 Case III Case IV 0.2
0.1
0.0 0
2
4 6 Horizontal location X/H
8
FIGURE 11.9 Simulated erroneous measurement of displacement responses on the surface of laminate [C90/ G45/G–45]s for four cases. They are generated by adding a white noise with 5% noise level to the simulated displacement with actual values of the delaminations using SEM. (From Xu,Y.G. et al., AIAA J., 40(9), 1860–1866, 2002. With permission.)
Surface displacement responses sampled at 250 points on the surface of these laminates are simulated using the SEM. A white noise with 5% level is added into the computer-generated displacement to simulate the noisecontaminated measurement, as shown in Figure 11.9. The improved IP-GA is used to detect these three delamination parameters using the measured responses. The search range of three delamination parameters, ac, dc, and lc, is then set to be within [0, 10H], [0.042H, 0.5H], and [0.2H, 5H] for the horizontal delamination case, and [0, 10H], [0, 0.5H], and [0.042H, 0.96H] for the vertical crack case, respectively. The number of possibilities for three delamination parameters is 32768, 12, and 32768 for the horizontal delamination case, and 32768, 32768, and 24 for the vertical crack case, respectively. The total population of possible individuals for the inverse searching spaces is approximately 1.29 × 1010 and 2.58 × 1010 for the horizontal and vertical crack cases, respectively. As an example, Figure 11.10 presents a two-dimensional sketch of the error function for the laminate [C90/G45/G–45]s with true delamination parameters of ac = 4H, dc = 4H, and lc = 2H. It is clearly seen that a huge number of local optima are in this error function. The improved IP-GA uses the following operation parameters: population size of 5, uniform crossover at pcross = 0.5, mutation at pmutate = 0.02, and α = 0.2 and β = 0.5. The first generation of five sets of delamination para m eters is randomly generated at first. Then the generations are gradually evolved with the process of convergence of the improved IP-GA, until the
© 2003 by CRC Press LLC
Error function
1523_Frame_C11.fm Page 403 Thursday, August 28, 2003 5:43 PM
lc/H ac/H
FIGURE 11.10 Search space for two delamination parameters for a horizontal delamination in laminate [C90/ G45/G–45]s excited by a line load on the surface. A huge number of local optima are in the search space. (From Xu, Y.G. et al., AIAA J., 40(9), 1860–1866, 2002. With permission.)
minimal error function is sufficiently small. Figure 11.11 shows the conver gence processes of the minimal error function and the corresponding errors of three delamination parameters for these four delamination cases. It can be seen that • The improved IP-GA converges to the satisfactory detection results very fast. • The maximal error of the detected delamination parameters with respect to their true values is –4.1, –4.3, –0.36, and –0.35% at the 60th generation in the evolution process for the four delamination cases, respectively.
11.6 Delamination Detection Using the Combined Optimization Method The combination of a µ GA and a gradient-based optimization method can often provide an ideal performance in efficiency and accuracy for the optimization procedure needed for nonlinear optimization problems, as shown in Section 5.8. Following a similar technique as that described in Chapter 5, the modified µ GA (see Section 5.4.1) is employed at the first step in the inverse procedure to determine the potential candidates, the BCLSF based on the modified Levenberg–Marquardt method (see Chapter 4) and using the finite-difference method for evaluating the Jacobian, is
© 2003 by CRC Press LLC
1523_Frame_C11.fm Page 404 Thursday, August 28, 2003 5:43 PM
20
Error (%)
0 -20 -40 -60
ac
lc
dc
Error
-80 15
30 45 Number of generations
60
(a) Case I
20
Error (%)
0 -20 -40 -60
ac
lc
dc
Error
-80 15
30 45 Number of generations
60
(b) Case II FIGURE 11.11 Convergence processes of the improved IP-GA for detection of four cases of delaminations or cracks in the laminate [C90/G45/G–45]s. (From Xu, Y.G. et al., AIAA J., 40(9), 1860–1866, 2002. With permission.)
then used in the second stage to find the global minimum of the objective function of error.
11.6.1
Implementation of the Combined Technique
11.6.1.1 Formulations of Objective Functions The objective error function for delamination detection problems is defined by the difference between the measured displacement response on the surface of a laminate and the calculated response from the corresponding computational model with a trial set of delamination parameters, ac, dc, and lc. The error functions can be defined using three norms as shown in Chapter
© 2003 by CRC Press LLC
1523_Frame_C11.fm Page 405 Thursday, August 28, 2003 5:43 PM
40 10 -20 -50
ac
lc
dc
Error
-80 15 30 45 Number of generations
60
(c) Case III
40 10 -20 -50
ac
lc
dc
Error
-80 15 30 45 Number of generations
60
(d) Case IV
FIGURE 11.11 Continued.
2, e.g., L1, L2, and L∞ norms. It is well known that L1 norm is the sum of the absolute values of individual errors and is less sensitive to a few large errors but more sensitive to the spatial distribution of data. L2 norm is the sum of the square of individual errors and leads to a least-squares solution widely applied in engineering. L∞ norm only considers the single worst error and is most sensitive to the maximal outlier but least sensitive to the spatial distribution of data. To examine the effects of the forms of error functions, these three norms are implemented and examined in this section. Consider the laminate [C90/G45/G–45]s with a horizontal delamination of ac = 4H, dc = 4H/24, and lc = 1.8H. The modified µGA is applied to find the near-optimal solutions for each of these three types of error functions. In using the µGA, the genetic operators and parameters are selected as follows: population size of 5, random number seed of –20,000, tournament selection, uniform crossover of pcross = 0.5, no
© 2003 by CRC Press LLC
1523_Frame_C11.fm Page 406 Thursday, August 28, 2003 5:43 PM
60 L1 norm
40 20 0 -20
ac
dc
-40
lc
fitness
-60 0
5
10
15
20
25
30
60 L2 norm
40 20 0 -20 -40 -60 0
5
10
15
20
25
30
60 ∞
40
L norm
20 0 -20 -40 -60 0
5
10
15 20 Generation
25
30
FIGURE 11.12 Convergence processes of the modified µGA for detection of horizontal delamination in the laminate [C90/G45/G–45] s using different error norms. (From Xu, Y.G. and Liu, G.R., Comput. Methods Appl. Mech. Eng., 191, 3929–3946, 2002. With permission.)
mutation, elitism, and niching operation. Figure 11.12 shows the searching processes of modified µGA for delamination parameters with respect to three types of objective functions. It can be observed from this figure that • L1 norm demonstrates the fastest convergence among the three norms. The µGA found a near-optimal solution only at generation 8 where the error of three delamination parameters, ac, dc, and lc, is 0.925, 0, and –6.67%, respectively. • L2 norm shows similar results with slightly lower convergence. The µGA found a near-optimal solution at generation 12 where the error of the three delamination parameters is –11.60, 0, and 6.5%, respectively.
© 2003 by CRC Press LLC
1523_Frame_C11.fm Page 407 Thursday, August 28, 2003 5:43 PM
• L∞ norm demonstrates the worst result and therefore is not recommended to use for this type of delamination detection problem. 11.6.1.2 Switch from µGA to BCLSF In the combined optimization technique, the µGA provides an initial guess for the BCLSF. The µGA search stops at the so-called switch point. In general, the switch point selected at a later generation would increase the possibility for the µGA to provide a better initial guess. However, this would increase the computation cost of the searching process. For the present problem, a single computation of the objective function takes approximately 80 seconds of CPU time, and a generation computation takes about 7 minutes; thus, avoiding unnecessary µGA runs is very important. However, if the switch point is selected too early, the µGA will provide an initial guess with less quality that may not be close enough to the vicinity of the global optimum, causing the BCLSF to fail in converging to the global optimum. A proper selection of the switch point from the µGA to BCLSF is important. To investigate the effect of the switch point at different generations, the same case stated earlier is investigated again, with the objective function formulated using L1 norm. The problem is solved first using the µGA along with the first 80 generation runs. In the combined algorithm, the switch points are set at the 8th, 23rd, 29th, and 39th generations, respectively. Figure 11.13 shows the convergence process of the µGA searching during the first 80 generations. Table 11.3 shows the results of optimal computations using the combined technique with different switch points. Figure 11.13 and Table 11.3 show that • The µGA converges to the true solution of delamination parameters more slowly compared to the combined optimization technique with any switch point, using 582 minutes of computation time and decreasing errors of the three delamination parameters to 0.902, 0, and –2.278%, respectively. Even with the worst choice of switch points, the combined technique takes only about half of this time and gets a more accurate result. • Most µGA runs at later generations do not improve the best fitness value effectively. For example, running from generation 8 to 22, generation 23 to 28, generation 29 to 38, and generation 39 to 80 does not result in any improvement on the error function value, but takes about 110, 40, 70, and 300 minutes, respectively. Moreover, the best fitness value is only improved by 4.4% from generation 22 to 39 and no improvement is made from generation 39 to 80. This demonstrates that it is necessary to replace the later searching in the µGA with a gradient-based algorithm such as BCLSF. • A switch from the µGA to the BCLSF at a relatively earlier generation performs better than that at a later generation in overall performance.
© 2003 by CRC Press LLC
1523_Frame_C11.fm Page 408 Thursday, August 28, 2003 5:43 PM
Parameter error (%) and fitness value
60
40
ac
dc
lc
fitness
20
0
-20
-40
-60 0
10
20
30 40 50 Generation
60
70
80
FIGURE 11.13 Convergence processes of the modified µGA for detection of horizontal delamination in the laminate [C90/G45/G–45]s during the first 80 generations. (From Xu, Y.G. and Liu, G.R., Comput. Methods Appl. Mech. Eng., 191, 3929–3946, 2002. With permission.)
TABLE 11.3 Results of Horizontal Delamination Identification Using Different Optimization Schemes (µGA and Combined Optimization Method with µGA and BCLSF) for Laminate [C90/G45/G–45]s Switch Point a µGA (no switch) Generation 8 Generation 23 Generation 29 Generation 39 a
Parameter Error (%) ac dc lc 0.91 0.43 0.32 0.38 0.16
0 0.07 0.06 0.06 0.05
–2.28 0.81 0.69 0.78 0.73
CPU Time (min) µGA BCLSF Total 582 52 164 206 279
— 21 21 21 16
582 73 185 227 295
Generation number at which µGA is switched to the gradientbased optimization method. (The subroutine BCLSF in IMSL is used in this study.)
Source: Xu, Y.G. and Liu, G.R., Comput. Methods Appl. Mech. Eng., 191, 3929–3946, 2002. With permission.
It is difficult to suggest an explicit criterion for selecting the switch point suitable for all the cases, for it generally depends on the complexity of the problem to be investigated. If a GA takes N generations to find the optimum by itself, switching at about generation N/4 should be a safe and efficient choice for most of the problems.
© 2003 by CRC Press LLC
1523_Frame_C11.fm Page 409 Thursday, August 28, 2003 5:43 PM
0.8 noise-free
0.7
noise (15%) 0.6 0.5 0.4 0.3 0.2 0.1 0.0 0
1
2
3
4
5 6 X/H
7
8
9
10
FIGURE 11.14 Noisy displacement response on the surface of the laminate used for inverse detection of a horizontal delamination in the laminate [C90/G+45/G–45]s. (From Xu, Y.G. and Liu, G.R., Comput. Methods Appl. Mech. Eng., 191, 3929–3946, 2002. With permission.)
11.6.1.3 Effect of Noise Noise is inevitably involved in the measured data in practice. To investigate the effect of noise, white noise is intentionally introduced in the simulated measurements for the surface displacement responses. The noise level is set as 5, 10, and 15%, respectively, in this investigation. Figure 11.14 shows the 15% noise-contaminated displacement responses. The L1 norm and L2 norm are used to formulate the objective function. Figure 11.15 shows the evolution processes of the modified µ GA corresponding to different noise amplitudes and error norms. Table 11.4 shows the corresponding errors of delamination parameters detected at generation 20. It m can be seen that the modified µGA can bear relatively weak noise conta inations (η = 5%) for cases of using L1 and L2 norms. The L2 norm seems to show greater stability to tolerate higher-level noisy contamination. With the increase in noise amplitude, the performance of the µGA becomes worse. When η = 15%, the improved µGA fails to reach the near-optimum solution for both cases. 11.6.1.4 Ill-Posedness Analysis These examples show that GA search solutions are very stable. This is because this inverse formulation is immune from the ill-posedness of the problem, due to heavily over-posed formulation and employment of an SEM
© 2003 by CRC Press LLC
1523_Frame_C11.fm Page 410 Thursday, August 28, 2003 5:43 PM
100 90 80 Best fitness value
70 60 50 40 30
L1 norm (η = 5%)
L2 (10%)
L2 (5%)
L1 (15%)
L1 (10%)
L2 (15%)
20 10 0 0
5
10 Generation
15
20
FIGURE 11.15 Convergence processes of the µGA with respect to different noise amplitudes and error norms for inverse detection of a horizontal delamination in the laminate [C90/G+45/G–45] s. (From Xu, Y.G. and Liu, G.R., Comput. Methods Appl. Mech. Eng., 191, 3929–3946, 2002. With permission.)
TABLE 11.4 Errors (%) of Delamination Parameters Detected Using Simulated Measurement Data with Different Noise Levels and Error Norms for Horizontal Delamination in Laminate [C90/G45/G–45]s Noise
ac
L1 Norm dc
lc
ac
L2 Norm dc
lc
pe = 5% pe = 10% pe = 15%
1.72 7.62 –36.65
0 –25 50
–5.78 –16.89 9.06
1.15 2.93 –34.18
0 0 –25
–4.37 –6.71 –8.78
Note: µGA, at generation 20. Source: Xu, Y.G. and Liu, G.R., Comput. Methods Appl. Mech. Eng., 191, 3929–3946, 2002. With permission.
that solves the wave equation analytically in the horizontal direction (see Section 10.2.2). The projection regularization is also at work because discrete sampling is used in the error function formulation. Note that at the presence of noise, forcing the GA to converge to below the error range of the experiments is meaningless. If the solution is not stable because of ill-posedness, the discrepancy principle should be used to stop the search process once the error gets close enough to the noise level (see Chapter 3).
© 2003 by CRC Press LLC
1523_Frame_C11.fm Page 411 Thursday, August 28, 2003 5:43 PM
0.8 noise-free
0.7
filtered (15%) 0.6 0.5 0.4 0.3 0.2 0.1 0.0 0
1
2
3
4
5 6 X/H
7
8
9
10
FIGURE 11.16 Filtered displacement response on the surface of the laminate. The simulated measurement data are first added with 15% white noise. A filter based on the smoothing with moving average method is applied to obtain the filtered curve. (From Xu, Y.G. and Liu, G.R., Comput. Methods Appl. Mech. Eng., 191, 3929–3946, 2002. With permission.)
11.6.1.5 Regularization by Filtering To suppress the effect of noise effectively, filter out the noise from the measured displacement responses before they are used in the objective function. The filter could be selected in a number ways from the filter bank, as provided by Strang and Nguyen (1996). The smoothing with moving average method introduced in Section 3.5 is employed here to treat the noise-contaminated inputs. Figure 11.16 shows one of the noisy displacement responses after filtering. The filtered responses are used again for the inverse analysis using the modified µGA; all demonstrate a satisfactory convergence towards the global optimum. The maximum error of delamination parameters detected at generation 30 is within ±8% for the cases studied. In addition, the L1 norm is likely to work better than the L2 norm, as in the noise-free case discussed previously. Figure 11.17 shows the evolution process of the µGA for the objective error function formulated using the L1 norm with the filtered displacement responses in Figure 11.16. The investigations show that the µGA for the objective function formulated using the L1 norm is still valid in a noisy environment provided that the noise filter is applied a priori. 11.6.2
Horizontal Delamination in [C90/G45/G–45]sLaminate
The composite laminate [C90/G45/G–45]s with a horizontal delamination is considered. Three delamination cases are defined:
© 2003 by CRC Press LLC
1523_Frame_C11.fm Page 412 Thursday, August 28, 2003 5:43 PM
Parameter error (%) and fitness value
60
40
ac
dc
lc
error
20
0
-20
-40
-60 0
5
10
15 20 Generation
25
30
FIGURE 11.17 Convergence process of the modified µGA using the filtered response for the inverse detection of a horizontal delamination in the laminate [C90/G+45/G–45] s. (From Xu, Y.G. and Liu, G.R., Comput. Methods Appl. Mech. Eng., 191, 3929–3946, 2002. With permission.)
• Case I — ac = 4.0H, dc = 4H/24, lc = 1.8H • Case II — ac = 5.0H, dc = 6H/24, lc = 1.5H • Case III — ac = 6.0H, dc = 4H/24, lc = 1.0H These delamination parameters are to be inversely identified. Only the surface displacement responses resulting from the elastic wave field scattered by these delaminations are measured and used for carrying out the delamination detection. The measured surface displacement responses are simulated using the SEM code with the true delamination parameters. For calculating the surface displacement response corresponding to the given delamination parameters using SEM, each layer in this laminate is divided into four strip elements along the thickness direction. The number of total strip elements in the thickness direction is thus 24. The displacement response is picked up at 250 sampling points on the surface of the laminate. 11.6.2.1 Noise-Free Cases Figure 11.18 shows three noise-free responses of surface displacements corresponding to the three simulated flaw cases given earlier. The L1 norm is used to formulate the objective function. The modified µGA with the same operators and parameters specified in Section 11.6.1 is used to provide an initial guess for the BCLSF. Search ranges of three delamination parameters, ac , dc , and lc , are within [0.5, 9.5], [1/24, 12/
© 2003 by CRC Press LLC
1523_Frame_C11.fm Page 413 Thursday, August 28, 2003 5:43 PM
Amplitude of displacement response
0.8 Case 1 (4-4/24-1.8)
0.7
Case 2 (5-6/24-1.5) 0.6
Case 3 (6-4/24-1.0)
0.5 0.4 0.3 0.2 0.1 0.0 0
1
2
3
4
5 6 X/H
7
8
9
10
FIGURE 11.18 Displacement responses on surface of the laminate [C90/G+45/G–45]s with a horizontal delamination (for three cases) used as the input for the combined optimization method. The laminate is excited by a vertical line load on the surface. (From Xu, Y.G. and Liu, G.R., Comput. Methods Appl. Mech. Eng., 191, 3929–3946, 2002. With permission.)
24], and [0.5, 5.0], respectively. The switch point is selected at generation 10, which means that the µGA only runs 10 generations and then the best individual at that generation is chosen as the initial guess for the BCLSF. Figure 11.19 shows the evolution processes of the improved µGA for the first 10 generations for three simulated flaw cases. Table 11.5 shows the results of optimal computations using the combined technique. Three initial guesses generated from the µ GA as well as the total computation time of three optimization processes are also shown in Table 11.5. It can be seen that the combined optimization technique using the modified µGA and BCLSF is quite effective for solving delamination detection problems using noise-free inputs. 11.6.2.2
Noisy Cases White noise with the level of 5, 10, and 15% is respectively added into the three simulated surface displacement responses in order to test the stability and robustness of the detection technique. Figure 11.20 shows the noisecontaminated responses. A µGA with the same operators and parameters as given earlier is used to provide the initial guess. The switch point from the µGA to BCLSF is moved to generation 12 in order to take into account the effect of noise. Table 11.6 shows the optimization results, which are satisfactory for the 5% noise level case.
© 2003 by CRC Press LLC
1523_Frame_C11.fm Page 414 Thursday, August 28, 2003 5:43 PM
16
Best fitness value
12
8 Case 1 (4-4/24-1.8) 4
Case 2 (5-6/24-1.5) Case 3 (6-4/24-1.0)
0 0
2
4
6 Generation
8
10
FIGURE 11.19 Convergence process of the modified µGA for three cases of horizontal delaminations in the laminate [C90/G+45/G–45] s. (From Xu, Y.G. and Liu, G.R., Comput. Methods Appl. Mech. Eng., 191, 3929–3946, 2002. With permission.)
TABLE 11.5 Identified Results Using Combined Optimization Method for Three Cases of Horizontal Delaminations in Laminate [C90/G–45/G+45]s Using NoiseFree Input Data of Displacement Response Case 1 2 3
Parameter Error (%) ac dc lc
Initial Guess by µGA ac/H dc/H lc/H
0.34 0.52 0.82
4.037 4.538 6.436
0.09 0.06 0.15
0.53 0.88 0.93
0.167 0.250 0.167
1.680 1.962 0.836
CPU Time (min) µGA BCLSF Total 72 72 72
28 28 32
100 100 104
Source: Xu, Y.G. and Liu, G.R., Comput. Methods Appl. Mech. Eng., 191, 3929–3946, 2002. With permission.
11.6.2.3 Discussion The preceding examples numerically validate the detection technique using three idealized cases of horizontal delaminations in laminates. For more complicated problems where composites are not simple laminates or more than one flaw simultaneously presents in composites, the detection technique is still applicable in principle. The only difference is that the surface displacement response derived from the scattered elastic wave field should be calculated using the computation model particularly suitable to the problem.
© 2003 by CRC Press LLC
1523_Frame_C11.fm Page 415 Thursday, August 28, 2003 5:43 PM
Amplitude of displacement response
0.8 Case 1 ( pe = 10%)
0.7
Case 2 ( pe = 5%) 0.6
Case 3 ( pe = 15%)
0.5 0.4 0.3 0.2 0.1 0.0 0
1
2
3
4
5 6 X/H
7
8
9
10
FIGURE 11.20 Noise-contaminated displacement responses on the surface of laminate [C90/G+45/G–45]s for three cases of horizontal delaminations. (From Xu, Y.G. and Liu, G.R., Comput. Methods Appl. Mech. Eng., 191, 3929–3946, 2002. With permission.)
TABLE 11.6 Identified Results Using Combined Optimization Method for Three Cases of Horizontal Delaminations in Laminate [C90/G–45/G+45]s Using 5% Noise-Contaminated Input Data of Displacement Response Case 1 2 3
Parameter Error (%) ac dc lc
Initial Guess by µGA ac/H dc/H lc/H
5.92 4.13 5.34
4.037 4.538 6.605
0.59 0.79 1.92
3.19 5.92 7.73
0.167 0.250 0.167
1.680 1.971 0.754
CPU Time (min) µGA BCLSF Total 72 72 72
32 32 38
104 104 110
Source: Xu, Y.G. and Liu, G.R., Comput. Methods Appl. Mech. Eng., 191, 3929–3946, 2002. With permission.
The detection technique assumes that the computational errors for the surface displacement responses resulting from mode errors, i.e., inaccurate laws of physics, discretization errors, etc., are negligible in comparison with the variations of displacement responses due to the presence of delaminations. The assumption is generally acceptable in engineering applications due to advances in modeling accuracy using computational techniques. The studies have also shown that delaminations having larger sizes or closer to the surface of composite laminates are more easily detected. This is because these flaws provide more significant changes in magnitude and pattern of surface displacement responses compared with the response of delamination-free case.
© 2003 by CRC Press LLC
1523_Frame_C11.fm Page 416 Thursday, August 28, 2003 5:43 PM
11.7 Delamination Detection Using the Progressive NN The progressive NN model described in Chapter 6 is adopted for inverse characterization of delaminations in composite laminates. The outputs of the NN model are the locations and sizes of the delaminations. The inputs of the NN model are the displacement responses on the surface of laminates, which can be easily measured using conventional experimental techniques, as seen in Chapter 10. The [C90/G45/G–45]s composite laminate with a horizontal delamination was considered here.
11.7.1
Implementation
Before the NN model is created, an investigation on the effect of the location, as well as the size of the delamination, on the displacement response is conducted. Figure 11.21 through Figure 11.23 give the displacement response amplitudes in the vertical direction on the upper surface of laminate under different delamination situations. By comparing Figure 11.21 and Figure 11.22, it can be observed that a “special region” exists, which results from waves scattered by the delamination and reveals significant change in amplitude and pattern of the response within this region. This special region shifts horizontally in the same direction as the delamination moves. The effect of the delamination location ac on the displacement response is thus obviously exhibited. Comparing Figure 11.21 and Figure 11.23 shows that the maximal amplitude of the response within the special region decreases with the increase of the delamination depth dc. The effect of the delamination size lc on the displacement response can be observed from each of these three figures. With the decrease of lc, the amplitudes of the oscillated displacement response within the special region obviously mitigate. When lc = 0, i.e., the nondelamination case, the special region disappears. These observations demonstrate that the information on delamination is indeed naturally encoded in the surface response of the laminate. This provides the possibility of using the surface response as the input of the NN model to detect the delaminations in the laminate. In order to detect the possible shortest delamination in horizontal direction and simultaneously avoid the overcomplexity of the NN architecture, the response amplitudes at 34 selected sampling points on the surface of the laminate within the region from x = 0 to x = 10 were used as the input of the NN model. Consequently, the number of neurons in the NN input layer is 34, and the minimal length, xmin , for the detectable delamination is therefore about 10H/(34 – 1) = 0.3H. The delamination parameters ac , dc , and lc were used as the output of the NN model, so the number of neurons in the output layer of the NN is three. Two hidden layers were employed. The
© 2003 by CRC Press LLC
1523_Frame_C11.fm Page 417 Thursday, August 28, 2003 5:43 PM
FIGURE 11.21 Amplitudes of the displacement responses on the surface of laminate [C90/G+45/G–45]s with a horizontal delamination (ac = 4H, dc = H/6). (From Xu, Y.G. et al., Int. J. Solids Struct., 38, 5625–5645, 2001. With permission.)
noncrack 1c = 0.4H 1c = 0.8H 1c = 1.0H
FIGURE 11.22 Amplitudes of the displacement responses on the surface of laminate [C90/G+45/G–45] s with a horizontal delamination (ac = 6H, dc = H/6). (From Xu, Y.G. et al., Int. J. Solids Struct., 38, 5625–5645, 2001. With permission.)
© 2003 by CRC Press LLC
1523_Frame_C11.fm Page 418 Thursday, August 28, 2003 5:43 PM
noncrack 1c = 0.4H 1c = 0.8H 1c = 1.0H
FIGURE 11.23 Amplitudes of the displacement responses on the surface of laminate [C90/G+45/G–45] s with a horizontal delamination (ac = 4H, dc = H/3). (From Xu, Y.G. et al., Int. J. Solids Struct., 38, 5625–5645, 2001. With permission.)
numbers of neurons for the first and second hidden layers are initially assigned to be 45 and 20, respectively. To formulate the initial training samples, it was assumed that there were six levels of discrete values for the three delamination parameters ac , dc , and lc (see Table 11.7). A total of 63 = 218 combinations would be needed to cover all the delamination possibilities by applying the complete combination method for this case. However, based on the orthogonal array method and employing the corresponding orthogonal array L16(63), only 3 × (6 – 1) + 1 = 16 combinations were required to cover the whole sample space. To further reinforce the sample set, it was decided to add six samples that result from varying each delamination parameter to its extreme value in turn while keeping the other two delamination parameters at their reference values. The SEM code was used to calculate the displacement response on the surface of laminate for each of these 22 combinations so as to generate 22 initial samples. These samples were then normalized and used for training of the initially designed NN model. According to Section 6.4.1, the numbers 24 and 9 were obtained for first and second hidden layers, respectively. For this optimized NN architecture, the given convergence criterion was fulfilled after 7032 training iterations. The convergence of the error norm of the NN within the first 5000 iterations is shown in Figure 11.24.
© 2003 by CRC Press LLC
1523_Frame_C11.fm Page 419 Thursday, August 28, 2003 5:43 PM
TABLE 11.7 Discrete Values of Delamination Parameters Level 1
Level 2
Level 3
Level 4
Level 5
Level 6
ac
0
1
3
5
6
8
dc
2/24
4/24
8/24
12/24
16/24
20/24
lc
0.3
0.6
0.9
1.2
1.5
1.8
Source: Xu, Y.G. et al., Int. J. Solids Struct., 38, 5625–5645, 2001. With permission.
Initial training 2nd retraining
Error
3rd retraining 4th retraining
FIGURE 11.24 Convergence of the NN model in the progressive training process using SEM as the teacher. (From Xu, Y.G. et al., Int. J. Solids Struct., 38, 5625–5645, 2001. With permission.)
11.7.2
Noise-Free Case
Four cases of horizontal delaminations in the [C90/G45/G–45]s laminate are examined: • • • •
Case I — ac = 4.5, dc = 4/24, lc = 0.5; Case II — ac = 7, dc = 4/24, lc = 1.0; Case III — ac = 4.5, dc = 8/24, lc = 0.5; Case IV — ac = 7, dc = 8/24, lc = 1.0.
© 2003 by CRC Press LLC
1523_Frame_C11.fm Page 420 Thursday, August 28, 2003 5:43 PM
FIGURE 11.25 Displacement response amplitude on the surface of laminate [C90/G+45/G–45] s for four cases of horizontal delaminations used as the input of the NN model to inversely identify the delamination parameters. (From Xu, Y.G. et al., Int. J. Solids Struct., 38, 5625–5645, 2001. With permission.)
These cases have not been used in the training samples; therefore, they can be used for examining the detecting capability of the trained NN model. The displacement responses on the surface of laminate calculated using the SEM code with these delaminations are used as the simulated measurement responses, as shown in Figure 11.25. These simulated responses were used as the inputs of the trained NN model to reconstruct the delamination parameters. The first reconstructions of delamination parameters were immediately obtained by feeding the simulated responses into the trained NN model for these four delamination cases. In order to examine the accuracy of the output from the NN model, these reconstructed delamination parameters were then put into the SEM model to calculate the surface displacement responses of the laminate. Figure 11.26 shows that these computed responses using the delamination parameters predicted by the initially trained NN are significantly different from the simulated ones in terms of the distance norm as defined by Equation 6.17 for the four cases. In fact, the maximal error at the initial step for the reconstructed ac , dc , and lc was as high as –23.44, –22.34, –24.94, and –22.98% for the four cases, respectively. Obviously, the reconstructions were not acceptable, so the retraining procedure for the NN model was required.
© 2003 by CRC Press LLC
1523_Frame_C11.fm Page 421 Thursday, August 28, 2003 5:43 PM
Difference of the displacement amplitude
3.0 H
case 1
E
case 3
C
case 2
F
case 4
C
2.5
C H E
2.0
H E
H C H C
1.5
E E
F
1.0
F F
F
0.5
0.0 1 2 3 4 Number of iterations of training for the NN FIGURE 11.26 Changes of the difference displacement amplitude between the actual displacement and the computed displacement using delamination parameters predicted by the NN at different iterations. (From Xu, Y.G. et al., Int. J. Solids Struct., 38, 5625–5645, 2001. With permission.)
Four new samples were consequently generated using these reconstructed delamination parameters and the resulting displacement responses calculated using the SEM code. These new samples were then put into the original sample set to replace four selected samples with largest distance norms. The NN model was retrained with the adjusted sample set, and then used to reconstruct the delamination parameters again. With the progress of this retraining process, the displacement responses calculated using the SEM code with the reconstructed delamination parameters became closer to the true ones. As an example, Figure 11.27 shows the evolution process of calculated displacement response amplitudes for case (2) using the delamination parameters predicted by the NN at different iterations. After three episodes of retraining, the displacement responses calculated from the reconstructed delamination parameters were very close to the actual ones for all four cases. The maximum error of the reconstructed delamination parameters with respect to their assumed values decreases to –5.03, –4.92, –6.84, and –5.43%, respectively. Figure 11.28 shows the convergence of these delamination parameters during the progressive reconstruction process. The convergence of the error norm for the NN model during these three retrainings is also shown in Figure 11.24. The numerically simulated results demonstrate that the progressively trained NN model can correctly detect the locations ( ac and dc ) and the
© 2003 by CRC Press LLC
1523_Frame_C11.fm Page 422 Thursday, August 28, 2003 5:43 PM
1st reconstruction 2nd 3rd 4th target
FIGURE 11.27 Progressive process of the calculated displacement response amplitudes. (From Xu, Y.G. et al., Int. J. Solids Struct., 38, 5625–5645, 2001. With permission.)
length ( lc ) of delamination hidden in the anisotropic laminate using noisefree response data. 11.7.3
Noise-Contaminated Case
To further examine the applicability of the technique for the practical problems where the noise is inevitable in the measured data, the displacement response data acquired from the SEM code were contaminated with excessive Gaussian noise. To compare the output from the NN model using these noisy data as input with that of the noise-free cases, the location and length of delaminations assumed in case (1) and case (4) examined previously were reconstructed again. The Gaussian noises (see Equation 2.130) with levels of 5% (pe = 0.05) and 10% (pe = 0.1) are used for these two cases, respectively. They were then added into the simulated response data using SEM code with the actual parameters. These simulated measurement data with noise are then put into the trained NN model for reconstructing the delamination parameters. Similar to the analysis process outlined in Section 11.7.2, the first iteration of reconstructions from the NN model for case (1) and case (4) were not satisfactory, so retraining of the NN model was then conducted. After four episodes of successive retraining, the maximal error of delamination parameters reconstructed from the NN model for case (1) converged to –5.03 and
© 2003 by CRC Press LLC
1523_Frame_C11.fm Page 423 Thursday, August 28, 2003 5:43 PM
5 Å
-5 -15
Errors of crack parameters detected by NN (%)
-25
P
location length
H
depth H Å P
H Å
H H Å P
(1) ac = 4.5, dc = 4/24, lc = 0.5
P
5
Å Å
-5
Å P
H Å
H
H P
Å H P
P
-15
(2) ac = 7.0, dc = 4/24, lc = 1.0
P
-25 5 -5 -15 -25 5
H
Å
H
-25
Å
Å H
H P
Å H P
P
(4) ac = 7.0, dc = 8/24, lc = 1.0
P
1
H Å P
H Å P
(3) ac = 4.5, dc = 8/24, lc = 0.5
Å P
-5 -15
H Å P
2
3
4
Number of iterations of training for the NN model
FIGURE 11.28 Errors of delamination parameters detected by the trained NN model at different iterations using the noise-free input data of displacement response. (From Xu, Y.G. et al., Int. J. Solids Struct., 38, 5625–5645, 2001. With permission.)
–6.14% from –27.61 and –28.42% for the response data with 5 and 10% involved noise, respectively. For case (4), it converged to –6.36 and –6.81% from –27.89 and –29.83% for the same two noise levels, respectively. The training is therefore stopped because the errors have fallen into the range of the measurement errors; any further improvement is practically meaningless. These reconstructed results are shown in Figure 11.29. Based on the results, it is found that the satisfactory reconstruction of the delamination parameters is possible from the trained NN model even if the input contains certain levels of noise. This example shows that the NN model is immune from the ill-posedness of the problem because of use of SEM code (see Section 10.2.2) and the discrete samplings providing the projection regularization. Because
© 2003 by CRC Press LLC
1523_Frame_C11.fm Page 424 Thursday, August 28, 2003 5:43 PM
10
(1) ac = 4.5, dc = 4/24, lc = 0.5 H pe = 5% H
0 -10
H Å
-20
H
P
Errors of crack parameters detected by NN (%)
Å
-30 10
P
H Å P
Å P
Å P Å
location
P
length
H depth
(2) ac = 4.5, dc = 4/24, lc = 0.5
0
pe = 10%
-10
H Å P
Å H P
(3) ac = 7.0, dc = 8/24, lc = 1.0 H
H
H Å P
H Å P
-20 -30
P Å H
10 0
pe = 5%
-10 -20 -30
H P
Å
P
Å P
Å
P H Å
10
(4) ac = 7.0, dc = 8/24, lc = 1.0
0
pe = 10%
-10
Å
P Å H
-20 -30
H P Å
P H
Å P H
Å H P
P Å H
1
2 3 4 Number of training the NN model
5
FIGURE 11.29 Errors of delamination parameters detected by the progressively trained NN model at different iterations using the noise-contaminated input data of the displacement response. (From Xu, Y.G. et al., Int. J. Solids Struct., 38, 5625–5645, 2001. With permission.)
34 sampling points are used in this study, the NN built can accommodate higher levels of noise compared with the case discussed in Section 9.7.2.
11.7.4
Discussion
The accuracy of reconstructions from the trained NN model depends signif icantly on the sensitivity of the displacement response sampled on the laminate surface to the variation of the location and length of delaminations. The longer and shallower the delamination is, the more significant the distortion shown in the curve of the surface displacement response. Conse-
© 2003 by CRC Press LLC
1523_Frame_C11.fm Page 425 Thursday, August 28, 2003 5:43 PM
quently, better accuracy could be obtained in reconstructing the delamination parameters. This is seen in case (2) and case (3); case (2) has the highest accuracy and case (3) is the lowest among the four delamination cases. From the numerical examinations, it is seen that the accuracy of output from the NN model improves with increase of the retraining process. The required accuracy can be reached by increasing the times of retraining if the noise-free training and input data are used, although this requires more computational effort. Judgment should be exercised for a particular delamination detection in the practical applications. When noisy input data are used, however, too extensive iterations of retraining are meaningless (see Section 9.7.2.3).
11.8 Remarks • Remark 11.1 — the delamination in laminates can be characterized based on measured displacement on the surface of the laminate. The advantage of this technique is the feasibility for the application in real engineering applications. All that is necessary is to apply a timeharmonic load on the surface of laminate and simultaneously measure the surface displacement response. The delaminations hidden in the laminate can be successfully detected using inverse techniques presented in this chapter. • Remark 11.2 — the delamination detection problem in laminate is formulated as an optimization problem minimizing the difference between the measured and calculated surface displacement response. The scattered wave fields in the frequency domain for the laminates with horizontal or vertical cracks can be effectively computed using the SEM code. Characteristic parameters of delaminations can be inversely detected by incorporating genetic algorithms with the forward solver of the SEM code. Uniform µGAs and IPGAs, as well as the combination of the modified µGA and gradientbased optimization method, can be used. • Remark 11.3 — the delamination detection problem in laminate is also solved as an identification problem using the progressive NN. The excited displacement response on the surface of laminate is used as the input of the NN model. The delamination parameters are used as the output of the NN model. The results indicate that the progressive NN is effective for inversely identifying the delamination parameters in laminated laminates. • Remark 11.4 — apart from the SEM code, some powerful software packages such as LS_DYNA, NASTRAN, and ANSYS, etc. have been developed. They are commercially available and can be used to
© 2003 by CRC Press LLC
1523_Frame_C11.fm Page 426 Thursday, August 28, 2003 5:43 PM
calculate elastic wave fields for complex composite structures. These advanced computer software packages provide forward solvers for the detection technique to be applied to complex structures.
© 2003 by CRC Press LLC
1523_Frame_C12.fm Page 427 Thursday, August 28, 2003 5:54 PM
12 Inverse Detection of Flaws in Structures
Computational inverse techniques for flaw detection in beams or plates with applications to sandwich structures are introduced in this chapter. Sandwich structures are a special class of laminates that require special considerations and treatments. Two approaches are introduced for the characterization of flaws in sandwich structures; in both the finite element model is used for the forward analysis. The first approach is the parameter identification to determine the presence, location, size, and degree of the damage of flaws in the core layer; the GA is used for the inverse analysis. The second approach is to relate the flaw with the element stiffness factors; gradient-based methods, such as Newton’s root finding method as well as the Levenberg–Marquardt method, are used for the inverse analysis. A number of numerical examples are provided to demonstrate the application of these computational inverse techniques. It is also revealed that gradient-based methods are much more efficient in dealing with inverse problems with large numbers of parameters.
12.1 Introduction Damages of flaws in any structures are always a big concern for structure systems. Sandwich structures have been widely used in various industrial applications, especially in the aerospace and ship building industries. Sand wich structures are a special class of laminates that consist of three layers: two thin, high-strength, stiff face (outer) layers and one thick, low-density, flexible inner core layer. Structural efficiency in terms of economy, high stiffness, and low weight is achieved by combining the stronger facings with a thicker, lightweight core material. Structural reliability is highly dependent on the support of the core material, and the stiffness and strength of sandwich structures is affected greatly by failure of the core material. Usually, the core material is more easily damaged than the facings during the manufacturing process and in practical use by local concentrated loads and/or
© 2003 by CRC Press LLC
1523_Frame_C12.fm Page 428 Thursday, August 28, 2003 5:54 PM
impacted loads. It is thus very important to locate the flaw and detect its degree of damage. Nondestructive testing using ultrasonic techniques plays a very important role in detecting flaws in structures, determining the flaw according to the echogram of the ultrasonic bulk waves. It is often not efficient for the case of sandwich plates due to the material inhomogeneous layered structures. Also, it is necessary to scan the whole surface of the structure to locate the flaws. Liu and his co-workers (Liu and Lam, 1994; Liu and Achenbach, 1995; Liu et al., 1995b, 1996; Lam et al., 1997; Wang et al., 1998) applied the SEM method (see Section 10.2.1) to investigate the scattering of lamb waves by rectangular flaws in sandwich plates and anisotropic laminated plates, where the plates were treated as a plane problem and the scattered wave field in the frequency domain computed for characterization of the flaws. Another very important direct methodology is to determine the damage quantities from changes in dynamic properties of a structure using the measurements of natural fundamental frequencies shift and modal shapes change in such structures (Salawu, 1997; Doebling et al., 1996). However, for local small flaws in a sandwich structure, the changes in lower natural fundamental frequencies and modal shapes are too small to be detected, especially with the presence of the measurement error. The harmonic response of the plate structures excited with a load of a certain frequency can be significantly different for damaged and undamaged plates. Practically, it is not too difficult to excite the structure with a harmonic load and measure the corresponding response at certain points using the available testing equipment. The recorded responses at certain positions of the plates to the excitation are different for the plates with different flaws. However, it is very difficult and complicated to find the direct relationship between the flaw parameters and the responses to the excitation. Therefore, computational inverse techniques are required to determine the quantities (parameters) of the flaw based on the response of structures to the dynamic excitation. In recent years, Liu and Chen (2000, 2001, 2002) have proposed several computational inverse techniques to detect flaws in sandwich structures. The genetic algorithm and Newton’s root finding methods are employed in the inverse procedure. Based on their study, several computational inverse techniques are presented in this chapter to detect flaws in sandwich beams and plates quantitatively using the response of structures to the given harmonic excitation.
12.2 Inverse Identification Formulation Consider a general finite element model of a linear-elastic structure. The dynamic governing equation found in many textbooks on solid mechanics or FEM (see, e.g., Liu and Quek, 2003) is given for a harmonic excitation:
© 2003 by CRC Press LLC
1523_Frame_C12.fm Page 429 Thursday, August 28, 2003 5:54 PM
[K − ω 2 M]w = F
(12.1)
where K and M are the stiffness and mass matrices, respectively, ω is the frequency of the harmonic excitation, w is the vector of nodal displacement amplitude, and F the vector of the amplitudes of the forces. In the next sections, two approaches for the characteristics of flaw in structures, i.e., damaged element identification and stiffness factor identification, are introduced. 12.2.1
Damaged Element Identification
Based on the finite element model of a sandwich structure, the unknown parameters are defined as • A reference element number, nr , indicating the reference location of the damaged area • Damaged element numbers and/or damage profile cases, nd , determining the type of the damaged area • The damage factor β f , indicating that the Young’s modulus of the damaged layer in these elements is reduced to Ekf = (1 − β f )Ek
(12.2)
The input of the inverse analysis is taken as the responses (deflection) sampled at certain points of the structure (Liu and Chen, 2001) excited using a time-harmonic point load with frequency ω. To predict the response of the damaged plate, the stiffness matrix of the entire structure is updated according to the changes in the element stiffness matrix arising from deduction of the modulus of the damaged layer. However, the internal change of a structure generally does not result in a loss of material; therefore, it is assumed that the mass matrix does not change. In the forward analysis, the stiffness matrix is updated for different trials of different parameters of flaw. The responses are then solved for constructing the error function for the inverse analysis. The objective function is constructed using the sum of squares of the difference between the responses computed for the plate with actual damage parameters and those with the assumed parameters by FEM model in the form of L2 norm as defined in Equation 2.123: ns
ferr (p) =
∑ {w (p) − w c i
m i
(pt )} 2
(12.3)
i =1
where p is the vector of unknown parameters (nr , nd , β f ) describing the flaw characteristics defined previously, ns is the number of points where deflec-
© 2003 by CRC Press LLC
1523_Frame_C12.fm Page 430 Thursday, August 28, 2003 5:54 PM
tions are sampled for inverse analysis, wic is the computed deflection of the structure with trial damage parameters, and wim the deflection of the plate with true damage parameters at the sampling point. Constraints to the parameters are given for different finite element models accordingly, which will form the search space. To determine the unknown parameters of the flaw in inverse analysis, it is necessary to minimize the object function computationally with the FEM for forward analysis.
12.2.2
Stiffness Factor Identification
The global stiffness matrix of a structure is the assembly of all the elements’ stiffness matrices. For isotropic elastic material, the element stiffness matrix is always proportional to the elastic modulus of the material and the geometric coefficient, which are unknown parameters in inverse analysis. Thus the global stiffness matrix can often be expressed as (see, e.g., Liu and Chen, 2002) ne
K=
∑e K i
e i
(12.4)
i =1
where ne is the total number of elements, ei (i = 1, 2, , ne ) are the unknown parameters of elastic modulus or element stiffness factor, and the element stiffness K i e is obtained by assuming that the element is perfect with a unit factor. Therefore, the element stiffness factor, ei (i = 1, 2, , ne ) , reflects the degree of damage in the element in the damaged structure. Substituting Equation 12.4 into Equation 12.1 yields
ne
∑ i =1
ei K ie − ω 2 M w = F
(12.5)
Note that ω and F are known for given excitation force. Mass matrix M is known for given material for the structure and the FE mesh. K i e is also known once the finite element mesh is given. For an assumed set of ei , w can be therefore computed without any difficulty. It is also clear that Equation 12.5 is a linear function of the parameters ei in an explicit form. This form facilitates efficient computation of the gradients required in the gradient-based search techniques. • In the forward analysis, the displacement response of a finite element system can be predicted using Equation 12.5 with a given set of parameters ei (i = 1, 2, , ne ) .
© 2003 by CRC Press LLC
1523_Frame_C12.fm Page 431 Thursday, August 28, 2003 5:54 PM
• In the inverse analysis, the parameters need to be identified using the measured value of the displacement response or modal parameters. That is, parameters are chosen to best fit the experiment data. Two methods are used to fit these data: • The least squares method to minimize the error • The sensitivity-based analysis method, which has different formulations for different problems 12.2.2.1 Objective Function with Weight The objective function is defined using the weighted sum of squared differences between the measured data and the corresponding simulated value of the dynamic properties of structures. ns
ferr (e) =
∑W ( f (e) − f c
i
i
i
m
(e t ))2
(12.6)
i =1
where e is the vector of unknown parameters (e1 , e 2 , , ene )T , fim is the measured value, and fic is the corresponding simulated value using a trial e, and Wi is the weight factor used to provide some measurements with more or less weight. The measured values of a structure can be responses, natural frequencies, and values of modal assurance criterion (MAC) (Ewins, 1985) related to the mode shapes. 12.2.2.2 Direct Formulation A direct formulation is employed by Liu and Chen (2002) in determining stiffness factors using harmonic response. For a finite element model with ne elements, ns displacements at different nodes on the structure can be measured and expressed in a vector form of w . The identification problem is to determine the element stiffness factor vector e in Equation 12.5 using the measured response w , i.e., to find e that satisfies the following equation based on the simple matrix operation: Qw = w
(12.7)
where Q is a constant row vector with elements of zeros or ones that can always be formed to select the degrees of freedom corresponding to the measured displacement components. For example, if w = {w3 , w5 , w8 }T , the row vector Q should be Q = {0 0 1 0 1 0 0 1 0 0}
Vector w is solved from Equation 12.5 for a given e.
© 2003 by CRC Press LLC
(12.8)
1523_Frame_C12.fm Page 432 Thursday, August 28, 2003 5:54 PM
Define an error function of f(e) = Qw − w = 0
(12.9)
where f1 (e) f2 (e) f(e) = , fns (e)
e1 e 2 e= ene
(12.10)
Here, f(e) is a set of nonlinear implicit equations with respect to the parameters. The value of f(e) and its derivation can be evaluated making use of the linear property of Equation 12.5. Thus, the solution of the Equation 12.9 can be found directly using a root finding method numerically.
12.3 Use of Uniform µ GA The forward analysis for obtaining the response of the sandwich plate with different flaws is conducted using an FEM code developed based on the Mindlin theory for laminated plates. The flaw in the core layer of the sandwich plates is characterized based on the finite element level with damage represented as a deduction in modulus of the material in the damaged layer. In the following inverse analysis, a uniform µGA is employed to solve the optimization problem as defined by Equation 12.3 to find the actual parameters of the flaws in sandwich structures. The µGA uses a population size of 5, tournament selection, no mutation, niching, elitism, and uniform crossover of pcross = 0.5. 12.3.1
Example I: Sandwich Beam
Consider a simply supported sandwich beam shown in Figure 12.1. The material constants and size of the beam are given in Table 12.1. The beam is discretized into 20 elements, and it is assumed that the elements 14, 15, 16, and 17 are the damaged elements containing a flaw in the core layer with a damage factor βf = 0.5. The Young’s modulus of the core material for the elements is computed using Equation 12.2. A time-harmonic load with frequency ω = 3000 rad/s is applied to excite the beam. The responses are sampled at five points (marked with dots in the figure). The unknown
© 2003 by CRC Press LLC
1523_Frame_C12.fm Page 433 Thursday, August 28, 2003 5:54 PM
Element i
z
14
15 16
17
x
y x
FIGURE 12.1 Simply supported sandwich beam with a flaw. The beam is divided into 20 elements. The elements numbered 14, 15, 16, and 17 are damaged elements r containing a flaw in the core layer. A time-harmonic load with frequency ω = 3000 ad/s is applied to excite the beam. The responses are sampled at five points marked with dots in the figure. (From Liu, G.R. and Chen, S.C, Comput. Methods Appl. Mech. Eng., 190, 5505–5514, 2001. With permission.)
TABLE 12.1 Material Constants and Geometric Size of Sandwich Beam E Layer 1 Layer 2 (core) Layer 3
16.7 GPa 13.0 GPa 16.7 GPa
ν 0.3 0.3 0.3
Density
Length
Width
Thickness
1m
— 0.05 m —
0.005 m 0.01 m 0.005 m
kg/m3
1760 1000 kg/m3 1760 kg/m3
Source: Liu, G.R. and Chen, S.C., Comput. Methods Appl. Mech. Eng., 190, 5505–5514, 2001. With permission.
parameters set to be identified are the reference element number to indicate the location of the flaw, the numbers of the damaged element to represent the size of the flaw, and degree of the damage to describe the severity of the damage, respectively. The range of the reference element number is set to be 1 to 16, having 16 possibilities. The number of damaged elements ranges from 1 to 4 with four possibilities. The degree of the damage is in the range of 0.1 to 0.8, with eight possibilities. Thus, the searching space has a total of 512 possibilities. The characteristics of the actual flaw were successfully detected after 30 generations with a total of 150 function evaluations that invoke the FEM code. Figure 12.2 shows the convergence process of the fitness of the best individual of the population in each generation.
12.3.2
Example II: Sandwich Plate
The second example considered is a simply supported square sandwich plate. The material constants and size of the plate are given in Table 12.2. Figure 12.3 shows the finite element model (discretized into 100 elements) of the plate. The plate is assumed to contain a flaw in shaded elements with a damage factor βf = 0.5 in the core layer. A time-harmonic load with frequency ω = 3000 rad/s is applied at the center of the plate (marked with a circle) to excite the plate. The responses are sampled at six points (marked with dots in the figure). The reference element number that indicates the
© 2003 by CRC Press LLC
1523_Frame_C12.fm Page 434 Thursday, August 28, 2003 5:54 PM
Generation 0.00E+00
Fitness
0
1 10
20
30
40
50
-1.00E-01 -2.00E-01 -3.00E-01
FIGURE 12.2 Fitness value of the best individual for the simply supported beam shown in Figure 12.1. (From Liu, G.R. and Chen, S.C., Comput. Methods Appl. Mech. Eng., 190, 5505–5514, 2001. With permission.)
TABLE 12.2 Material Constants and Geometric Size of Sandwich Square Plate Layer 1 Layer 2 (core) Layer 3
E
ν
Density
Length of Sides
16.7 GPa 13.0 GPa 16.7 GPa
0.3 0.3 0.3
1760 kg/m3 1000 kg/m3 1760 kg/m3
— 0.6 m —
Thickness 0.005 m 0.01 m 0.005 m
Source: Liu, G.R. and Chen, S.C, Comput. Methods Appl. Mech. Eng., 190, 5505–5514, 2001. With permission.
FIGURE 12.3 Simply supported square sandwich plate with a flaw (shaded) in the core layer. The plate is discretized into 100 plate elements. A time-harmonic load with frequency ω = 3000 rad/s is applied at the center of the plate. The frequency responses are sampled at six points marked with dots in the figure. (From Liu, G.R. and Chen, S.C., Comput. Methods Appl. Mech. Eng., 190, 5505–5514, 2001. With permission.)
© 2003 by CRC Press LLC
1523_Frame_C12.fm Page 435 Thursday, August 28, 2003 5:54 PM
location of the damage can vary from 1 to 128. Figure 12.4 gives four possible damage patterns. The degree of the damage is in the range of 0.1 to 0.8 discretized into eight grades. To those individuals with parameters exceeding the domain of the problem, such as the element number of more than 100 in this example, no evaluation is performed but a small value of fitness is assigned to throw out these individuals. It takes 244 generations or 244 × 5 = 1220 function evaluations that invoke the FEM code to converge to the true solution. The fitness of the best individual in each generation is shown in Figure 12.5. Due to the random nature of the GA, it converges to the true solution slowly after the best individual in the population falls into the region near the true optima. To improve the convergence rate, a two-stage searching method (Liu and Chen, 2001) is suggested. This method consists of global searching at the first stage and local searching at the second. The local searching is performed by reducing the search space after global searching locates the likelihood of the optimal region. For this example, the global searching means that the reference element number ranges over all the elements, i.e., the flaw may be possibly everywhere in the plate. The reference element number can be approximately located after a certain number of generations, for example, 50 generations at the first stage; global searching is then stopped and the
Case 1
Case 2
Case 3
Case 4
FIGURE 12.4 Four possible damage patterns. (From Liu, G.R. and Chen, S.C., Comput. Methods Appl. Mech. Eng., 190, 5505–5514, 2001. With permission.)
Generation number
2.00E-04 0.0 0
100
200
300
Error
-2.00E-04
-6.00E-04
-1.00E-03
FIGURE 12.5 µGA convergence in searching for flaws in the simply supported plate. Error function is defined by Equation 12.3. (From Liu, G.R. and Chen, S.C., Comput. Methods Appl. Mech. Eng., 190, 5505–5514, 2001. With permission.)
© 2003 by CRC Press LLC
1523_Frame_C12.fm Page 436 Thursday, August 28, 2003 5:54 PM
FIGURE 12.6 Domain for local search at the second stage. (From Liu, G.R. and Chen, S.C., Comput. Methods Appl. Mech. Eng., 190, 5505–5514, 2001. With permission.)
FIGURE 12.7 Possible damage patterns in a 3 × 3 mesh. (From Liu, G.R. and Chen, S.C., Comput. Methods Appl. Mech. Eng., 190, 5505–5514, 2001. With permission.)
damage pattern redefined for the reference elements, as shown in Figure 12.6. Starting the GA search in this specified local domain, it takes only 10 generations to obtain the true solution. The two-stage searching method improves search efficiency greatly because local searching is performed in a much smaller parameter space that includes the true solution. For the same example with a larger flaw area, the preceding procedure can also be applied by changing the damage profiles to include more cases, as shown in Figure 12.7. The same plate with a larger flaw area as shown in Figure 12.8 is analyzed. Global search is stopped at the 100th generation to determine the subregion that contains the flaw. After restarting the GA search in this subregion, it takes only another 18 generations to obtain the true solution.
12.4 Use of Newton’s Root Finding Method In this section, the element stiffness factors defined in Equation 12.4, for all the elements of the finite element model of a structure, are taken to be parameters and explicitly expressed in a linear form in the system equation for forward analysis of the harmonic response of the structure, as shown in Equation 12.5. This offers great convenience in applying Newton’s root find-
© 2003 by CRC Press LLC
1523_Frame_C12.fm Page 437 Thursday, August 28, 2003 5:54 PM
FIGURE 12.8 Simply supported square sandwich plate with a rectangular larger flaw in the core layer. (From Liu, G.R. and Chen, S.C., Comput. Methods Appl. Mech. Eng., 190, 5505–5514, 2001. With permission.)
ing method to search for parameters of the stiffness factor inversely because the Jacobian matrix can be obtained by solving sets of linear algebraic equation derived from the system equation. Newton’s root searching method is briefly discussed in Chapter 4.
12.4.1
Calculation of Jacobian Matrix
Newton’s root finding method and modified Newton’s method require calculation of the Jacobian matrix that contains the derivatives of displacements with respect to the unknown parameters, element stiffness factors e. The Jacobian matrix can be obtained efficiently by taking advantage of the linear expression of ei in Equation 12.5. Performing differentiations on the both sides of Equation 12.5 with respect to each parameter ei leads to K ie w + (K − ω 2 M)
∂w = 0 (i = 1, 2, … , ne ) ∂ei
(12.11)
In Equation 12.11, vector w is solved from Equation 12.5 in forward analysis. So Equation 12.11 can be written as (K − ω 2 M )
∂w = −K ie w (i = 1, 2, … , ne ) ∂e i
(12.12)
Thus, the derivative ∂w / ∂ei can be solved from the preceding linear algebraic equation system, which is in the same form as Equation 12.5. The only difference is on the right-hand side of the equation. For (i = 1, 2, , ne )
© 2003 by CRC Press LLC
1523_Frame_C12.fm Page 438 Thursday, August 28, 2003 5:54 PM
the Jacobian matrix is obtained by multiplying matrix Q as defined by Equation 12.7.
12.4.2
Iteration Procedure
Following the procedure as schematically given in Figure 4.8, the stiffness factors e can be identified. Starting with an initial guess, e( 0 ) , the iteration procedure is given as: • Step 1 — solve Equation 12.5 at e( k ) for w, and then compute the value of error function f(e( k ) ) = Qw − w
(12.13)
• Step 2 — solve Equation 12.12 at e( k ) for {∂w / ∂ei } , and obtain the Jacobian matrix. In solving Equation 12.12, the right-hand side vector is formed as “a pseudo load vector” first, using the response obtained previously. The coefficient matrix has been factorized in Step 1 so that forward analysis can be utilized and the derivation vector can be obtained only by back-substitution with the pseudo load vector. • Step 3 — find e( k+1) by Newton’s method using Equation 4.74. In practice, e( k+1) is obtained by solving the linear equation system of ∇fk (e( k +1) − e( k ) ) = −f(e( k ) )
(12.14)
• Step 4 — repeat Step 1 through Step 3 until the required tolerance is satisfied.
12.4.3
Example I: Cantilever Beam
12.4.3.1 Stiffness of Cantilever Beam In order to verify the technique, the cantilever beam shown in Figure 12.9 is considered. It is discretized into 20 beam elements; therefore, 20 unknown z
F 1 2
20
x
FIGURE 12.9 Nonuniform stiffness beam and its finite element model. The beam is discretized into 20 beam elements and the excitation is a time-harmonic load at the free tip of the beam with a frequency of ω = 100 rad/s. (From Liu, G.R. and Chen, S.C., J. Sound Vib., 254(5), 823–835, 2002. With permission.)
© 2003 by CRC Press LLC
1523_Frame_C12.fm Page 439 Thursday, August 28, 2003 5:54 PM
TABLE 12.3 Element Stiffness Factors for the Cantilever Beam as Shown in Figure 12.9 (Case 1, e0 = 2.1) Element number Stiffness factor e
1–5 2.1
6–10 1.5
11–15 2.1
16–20 1.5
Source: Liu, G.R. and Chen, S.C., J. Sound Vib., 254(5), 823–835, 2002. With permission.
TABLE 12.4 Element Stiffness Factors for the Cantilever Beam as Shown in Figure 12.9 (Case 2, e0 = 2.1) Element number Stiffness factor e
1–2 2.1
3–4 1.8
5–6 2.1
7–9 1.05
10–20 2.1
Source: Liu, G.R. and Chen, S.C., J. Sound Vib., 254(5), 823–835, 2002. With permission.
TABLE 12.5 Element Stiffness Factors for the Cantilever Beam as Shown in Figure 12.9 (Case 3, e0 = 2.1) Element number Stiffness factor e
1–2 2.1
3 1.8
4 1.5
5 1.2
6–20 2.1
21–25 1.5
26–50 2.1
Source: Liu, G.R. and Chen, S.C., J. Sound Vib., 254(5), 823–835, 2002. With permission.
parameters represent the stiffness factors of elements. It is related to the material constant and/or second moment of the section area. The element stiffness factor to be identified is given in Table 12.3 through Table 12.5. The mass density is ρ = 7.8 × 103 kg/m3, and the second moment of section area is I z = 0.8 × 10–8 m4. The excitation is a time-harmonic load at the free tip of the beam with a frequency of ω = 100 rad/s. The measured deflection amplitude at 20 nodes is simulated using computational analysis results for given true parameters. The damage factor β if of the ith element is defined as the deduction of the element stiffness and can be obtained from the stiffness factor ei: e β if = 1 − i e0
(12.15)
where e0 is the undamaged stiffness factor. • In case 1, a piecewise uniform stiffness distributed beam is considered; the true stiffness factors are given in Table 12.3. The iteration is started from an initial guess x ( 0 ) that takes uniform value of 2.1(undamaged stiffness factor) for all the parameters. It converges
© 2003 by CRC Press LLC
1523_Frame_C12.fm Page 440 Thursday, August 28, 2003 5:54 PM
2.5
Stiffness Factor
2
1.5
1
0.5
0 0
2
4
6
8
10
12
14
16
18
20
Element Number FIGURE 12.10 Inversely determined stiffness distribution in elements (Case 1) using Newton’s root finding method. (From Liu, G.R. and Chen, S.C., J. Sound Vib., 254(5), 823–835, 2002. With permission.)
to the solution very fast; the results are shown in Figure 12.10 and are in very good agreement with the true values given in Table 12.3. The same results can be obtained for different values of the angular frequency ω . • In case 2, a damaged beam with two damaged locations is considered. The true stiffness factors are given in Table 12.4. The damage in element 3 and element 4 and element 7 through element 9 is successfully detected as shown in Figure 12.11. • In case 3, the same beam is now discretized into 50 elements and the stiffness distribution is represented by 50 parameters. Table 12.5 shows the true element stiffness factors. The results are obtained very quickly and accurately as shown in Figure 12.12. The results also indicate clearly the damage status or the stiffness distribution of the beam in terms of the stiffness factor. The examples have shown that the inverse technique is suitable for problems with large numbers of parameters to be identified. It takes only seconds of CPU time to obtain a very accurate result for the beam structure considered. It can be applied accurately to damage detection problems involving multiple distributed defects with arbitrary degrees of damage. However, like any gradient-based optimization algorithm, the initial guess will affect iteration progress. For suitable initial guesses, the identical result can be obtained; otherwise, the solution converges to a local minimum. It should also be pointed out that the frequency of exciting force should not be too close to the natural frequency of the structure because, in such a case, Equa -
© 2003 by CRC Press LLC
1523_Frame_C12.fm Page 441 Thursday, August 28, 2003 5:54 PM
2.5
Stiffness Factor
2
1.5
1
0.5
0
0
2
4
6
8
10
12
14
16
18
20
Element Number FIGURE 12.11 Inversely determined stiffness distribution in elements (Case 2) using Newton’s root finding method. (From Liu, G.R. and Chen, S.C., J. Sound Vib., 254(5), 823–835, 2002. With permission.) 2.5
Stiffness Factor
2
1.5
1
0.5
0
0
5
10
15
20
25
30
35
40
45
50
Element Number FIGURE 12.12 Inversely determined stiffness distribution in elements (Case 3) using Newton’s root finding method. (From Liu, G.R. and Chen, S.C., J. Sound Vib., 254(5), 823–835, 2002. With permission.)
tion 12.12 becomes singular and the inverse procedure fails because no damping terms are considered. 12.4.3.2 Performance Comparison with µ GA In order to compare the performance of the direct root searching method with the genetic algorithm, the same beam as shown in Figure 12.9 is re-
© 2003 by CRC Press LLC
1523_Frame_C12.fm Page 442 Thursday, August 28, 2003 5:54 PM
z
F 1 2
20
x
FIGURE 12.13 Beam with one damaged location. The beam is discretized into 20 elements; the damage is located in element 3 and element 4. (From Liu, G.R. and Chen, S.C., J. Sound Vib., 254(5), 823–835, 2002. With permission.)
examined. A µGA is first applied to the stiffness factor identification (damage detection). The population is taken to be 5 in each generation, while the probability of uniform crossover is set to be 0.5. The objective function (Equa tion 12.6) is employed for fitness evaluation with harmonic response of deflection taken as input. The stiffness factors of 20 elements are taken as the parameters to be identified. In the GA, these parameters are required to be discretized according to the accuracy needed. When all the parameters are discretized into eight grades in the range of 0.63 to 2.1, the discrete search space contains 260 ( ≈ 1.15 × 1018 ) candidates. Because of such a great number of possibilities, the CPU time is excessively long due to the random nature of GAs and the time-consuming forward analysis of FEM code. In order to decrease the number of parameters, it is assumed that the beam has only one damage, which is one of the first four elements. The degrees of damage of these elements are discretized into eight grades, thus decreasing the discrete search space to 212 (4096) candidates and making the GA search possible. As an example, the beam including one damage location shown in Figure 12.13 is considered. The damage is located in element 3 and element 4, with stiffness deduction factor β f = 0.5 ; it takes 30 generations to obtain the solution. The CPU time consumed is 40 seconds. Using Newton’s root finding technique to solve the same problem takes only about 1.5 seconds. This simple test verifies the efficiency of Newton’s root finding technique over GAs for problems with multiple continuing variables as parameters. Another advantage of the technique over GAs is that the GAs’ accuracy is limited to the possibilities to discretize parameters. In order to increase the accuracy of GAs, more possibilities are required to discretize the parameters and more generations are required to search the solution. 12.4.3.3 Noise-Contaminated Case The efficiency and accuracy of the technique has been demonstrated through the preceding examples without considering random measurement errors. To study the effect of the measurement noises on parameter identification, Gaussian noise with zero mean and constant standard deviation is added to the measured value. Consider again the three beams investigated in Section 12.4.3.1. Adding these noises to the measured responses obtains a good result for case 2, as shown in Figure 12.14. For other cases, it fails to get good results, implying that Newton’s root finding method is very sensitive to noise.
© 2003 by CRC Press LLC
1523_Frame_C12.fm Page 443 Thursday, August 28, 2003 5:54 PM
2.5
Stiffness Factor
2
1.5
1
0.5
0
0
2
4
6
8
10
12
14
16
18
20
Element Number FIGURE 12.14 Inversely determined stiffness distribution in elements (Case 2 with measurement noise) using Newton’s root finding method. (From Liu, G.R. and Chen, S.C., J. Sound Vib., 254(5), 823–835, 2002. With permission.)
12.4.4
Example II: Plate
Newton’s root finding method is extended to plate structures modeled with finite elements. The plate is simply supported with several elements deducted in stiffness as given in Figure 12.15. The plate is divided into 100 62
72
1 2
91
10
100
92
27
FIGURE 12.15 Simply supported square plate modeled using 100 eight-node isoparametric quadratic elements. The six elements deducted in stiffness are shaded; actual element stiffness factors are listed in Table 12.6. A time-harmonic load with a frequency of ω = 3000 rad/s is applied at the center of the plate. (From Liu, G.R. and Chen, S.C., J. Sound Vib., 254(5), 823–835, 2002. With permission.)
© 2003 by CRC Press LLC
1523_Frame_C12.fm Page 444 Thursday, August 28, 2003 5:54 PM
TABLE 12.6 Element Stiffness Factors for the Plate Simply Supported as Shown in Figure 12.15 Element number Stiffness factor e Damage factor β
27–28 1.5 0.286
52 1.6 0.238
53 1.8 0.143
62 1.2 0.429
63 1.5 0.286
Remainder 2.1 0
Source: Liu, G.R. and Chen, S.C., J. Sound Vib., 254(5), 823–835, 2002. With permission.
eight-node isoparametric quadratic plate elements. Numerically simulated nodal deflection response subjected to a harmonic excitation is used as the measurement. The true element stiffness factor is given in Table 12.6. Using the inverse procedure described previously, the stiffness factors of the elements for plates are identified accurately, as shown in Figure 12.16. To apply the method to solve practical problems, some consideration and modification are required. First of all, the forward analysis model of the structural system should be carefully considered to simulate the practical system as accurately as possible or corrections to the difference between simulated and practical responses should be made. For example, damping terms and support stiffness of the boundary should be considered. Another important problem considered is the measuring error. To consider measurement error, the Gauss–Newton method should be used, where the number of measured data is more than the number of parameters (over-posed prob lem; see Chapter 2). The Gauss–Newton method gives the estimation of the parameters based on minimization of the least squares of the error norm. It is expected to be more robust to random errors of measurement.
FIGURE 12.16 Damage factor identified for the plate using Newton’s root finding method. (From Liu, G.R. and Chen, S.C., J. Sound Vib., 254(5), 823–835, 2002. With permission.)
© 2003 by CRC Press LLC
1523_Frame_C12.fm Page 445 Thursday, August 28, 2003 5:54 PM
z
F 1 2 3 4 5
9 10 11
20
x
FIGURE 12.17 Finite element model of a cantilever beam with damage, divided into 20 elements. The damages are assumed at element 4 and element 5 and element 9 through element 11 as marked by the in shaded areas.
12.5 Use of Levenberg–Marquardt Method Levenberg–Marquart’s root finding method (described in Chapter 4) is employed for the flaw detection in a cantilever beam as shown in Figure 12.17. The beam is discretized into 20 beam elements. The damages are located at element 4 and element 5 with damage factor β = 14.3% and element 9 through element 11 with damage factor β = 28.6% . There are 20 unknown parameters that represent the stiffness factors of elements. The mass density, Young’s modulus, and Poisson’s ratio are ρ = 7.8 × 10 3 kg/m3, E = 2.1 × 1011 N/m2, and ν = 0.3 , respectively, and the second moment of section area is I z = 0.8 × 10 −8 m4. The element stiffness factor before damage is taken to be 1 as reference or the element damage factor before damage is 0. The excitation is a time-harmonic load at the free tip of the beam with frequencies of ω 1 = 100 rad/s and ω 2 = 220 rad/s. The measured deflection amplitude at 16 selected nodes (at points marked by dots) is used for inverse analysis, which is simulated in forward analysis using true parameters corresponding to damage status given earlier. The initial element stiffness factors are taken to be 0.48 for all the elements. Using the Levenberg–Marquart method, the results are obtained and shown in Figure 12.18, where the damage locations are clearly indicated and the damage factors are in good agreement with the true values. To simulate the practical measurements, a Gaussian noise with zero mean and constant standard deviation is added to the computed responses. Using the cantilever beam as an example, different levels of Gaussian noise are considered. The results are shown in Figure 12.19 to Figure 12.21 for noise levels from 2 to 10%. It is shown from these figures that the damage locations are clearly indicated at element 4 and element 5 and element 9 through element 11, while the degrees of damage agree well with the true values. It can be noted that the results are stable and robust, despite the random measurement noise, with the noise level up to 10%. The same beam with two ends fixed is considered, as shown in Figure 12.22. The size and damage status of the beam are the same with the cantilever beam. In this case, the harmonic load is applied at the center of the beam. The frequencies of the load adopted are ω 1 = 100 rad/s and
© 2003 by CRC Press LLC
1523_Frame_C12.fm Page 446 Thursday, August 28, 2003 5:54 PM
0.35
Damage Factor
0.3
0.25
Without noise
0.2
0.15
0.1
0.05
0
0
2
4
6
8
10
12
14
16
18
20
Element Number FIGURE 12.18 Detected damage factors for the cantilever beam using simulated noise-free measurements using the Levenberg–Marquardt method.
0.4
Damage Factor
0.35 0.3
2% noise
0.25 0.2 0.15 0.1 0.05 0 0
2
4
6
8
10
12
14
16
18
20
Element Number FIGURE 12.19 Detected damage factors for the cantilever beam using simulated measurements with 2% noise using the Levenberg–Marquardt method.
© 2003 by CRC Press LLC
1523_Frame_C12.fm Page 447 Thursday, August 28, 2003 5:54 PM
0.35
Damage Factor
0.3
0.25
5% noise 0.2
0.15
0.1
0.05
0
0
2
4
6
8
10
12
14
16
18
20
Element Number FIGURE 12.20 Detected damage factors for the cantilever beam using simulated measurements with 5% noise using the Levenberg–Marquardt method. 0.4
Damage Factor
0.35 0.3 0.25
10% noise
0.2 0.15 0.1 0.05 0
0
2
4
6
8
10
12
14
16
18
20
Element Number FIGURE 12.21 Detected damage factors for the cantilever beam using simulated measurements with 10% noise using the Levenberg–Marquardt method.
z
F 1 2
FIGURE 12.22 Beam with two ends fixed.
© 2003 by CRC Press LLC
20
x
1523_Frame_C12.fm Page 448 Thursday, August 28, 2003 5:54 PM
0.35
Damage Factor
0.3
0.25
0.2
0.15
0.1
0.05
0
0
2
4
6
8
10
12
14
16
18
20
Element Number FIGURE 12.23 Damage status detected in the fix–fix beam using simulated noise-free measurements using the Levenberg–Marquardt method.
ω 2 = 220 rad/s. Only deflections at the selected nodes indicated are measured and used for the inverse analysis. Starting from initial trial parameters taken to be 1.0 for all element stiffness factor (without any damage initially), the result is shown in Figure 12.23. It is found that damage location and damage factor are obtained in good agreement with the true values.
12.6 Remarks • Remark 12.1 — two approaches are introduced for the characteriza tion of flaws in structures; in both approaches the finite element model is used for forward analysis. The first approach is parameter identification to determine the presence, location, size, and degree of flaws in the core layer; a GA is used for inverse analysis. A uniform µGA has been employed to detect the location, area, and degree of the flaw in the core layer of sandwich structures from timeharmonic response of the structure. The characteristics of the flaw are represented by discrete and continuing variables at the element level as a set of parameters. The search efficiency of the GA is improved greatly by using a two-stage searching method. • Remark 12.2 — the second approach is to relate the flaw with the element stiffness factors that can be formulated in a linear form in the system equation, leading to a very efficient way to compute the gradients. Newton’s root finding method is applied to find the
© 2003 by CRC Press LLC
1523_Frame_C12.fm Page 449 Thursday, August 28, 2003 5:54 PM
inverse solution. It converges to the true solution much faster in comparison with GAs. The method is very accurate and efficient for problems with a large number of parameters; however, this technique is sensitive to measurement noise. • Remark 12.3 — the Levenberg–Marquardt method is successfully applied to damage assessment of a finite element model using timeharmonic responses. Examples demonstrated that location and degree of damages can be identified simultaneously with a large number of parameters. However, it should also be noted that the success of the identification is strongly dependent on the value of the damping factors (see Equation 4.78), which needs to be selected properly. The choice of the initial values of parameters is another issue. • Remark 12.4 — although not yet tested for example in this chapter, a combined procedure using the GA and the Levenberg–Marquardt method is expected to work well because the GA can be used to find a good initial guess, as demonstrated in Section 8.5, Section 9.6, and Section 11.6.
© 2003 by CRC Press LLC
1523_Frame_C13.fm Page 451 Thursday, August 28, 2003 5:59 PM
13 Other Applications
Chapter 7 through Chapter 12 presented a number of computational inverse techniques using elastic waves propagating in composite structures or dynamic responses of structures. Practical complex nondestructive evaluation problems of crack detection, force function reconstruction, and material property identification have been examined in detail. In this chapter, several further applications to engineering problems are presented. Section 13.1 gives the coefficient identification for an electronic cooling system, Section 13.2 identifies the material parameters of a printed circuit board, and Section 13.3 identifies the material property of thin films. Crack detection in structures using integral strain measured by optical fibers is discussed in Section 13.4; Section 13.5 detects the flaw in the truss structure. Section 13.6 introduces computational methods for predicting protein structures, Section 13.7 introduces an algorithm for fitting interatomic potentials using molecular dynamics and the IP-GA, and Section 13.8 identifies the dynamic flow-pressure characteristics in a valveless micropump. These applications provide a landscape view of the broadness of the applications of inverse techniques. Some of the forward solvers used in this chapter are very expensive due to the complexity of real-life systems. A single run can take days or even weeks. Therefore, trying to reduce the number of parameters to be inversely identified is very important. To improve the efficiency of the inverse procedure is also crucial to the practical applications.
13.1 Coefficient Identification for Electronic Cooling Systems In situ evaluation of a cooling system is important to modern electronic systems. Experimental approaches have been widely used in practice to understand how these cooling systems work; empirical approaches are utilized in the design of cooling systems. One of the disadvantages of such an approach is the costly and time consuming process in establishing the empirical formulations. In addition, the experiments need to be carried out again when the system is modified, which happens very often in today’s compet -
© 2003 by CRC Press LLC
1523_Frame_C13.fm Page 452 Thursday, August 28, 2003 5:59 PM
itive environment. Moreover, an in situ test often lacks flexibility when a sensitivity study is required in the design stage. For example, it is quite difficult to determine experimentally the most suitable location of fans and vents when an electronic system package is designed. In designing any cooling system, the material thermal properties are always very important. Heat transfer coefficients have been identified using the temperature distributions in an electronic cooling system (Truffart et al., 1993; Blanc et al., 1996; Liu et al., 2002h; Zhou, 2001). Computational inverse techniques, including the golden section search method for one-variable problems and improved genetic algorithms and NNs for multivariable problems, have also been applied as the inverse procedures. A commercial software code, ESC and TMG in the I-deas master series package (I-deas, 1995) is used to perform the forward computation. In electronic system cooling analysis, computational fluid dynamics and heat transfer should be analyzed. There are two solvers, named ESC and TMG, for fluid flow and thermal transfer in the software I-deas, respectively. The flow solver deals with nonlinear and coupled partial differential equations of conservation of mass, energy, and momentum in general threedimensional geometry. It uses an element-based finite volume method (FVM) and a coupled algebraic multigrid method to discretize governing equations. The physical models include laminar or turbulent incompressible flow, natural convection, and general boundary conditions for fluid flow and heat transfer in ducts and enclosures in the electronic cooling system.
13.1.1
Using the Golden Section Search Method
The golden section search method introduced in Chapter 4 is employed to a one-variable identification problem of the cooling system. 13.1.1.1 Natural Convection Problem The natural convection occurs in a case consisting of 10 cards, as shown in Figure 13.1. Vents are located symmetrically at the upper near top and lower near bottom of the housing. A high power dissipated chip is mounted in the cards. The temperature distribution of some points on the chip and its board is obtained by measurement. The golden section search method is applied to inversely determine the heat transfer coefficient between the chip and the card on which the chip is mounted. Natural convective cooling in such systems requires special attention from the analyst. The external heat transfer from the sheet metal housing to the environment should be taken into account. The heat convection and radiation from the sheet metal housing to the environment should also be modeled. One level of complexity is added by modeling the conduction interface between the cards and the card guides and between the card guides and the housing. Taking advantage of the symmetry of the system by imposing
© 2003 by CRC Press LLC
1523_Frame_C13.fm Page 453 Thursday, August 28, 2003 5:59 PM
Vent Card
FIGURE 13.1 Natural convection occurs in a cage containing 10 cards. Vents are located symmetrically at upper top and lower bottom of the housing.
symmetric boundary conditions in I-deas ESC, the model size can be reduced, which saves modeling effort and solution time. This can significantly improve the efficiency of the forward solver, which is very important for the inverse analysis. 13.1.1.2 Numerical Results The golden section search algorithm is applied in the inverse analysis with the I-deas as the forward solver using the one-quarter model of the natural convection system as shown in Figure 13.1. The boundary conditions applied are: • The heat load of the five printed circuit boards (PCBs) is 48 W. (The total heat load of 10 PCBs is 48*4 = 192 W.) • The heat load of the high-power dissipated chip is 1.28 W. • The vent boundary conditions are defined at the top and bottom vent areas. • Convection from steel housing to the environment, which is defined as 20°C, is also set up; the heat transfer coefficient is 5 w/(°Cm2). • Radiation from steel housing to environment is established as a radiative coefficient equal to 1. • Because the temperature of the PCBs at both sides of the PCB array is much higher than steel housing, the calculation should consider the radiation from the PCBs at both sides of the PCB array to steel housing. That radiation is modeled as a radiative coefficient equal to 1.
© 2003 by CRC Press LLC
1523_Frame_C13.fm Page 454 Thursday, August 28, 2003 5:59 PM
The heat transfer between the high-power dissipated chip to the board to which the chip is attached is modeled as a thermal coupling boundary condition. The heat transfer coefficient of thermal coupling is identified inversely from the measured temperature distribution. The temperature at 27 nodes located at the most sensitive area of the high-power dissipated chip, and its board is chosen for constructing the fitness (error) function given in the L2 norm as: ns
ferr =
∑ (T
m
i
− Tic )2
(13.1)
i =1
where ns is the number of points at which the temperatures have been recorded. The number ns should be sufficiently larger compared to the number of the coefficient to be inversely identified (which is 1 in this study) so that the problem will be over-posed. In this problem, ns = 27. Tic is the temperature at the ith point from I-deas calculation, and Tim is the measured temperature at the ith point. In real applications of electronic product development, Tim can be measured from the system. In this study, Tim is simulated by I-deas using the heat transfer coefficient instead of actual measurement. For the golden section search algorithm shown in Figure 4.1, the convergence value is set to 0.001. The range of the coefficient is [100 to 9000]. A simple interface program, golden.prg, is coded using the I-deas developing language, and the search is performed using I-deas. The convergence process is listed in Table 13.1. A very good result is obtained only after 14 calculations. TABLE 13.1 Convergence Process of the Inverse Search for Heat Transfer Coefficient Using Golden Section Search Method
© 2003 by CRC Press LLC
Calculation Number
Heat Transfer Coefficient
1 2 3 4 5 6 7 8 9 10 11 12 13 14
100.00 9000.00 3499.50 5600.50 2201.00 1398.49 902.51 1705.02 1894.47 1587.94 1777.39 1660.30 1632.66 1677.38
Convergence Value 58,421.52 210.40 83.78 153.95 17.1158 12.1776 219.0022 0.0651 3.8943 1.0143 0.9102 0.0424 0.2537 0.0007
1523_Frame_C13.fm Page 455 Thursday, August 28, 2003 5:59 PM
FIGURE 13.2 Distribution of air velocity computed using the inversely identified heat transfer coefficient.
Using the identified coefficient (1677.38 w/(°Cm2)), the natural convection cooling system is then simulated using the I-deas. The distribution of the air velocity and the temperature on the structure is shown in Figure 13.2 and Figure 13.3, respectively, and that of the temperature in the air at cutting planes in Figure 13.4. The detailed temperature comparison is listed in Table 13.2. The computed temperatures listed in the third column agree well with the simulated measurements using the actual coefficient in the second column. 13.1.1.3 Summary The golden section search method is a useful optimization method for problems with only one parameter. In this application, the golden section search algorithm and its program, golden.prg, are developed and interfaced with Ideas using its second developing language. This algorithm is used to inversely determine a heat transfer coefficient for cooling analysis in an electronic system cooling with natural convection. The commercial software, I-deas ESC, is employed as forward solver to determine the temperature
© 2003 by CRC Press LLC
1523_Frame_C13.fm Page 456 Thursday, August 28, 2003 5:59 PM
FIGURE 13.3 Distribution of structural temperature computed using the inversely identified heat transfer coefficient.
distribution for a given coefficient; the golden section search method has found the heat coefficient with good accuracy. The calculation shows that this procedure can converge very fast for problems of one variable and no gradient information of the error function is required.
13.1.2
Using GAs
Heat transfer coefficients between contacted surfaces are now to be inversely identified using GAs. In these parameter identification problems, the parameters could be the heat transfer coefficient between motherboard and CPU chip, which dissipates high-density heat of the CPU chip, or heat convection coefficient between the bottom PCB and duct water flow at the edges of the PCB board. Temperatures on the chip and motherboard are sampled for the inverse analysis (Liu et al., 2002h).
© 2003 by CRC Press LLC
1523_Frame_C13.fm Page 457 Thursday, August 28, 2003 5:59 PM
FIGURE 13.4 Distribution of air temperature at two sectional planes computed using the inversely identified heat transfer coefficient.
13.1.2.1 Forward Modeling In electronic system cooling, most heat generated in the chips will transfer to the printed circuit board where the chips are mounted. However, some heat is directly convected from chip surfaces to air. Therefore, the forward solver should be able to handle heat conduction and heat convection. Heat conduction is modeled based on the finite difference method, which can be tightly coupled with a flow solver that uses finite volume method for heat convection simulation. Energy is transferred at the interface of fluid and solid that are simulated by these two solvers. Control volume is established on convection faces. Convective rate is calculated from the heat conduction model to the faces of the flow model. As shown in Figure 13.5, the heat flow through a contact area can be expressed as: q = hA(T1 − T2 )
© 2003 by CRC Press LLC
(13.2)
1523_Frame_C13.fm Page 458 Thursday, August 28, 2003 5:59 PM
TABLE 13.2 Comparison of Temperature Computed Using Identified Heat Transfer Coefficient with that of Simulated Measurements Node Number
“Measured” Temperature
Computed Temperature
1881 1880 1879 1878 1877 1876 1875 1869 1868 1865 1862 1858 260 245 244 243 240 239 238 237 236 235 234 233 109 108 107
71.602 72.169 72.456 72.713 72.590 72.619 72.306 72.233 71.620 72.404 72.588 72.555 63.509 57.574 60.352 62.516 62.282 64.160 63.497 60.195 62.775 64.758 63.848 63.307 58.338 60.977 62.700
71.610 72.176 72.464 72.721 72.597 72.626 72.314 72.241 71.629 72.412 72.595 72.562 63.509 57.574 60.352 60.515 62.282 64.160 63.497 60.195 62.775 64.758 63.848 63.307 58.338 60.977 62.700
Heat convect to air
Chip dissipates heat Filling material
PCB
Heat transfer from chip to PCB
FIGURE 13.5 Heat transfer at interfaces. (From Liu, G.R. et al., Comput. Struct., 80, 23–31, 2002. With permission.)
© 2003 by CRC Press LLC
1523_Frame_C13.fm Page 459 Thursday, August 28, 2003 5:59 PM
where A is the contact surface and h the heat transfer coefficient. T1 and T2 are the temperatures at the two contacted surfaces, respectively, and q is the transferred heat. Although the temperatures of the chips and printed circuit board can be measured, h is unknown. Once h is identified, the heat convection can be determined using Equation 13.2 without solving the fluid–thermal coupling problem again in the future design. The modeling of heat transfer using Equation 13.2 is also applicable for interfaces between two components, such as IC-chip/PCB, PCB/water-duct, etc. 13.1.2.2 Inverse Analysis of a PCB Board Figure 13.6 gives the flowchart for the identification of parameters in an electronic system cooling analysis. In the procedure, I-deas ESC module is used for forward calculation. A small model shown in Figure 13.7 is used as an example to study the feasibility of the procedure. There are 330 shell elements (to simulate structures like PCBs) and 30 water material beam elements (to simulate water flow) in the model. Two small surfaces dissipate heat into the motherboard. The heat coefficients for the two small interfaces GA initialization
Produce an initial generation of 5 individuals randomly Yes End
Stop No
Get two coefficients from each individual
Get two coefficients from GA
Get temperature values from I-deas
Update boundary condition
Calculate fitness
Call I-deas ESC solver
Selection
Get results from solver
Crossover
Output temperature data for GA input
Mutation I-deas developing language Generate 5 individuals for next generation
GA code in FORTRAN
FIGURE 13.6 Flowchart for the inverse procedure for thermal coefficient identification (forward solver: Ideas, inverse operator: µGA) (From Liu, G.R. et al., Comput. Struct., 80, 23–31, 2002. With permission.)
© 2003 by CRC Press LLC
1523_Frame_C13.fm Page 460 Thursday, August 28, 2003 5:59 PM
FIGURE 13.7 Simple model for a two-interface heat transfer problem. (From Liu, G.R. et al., Comput. Struct., 80, 23–31, 2002. With permission.)
are expressed by h1, h2, which is to be identified. The heat is conducted to the two edges of the surface and taken away by the water inside the water pipe. To carry out the minimization of the error function as defined by Equation 13.1, the uniform µGA method is employed in this procedure. The µGA uses a population size of 5, tournament selection, no mutation, niching, elitism, and uniform crossover of pcross = 0.5. The ranges of the two coefficients are assumed to be within the range of 400 ~ 900. It has been found that the best result is achieved at the 63rd generation, which gives h1 = 660.51 and h2 = 587.19. The error function is 1.76. Although convergence of the GA is fast for this simple model, the forward computation time is still huge because of the coupling nature of the problem. A single I-deas run spends about 18 total hours in SGI Indigo2 (IMPACT 10000) workstation. Detailed analysis reveals that convergence speed is much slower when smaller error is achieved and the identified coefficients approach the true values. This is the typical feature of GAs. The SR-GA introduced in Section 5.7 has been employed in this application. The combinations of these two GA parameters shown in Figure 13.8 all converge much faster than the plain µGA. The advantage of this SRGA over the GA is the searching space reduction that leads to savings of the computational cost, which is very important for expensive forward solvers. 13.1.2.3 A Complex Example A more complicated electronic system model is studied here to examine the feasibility of the inverse procedure further. This system includes one fan, one vent, and five printed circuit boards. A CPU is installed on PCB 1 (Figure 13.9). In addition to fan cooling, water duct flow cooling is introduced to prevent extreme temperature conditions. The water duct is used so that more convection heat transfers from the bottom PCB to water. Two coefficients to be identified are the interface coefficient between the CPU and PCB 1 and a
© 2003 by CRC Press LLC
1523_Frame_C13.fm Page 461 Thursday, August 28, 2003 5:59 PM
Error function
µGA
M = 3, α = 0.1 M = 2, α = 0.2 M = 1, α = 0.3
FIGURE 13.8 Comparison of the convergence processes of using µGAs and SR-GAs for the identification of cooling system. (M: number of best individuals used for determining the reduced search domain; α: the factor used to ensure that a certain fraction of the previous domain be added into the new domain.) It is found that the combination M = 3, α = 0.1 is the best for this problem. (From Liu, G.R. et al., Comput. Struct., 80, 23–31, 2002. With permission.)
FIGURE 13.9 Cooling system for a PCB box (From Liu, G.R. et al., Comput. Struct., 80, 23–31, 2002. With permission.)
multiplier in the forced convection heat transfer between PCB and duct water flow. In the I-deas TMG module, this multiplier is used to simulate the convection through fins. Figure 13.10 shows the finite element mesh for forward computation; it includes 280 shell elements, 45 beam elements, and 8006 air fluid elements. The fan is defined by air velocity and the fan cover is meshed by shell elements. All PCBs have different heat loads, especially the PCB in the middle of the assembly, which has a concentrated heat load due to the CPU. Forced air convection is defined by two-dimensional thin shell elements. The
© 2003 by CRC Press LLC
1523_Frame_C13.fm Page 462 Thursday, August 28, 2003 5:59 PM
FIGURE 13.10 Finite element model for the PCB. (From Liu, G.R. et al., Comput. Struct., 80, 23–31, 2002. With permission.)
small components attached on the PCBs are simulated by surface roughness, which can be specified to the convection surfaces thereby creating drag on the surrounding fluid flow. Two kinds of heat convections are in the model: PCBs to airflow and PCBs to water duct flow. The beam element is used to express the forced water convection. The section of beam element and the velocity of water flow define pump characteristics. Boundary conditions are specified as • Air velocity through the fan is 8 m/s. • Heat loads of the five PCBs are 60, 10, 5, 6, and 5 W, respectively. • The heat load of the CPU is 6 W. • The roughness of the flow surfaces is defined as 1 mm. • The water velocity through the pump is 0.2 m/s. The plain µGA and SR-GA are applied in this problem. The maximum generation is limited to 150 for both methods and the ranges of the two coefficients are given as 0.8 ~ 2.5 and 1000 ~ 3000, respectively. A total of 76 hours in SGI Indigo2 (IMPACT 10000) workstation is needed for the plain µGA to locate these two parameters. The best result is obtained at the 137th generation; the multiplier is found to be 1.80, and the heat transfer coefficient is 2000.98. The error function reaches to 0.0. Figure 13.11 compares the convergence of the µGA and the SR-GA that reduces the searching domain in the searching process; the convergence rate increases about 50%. The effect of the SR-GA is the same as the simple example given in Section 13.1.2.2.
© 2003 by CRC Press LLC
1523_Frame_C13.fm Page 463 Thursday, August 28, 2003 5:59 PM
Error
µGA SR-GA
Generation FIGURE 13.11 Convergence comparison of the SR-GA with the µGA for the identification problem of PCB. (From Liu, G.R. et al., Comput. Struct., 80, 23–31, 2002. With permission.)
FIGURE 13.12 Velocity distribution of airflow in the PCB computed using the inversely identified parameters. (From Liu, G.R. et al., Comput. Struct., 80, 23–31, 2002. With permission.)
The temperature distribution is computed using the identified coefficients. Figure 13.12 gives the distribution of air velocity, and the temperature distribution on the structure is given in Figure 13.13. The air temperature at cutting planes is shown in Figure 13.14. They are all reasonable under the working conditions of PCBs. 13.1.2.4 Summary An inverse procedure using the GA is presented to determine heat transfer coefficients in electronic system cooling analysis. This procedure uses the commercial software I-deas (ESC and TMG) as the forward solver to compute
© 2003 by CRC Press LLC
1523_Frame_C13.fm Page 464 Thursday, August 28, 2003 5:59 PM
FIGURE 13.13 Temperature distribution of the PCB computed using the inversely identified parameters. (From Liu, G.R. et al., Comput. Struct., 80, 23–31, 2002. With permission.)
FIGURE 13.14 Temperature distribution of air on cutting planes computed using the inversely identified parameters. (From Liu, G.R. et al., Comput. Struct., 80, 23–31, 2002. With permission.)
© 2003 by CRC Press LLC
1523_Frame_C13.fm Page 465 Thursday, August 28, 2003 5:59 PM
the temperature and stress distributions. It is believed that this procedure can be used not only in electronic system cooling simulation, but also in other CAE applications whose forward problems can be solved using any other codes.
13.1.3
Using NNs
13.1.3.1 Coefficient Identification of a Telephone Switch Model In this application, identification of parameters in a telephone switch model from the temperature distribution at the most related area is introduced. The telephone switch model is shown in Figure 13.15. It includes three fan exits, one vent, one air filter, 15 boards, and one power supply unit. Four parameters are to be identified. One is the coefficient of the fan; the other three parameters are the heat transfer coefficients between the three chips and
FIGURE 13.15 Computation model for thermal analysis of a telephone switch. There are three fan exits, one vent, one air filter, fifteen boards, and one power supply unit.
© 2003 by CRC Press LLC
1523_Frame_C13.fm Page 466 Thursday, August 28, 2003 5:59 PM
their attached boards. The temperatures sampled at six points on the chips and the boards have been used as the inputs for the inverse analysis. The ESC in I-deas Master Series package is used for preparing the training data of the neural network. This NN is trained using data containing a set of four values — namely, the fan characteristic value and three heat transfer coefficients — and their corresponding temperature distribution calculated from I-deas ESC. The telephone switch structure is meshed first. The finite element model shown in Figure 13.16 uses 1582 shell elements, 10,471 air fluid elements, and 2333 nodes. The fan can be defined by the mass flux or air velocity and the fan-covered area meshed by shell elements. The pressure head loss coefficient of the air and flow angle normal to the fan can also be specified. As the characteristic parameter, the air velocity through the fan is defined in this application. The air velocity through the fan is one of the four coefficients to be determined inversely because the real fan is installed at the outlet that has a distance away from this telephone switch. The software can define the vent characteristics by area, head loss coefficient, and atmosphere conditions. The air filter introduces a resistance to air flow and is modeled as screens in the I-deas ESC. A head loss coefficient of 5.0, based on the approach velocity, is specified for the air filter. A power supply unit has two influences to the whole model and generates heat while the system is working. A heat load of 400 watts is defined on the fluid elements within the power supply unit region. On the other hand, the power supply unit also introduces a resistance to the air flow modeled using an isotropic porous blockage. The pressure loss through the blockage is calculated, and the head loss per unit length of 100 (1/m) is defined for all fluid elements within the power supply unit region (Zhou, 2001). The 15 boards have a total heat load of 1125 W. Three kinds of high-energy dissipated chips are on the boards; their heat loads are 2.88, 2.58, and 2.88 W, respectively. However, the other three coefficients that need to be identified inversely, the heat transfer coefficients between these chips and the mounting board, are unknown. Forced air convection is defined on the surfaces that have been previously meshed with two-dimension thin shell elements. The roughness can be specified to the surfaces, thereby creating drag on the surrounding fluid flow. The small components attached on the boards can be simplified by the surface roughness. During model simulation, heat paths are established from the surfaces to the nearest three-dimensional air fluid elements. The ranges of the fan coefficient and the three heat transfer coefficients are 3000 ~ 10,000 (mm/s) and 400 ~ 1800 (m 2 * °C), respectively. The first group of 16 training samples is created by setting the four coefficients to the maximum and minimum values. The other 32 training samples are randomly generated. The total I-deas ESC calculation needs about 8 hours in SGI Indigo2 (IMPACT 10000) workstation for the 48 training samples. These samples are normalized based on Equation 6.14.
© 2003 by CRC Press LLC
1523_Frame_C13.fm Page 467 Thursday, August 28, 2003 5:59 PM
FIGURE 13.16 Mesh of the phone switch for numerical analysis.
The neuron numbers for the input, output, and first and second hidden layers are 6, 4, 20, and 15, respectively. After training, the outputs of the four coefficients from neural networks are 6799.25, 808.8, 1200.1, and 993.8. Based on this set of coefficients, Table 13.3 compares the temperatures computed using the identified parameters at the indicated locations and the simulated measurements. Good agreement is observed. The distributions of the air velocity and structural temperature are shown in Figure 13.17 and Figure 13.18, respectively; all are reasonably accurate. 13.1.3.2 Coefficient Identification for IC Chips Figure 13.19 shows the structure of an IC chip. The chip is sealed by epoxy molding compound (IC case), mounted onto the die bond pad through
© 2003 by CRC Press LLC
1523_Frame_C13.fm Page 468 Thursday, August 28, 2003 5:59 PM
TABLE 13.3 Comparison of Temperature Computed Using Identified Coefficients with that of Simulated Measurements at Selected Nodes in FE Model of a Telephone Switch Node
Calculated Value
True Value
3363 3354 3345 133 113 93
90.354 85.945 87.027 74.967 74.321 78.360
90.170 85.942 87.374 74.963 74.317 78.355
FIGURE 13.17 Velocity distribution of the air flow computed using the inversely identified parameters.
© 2003 by CRC Press LLC
1523_Frame_C13.fm Page 469 Thursday, August 28, 2003 5:59 PM
FIGURE 13.18 Boards’ temperature distribution computed using the inversely identified parameters.
conductive epoxy, and connected with leadframe through wire. The chip is a main heat source when it is working. The IC case is modeled with solid hexahedral mesh and three sets of shell mesh for the top, middle, and bottom surfaces, as shown in Figure 13.20. The top surface mesh is used to convect heat to the fluid; the bottom surface mesh is used to couple the IC case with the PCB. The middle surface mesh is used to couple the chip, the leadframe, and the die bond pad to the IC case. The detailed mesh for die bond pad, leadframe, and chip is shown in Figure 13.21. An assembly model of the PCB and its wind tunnel, in which the whole component is installed, is illustrated in Figure 13.22. In the numerical simulation, the heat transfer through each part and its heat transfer coefficients are considered as follows (Zhou, 2001): • Chip to die bond pad — material is attached between the chip and die bond pad. Because the filling material and patterns in this layer vary with processing, the heat transfer coefficient is difficult to estimate accurately. An effective way is to identify this heat transfer coefficient inversely.
© 2003 by CRC Press LLC
1523_Frame_C13.fm Page 470 Thursday, August 28, 2003 5:59 PM
FIGURE 13.19 Configuration of a chip. The chip is sealed by epoxy molding compound and mounted onto the die bond pad through conductive epoxy.
FIGURE 13.20 Mesh for IC case.
FIGURE 13.21 Mesh for die bond pad, leadframe, and chip.
© 2003 by CRC Press LLC
1523_Frame_C13.fm Page 471 Thursday, August 28, 2003 5:59 PM
FIGURE 13.22 PCB and its wind tunnel.
• Chip to IC case — this heat transfer coefficient can be obtained using the thermal conductivity of the chip and half of its thickness. The final heat transfer coefficient can be obtained as: K/I = 149 w/m2 * °C/(0.5*0.000508m) = 586,000 (w/m2 * °C)
(13.3)
• Die bond pad to IC case — the coefficient is obtained by taking the thermal conductivity of the die bond pad and half of its thickness into account. A multiple factor is used to take into account heat dissipation from both sides (chip side is partially exposed). Similarly, the heat transfer coefficient can be obtained as 6,640,000(w/m2 * °C). • Lead frame to IC case — the heat transfer coefficient can be obtained as 8,000,000(w/m2 * °C) by means of the thermal conductivity of the lead frame, its thickness, and the condition of heat dissipation. • Die bond pad to leadframe — most of the heat transfer here occurs through the conduction of the plastics. Therefore, an absolute heat transfer coefficient is calculated as: KA 0.63 * ( 4 * 0.014) * 0.00015 = = 5.29E − 3(W / °C) I 0.001
(13.4)
• Leads to PCB — most of the heat transfer occurs through the heat conduction along the lead length outside the IC case. Therefore, an absolute heat transfer coefficient is 6E — 3 (w/°C) for 1 lead and 1.536 (w/°C) for 256 leads. • IC case to PCB — most of the heat transfer occurs through the contacted case bottom surface and PCB filling air. Heat transfer also occurs through the radiation between these two surfaces. Because of the complexity, the heat transfer coefficient is inversely determined. • Other conditions include a 1.5-W heat load on the chip, heat convection on the PCB surface, and the IC case top surface, a 2 m/s outlet fan and a vent to ambient.
© 2003 by CRC Press LLC
1523_Frame_C13.fm Page 472 Thursday, August 28, 2003 5:59 PM
FIGURE 13.23 Temperature distribution of electronics component computed using the identified heat transfer coefficients.
The two heat transfer coefficients are assumed to be 2000 to 10,000 (w/m2 * °C) and 50 to 500 (w/m2 * °C), respectively. Four input neurons are the temperatures located at the top surface of the IC case, PCB, and leads. Two output neurons are the heat transfer coefficient between chip to die bond pad and that between IC case to PCB. The neuron numbers for input, output and the first and the second hidden layers are 4, 2, 12, and 8, respectively. A total of 36 training samples for the neural network are generated using the I-deas. In the computation, the learning rate and momentum rate are assumed to be 0.2 and 0.1, respectively. Finally, the trained NN is applied to predict the coefficients through temperature distributions. One set of normalized outputs is 0.5861 and 0.2520, compared to true coefficients of 6000 w/m2 * °C1 and 244.9 w/m2 * °C1, respectively — a very accurate prediction. The temperature distribution computed using I-deas with these identified coefficients is given in Figure 13.23.
13.1.3.3 Summary An inverse procedure based on neural network technique is employed to determine the coefficients in the electronic system cooling analysis. The Ideas ESC module is employed to set up the training data set for the neural network. Coefficient identifications of telephone switch and electronic IC components have been successfully carried out to demonstrate the inverse approaches.
© 2003 by CRC Press LLC
1523_Frame_C13.fm Page 473 Thursday, August 28, 2003 5:59 PM
13.2 Identification of the Material Parameters of a PCB 13.2.1
Introduction
Electronic devices are normally subjected to qualification tests. A typical test may consist of subjecting the electronic device to an enforced sinusoidal acceleration and measuring the response, over a range of excitation frequencies at a few locations on the PCB and components. Making prototypes and conducting physical tests can take a long time. From a business point of view, it is always preferable to keep the time to market as short as possible. Use of computer-aided engineering (CAE) tools results in faster evaluation of relative performance of various designs, thereby reducing the number of physical prototypes and tests required. One of the problems faced by a CAE analyst is the lack of availability of exact material properties. One way of overcoming this difficulty is to work in the opposite way. Instead of looking at the forward problem in which all the material properties, loading, and boundary conditions are given and the response is calculated, look at the inverse problem in which the response is given (from physical tests), and seek the material properties that result in the given response. Once the material properties are obtained, relative performance of competing designs using similar components can be evaluated virtually on a computer without the need to make too many physical prototypes for all the designs and to perform physical tests with all the prototypes. Next, the inverse identification of material parameters of PCB using the improved IP3-GA is presented. This section is based on work done by Yang, Liu, Venkalasubramanian, and Lam.
13.2.2
Problem Definition
A PCB with two heat sinks and two components mounted on it is considered; Figure 13.24 shows a geometric model of the system. The physical dimensions are • PCB — 200 × 100 × 1.5 mm • Heat sinks — 140 × 50 × 1.4 mm • Component 1 — 50 × 20 × 40 mm • Component 2 — 40 × 20 × 30 mm The PCB is partially supported along two opposite edges and at a few locations on the other pair of opposite edges. This corresponds to the PCB being housed in a casing. The system is subjected to an enforced acceleration of 1.0 g over a frequency range of 20 to 140 Hz. The frequency response (acceleration levels) at a few locations on the PCB and on the components
© 2003 by CRC Press LLC
1523_Frame_C13.fm Page 474 Thursday, August 28, 2003 5:59 PM
Component 1
Heat sink 2
Heat sink 1
z
PCB Component 2
y x
FIGURE 13.24 Geometrical model of the PCB with heat sinks and components. Two heat sinks and two components are mounted onto the PCB, which is partially supported along two opposite degrees and a few locations on the other pair of opposite edges. The system is subjected to an enforced acceleration of 1.0 g over a frequency range of 20 to 140 Hz. The frequency response at some locations of the PCB and on the components can be measured.
is obtained for a given set of material properties. In the actual case, this response, obtained for a given set of material properties, will correspond to test results. Now, the problem is to determine a set of material properties whose bounds are available.
13.2.3
Objective Functions
The inverse analysis procedure used to determine the material properties in this work is performed by a two-step procedure. First, matching of natural frequencies is performed to determine parameters that affect only the natural frequency of the system. The objective for this step is to nsf
Minimize
∑(f
− fic )2
m
i
(13.5)
i =1
where fm and fc are the natural frequencies from test and simulation, respectively, and nsf is the number of natural frequencies sampled. After matching the natural frequencies, the identified parameters are not allowed to vary in the second step, in which matching of frequency response is performed. The remaining parameters that are sensitive to the frequency response are then identified in a second step by nsa
Minimizing
∑ (A
m i
i =1
© 2003 by CRC Press LLC
− Aic )2
(13.6)
1523_Frame_C13.fm Page 475 Thursday, August 28, 2003 5:59 PM
where Am and Ac denote the frequency response of acceleration from test and simulation, respectively, and nsa denotes the number of sampling points. The IP3-GA (see Chapter 5) is used in the procedures of finding the minimums in Equation 13.5 and Equation 13.6. In the forward calculation the finite element method (MSC/NASTRAN) is used as the forward solver.
13.2.4
Finite Element Representation
Based on the geometric model shown in Figure 13.24, a finite element model is created. The PCB and heat sinks are modeled using four-node shell elements (CQUAD4 in MSC/NASTRAN) while the components are modeled using eight-node brick elements (CHEXA). The heat sinks and the components are connected to the PCB using rigid elements (RBE2 in MSC/NASTRAN). The finite element representation is shown in Figure 13.25. The model consists of 340 CQUAD4 elements and 64 CHEXA elements. For the purpose of specifying an enforced acceleration, the large mass method is used. A large mass (1.0 × 107 ton) is connected to the nodal points where the system is supported. An appropriate force on the heavy mass in the required direction will result in an enforced acceleration. The next section describes the numerical “experiments” and the results obtained from them.
13.2.5
Numerical Results and Discussion
Before performing a frequency response analysis, knowledge of the natural frequencies of the system is required. A normal mode analysis is performed first with the finite element model described in the previous section. For the purpose of obtaining “test” results, the set of material properties given in Table 13.4 is used. The first three natural frequencies are found to be f1 =
z y x
FIGURE 13.25 Finite element representation of PCB and components. The software package used is the MSC/ NASTRAN. Four-node plate elements (CQUAD4), eight-node brick elements (CHEXA), and rigid elements (RBE2) are used to mesh the PCB and components.
© 2003 by CRC Press LLC
1523_Frame_C13.fm Page 476 Thursday, August 28, 2003 5:59 PM
TABLE 13.4 Material Properties for Obtaining “Test” Results Property
PCB
Heat Sinks 1&2
Components 1&2
Young’s modulus, MPa Mass density, kg/mm3 Poisson’s ratio Structural damping coefficient
13,500 1.5 × 10–6 0.3 0.08
71,000 2.7 × 10–6 0.3 0.08
5,000 1.0 × 10–6 0.2 0.08
8 7
Acceleration 'g'
6 5 4 3 2 1 20
40
60
80
100
120
140
Frequency Hz
FIGURE 13.26 Frequency spectrum of the acceleration response at PCB center.
95.39 Hz, f2 = 118.54 Hz, and f3 = 173.82 Hz. The fundamental mode is observed to be dominantly a bending mode. Following normal mode analysis, a modal frequency response analysis is performed with the excitation frequencies in the range from 20 to 140 Hz. Figure 13.26 shows the response at the PCB center; the peak response corresponds to the fundamental frequency. Table 13.5 lists the acceleration responses at some specific locations. After obtaining the test results, attention is now focused on the inverse problem in which a set of material properties, each of which lies within a given range that minimizes the error between test and computed response, is determined. To begin, a set of values that corresponds to the material TABLE 13.5 Acceleration Response from “Test”
© 2003 by CRC Press LLC
No
Location
Acceleration (g)
1 2 3 4 5 6 7
PCB center PCB front PCB back Below component 1 Top of component 1 Below component 2 Top of component 2
7.567 8.100 7.324 5.956 6.545 5.999 6.350
1523_Frame_C13.fm Page 477 Thursday, August 28, 2003 5:59 PM
TABLE 13.6 Material Properties Provided by “Client” Material Property
PCB
Heat Sinks 1&2
Components 1&2
Young’s modulus, MPa Mass density, kg/mm3 Poisson’s ratio Structural damping coefficient
15,000 1.4 × 10–6 0.3 0.07
68,000 2.5 × 10–6 0.3 0.07
5,000 1.0 × 10–6 0.2 0.08
TABLE 13.7 Design Variable Range for Frequency Matching Material Property
PCB
Heat Sinks 1 & 2
Components 1 & 2
Young’s modulus, E, MPa
12,000 ≤ E ≤ 18,000 Initial value: 17,000
54,400 ≤ E ≤ 81,600 Initial value: 75,000
4,000 ≤ E ≤ 6,000 Initial value: 5,400
Mass density, ρ kg/mm3
1.2 × 10–6 ≤ ρ ≤ 1.8 x 10–6 Initial value: 1.6 × 10–6
2.0 × 10–6 ≤ ρ ≤ 3.0 × 10–6 Initial value: 2.5 × 10–6
0.8 × 10–6 ≤ ρ ≤ 1.2 × 10–6 Initial value: 0.9 × 10–6
properties provided by a “client” is chosen. These values are listed in Table 13.6. A normal mode analysis is performed to determine the natural frequencies. The first three natural frequencies are f1 = 102.07 Hz, f2 = 125.84, and f3 = 185.57 Hz. The client is not sure about the exact values of the material properties, but the ranges are given as shown in Table 13.7. 13.2.5.1 Sensitivity Analysis Figure 13.27 and Figure 13.28 show the sensitivity of the fundamental frequency to Young’s modulus and mass density of PCB, respectively. A value of 1.63 × 10–9 ton/mm3 is used for the mass density of the PCB for generating Figure 13.27, and a value of 13,500 MPa is use for the Young’s modulus of 120
Fundamental frequency 'Hz'
110
100
90
80 1.2
1.3
1.4
1.5
1.6
1.7
Young's modulus ( x 10 3 ) Mpa
FIGURE 13.27 Variation of fundamental frequency with Young’s modulus of PCB.
© 2003 by CRC Press LLC
1.8
1523_Frame_C13.fm Page 478 Thursday, August 28, 2003 5:59 PM
96.8 96.6
Fundamental frequency 'Hz'
96.4 96.2 96 95.8 95.6 95.4 95.2 95 94.8 94.6 1.2
1.3
1.4
1.5
1.6
1.7
1.8
-9
Mass density ( x 10 ) ton/mm3
FIGURE 13.28 Variation of fundamental frequency with mass density of PCB. 9
Acceleration 'g'
8.8
8.6
8.4
8.2
8 1.2
1.3
1.4
1.5
1.6
1.7
1.8
Young's modulus (×10 ) MPa 3
FIGURE 13.29 Variation of acceleration response at PCB center with Young’s modulus.
PCB for generating Figure 13.28. The values of other parameters are E = 74, 586 MPa, ρ = 2.67 × 10–9 ton/mm3 for the heat sinks, and E = 5, 392 MPa, ρ = 1.0 × 10–9 ton/mm3 for the components. As can be seen from these two figures, the natural frequencies are sensitive to the Young’s modulus and the mass density. Figure 13.29 shows the sensitivity of acceleration response at the center of PCB to the variation of the Young’s modulus. Sensitivity of the acceleration response at the center of PCB to the variation of the structural damping coefficient is shown in Figure 13.30. 13.2.5.2 Identification Using Natural Frequencies Based on the sensitivity analysis, the inverse identification of this set of material properties is performed in two steps. In the first step, the natural frequencies that fall in the excitation range are matched (error function is
© 2003 by CRC Press LLC
1523_Frame_C13.fm Page 479 Thursday, August 28, 2003 5:59 PM
Acceleration 'g'
10 9 8
7 6 5 0.06
0.065
0.07
0.075
0.08
0.085
0.09
0.095
0. 1
Structural damping coefficient
FIGURE 13.30 Variation of acceleration at PCB center with damping coefficient of PCB. 2.5E+6
Fitness
2.0E+6
1.5E+6
1.0E+6
5.0E+5
0.0E+0 0
100
200
300
400
500
Generation
FIGURE 13.31 Searching procedure in natural frequency matching.
minimized). The design variables for this step are the Young’s modulus and mass density of the PCB, heat sinks, and components. The sensitivity analysis has showed that the Young’s modulus and density of the PCB material have relatively greater influence on the natural frequencies; therefore, frequency matching is performed. The range of values of the design variables used in this step is given in Table 13.7. In this example, the two components on the PCB have the same material properties. The searching process of minimizing the fitness function using the IP3-GA is shown in Figure 13.31. The desired variables are found at the 394th generation. 13.2.5.3 Identification Using Frequency Response After the first two frequencies are matched, the parameters identified in the previous step are kept constant. A sensitivity analysis is again performed to determine the remaining parameters that influence the frequency response (at specified output locations). The structural damping coefficient is found to be the single most significant design variable. Therefore, in matching frequency response, the structural damping coefficients of the PCB, the heat
© 2003 by CRC Press LLC
1523_Frame_C13.fm Page 480 Thursday, August 28, 2003 5:59 PM
1.0E+5
Fitness
7.5E+4
5.0E+4
2.5E+4
0.0E+0 0
100
200
300
400
500
Generation
FIGURE 13.32 Searching procedure in frequency response matching.
sinks, and the components are made the parameters to be identified. Based on the sensitivity analysis, matching of frequency response results is carried out. The objective is to minimize the sum of the squares of the deviations between test and simulated results at the output locations. Figure 13.32 shows the convergence of the searching process in match frequency responses using the IP3-GA. The true structural damping coefficients are found at the 426th generation.
13.2.6
Summary
Through sensitivity analysis, the design variables (material parameters) that influence the natural frequencies and frequency response can be determined. A two-step inverse procedure, matching frequency and matching frequency response, can then be set up to identify material parameters of a PCB with heat sinks and electronic components mounted on it, using the finite element model and the IP3-GA. It is demonstrated that the two-step inverse analysis is feasible in real-life applications to effectively avoid ill-posedness of the problem caused by the lack of sensitivity between the inputs and outputs.
13.3 Identification of Material Property of Thin Films The material property of a thin film as shown in Figure 13.33 will be identified using the IP-GA (Xu and Liu, 2002a). The thickness of structure in the z direction is H. The length and depth of structure in the x and y directions are considered to be infinite because they are significantly larger than H. The elastic modulus of the first, second, and third layers is denoted by E1, E2, and E3, respectively. A time-harmonic input load q0 is applied on the upper
© 2003 by CRC Press LLC
1523_Frame_C13.fm Page 481 Thursday, August 28, 2003 5:59 PM
q
H
z x
1st layer 2nd layer 3rd layer
FIGURE 13.33 A three-layer thin film structure subjected to an external load q. Each layer has different materials. The SEM for composite laminate is used to model the thin-layered films. (From Xu, Y.G. and Liu, G.R., J. Micromech. Microeng., 12, 723–729, 2002. With permission.)
surface of the structure to excite an elastic wave field. The load does not vary in the y direction and is expressed as q0 = q0 e − iωt
(13.7)
where q0 and ω are the amplitude and frequency of load, respectively. In general, ω should be high enough so that the wavelengths of elastic waves are shorter than the thickness of thin films. The SEM code (see Section 10.2.2) for composite laminate is used here to calculate the elastic wave propagation in multilayer thin films. The inverse procedure for determination of elastic constants of thin films can then be formulated in the same way as described in Section 8.3.1. Figure 13.34 shows a family of the calculated displacement responses on the surface of a three-layer thin film structure, where H = 6 µm, q0 = 1, and 3.14 E / ρ , in which E = 38 GPa and ρ = 2.66 g/cm3. It can be observed H /3 that the variation of material properties in each layer results in an obvious change of surface displacement response. This observation demonstrates that the information on material properties of each layer is indeed encoded by the surface displacement response. It is therefore capable of using them to determine the material properties of multilayered thin films. Three cases are investigated, in all of which the mass density ρ = 2.66 g/ cm3 is used. The material properties of the first, second, and third layers for each of the cases are respectively set as: ω=
• Case 1 — E1 = 213 GPa, ν1 = 0.29, E2 = 57 GPa, ν2 = 0.33, E3 = 57 GPa, ν3 = 0.26 • Case 2 — E1 = 213 GPa, ν1 = 0.33, E2 = 85 GPa, ν2 = 0.26, E3 = 57 GPa, ν3 = 0.29 • Case 3 — E1 = 142 GPa, ν1 = 0.26, E2 = 57 GPa, ν2 = 0.29, E3 = 85 GPa, ν3 = 0.33
© 2003 by CRC Press LLC
1523_Frame_C13.fm Page 482 Thursday, August 28, 2003 5:59 PM
0.04 fivefold E
Amplitude of displacement response
nominal fivefold E
1
fivefold E
2
3
0.03
0.02
0.01
0.00 0
2
4 6 Location x/H
8
10
FIGURE 13.34 Variation of surface displacement responses with the change of material properties (nominal parameters: E1 = 142 GPa, ν1 = 0.334, E2 = 38 GPa, ν2 = 0.291, E3 = 38 GPa, ν3 = 0.291). (From Xu, Y.G. and Liu, G.R., J. Micromech. Microeng., 12, 723–729, 2002. With permission.)
It should be mentioned that these material properties are used only to provide the simulated measurements of surface displacement responses using SEM code, and to check the accuracy of the material properties determined by the inverse analysis using these simulated measurements.
13.3.1
Noise-Free Cases
In noise-free cases, the measured surface displacement responses are calculated from the SEM code. Figure 13.35 shows the simulated measurement of the surface displacement response for case 1. It is sampled at 100 points on the surface of the multilayer thin film from x = 0 to x = 10H. Therefore, the problem is heavily over-posed. For each of these cases, the objective error function is constructed in the same way as described in Section 8.3.1. The IP-GA (Chapter 5) is then applied to solving the optimization problem and thus to determining the material properties. The search ranges are specified as [20 GPa, 1400 GPa] for E1, E2, E3, and [0.1, 0.5] for ν1, ν2, ν3. Each set of E1, ν1, E2, ν2, E3, and ν3 constructs an individual for evolutionary computation. A total of five individuals in a generation is used in the present simulations. Associated with the small population, the random number seed of –20,000, tournament selection, uniform crossover of pcross = 0.5, no mutation, elitism, and niching operation, one child per pair of parents, are used in the IP-GA. Details of these operators can be found in Chapter 5. In this application, the IP-GA, which is improved by incorporating the hill-climbing searching, is employed Xu and Liu (2002a).
© 2003 by CRC Press LLC
1523_Frame_C13.fm Page 483 Thursday, August 28, 2003 5:59 PM
0.04 noise-free Amplitude of displacement response
2% noise 0.03
5% noise
0.02
0.01
0.00 0
2
4
6
8
10
Location x/H
FIGURE 13.35 Simulated measurements of surface displacement responses for case 1. (From Xu, Y.G. and Liu, G.R., J. Micromech. Microeng., 12, 723–729, 2002. With permission.)
15 E
MSF (*100) and parameter error (%)
10
E
E
1
3
Error
2
5
0
-5
-10
-15 0
10
20
30
40
50
Number of generations
FIGURE 13.36 Searching process of partial material properties using the evolutionary algorithm for case 1. (From Xu, Y.G. and Liu, G.R., J. Micromech. Microeng., 12, 723–729, 2002. With permission.)
© 2003 by CRC Press LLC
1523_Frame_C13.fm Page 484 Thursday, August 28, 2003 5:59 PM
400
400
350
350 E1
B B B B BB B BBB BB B
300
300 250
250
200 50
50 100
100
150
150 E2
200
200 250
250 300300
E3
FIGURE 13.37 Convergence of E1, E2 and E3 toward their actual values in the third hill-climbing search at gener ation 44. (From Xu, Y.G. and Liu, G.R., J. Micromech. Microeng., 12, 723–729, 2002. With permission.)
Figure 13.36 shows the searching processes of the evolutionary algorithm for the material property parameters for case 1 (noise-free), in which the hillclimbing searching is ignited three times to speed up the convergence. The entire process takes 402 function evaluations to convergence, of which 142 function evaluations are conducted by hill-climbing searching. Figure 13.37 shows the process of the third hill-climbing search ignited at generation 44. Table 13.8 through Table 13.10 show the material properties determined by this approach and the corresponding errors with respect to their actual values for all the simulated cases. It can be seen that the results in the noise-free environment are very satisfactory; the maximal error is 4.61, 3.79, and 4.23%, respectively, for case 1, case 2, and case 3. 13.3.2
Noisy Cases
White noise with the different noise levels of pe = 2 and 5%, respectively, produced using Equation 2.131 to Equation 2.133, is then added into the calculated surface displacement responses obtained from the SEM code to simulate the measurement with different noise level. Figure 13.35 also shows two noisy surface displacement responses for case 1. Following the same processes of establishing objective functions and solving optimization problems as that done in the noise-free case, the material properties for three simulated cases with noise contamination are determined. The results are also shown in Table 13.8 through Table 13.10. It can be seen that 2% noise does not give rise to an obvious change for the accuracy of results, while 5% noise results in a little increase of errors of material properties to be determined. The maximal error for three simulated cases is still below 5%. The
© 2003 by CRC Press LLC
1523_Frame_C13.fm Page 485 Thursday, August 28, 2003 5:59 PM
TABLE 13.8 Determined Material Properties and Errors for Case 1 Noise free 2% noise 5% noise
E1 (GPa)
ν1
E2 (GPa)
ν2
E3 (GPa)
ν3
207.9 (–2.39) 208.2 (–2.25) 206.3 (–3.15)
0.301 (3.79) 0.303 (4.48) 0.304 (4.82)
58.7 (2.98) 59.2 (3.86) 59.4 (4.39)
0.319 (–3.33) 0.317 (–3.94) 0.316 (–4.24)
55.1 (–3.33) 55.4 (–2.81) 54.8 (–3.86)
0.272 (4.61) 0.270 (3.85) 0.273 (5.00)
Source: Xu, Y.G. and Liu, G.R., J. Micromech. Microeng., 12, 723–729, 2002. With permission.
TABLE 13.9 Determined Material Properties and Errors for Case 2 Noise free 2% noise 5% noise
E1 (GPa)
ν1
E2 (GPa)
ν2
E3 (GPa)
ν3
209.1 (–1.83) 208.6 (–2.07) 206.4 (–3.10)
0.326 (–1.21) 0.317 (–3.94) 0.314 (–4.85)
86.8 (2.24) 87.7 (3.18) 88.9 (4.58)
0.269 (3.46) 0.270 (3.85) 0.273 (–5.0)
55.3 (–2.91) 55.2 (–3.21) 54.3 (–4.74)
0.301 (3.79) 0.303 (4.48) 0.304 (4.83)
Source: Xu, Y.G. and Liu, G.R., J. Micromech. Microeng., 12, 723–729, 2002. With permission.
TABLE 13.10 Determined Material Properties and Errors for Case 3 Noise free 2% noise 5% noise
E1 (GPa)
ν1
E2 (GPa)
ν2
E3 (GPa)
ν3
138.1 (–2.75) 137.8 (–2.96) 136.1 (–4.12)
0.271 (4.23) 0.272 (4.62) 0.270 (3.85)
58.5 (2.63) 58.9 (3.33) 59.8 (4.91)
0.292 (0.69) 0.294 (1.38) 0.301 (3.79)
82.1 (–3.41) 81.9 (–3.65) 81.5 (–4.12)
0.319 (–3.33) 0.317 (–3.94) 0.314 (–4.85)
Source: Xu, Y.G. and Liu, G.R., J. Micromech. Microeng., 12, 723–729, 2002. With permission.
stable inverse solution is due to the use of SEM, over-posed formulation, sufficient sensitivity, and projection regularization affected by the discrete sampling of displacement response (see Section 8.3.4.3). As expected, the searching process in a noise environment would require the evolutionary algorithm to run more generations for convergence. For example, the run for case 1 with 2 and 5% noise contamination takes 523 and 741 function evaluations, respectively.
13.3.3
Discussion
The preceding examples numerically validate the approach using the IP-GA. However, this approach probably becomes less effective with the increase of
© 2003 by CRC Press LLC
1523_Frame_C13.fm Page 486 Thursday, August 28, 2003 5:59 PM
layers in multilayered thin films. This is because the influence of material properties of the internal layer on surface displacement responses becomes weakened with the increase of the number of interfaces and the distance from it to surface of structures. In other words, the material properties of deeper layers are very difficult to determine accurately from surface displacement responses only. To solve this problem, it is necessary to use excitations of lower frequency that can penetrate deeper into the material. Another problem worthy of mention is that the accuracy of the present approach is critically dependent on the computational model of elastic wave propagation. For multilayered thin films with complex structural configurations, an effective computational method is necessary for providing an accurate prediction of surface displacement responses.
13.4 Crack Detection Using Integral Strain Measured by Optic Fibers 13.4.1
Introduction
In recent years optic fiber sensors have been rapidly developed and used for engineering applications. Unlike point-based testing sensors, optic fibers can be easily used for measuring a line, or even field purposes, which is more suitable for damage detection because the locations of damages are unknown and usually random in nature. Generally two kinds of techniques are used in optic fiber sensors. One is through measuring the energy loss of light in the optical fibers embedded in the structure, which are strained or fractured when subject to external loading (Crane and Gagorik, 1984; Hofer, 1987; Glossop et al., 1990). This technique is promising, but the sensitivity to damages is the main problem to be overcome before directly applying it to actual structures. The other technique of using optic fibers is from the interferometric theory. Among many researchers who use this technique, Elvin and Leung (1997) proposed a fiber optic-based approach for the detection of cracks in structures. An integral strain (total length change of the embedded optic fiber) is introduced in their method. The testing procedure of the optical fiber system is schematically shown in Figure 13.38. The advantage of this method is that the location and length of crack along the optic fiber can be shown in the curves. However, because each load position can only produce one point on the curve of integral strain and the size and location of the crack can only be detected after the whole curve is drawn, the disadvantage of the method is that too much data processing work must be done in drawing the integral strain curves; also, the whole work is time consuming. To avoid drawing an integral strain curve, an inverse analytical technique combined with genetic algorithm has been proposed (Yang et al., 2002b). In this way, the detection can be made point by point.
© 2003 by CRC Press LLC
1523_Frame_C13.fm Page 487 Thursday, August 28, 2003 5:59 PM
Entrance Coupler
Sensing Arm
Moving Load
Structure with Arbitrary Support Condition Crack
Laser Producer
Light Detector
Reference Arm
Exit Coupler
FIGURE 13.38 Crack detection system using optic fiber. (From Yang, Z.L. et al., Smart Mater. Struct., 11, 72–78, 2002. With permission.)
13.4.2
Numerical Calculation of Integral Strain
The integral strain can be obtained through measuring the phase shift. The theory of integral strain detecting technique has been reviewed by Davis (1985). Assume a crack with a dimension of a × b in a plate with thickness of H as shown in Figure 13.39. In this calculation, a finite element model using shell elements is introduced. The structure can be divided into three parts: the first is the noncrack area where the single-layer shell elements are used; the second and third parts are in the damaged area where two layers of shell elements will be used. One layer will represent the portion of plate above y
a
A
b
A x
0 (a)
Optic fiber
(b)
A-A H
h
(c)
FIGURE 13.39 FEM model for a plate with a a × b crack: (a) plate with a crack, (b) shell-element mesh, and (c) view of A-A section. (From Yang, Z.L. et al., Smart Mater. Struct., 11, 72–78, 2002. With permission.)
© 2003 by CRC Press LLC
1523_Frame_C13.fm Page 488 Thursday, August 28, 2003 5:59 PM
the crack and the other the portion below the crack area. These two layers of shell elements and single-layer shell elements are connected using constraints established by artificial rigid bars placed at the crack tips. For the preceding structure with given boundary conditions and loading, the displacement fields are computed by FEM. When the displacement co m ponents at nodes along the optical fiber axis are obtained, the integral strain of the fiber can be calculated. (Assume the optic fiber is perfectly attached to the structure and thus the displacement in optic fibers can be represented by corresponding nodes in the structure at the same positions). Assuming the displacement field is known, the length change (∆) between two nodes denoted by i and j in element e can be calculated from the following equations (Davis, 1985): ∆e = l1 − l0
(13.8)
l0 = lx2 + ly2 + lz2
(13.9)
l1 = (lx + ∆u)2 + (ly + ∆v)2 + (lz + ∆w)2
(13.10)
where
where l0 and l1 are the distances between two nodes before and after deformation, respectively. lα = αj – αi, (α = x, y, and z ) is the component of initial length between nodes i and j in α direction. ∆β = βj – βi, (β = u, v, and w) is the component of displacement difference in the β direction between nodes i and j. Assume optical fiber is embedded/surface mounted across n shell elements. The overall length change ∆L (integral strain) along the optical fiber can be expressed as ∆L =
∫
n
εds ≈
∑∆
e
(13.11)
e =1
Thus, at each load position, one integral strain ∆L can be calculated by numerical method.
13.4.3
Inverse Procedure
An inverse procedure is used for the identification of parameters of cracks in the structures. In this case, the location and size of the crack in the structure are the system parameters to be identified inversely. The objective function is defined as the squared difference between the integral strains obtained
© 2003 by CRC Press LLC
1523_Frame_C13.fm Page 489 Thursday, August 28, 2003 5:59 PM
… FIGURE 13.40 Possible shapes of cracks in plate. (From Yang, Z.L. et al., Smart Mater. Struct., 11, 72–78, 2002. With permission.)
from the structure with the actual crack and calculated from the structure with trial system parameters (L2 norm defined by Equation 2.123). A static load is applied on the top surface of structure along the optic fiber. The displacement is calculated using the FEM, and the integral strain is computed from Equation 13.8 and Equation 13.11. The system parameters (the size and location of the crack) around the specific load point can be obtained by minimizing the objective function. Based on the results, assessment is performed on whether the part of structure under the load point is damaged. 13.4.3.1 Crack Expression A crack in a structure is described here by two parameters: one is the location r (rx, ry, rz) and the other is the size δ (δx, δy, δz). The change of each location component will change the position of the crack in the corresponding direc tions; also, the change of size components will change the size of the crack in the relevant directions. The possible shape of the crack in the plane is represented using groups of elements such as those shown in Figure 13.40. The depth of the crack is represented by δz, which equals the thickness of elements of the upper part of plate in the damaged region. 13.4.3.2 Remesh Technique Remesh is required to modify the finite element meshes automatically during the restart calculation procedure when considerable changes in structural shape take place. It is a very useful method for solving problems such as crack propagation using finite element method. In inverse analysis, the size and the location of a trial crack can be random; remeshing procedure for each trial calculation is needed. A subroutine of remeshing a damaged structure with shell elements has been developed and interfaced with the LSDYNA3D (Yang et al., 2002b). This subroutine alters the information of nodes, elements, and other information in the input file at the beginning of each restart calculation. With this remesh function, the parameters, e.g., the location and size of the crack, that are produced by genetic algorithm can be changed arbitrarily. 13.4.3.3
Definition of Objective Functions
The objective function of the inverse problem is defined as the squared difference between the integral strains obtained from the structure with the
© 2003 by CRC Press LLC
1523_Frame_C13.fm Page 490 Thursday, August 28, 2003 5:59 PM
actual crack and calculated from the structure with trial system parameters, which can be expressed as: ε(r , δ) = {∆L(r , δ) − ∆L0 (r , δ)} 2
(13.12)
where r and δ represent the location and size of the crack, respectively. ∆L and ∆L0 are the integral strains obtained from the trial crack and the actual crack of structure, respectively. The location and size of the crack are updated at each trial calculation; the objective function will reach the minimum if the desired cracks are found. The µGA is used to find the desirable parameters to minimize objective function in the inverse analysis. The flowchart of the searching procedures using the µGA is given in Figure 13.41. The subscript i represents the ith generation, and n is the maximum generation for searching.
Input the integral strain (∆0) of a structure with real crack (r0, δ0)
Take best individual to next generation ith generation Produce new population in ith generation with np individuals by µGA
jth individual jth individual in ith generation with location r(rxj, ryj, rzj) and size δ(δxj, δyj, δzj)
Re-mesh the structure
Calculate the integral strain (∆) and objective function of jth trial of structure with assumed crack (r, δ)
j=j+1 No
j equal np ? Yes
i equal Yes n ?
i=i+1 No
Find best individual (r, δ) and stop running
FIGURE 13.41 Inverse procedures of crack detection using µGAs. (From Yang, Z.L. et al., Smart Mater. Struct., 11, 72–78, 2002. With permission.)
© 2003 by CRC Press LLC
1523_Frame_C13.fm Page 491 Thursday, August 28, 2003 5:59 PM
13.4.4
Numerical Results
A plate with a dimension of 0.5 × 0.5 × 0.005 m3 is discretized into 10 × 10 elements in a plate. Assume a rectangular crack exists somewhere in the plate. An optic fiber will be mounted on the surface of the plate across this damaged area and a static load will be applied within this area on the surface of the plate along the optic fiber. A response of an integral strain of the optic fiber will be used as the input of objective function in an inverse analysis. The following cases are considered: cracks with different sizes and locations, plate with different materials, applied loads, and boundary conditions. 13.4.4.1 Different Dimensions of Cracks Cracks with the dimensions of 0.1 × 0.1 × 0.0001 m3 (case A) and of 0.2 × 0.2 × 0.0001 m3 (case B) are located at (0.25, 0.25) in the coordinate system (Figure 13.39), as shown in Figure 13.42(a) and Figure 13.42(c). The thickness of plate above the crack is 20% of that of the whole plate. The boundary conditions are that two opposite sides (in optic fiber direction) are clapped, and the other two opposite sides are free. A 10 kN static load is applied at (0.25, 0.25) on the surface of the plate and homogeneous isotropic materials are used. The corresponding material constants are listed in Table 13.11. The integral strains for the two cases are 3.530E–6 and 1.650E–4, respectively. The search ing process using the µGA is shown in Figure 13.43 and Figure 13.44. The actual crack parameters for the two cases are found at the 121st and 194th generations, respectively. Figure 13.45 shows the detailed searching process using the µGA in approaching the true location and size of the crack corresponding to case A.
(a)
(b)
(c)
FIGURE 13.42 Sizes and locations of cracks. (From Yang, Z.L. et al., Smart Mater. Struct., 11, 72–78, 2002. With permission.)
TABLE 13.11 Material Constants of Plates Used in Crack Detection Material
E1 (Gpa)
E2 (GPa)
E3 (GPa)
G12 (GPa)
G23 (GPa)
G31 (GPa)
µ 12
µ 13
µ 23
ρ (kg/m3)
1 2
16.7 126
16.7 8.6
16.7 8.6
6.423 1.07
6.423 3.07
6.423 1.07
0.3 0.018
0.3 0.018
0.3 0.4
1760 1230
Source: Yang, Z.L. et al., Smart Mater. Struct., 11, 72–78, 2002. With permission.
© 2003 by CRC Press LLC
1523_Frame_C13.fm Page 492 Thursday, August 28, 2003 5:59 PM
minus fitness
0.0E+00
-4.0E-09
-8.0E-09
-1.2E-08 0
50
100 generation
150
200
FIGURE 13.43 Convergence of the µGA search process for case A: crack with the dimension of 0.1 × 0.1 × 0.0001 m3 located at (0.25, 0.25). The boundary conditions of the plate are: two opposite sides are clapped and the other two opposite sides are free, and the plate subjected to a static load of 10 kN at (0.25, 0.25). (From Yang, Z.L. et al., Smart Mater. Struct., 11, 72–78, 2002. With permission.) 5.0E-9
minus fitness
0.0E+0 -5.0E-9 -1.0E-8 -1.5E-8 -2.0E-8 -2.5E-8 0
50
100
150
200
generation
FIGURE 13.44 Convergence of the µGA search process for case B: crack with the dimension of 0.2 × 0.2 × 0.0001 m3 located at (0.25, 0.25). The boundary conditions of the plate are: two opposite sides are clapped and the other two opposite sides are free, and the plate subjected to a static load of 10 kN at (0.25, 0.25). (From Yang, Z.L. et al., Smart Mater. Struct., 11, 72–78, 2002. With permission.)
(1)
(4)
(5)
(18)
(21)
(26)
(28)
(40)
(41)
(58)
(76)
(121)
FIGURE 13.45 Convergence of the schematic expression of searching procedures by µGA related with Figure 13.43. Numbers in parentheses are related generations. (From Yang, Z.L. et al., Smart Mater. Struct., 11, 72–78, 2002. With permission.)
© 2003 by CRC Press LLC
1523_Frame_C13.fm Page 493 Thursday, August 28, 2003 5:59 PM
minus fitness
0.0E+0 -2.0E-6 -4.0E-6 -6.0E-6 -8.0E-6 -1.0E-5 0
400
800
1200
generation
FIGURE 13.46 Convergence of the µGA search process for case C: “crack” with the dimension of 0.1 × 0.1 × 0.0001 m3 located at (0.15, 0.15). The boundary conditions of the plate are: two opposite sides are clapped and the other two opposite sides are free, and the plate subjected to a static load of 10 kN at (0.25, 0.25). (From Yang, Z.L. et al., Smart Mater. Struct., 11, 72–78, 2002. With permission.)
At the 26th generation and the 121st generation, the trial cracks have the same plane profiles, but the locations in the depth direction are different. The true size and location of the crack are found at the 121st generation. 13.4.4.2 Different Locations of Cracks (Case C) All the conditions are the same as those of case A, except that the location of the crack is changed to (0.15, 0.15) in the plane as shown in Figure 13.42(b). The load is applied at the center of the crack on the surface of the plate. The corresponding integral strain is 2.336E–6. The real parameters are found at the 116th generation; the searching process is shown in Figure 13.46. 13.4.4.3 Different Materials (Case D) Unlike the previously used homogeneous isotropic materials, composite laminates will be considered here. The corresponding material constants are listed in Table 13.11 (material 2). Assuming other conditions to be the same as in case A, the integral strain for this case is 5.913E–7. The searching process is shown in Figure 13.47. The desirable crack is found at the 251st generation. 13.4.4.4 Different Applied Loads (Case E) All the conditions are the same as those of case A, except that the load value is 1 kN. The integral strain for this case is 4.457E–8. The searching process is shown in Figure 13.48. The true crack is found at the 67th generation. 13.4.4.5 Different Boundary Conditions (Case F) Other conditions are the same as in case A, but the boundary condition in this case is that all four sides of the plate are clapped. The integral strain for
© 2003 by CRC Press LLC
1523_Frame_C13.fm Page 494 Thursday, August 28, 2003 5:59 PM
minus fitness
0.0E+0 -4.0E-7 -8.0E-7 -1.2E-6 -1.6E-6 -2.0E-6 0
51
102
153
204
255
generation
FIGURE 13.47 µGA search process for case D: crack with the dimension of 0.1 × 0.1 × 0.0001 m3 located at (0.25, 0.25). The plate is composite with the material constants listed in Table 13.11. The boundary conditions of the plate are: two opposite sides are clapped and the other two opposite sides are free, and the plate subjected to a static load of 10 kN at (0.25, 0.25). (From Yang, Z.L. et al., Smart Mater. Struct., 11, 72–78, 2002. With permission.)
minus fitness
0.0E+00
-5.0E-18
-1.0E-17
-1.5E-17 0
20
40 60 generation
80
100
FIGURE 13.48 µGA search process for case E: crack with the dimension of 0.1 × 0.1 × 0.0001 m3 located at (0.25, 0.25). The boundary conditions of the plate are: two opposite sides are clapped and the other two opposite sides are free, and the plate subjected to a static load of 1 kN at (0.25, 0.25). (From Yang, Z.L. et al., Smart Mater. Struct., 11, 72–78, 2002. With permission.)
this case is 2.941E–6, and the desirable crack parameters are found at the 468th generation. The searching process is shown in Figure 13.49. From the results of these examples, cracks of different cases can be detected by using the previously mentioned method. In this method, only one load point is needed for each trial in assessing whether a crack is under the load point; also, the size and location of the crack can be detected in the same time if a crack exists. Examples have shown the feasibility and reliability of the method. 13.4.5
Summary
A nondestructive detection method is employed using computational inverse techniques and an integral strain that can be measured by an optic
© 2003 by CRC Press LLC
1523_Frame_C13.fm Page 495 Thursday, August 28, 2003 5:59 PM
1.0E-14 0.0E+00
minus fitness
-1.0E-14 -2.0E-14 -3.0E-14 -4.0E-14 -5.0E-14 -6.0E-14 -7.0E-14 0
100
200
300
400
500
generation
FIGURE 13.49 Convergence of the µGA search process for case F: crack with the dimension of 0.1 × 0.1 × 0.0001 m3 located at (0.25, 0.25). The boundary conditions of the plate are that its four sides are clapped, and the plate subjected to a static load of 10 kN at (0.25, 0.25). (From Yang, Z.L. et al., Smart Mater. Struct., 11, 72–78, 2002. With permission.)
fiber. By using this technique, detection and decision can be made point by point; this means that the decision of whether a specific crack under a load point exists can be made based only on the integral strain from that load position. It is not necessary to draw an integral strain curve before finding size and location of the crack along an optic fiber, which has been always the practice in the conventional methods using optic fibers. This technique may lead to new applications for optic fibers in point-by-point NDE.
13.5 Flaw Detection in Truss Structure The Levenberg–Marquart root finding method described in Chapter 4 is employed in the next truss flaw detection example. A 10-bar truss structure shown in Figure 13.50 is considered to assess the damage status in elements. It involves 41 truss elements and 22 nodes and is a planar truss structure, each node with two DOFs. The material properties and section area of all elements Nodal number 2
4
y
10 9 12 11
14
x
22
13 16 17 15
3
1
21 Element number
FIGURE 13.50 A truss structure for damage assessment.
© 2003 by CRC Press LLC
18
1523_Frame_C13.fm Page 496 Thursday, August 28, 2003 5:59 PM
0.45
Damage Factor
0.4 0.35 0.3 0.25 0.2 0.15 0.1 0.05 0
0
5
10
15
20
25
30
35
40
Element Number
FIGURE 13.51 Damage status detected in truss structure (case 1). The actual situation is that elements 9 and 12 are damaged with damage factors β = 28.6% and β = 42.8%.
are given as: Young’s modulus E = 2.1 N/m2, mass density ρ = 7.8 kg/m3, and section area A = 0.0025 m2. Two damage cases are considered in this example. For case 1, assume elements 9 and 12 are damaged with damage factors β = 28.6% and β = 42.8%, respectively. A harmonic load is applied at node 22 in the vertical direction at two frequencies ω1 = 500 rad/s and ω2 = 1200 rad/s, respectively. The vertical nodal displacements of all nodes are selected and measured to inversely determine the damage status. Here, the measurements are simulated using forward analysis with assumed damage factors given previously. The initial parameters for starting iteration are taken to be 1.0 for all element stiffness factors or 0 for all element damage factors. The results depicted in Figure 13.51 illustrate the successful detection of damage in elements 9 and 12. For case 2, elements 14 and 18 are assumed to have damage with damage factor β = 28.6%. In this case, as in case 1, the same load is imposed to excite the truss and the same measurements are used for inverse analysis. Results illustrated in Figure 13.52 show the damage status in elements 14 and 18 is in fairly good agreement with assumed values. The damage status in case 2 is further investigated. The number of measurements is increased by considering one more frequency at ω3 = 1900 rad/ s and better results are obtained, as shown in Figure 13.53. Performing the same inverse procedure but using the horizontal displacements at the same nodes, even better results are obtained (Figure 13.54). This confirms that the number and selection of measurements are very important in inverse analyses.
13.6 Protein Structure Prediction Proteins are biopolymers made from amino acids, usually a few hundred to a few thousand units long. Each unit has many different atoms, making
© 2003 by CRC Press LLC
1523_Frame_C13.fm Page 497 Thursday, August 28, 2003 5:59 PM
0.35
Damage Factor
0.3
0.25
0.2
0.15
0.1
0.05
0
0
5
10
15
20
25
30
35
40
Element Number
FIGURE 13.52 Damage status detected in truss structure (case 2). The actual situation is that elements 14 and 18 are assumed to have damage with damage factor β = 28.6%.
0.35
0.3
Damage Factor
0.25
0.2
0.15
0.1
0.05
0
0
5
10
15
20
25
30
35
40
Element Number
FIGURE 13.53 Damage status detected in truss structure for case 2 (same as in Figure 13.52), but using one more additional measurement.
0.35
Damage Factor
0.3
0.25
0.2
0.15
0.1
0.05
0
0
5
10
15
20
25
30
35
40
Element Number
FIGURE 13.54 Damage status detected in truss structure for case 2 (same as in Figure 13.52), but using the horizontal displacement.
© 2003 by CRC Press LLC
1523_Frame_C13.fm Page 498 Thursday, August 28, 2003 5:59 PM
proteins very large macromolecules (Freifelder et al., 1993). Many interactions occur within the protein and between the protein and its environment; these interactions drive the protein to find a certain stable structure. It is believed that this minimum energy structure is the same structure as that found in nature — the “native state” (Fasman, 1990). Because the number of atoms in protein is very large, the number of possible protein structures is also very large. Therefore, the search space is very big (Bryngelson et al., 1995), requiring an efficient search method to find the structure at the minimum energy state. Searching for the stable or native structure from the vast number of possible structures is a very challenging area of research. The genetic algorithm is regarded as a very robust tool useful in finding the global optimum of practical problems because it is applicable to all kinds of complex objective functions and has no requirement on the continuity of the objective functions. This advantage is very important to the problems of protein structure prediction (Patton et al., 1995). However, because of the random selection, the time required to find the desired converged solution is usually very long. It is commonly believed that the genetic algorithm is impractical for protein structure prediction with large numbers of amino acid residues, unless measures are taken to improve the performance of the searching process. As introduced in Chapter 5, the further improved GA with the concept of intergeneration project operator, named IP-GA, drastically improves the efficiency of the searching process. Recently, Liu et al. (2002g) and Yang et al. (2002c) employed the IP3-GA to predict protein structure, and quite good results were obtained. Based on their studies, the procedure of protein prediction using the IP3-GA is introduced in this section.
13.6.1
Protein Structural Prediction
Actual protein folding processes are very complicated, so it is difficult to trace the full dynamics of the protein. Thus, the first attempt should seek to find the stable native structure of proteins and then to analyze their dynamic behavior using the method of molecular dynamics. In order to perform the prediction, the problem can be expressed in this way. Given a sequence of amino acids: Q = q1 q2…qn, define the space of all possible structures derived from Q as S = {Si}. For each structure Si ∈ S, the corresponding energy Ei is found. A native structure Snative ∈ S, with energy Enative , exists. It is reasonable to assume that the native structure is independent of the path of formation of the protein. Therefore, it is not important to trace the path from the initial structure (S0, E0) to the native structure (Snative , Enative), although it may be done using dynamic analysis. The approach used by Yang et al. (2002a) is adopted; the problem of protein structure prediction is transformed into an optimization problem: the structure space S of a sequence Q is searched for a particular structure Sf whose energy Ef is the global minimum energy. Ef equals Enative , and Sf is then regarded as Snative .
© 2003 by CRC Press LLC
1523_Frame_C13.fm Page 499 Thursday, August 28, 2003 5:59 PM
13.6.2
Parameters for Protein Structures
Generally, the structure of a protein can be represented by the following parameters (Fasman, 1990): • • • • •
Bond lengths — b Bond angles — θ Dihedral angle about Cα–N bond — φ Dihedral angle about Cα–C bond — ψ Dihedral angle about C–N bond — ω
An additional set of dihedral angles is needed to describe the side chains of protein. These parameters describe a three-dimensional protein structure.
13.6.3
Confirmation Energy
Based on the assumption that the native structure is independent of its pathways and the structure should be the one that has the minimum energy, some researchers (e.g., Bryngelson et al., 1995) have suggested that the native structure of a protein is the global free-energy minimum structure. However, most researchers do not consider free energy in protein structure prediction. Instead, they calculate molecular potential energy for energy minimization. For example, a “potential of mean force field” is given as (Warshel, 1991): E(S) = 1 2
∑K
1 2
i
∑K
(bi − b0 ,i )2 +
1 2
∑K
θ ,i
(θ i − θ 0 , i ) 2 +
i
(1 − cos(nφ i )) +
(13.13)
i
∑Λ i,j
φ ,i
b ,i
if
qq 1 1 − Bij 6 + 322 i i + higher order terms rij12 rij rij
where bi is the bond lengths, θi is the bond angles, φi is the dihedral angles (including φ, ψ, and ω), and rij is the nonbonded distances between the residues or atoms.
13.6.4
Lattice Model
In the real world, methods of protein folding are very complicated, and the number of degrees of freedom is very large. The computational power is limited, so the degrees of freedom of protein structure need to be reduced by discretizing the space. In the problems of protein structure prediction,
© 2003 by CRC Press LLC
1523_Frame_C13.fm Page 500 Thursday, August 28, 2003 5:59 PM
“lattice model” that provides a ground structure for the protein is used. In the lattice model, the positions of the amino acid monomer units are taken to be the beads and are restricted to positions of the grids on a lattice; however, this simplification is not realistic. Nevertheless, some basic features of the real system, such as the interactions between the monomer units, are maintained. Most importantly, this simplification can greatly improve computational efficiency and make protein structure prediction possible (Dandekar et al., 1992). 13.6.4.1 Cubic Lattice Model Shakhnovich et al. (1991) presented a freely jointed bead model of a biopolymer chain that could fold into a simple cubic lattice by Monte Carlo method. In their example, • Bonds were fixed at unit length. • A 27-monomer length sequence was used. • Configuration of the biopolymer was expressed using monomer positions ri (i = 1, 27). • Monomers could occupy one of six positions (±x, ±y, and ±z) relative to the previous monomer. • Multiple occupancy of sites was allowed, but such configurations were penalized in the energy equation. • A compact, self-avoiding structure of a cube was found. 13.6.4.2 Random Energy Model In the lattice model, interactions are restricted to nearest neighbors in the configuration, but neighbors connected by chemical bonds are not counted. A random energy model (REM) for spin glass (Derrida, 1980) is usually used. The energy function is: N
E=
∑
N
Bij ∆(ri − rj ) + D2
i , j =1;i < j
∑
N
δ(ri − rj ) + D3
i,j
∑ δ(r − r )δ(r − r ) i
j
j
k
(13.14)
i , j ,k
where ∆(ri – rj) = 1 if ri – rj = 1 (if nearest neighbors) = 0 otherwise δ(ri – rj) = 1 if ri – rj = 0 (if monomers i and j occupy same site)
= 0 otherwise D2 is the penalty constant applied in sites that contain two or more mono mers; D3 is a similar parameter for sites containing three or more monomers.
© 2003 by CRC Press LLC
1523_Frame_C13.fm Page 501 Thursday, August 28, 2003 5:59 PM
Bij is the energy coefficients between the interactions, randomly created according to a Gaussian distribution: P(Bij ) = (2 πB 2 ) −1/2 exp( −(Bij − B0 ) 2 / 2B 2 )
(13.15)
B is the standard deviation that defines the spread of the random interaction energy. B0 is the average value of interaction energy; in order to model the hydrophobic effect, B0 is negative and thus an attractive potential. 13.6.4.3
Lattice Structure Prediction by IP3-GA Consider a polymer chain with fixed, unit-long bonds that may move in six directions of a simple cubic lattice. REM is used to calculate the energies. The IP3-GA is adopted to find structures that minimize energy function (see Chapter 5). • Representation of lattice structure — a string of bond directions to discrete structures was chosen. Bonds are in one of six directions (coordination number for a cubic lattice): LEFT, RIGHT, UP, DOWN, FRONT, or BACK. The definitions of using binary representations are shown in Table 13.12. The missing binary triplets “000” and “111” are screened off and replaced with legal directions. • Evaluation and fitness — in IP3-GA searching, the coordinate of the first monomer is fixed to the origin. Triplets are produced according to the IP3-GA operations and the incremental coordinate of the next monomer is interpreted from the triplet direction in Table 13.12. The protein structure is represented by a sequence of lattice coordinates that is set up by interpreting the incremental coordinate sequences. The configuration energy of lattice is evaluated using modified REM, which includes the following terms (Yang et al., 2002a): • Energy terms represent the interaction energy between the nearest neighboring monomers. To mark the nearest neighbors, all the monomer coordinates ahead in the sequence are compared with that of the monomer under consideration. If a nearest neighbor TABLE 13.12 Definitions of Bond Directions Using Binary Triplets Direction (R)IGHT (L)EFT (F)RONT (B)ACK (U)P (D)OWN
© 2003 by CRC Press LLC
Binary Triplet
Integer
∆
001 010 011 100 101 110
1 2 3 4 5 6
rx → rx + 1 rx → rx – 1 ry → ry + 1 ry → ry – 1 rz → rz + 1 rz → rz – 1
1523_Frame_C13.fm Page 502 Thursday, August 28, 2003 5:59 PM
is found, the energy term is increased by the interaction energy Bij (where i and j are the nearest neighboring pair of monomers). • Penalty terms consider the multiple occupancy of a particular site. When i and j occupy the same site, the penalty terms increase a positive energy D to penalize such configuration. In the procedure of counting the nearest neighbors for each monomer, the remaining monomers ahead of the sequence will be checked; therefore, the penalty terms actually consider all the possible multiple occupancies. • The purpose of protein structure prediction is to find the minimal energy structure for a given amino acid sequence. Because the fitness function in the modified IP3-GA is designed to search maximal value of a problem, fitness can be defined as fitness = – energy; thus N
fitness = −
∑
i , j = 1; i < j
13.6.5
N
Bij ∆(ri − rj ) − D
∑ δ(r − r ) i
j
(13.16)
i,j
Results and Discussion
As an example, the Monte Carlo results of Shaknovich et al. (1991) for a 27unit long system using the IP3-GA are first reproduced. Because the penalty parameter D will not appear in the energy function for a self-avoiding structure, it can be chosen freely; D =10 is used in this example. The randomly selected energy parameters Bij a re produced from Gaussian distribution with the standard deviation of the interactions B = 0 and the average interaction energy B0 = –2. The three cubic structures can be obtained by the IP3-GA for randomly selected 27-unit long sequences. Figure 13.55 shows the results of minimal energy structures obtained from the IP3-GA for two different sets of Bij generated with different random selections. Figure 13.56 gives the corresponding search process. Next, larger systems with 64- and 125-unit long sequences are investigated. The penalty parameters are selected as D = 20 and 40, respectively. The energy parameters Bij are randomly generated by Gaussian distribution with the standard deviation of the interactions B = 0 and the average interaction energy B0 = –2. Figure 13.57(a) and Figure 13.57(b) show the optimized structures from 64- and 125-unit long sequences. Figure 13.58(a) and Figure 13.58(b) gives the corresponding search procedures. In the preceding examples, the initial structures are created randomly, and there is no restriction for the initial structures. 13.6.6
Summary
The IP3-GA can greatly improve search efficiency and thus can be used in protein structure prediction. Stable structures of very large protein structure
© 2003 by CRC Press LLC
1523_Frame_C13.fm Page 503 Thursday, August 28, 2003 5:59 PM
2
1
0 1
0 1
Z
2 Y
X
(a)
2
1
0
-2 -1
Z
1 0
X
2
Y
(b) FIGURE 13.55 Optimized structures found for Bij set 1(a) and Bij set 2 (b).
with up to 125 monomers have already been found successfully. The one penalty parameter is used in the algorithm, and this new algorithm is more efficient due to the reduction of penalty parameter terms in the fitness function. The cubic or compacted structures can be obtained by this method directly from a one-dimensional sequence monomers.
13.7 Fitting of Interatomic Potentials 13.7.1
Introduction
A new procedure for fitting interatomic potentials is suggested by Xu and Liu (2003) using molecular dynamics simulations and the improved IP-GA (see Chapter 5). Molecular dynamic (MD) simulations are applied to calculate the material properties used to match experimental data during the fitting procedure. This includes the effect of atom relaxations in fitting calculations. The improved IP-GA is used to optimize the fitting parameters
© 2003 by CRC Press LLC
1523_Frame_C13.fm Page 504 Thursday, August 28, 2003 5:59 PM
energy
50
0 -50
-100 0
100000
200000
300000
generation (a)
energy
50
0 -50 -100 0.0
2.0E+6
4.0E+6
6.0E+6
generation (b)
FIGURE 13.56 Convergence process in finding the optimized structures for Figure 13.55.
until the error between the calculated and experimental material properties is within the tolerance. The new algorithm significantly improves the accuracy and transferability of the fitted potential.
13.7.2
Fitting Model
A common potential functions used in MD is (Daw and Baskes, 1984; Foiles et al., 1986; Voter, 1994) N
E=
∑
1 φ(r ) + 2 i , j =1 ij j ≠i
N
∑ U (n ) i
(13.17)
i =1
where rij is the distance between atom i and atom j, φ(rij) is a pair function expressing the pairwise interaction between atom i and atom j. In the embedded atom method (EAM) the form is taken to be a Morse potential (Voter, 1994), φ(rij ) = D{exp[−2α(rij − r0 )] − 2 exp[−α(rij − r0 )]}
© 2003 by CRC Press LLC
(13.18)
1523_Frame_C13.fm Page 505 Thursday, August 28, 2003 5:59 PM
2 1 0 -1 0
Z
1 Y
2
-1
0
1
-1
2
X
(a)
Z
2 1 0 -1 -2 -3
-2
-1
0
1
2
1
0
-3 -2 -1
2
Y X
(b)
FIGURE 13.57 Optimized structures for polymer length 64 (a) and 125 (b).
where A, α, and r0 are three fitting parameters to be determined. They define the depth, position of the minimum, and curvature at the minimum of function, respectively. U(ni) is an embedding function of expressing the energy to embed atom i into the background electron density ni. The embedding function U(ni) in EAM is designed as follows (Kitamura, 1997): U (ni ) = b1ni2 + b2 ni + b3 ni1/2
(13.19)
where b1, b2, and b3 are fitting parameters that determine the function shape. Both φ(rij) and U(ni) involve some fitting parameters that decide their shapes. The preceding potential function involves six fitting parameters, D, α, r0, b1 , b2 and b3, that are determined by performing optimization computations. The objective of potential fitting is to discover these parameters in order to get the best match between the calculated material properties from the
© 2003 by CRC Press LLC
-energy
1523_Frame_C13.fm Page 506 Thursday, August 28, 2003 5:59 PM
200 150 100 50 0 -50 -100 0.0E+0 2.0E+6 4.0E+6 6.0E+6 8.0E+6 1.0E+7 generations (a)
400 -energy
200 0 -200 -400 0.0E+0 2.0E+7 4.0E+7 6.0E+7 8.0E+7 generations (b)
FIGURE 13.58 GA searching procedures for polymer length 64 (a) and 125 (b).
resultant potential and their experimental values. A minimization problem to search the fitting parameters is defined as follows: n
Minimize ferr =
∑ i =1
pim 1 − pc i
2
(13.20)
where pim and pic (i = 1, …, n) are the experimental and calculated value of the ith material property, respectively, and n is the number of material properties to be considered. The most typical properties used for potential fitting are lattice constant a0, cohesive energy Ecoh, bulk modulus B, elastic constant f . In this fitting procedure, C11, C12, and C44, and vacancy-formation energy Evac the material properties are calculated from MD simulations.
13.7.3
Numerical Result
The initial values of six fitting parameters for optimization computations are obtained by using a simplex algorithm (Voter, 1994). These initial values are then used to generate an initial potential to be used in MD simulations. In
© 2003 by CRC Press LLC
1523_Frame_C13.fm Page 507 Thursday, August 28, 2003 5:59 PM
FIGURE 13.59 Computational cell of 864 atoms in FCC lattice used in MD simulation. (From Xu, Y.G. and Liu, G.R., J. Micromech. Microeng., 13, 254, 2003. With permission.)
this study, MD simulations are performed on a computational cell of 36 unit cells (6 × 6 × 6) involving 864 atoms in the FCC lattice structure as shown in Figure 13.59. During MD calculations, periodic boundary conditions are imposed in the x, y, and z directions, except for the calculation of surface energy, where a free surface is set in the corresponding direction. The system temperature is controlled at T = 50 K, and the time step for integral calculation is set to be 3.15 × 10–15. The minimization computation is implemented by running the improved IP-GA. As the minimization process proceeds, the six fitting parameters gradually approach their minimal values, while the error ferr between the experimental material properties and those calculated from the MD simulation using the potential resulted from the current fitting parameters converges quickly. The entire process includes 12 generations of IP-GA running 86 times of MD simulations each with 3000 time steps, taking a total of 49 minutes on workstation Einstein (SGI, Origin, 2000). The resulting pair function φ(r) and embedding function U(n) are shown in Figure 13.60 and Figure 13.61. The material properties calculated from MD simulations with the potential of using this set of parameters and their errors with respect to the experimental values are shown in Table 13.13. It can be found that excellent agreement occurs between the calculated and experimental properties.
13.7.4
Summary
Compared with traditional fitting methods, this new algorithm combining MD with IP-GA can include the effect of atom relaxations in property calculations, and lead to the global optimum for the fitting parameters. This greatly improves the accuracy and transferability of the fitted potential. The new algorithm also provides a possibility of using nonground state material properties to fit potentials.
© 2003 by CRC Press LLC
1523_Frame_C13.fm Page 508 Thursday, August 28, 2003 5:59 PM
6 present
φ (r) (eV)
4
Foiles et al. (1986)
2
Voter (1994)
0
-2 1
3
5
7
9
o
r (A)
FIGURE 13.60 Fitted pair functions. (From Xu, Y.G. and Liu, G.R., J. Micromech. Microeng., 13, 254–260, 2003. With permission.) 20 present
U (n) (eV)
10
Kitamura et al. (1997)
0 -10 -20 -30 0
0.025
0.05
0.075
0.1
o -3
n (A )
FIGURE 13.61 Fitted embedding functions. (From Xu, Y.G. and Liu, G.R., J. Micromech. Microeng., 13, 254–260, 2003. With permission.)
13.8 Parameter Identification in Valveless Micropumps 13.8.1
Introduction
Pan et al. (2001) recently developed a nonlinear analysis model for fluid–membrane coupling vibration of valveless micropumps. This model relates the configuration of micropump, the properties of fluid, the amplitude and frequency of excitation (piezoelectric), and the flow-pressure char acteristics of the diffusers with the mean output flux. This provides a new choice of investigating the flow-pressure characteristics of the diffusers by using an indirect measurement method. That is, only the mean flux for the micropump can be measured for specified configuration and excitation force,
© 2003 by CRC Press LLC
1523_Frame_C13.fm Page 509 Thursday, August 28, 2003 5:59 PM
TABLE 13.13 Fitted Material Properties and Errors with Respect to Experimental Values Experiment
Fitted
Error (%)
3.52a 4.45b 1.81c 2.465d 1.473d 1.247d 1.60e
3.522 4.467 1.802 2.437 1.489 1.268 1.565
0.06 0.38 –0.44 –1.14 1.09 1.68 –2.19
a0 (A) Ecoh (eV) B (erg/cm3) C11 (erg/cm3) C12 (erg/cm3) C44 (erg/cm3) f (eV) Evac a b c d e
Barrett and Massalski (1966). Smith (1976). Voter and Chen (1987). Simmons and Wang (1971). Wycisk and Feller–Kniepmeier (1978).
Source: From Xu, Y.G. and Liu, G.R., J. Micromech. Microeng., 13, 254–260, 2003. With permission.
then this analysis model can be used to calculate the corresponding mean flux for the trial flow-pressure characteristics in terms of the pressure-loss coefficients. If a set of pressure-loss coefficients yields the calculated mean flux sufficiently close to the measured values, it would be regarded as the flow-pressure characteristics that are sought. A problem arising from this method is that an optimization method should be used to search for the proper pressure-loss coefficients among the various candidates. Next, the identification procedure of the flow-pressure characteristic parameters of the diffusers suggested by Xu et al. (2002) based on the use of IP-GA is introduced. The forward solver is given by Pan et al. (2001).
13.8.2
Valveless Micropump
Figure 13.62 shows the schematic configuration of a valueless micropump. It consists of an oscillating membrane, a pump chamber, and two diffusers as dynamic passive values at inlet and outlet. When, driven by piezoelectric excitation, the oscillating membrane moves outward, and the chamber increases a corresponding volume. The fluid in the inlet side and the outlet side flows into the chamber through two diffusers. Because the flow in diffuser 1 is of positive direction, the flow resistance is less than that in diffuser 2, where the flow is in a negative direction. The pressure-loss coefficient ζ in−1 in diffuser 1 is less than ζ in−2 in diffuser 2. As a result, the volumetric flux Qin−1 through diffuser 1 is larger than the volumetric flux Qin−2 through diffuser 2 (see Figure 13.62(a)). Contrarily, when the membrane moves inward, the chamber decreases a corresponding volume. The fluid in the chamber is forced to flow out through two diffusers. In such a circumstance, flow in diffuser 1 is in a
© 2003 by CRC Press LLC
1523_Frame_C13.fm Page 510 Thursday, August 28, 2003 5:59 PM
z
fe
Membrane
x P
Inlet Diffuser 1
Diffuser 2
ζ in −1 , Qin −1
ζ in − 2 , Qin − 2 Pout
Pin
Outlet
(a)
fe Membrane
P Diffuser 2
Diffuser 1
ζ out − 2 , Q out −2
ζ out −1 , Q out −2 Pin
Inlet
Pout
Outlet
(b)
FIGURE 13.62 Cross-sectional view and operation of the micropump. (a) Fluid flows into the chamber; (b) fluid flows out of the chamber.
negative direction, while in diffuser 2 it is in a positive direction. The pressure-loss coefficient ζ out−1 in diffuser 1 is larger than ζ out−2 in diffuser 2. The volumetric flux Qout−1 in diffuser 1 is thus less than Qout−2 in diffuser 2 (see Figure 13.62 (b)). Therefore, the following relationships exist: ζ in−1 = ζ out−2 , ζ out−1 = ζ in−2 , Qin−1 = Qout−2 and Qout−1 = Qin−2 , when they are under the same pressure drop. Thus, a net volumetric flux Qout−2 – Qin−2 can be transported from the inlet side to the outlet side within a complete pump cycle. 13.8.3
Flow-Pressure Coefficient Identification
The mean flux Q through the diffuser 2 within a pump cycle is defined as Q=
1 T
∫Q
out − 2
dt
(13.21)
T
and ζp = ζ in−1 = ζ out−2
ζn = ζ out−1 = ζ in−2
(13.22)
where T is the time of a pump cycle. It is clear that Q is a function of ζp and ζn, so it is expressed as Q(ζp, ζn). The objective function for the flow-pressure characteristic identification is expressed as
© 2003 by CRC Press LLC
1523_Frame_C13.fm Page 511 Thursday, August 28, 2003 5:59 PM
2.5
2
ferr
1.5
1
0.5
0
0
10
20
30
40
50
60
70
80
90
100
Generation numbers FIGURE 13.63 Convergence of the GA search for coefficient identification ( ζp = 1.56; ζn = 0.98).
TABLE 13.14 Geometrical Sizes and Material Constants of the Micropump a (µm) 1200
h (µm) 60
E (N/m2) 2.1 ×
ρ (kg/m3)
A (µm2)
ρm (kg /m3)
2700
14,400
1200
1010
minimize ferr (ζ p , ζ n ) =
ns
∑ Q (ζ , ζ ) − Q c
p
n
i =1
m i
2
1/2
(13.23)
ζ p max ≤ ζ p ≤ ζ min , ζ n max ≤ ζ n ≤ ζ n min
where ζpmax, ζnmax and ζpmin, ζnmin are the maxima and minima of ζp and ζn, respectively. Qc(ζp, ζn) is the calculated mean flux using the forward solver (Pan et al., 2001), while Qim is the measured mean flux at the ith trial. ns is the number of sampling points.
13.8.4
Numerical Examples
For the valveless micropump shown in Figure 13.63, the flow-pressure char acteristics are identified using the identification model and the improved IPGA. Geometrical size and material constants of the micropump are as in Table 13.14. It was assumed that there were three sets of measured mean fluxes ( ns = 30) for this micropump. They correspond to three different sets of the flow-pressure characteristics, i.e., the pressure-loss coefficients. Three set of pressure-loss coefficients are assumed as:
© 2003 by CRC Press LLC
1523_Frame_C13.fm Page 512 Thursday, August 28, 2003 5:59 PM
• Case 1 — ζp = 1.56, ζn = 0.98 • Case 2 — ζp = 1.2, ζn = 0.6 • Case 3 — ζp = 1.35, ζn = 0.45 The measured mean fluxes are obtained from the calculated results using these sets of simulated pressure-loss coefficients (Pan et al., 2001) and adding the white noise with the level of 5%. In the inverse analysis, Equation 13.23 is the fitness function of the improved IP-GA. The searching range of two pressure-loss coefficients, ζp and ζn, is set to be within [1.0, 2.0] and [0.2, 1.5], respectively. It is decided that the number of possibilities for both of them are 2 15 (32,768). So the numbers of possible solutions for this identification problem are approximately 109. Clearly this search space is very large. Figure 13.63 gives the convergence of the GA search for Case 1. Table 13.15 shows the identification results for these three cases. The max imal error of the identified pressure-loss coefficients with respect to their true values is –4.9, 2.8 and 5.9%, respectively.
13.8.5
Summary
The flow-pressure characteristic parameters of valueless micropumps are identified using the improved IP-GA from the flow-membrane coupling vibration model. This provides a new choice of identifying the dynamic flowpressure characteristic parameters of the valueless micropumps. This is very important, because direct measurement of such parameters is very difficult for micro-devices.
13.9 Remarks Further application examples of advanced computational inverse techniques have been presented in this chapter. Topics ranged from the electronic sysTABLE 13.15 Results for the Pressure-Loss Coefficient Identification Case 1 Case 2 Case 3
nIIP-GA
ζp (Error (%))
77 98 56
1.511 (3.14) 1.262 (5.16) 1.344 (0.44)
ζn (Error (%)) 0.950 (3.06) 0.63 (5.0) 0.448 (0.4)
Note: nIIP-GA is the number of function evaluations to convergence for the improved IP-GA.
© 2003 by CRC Press LLC
1523_Frame_C13.fm Page 513 Thursday, August 28, 2003 5:59 PM
tem, integral optical fibers, truss structure, thin films, interatomic potentials, and micropumps to the protein structure. These examples have demonstrated the broad applications of the inverse techniques. Note that many examples presented in this chapter are examined with only noise-free data. Further study can be conducted using noise-contaminated inputs to confirm the stability of these inverse procedures for those applications.
© 2003 by CRC Press LLC
1523_Frame_C14.fm Page 515 Thursday, August 28, 2003 6:10 PM
14 Total Solution for Engineering Systems: A New Concept
During the past decades, the authors often encountered engineering problems with insufficient input and/or output information. A systematic approach is required to deal with this type of problem effectively. A concept of total solution recently suggested by Liu et al. (2001c) for engineering mechanics problems is introduced in this chapter. The total solution aims to establish a systematic approach to provide a comprehensive solution for complex engineering problems, especially for traditional forward problems with incomplete input information (load, material property, boundary condition) and inverse problems with insufficient observations (displacement, acceleration, stress, etc.). The approach for obtaining a total solution is to formulate such an engineering problem as a parameter identification problem based on the forward solver of the problem. All the unknown parameterized information in this forward model is determined through an iterative procedure of conducting alternately forward and inverse (or mixed) analyses.
14.1
Introduction
In order to describe the real engineering mechanics problem clearly, the general Equation 2.71 is modified slightly to the following expression (see Figure 14.1): Y = H(X, P, C)
(14.1)
where Y = {y1, y2,…, ym } is a vector of output parameters or a vector of modelrelated paramters representing the effects; X = {x1, x2,…,xn} is a vector of input parameters representing the external causes; P = {p1, p2,…,pr} is a vector of material property parameters; C = {c1, c2,…,ck} is a vector of parameterized boundary conditions. P and C may depend on X for nonlinear problems but are constant in a linear problem. H(X, P, C) is a system transformation matrix
© 2003 by CRC Press LLC
1523_Frame_C14.fm Page 516 Thursday, August 28, 2003 6:10 PM
P
C
Y=H(X,P,C)
Y
X (a) Forward problem
P
C
Y=H(X,P,C)
Y
X (b) a case of inverse problem
P
C
H ( X, P, C, ) Y=H(X,P,C) Y
X
(c) another case of inverse problem
P
C
Y=H(X,P,C)
Y
X
(d) a mixed problem
FIGURE 14.1 Schematic illustration of engineering problems that need a total solution. Y = {y1, y2,…, ym} is a vector of output parameters or a vector of model-related coefficients representing the effects; X = {x1, x2,…,xn} is a vector of input parameters representing the external causes; P = {p1, p2,…,pr} is a vector of material property parameters; C = {c1, c2,…,ck} is a vector of parameterized boundary conditions. P and C may depend on X for nonlinear problems but are constant in a linear problem. H(X, P, C) is a system transformation matrix representing the translation process from input to output.
© 2003 by CRC Press LLC
1523_Frame_C14.fm Page 517 Thursday, August 28, 2003 6:10 PM
representing a forward solver. It is derived from the physical laws underlying the problem under consideration and may be dependent on the input X for nonlinear problems. For a linear problem, matrix H is independent of input X, and Equation 14.1 has the form of Equation 2.71. According to the type of information sought in the solution procedure from Equation 14.1, an engineering problem was traditionally treated as either a forward problem (see Figure 14.1a) or an inverse problem (Figure 14.1b, c; refer also to Chapter 2 for the classification of problems). However, a commonly encountered difficulty in solving forward and inverse problems is that they are often unable to provide complete known information as required in a traditional sense (see Figure 14.1d). This results from the various practical limitations in the realistic world. For example, even for a simple structural analysis problem, the boundary conditions are sometimes difficult to describe to represent real situations exactly. In these cases, they must be idealized by saying that the boundary is “fixed,” “clamped,” or “free” in order to obtain a solution. This kind of idealization is also made for the other input information, such as initial conditions, material properties, and system configurations, etc. It is obvious that this idealization would lead to significant errors for real engineering mechanics problems. For inverse problems, the difficulty is more obvious. The characteristics of nonuniqueness, ill-posedness, and instability involved in inverse problems make them intractable to solve even for relatively simple corresponding forward problems (Tikhonov et al., 1990; Tanaka and Bui, 1992; Haario, 1996; Anikonov, 1997). There are even more complex and challenging cases in reality. The input vector X, the output vector Y, the parameter vectors P and C are only partly known. The information to be sought is the remaining parts of the vectors X, Y, P, and C. These kinds of problems cannot be properly formulated as forward problems or inverse problems in a traditional sense; thus, they are termed “mixed” problems. Presently, mixed problems are very often treated as forward problems by complementing artificially the unknown information in the vectors X, P, and C, or as inverse problems by complementing artificially the unknown parts in the vector Y and the vectors X, P, and C based on idealized assumptions. This is obviously improper. Liu et al. (2001c) have introduced a unified concept of total solution to deal with all these problems described in Figure 14.1. Based on their study, this chapter will introduce this new concept as an extension of inverse analysis.
14.2 Approaching a Total Solution The concept of total solution is that the problem should be viewed and treated as it is — without artificial assumptions to any of X, Y, P, and C. The
© 2003 by CRC Press LLC
1523_Frame_C14.fm Page 518 Thursday, August 28, 2003 6:10 PM
approach is based on the assumption that a forward solver for the problem is always available. Sets of system parameters xi (i = 1,…,n), yi (i = 1,…,m), pi (i = 1,…,r), and ci (i = 1,…,k) are then selected and investigated to provide as much known information as possible. After that, an iterative procedure of conducting alternately forward and inverse (or mixed) analyses is implemented. All the unknown parameters in the vectors X, Y, P, and C are then determined when the iterative process converges. In this solution procedure, forward analysis is conducted using the corresponding forward solver, while inverse (or mixed) analysis is performed using the sensitivity matrix-based method or the progressive NN method developed. The parameters sought may be parts of X, Y, P, and C; thus the obtained solution is termed a total solution. With the use of this new concept, it is no longer necessary to artificially assume the unknown information in vectors X, Y, P, and C when solving forward or inverse problems with incomplete information. The general treatment of inverse problems is also greatly improved because of the incorporation of forward solution procedure that can directly identify the correctness of a trial solution and rapidly correct the iterative direction in solution procedure. Numerical examples have been presented to demonstrate the feasibility and the validity of this approach as well as the implementation of these techniques. The approach towards a total solution of engineering problems is primarily motivated from the following findings: • For most forward and inverse problems found in engineering practice, the forward mathematical models are obtainable without too much difficulty due to the rapid development of computer science and numerical methods. • Inverse problems can be interpreted as one kind of special forward problem with incomplete input information. In addition, forward problems with incomplete input information and inverse problems with insufficient observations of their effects can be treated as mixed problems. This means that these problems can be solved in the same framework. • In dealing with ill-posedness of the problems (forward and inverse), it has been shown that as long as the unknown information is sufficient and sensitive enough, it is always possible to deal with the types I and II ill-posedness effectively and to obtain stable and accurate solutions. When descritized numerical methods are used as the forward solver, type III ill-posedness is well suppressed by the projection regularization together with regularization by filtering. Therefore, stable and accurate inverse solutions can be obtained. • For most engineering problems, the bounds for parameters are often very narrow due to the experience and knowledge accumulated in the past. This helps to reduce or even remove nonuniqueness and hence the ill-posedness.
© 2003 by CRC Press LLC
1523_Frame_C14.fm Page 519 Thursday, August 28, 2003 6:10 PM
• Conducting alternately forward and inverse analyses provides a way for effective and timely transfer of information between forward and inverse analyses. It can be done easily to ensure sufficient information for each iteration of the analysis process. This would also help greatly in reducing ill-posedness and increasing stability in searching for a total solution of problem.
14.2.1
Procedure for a Total Solution
Based on the idea stated previously, the procedure of obtaining a total solution can be described as follows (Liu et al., 2001c): • Construction of a forward solver (i.e., the forward computational model) for the problem is accomplished by determining the system outputs yi (i = 1,…,m), based on the basic physical laws of mechanics, and selecting the proper parameters xi (i = 1,…,n), pi (i = 1,…,r), and ci (i = 1,…,k) following the Ockham’s razor criterion of favor simplicity (Santamarina et al., 1998). In choosing forward model, the simpler model is always preferred, as long as it produces more accurate results than the experiments. • Collect as much known information for these parameters as possible with the help of analytical, experimental, and even experiential knowledge. When the information is not sufficient, consider conducting additional or alterative experiments. The bounds of the unknowns should be set as narrow as possible. The experimental data should be filtered to remove as much noise as possible. • Development and validation of the inverse procedures and algorithms is a key to obtaining a total solution. • Identification of all unknown parameters in vectors X, Y, P, and C is accomplished by alternately conducting forward and inverse (or mixed) analyses until the entire solution process converges.
14.2.2
Forward Solver
The forward solver is first required to establish from the basic physical laws. Constructing an effective, reliable, and fast forward solver is the first step towards a total solution. With the increasing complexity of problems at present engineering practices, discrete numerical methods have become an effective major tool to this end. The finite element method (FEM), finite volume method (FVM), boundary element method (BEM), mesh-free (MFree) method, wave analysis method, and many other numerical methods play an important role. Table 14.1 lists some commercially available software packages using FEM, FVM, BEM, and MFree methods and wave analysis codes. They can be
© 2003 by CRC Press LLC
1523_Frame_C14.fm Page 520 Thursday, August 28, 2003 6:10 PM
TABLE 14.1 Commercially Available Software Packages Using FEM, FVM, BEM, and MFree Methods and Wave Analysis Codes Software
Method Used
Application Problems
ABAQUS
FEM (implicit, explicit)
Structural analysis, acoustics, thermal analysis, etc.
ADINA DIANA
FEM (implicit)
Structural analysis, computational fluid dynamics, fluid–structural interaction, etc.
ANSYS
FEM (implicit)
Structural analysis, acoustics, thermal analysis, etc.
Fluent
FVM (implicit)
Turbulent flows, heat transfer, reacting flows, chemical mixing, combustion, and multiphase flows
I-deas
FEM (implicit)
Structural analysis, acoustics, thermal analysis, fluid flow, etc.
LS-DYNA
FEM (explicit)
Structural dynamics, computational fluid dynamics, fluid–structural interaction, etc.
MARC
FEM (implicit)
Structural analysis, acoustics, thermal analysis, etc.
MFree2D
Mesh free (adaptive refinement)
Two-dimensional solid mechanics
MSC-DYTRAN
FEM + FVM (explicit)
Structural dynamics, computational fluid dynamics, fluid–structural interaction, etc.
NASTRAN
FEM (implicit)
Structural analysis, acoustics, thermal analysis, etc.
Sysnoise
FEM/BEM (frequency domain)
Acoustics
TransWave
Hybrid numerical method (transient elastic wave solver)
Transient waves in anisotropic composite laminate
WaveFace
FEM type (characteristic of wave solver)
Phase, group velocities of waves in anisotropic composite laminate
directly used to solve most forward problems in engineering mechanics. For solving inverse or mixed problems, efforts have also been made for making use of these commercial software packages. These include development of interfaces to use them as a subroutine in the user main program, for instance, LS-DYNA (Liu and Ma, 2003), I-deas (Liu et al., 2002f), NASTRAN (Pradhan et al., 2000; Tao et al., 2000; Yang et al., 2002f). In using discrete numerical methods, the density of meshes is an important concern. Denser mesh does not necessarily always produce better results for inverse problems. In fact, coarser mesh can often produce better and more stable inverse solutions due to the work of projection regularization. The consideration should use mesh as coarse as possible, but produces results more accurately than the experiment can produce (see Section 3.4.5).
© 2003 by CRC Press LLC
1523_Frame_C14.fm Page 521 Thursday, August 28, 2003 6:10 PM
14.2.3
System Parameters
Selection of system parameters is very important in constructing an effective forward solver. All system parameters are assembled in four vectors, i.e., X = {x1, x2,…,xn}, Y = {y1, y2,…,ym }, P = {p1, p2,…,pr}, and C = {c1, c2,…,ck}. xi (i = 1,…,n) and yj (i = 1,…,m) are the discrete samples of some physical parameters at different times and locations. pi (i = 1,…,r) is the structural or material parameters such as the elastic module, mass density, etc. ci (i = 1,…,k) is the parameterized boundary condition values. Generally, these parameters should be sufficient to capture the essential characteristics of the problem. The number of parameters sought should be minimal as emphasized in the Ockham’s razor criterion of favor simplicity. The parameters, especially for those in the output vector used as the known information, should be sensitive to the parameters sought. The following simple example can reveal the importance of a proper selection of system parameters. Consider the damage detection in a cantilever beam. If only the modal parameters of natural frequencies are selected as the effect vector Y, the problem would become very difficult to solve because same changes in natural frequencies of lower orders can result from damages at the different locations (Pandey et al., 1991). It is therefore not possible to identify the damage location correctly by using only the observed natural frequencies. This selection of natural frequency alone is thus incomplete or insufficient. To overcome this problem, it has been proposed to add other modal parameters such as the frequency parameters or the mode shapes into the vector Y to guarantee the uniqueness of the identified result of a similar case presented in Section 13.2. However, further studies have found that although using the natural frequencies together with the modal shapes as the effect vector Y can avoid the uniqueness problem, it can still be difficult to solve the damage detection problems in many cases. This is because such a Y vector may not be sensitive to the damage hidden in the structures (Pandey et al., 1991; Doebling et al., 1996). To solve the problems effectively, some researchers have recommended frequency response functions, curvatures of mode shapes, etc. as part of the effect vector Y. This example has demonstrated that different ways in the selection of system parameters exist; a proper selection of system parameters is usually critical to the success of obtaining a total solution of practical engineering problems. 14.2.4
Mathematical Representation
Once the forward solver and the system parameters are determined, the problem, no matter what kind of information is sought, can be formulated as a parameter identification: Finding: {X1, P1, C1, Y1} For given: {X2, P2, C 2, Y2} Subject to: {Y1, Y2} = Fsolver ({X1, X2}, {P1, P2}, {C1, C2})
© 2003 by CRC Press LLC
(14.2)
1523_Frame_C14.fm Page 522 Thursday, August 28, 2003 6:10 PM
where Fsolver(.) denotes the forward solver and X = {X1, X2}, Y={Y1, Y2}, P = {P1, P2}, and C = {C1, C2}, where the subvectors with subscription 1 are the unknown parts and those with subscription 2 are the known parts in the corresponding parameter vectors.
14.3 Inverse Algorithms Inverse algorithms play a key role in obtaining a total solution for engineering problems. The optimization methods introduced in Chapter 4 and Chapter 5 as well as the progressive neural network introduced in Chapter 6 can be used as inverse algorithms in searching for the total solution process. As examples, the use of the sensitivity matrix-based method and the neural network as inverse algorithms in obtaining a total solution will be presented.
14.3.1
Sensitivity Matrix-Based Method (SMM)
14.3.1.1 Sensitivity-Based Equations Equation 14.1 relates the cause vector X to the effect vector Y using the transformation matrix H; however, a sensitivity matrix S is used to relate a finite change between a compound vector of parameters Q = {X , P , C } and the effect vector Y , Y = SQ
(14.3)
where ∆y Y = i , i = 1, ... , m yi ∆q j Q = , j = 1, ... , ψ ; ψ = n + r + k q j
(14.4)
∆y q j , i = 1, ..., m; j = 1, ..., ψ S = i ∆q j yi
in which q j = {x1 , , xn , p1 , , pr , c1 , , c k }, j = 1, ... , ψ ; ψ = n + r + k The sensitivity matrix S is usually obtained by: • Analytical methods such as that derived from the sensitivity equations (Anju and Kawahara, 1997)
© 2003 by CRC Press LLC
1523_Frame_C14.fm Page 523 Thursday, August 28, 2003 6:10 PM
• Numerical computations by using the forward solver, where the differentiations are approximately replaced by the finite differences (Yeh, 1986) • Specially designed experiment methods By dividing the parameter vectors into the known and unknown parts, Equation 14.3 can be re-expressed as follows:
Y1 S11 = Y2 S21
S12 S22
S13 S23
S14 S24
X1 X 2 S16 P1 S26 P2 C 1 C2
S15 S25
(14.5)
Then, all the known parts in the vectors Y, X, P, and C are moved to the left-hand-side of Equation 14.5 and denoted as new vector Y , while all the unknown parts are moved to the right-hand-side and denoted as new vector Q . Equation 14.5 is thus transferred as: Y = SQ
(14.6)
where − S12 X 2 − S14 P2 − S16 C2 Y= Y2 − S22 X 2 − S24 P2 − S26 C2 S S = 11 S21
S13 S23
Q = X1
P1
[
S15 S25
C1
−I 0
Y1
(14.7)
]
T
For several special cases, Y , S , and Q are expressed as: • Y, P, and C are known, X is sought Y = Y − S2 P − S3 C Q=X
(14.8)
S = S1
[
where Y = S1
© 2003 by CRC Press LLC
S2
S3 X
P
C
]
T
1523_Frame_C14.fm Page 524 Thursday, August 28, 2003 6:10 PM
• Y and X are known, P and C are sought Y = Y − S1 X
[ S = [S
Q= P
C
]
T
S3
2
(14.9)
]
• Y, X2, P2, and C2 are known, X1, P1, and C1 are sought Y = Y − S2 X 2 − S4 P2 − S6 C2
[ S = [S
where Y = S1
[
S2
S3
S4
Q = X1
P1
C1
1
S3
] S ] 5
S6 X 1
X2
S5
T
(14.10)
P1
P2
C1
C2
]
T
14.3.1.2 Algorithms Next, two algorithms are introduced. 14.3.1.2.1 Matrix Inversion Algorithms Matrix inversion is a kind of traditional inverse problem algorithm in which the inversion of Equation 14.6 can be directly obtained by
[]
Q= S
−1
Y
(14.11)
Y
(14.12)
or
[]
Q= S
−g
[ S ]−1 and [ S ]− g are an inverse and a general inverse matrix of [ S ] , respectively. [ S ]−1 is obtainable only if [ S ] is square and nonsingular. In most cases, it is necessary to use the general inverse matrix [ S ]− g to calculate Q . Table 3.1 gives several different representations of [ S ]− g for the different cases.
14.3.1.2.2 Iterative Algorithms This algorithm gradually corrects the solution Q = { qk } in an iterative procedure. This procedure is iteratively conducted until the error E between
© 2003 by CRC Press LLC
1523_Frame_C14.fm Page 525 Thursday, August 28, 2003 6:10 PM
the known output, Y = { y k } , and the predicted output, Y P = { y kP } , from the solution Q is within the specified toleration or the stopping criterion, based on the discrepancy principle. Typical algorithms include the widely used algebraic reconstruction technique (ART), multiplicative ART (MART), and sequential image reconstruction technique (SIRT). Details of these techniques have been presented by Santamarina and Fratta (1998). ART updates a trial solution Q starting with the first row i = 1 in Equation 14.6; updating proceeds until the last row is taken into consideration. Then, the iterations continue by considering the first row again. This process stops only when the error E reaches a predefined toleration. The correction of qk for the (s + 1)th iteration is calculated by: ( qk ) s+1 = ( qk ) s +
( yi − yiP )sik
∑
(14.13)
( qk )2
k
MART follows the same updating process as that in ART; however, the correction of qk is conducted by: ( qk ) s+1 =
∑
yi sik ( qk ) s
if sik ≠ 0
( qk ) s
(14.14)
k
That means that when the ith row is considered, the kth unknown, qk , is updated only if it is affected by the ith row. A little bit different from ART and MART, SIRT delays the correction of solution until all the rows in Equation 14.6 have been considered. Each qk is updated by weight averaging all corrections computed for all the rows. ( qk ) s+1 = ( qk ) s +
( yi − yiP ) s sik
∑ ∑ (s ) ∑ s sik
2
i
ik
k
(14.15)
ik
k
14.3.1.3 Solution Procedure Based on the preceding discussion, the solution procedure towards a total solution of problem using the SMM can be described as: 1. Construct a forward model for the problem under consideration, which determines the system output of Y. 2. Assume the unknown parameters {X 10 , P10 , C 10 } from the engineering experience; put them together with the known parameters {X 2 , P2 , C 2 } into the forward solver to calculate the corresponding output {Y10 , Y20 } .
© 2003 by CRC Press LLC
1523_Frame_C14.fm Page 526 Thursday, August 28, 2003 6:10 PM
3. Calculate the sensitivity matrix S j centering on the present parameters {X 1j , X 2 , P1j , P2 , C 1j , C 2 , Y1j , Y2j } . 4. Draw the matrix Sj in Equation 14.6 from the sensitivity matrix, and construct the parameter vector Yj based on the difference between the known Y2 and the calculated Y2j . 5. Compute the parameter increment vector Qj in Equation 14.6 using the matrix inversion algorithms or the iterative algorithms. 6. Obtain a new set of parameters {X 1j+1 , P1j+1 , C 1j+1 , Y1'j+1 } by adding the increment Qj into the solution {X 1j , P1j , C 1j , Y1j } in the jth iteration. 7. Calculate the output Y = {Y1j +1 , Y2j+1 } again from the forward solver using the newly obtained {X 1j+1 , P1j+1 , C 1j+1 } and the known {X 2 , P2 , C 2 } . 8. Compare the following five errors between (a) the calculated Y2j+1 and the known Y2 , (b) the calculated Y1j+1 and the computed Y1' j+1 , (c) X 1j+1 and X 1j , (d) P1j+1 and P1j , and (e) C 1j+1 and C 1j . 9. If all these errors are within the predefined toleration, the set of parameters {X 1j+1 , P1j+1 , C 1j+1 , Y1j+1 } is considered the total solution of the problem and the solution procedure ends. Otherwise, j = :j + 1; go back to step (2) to calculate the new sensitivity matrix S j . This procedure is illustrated in the flow chart of Figure 14.2. 14.3.1.4 Comments on SMM The SMM is very suitable for solving the complex problems of engineering mechanics, mainly because the SMM uses a sensitivity matrix S rather than the traditional transformation matrix H(X, P, C) to relate the system parameters. The matrix S is always obtainable without too much difficulty when the forward solver is available for arbitrarily complex mechanics problems; however, the matrix H(X, P, C) is sometimes very difficult to calculate because each column in the matrix is a shifted version of system impulse response or the discrete expression of Green’s function. These are usually computationally extensive. In addition, the sought parameter vector X is capably expressed in an explicit form in Equation 14.1 only for linear problems such as identification for the external loadings (Liu et al., 2002e). Identification can be successfully conducted using the inverse problem algorithms based on Equation 14.1. For nonlinear problems such as detection of the cracks in structures or identification of the material properties, the parameters to be sought are involved in the matrix H(X, P, C). The conventional inverse problem algorithms based on Equation 14.1 would become very difficult in solving these problems. However, the SMM accommodates the nonlinear problems very well without any extra difficulty because, in the SMM, the sought parameters are always expressed in an explicit form. Therefore, identification for these
© 2003 by CRC Press LLC
1523_Frame_C14.fm Page 527 Thursday, August 28, 2003 6:10 PM
Construct forward solver J = :0 End 0 1
0 1
0 1
Assumed X , P , C
Total solution Known X 2 , P2 , C 2
X 1j+1 , P1j+1 , C1j+1 , Y1j+1
Forward solver
J = :j+1 No
Calculated Y10 , Y20
j 1
j 1
j 1
j 1
j 2
X , X 2 , P , P2 , C , C 2 , Y , Y
|| Y2j+1 − Y2 || ≤ ε y 2 || Y1j+1 − Y1' j+1 || ≤ ε y1 || X1j+1 − X1j ||≤ ε x || P1j+1 − P1j ||≤ ε p || C1j+1 − C1j ||≤ ε C
Yes
−
Sensitivity matrix S j
=
=
Matrix S j and vector Yj
SMM
Calculated Y1j+1 , Y2j+1
Forward solver
=
Computed Q j
Computed X1j+1 , P1j+1 , C1j+1 and known X 2 , P2 , C 2
Updated X1j+1 , P1j+1 , C1j+1 , Y1' j+1
FIGURE 14.2 Solution procedure towards a total solution using the sensitivity matrix-based method (SMM). (From Liu, G.R. et al., Comput. Methods Appl. Mech. Eng., 191, 989–1012, 2001. With permission.)
parameters can be easily performed using the matrix inversion or iterative algorithms. As do any gradient-based approaches, the SMM requires an initial estimation for the unknown parameters involved in the parameter vectors X, P, and C. This work requires a rational combination of scientific knowledge and engineering experience. A proper estimation for these parameters would surely accelerate the convergence of a solution procedure towards a total
© 2003 by CRC Press LLC
1523_Frame_C14.fm Page 528 Thursday, August 28, 2003 6:10 PM
solution. If the bounds of the parameters are too big and the initial estimation is not properly done, the SMM would fail to find the true solution.
14.3.2
Neural Network
Based on the progressive neural network as discussed in Chapter 6, the solution procedure toward total solution using the NN method can be described as: 1. Construct a forward solver for the problem under consideration, which determines the system output of Y. 2. Select an NN model and initially train this model. This includes: (a) determine the architecture of the NN model with the output vector Y = {Y1, Y2} as the input of the NN model and the compound vector Q = {X1, X2, P1, P2, C1, C2} as the output of the NN model. This is deliberately set to be opposite to the forward solution procedure, (b) generate a set of initial training samples using the orthogonal array based method, and (c) train the NN model using the improved learning algorithm until it converges. 3. Assume the unknown parameters {X 10 , P10 , C 10 } , put them together with the known {X 2 , P2 , C 2 } into the forward solver to calculate the output Y 0 = {Y10 , Y20 } . 4. Feed the calculated Y1j and the known Y2 into the trained NN model to compute the corresponding output {X 1j , X 2j , P1j , P2j , C 1j , C 2j } . 5. Examine the output of the NN model and retrain the model using the adjusted sample as stated earlier until the satisfactory result is obtained. 6. Put the computed {X 1j , P1j , C 1j } and the known {X 2 , P2 , C 2 } into the forward solver to calculate the output Y j +1 = {Y1j +1 , Y2j+1 } again. 7. Compare the following five errors between (a) the calculated Y2j+1 and the known Y2 , (b) Y1j+1 and Y1j , (c) X 1j and X 1j−1 , (d) P1j and P1j−1 , and (e) C 1j and C 1j−1 . 8. If all these errors are within the predefined toleration, the set of parameters {X 1j , P1j , C 1j } is considered to be the total solution of problem. The solution procedure ends. Otherwise, j = :j+1; go back to step (3) to compute the new output {X 1j , X 2j , P1j , P2j , C 1j , C 2j } by feeding the calculated Y1j and the known Y2 into the re-trained NN model. This procedure is illustrated in the flow chart of Figure 14.3. The NN method is especially suitable for complex engineering problems when the underlying transformations are unknown or when the event takes
© 2003 by CRC Press LLC
1523_Frame_C14.fm Page 529 Thursday, August 28, 2003 6:10 PM
Construct forward solver J = :0 End Initially trained NN model
Total solution X1j , P1j , C1j , Y1j
Assumed X10 , P10 , C10
Known X 2 , P2 , C 2 J = :j+1 No
Forward solver
Calculated Y10 , Y20
Calculated Y1j and known Y2
NN
|| Y2j+1 − Y2 || ≤ ε y 2 || Y1j+1 − Y1j || ≤ ε y1 || X1j − X1j−1 ||≤ ε x || P1j − P1j−1 ||≤ ε p || C1j − C1j−1 ||≤ ε c
Yes
Calculated Y1j+1 , Y2j+1
Computed X1j , X 2j , P1j , P2j , C1j , C 2j
Forward solver
Yes Examination
Computed X1j , P1j , C1j and known X 2 , P2 , C 2
No Re-train NN model
FIGURE 14.3 Solution procedure towards a total solution using the progressive neural network. (From Liu, G.R. et al., Comput. Methods Appl. Mech. Eng., 191, 989–1012, 2001. With permission.)
place in a noisy environment. However, it cannot provide precise physical or mathematical insights into the underlying physical laws. In using an NN, care must be taken to avoid over-fitting (see Section 6.4.4).
14.4 Numerical Examples Two numerical examples are given in this section to validate the approach and corresponding algorithms. One is referred to a forward problem of
© 2003 by CRC Press LLC
1523_Frame_C14.fm Page 530 Thursday, August 28, 2003 6:10 PM
a a/2
r=a r=a/2
Φ t1
Φ
t2
K
K
FIGURE 14.4 Vibration analysis of a circular plate with elastic supports. The circular plate with step thickness, where t1 and t2 are the thickness of plate in the regions a/2 ≤ r ≤ a and 0 ≤ r ≤ a/2, respectively. Φ and K are the rotational and translational spring constants of the elastic supports in the boundary, respectively. (From Liu, G.R. et al., Comput. Methods Appl. Mech. Eng., 191, 989–1012, 2001. With permission.)
vibration analysis with incomplete descriptions of their causes, and the other to an inverse problem of identification for the material properties with insufficient observations of their effects. Both are treated as mixed problems.
14.4.1
Vibration Analysis of a Circular Plate
Figure 14.4 shows a circular plate with stepped thickness over a concentric region, where t1 and t2 are the thickness of plate in the region a/2 ≤ r ≤ a and 0 ≤ r ≤ a/2, respectively. E, ν, and ρ are the elastic module, Poisson’s ratio, and mass density of plate, respectively, and Φ and K are the rotational and translational spring constants, respectively, to model the elastic supports in boundary. For the sake of simplicity, the following characteristic parameters are used in this example: ω m = ω m a 2 ρt1 / D1 K = Ka 3 / D1 Φ = Φa 3 / D1 D1 = Et13 / 12(1 − ν 2 ) t = t2 / t1
© 2003 by CRC Press LLC
(14.16)
1523_Frame_C14.fm Page 531 Thursday, August 28, 2003 6:10 PM
The objective of vibration analysis for this problem is to calculate the first six natural frequencies ω i (i = 1,…,6). It is clear that if parameters t , ν, K , and Φ can be completely provided, this solution is easily obtained by using the differential quadrature method (DQM) reported by Wu and Liu (2000). For testing the approach and the corresponding inverse problem algorithms, it is intentionally assumed that the spring constants K and Φ are unknown. This case is most likely to happen in real situations; it is clear that the problem would be very difficult to solve using conventional solving techniques such as the DQM because necessary boundary conditions are lacking. According to the approach towards a total solution, this problem is first treated as a mixed problem, where the first three frequencies, ω m (m = 1, 2, 3) , are assumed to be experimentally obtainable. The parameter vectors for this problem are thus set up as Q = {Q1, Q2}, Q1 = { K , Φ }, Q2 = { t , ν}, Y = {Y1, Y2}, Y1 = { ω 4 , ω 5 , ω 6 }, and Y2 = { ω 1 , ω 2 , ω 3 }. DQM is used as the forward solver. In this numerical experiment, the actual values of parameters t , ν, K , and Φ are 1.5, 0.3, 16, and 4, respectively. The first six frequencies, ω m (m = 1, , 6) , corresponding to this set of physical parameters are calculated to be 4.6, 17.1, 50.8, 113.3, 196.4, and 302.7, respectively, from the forward solver DQM. This study would demonstrate how the parameters ω 4 , ω 5 , ω 6 , K , and Φ are solved from the known t , ν, ω 1 , ω 2 , and ω 3 based on the concept of total solution. The SMM and the NN methods are used to this end. 14.4.1.1 SMM Solution According to the solution procedure described in Section 14.3.1, it is first assumed that the initial values of K and Φ are 8 and 2, respectively, 50% off from their actual values. Putting these assumed values of K , Φ and the known t of 1.5, ν of 0.3 into the forward solver DQM, yields the corresponding first six frequencies, ω m (m = 1, , 6) , to be 3.43, 15.31, 48.53, 110.34, 193.78, and 299.21, respectively. Then, a sensitivity matrix S0 centering on this set of parameters is obtained by way of varying each parameter among t , ν, K , and Φ and then calculating the corresponding changes of ω m (m = 1, , 6) for each varied combination of parameters. Subsequently, the matrix S and the vector Y in Equation 14.6 are formed. Based on this newly formed Equation 14.6 and using the singular value decomposition algorithm, the frequencies ω 4 , ω 5 , and ω 6 and the parameters K and Φ are computed to be 111.1, 194.2, 300.1, and 11.3, 2.7, respectively (see the second row of Table 14.2). Substituting the originals of parameters K and Φ with the newly computed values of 11.3 and 2.7 and putting them together with the known t and ν into the forward solver DQM once again, a new set of frequencies ω m (m = 1, , 6) is calculated to be 4.0, 16.1, 49.4, 111.5, 194.7, and 300.5 (see the third row of Table 14.2). Then, the five kinds of errors defined in Section 14.3.1 are examined to decide if the solution procedure ends (also see Figure 14.2). All of them are larger at this
© 2003 by CRC Press LLC
1523_Frame_C14.fm Page 532 Thursday, August 28, 2003 6:10 PM
TABLE 14.2 Solution Procedure towards a Total Solution Using SMM Input parameters Iteration No. Initial 1 2 3 4 5 6 Target a
Output parameters
Method
t
n
K
Φ
ω1
ω2
ω3
ω4
ω5
ω6
DQM SMM DQM SMM DQM SMM DQM SMM DQM SMM DQM SMM DQM
1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5
0.3 0.3 0.3 0.3 0.3 0.3 0.3 0.3 0.3 0.3 0.3 0.3 0.3 0.3
8a 11.3 11.3 15.1 15.1 16.8 16.8 16.3 16.3 15.7 15.7 15.8 15.8 16
2a 2.7 2.7 3.4 3.4 3.8 3.8 3.7 3.7 3.6 3.6 3.8 3.8 4
3.4 3.4 4.0 4.0 4.5 4.5 4.7 4.7 4.7 4.7 4.6 4.6 4.6 4.6
15.3 15.3 16.1 16.1 16.8 16.8 17.1 17.1 17.1 17.1 16.9 16.9 17.0 17.1
48.5 48.5 49.4 49.4 50.3 50.3 50.6 50.6 50.4 50.4 50.5 50.5 50.5 50.8
110.3 111.1 111.5 112.3 112.5 113.2 112.9 113.0 112.9 112.4 112.8 113.2 113.1 113.3
193.7 194.2 194.7 195.1 195.7 196.1 196.0 195.8 195.0 195.7 195.9 196.1 196.2 196.4
299.2 300.1 300.5 301.6 301.7 302.6 302.2 302.4 301.9 302.0 302.1 302.4 302.2 302.7
Initially assumed values.
Note: Data in bold are values obtained at the corresponding solution process. Source: Liu, G.R. et al., Comput. Methods Appl. Mech. Eng., 191, 989–1012, 2001. With permission.
iteration, so the iteratively solving process is required to continue. After six episodes of such iterations, the maximal error between solved and real values for all the sought parameters decreases to –5%. The solution procedure ends. Table 14.2 gives the corresponding solutions at each iteration. Figure 14.5 shows four sensitivity matrices in the iteration procedure, and Figure 14.6 shows the convergence process of frequency error ei (ω i ) ei (ω ) =
ω ij +1 − ω i0 ω ij +1
i = 1, …, 6
(14.17)
where ω ij+1 (i = 1, ..., 6) are the frequencies calculated from the DQM at the j + 1 iteration, ω i0 (i = 1, 2, 3) are the known frequencies, and ω i0 (i = 4 , 5, 6) are the frequencies computed from the SMM at the j + 1 iteration. 14.4.1.2 Progressive NN Solution In accordance with the present problem, the NN model is constructed with two hidden layers in which the number of neurons in the input and output, and first and second hidden layers are 6, 4, 21, and 7, respectively. Six input neurons correspond to six output parameters, ω m (m = 1, , 6) , while four output neurons correspond to four input parameters t , ν, K , and Φ . A total of 32 samples is generated for initially training this NN model, among which 16 samples are determined by the orthogonal array L16(45).
© 2003 by CRC Press LLC
1523_Frame_C14.fm Page 533 Thursday, August 28, 2003 6:10 PM
1.0 0.8 0.6 0.4 0.2 0.0 -0.2
.
0
. − t
.
-
. ν
R
.
0
R.-
R0
ω1
0
. K−
ω2
−
0
Φ
0
0
R.-
R.-
ω3
ω4
R.ω5
.Rω6
Sensitivity of input parameters to the output frequencies
(a) S 0 at initial solution
1.0 0.8 0.6
.
0.4 0.2 0.0 -0.2
R0
0
0
.R-
R.-
ω2
ω1
0
0
0
R.ω4
ω3
R.-
R.ω6
ω5
st
(b) S1 at 1 iteration
1.0 0.8 0.6 0.4 0.2 0.0 -0.2
. R0
0
0
. -R
R.-
ω2
ω1
0
R.-
R.ω4
ω3
0
0
R.ω6
ω5
(c) S3 at 3rd iteration
1.0 0.8 0.6 0.4 0.2 0.0 -0.2
. R0
ω1
0
0
R.-
R.-
ω2
ω3
0 0
R.ω4
.Rω5
0
.Rω6
(d) S 5 at 5th iteration
FIGURE 14.5 Sensitivity matrices produced in solution procedure. (From Liu, G.R. et al., Comput. Methods Appl. Mech. Eng., 191, 989–1012, 2001. With permission.)
This array is selected by setting each of the four parameters t , ν, K , and Φ with five discrete values. Another 16 samples are generated from the random-based method. These sample data are then normalized to ensure that they are not too close to 0 or 1, so as to avoid some numerical difficulties in the training process. The improved learning algorithm and the progressive training strategy as stated in Chapter 6 are used to train this NN model, where the learn rate η and the jump factor γ are set to be 1.8 and 0.02, respectively. Figure 14.7 shows the convergence process of error norm within the first 5000 training iterations in the first episode of training. This training process ends after 8253 iterations when the given convergence criterion is fulfilled.
© 2003 by CRC Press LLC
1523_Frame_C14.fm Page 534 Thursday, August 28, 2003 6:10 PM
10
C E A
0
F
E C A F JH
H
H
JF E A C
JA C E F
C E A H F J
E H A C F J
F
Error (%)
J
-10
J H
ω1
H
-20
C
ω4
J
ω2
E
ω5
F
ω3
A
ω6
H
-30 initial
1
2 3 4 Number of iterations
5
6
FIGURE 14.6 Convergence process of the frequency errors e i (ω i ) (i = 1,…, 6). (From Liu, G.R. et al., Comput. Methods Appl. Mech. Eng., 191, 989–1012, 2001. With permission.)
8.0 α
7.0 6.0
α
0
α
0
α
0
1
i
α
1
α
2 1
1 p
α
2 q
… … n
α
Error norm
1
… …
1 a
α
α
3
α
3
α
1
j
3 m
2 b
5.0 Neurons: 6 – 21 – 7 – 4
4.0
Learning rate: η = 1.8 Jumping factor: γ = 0.02
3.0 2.0 1.0 0.0 0
1000
2000
3000
4000
5000
Number of iterations in training FIGURE 14.7 Convergence process of error norm in initial training of NN model. (From Liu, G.R. et al., Comput. Methods Appl. Mech. Eng., 191, 989–1012, 2001. With permission.)
© 2003 by CRC Press LLC
1523_Frame_C14.fm Page 535 Thursday, August 28, 2003 6:10 PM
TABLE 14.3 Solution Procedure towards a Total Solution Using Progressive NN (PNN) Method Input Parameters Iteration No. Initial 1 2 3 4 5 6 Target a
Output Parameters
Method
t
n
K
Φ
ω1
ω2
ω3
ω4
ω5
ω6
DQM PNN DQM PNN DQM PNN DQM PNN DQM PNN DQM PNN DQM
1.5 1.48 1.5 1.49 1.5 1.51 1.5 1.50 1.5 1.50 1.5 1.50 1.5 1.5
0.3 0.39 0.3 0.37 0.3 0.35 0.3 0.34 0.3 0.32 0.3 0.32 0.3 0.3
8a 14.36 14.36 15.24 15.24 16.67 16.67 16.53 16.53 16.18 16.18 16.17 16.17 16
2a 2.58 2.58 2.91 2.91 3.31 3.31 3.72 3.72 3.87 3.87 3.95 3.95 4
3.4 4.6 4.5 4.6 4.5 4.6 4.6 4.6 4.7 4.6 4.7 4.6 4.6 4.6
15.3 17.1 16.6 17.1 16.7 17.1 17.0 17.1 17.1 17.1 17.2 17.1 17.1 17.1
48.5 50.8 49.5 50.8 49.8 50.8 50.2 50.8 50.6 50.8 50.7 50.8 50.7 50.8
110.3 110.3 111.4 111.4 111.8 111.8 112.4 112.4 112.9 112.9 113.1 113.1 113.2 113.3
193.7 193.7 194.6 194.6 195.0 195.0 195.6 195.6 196.1 196.1 196.3 196.3 196.3 196.4
299.2 299.2 300.3 300.3 300.9 300.9 301.6 301.6 302.3 302.3 302.5 302.5 302.6 302.7
Initially assumed values.
Note: Data in bold are values obtained at the corresponding solution process. Source: Liu, G.R. et al., Comput. Methods Appl. Mech. Eng., 191, 989–1012, 2001. With permission.
After the NN model is trained well, the solution procedure as described in Section 14.3.2 takes place. As with using the SMM, assume the initial K and Φ to be 8 and 2, respectively. Using this assumed K and Φ and the known t of 1.5, ν of 0.3, the first six frequencies, ω m (m = 1, , 6) , are calculated to be 3.4, 15.3, 48.5, 110.3, 193.7, and 299.2, respectively, from the forward solver DQM. Then, a new input to the NN model is formed by using the known first three frequencies and the calculated ω 4 , ω 5 , and ω 6 . They are 4.6, 17.1, 50.7, 110.3, 193.7, and 299.2. With this new input, the corresponding t , ν, K , and Φ are computed to be 1.48, 0.39, 15.36, and 2.58, respectively, from the retrained NN model (see the second row of Table 14.3). Using the computed K of 15.63, Φ of 2.58, and the known t , ν, a new set of frequencies, 4.5, 16.6, 49.5, 111.4, 194.6, and 300.3, is calculated again from the DQM (see the third row of Table 14.3). Similar to that using the SMM, the five kinds of errors defined in Section 14.3.2 (also see Figure 14.3) are examined. Because all of them are larger at this iteration, the solving process continues to the next iteration. After six episodes of such iterations, the maximal error of the sought parameters with respect to their real values decreases to –1.25%. Table 14.3 gives the corresponding solutions at each iteration. Figure 14.8 shows the convergence process of errors ei (ω i ) (i = 1, 2, 3) and ei ( pi ) (i = 1, 2) for all the known and solved parameter values.
© 2003 by CRC Press LLC
1523_Frame_C14.fm Page 536 Thursday, August 28, 2003 6:10 PM
30
J J
20 J J
Error (%)
10
0 E
-10
H E F C
H F E C
C
-20
F H C E
H F C E
J
J
F C H E
H F C E
H
t
C
ω2
J
v
E
ω3
F
ω1
F
-30 initial
1
2 3 4 Number of iterations
5
6
FIGURE 14.8 Convergence process of the errors e i ( p i ) (i = 1, 2) and e i (ω i ) (i = 1, 2, 3). (From Liu, G.R. et al., Comput. Methods Appl. Mech. Eng., 191, 989–1012, 2001. With permission.)
ei (ω ) =
ω ij +1 − ω i0 ω ij +1
e i ( pi ) =
pij − pi0 pij
i = 1, 2, 3
(14.18a)
i = 1, 2
(14.18b)
where ω i0 and ω ij+1 (i = 1, 2, 3) (i = 1, 2, 3) are the known frequencies and the calculated frequencies from the DQM at the j + 1 iteration, respectively. p10 and p1j are the known and the calculated value of K , respectively. p20 and p2j are the known and the calculated values of Φ , respectively. 14.4.2
Identification of Material Properties of a Beam
Figure 14.9 shows a beam with a centralized force Pc = 5 × 103N on its end. This beam is fixed at node 1, simply supported at node 2 and free at node 3. It has the different material and geometrical properties in the spans 1 to 2 and 2 to 3, respectively. They are given in Table 14.4. Analysis is conducted only in the x–y plane for simplicity. With the finite element model (FEM) of the beam and the complete input information, the translational and rotational displacements of the beam at node 2 and node 3 (i.e., x2, θ2, x3, y3, and θ3) can be easily calculated. The FEM model of the beam is used as the forward solver.
© 2003 by CRC Press LLC
1523_Frame_C14.fm Page 537 Thursday, August 28, 2003 6:10 PM
Pc = 5×103 N 1
2
3 45° 50 mm
80 mm
x
y
FIGURE 14.9 A beam subjected to a concentrated load at its end. The beam is fixed at node 1, simply supported at node 2 and free at node 3. It has different material and geometrical properties in spans 1 to 2 and 2 to 3. (From Liu, G.R. et al., Comput. Methods Appl. Mech. Eng., 191, 989–1012, 2001. With permission.)
TABLE 14.4 Properties of the Beam Span 1–2 Span 2–3
E (N/mm2)
ν
A (mm2)
Iz (mm4)
2 × 105 2 × 105
0.3 0.3
6 × 103 4 × 103
200 × 106 40 × 106
Source: Liu, G.R. et al., Comput. Methods Appl. Mech. Eng., 191, 989–1012, 2001. With permission.
It is intentionally assumed that E1, E2 and x2, θ2, θ3 are unknown, but the transnational displacements at node 3 (i.e., x3 and y3) and the section areas of the beam (i.e., A1 and A2) are experimentally obtained. In this case, identification of the material properties E1 and E2 would be very difficult using conventional inverse problem algorithms based only on the known output x3 and y3. However, the approach towards a total solution is available for this problem. It is able to make use of the other know information of A1, A2. As in the previous example, the SMM and the NN methods are used. 14.4.2.1 SMM Solution Similar to Section 14.4.1.1, it is first assumed that E1 and E2 are 1 × 105, which is 50% off from their actual values. Using the FEM model, displacements x2, θ2, x3, y3, and θ3 are calculated to be 2.36 × 10–2, –5.07 × 10–2, 4.57 × 10–2, –1.92 × 10–2 and –3.04 × 10–2, respectively. Then, an initial sensitivity matrix S0 is obtained from this set of physical parameters and, subsequently, E1, E2, x2, θ2, and θ3 are computed in the same way as that in the previous example (see the second row of Table 14.5). With the computed E1 of 1.31 × 105, E2 of 1.29 × 105, the known A1 of 6.0 × 103, and A2 of 4.0 × 103, a new set of displacements, 4.72 × 10–2, –10.13 × 10–2, 9.13 × 10–2, –3.83 × 10–2, and –6.08 × 10–2, are calculated from the FEM model (see the third row of Table 14.5). The parameter errors similar to that defined in Equation 14.17 are examined to assess whether the present solution is satisfactory in accuracy. This solving process experiences a total of five iterations. Figure 14.10 shows two sensitivity matrices, S1 and S4 , at the first and fourth iterations,
© 2003 by CRC Press LLC
1523_Frame_C14.fm Page 538 Thursday, August 28, 2003 6:10 PM
TABLE 14.5 Solution Procedure towards a Total Solution Using SMM Input Parameters Output Parameters E1 E2 A1 A2 x3 y3 x2 θ2 θ3 Iteration No. Method (×105) (×105) (×103) (×103) (×10–2) (×10–2) (×10–2) (×10–2) (×10–2) Initial 1 2 3 4 5
1.0a 1.31 1.31 1.73 1.73 2.21 2.21 1.97 1.97 2.02 2.02 2.0
FEM SMM FEM SMM FEM SMM FEM SMM FEM SMM FEM
Target a
1.0a 1.29 1.29 1.62 1.62 2.02 2.02 1.98 1.98 1.99 1.99 2.0
6.0 6.0 6.0 6.0 6.0 6.0 6.0 6.0 6.0 6.0 6.0 6.0
4.0 4.0 4.0 4.0 4.0 4.0 4.0 4.0 4.0 4.0 4.0 4.0
9.13 9.13 6.99 6.99 5.45 5.45 4.32 4.32 4.63 4.63 4.555 4.57
–3.83 –3.83 –2.94 –2.94 –2.33 –2.33 –1.86 –1.86 –1.94 –1.94 –1.92 –1.92
4.72 4.65 3.60 3.04 2.73 1.93 2.13 2.33 2.39 2.34 2.33 2.36
–10.13 –10.08 –7.73 –6.21 –5.86 –4.11 –4.58 –5.01 –5.14 –5.05 –5.02 –5.07
–6.08 –6.02 –4.67 –4.06 –3.71 –2.49 –2.97 –3.05 –3.07 –3.04 –3.05 –3.04
Initially assumed values.
Note: Data in bold are values obtained at the corresponding solution process. Source: Liu, G.R. et al., Comput. Methods Appl. Mech. Eng., 191, 989–1012, 2001. With permission.
Sensitivity
0.0 -0.4
H
F É
Ç É H F
-0.8 -1.2 -1.6
H
E1
É
E
-2.0 x
3
Ç
2
y
3
Ç É
É Ç
H F
H
F
A1
Ç
A
H F
É Ç
F
2
x
2
θ2
θ3
(a) S1 at 1st iteration
Sensitivity
0.0 -0.4
H
É Ç H F
-0.8
F É
É Ç
É Ç
H F
H
Ç
-1.2
H F
É Ç
F
-1.6 -2.0 x3
y3
x2
θ2
θ3
(b) S4 at 4th iteration FIGURE 14.10 Sensitivity produced in solution procedure. (From Liu, G.R. et al., Comput. Methods Appl. Mech. Eng., 191, 989–1012, 2001. With permission.)
© 2003 by CRC Press LLC
1523_Frame_C14.fm Page 539 Thursday, August 28, 2003 6:10 PM
100
H J
80 60 Error (%)
JH
H
x3
C
θ2
J
y3
E
θ3
F
x2
40 JH
20
E C F
0
C F JH E
E JH F C
2 3 4 Number of iterations
5
C E F
-20
JH
E F C
-40 initial
1
FIGURE 14.11 Convergence process of the displacement parameter errors. (From Liu, G.R. et al., Comput. Methods Appl. Mech. Eng., 191, 989–1012, 2001. With permission.)
respectively. Figure 14.11 shows the convergence process of displacement parameter errors with the similar definition of Equation 14.17. The maximal error between the solved and the actual values for the sought parameters is just 1%. Table 14.5 shows the corresponding solutions at each iteration. 14.4.2.2 Progressive NN Solution The NN model is selected to have two hidden layers, where the number of neurons in the input and output, and first and second hidden layers is 5, 4, 15, and 7, respectively. Five input neurons correspond to five displacements x2, θ2, x3, y3, and θ3, while four output neurons correspond to E1, A1, E2, and A2. Similar to Section 14.4.1.2, a total of 32 samples is initially used to train this NN model by using the improved learning algorithm. Figure 14.12 shows the convergence process of error norm within the first 5000 training iterations in the first time of training. The given convergence criterion is fulfilled after 7819 iterations of training. The trained NN model is then used in the solution procedure in search of a total solution of the problem. E1 and E2 are first assumed to be 1 × 10 5. The corresponding displacements, x2, θ2, x3, y3, and θ3, are calculated from the FEM model using these assumed values of E1 and E2, and the known A1 and A2 (see the first row of Table 14.6). After five episodes of iteration as performed in the previous example, all the parameter errors converge within the required toleration. The maximal error of the solved parameters with respect to their actual values is 3.5%. Figure 14.13 shows the convergence
© 2003 by CRC Press LLC
1523_Frame_C14.fm Page 540 Thursday, August 28, 2003 6:10 PM
8.0 7.0
α
6.0
α
0
α
0
α
0
1
i
α
α
2 1
1 p
α
2 q
… … n
α
Error norm
1 1
… …
1 a
α
α
3
α
3
α
1
j
3 m
2 b
5.0 Neurons: 5 – 15 – 7 – 4
4.0
Learning rate: η = 1.8
3.0
Jumping factor: γ = 0.01
2.0 1.0 0.0 0
1000 2000 3000 4000 Number of iterations in training
5000
FIGURE 14.12 Convergence process of error norm in the initial training of the progressive NN model. (From Liu, G.R. et al., Comput. Methods Appl. Mech. Eng., 191, 989–1012, 2001. With permission.)
TABLE 14.6 Solution Procedure towards a Total Solution Using Progressive NN (PNN) Method Input Parameters Output Parameters E1 E2 A1 A2 x3 y3 x2 θ2 θ3 Iteration No. Method (×105) (×105) (×103) (×103) (×10–2) (×10–2) (×10–2) (×10–2) (×10–2) Initial 1 2 3 4 5 Target a
FEM PNN FEM PNN FEM PNN FEM PNN FEM PNN FEM
1.0a 1.88 1.88 1.95 1.95 1.94 1.94 1.98 1.98 2.07 2.07 2.0
1.0a 2.56 2.56 2.21 2.21 2.15 2.15 2.06 2.06 1.98 1.98 2.0
6.0 7.61 6.0 7.31 6.0 5.83 6.0 5.88 6.0 6.09 6.0 6.0
4.0 3.77 4.0 3.82 4.0 4.02 4.0 4.03 4.0 4.01 4.0 4.0
9.13 4.57 4.24 4.57 4.41 4.57 4.48 4.57 4.52 4.57 4.55 4.57
–3.83 –1.92 –1.62 –1.92 –1.78 –1.92 –1.82 –1.92 –1.88 –1.92 –1.92 –1.92
4.72 4.72 2.51 2.51 2.42 2.42 2.43 2.43 2.38 2.38 2.32 2.36
–10.13 –10.13 –5.39 –5.39 –5.21 –5.21 –5.22 –5.22 –5.112 –5.112 –4.99 –5.07
–6.08 –6.08 –2.52 –2.52 –2.79 –2.79 –2.87 –2.87 –2.97 –2.97 –3.05 –3.04
Initially assumed values.
Note: Data in bold are values obtained at the corresponding solution process. Source: Liu, G.R. et al., Comput. Methods Appl. Mech. Eng., 191, 989–1012, 2001. With permission.
© 2003 by CRC Press LLC
1523_Frame_C14.fm Page 541 Thursday, August 28, 2003 6:10 PM
100
F C
80
Error (%)
60
H
A1
F
x3
J
A2
C
y3
40 H H
20
0 J F
F J C
J F H C
J F H C
H J C F
4
5
C
-20 initial
1
2 3 Number of iteration
FIGURE 14.13 Convergence process of the parameter errors. (From Liu, G.R. et al., Comput. Methods Appl. Mech. Eng., 191, 989–1012, 2001. With permission.)
process of the errors between the known and the solved parameter values. Table 14.6 gives the corresponding solutions at each iteration.
14.5 Remarks • Remark 14.1 — the approach towards a total solution provides a comprehensive solution strategy for all kinds of engineering problems, especially for forward problems with incomplete descriptions of their causes and inverse problems with insufficient observations of their effects. • Remark 14.2 — the basic idea of this total solution approach is to formulate the engineering mechanics problem as a parameter identification problem using a forward solver. An iterative process of conducting alternately forward and inverse (or mixed) analyses is implemented to identify all the unknown parameters. This approach allows solution information to be shared between the forward and
© 2003 by CRC Press LLC
1523_Frame_C14.fm Page 542 Thursday, August 28, 2003 6:10 PM
inverse analyses at each iteration, which ensures a stable convergence of solution procedure. • Remark 14.3 — SMM and the NN methods are developed for conducting the inverse (or mixed) analyses in the solution procedure towards a total solution of a problem. Numerical examples have demonstrated their effectiveness. • Remark 14.4 — the concept of total solution and the approach is still in the infant stage. However, it offers a new viewpoint and strategy for solving practical engineering problems. Further study is needed. The results are expected to have a significant impact on the treatment of practical engineering problems.
© 2003 by CRC Press LLC
1523_Frame_C15.fm Page 543 Thursday, August 28, 2003 6:20 PM
References
ABAQUS User’s Manual, vol. I, II, III, version 6.1, Hibbitt, Karlsson & Sorensen, Inc., Pawtucket, RI, 2000. Abu-Lebdeh, G. and Benekohal, R.F., Convergence variability and population sizing in microgenetic algorithms, Comput.-Aided Civ. Infrastruct. Eng., 14(5), 321, 1999. Achenbach, J.D., Wave Propagation in Elastic Solids, North-Holland, Amsterdam, 1973. Ackley, D., A Connectionist Machine for Genetic Hillclimbing, Kluwer Academic Publishers, Boston, 1987. Adams, R.D. and Cawley, P., A review of defect types and nondestructive testing techniques for composites and bonded joints, NDT Int., 21, 208, 1988. Adams, R.D., Cawley, P., Pye, C.J. and Stone, B.J., A vibration technique for nondestructively assessing the integrity of structures, J. Mech. Eng. Sci., 20(2), 93, 1978. Al-Hunaidi, M.O., Nondestructive evaluation of pavements using spectral analysis of surface waves in the frequency wave-number domain, J. Nondestructive Evaluation, 15(2), 71, 1996. Amada, S., Ichikawa, Y., Munekata, T., Nagase, Y. and Shimizu, H., Fiber texture and mechanical graded structure of bamboo, Composites Part B, 28, 130, 1997. Anderson, J.D., Computational Fluid Dynamics, the Basics with Applications, McGrawHill, New York, 1995. Anderson, T.L., Fracture Mechanics: Fundamentals and Applications, 1st ed., CRC Press, Boca Raton, FL, 1991. Angelo, M., Hybrid genetic algorithms for timetabling, Int. J. Intelligent Syst., 11(8), 477, 1996. Anikonov, Y.E., Formulas in Inverse and Ill-Posed Problems, Utrecht Press, Tokyo, 1997. Anju, A. and Kawahara, M., Comparison of sensitivity equation and adjoint equation methods for parameter identification problems, Int. J. Numer. Mech. Eng., 40, 1015, 1997. Ashok, D.B. and Chandrupatla, Optimization Concepts and Applications in Engineering, Prentice Hall, Inc., Englewood Cliffs, NY, 1999. Atalla, M.J. and Inman, D.J., On model updating using neural networks, Mech. Syst. Signal Proc., 12, 135, 1998. Back, T., Hammel, U., and Schwefel, H.P., Evolutionary computation: comments on the history and current state, IEEE Trans. Evol. Computation 1(1), 3, 1997. Bai, B.C. and Farhat, N.H., Learning networks for extrapolation and radar target identification, Neural Networks, 5(3), 507, 1992. Balasubramaniam, K. and Rao, N.S., Inversion of composite material elastic constants from ultrasonic bulk wave phase velocity data using genetic algorithms, Composite Part B, 29B, 171, 1998. Barro, S. et al., Classifying multichannel ECG patterns with an adaptive neural network, IEEE Eng. Med. Biol. Mag., 17(1), 45, 1998.
© 2003 by CRC Press LLC
1523_Frame_C15.fm Page 544 Thursday, August 28, 2003 6:20 PM
Belegundu, A.D. and Chandrupatla, T.R., Optimization Concepts and Applications in Engineering, Prentice Hall, Inc., Englewood Cliffs, NJ, 1999. Besterfield, D.H., Besterfield-Michna, C., Besterfield, G.H. and Besterfield-Sacre, M., Total Quality Management, Prentice Hall, Inc., 1995. Bethke, A.D., Genetic algorithms as function optimizers, Diss. Abstr. Int., University of Michigan, Ann Arbor, 3503B. 1981. Bicanic, N. and Chen, H.P., Damage identification in framed structures using natural frequencies, Int. J. Numer. Methods Eng., 40, 4451, 1997. Bishop, C.M., Neural networks and their applications, Rev. Sci. Instrum., 65, 1803, 1994. Blanc, G., Raynaud, M. and Chau, T.H., Solution of a 2-D inverse heat conduction problem from thermal strain measurements, 2nd Int. Conference on Inverse Problems in Engineering: Theory and Practice, Le Croisic, France, 1996, 512. Boltezar, M., Strancar, B. and Kuhelj, A., Identification of transverse crack location in flexural vibrations of free–free beam, J. Sound Vib., 211(5), 729, 1998. Bosworth, J., Foo, N. and Zeigler, B.P., Comparison of genetic algorithms with conjugate gradient methods, NASA report, CR-2093, Washington, D.C., 1972. Boving, K.G., NDE Handbook, Butterworths, London, 1989. Bratton, R.L. and Datta, S.K., Anisotropic effects on lamb waves in composite plates, in Review of Progress in Quantitative Nondestructive Evaluation, Thompson, D.O. and Chimenti, D.E., Eds., Plenum Press, New York, 1989, 197. Brebbia, C.A., Telles, J.C.F., and Wrobel, L.C., Boundary Element Techniques: Theory and Application in Engineering, Springer-Verlag, New York, 1984. Brownjohn, J.M.W., Steele, G.H., Cawley, P. and Adams, R.D., Errors in mechanical impedance data obtained with impedance heads, J. Sound Vib., 73(3), 461, 1980. Bryngelson, J., Onuchic, J., Socci, N. and Wolynes, P., Funnels, pathways and the energy landscape of protein folding: a synthesis, Prot. Struct. Funct. Genet., 21(3), 167, 1995. Cai, C., Liu, G.R. and Lam, K.Y., An exact method for analyzing sound reflection and transmission by anisotropic laminates submerged in fluids, Appl. Acoust., 61, 95, 2000. Carroll, D.L., Genetic algorithms and optimizing chemical oxygen-iodine lasers, Dev. Theor. Appl. Mech. XVIII, 411–424, University of Alabama, Birmingham, 1996a. Carroll, D.L., Chemical laser modeling with genetic algorithms, AIAA J., 34(4), 338,1996b. Cavalieri, S. and Gaiardelli, P., Hybrid genetic algorithms for a multiple-objective scheduling problem, J. Intelligent Manuf., 9(4), 361, 1998. Cawley, P., The impedance method of non-destructive inspection, NDT Int., 17(2), 59, 1984. Cawley, P., The operation of NDT instruments based on the impedance method, Composites Struct., 3, 215, 1985. Cawley, P., The sensitivity of the mechanical impedance method of nondestructive testing, NDT Int., 20(4), 209, 1987. Cawley, P., Woolfrey, A.M. and Adams, R.D., Natural frequency measurements for production quality control of fiber composites, Composites, 16(1), 23, 1985. Chandrashekhara, K., Chukwujekwu, A. and Jiang, Y.P., Estimation of contact force on composite plates using impact-induced strain and neural networks, Composites Part B, 29B, 363, 1998. Chang, C. and Sachse, W., Analysis of elastic wave signals from an extended source in an extended source in a plate, J. Acoust. Soc. Am., 77(4), 1335, 1985.
© 2003 by CRC Press LLC
1523_Frame_C15.fm Page 545 Thursday, August 28, 2003 6:20 PM
Chang, C. and Sun, C.T., Determining transverse impact force on a composite laminate by signal deconvolution, Exp. Mech., 29, 414, 1989. Chang, C.C., Chang, T.Y.P. and Xu, Y.G., Adaptive neural networks for model updating of structures, Smart Mater. Struct., 9, 59, 2000. Chen, D. and Nisitani, H., Trans. Jpn. Soc. Mech. Eng. Ser. A, 57, 274, 1991. Chen, S.C. and Liu, G.R., Damage assessment of structures using dynamic response and an inverse procedure, Proceedings of the First Asian-Pacific Congress on Computational Mechanics, 23 November 2001, Elsevier, Sydney, 1065. Chen, T. and Chen, H., Universal approximation capability of EBF neural networks with arbitrary activation functions, Circuits Syst. Signal Proc., 15, 671, 1996. Cheng, R., Gen, M. and Tsujimura, Y., A tutorial survey of job-shop scheduling problems using genetic algorithms: part I. representation, Int. J. Comput. Ind. Eng. 30(4), 983, 1996. Cheng, R., Gen, M. and Tsujimura, Y., A tutorial survey of job-shop scheduling problems using genetic algorithms: part II. hybrid genetic search strategies. Int. J. Comput. Ind. Eng., 37(1), 51, 1999. Christensen, R.M., Mechanics of Composite Materials, John Wiley & Sons, New York, 1979. Chu, Y.C. and Rokhlin, S.I., Stability of determination of composite moduli from velocity data in planes of symmetry for weak and strong anisotropies, J. Acoust. Soc. Am., 95(1), 213, 1994a. Chu, Y.C. and Rokhlin, S.I., Analysis of composite elastic constant reconstruction from ultrasonic bulk wave velocity data, in Review of Progress in Quantitative Nondestructive Evaluation, Thompson, D.O. and Chimenti, D.E., 1994b, 1165. Chu, Y.C., Degtyar, A.D. and Rokhlin, S.I., On determination of orthotropic material moduli from ultrasonic velocity data in non-symmetry planes, J. Acoust. Soc. Am., 95(6), 3191, 1994. Coffin, L.F., ASTM STP 969, Amer. Soc. Testing Mater., 325, 1988. Coley, D.A., An Introduction to Genetic Algorithms for Scientists and Engineering, World Scientific, Singapore, 1999. Crane, R.M. and Gagorik, J., Fiber optics for a damage assessment system for fiber reinforced plastic composite structures, Quant. NDE, 28, 1419, 1984. D’Cruz, J., Crisp, J.D.C. and Ryall, T.G., On the identification of a harmonic force on a viscoelastic plate from response data, J. Appl. Mech., 59, 722, 1992. Dandekar, T. and Argos, P., Potential of genetic algorithms in protein folding and protein engineering simulations, Protein Eng., 5 (7), 637, 1992. Datta, S.K., Ju, T.H. and Shah, A.H., Scattering of an impact wave by a crack in a composite plate, ASME J. App. Mech., 59, 596, 1992. Davidor, Y., A genetic algorithm applied to robot trajectory generation, in Handbook of Genetic Algorithms, Davis, L., Ed., 923–932, Van Nostrand Reinhold, New York, 1991. Davis, C.M., Fiber optic sensor: an overview, Opt. Eng., 24(2), 347, 1985. Davis, L., Ed., Handbook of Genetic Algorithms, Van Nostrand Reinhold, New York, 1991. Daw, M.S. and Baskes, M.I., Embedded-atom method: derivation and application to impurities, surfaces, and other defects in metals, Phys. Rev. B, 29(12), 6443, 1984. Deb, K., Optimization for Engineering Design: Algorithms and Examples, Prentice Hall of India, New Delhi, 1998. Denker, J., Schwartz, B., Wittner, B., Solla, S., Howard, R., Jackel, L. and Hopfiled, J., Large automatic learning, rule extraction, and generalization, Complex Syst., 1, 877, 1987.
© 2003 by CRC Press LLC
1523_Frame_C15.fm Page 546 Thursday, August 28, 2003 6:20 PM
Dennis, J.E. and Schnabel, R.B., Numerical Methods for Unconstrained Optimization and Nonlinear Equations, Society for Industrial and Applied Mathematics, Philadel phia, 1996. Derrida, B., Random energy model: limit of a family of disordered models, Phys. Rev. Lett., 45, 2, 1980. Doebling, S.W. et al., The state of the art in structural identification of constructed facilities, Technical Report, Los Alamos National Laboratory, New Mexico, 1998. Doebling, S.W., Farrar, C.R., Prime, M.B. and Shevitz, D.W., Damage identification and health monitoring of structural and mechanical systems from changes in their vibrations characteristics: a literature review, Technical Report LA-13070MS, Los Alamos National Lab., New Mexico, 1996. Dong, S.B. and Nelson, R.B., On natural vibrations and waves in laminated orthotropic plates, ASME J. Appl. Mech., 39, 739, 1972. Doyle, J.F., An experimental method for determining the dynamic contact law, Exp. Mech., 24 (1), 10–16, 1984a. Doyle, J.F., Further development in determining the dynamic contact law, Exp. Mech., 24(4), 265, 1984b. Doyle, J.F., Determining the contact force during the transverse impact of plates, Exp. Mech., 27(10), 68, 1987. Doyle, J.F., Static and Dynamic Analysis of Structures, Kluwer Academic Publishers, The Netherlands, 1991. Doyle, J.F., Determining the size and location of transverse cracks in beams, Exp. Mech., 35, 272, 1995. Doyle, J.F., Wave Propagation in Structures, Spectral Analysis Using Fast Discrete Fourier Transforms, 2nd ed., Springer-Verlag, New York, 1997. Doyle, J.F. and Kamle, S., An experimental study of the reflection and transmission of flexural waves at discontinuities, ASME J. Appl. Mech., 52, 669, 1985. Doyle, J.F. and Kamle, S., An experimental study of the reflection and transmission of flexural waves at an arbitrary T-joint, ASME J. Appl. Mech., 54, 136, 1987. Doyle, J.F., Farris, T.N. and Martin, M.T., Crack identification in frame structures, dynamic fracture mechanics, Comput. Mech. Publ., 237, 1995. Dozier, G., Bowen, J. and Homaifar, A., Solving constraint satisfaction problems using hybrid evolutionary search, IEEE Trans. Evol. Computation, 2(1), 23, 1998. Elvin, N. and Leung, C., Feasibility of delamination detection with embedded optical fibers, Procedings SPIE-International Society of Optical Engineers, Society of PhotoOptical Instrumentation Engineers, Bellingham, WA, V3041, 1997, 627. Engl, H.W., Hanke, M. and Neubauer, A., Regularization of Inverse Problems, Kluwer Academic Publishers, The Netherlands, 2000. Eshelman, L.J., The CHC adaptive search algorithm: how to have safe search when engaging in nontraditional genetic recombination, in Foundations of Genetic Algorithms, Spatz, B.M., Ed., Morgan Kaufmann, San Mateo, CA, 1989, 265. Ewins, D.J., Modal Testing: Theory and Practice, John Wiley & Sons, New York, 1985. Fasman, G., Ed., Prediction of Protein Structure and the Principles of Protein Conformation, Plenum Press, New York, 1990. Fletcher, R. and Reeves, C.M., Function minimization by conjugate gradients, Comput. J., 6, 163, 1963. Fletcher, R., Practical Methods of Optimization, 2nd ed., John Wiley & Sons, New York, 1987. Foiles, S.M., Baskes, M.I. and Daw, M.S., Embedded-atom-method functions for the fcc metals Cu, Ag, Au, Ni, Pd, Pt, and their alloys, Phys. Rev. B, 33(12), 7983, 1986.
© 2003 by CRC Press LLC
1523_Frame_C15.fm Page 547 Thursday, August 28, 2003 6:20 PM
Foresee, F.D. and Hagan, M.T., Gauss–Newton approximation to Bayesian regularization, Proc. 1997 Int. Joint Conf. Neural Networks, 1997, 1930. Forster, F.K., Bardell, L., Afromowitz, M.A., Sharma, N.R. and Blanchard, A., Design, fabrication and testing of fixed-valve micro-pumps, Proc. ASME Fluids Eng. Division, 234, 39, 1995. Frederiksen, P.S., Application of an improved model for the identification of material parameters, Mech. Composite Mater. Struct., 4, 297, 1997. Freifelder, D. and Malacinski, G., Essentials of Molecular Biology, 2nd ed., Jones and Bartlett Publishers, Boston, 1993. Gasik, M. The present state of research and future opportunities of functionally graded materials, FGM News, 31(7), 6, 1996. Gasik, M. and Kawasaki, A., Functionally graded materials for automotive applications, Proceedings of 32nd ISATA Congress, Vienna, 4, 581, 1999. Gen, M. and Chen, R.W., Genetic Algorithms and Engineering Design, John Wiley & Sons, New York, 1997. Gen, M., Ida, K. and Li, Y.Z., Bicriteria transportation problem by hybrid genetic algorithm, Comput. Ind. Eng. 35(1–2), 363, 1998. Gill, P.E. and Murray, W., Algorithms for the solution of the nonlinear least squares problem, SIAM J. Numerical Anal., 15(5), 977, 1978. Glossop, N.D.W., Dubois, S., Tsaw, W., Leblanc, M., Lymer, J., Measures, R.M., and Tennyson, R.C., Optical fiber damage detection for an aircraft composite leading edge, Composites, 21(1), 71–80, 1990. Goldberg, D.E., Computer-aided gas pipeline operation using genetic algorithms and rule learning, Diss. Abstr. Int., University of Michigan, Ann Arbor, 44(10), 3174B, 1983. Goldberg, D.E., Genetic Algorithms in Search, Optimization, and Machine Learning, Addison-Wesley, Reading, MA, 1989. Goldberg, D.E., Real-coded genetic algorithms, virtual alphabets, and block, Technical Report No. 90001, University of Illinois, Chicago, 1991. Goldberg, D.E. and Richardson, J., Genetic algorithms with sharing for multimodal function optimization: genetic algorithms and their applications, in Proc. 2nd Int. Conf. Genet. Algorithms, 1987, 41. Goldberg, D.E., Korb, B. and Deb, K., Messy Genetic Algorithms: Motivation, Analysis, and First Research, Complex Systems 3, Complex Systems Publications, Inc., Champaign, IL, 490, 1989. Golub, G.H. and Van Loan, C.F., Matrix Computations, 3rd ed., The Johns Hopkins University Press, Baltimore, 1996.
Gong, D., Yamazaki, G., and Gen, M., Evolutionary program for optimal design of material distribution system, in Proc. 3rd IEEE Conf. Evol. Comput., 1996, 139. Goodier, J.N., Jahsman, W.E. and Riperger, E.A. An experimental surface wave method for recording force-time curves in elastic impacts, J. Appl. Mech., 26, 3, 1959. Gorges-Schleuter, ASPARAGOS: an asynchronous parallel genetic algorithm strategy, in Proc. 3rd Int. Conf. Genet. Algorithms Appl., 1989, 422. Graff, K.F., Wave Motion in Elastic Solids, Clarendon Press, Oxford, 1975. Grefenstette, J.J., Gopal, R., Rosmaita, B.J. and Van Gucht, D., Genetic algorithms for the traveling salesman problem, in Proc. Int. Conf. Genet. Algorithms Appl., 1985, 160.
© 2003 by CRC Press LLC
1523_Frame_C15.fm Page 548 Thursday, August 28, 2003 6:20 PM
Grefenstette, J.J., Lamarkian learning in multi-agent environment, in Proc. 4th Int. Conf. Genet. Algorithms, 303, 1991. Groetsch, C.W., Inverse Problems in the Mathematical Sciences, Vieweg & Sohn Verlagsgesellschaft mbH, Braunschweig/Wiesbaden, Germany, 1993. Haario, H., Theory and Application of Inverse Problems, Logmon Scientific & Technical Press, John Wiley & Sons, Inc., New York, 1996. Hallquist, J.O., Ed., LS_DYNA Theoretical Manual, Livemore Software Technology Corporation, 1998. Han, X., Elastic waves in functionally graded materials and its application to material characterization, Ph.D. thesis, National University of Singapore, 2000. Han, X. and Liu, G.R., Reconstruction of elastic constants of laminated shells using a combined inverse technique, 14th U.S. National Congress of Theoretical and Applied Mechanics, Blacksburg, VA, June 23–28, 2002a. Han, X. and Liu, G.R., Determination of elastic constants of laminated cylindrical shells by means of elastic waves, 5th World Congress on Computational Mechanics, Vienna, Austria, 2002b. Han, X. and Liu, G.R., An inverse technique for identification of elastic constants of glass/epoxy laminated plate 4th International Conference on Inverse Problems in Engineering: Theory and Practice, Angra dos Reis, Brazil, 2002c. Han, X. and Liu, G.R., Material characterization of FGM plates using elastic waves and genetic algorithm, AIAA J., 41(2), 288, 2003. Han, X., Liu, G.R. and Lam, K.Y., Transient waves in functionally graded material plate, Int. J. Numerical Methods Eng., 52, 851, 2001a. Han, X., Liu, G.R., Lam, K.Y., and Ohyoshi, T., A quadratic layer element for analyzing stress waves in functionally graded materials and its application for material characterization, J. Sound Vib., 236(2), 307, 2000. Han, X., Liu, G.R., Xi, Z.C. and Lam, K.Y., Characteristics of waves in a functionally graded cylinder, Int. J. Numerical. Methods Eng., 53, 653, 2002a. Han, X., Liu, G.R., Xi, Z.C. and Lam, K.Y., Transient waves in a functionally graded cylinder, Int. J. Solids Struct., 38(17), 3021, 2001b. Han, X. and Xu, D., A computational method for reconstruction of elastic constants of anisotropic laminated plate, 6th U.S. National Congress on Computational Mechanics, Dearborn, Michigan, August 1–4, 2001c. Han, X., Xu, D., Yap, F.F. and Liu, G.R., On determination of the material constants of laminated cylindrical shells based on an inverse optimal approach, Inverse Probl. Eng., 10(4), 309, 2002b. Han, X., Xu, D. and Liu, G.R., An application of a progressive neural network to material characterization of a functionally graded cylinder, Neurocomputing, 51, 341, 2003. Hansen, P.C., Analysis of discrete ill-posed problems by means of the L-curve, SIAM Rev., 34, 561, 1992. Haupt, L.R. and Haupt, S.E., Practical Genetic Algorithms, John Wiley & Sons, Inc., New York, 1998. Heinz, W.E., Hanke, M. and Neubauer, A., Regularization of Inverse Problems, Kluwer Academic Publishers, Dordrecht, 2000. Hertz, J., Krogh, A. and Palmer, R.G., Introduction to the Theory of Neural Computation, Addison-Wesley Publishing Company, Reading, MA, 1991. Hirai, T., Functionally gradient materials, in Processing of Ceramics, Part 2, Brook, R.J., Ed., VCH Velagsge-sellschaft MbH Publishers, Weinheim, Germany, 1996, 293.
© 2003 by CRC Press LLC
1523_Frame_C15.fm Page 549 Thursday, August 28, 2003 6:20 PM
Hirsch, C., Numerical Computation of Internal and External Flows, vol. 1, Wiley-Interscience Publications, New York, 1988. Hofer, B., Fiber optic damage detection in composite structures, Composites, 18(4), 309, 1987. Holland, J.H., Adaption in Natural and Artificial Systems, MIT Press, Cambridge, MA, 1975. Hsu, N.N., Simmons, J.A. and Hardy, S.C., An approach to acoustic emission signal analysis-theory and experiment, Mater. Eval., 35(10), 100, 1977. Huber, N. and Tsakmakis, C., Determination of constitutive properties from spherical indentation data using neural networks. part I: the case of pure kinematic hardening in plasticity laws, J. Mech. Phys. Solids, 47, 1569, 1999. The I-deas Electronic System Cooling User's Guide and TMG Thermal Analysis User's Guide, MAYA Heat Transfer Technologies, 1995. Ilschner, B. and Cherradi, N., Eds., Proceedings of the Third International Symposium on Structural and Functional Gradient Materials, Lausanne, Switzeland, Presses Polytechniques et Universitarires Romands, 1995. Irwan, B.K., Finite element simulation of images of flaws in materials, bachelor thesis, National University of Singapore, 2001. Ishak, S.I., Nondestructive evaluation of structures using flexural waves, Ph.D. thesis, National University of Singapore, 2001. Ishak, S.I., Liu, G.R. and Lim, S.P., Study on characterization of horizontal cracks in isotropic beams, J. Sound Vib., 238(4), 661, 2000. Ishak, S.I., Liu, G.R., Lim, S.P. and Shang, H.M., A study of mechanical impedance method for characterization of delaminations in laminated materials, in Proceedings of the Impact Response of Materials and Structures, Shim, V.P.W., Tanimura, S. and Lim, C.T., Eds., Oxford, Singapore, 1999, 333. Ishak, S.I., Liu, G.R., Lim, S.P. and Shang, H.M., Characterization of delamination in beams using flexural wave scattering analysis, ASME J. Vib. Acoust., 123(4), 421, 2001a. Ishak, S.I., Liu, G.R., Lim, S.P. and Shang, H.M., Experimental study on employing flexural wave measurement to characterize delamination in beams, Exp. Mech., 41(2), 157, 2001b. Ishak, S.I, Liu, G.R., Lim, S.P. and Shang, H.M., Locating and sizing of delamination in composite laminates using computational and experimental methods, Composite Part B, 32(4), 287–298, 2001c. Ishak, S.I., Liu, G.R., Lim, S.P. and Shang, H.M., Nondestructive evaluation of horizontal crack detection in beams using transverse impact, J. Sound Vib., 252(2), 343, 2002. Janikow, C.Z. and Michalewicz, Z., An experimental comparison of binary and floating point representations in genetic algorithms, in Proc. 4th Int. Conf. Genet. Algorithms, 1991, 31. Johnson, E.G. and Abushagur, M.A.G., Image deconvolution using a micro genetic algorithm, Opt. Commun., 140(1-3), 6, 1997. Karim, M.R. and Kundu, T., Transient response of three layered composites with two interface cracks due to a line load, Acta Mech., 76, 53, 1989. Karim, M.R. and Kundu, T., Scattering of acoustic beams by cracked composites, ASCE J. Eng. Mech. 116, 1812, 1990. Karim, M.R., Awal, M.A. and Kundu, T., Numerical analysis of guided wave scattering by multiple cracks in plates: SH-case, Eng. Fracture Mech., Int. J., 42, 371, 1992a.
© 2003 by CRC Press LLC
1523_Frame_C15.fm Page 550 Thursday, August 28, 2003 6:20 PM
Karim, M.R., Awal, M.A. and Kundu, T., Elastic wave scattering by cracks and inclusions in plates: in-plane case, Int. J. Solids Struct., 29, 2355, 1992b. Karim, M.R., Kundu, T. and Desai, C.S., Detection of delamination cracks in layered fiber reinforced composite plates, ASME J. Pressure Vessel Tech., 111, 165, 1989. Karunasena, W.M., Shah, A.H. and Datta, S.K., Plane-strain wave scattering by cracks in laminated composite plate, ASCE J. Eng. Mech., 117(8), 1738, 1991. Kausel, E. and Roësset, J.M., Semianalytic hyperelement for layered strata, J. Eng. Mech. Div., 118(4), 569, 1977. Kausel, E., An explicit solution for the Green's functions for dynamic loads in layered media, in Technical Report R81-13, Department of Civil Engineering, M.I.T., Cambridge, MA, 1981. Kausel, E., Wave propagation in anisotropic layered media, Int. J. Numer. Methods Eng., 23, 1567, 1986. Kennedy, S., Five ways to a smarter genetic algorithm, AI Expert, 34, 35, 1993. Kerner, E.H., The elastic and thermo-elastic properties of composite media, Proc. Phys. Soc., 63, 808, 1956. Klenke, S.E. and Paez, T.L., Damage identification with probabilistic neural networks, Proc. 12th Int. Modal Anal. Conf., 1994, 99. Kitamura, T., Yashiro, K. and Ohtani, R., Atomic simulation on deformation and fracture of nano-single crystal of nickel in tension, JSME Int. J., A, 40(4), 430, 1997. Koizumi, M., FGM activities in Japan, Composites Part B, 28 (1–2), 1, 1997. Kress, R. and Zinn, A., Inverse problems in engineering science, in ICM-90 Satellite Conference Proceedings, Yamaguti, M., Hayakawa, K., Iso, Y., Mori, M., Nishida, T., Tomoeda, K. and Yamamoto, M., Eds., Springer-Verlag, Tokyo, 1991, 43. Krishnakumar, K., Micro-genetic algorithms for stationary and nonstationary function optimization, in SPIE: Intelligent Control and Adaptive Systems, Philadelphia, PA 1989, 289. Krishnan, B. and Navin, S.R., Inversion of composite material elastic constants from ultrasonic bulk wave phase velocity data using genetic algorithms, Composites Part B, 29, 171, 1998. Kubo, S., Computational inverse schemes for various categories of inverse problems, in Inverse Problems, Kubo, S., Ed., 1993, 36. Kundu, T., Dynamic interaction between two interface cracks in a three layered plate, Int. J. Solids Struct., 24, 27, 1988. Kundu, T. and Hassan, T., A numerical study of the transient behavior of an interfacial crack in a bimaterial plate, Int. J. Fracture, 35, 55, 1987. Lagerkvist, L. and Lundberg, B., Mechanical impedance gauge based on measurement of strains on a vibrating rod, J. Sound Vib., 80(3), 389, 1982. Lam, K.Y., Liu, G.R., and Wang, Y.Y., Characterization of a vertical surface-breaking crack plate, Computational Acoust., 3(4), 297, 1995. Lam, K.Y., Liu, G.R. and Wang, Y.Y., Time-harmonic response of a vertical crack in plates, Theor. Appl. Frac. Mech., 27, 21, 1997. Lange, Y.V., Characteristics of the impedance method of inspection, Sov. J. Nondestructive Testing, 8, 47, 1972. Lange, Y.V. and Teumin, I.I., Dynamic flexibility of a dry point contact, Sov. J. Nondestructive Testing, 7, 157, 1971. Lapedes, A. and Farber, R., How neural nets work, in Neural Information Processing Systems, Anderson, D.Z., Ed., American Institute of Physics, New York, 1988, 442.
© 2003 by CRC Press LLC
1523_Frame_C15.fm Page 551 Thursday, August 28, 2003 6:20 PM
Law, S.S., Chan, H.T. and Zeng, Q.H., Moving force identification: a time domain method, J. Sound Vib., 201, 1, 1997. Lawrence, D., Genetic Algorithms and Simulated Annealing, Morgan Kaufmann Publishers, London, 1987. Leung, A. and Payandeh, S., Application of adaptive neural network to localization of objects using pressure array transducer, Robotica, 14(4), 407, 1996. Levin, R.I. and Lieven, N.A.J., Dynamic finite element model updating using neural networks, J. Sound Vib., 210(5), 593, 1998. Levine, D., Users Guide to the PGA Pack Parallel Genetic Algorithm Library, Argonne National Laboratory, Argonne, IL, 1996. Lippmann, R.P., An introduction to computing with neural nets, IEEE ASSP Mag., 4, 4, 1987. Liu, G.R., Experimental Study and Theoretical Analysis of Mechanical Behavior at High Temperature of T300/epoxy Composite Materials, master thesis, Bei Hang University, China, 1984. Liu, G.R., A step-by-step method of rule-of-mixture of fiber- and particle-reinforced composite materials, Composite Struct., 40, 313, 1998. Liu, G.R., A combined finite element/strip element method for analyzing elastic wave scattering by cracks and inclusions in laminates, Computational Mech., 28, 2002b, 76–82. Liu, G.R., Mesh Free Methods: Moving beyond the Finite Element Method, CRC Press, Boca Raton, FL, 2002a. Liu, G.R. and Achenbach, J.D., A strip element method for stress analysis of anisotropic linearly elastic solids, ASME J. Appl. Mech., 61, 270, 1994. Liu, G.R. and Achenbach, J.D., Strip element method to analyze wave scattering by cracks in anisotropic laminated plates, ASME J. Appl. Mech., 62, 607, 1995. Liu, G.R., Achenbach, J.D., Kim, J.O. and Li, Z.L., A combined finite element method/ boundary element method for v(z) curves of anisotropic-layer/substrate configurations, J. Acoust. Soc. Am., 92(5), 2734, 1992. Liu, G.R. and Chen, S.C., A novel formulation of inverse identification of stiffness distribution in structures, in The 1st International Conference on Structural Stability and Dynamics, Yang, Y.B., Leu, L.J. and Hsieh, S.H., Eds., Taipei, Taiwan, 2000, 531. Liu, G.R. and Chen, S.C., Flaw detection in sandwich plates based on time-harmonic response using genetic algorithm, Comput. Methods Appl. Mech. Eng., 190 (42), 5505, 2001. Liu, G.R. and Chen, S.C., A novel technique for inverse identification of distributed stiffness factor in structures, J. Sound Vib., 254(5), 823, 2002. Liu, G.R. and Han, X., An inverse procedure for determination of material property of FGM, International Symposium on Inverse Problems in Engineering Mechanics, Nagano City, Japan. 2001, 63. Liu, G.R., Han, X. and Lam, K.Y., Material characterization of FGM plates using elastic waves and an inverse procedure, J. Composite Mater., 35 (11), 954, 2001a. Liu, G.R., Han, X. and Lam, K.Y., A combined genetic algorithm and nonlinear least squares method for material characterization using elastic waves, Comput. Methods Appl. Mech. Eng., 191, 1909, 2002a. Liu, G.R., Han, X. and Lam, K.Y., Determination of elastic constants of anisotropic laminated plates using elastic waves and a progressive neural network, J. Sound Vib., 252(2), 239, 2002b.
© 2003 by CRC Press LLC
1523_Frame_C15.fm Page 552 Thursday, August 28, 2003 6:20 PM
Liu, G.R., Han, X. and Lam, K.Y., Stress waves in functionally gradient materials and its use for material characterization, Composites Part B, 30, 383, 1999. Liu, G.R., Han, X. and Ohyoshi, T., Computational inverse techniques for material characterization using dynamic response, Int. J. Soc. Mater. Eng. Resour., 10(1), 26, 2002c. Liu, G. R., Han, X., Xu, Y.G. and Lam, K.Y., Material characterization of functionally graded material using elastic waves and a progressive learning neural network, Composites Sci. Technol., 61, 1401, 2001b. Liu, G.R. and Lam, K.Y., Characterization of a horizontal crack in anisotropic laminated plates, Int. J. Solids Struct., 31(21), 2965, 1994. Liu, G.R. and Lam, K.Y., Two-dimensional time harmonic elastodynamic Green’s functions for anisotropic media, Int. J. Eng. Sci., 34 (11), 1327, 1996a. Liu, G.R. and Lam, K.Y, Scattering of SH waves by flaws in sandwich plates and its use in flaw detection, Composite Struct., 34, 251, 1996b.
Liu, G.R., Lam, K.Y. and Ohyoshi, T., A technique for analyzing elastodynamic responses of anisotropic laminated plates to line loads, Composites Part B, 28B, 667, 1997. Liu, G.R., Lam, K.Y. and Shang, H.M., Scattering of waves by flaws in anisotropic laminated plates, Composites Part B, 27B, 431, 1996. Liu, G.R., Lam, K.Y. and Tani, J., An exact method for analyzing elastodynamic responses of anisotropic laminates to line loads, Mech. Composite Mater. Struct., 2, 227, 1995a. Liu, G.R., Lam, K.Y. and Tani, J., Characterization of flaws in sandwich plates: numerical experiment, JSME Int. J. (A), Jpn., 38 (4), 554, 1995b. Liu, G.R, Lam, K.Y. and Shang, H.M., A new method for analyzing wave fields in laminated composite plates: two-dimensional cases, Composites Eng., 5(12), 1489, 1995c. Liu, G.R. and Liu, M.B., Smoothed Particle Hydrodynamics: a Meshfree Particle Method, World Scientific Publishing Corporation, Singapore, 2003. Liu, G.R. and Ma, H.J., Material characterization of composite laminate using dynamic response and real parameter coded micro genetic algorithm, Eng. Comput., 2003. Liu, G.R. and Ma, W.B., An inverse procedure for loading identification of composite laminates, Proc. Asia–Pacific Vib. Conf. ’99, Singapore, December 1999, 740. Liu, G.R. and Ma, W.B., Inversion of ply orientations of composite laminates using genetic algorithm, in Proceedings of the Advances in Computational Engineering & Sciences, Atluri, S.N. and Brust, F.W., Eds., Tech Science Press, Encino, CA, 2000, 1227. Liu, G.R., Ma, W.B. and Han, X., An inverse procedure for determination of material constants of composite laminates using elastic waves, Comput. Methods Appl. Mech. Eng., 191, 3543, 2002d. Liu, G.R., Ma, W.B. and Han, X., An inverse procedure for identification of loads on composite laminate plates, Composites Part B, 33, 425, 2002e. Liu, G.R., Ma, W.B. and Han, X., Inversion of loading time history using displacement response of composite laminates: three-dimensional cases, Acta Mech., 157(1–4), 223, 2002f. Liu, G.R. and Quek, S.S., Finite Element Method: for Readers of All Backgrounds, Butterworth-Heinemann, Burlington, MA, 2003.
© 2003 by CRC Press LLC
1523_Frame_C15.fm Page 553 Thursday, August 28, 2003 6:20 PM
Liu, G.R. and Tani, J., Lamb wave propagation in anisotropic functionally gradient material plates, in Proceedings of the First International Symposium on Functionally Gradient Materials, Yamanouchi, M. et al., Eds., Functionally Gradient Materials Forum, Sendai, 1990, 54. Liu, G.R. and Tani, J., Surface waves in functionally gradient piezoelectric material plates, ASME J. Vib. Acoust., 116, 440, 1994. Liu, G.R., Tani, J., Ohyoshi, T. and Watanabe, K., Transient waves in anisotropic laminated plates, part 1: theory, part 2: application, J. Vib. Acoust., 113, 230. 1991a. Liu, G.R., Tani, J., Watanabe, K. and Ohyoshi, T., A semi-exact method for the prop agation of harmonic waves in anisotropic laminated bars of rectangular cross section, Wave Motion, 12, 361, 1991b. Liu, G.R., Tani, J. and Ohyoshi, T., Lamb waves in a functionally gradient material plates and its transient response, part 1: theory, part 2: calculation results, Trans. Jpn. Soc. Mech. Eng., 57(A), 535, 603, 1991c. Liu, G.R., and Xi, Z.C, Elastic Waves in Anisotropic Laminates, CRC Press, Boca Raton, FL, 2001. Liu, G.R., Xu, Y.G. and Wu, Z.P., Total solution for structural mechanics problems, Comput. Methods Appl. Mech. Eng., 191, 989, 2001c. Liu, G.R., Yang, Z.L., and Han, X., Stable protein structure prediction and dynamic behavior analysis, 2nd Int. Conf. Struct. Stability Dynamics, Singapore, 2002g. Liu, G.R., Zhou, J.J. and Wang, J.G., Coefficients identification in electronic system cooling simulation through genetic algorithm, Comput. Struct., 80, 23, 2002h. Liu, N., Zhu, Q.M., Wei, C.Y., Dykes, N.D. and Irving, P.E., Impact damage detection in carbon fibre composites using neural networks and acoustic emission, Key Eng. Mater., 167, 43, 1999. Liu, S.W. and Datta, S.K., Scattering of ultrasonic waves by cracks in a plate, ASME J. Appl. Mech., 60, 352, 1993. Liu, S.W., Datta, S.K. and Ju, T.H., Transient scattering of Rayleigh lamb waves by a surface-breaking crack: comparison of numerical simulation and experiment, J. Nondestrucive. Evaluation, 10, 111, 1991. Lobo, F.G. and Goldberg, D.E., Decision making in a hybrid genetic algorithm, Proc. IEEE Conf. Evol. Computation, 121, 1997. Lp K.-H., Tse, P.-C. and Lai, T.-C., Material characterization for orthotropic shells using model analysis and Rayleigh-Ritz models, Composites Part B, 29B, 397, 1998. Luo, H. and Hanagud, S., Dynamic learning rate neural networks training and com posite structural damage detection, AIAA J., 35, 1522, 1997. Ma, W.B., Studies on inverse problems of composite laminates, Master thesis, National University of Singapore, 2000. Mace, B.R., Wave reflection and transmission in beams, J. Sound Vib., 97(2), 237, 1984. MacKay, D.J.C., Bayesian interpolation, Neural Computation, 4(3), 415, 1992. Magyar, G., Johnsson, M. and Nevalainen, O., An adaptive hybrid genetic algorithm for the three-matching problem, IEEE Trans. Evol. Computation, 4(2), 135, 2000. Maluf, N., An Introduction to Microelectromechanical Systems Engineering, Artech House, Inc., Boston, 2000. Man, K.F., Tang, K.S. and Kwong, S., Genetic Algorithms: Concepts and Designs, Springer, London, 1999. Manson, R.L., Gunst, R.F. and Hess, J.L., Statistical Design and Analysis of Experiments, John Wiley & Sons, New York, 1989.
© 2003 by CRC Press LLC
1523_Frame_C15.fm Page 554 Thursday, August 28, 2003 6:20 PM
Marquardt, D.W., An algorithm for least squares estimation of nonlinear parameters, SIAM J. Appl. Math., 11(2), 431, 1963. Mase, H. and Kitano, K., Prediction model for occurrence of impact wave force, Ocean Eng., 26, 949, 1999. Masri, S.F., Ghassiakos, A.G. and Caughey, T.K., Neural network approach to detection of changes in structural parameters, J. Eng. Mech., 122, 350, 1996. MATLAB: the Language of Technical computing, version 6.1.0.450, release 12.1, The MathWorks, Inc., 2001. Matsumura, S., Dkada, M., Yoshikawa, I., Togawa, M. and Kuroda, Y., A technology to form FGMs by composite electroforming, in Functionally Gradient Materials, Birch, H.J., Ed., American Ceramics Society, Westville, OH, 1993, 331. Michael, D.H., Waechter, R.T. and Collins, R., Characterization of a vertical surfacebreaking crack plates, Computational Acoust., 3(4), 297, 1995. Michaels, J.E., Michaels, T.E., and Sachse, W., Applications of deconvolution to acoustic emission, Signal Analysis Mater. Eval., 39(11), 1032, 1981. Michaels, J.E. and Pao, Y.H., The inverse source problem for an oblique force on an elastic plate, J. Acoust. Soc. Am., 77, 2005, 1985. Michaels, J.E. and Pao, Y.H., Determination of dynamic forces from wave motion measurements, J. Appl. Mech., 53, 61–67, 1986. Michalewicz, Z., Genetic Algorithms + Data Structures = Evolution Programs (3rd rev., extended ed.), Springer-Verlag, Berlin, 1992. Mignogna, R.B., Ultrasonic determination of elastic constants from oblique angles of incidence in non-symmetry planes, in Review of Progress in Quantitative Nondestructive Evaluation, Thompson, D.O. and Chimenti, D.E., Eds., Plenum Press, New York, 1990, 1565. Mignogna, R.B., Batra, N.K. and Simmonds, K.E., Determination of elastic constants of anisotropic materials from oblique angle ultrasonic measurements. I: analysis II: experimental, in Review of Progress in Quantitative Nondestructive Evaluation, Thompson, D.O. and Chimenti, D.E., Eds., Plenum Press, New York, 1991, 1669. Miller, J., Potter, W., Gandham, R. and Lapena, C., An evaluation of local improvement operators for genetic algorithms, IEEE Trans. Syst., Man, Cybern., 23(5), 1340, 1993. Miller, R.E., Optimization Foundations and Applications, John Wiley & Sons, New York, 2000. Mohammed, O.A. and Uler, G.G., A hybrid technique for the optimal design of electromagnetic devices using direct search and genetic algorithms, IEEE Trans. Magn., 33, 1931, 1997. Möller, P.W., Load identification through structural modification, J. Appl. Mech., 66, 236, 1999. Mo´re, J.J., The Levenberg–Marquardt algorithm: implementation and theory, in numerical analysis, Lecture Notes in Math, 630, Waston, G.A., Ed., Springer-Verlag, Berlin, 105–116, 1977. Morris, A.J., Foundations of Structural Optimization: a Unified Approach, John Wiley & Sons, New York, 1982. Moscato, P. and Norman, M., A memetic approach for the traveling salesman problem: implementation of a computational ecology for combinatorial optimization on message-passing systems, in Proc. Int. Conf. Parallel Comput. Transp. Appl., 1992. MSC/Dytran user’s manual, version 4: The MacNeal-Schwendler Corporation, Santa Ana, CA, 1997.
© 2003 by CRC Press LLC
1523_Frame_C15.fm Page 555 Thursday, August 28, 2003 6:20 PM
Muhlenbein, H., How genetic algorithms really work: part I. mutation and hillclimbing, in Parallel Problem Solving from Nature: PPSN II, Manner, R. and Manderick, B., Eds., Elsevier Science Publisher, Holland, 1992, 15. Mujumdar, P.M. and Suryanarayan, S., Flexural vibrations of beams with delaminations, J. Sound Vib., 125(3), 441, 1988. Nadeau, J.C. and Ferrari, M., Microstructural optimization of a functionally graded transversely isotropic layer, Mech. Mater., 31, 637, 1999. Nakamura, M. and Kimura, K., Elastic constants of TiAl3 and ZrAl3 single crystals, J. Mater. Sci., 26, 2208. 1991. Narkis, Y., Identification of crack location in vibrating simply supported beams, J. Sound Vib., 172(4), 549, 1994. Nelson, R.B. and Dong, S.B., High frequency vibrations and waves in laminated othortropic plates, J. Sound Vib., 30, 33, 1973. Ni, R.G. and Adams, R.D., A rational method for obtaining the dynamic mechanical properties of lamina for predicting the stiffness and damping of laminated plates and beams, Composites, 15(3), 193, 1984. Nogata, F. and Takahashi, H., Intelligent functionally graded material: bamboo, Composites Eng., 5(7), 743, 1995. Olsson, A., Stemme, G. and Stemme, E., Diffuser-element design investigation for valve-less pumps, Sensors Actuators A, 57, 137, 1996. Olsson, A., Stemme, G. and Stemme, E., Numerical and experimental studies of flatwalled diffuser elements for valve-less micropump, Sensors Actuators A, 84, 165, 2000. Ootao, Y., Kawamura, R., Tanigawa, Y. and Imamura, R., Optimization of material composition of nonhomogeneous hollow sphere for thermal stress relaxation making use of neural network, Comput. Methods Appl. Mech. Eng., 180, 185–201, 1999a. Ootao, Y., Tanigawa, Y. and Nakamura, Y., Optimization of material composition of FGM hollow circular cylinder under thermal loading: a neural network approach, Composites Part B, 30, 415, 1999b. OptiGA: http://www.optiwater.com/optiga/. Pan, L.S., Ng, T.Y., Liu, G.R. and Jiang, T.Y., Fluid-membrane coupling analysis ofr a valve-less micropump, Sensors Actuators A, 93, 172, 2001. Pandey, A.K., Biswas, M. and Samman, M.M., Damage detection from changes in curvature mode shapes, J. Sound Vib., 145, 321, 1991. Patton, A., Punch, W. and Goodman, E., A standard GA approach to native protein conformation prediction, Proceedings of 6th International Conference on Genetic Algorithms, Eshelman, L., Ed., 574, Morgan Kaufmann, Burlington, MA, 1995. Pei, Y.T. and Hosson, J.T.M.DE., Functionally graded materials produced by laser cladding, Acta Mater., 48, 2617, 2000. Petyt, M., Introduction to Finite Element Vibration Analysis, Cambridge University Press, Cambridge, 1990. Powell, M.J.D., An efficient method for finding the minimum of a function of several variables without calculating derivatives, Computer J., 7, 155, 1964. Pradhan, S.C., Krishnamoorthy, C.S., Liu, G.R., Lam, K.Y., Wang, Z.Z., Ito, T. and Takamori, N., Engineering design optimization using genetic algorithm, HPC–Asia Conference, China, 2000. Press, W.H., Flannery, B.P., Teukolsky, S.A. and Vetterling, W.T., Numerical Recipes: the Art of Scientific Computing (Fortran Version), Cambridge University Press, Cambridge, 1989.
© 2003 by CRC Press LLC
1523_Frame_C15.fm Page 556 Thursday, August 28, 2003 6:20 PM
Priestley, M.B., Spectral Analysis and Time Series, vols. 1 and 2, Academic Press, New York, 1981. Radcliffe, N. and Surry, P., Formal memetic algorithms, in Evolutionary Computing, Fogarty, T., Ed., Springer-Verlag, Berlin, 1994, 1. Rao, S.S., Engineering Optimization: Theory and Practice, 3rd ed., John Wiley & Sons, Inc., New York, 1996. Reeves, C., Genetic algorithms and neighborhood search, in Evolutionary Computing, Fogarty, T., Ed., Springer-Verlag, Berlin, 1994, 115. Rhim, J. and Lee, S.W., Neural network approach for damage detection and identification of structures, Comput. Mech., 16, 437, 1995. Riedmiller, M. and Braun, H., A direct adaptive method for faster backpropagation learning: the RPROP algorithm, Proc. IEEE Int. Conf. Neural Networks, 1993, 586. Rikards, R. and Chate, A., Identification of elastic properties of composites by method of planning of experiments, Composite Struct., 42, 257, 1998. Rikards, R., Chate, A., Steinchen, W., Kessler, A. and Bledzki, A.K., Method of iden tification of elastic properties of laminates based on experiment design, Composite: Part B, 30, 279, 1999. Rizos, P.F., Aspragathos, N. and Dimarogonas, A.D., Identification of crack location and magnitude in a cantilever beam from the vibration modes, J. Sound Vib., 138(3), 381, 1990. Rizzi, S.A. and Doyle, J.F., Spectral analysis of wave motion in plane solids with boundaries, J. Vib. Acoust., 114, 133, 1992. Rogers, J.L., Simulating structural analysis with neural network, J. Comput. Civ. Eng., 8, 252, 1994. Rokhlin, S.I. and Wang, W., Double through-transmission bulk wave method for ultrasonic phase velocity measurement and determination of elastic constants of composite materials, J. Acoust. Soc. Am., 91(6), 3303, 1992. Rose, J.L., Huang, Y. and Tverdokhlebov, A., Surface waves for anisotropic material characterization: a computer aided evaluation system, in Review of Progress in Quantitative Nondestructive Evaluation, Thompson, D.O. and Chimenti, D.E., Eds., Plenum Press, New York, 1990, 1573. Sachse, W. and Kim, K.Y., Point-source/point-receiver materials testing, in Review of Progress in Quantitative Nondestructive Evaluation, Thompson, D.O. and Chimenti, D.E., Eds., Plenum Press, New York, 1986, 311. Salawu, O.S., Detection of structural damage through changes in frequency: a review, Eng. Struct., 19, 718, 1997. Santamarina, J.C. and Fratta, D., Introduction to Discrete Signals and Inverse Problems in Civil Engineering, ACES Press, Reston, VA, 1998. Santosa, F. and Pao, Y.H., Transient axially asymmetric response of an elastic plate, Wave Motion, 11, 271, 1989. Sareni, B. and Krahenbuhl, L., Fitness sharing and niching methods revisited, IEEE Trans. Evol. Computation, 2(3), 97, 1998. Sasaki, M., Wang, Y., Hirano, T. and Hirai, T., Design of SiC/C functionally gradient material and its propagation by chemical vapor deposition, J. Ceramic Soc. Jpn., 97(5), 539, 1989. Scales, L.E., Introduction to Nonlinear Optimization, Macmilian, U.K., 1985. Schnecke, V. and Vornberger, O., Hybrid genetic algorithms for constrained placement problems, IEEE Trans. Evol. Computation, 1(4), 266, 1997. Schwartz, M.M., Joining of Composite-Matrix Materials, ASM International, Material Park, OH, 153, 1994.
© 2003 by CRC Press LLC
1523_Frame_C15.fm Page 557 Thursday, August 28, 2003 6:20 PM
Shakhnovich, E., Farztdinov, G., Gutin, A.M. and Karplus, M., Protein folding bottlenecks: a lattice Monte Carlo simulation, Phys. Rev. Lett., 67, 1665, 1991. Shimoda, N., Kitaguchi, S., Saito, T., Takigawa, H. and Koga, M., Production of functionally gradient materials by applying low pressure plasma spray, Proc. 1st Int. Sym. Functionally Gradient Mater., Sendai, Japan, 151–156, October, 1990. Shiota, I. and Miyamoto, Y., Functionally graded materials, Proceedings of 4th International Symposium on Functionally Graded Materials, Elsevier, Amsterdam, 1997. Silva, J.M.E.M. and Gomes, A.J.M., Experimental dynamic analysis of cracked freefree beams, Exp. Mech., 30(1), 20, 1990. Simons, G. and Wang, H., Single Crystal Elastic Constants and Calculated Aggregate Properties: a Handbook, MIT Press, Cambridge, MA, 1971. Smith, C.J., Metal Reference Books, 5th ed., Butterworth & Co., London, 1976. Soares, C.M.M., Defreitas, M.M., Araujo, A.L., and Pedersen, P., Identification of material properties of composite plate specimens, Composite Struct., 25, 277, 1993. Sribar, R., Solutions of inverse problems in elastic wave propagation with artificial neural networks, dissertation, Cornell University, Ithaca, NY, 1994. Stemme, E. and Stemme, G., A valve-less diffuser/nozzle based fluid pump, Sensors Actuators A, 39, 159, 1993. Stoloff, N.S., An overview of power processing of silicides and their composites, Mater. Sci. Eng., A261, 169, 1999. Strang, G. and Nguyen, T., Wavelets and Filter Banks, Wellesley-Cambridge Press, Wellesley, MA, 1996. Sumpter, B.G. and Noid, D.W., On the design, analysis, and characterization of materials using computational neural networks, Annu. Rev. Sci., 26, 223, 1996. Syswerda, G., Uniform crossover in genetic algorithms, in Proceedings of the 3rd International Conference on Genetic Algorithms, Schaffer, J., Ed., Morgan Kaufmann, Los Altos, 1989, 2. Takahashi, K. and Chou, T.W., Non-linear deformation and failure behavior of carbon/glass hybrid laminates, J. Composite Mater., 21, 396, 1987. Takahashi, M., Itoh, Y. and Kashiwaya, H., Fabrications and evaluations of W/Cu gradient material by sintering and infiltration technique, Proc. 1st Int. Sym. Functionally Gradient Mater., Sendai, Japan, 1990, 129. Takemura, M., Yoshitake, A., Haykawa, H., Hyakubu, T. and Tamura, M., Mechanical and thermal properties of FGM fabricated by thin sheet lamination method, Proc. 1st Int. Sym. Functionally Gradient Mater., Sendai, Japan, 1990, 97. Tanaka, M. and Bui, H.D., Inverse Problems in Engineering Mechanics, Springer-Verlag Press, U.K., 1992. Tao, J.S., Liu, G.R. and Lam, K.Y., Design optimization of marine-engine mounting systems, J. Sound Vib., 235, 477, 2000. Tikhonov, A.N., Goncharsky, A.V. and Yagola, A.G., Numerical Methods for the Solution of Ill-Posed Problems, Kluwer Academic Publishers, Dordrecht, 1990. Topping, B.H.V. and Bahreininejad, A., Neural Computing for Structural Mechanics, Saxe-Coburg Publications, Edinburgh, 1997. Tosaka, N., Onishi, K. and Yamamoto, M., Mathematical Approach and Solution Methods for Inverse Problems: Inverse Analysis of Partial Differential Equations, University of Tokyo Press, 1999. Touloukian, Y.S., Thermophysical Properties of High Temperature Solid Materials, Macmillan, New York, 1967.
© 2003 by CRC Press LLC
1523_Frame_C15.fm Page 558 Thursday, August 28, 2003 6:20 PM
Tracy, J.J. and Pardoen, G.C., Effect of delamination on the natural frequencies of composite laminates, J. Composite Mater., 23, 1201, 1989. Truffart, B., Jarny, Y. and Delaunay, D., A general optimization algorithm to solve 2D boundary inverse heat conduction problems using finite elements, first conference in a series on inverse problems in engineering, Palm Coast, FL, 1993, 53. Tsai, S.W. and Hahn, T.H., Introduction to Composite Materials, Technomic Publishing Co., Lancaster, PA, 1980. Ullmann, A., The piezoelectric valve-less pump — performance enhancement analysis, Sensors Actuators A, 69, 97–105, 1998. Unger, R. and Moult, J., Genetic algorithms for protein folding simulations, J. Mol. Biol., 231, 75, 1993. Vinson, J.R. and Sierakowski, R.L., The Behavior of Structures Composed of Composite Materials, Martinus Nijhoff, Dordrecht, 1987. Vogl, T.P., Mangis, J.K., Rigler, A.K., Zink, W.T. and Alkon, D.L., Accelerating the convergence of the back-propagation method, Biol. Cybern., 59, 257, 1988. Voter, A.F. and Chen, S.P., Accurate Interatomic Potentials for Ni, Al, and Ni3Al Characterization of Defect in Material, vol. 82, Siegar, R.V., Weetman, J.R., and Sinclair, R., Eds., Material Research Society, Pittsburgh, 1987, 175. Voter, A.F., The embedded-atom method, in Intermetallic Compounds: Principles and Applications, Vol. 1, Westbrook, J.H. and Fleischer, R.L., Eds., John Wiley & Sons, New York, 77, 1994. Waas, G., Linear two-dimensional analysis of soil dynamics problems in semi-infinite layer media, Ph.D. Thesis, University of California, Berkeley, 1972. Walsh, G.R., Methods of Optimization, John Wiley & Sons, New York, 1975. Wang, J.T.S., Liu, Y.Y. and Gibby, J.A., Vibrations of split beams, J. Sound Vib., 84(4), 491–502, 1982. Wang, Y.Y., Lam, K.Y., and Liu, G.R., Detection of flaws in sandwich plates, Composite Struct., 34, 409, 1996. Wang, Y.Y., Lam, K.Y. and Liu, G.R., Wave scattering of the interior vertical crack in plates and the detection of the crack, Eng. Fracture Mech., 59, 1, 1998. Warshel, A., Computer Modeling of Chemical Reactions in Enzymes and Solutions, John Wiley & Sons, New York, 1991. Watanabe, R. and Kawasaki, A., Overall view of the P/M fabrication of functionally gradient materials, Proc. 1st Int. Sym. Functionally Gradient Mater., Sendai, Japan, 1990, 107. Watari, F., Yokoyama, A., Saso, F., Uo, M. and Kawasaki, T., Fabrication and properties of functionally graded dental implant, Composites Part B, 28, 5, 1997. Weaver, R.L. and Pao, Y.H., Axisymmetric elastic waves excited by a point source in a plate, ASME J. Appl. Mech., 49, 821, 1982. Whitley, D., Gordon, S. and Mathias, K., Lamarckian evolution, the Baldwin effect and function optimization, in Parallel Problem Solving from Nature: PPSN III, Davidor, Y., Schwefel, H.-P. and Manner, R., Eds., Springer-Verlag, Berlin, 1994, 6. Wilkowshi, G.M. and Maxey, W.A., ASTM STP 791, Amer. Soc. Testing Mater., II-266, 1983. Wong, K.P. and Wong, Y.W., Hybrid genetic/simulated annealing approach to shortterm multi-fuel-constrained generation scheduling, IEEE Trans. Power Syst., 12(2), 776, 1997. Wright, A.H., Genetic algorithms for real parameter optimization, in Foundations of Genetic Algorithms, Rawlins, J.E., Ed., Morgan Kaufmann, San Mateo, 205, 1991.
© 2003 by CRC Press LLC
1523_Frame_C15.fm Page 559 Thursday, August 28, 2003 6:20 PM
Wu, E., Tsai, T.D. and Yen, C.S., Two methods for determining impact-force history on elastic plates, Exp. Mech., 35, 11, 1995. Wu, T.Y. and Liu, G.R., The differential quadrature rule for initial-value differential equations, J. Sound Vib., 233, 195, 2000. Wu, X., Ghaboussi, J. and Garrett, J.H., Use of neural networks in detection of structural damage, Comput. Struct., 42(4), 649, 1992. Wu, Z.P., Liu, G.R. and Han, X., An inverse procedure for crack detection in anisotropic laminated plates using elastic waves, Eng. Comput., 18(2), 116, 2002. Wycisk, W. and Teller-Kniepmeier, M., Quenching experiments in high purity Ni, J. Nucl. Material, 616, 79, 1978. Xiao, F.C. and Yabe, H., Microwave image of perfectly conducting cylinders from real data by micro genetic algorithm coupled with deterministic method, IEICE Trans. Electron., E81-c(12), 1784, 1998. Xu, Y.G., Research on the dynamic intelligent diagnosis and reliability evaluation for structures, Ph.D. dissertation, Huazhong University of Science and Technology, China, 1996. Xu, Y.G. and Liu, G.R., Determination of material properties of multilayered thin films using elastic wave propagation approach, J. Micromech. Microeng., 12, 723, 2002a. Xu, Y.G. and Liu, G.R., A novel evolutionary algorithm for complicated optimization problems, Proc. Singapore-MIT 2nd Annu. Sym., Singapore, 2002b. Xu, Y.G. and Liu, G.R., A novel inverse algorithm for parameter identification prob lems in MEMS, 5th World Cong. Computational Mech., Vienna, 2002c, 199. Xu, Y.G. and Liu, G.R., Detection of flaws in composite materials from scattered elastic-wave field using modified µGA and gradient-based optimizer, Comput. Methods Appl. Mech. Eng., 191, 3929, 2002d. Xu, Y.G. and Liu, G.R., Fitting interatomic potential using molecular dynamics simulations and inter-generation projection genetic algorithm, J. Micromech. Microeng., 13, 254, 2003. Xu, Y.G. and Liu, G.R., Parameter identification of dynamic flow-pressure character istics in valve-less micropumps, 2002e, submitted. Xu, Y.G., Liu, G.R. and Wu, Z.P., An accelerated genetic algorithms using HookeJeeves method for local searching, in Proceedings of the 1st International Conference on Structural Stability and Dynamics, Yang, Y.B., Leu, L.G. and Hsieh, S.H., Eds., Taiwan: College of Engineering, National Taiwan University, 2001a, 781. Xu, Y.G., Liu, G.R., Wu, Z.P. and Huang, X.M., Adaptive multilayer perceptron networks for detection of cracks in anisotropic laminated plates, Int. J. Solids Struct., 38, 5625, 2001b. Xu, Y.G., Liu, G.R. and Wu, Z.P., A novel hybrid genetic algorithm using local optimizer based on heuristic pattern move, Appl. Artif. Intelligence, 15(7), 601, 2001c. Xu, Y.G., Liu, G.R. and Wu, Z.P., Damage detection for composite plates using Lamb waves and projection genetic algorithm, AIAA J., 40(9), 1860, 2002. Yamada, T. and Nakano, R., A genetic algorithm applicable to large-scale job-shop problems, in Parallel Problem Solving from Nature: PPSN II, Manner, R. and Manderick, B., Eds., Elsevier Science Publisher, North-Holland, 1992, 281. Yamaoka, H., Yuki, M., Tahara, K., Irisawa, T., Watanabe, R. and Kawasaki, A., Fabrication of functionally gradient material by slury stacking and sintering process, Ceramic Trans. Functionally Gradient Mater., 34, 165, 1993.
© 2003 by CRC Press LLC
1523_Frame_C15.fm Page 560 Thursday, August 28, 2003 6:20 PM
Yanagisawa, N., Sata, N. and Sanada, N., Fabrication of TiB2-Cu functionally gradient material by SHS process, Proc. 1st Int. Sym. Functionally Gradient Mater., Sendai, Japan, 1990, 179. Yang, S.Y., Park, L.J., Park, C.H. and Ra, J.W., A hybrid algorithm using genetic algorithm and gradient-based algorithm for iterative microwave inverse scattering, in IEEE Int. Conf. Evol. Computation, 1995, 450. Yang, Z.L., Liu, G.R. and Han, X., Protein structure prediction using lattice model and a modified micro genetic algorithm, ICMBE, Singapore, D3VB-1230, 2002a. Yang, Z.L., Liu, G.R. and Lam, K.Y., An inverse procedure for crack detection using integral strain measured by optical fibers, Smart Mater. Struct., 11, 72, 2002b. Yang, Z.L., Liu, G.R. and Lam, K.Y., A modified genetic algorithm with local and global search techniques, 3rd Int. Conf. Bioinformatics Genome Regul. Struct., (BGRS ‘2002), Novosibirsk, Russia, 2002c, 190. Yang, Z.L., Liu, G.R., Xu, Y.G. and Lam, K.Y., A local optimized genetic algorithm and its application to inverse detection of delamination in laminates, Asia–Pacific Vibration Conference, Hangzhou, China, 2001, 1155. Yeh, W.G., Review of parameter identification procedures in groundwater hydrology, Water Resour. Res., 22, 95, 1986. Yen, C.S. and Wu, E., On the inverse problem of rectangular plates subjected to elastic impact, J. Appl. Mech., 62, 692, 1995. Zgnoc, K. and Achenbach, J.D., A neural network for crack sizing trained by finite element calculations, NDT&E Int., 29(3), 147, 1996. Zhao, J., Ivan, J.N. and DeWolf, J.T., Structural damage detection using artificial neural networks, J. Infrastruct. Syst., 4, 93, 1998. Zhou, J.J., Inverse procedures and their applications in parameter identification for electronic system cooling, Master Thesis, National University of Singapore, 2001. Zhu, J. and Lu, Z., A time domain method for identifying dynamic loads on continuous systems, J. Sound Vib., 148, 137, 1991. Zienkiewicz, O.C. and Taylor, R.L., The Finite Element Method, 5th ed., McGraw-Hill, New York, 2000.
© 2003 by CRC Press LLC