Proceed'ings of the
METHODS in Sciences and Engineering 2003 (ICCMSE 2003)
This page intentionally left blank
Proceedinas of the
METHODS in Sciences and Engineering 2003 (ICCMSE 2003) Kastoria, Greece
September 12 - 16
editor
T E Simos
World Scientific New Jersey London Singapore Hong Kong
Published by World Scientific Publishing Co. Re. Ltd. 5 Toh Tuck Link, Singapore 596224 USA office: Suite 202, 1060 Main Street, River Edge, NJ 07661 UK office: 57 Shelton Street, Covent Garden, London WC2H 9HE
British Library Cataloguing-in-PublicationData A catalogue record for this book is available from the British Library.
COMPUTATIONAL METHODS IN SCIENCES AND ENGINEERING Proceedings of the International Conference 2003 (ICCMSE 2003) Copyright 0 2003 by World Scientific Publishing Co. Re. Ltd. AN rights reserved. This book, or parts thereol; may not be reproduced in any form or by any means, electronic or mechanical, including photocopying, recording or any information storage and retrieval system now known or to be invented, without written permissionfrom the Publisher.
For photocopying of material in this volume, please pay a copying fee through the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to photocopy is not required from the publisher.
ISBN 981-238-595-9
Printed in Singapore by World Scientific Printers (S)Pte Ltd
This page intentionally left blank
This page intentionally left blank
PREFACE FOR THE PROCEEDINGS OF THE INTERNATIONAL CONFERENCE OF COMPUTATIONAL METHODS IN SCIENCES AND ENGINEERING (ICCMSE 2003), SEPTEMBER 12-16, KASTORIA, GREECE T.E. SIMOS Department of Computer Science and Technology, Faculty of Sciences and Technology, University of Peloponnese, GR-221 00 Tripolis, Greece cry3 simos-eclilov@uop. Pr E-mail: tsr'mos(i~tnail.ariadne-t.
It is well known that computational methods originally developed in a given basic science, for example physics, can be applied very effectively in other neighboring sciences, e.g. chemistry, biology etc as well as to engineering. So, it is of great importance the communication of computational scientists and the interchange of ideas on computational methods. This is the main scope of the International Conference of Computational Methods in Sciences and Engineering (ICCMSE), which I have found. This year the International Conference of Computational Methods in Sciences and Engineering 2003 (ICCMSE 2003) is taken place in Kastoria, Greece. We have a lot of submitted papers and the rejected rate (after international peer review) was approximately 50%. I want to thank World Scientific Publishing Company for the excellent cooperation for the publication of Proceedings (Extended Abstracts) of ICCMSE 2003.
Many thanks to Editors-in-Chief and Publishers of the journals that accept to publish selected full paper of ICCMSE 2003. The selection will be based on the international peer review of at least two independent reviewers.
I want also to thank: 0
The Scientific Committee:
vii
...
Vlll
Prof. H. ¥, Royal Institute of Technology, Sweden Prof. H. Arabnia, The University of Georgia, USA Prof. J. Vigo-Aguiar, University of Salamanca, Salamanca, Spain Prof. D. Belkic, Karolinska Institute, Stockholm, Sweden Prof. K. Belkic, University of Southern California, USA Prof. E. Brandas, University of Uppsala, Uppsala, Sweden Prof. J.C. Butcher, The University of Auckland, New Zealand Prof. A.Q.M. Khaliq, Western Illinois University, USA Prof. G. Maroulis, University of Patras, Patras, Greece Prof. R.M. Nieminen, Helsinki University of Technology, Finland Prof. S. Wilson, Rutherford Appleton Laboratory, UK Prof. J. Xu, Pennsylvania State University, USA for their help and their important support. I must note here that it is a great honour for me that the above leaders on computational Sciences have accepted to participate Scientific Committee of ICCMSE 2003. The invited speakers for their acceptance to give keynote lectures on Computational Sciences. The Chair of the Organizing Committee Dr. Zacharoula Kalogiratou for the support and great efforts for the success of the organization of ICCMSE 2003. The Organizing Committee: Mr. Th. Monovasilis, Mr. D. Sakas, Dr. I. Sinatkas, Dr. E. Siskos, Dr. S. Spartalis, Mr. Th. Themelis, Mr. K. Tselios, Dr. N. Tsounis for their help and support. Special thanks for the Secretary of ICCMSE 2003 Mrs. Eleni RalliSimou for her excellent job. The Technological Educational Institute of Western Macedonia, Kastoria Campus and its Director Dr. N. Tsounis, for the hospitality. The Region of Western Macedonia, Greece, for their financial support. Dr. T.E. Simos, Active Member of the European Academy of Sciences and Arts Invited Chair of ICCMSE 2003 Editor
TABLE OF CONTENTS
1.
P. Abad, 1-3 Componentsfor Time Serie Receiver Clock Oflset in GPS Solutions
2.
4-8 E. Afjei, M. H. Arbab, Magnetostatic Field Analysis by Employing Absorbing Boundary Condition
3.
9-1 1 Z. Akdogan, M. Demirci and 0. Sh. Mukhtarov, Sturm-Liouville Problems with Eigendependent Boundary and Transmissions Conditions
4.
T. M. Alkhamis, 12-14 Computational Methodfor Unconstrained Optimization Functions with Noise
5.
M. Al-Refai, 15 Convergence Analysis for an Iterative Method for Solving Nonlinear Parabolic Systems
6.
G. M. Amiraliyev, 16-21 Unform Numerical Method for a Quasilinear System with Boundary Layer
7.
22-23 Z. A. Anastassi and T. E. Simos, A Family of Optimized Runge-Kutta Methods with Five Stages and Fourth Orderfor IVPS with Oscillating Solutions
8.
24-27 R Anguelov, P. Kama and JM-S Lubuma, Nonstandard Theta-Method and Related Discrete Schemes for the Reaction-Dirusion Equation
9.
28-32 0. Angulo, J. C. Lopez-Marcos, Numerical Integration of a Size-Structured Cell Population Model in an Environment of Changing Substrate Concentration
IX
X
10.
1. Arregui, J. J. Cendan and C. Vazquez, 33-37 A Duality Method for the Compressible Reynolds Equation. Application to Simulation of Readwrite Process in Magnetic Storage Devices
11.
F. Balibrea, J. L. C. Guirao and F. L. Pelayo, 38-46 An Environment for Computing Topological Entropy for Skew-Product Transformations
12.
P. G. Bagos, Th. D. Liakopoulos and S. J. Hamodrakas, 47-55 Maximum Likelihood and Conditional Maximum Likelihood Learning Algorithms for Hidden Markov Models with Labeled Data-Application to Transmembrane Protein Topology Prediction
13.
M. K. Banda, 56-59 Variants of Relaxed Schemes and two-Dimensional Gas Dynamaics
14.
S. C. Basak, B. D.Cute and D. Mills, 60 Predicting Toxicity of Chemicals Using Chemodescriptors and Biodescriptors: an Integrated Approach
15.
S. R. Basu, Measuring Economic Methodological Tools
Well-Being
and
Governance:
61-65 Some
16.
F. A. Batzias, N. P. Nikolaou, A. S. Kakos and 1. Mihailides, 66-73 Modelling the Natural Gas Consumption in a Changing Environment
17.
F. A. Batzias, A. S. Kakos and N. P. Nikolaou, 74-82 Computer Aided Dimensional Analysis for Knowledge Management in Chemical Engineering Processes
18.
L. Bayon, J. M. Grau, M. M. Ruiz and P. Suarez, 83-86 New Developments on Equivalent Thermal in Hydrothermal Optimization. An Algorithm of Approximation
19.
87-89 Unique Virtues of the Pad& Approximant for High-Resolution Signal Processing
20.
Y. S. Boutalis and 0. I. Kosmidou, 90-92 A Feedback Linearization Technique by Using Neural Networks: Application to Bioprocess control
Di. Belkif
xi
21.
J. C. Butcher, Some Numerical Methods for Stiff Problems
22.
E. Camouzis, R. Devault, G. Papaschinopoulos,
93-97
98
a + yxN.-I + S x N - , Period Two Trichotomy on xN+, = xN
+xN-2
23.
P. Carbonniere, D. Begue, A. Dargelos and C. Pouchan, 99-103 Least Squares Fits of Potentials by Using Energy and Gradient Data: Vibrational Anharmonic Spectra for H2Co@om DFT Calculations
24.
G. Castellano, A. M. Fanelli and C. Mencar, Deriving Prediction Intervals for Neurofuzzy Networks
25.
M. M. Cerimele, D. Mansutti and F. Pistella, 110-113 Marangoni Efsects in a Horizontal Solidjkation Process in Microgravity
26.
C. S. Chew, K. S. Yeo and C. Shu, 114-117 Simulation of Incompressible Flows Across Moving Bodies Using Meshless Finite Difrerencing
27.
A. Chortaras, Y. Guo, M. M. Ghanem, F. 0. Bunnin, 118-121 Automatic Generation of Software Components for Real Options Modelling
28.
V. N. Christofilakis, Ch. Alexopoulos, 122-125 Modeling the State and Behavior of an Enzyme Using UML- an Object Oriented Approach
29.
K. Daoulas and V. G. Mavrantzas, 126-132 Atomistic Monte Carlo Simulation Studies of Polymer Melts Grafted on Solid Substrates
30.
N. J. Daras, 133-137 Markov 's Property and Generalized Pad&-TypeApproximants
31.
M. Darbandi, K. Mazaheri-Body, S. Vakilipour, 138-143 A Pressure Weighted Upwinding Scheme for Calculating Flows on Unstructured Grids
104-109
xii
32.
K. Dosios, K. Paparrizos, N. Samaras and A. Sifaleras, 144-147 An Eflcient Modijkation of the Primal-Dual Ttwo Paths Simplex Algorithm
33.
V. Drakopoulos, 148-151 Comparing Sequential Visualisation Methodsfor the Mandelbrot Set
34.
152-155 I. Z. Emiris and Th. G. Nikitopoulos, Structured Matrix Perturbationsfor Molecular Conformations
35.
G. Cerruela Garcia, I. Luque Rub, M. A. Gomez-Nieto, 156-159 A New Algorithm to Obtain All Maximum Common Subgraphs in Molecular Graphs Using Binary Arithmetic and Constraints Satisfaction Model
36.
M. Christodoulakis, C. S. Iliopoulos, Kunsoo Park, J. S. Sim, 160-165 Implementing Approximate Regularities Extended Abstract
37.
166-170 R.Clifford and M. Sergot, Distributed Suff? Trees and Their Application to Large-Scale Genomic Analysis
38.
A. Episkopakis, D. Nikolopoulos, K. Arvanitis, N. Dimitropoulos, 171-174 G. Panayiotakis, D. Cavouras and I. Kandarakis, Modeling the Detective Quantum Eflciency of Scintillators Used in Medical Imaging Radiation Detectors
39.
S. C. Farantos, 175-178 Bihrcation Phenomena in Molecular VibrationalSpectroscopy
40.
179-182 A. Gaitanis, M. R. Freedman and N. M. Spyrou, Verijkation of a Simple Techniquefor the Removal of Motion Artefacts in Electrical Impedance Epigastrography Signals
41.
P. J. Garcia-Nieto, 183-189 Numerical Simulation of Scavenging of an Urban Aerosol by Filtration Taking Into Account the Presence of Coagulation, Condensation, and GravitationalSettling
42.
E. E. Gdoutos, A. A. Giannakopoulos and D. A. Zacharopoulos, 190-191 Stress Analysis and Failure Mechanisms of Compose Materilas with Debonded Interfaces
...
Xlll
43.
D. Glotsos, P. Spyridonos, P. Petalas and G. Nikiforidis, D. Cavouras, P. Ravazoula, P. Dadioti and I. Lekka, 192-195 Support Vector Machines for ClassiJcation of Histopathological Images of Brain Tumour Astrocytomas
44.
D. Greenspan, N-Body Modelling
45.
A. P. Grinko, M. M. Karpuk, 197-204 About One Approach to the Minimization of the Errors of the Tutoring of the Neuron Networks
46.
F. Grondin and G. Mounajed, A. Ben Hamida and H. Dumontet,
196
205-208
Digital Concrete: A Multi-Scale Approachfor the Concrete Behavior 47.
S. Guangyi, T. Kawabe, K. Toraichi, K. Katagishi, 209-2 12 A New Approach to Discrete Approximation of a Continuous-Time System Model Based on Spline Function
48.
J. L. Guisado, F. Jimenez-Morales, J. M. Guerra, 213-216 Application of Shannon 's Entropy to ClassiJLEmergent Behaviors in a Simulation of Laser Dynamics
49.
G. Hanna and J. Roumeliotis, Collocation and Fredholm Equations of the First Kind
217-222
50.
A. Haskopoulos and G. Maroulis,
223-227
Intermolecular Interactions of
(H,O),
51.
U.Hohm, L. Zarkova, 228-235 Accurate Thermophsical Properties of Neat Globular Gases and their Binary Mixtures Determined by Means of an Isotropic TemperatureDepandent Potential
52.
R Hoppe and W. Litvinov and T. Rahman, 236-241 Modelling and Computation of Axially Symmetric Flows of Electrorheological Fluids
53.
D. T. Hristopoulos, Simulations of Spart an Random Fields
242-247
xiv
54.
L. S. Iliadis, S. H. Spartalis, 248-249 Fundamental Fuzzy Relation Concepts of a D.S.S. for the Estimation of Natural Disasters’ Risk (The Case of a Trapezoidal Membership Function)
55.
S. Itoh, M. Igami, 250-253 A Hybrid Molecular Dynamics Simulation Methodfor Solids
56.
I. G. Ivanov and L. G. Taseva, The Contract Gas Market with a Linear Supply Function
254-257
57.
A. Kaczanowski, K. Malan and K. Kulakowski Hysteresis Loop of a Nanoscopic Magnetic Array
258-261
58.
Z. Kalogiratou, Th. Monovasilis, Th. Simos, 262-267 Numerical Solution of the Two- Dimensional Time Independent Schrodinger Equation with Exponential-Fitting Methods
59.
I. Kalatzis, N. Piliouras, E. Ventouras and I. Kandarakis, C. C. Papageorgiou and A. D. Rabavilas, D. Cavouras, 268-271 Probabilistic Neural Network Versus Cubic Least-Squares MinimumDistance in Classijjing EEG Signals
60.
I. Kalatzis and N. Piliouras, D. Pappas, E. Ventouras and D. Cavouras, 272-276 Probabilistic Neural Network Classifier Versus Multilayer Perceptron ClassiJier in Discriminating Brain Spect Images of Patients with Diabetesfiom Normal Controls
61.
T. E. Karakasidis, A. B. Liakopoulos, N. S. Cholevas, 277-280
Parallel Molecular Dynamics Simulation of Lennard-Jones Liquids on a Small Beowulf Cluster 62.
T. E. Karakasidis,
281-284
Vibrational Properties of NiO(lI0) Surface by Molecular Dynamics Simulation 63.
P. Karamanis and G. Maroulis, Electric Properties of Substituted Diacetylenes
64.
N. Karatsis and G. Maroulis, 289-291 Molecular Structure and Electric Polarizability in Sodium Chloride Clusters
285-288
xv
65.
M. M. Karpuk, 292-296 About the Possibility of Applying the Neuron Networksfor Determining the Parameters of Uniaxial Films on the Basis of the Ellipsometric measurements
66.
S. H. Kashani, A Fuzzy Logic Paradigmfor Industrial Economics Analysis
67.
H. Katsuragi and H. Honjo, 298-301 Monotonic Scaling of the KPZ Growth with Quenched Disorder
68.
H. Kaya, M. Kaplan, H. Saygin, 302-305 A Recursive Algorithm for Finding HDMR Terms for Sensitivity Analysis
69.
P. Kolorenc, J. Horacek, K. Houfek and M. C. Zek, G. Mil’Nikov H. Nakamura, 306-308 Calculation of Vibrational Excitation of Diatomic Molecules Below Dissociative Attachment Threshold
70.
K. Konstantinidis and I. Andreadis, 309-310 On the Use of Color Histograms for content Based Image Retrieval in Various Color Spaces
71.
A. M. Kosmas, 311-315 Theoretical Structural and Relative Stability Studies of Isometric and of xooy Peroxides Conformeric Forms
297
(X= H , CH, ,CI, Br ,I , Y = CI, Br ) 72.
W. J. Kowalski, J. Nowak and M. Konior, 316-322 Modeling of Chiral Separations in Chromatography by Means of Molecular Mechanics
73.
M. I. Krivoruchenko, E. Alessio, V. Frappietro and L. J. Streckert, 323-326 Probability Distributions of Volatility in Financial Time Series
74.
M. Kunik, S. Qamar and G. Warnecke, Kinetic Solution of the Boltzmann-Peierls Equation
75.
P.V. Kyratsis, D. A. Panagiotopoulos, D. V. Kakogiannis, 333-336
327-332
Computer Aided Engineeringfor Theoretical Studies of Vehicle Active Suspension Systems
xvi
337-339
76.
M. Lambiris, Ch. Tsitouras and K. Evmorfopoulos, Four-Step, Two-Stage, Sixth-Order, P-Stable Methodrr
77.
340-345 G. Lappas and V. Ambrosiadou, Binary and Multicategory ClassificationAccuracy of the LSA Machine
78.
E. C. Laskari, G. C. Meletiou, D. K. Tasoulis, M. N. Vrahatis, 346-349 Data Mining and Cryptology
79.
Ming-Gong Lee, 350-359 Application of Automatic Diflerentiation in Numerical Solution of a Flexible Mechanism
80.
360-364 T. Levitina and E. J. Brandas, Numerical Quadrature Performed on the Generalized Prolate Spheroidal Functions
81.
T. Levitina and E. Brandas, 365 Multitaper Techniques and Filter Diagonalisation -a Comparison
82.
M.H.X. Liang & B. Wetton, T.G. Myers, 366-376 Combined Air and Rivulet Flow and Application to Fuel Cells
83.
377-381 M. Liapi, K. Alketas Ougrinis, The Transmutation of the Architectural Synthesis. Morphing Procedures Through the Adaptation of Information Technologv
84.
382-386 Y. Li, Shao-Ming Yu and P. Chen, A Parallel Adaptive Finite Volume Methodfor Nanoscale Double Gates Mosfets Simulation
85.
Y. Li, 387-390 An Iterative Method for Single and Vertically Stacked Semiconductor Quantum Dots Simulation
86.
391-394 J. A. Lopez and F. J. Marco, M. J. Martinez, Proposal of a New Computational Method for the Analysis of the Systematic Dsfferences in Star Catalogues
87.
M. S. Magdon - Maksymowicz, A. Dydejczyk, P. Gronek and A. Z. Maksymowicz, 395-399 Simulation of the Switching Curve in AntiferromagneticIsing Model
xvii
88.
G. Maroulis, Electric HyperpolarizabilityCalculations
400-404
89.
F. Martinez, A. Guillamon and J. J. Martinez, Segmentation Natural Speech Using Fractal Dimension
405-408
90.
J. Mateu and J.A. Lopez, 409-412 Cluster Modelsfor Spatial Point Processes with Applications
91.
T. Mavromoustakos, P. Zoumpoulakis, M. Zervou, I. Kyrikou, A. Kapou, N. Benetis, 413-417 The Use of ComputationalAnalysis to Design Novel Drugs
92.
E. Miletics, 418-426 Energy ConservativeAlgorithmfor Numerical Solution of ODES Initial Value Problems
93.
B. F. Minaev and H. Agren, Enzymatic Spin Catalysis Involving 0,
94.
N. Moir, 432-435 A New Class of Methodsfor Solving Ordinary Diferential Equations
95.
G. Molnhrka, 436-445 implicit Extension of Taylor Series Methodfor Initial Value Problems
96.
446 450 Th. Monovasilis, Z. Kalogiratou, T. E. Simos, Exponential-Fitting Symplectic Methods for the Numerical Integration of the Schrodinger Equation
97.
H. Nakatsuji, Structure of the Exact Wave Function: Progress Report
451-453
98.
S. Nikolif and N. Trinajstib, Complexity of Molecules
454-456
99.
D. Nikolopoulos, P. Liaparinos, S. Tsantis, D. Cavouras and I. 457 460 Kandarakis, G. Panayiotakis, Radiation Detection Eficiency Evaluation of YAP;CE Scintillator by Monte-Carlo Methods
427-431
-
-
xviii
100.
L. A. A. Nikolopoulos, 461-465 B-Splines: A Powerful and Flexible Numerical Basisfor the Continuum Spectrum of the Schrodinger Equation. An Application to Hydrogenic Atomic Systems
101.
L. A. A. Nikolopoulos, A Finite Element Approachfor the Dirac Radial Equation
102.
470-473 I. Ntzoufras, A. Katsis, D. Karlis, A Bayesian Statistical Modeling for the Distribution of Insurance Counts
103.
N. Orfanoudakis, H. Hatziapostolou, E. Mastorakos, E. Sardi, K. 474-478 Krallis, N. Vlachakis, S. Mavromatis, Design, evaluation measurements and CFD Modeling of a Small Swirl Stabilised Laboratory Burner
104.
479-483 Y. Panagis, E. Theodoridis, K. Tsichlas, Data Structuring Application for String Problems in Biological Sequences
105.
484-489 N. G . Pavlidis, K. E. Parsopoulos and M. N. Vrahatis, Computing Nash Equilibria Through Particle Swarm Optimization
106.
D. G. Pavlou, N. V. Vlachakis, M. G. Pavlou, V. N. Vlachakis, M. Kouskouti, I. Statharas, 490-491 Foundamental Solution of the Cracked Dissimilar Elastic Space
107.
G . Papakaliatakis, D. Karalekas, Study of Fracture in SIC/AL Composites
108.
G. Papakaliatakis,
494-495 Computational Study of the Crack Extension Initiation in a Solid Propellant Plate with a Circular Hole
109.
496-499 S. H. Park, J. H. Kim, Nodal Stress Recovery and Error Estimation Based on Variation of Mapping Function
110.
K. Perdikuri, C. Makris, A. Tsakalidis, Discovering Regularities in Biosequences: Applications
466-469
492-493
500-503 Challenges and
xix
111.
I. Petrounias, A. Tseng, P. Chountas, Constraint Based Web Mining
504-511
112.
I. Petrounias and A. Assaid, Temporal Web Log Mining Using Olap Techniques
512-519
113.
L. Pogliani, 520 Introducing Complete Graphs in Molecular ConnectivityStudies
114.
M. Sekkal-Rahal, D.C. Kleb and P. Bleckmann, 52 1 Structures and Energies of p-Neocarrabiose in Vaccum and in Aqueous Solution 522-525
115.
H. Rarnos, J. Vigo-Aguiar, Variable Step-Size Stormer Methods
116.
H. Ramos, J. Vigo-Aguiar, 526-529 A Note on the Selection of the Step Size in the Variable Step-Size Stormer Method
117.
A. R k o , 530-533 Birej-ingences: A Challengefor Both Theory and Experiment
118.
J. Roca J. R., J. Roca, J. Martinez and F. J. Martinez, F. J. Gil and J. A. Alvarez-Gomez 534-537 Feasibility of Closed-Loop Target Controlled Infusion of Intravenous Anaesthesia
119.
P. Roubides, 538-541 The Fundamental Solution Method for Elliptic Boundary Value Problems
120.
J. Roumeliotis, Axisymmetric Rigid Bodies in Creeping Flow
121.
546-549 N. RUSSO, T. Marino, E. Sicilia and M. Toscano, Past, Present and Future Challenge of Density Functional Theory Based in Molecular Sciences
122.
550-554 T. Rusu and M. Pinteala, V. Bulacovschi, Artzjkial Intelligence Methods Used in the Investigation of Polymers Properties
542-545
xx
123.
555-556 D. Sakas and T. E. Simos Symmetric Multistep Methods with Minimal Phase-Lag for the approximate Solution of Orbital Problems
124.
J. K. Sakellaris, 557-560 Finite Element Analysis for Weakly Coupled Magneto - ThermoMechanical Phenomena in Shell Structures
125.
S. Sanchez and R. Criado, C. Vega, 561-566 A Generator of Pseudo-Random Numbers Sequences with Maximum Period
126.
P. Sasavat, N. Gindy, G. F. Xie and A. T. Bozdana 567-570 Near Force-Balanced Cutting: Key to Increase Productivity in Machining
127.
L. P. Schulz, 571-584 Symmetry Formation Principles of the Chemical Computer Software
128.
A. Sharma,
585-586
The Generalised Mass-Energv Equation hE = AC2hM;Its Mathematical Justification and Application in General Physics and Cosmology 129.
Shenghua Shi and Atsuo Kuki, 587-592 A Simple Approach to a Multi-Objective Design with Constraints in Compound Selectionfor Drug Discovery
130.
S. V. Shepel, S. Paolucci, 593-597 Finite Element Level Set Formulationsfor Modelling Multiphase Flows
131.
K. Sivagurunathan, P. Chountas, E. El-Darzi, Representation h Modelling of Electronic Patient Records
132.
Y. V. Skorov, B. J. R. Davidsson, G. N. Markeleov, 604-610 Consistent Kinetic Model of Innermost Cometary Atmosphere and Boundary Layers of Cometary Nucleus
133.
P. Spyridonos, P. Petalas, D. Glotsos, G. Nikiforidis, D. Cavouras, P. Ravazoula, 611-614 Comparative Evaluation of Support Vector Machines and Probabilistic Neural Networkrs in SuperJicial Bladder Cancer Classification
598-603
xxi
134.
J. P. Suarez and P. Abad, A. Plaza, M. A. Padron, 615-618 ComputationalAspects of the Refinement of 3 0 Complex Meshes
135.
Chen-Yin Suen, 619-621 The Impact of Graphics Calculator on Mathematics Education in Asia
136.
A. J. Thakkar, Density Functionals for Moments of Distribution
622-625
the Electron Momentum
137.
P. Theocharakis, I. Kalatzis and N. Piliouras, N. Dimitropoulos, E. Ventouras and D. Cavouras, 626-630 Relationship Between Carotid Plaque Composition and Embolization Risk Assessed by Computer Processing of Ultrasound Images
138.
E. S. Tentis, D. P. Margaris, D. G. Papanikas, 631-633 Transient Simulation of Large Scale Gas Transmission Networks Using an Adaptive Method of Lines
139.
D. Tomtsis, V. Kodogiannis, E. Wadge, Optical PH Measurement using Chromatic Modulation
140.
S. Tsantis, I. Kalatzis, N. Piliouras, D. Cavouras, N. Dimitropoulos, G. Nikiforidis, 639-642 Computer-Aided Characterization of Thyroid Nodules by Image Analysis Methods
141.
S. Tsantis, D. Cavouras, N. Dimitropoulos, G. Nikiforidis, 643-646 Denoising Sonographic Images of Thyroid Nodules Via Singularity Detection Employing the Wavelet Transform Modulus Maxima
142.
K. Tselios and T. E. Simos, 647-649 Runge-Kutta Methods with Minimal Dispersion and Dissipation for Problems Arisingfrom ComputationalAcoustics
143.
S. Tsitmidelis, M. V. Koutras, V. Zissimopoulos, 650 ReliabiIity Bounds Improvement Via Cut Set or Path Set Rearrangements
144.
J. M. Ugalde, 651-655 The Electron-Pair Density and the Modeling of the Spherically Averaged Exchange-Correlation Hole
634-638
xxii
145.
E. Varnvakopoulos, G. A. Evangelakis, D. G. Papageorgiou, 656-659 Solidjication of Pb PRE-Covered Cu(1 I I) Surface
146.
660-662 J. A. Vera and A. Vigueras, Stability of an Equilibrium Solution for a Gyrostat About an Oscillating Point
147.
G. D. Verros, 663-666 Computer Aided Estimation of Molecular Weight and Long Chain Branching Distribution in Free Radical Polymerization
148.
J. sgo-Aguiar, H, Rarnos, 667-669 VS-VO Numerov Method for the Numerical Solution of the Schrodinger Equation
149.
670-672 M. De’ Michieli Vitturi, F. Beux, Nonlinear Pressure and Temperature Waves Propagation in Fluid Saturated Rock
150.
673-677 E. Wadge, V. Kodogiannis, D. Tomtsis, Neuro-Fuzzy Ellipsoid Basis Function Multiple Classifier for Diagnostic of Urinary Tract Infections
151.
G. Wei and N. Mousseau, P. Derreurnaux, 678-681 Protein Folding Simulations Using the Activation-Relaxation Technique
152.
S. Wilson, On the Systematic Construction of Molecular Basis Sets
153.
687-691 Z. Xiong and N. C. Bacalis, Generalization of Laguerre Orbitals Toward an Accurate, Concise and Practical Analytic Atomic Wave Function
154.
S. Zimeras, F. Georgiakodis, 692-694 Bayesian Models for Medical Image Biology Using Monte Carlo Markov Chains Techniques
682-686
COMPONENTS FOR TIME SERIE RECEIVER CLOCK OFFSET IN GPS SOLUTIONS
P.ABAD University of Las Palmas de Gran Canaria Department of Cartography and Graphic Engineering Department Las Palmas de Gran Canarias, 3501 7, SPAIN E-mail:
[email protected]
In the equations systems generated for the GPS satellite observations, when the positional parameters are held fixed or constrained the clock receiver solutions are significantly affected, and this variable absorbs the non controlled effects. In this work, for the clock receiver solutions viewed a time serie , we analyze the components to detect periodicity in the no controlled effects, and built new mathematical models for control them.
1. Motivation and Introduction A great variety of corrections contribute to the quality of results when using the GPS observations [l]. All these effects must be considered for positioning, even for pseudorange positioning at the meter precision level. The effects (first kind effects) are: The special and general relativity, instrumental delays, atmospheric delays, satellite and receiver clock offsets and multipath effect. When we talk about position with precision of a few cms. in the correction model for Precise Point Positioning (PPP) it is important to take into account some second kind of effects that may not have been considered in other processing modes. The effects are the satellite antenna offsets and phase wind-up correction (Satellite attitude effects), Earth orientation parameters (EOP), solid Earth tides and ocean loading (site displacements effects), and the last, precise ephemeris [2]. 2. Two-Correctionsmodels to solve the equations systems The simultaneous observation of four or more satellites of known position from an unknown place allows us to build an equation system using the distance formula. There are two kind of equations depending on the satellite signal: phase(@) equations and pseudo-range(P) equations. The GPS observation equation following a mechanistic model, see for example [3], can be written as follows: 1
2
@,'(t)= p,"( t )- cdTs( t ) + cdt, ( t ) + NA +
-
+ ETropo. + ERelu + EInsi
+ EAnt + EWind + EE(idc + EOcean
+ EMull. + + EEop +
P;' ( t )= p,"( t )- CdT' ( t ) + cdt, ( t ) + + EIono. + ETropo. + ERela + EIn.sl
+
+
Where: p; is the pseudo-range, dr,(r) is the receiver clock offset from GPS time, d76(t) is the satellite clock offset from GPS time, c is the vacuum speed of light, A is the wavelength, N is the carrier phase ambiguity, E , are the effects mentioned above, and E is the random error distributed as N(0,d). Usually the first kind of effects are removed but normally the second kind are omitted. The simplified mathematical model can be written:
@: ( t )= p,"( t )- cdTs ( t )+ cdt, ( t )+ NA + E Prs(t)= p ~ ( t ) - c d T ' ( t ) + c d t , ( t ) + ~ The least squares solution with a a-priori weighted constraints P to the parameters is given by S=-(P+ATPA)-'ATPW,, so that the estimated parameters are:
2=X,+S When the positional parameters are held fixed or constrained the clock solutions are significantly affected, and this variable absorbs the non controlled effects.
3.
Conclusions
The different components are quantified and classified by order of magnitude. Depending on the value of these components will be taken into account in the mathematical correction model.
3
References 1.
Global Positioning System, Papers published in Navigation, v01.- 1, 1980.
2. J. Kouba, P. HCroux, Precise Point Positioning Using IGS Orbit and Clock Products, Geodetic Survey Division, Ottawa, Ontario. September 2000. 3.
G. Strang and K. Borre, Linear algebra geodesy and GPS, WellesleyCambridge Press ,1997.
MAGNITOSTATIC ANALYSIS BY EMPLOYING ABSORBING BOUNDARY CONDITION E. AFJEI AND M. H. ARBAB Depi. Elecirical & Computer Engineering Shahid Beheshti University E-mail: afjei @yahoo.corn
The need for absorbing boundary condition and coordinate stretching arises when one wishes to simulate the extension to infinity on a finite domain of computation for a problem. This paper develops the proper absorbing boundary operators for an elliptical partial differential equation for a magnetostatic problem. It then uses finite difference technique with the founded absorbing boundary condition in order to investigate its effect on the solution of the problem. In this procedure a finite distance is considered instead of the original infinite distance. Applying proper boundary conditions at that distance one can emulate the open boundary conditions. The absorbing boundary condition have been employed extensively to hyperbolic problems in which the solutions move at finite speed are limited in duration by the return of outwardly propagating feature of the solution. In this problem we consider an elliptic magnetostatic problem and then try to develop absorbing boundary condition formulas of different orders and finally applying them to the problem using the finite difference technique.
Problem Description & Development Consider the following two-dimensional Problem shown in Fig. 2 where bounded domain which is a current carrying conductor with a highly permeable material for the exterior LIT.
R, is a simply connected
The governing equations are
V2A=-~
onR
and
V2A = 0 The boundary conditions are
4
on
RT.
5 on the interface between L? and nT and A, = 0 at the outer boundary very far distance. Due to symmetry in 8 direction, this problem can be reduced to a 1-dimensional problem in r direction. The known solution for the exterior region of this problem is written as
N A , ( r ) = (-)In2n
c1
(4)
r
where CI is a constant and determined by the far distance radius at which, the boundary condition A,=O is applied. The asymptotic expansion of (4) results in
Expanding ( 5 ) results in
a a a A Z ( r ) = a o+L+++<+..r r r
a, a a u ( r ) = +-+++++... r r r
(7)
It an be seen that
dv dr
dv dr
-
v r
a, r2
-+-=--(a,+-
1 r3
2a, - -... 3a3 r3 r4
2a3 + ...) r
(9)
6
1
is a series expansion that vanes as 7 and has eliminated the coefficient a,. We
r
define a first-order operator for the absorbing boundary condition to be
d dr
B, = (-+-) In terms of
1 r
v, d dr
B, (v) = (it can be seen that V
1 + -)v
r
= AZ( r )since Iimr+ A, ( r ) = a,
for our problem it is required
lim
~~~
A, ( r ) = 0
then, a. =O The first order absorbing boundary operator as a function of the potential, A,, then becomes
it can be seen that
1
valid to the order of 7
r
and higher-order absorbing boundary condition can also be developed by the same procedure.
7
Numerical Analysis Equations 1 and 2 are solved using two types of outer boundary conditions namely, Az=O and Absorbing are used at different radius in the high permeable material. In the analysis, current is 5A and relative permeability considered to be 6500. The radius of current carrying coil is 10 cm. Figs. 3 , 4, and 5 are the results of the magnetic vector potential and magnetic field density vs. radius when the first order absorbing boundary condition and also Az=O at different radius is employed. The results for the vector potential show lower constant values when absorbing boundary condition is applied, especially closer to the current carrying conductor. At smaller radius the effect of absorbing boundary is more pronounced. The magnetic field density is the same in all cases, since it deals with the curl of A, .
Fig. 1 Non-uniform mesh
Fig. 2 Bounded closed domain
8
1 a
0
0.005
0.01
0.015
0.02
Radlus(m)
Fig.3. A- Magnetic vector potential vs. radius B - Magnetic field density vs. radius (BC at r-2cm)
Fig. 4 .A- Magnetic vector potential vs. radius B - Magnetic field density vs. radius (BC at r=4cm)
0
0.01
0.04
0.06
0.08
0.1
0
Rmdlu.(n)
0.025
0.05
0.075
0.1
Ridlui (m)
I
Fig.5. A- Magnetic vector potential vs. radius B- Magnetic field density vs. radius (BC at r=8cm)
STURM-LIOUVILLE PROBLEMS WITH EIGENDEPENDENT BOUNDARY AND TRANSMISSIONS CONDITIONS
Z A K D O ~ A N ,M.DEMIRCI AND O.SH.MUKHTAROV Gaziosmanpaga University, Faculty of Science-Art, Department of Mathematics, 60100 Tokat, Turkey E-mail:
[email protected],
[email protected] and
[email protected]
It is well-known that the sturmian theory is an important aid in solving many problems in mathematical physics. Therefore this theory is one of the most actual and extensively developing fields in spectral analysis of boundary-value problems. Basically it has been investigated boundaryvalue problems which consist of ordinary differential equations with continuous coefficients and end-point boundary conditions. The purpose of this work is to extend some fundamental spectral properties of regular Sturm-Liouville problems to special kind discontinuous boundary value problem, which consist of a Sturm-Liouville equation together with eigendependent boundary and transmission conditions. Namely we shall consider one discontinuous eigenvalue problem which consist of Sturm-Liouville equation.
+
7u := -u’! q(z)u= xu, z E [a,c) u (c,b]
(1)
with boundary conditions Ll(U)
: = x ((u;u(a)- (u’2u’(a))- (a1u(a)- azu’(a))= 0
Lz(u) : = x (Piu(b) - P;u’(b))
+ ( P l U ( b ) - PZU’(b)) = 0
(2) (3)
and transmission conditions
L3(u): = u(c + 0 ) - u(c - 0) = 0
(4) L4(u) : = u/(c 0) - u’(c - 0 ) X S l U ( C ) = 0 (5) where X is complex eigenparameter, q(z)continuous in [a,c) and (c, b] and (i = 1,2) are real has a finite limits q(fz) := lim q(z) ; ai,ai,/?i,,8;,61, x-fc numbers; we assume that 61 > 0 .
+
+
9
10
In the Hilbert space L2 [a,b]@ C3 of vectors F := (f(z),fl,fz, f3) we define an equivalent inner product by b
a
[ a)
where p1 = aim - alak > 0, p2 = Pip2 - P1P; > 0 and linear operator in this space by the domain of definition
D(A):={F=
f (x)
f is absolutely continuous in [a,b] , f’ is absolutely continuous in [a,c) and (c,b] , f’(&c) = lim f’(z) , ~f E L2 [ah]f l = Ni(f), x+fc
f2 =
N8f) f3 = MXf) 3
and action low
A F = (7f,Na(f),Nb(f),Mc(f)) where N,(f)=
).(fi
Plf(b) - Pzf‘@),
- %#(a), NL(f)= a:f(a) - Ohf’(a), Nb(f)=
Wf)= P X b ) - Pif’@),M4.f) = f’(c + 0) - f‘(c
-
O ) , Mi(f) = -S,f(c)
so that the problem (1)-(5) can be considered as the eigenvalue problem of this operator. By using this Hilbert space formulation and own Special techniques we investigate some spectral properties of the considered eigenvalue problem (1)-(5). In particular we obtain asymptotic approximate formulas for eigenvalues and corresponding eigenfunctions. In contrast to previous works, the derivative of eigenfunctions of our problem may have discontinuity at the inner point z = c. For example, for the case1 # 0 , ah # 0 we found the next asymptotic formulas for eigenfunctions and corresponding eigenvalues b-c
c-a
(n+i)+O
[A]
(6)
11
This kind of eigenfunctions may arise in spectral problems of the theory of heat and mass transfer, in diffraction problems and varied assortment physical transfer problems. The main references
References 1. C.T. Fulton, Two-point boundary value problems with eigenvalue parameter contained in the boundary conditions,Proc.Roy.Soc.Edin. 77A 293-308(1977). 2. 0. Sh. Mukhtarov, M. Kandemir, Asymptotic behaviour of eigenvalues for the discontinuous boundary-value problem with functional-transmission conditions, Acta Mathematica Scientia 22 B(3) 335-345(2002). 3. S. Yakubov and Y . Yakubov, Abel basis of root functions of regular boundary value problems, Math. Nuchr. 197 157-187(1999).
COMPUTATIONAL METHOD FOR UNCONSTRAINED OPTIMIZATION FUNCTIONS WITH NOISE T A L L M. ALKI-IAMIS Department of Statistics and Operations Research Kuwait Universiv, P.O. Box 5969, Safat, Kuwait E-mail: alkhamis(lii,kucOl.kuniv.edu.hv
1. Guidelines Unconstrained nonlinear optimization problem UNOP arise in science and engineering when the goal is to find a solution, expressed as a vector of variables, that minimizes or maximizes some function that acts as a measure of the merit of the solution. A standard UNOP can be defined as follows:
min f(x) X
+
where x E Rn and f : R n R. In the literature of nonlinear optimization, the Hooke and Jeeves algorithm is widely used as a pattern search procedure to optimize a nonlinear functions that are not necessarily continuous nor differentiable. The HJ pattern search algorithm consists of one variable at a time exploratory moves along the individual coordinate directions in the neighborhood of a base point solution to determine an appropriate direction of search (pattern). Following the exploratory search, a series of pattern moves are made to accelerate the search in the direction determined in the exploratory search. Exploratory searches and pattern moves are repeated until a termination criterion is met. Most work on HJ assume implicitly that to each solution point x, the corresponding function value f(x) can be computed accurately. In many situations of real life application function f(x) is either does not exist in a closed form or too complicated to be calculated analytically. An example where function L does not exist in a closed form is arising in the area of nonMarkovian, queuing networks, where any of the factors such as non-exponential service times, multiple classes of customers, limited buffer sizes or transient behavior is significant then system measure of performance such as waiting time in the system does not have a closed form. An example where function L is too complicated to be calculated analytically arising in the area of engineering
12
13 optimization where the measure of performance usually defined on the solution of complex systems of equations including implicit equations, ordinary differential equations, and partial differential equations. In this work, we are considering the case where f(X) can only be evaluated via Monte Carlo simulation i.e f(X) is observed with a random error. It is assumed that at each time when we want to determine f(X), a
-
value f (X) can be observed, which is obtained from -
random noise:
f(x)
=
f(x)+
't,
f(x) by superposition of
where 7, are independent random L
variables with mean zero and variance Gx. Thus our optimization problem becomes min f(x) = E[T(x)]
x E R n and our objective is to seek the global
X
optimal solution X*
E R n ,where
f(x*) = E[f(x*)] <
E[T(x)] = f(x)
'd x E R n
Steps of Hooke and Jeeves Algorithm (Bazara et. al. [ 13): a formal statement for a minimization problem is given below Initialization step. Let d,, d,
,..., d,
be the coordinate directions. Choose a
scalar & > 0 to be used for terminating the algorithm. Furthermore, choose an initial step size, A 2 0 , and an acceleration factor, cx > 0. Choose a starting point X I ,let y 1 = X I , let k = i = 1, and go to the main step. Main Step
1.1 Evaluate function f a t point y i and point ( y;
+ A d i ),
= yi
+ A d i and go to step 2, else go
1.2 Evaluate function f a t point ( y i - A d ; ) ,
if f(yi -Ad; ) < f(yi) set
if f(yi to step 1.2.
+ A d i ) < f(yi) set y i + l
y i + l = y i - Adi and go to step 2, else set y i + l = y i and go to step 2.
14
i < n, set i = i + 1, and repeat step 1. Otherwise, go to step 3 if f ( y n + l ) < f ( xk) , andgo to step4 if f(yn+l) 2 f ( x k ) .
2. If
4. If
A I E ,stop; Xk is the solution. Otherwise, replace A by A 12. Let
Y1 = x k , Xk+l = x k , s e t k = k + l , s e t i = l ,an d g o to s te p I . As it can be seen above, for the HJ algorithms to work it needs to evaluate the objective function value accurately. Here we are considering situation where the measure of performance of the objective function can only be evaluated via Monte Car10 simulation. Due to the nature of simulation, the outputs of the experiments are stochastic. This means that the value of the objective function obtained from the simulation experiments belong to some probability distribution. Thus a modification to the HJ algorithm to handle the stochastic noise is very important to reduce the chances that the search technique is not being misguided by stochastic noise. In the simulation literature, ranking and selection (RS) [2] procedures have often been recommended for solving optimization problem via simulation. RS procedures are statistical methods specifically developed to select the best system from a set of competing systems. Provided certain assumptions are met, these methods usually guarantee that the probability of correct selection will be at least some user-specified value. The statistical methods of RS are applicable when comparisons among a finite and typically small number of systems are required. In this paper we present a computational method that combines HJ and RS for solving unconstrained stochastic optimization problems. We present empirical results that illustrate the performance of the proposed computational method.
References [l] Bazaraa, M. S., Sherali, H. D., Shetty, C. M., Nonlinear Programming Theory and Algorithms (2ndEdition), John Wiley and Sons, New York, 1993. [2] Law, A. M. ,Kelton, W. D., Simulation Modeling and analy~is(2"~ Edition), McGraw-Hill, Inc., New York, 1991.
CONVERGENCE ANALYSIS FOR AN ITERATIVE METHOD FOR SOLVING NONLINEAR PARABOLIC SYSTEMS
MOHAMMED AL-REFAI Department of Mathematics and Statistics, Jordan University of Science & Technology P.O. Boa: 3030, I d i d 22110, JORDAN E-mail: m-
[email protected]
Nonlinear parabolic systems of partial differential equations are considered. In a recent work, we have proposed a new iterative method based on the eigenfunction expansion to integrate these systems. In this paper, we prove the convergence of the method on bounded time intervals under certain condition that can be more easily to satisfy. We then show that the solution obtained by the new method will converge to the exact solution for a problem in combustion theory. Moreover, we determine the number of iterations needed to obtain a solution with a predetermined level of accuracy. It is expected that the convergence analysis can be used for similar systems of time dependence.
15
UNIFORM NUMERICAL METHOD FOR A QUASILINEAR SYSTEM WITH BOUNDARY LAYER
G. M. AMIRALIYEV 100. Yil University, Faculty of Art and Science, Department of Mathematics, 65080 Van, TURKEY E-mail: gamirali2000@yahoo. com Keywords: Singular perturbation, Non-uniform mesh, Difference scheme, Uniform convergence
1. Introduction
This paper is concerned with the numerical solution for singular perturbation system of two coupled ordinary differential equations with first and second orders and with initial and boundary conditions, respectively. Thus we will consider the initial boundary value problem of the form.
u2(0)= A,
2~2(1) = B,
(1-4)
where 0 < E < 1 is a small parameter, u*,A , B are given constants. f1(x,u1,UZ),f2(2,u1,ug)are given smooth functions satisfying certain regularity conditions. These conditions will be specified whenever necessarily. In addition assume
16
17
Finite difference scheme on a special non-uniform mesh, whose solution converges pointwise independently of the singular perturbation parameter is constructed and analyzed. Numerical results supporting the theory are presented. The above type initial boundary value problems arise in the nonlinear vibration theory of solid mechanics 5,8, in fluid mechanics to study nonsteady filtration in a porous medium and other physical models. The problem (1.1)-(1.4) exhibits a boundary layer of small thickness at left end. It is well known that a difference schemes on a uniform mesh are not suitable to nonlinear singularly perturbed problems as a special fine mesh is required in boundary layer region and comparatively much coarser mesh elsewhere. Ideally, the mesh should be adapted to the features of the exact solution using an adaptive grid generation technique. This approach is now widely used for numerical solution of differential equations with steep, continuous solutions (see, for more recent work, e.g. monographs 4 , 6 ) . In the present paper, we have derived a method based on using finite elements with piecewise constant and piecewise linear basis functions and appropriate quadrature formulae with remainder term in integral form. In the boundary layer, we introduce the non-uniform mesh, which is constructed by using the estimates of derivatives of the exact solution and the analysis of the local truncation error. This technique is similar to those used in l , 2 , 3. 2. Analytical Results
Here we give a useful asymptotic estimations of the exact solution of (1.1)(1.4), that are needed in later sections.
Lemma 2.1. Suppose the conditions (1.5) and (1.6) are satisfied. Then
{
Iuy(z)I 5 c 1 +&-I€!-?
I
},
0 5 ~ 5 1 ,k = 0 , 1 ,
IuF)(x) I C, 0 5 z 5 1,
k = 0 ,1 , 2 ,
(2.1)
(2.2)
provided that Iafk/azI, I a f k / a u k I , Idfk/au3-kl I C , k = 1,2, where C is positive generic constant independent of E (also mesh parameter in OUT following discussion about numerical solution).
18
3. Discretization and Non-uniform Mesh
Let W N be any non-uniform mesh on [0,1]: WN
= (0
< 5 1 < ... < X N - 1 < 1 } ,
and W$ = W N U { X N = 1 } , WN = W N U (50 = 0 ,X N = 1). For each i 2 1 we set the stepsize hi = zi - zi-1. Before describing our numerical method, we introduce some notation for the mesh functions:
for any function g , 9f,i = (9i - 92-1)/hi,
gz,i = (gi+l - gz)/hi+l,
1 92,i = (9i+l - 9i)/hi, ti, = -(hi 2 + hi+l), 9f&i = (&,i
-
%,i)/hi.
We propose the following differencescheme for approximating (1.1)-1.4): EYlf
+ fl(X,Yl,Y2)
= 0,
2 E w;,
(3.1)
(3.4)
Remark 3.1. Under the above assumption on the data of problem (1.1)(1.4), it is possible to prove, for (3.1)-(3.4), the existence and uniqueness result. The difference scheme (3.1)-(3.4), in order to be &-uniformconvergent, we will use the fitted form of W N . This is a special non-uniform mesh, ) a which is condensed in the boundary layer. According to (2.1), u ~ ( xhas boundary layer at z = 0 of order a,'aIlna( thickness. The fitted special non-uniform mesh W N on the interval [0,1] is formed by dividing the interval into two subintervals [0,a], [a,11, where
19
The interval [O,B] is contained in the boundary layer. Here we use a fine mesh. The interval [a,l]contains the smooth part of the solution. There we use a coarse mesh. The corresponding mesh points are
if B
(3.6)
, i = 0,..., N/2,
(3.7)
+ 1,..., N; h = 2(1 - a)/N.
(3.8)
< 1/2, zi = -a;'eln
if
1
, i = 0, ..., N/2,
zi E [O,B] : zi= -a; Eln 1 - (1- E ) l [ N 2i
$1
[l - (1 - e - s )
= 1/2;
xi E [cr,l] : zi = B + (i - N/2)h,i = N/2
Here, without loss of generality, we assume that N is even. The special mesh W N , given by (3.6)-(3.8) denote by & N . 4. Uniform Convergence
Theorem 4.1. Let ( U I , U ~ } be the solution of ( l . l ) - ( l . 4 ) and {y1,y2} be the solution of (3.1)-(3.4). T h e n the following &-uniform error estimate holds IlYk(.)
k = 1,2.
- ~ k ( 4 1 1 c ( 3 N )< CN-',
5. Numerical Results We consider the test problem (1.1)-(1.4) with the data fi(z,ul,u2) = 2u1 - e-"l
U*
= 1,
-
tanh(u2)
+ z2 - 1,
z E (0,1],
B = 0.
A=0,
The numerical solution is obtained by using the following quasilinearization technique
-ygi + y p ) + (ez/2 We take
a1
-
2)y?-')
= 0,
z E 2,;
= 2 in (3.5). The initial guess is chosen as
y y = 1 - z,
5
E
5;.
=
1 , 2...
20
A numerical method is &-uniform of order (PI,p2) on a mesh LJN if
where pk are constants independent of E and N . Approximations t o p k , the &-uniform rates of convergence, are determined using the double mesh method as follows pk
= In (r,"/riN)/
ln2,
IC
= 1,2
where
The resulting errors r y and r? and the corresponding numbers pl and p2 after five iterations are listed in following Table 1 and Table 2. Table 1
N = 32
N=16
&
7'1
lo-' loW4
0.0090035 0.0091113 0.0091123 loW8I 0.0091124 I
I
I
Pl
T2
TI
7'2
0.0048555 0.0049131 0.0049137 0.0049137
0.0001242 0.0000458 0.0000466 0.0000467 I
0.0000294 0.0000112 0.0000116 0.0000117 1
I I
I
1
P2
0.8908734 0.8910329 0.8910372 0.8910277 I
I
2.0787771 2.0318489 2.0064673 1.9997697
Table 2
N
&
lo-'
1
I
N
= 32 7'2
7'1
7'2
0.0048555 0.0049131 0.0049137 0.0049137
0.0022525 0.0025550 0.0025553 0.0025553
0.0000590 0.0000113 0.0000118 0.0000118
0.0000144 0.0000027 0.0000029 0.0000029
1
I
PI
= 64
7'1
I
I
P2
0.9433048 0.9433007 0.9433411 I 0.9433075
I
2.0346461 2.0652915 2.0146840 2.0164671
1
References 1. G. M. Amirali(yev), 1988, On the numerical solution of the system of Boussinesque with boundary layers, USSR Modelling in Mechanics, 3(5), 3-14. (Russian)
21
2. I. P. Boglaev, 1984, Approximate solution of a nonlinear boundary value problem with a small parameter for the highest-order derivative, USSR Comput. Math. and Math. Physics, 25, 30-39. (Russian) 3. I. P. Boglaev, V. V. Sirotkin, 1993, The solution of singularly perturbed Huge Problems via domain decomposition, Computers Math. Appl., 25(9), 31-42. 4. F. A. Farrell, A. F. Hegarty, J. J. H. Miller, E. O’Riordan, G. I. Shishkin, 2000, Robust Computational Techniques for Boundary Layers, CRC Press, Boca Raton. 5. I. F. Morozov, 1967, The nonlinear vibrations of thin plates taking account of inertia of the rotation, DAN USSR 176(3). (Russian) 6. H. G. Roos, M. Stynes, L. Tobiska, 1996, Numerical Methods for Singularly Perturbed Differential Equations, Springer, Berlin. 7. M. D. Rosenberg, 1952, O n non-steady filtration of partially gaseous fluid in a porous medium, Izv. AN USSR, Tech., 10. (Russian) 8. G. M. Vorovich, 1977, O n some direct methods in the theory of nonlinear vibrations of sloping shell, Izv. AN USSR, Tech., 21. (Russian)
A FAMILY OF OPTIMIZED RUNGE-KUTTA METHODS WITH FIVE STAGES AND FOURTH ORDER FOR IVPS WITH OSCILLATING SOLUTIONS
Z.A. ANASTASSI AND T.E. SIMOS *+ Department of Computer Science and Technology Faculty of Sciences and Technology, University of Peloponnese GR-22 100 Pipolis, GREECE
In this paper we present a family of explicit Runge-Kutta methods of 4th algebraic order with 5 stages for the efficient solution of problems with oscillating solutions. The main purpose is t o achieve better results by minimizing the dispersion and dissipation of the method, while maintaining constant algebraic order. In order t o produce the equations necessary for the 4th algebraic order, we will use the tree theory. The tree theory offers a convenient way of generating these equations as well as the equations of every algebraic order. Some of the fundamental definitions and theorems of the tree theory will be presented. High algebraic order is essential for every Runge-Kutta method since it increases the order of the principal error, thus increasing the rate that the error reduces while the step-length reduces. However further increasing of the algebraic order costs in terms of the number of the coefficients needed t o satisfy the equations, while the benefit isn’t that significant. Instead we can keep these coefficients free for increasing the order of properties such as dispersion and dissipation. Although dispersion (or phase-lag) was introduced for cyclic orbit, *Active Member of the European Academy of Sciences and Arts tcorresponding author. Please use the following address for all correspondence: Dr. T.E. Simos, 26 Menelaou Street, Amfithea - Paleon Faliron, GR-17564 Athens, GREECE, Tel: 0030 210 94 20 091, E-mail:
[email protected] 22
23
Runge-Kutta methods with high phase-lag order are more efficient in many other problem types than methods with lower phase-lag order and higher algebraic order with the same number of stages. They are even more effective in problems with oscillating solutions. We will try to maximize the order of both dispersion and dissipation, as well as equalizing them to zero. In the first case we produce methods with constant coefficients, in opposite to the second case where the methods have variable coefficients. Methods with variable coefficients are usually more effective, thanks to the infinite order of either dispersion or dissipation, especially when the solution is a linear combination of trigonometric functions. However they depend on the problem’s frequency and the step-length. Fortunately there are ways of determining the dominant frequency of problems with oscillating solutions. One of them is by giving the frequency a value close to the spectral radius of the coefficient of the dependant variable in the general form of the problem, although they are often better even when we don’t know a good approximation of the frequency. The methods are compared with each other as well as with the most efficient among already known methods, while solving various problems from the literature. The comparison’s purpose is to show the importance of dispersion and dissipation in oscillating problems and also to show the efficiency of the produced methods.
NONSTANDARD THETA-METHOD AND RELATED DISCRETE SCHEMES FOR THE REACTION-DIFFUSION EQUATION
R. ANGUELOV, P KAMA AND JM-S LUBUMA* Department of Mathematics and Applied Mathematics University of Pretoria Pretoria 0002 (South Africa) Let us consider the following initial-value problem for an autonomous system of n differential equations in n unknowns:
The &method
of order 1 or 2 (if 0 = 1/2) is often used for the numerical solution of (1). However, unless 6 = 1/2, this method is not elementary stable in the sense that its fixed-points do not display the linear stability properties of the fixed-points of the differential equation. hrthermore, in the particular case of linear constant coefficient stiff systems, the constraint 1/2 5 8 5 1, which excludes the explicit forward Euler method, is essential for the method (2) to be A-stable. Our aim is to design a qualitatively stable scheme and to investigate its impact on efficient numerical solution of the one dimensional reactiondiffusion equation au at
-=
8% -+T(U). ax=
(3)
To achieve the above for (l),we use two strategies from Mickens’ nonstandard finite difference approach ’. Firstly, assuming that Eq. (1) has a (nonzero) finite number of fixed-points jj, all being hyperbolic, we require *corresponding author: e-mail: j1ubumaQscientia.up.ac.za;fax: +27-12-4203893; phone: +27-12-4202222 24
25
the properties of the fixed-points to be directly reflected in the new version of (2). The properties will be captured by a fixed number Q
L m={l41
where X traces all the eigenvalues of the Jacobian matrices J ( f ) ( j j )off at all the fixed-points. Notice that the choice of the number q is not so critical if the system is non-stiff. In practice, one may take q = max{llJ(f)(jj)lloo; Vjj fixed-points of (1)) where Il.lloo is the matrix norm associated with the supremum norm on Rn. Secondly, we renormalize the denominator of the discrete derivative in (2) through a real-valued function 4, defined for z > 0 , such that
4 ( t )= t + O ( z 2 )as z
--f
0 and 0 < +(z) < 1.
(4)
(A typical example is $ ( z ) = 1 - exp(-2)). Our non-standard &method for (1) is
4
We show that (5) has the same order as the standard method (2). In particular, when 0 = 0 (forward Euler), we obtain explicit error estimates. We prove a result on the elementary stability of the new method, irrespective of the value of the parameter 0 E [0,1]. We also establish some absolute elementary stability properties pertinent to stiffness. As for the correlation of the scheme (5) to the numerical solution of the partial differential equation (3), we observe that the latter has the stationary equation
d2u -+?-(ti) dx2
= 0,
which, in terms of first integral equation, is equivalent to the conservation u)along the trajectory u,that is: V(x,u(x)) = constant. of the energy V(x, With 'p a function satisfying the same conditions as q5 in (4),let us associate
a nonstandard finite difference scheme for ( 6 ) , which is equivalent to the conservation of the discrete energy, i.e., VA,(U,) = V~~(u,-l),Vm 2 1. A general procedure for deriving energy preserving difference schemes is discussed in and from where it follows that one could take Y('llrn+l, urn, urn-1)= Purn - P
%n+l +urn
3
+ ~ r n - 1urn
26
in the case of the Fisher equation, i.e., r(u) = Pu(1-u)in (3). On the other hand, the space-independent equation of (3) is similar to (1). Therefore, we may, with appropriate adaptation, consider its non-standard 0 analogue method ( 5 ) . Assembling this scheme with (7), we obtain the following nonstandard finite difference scheme for (3):
e-hkY1,
uk+l m
7
uk+l m-l>
k + (1- 0 ) Y ( U km + l , uk,um-l).
(8)
It is shown that the scheme (8) preserves the principle of conservation of energy and the linear stability property of the equation (3) in the limit cases of time and space independent variables, respectively. It is also shown that, under a suitable functional relation between the step sizes At and A x , the scheme (8)has the positivity and boundedness property 0 5 uk 5 1 whenever the solution of (3) satisfies the physical property 0 5 u(x,t ) 5 1. Our further scheme for (3) is in the context of variational analysis and spectral methods '. If the space-domain in (3) is the interval (0,27r), then this problem admits the variational formulation: u(x,O)= uo(x),u(.,t)E V for any t > 0 and
0=
127r
*@(x)dx dt
+
1
27r
du(x,t)d@(x) dx dx dx
____-
27r
r(u(x,t))@(x)dx,
(9) for any test function Y E V where V is the subspace of the Sobolev space H 1 ( 0 , 2 7 r ) that consists of periodic functions with period 27~.For the numerical treatment of (9), we couple a spectral discretization in the x-variable (based on the Fourier system (ezmz)mEz)with the non-standard &method in the t-variable. If u",.) E VM := span{eZmz;Iml 5 M } , for M E N, denotes an approximation of u(., t k ) , we obtain the following fully-discrete spectral-nonstandard-theta method: for all Y E V M ,
-
127r
0r[uZ1(x)]+ (1 - 0)r[u~(x)]@(x)dx,
(10)
where q~ is given by the eigenvalues of the Sturm-Liouville problem associated with (3). (We restrict ourselves to a linear reaction r ( u ) ) . We prove a result on rates of convergence, in terms of powers of 1/M and At,for the scheme (10) and we provide numerical experiments.
27
References 1. R. Anguelov and J.M-S Lubuma, Contributions to the mathematics of the nonstandard finite difference method and applications, Numerical Methods for Partial Differential Equations 17 (2001), 518-543. 2. C. Canuto, M.Y. Hussani, A. Quarteroni and T.A. Zang, Spectral methods in fluid dynamics, Springer-Verlag, Berlin, 1988. 3. R.E. Mickens, Nonstandard finite difference models of differential equations, World Scientific, Singapore, 1994.
NUMERICAL INTEGRATION OF A SIZESTRUCTURED CELL POPULATION MODEL IN AN ENVIRONMENT OF CHANGING SUBSTRATE CONCENTRATION
0. ANGULO Dpto. Matema'tica Aplicada a la Te'cnica, Escuela Universitaria Polite'cnica, Universidad de Valladolid. C/ Fco. Mendirabal, 1, 4'7014 Valladolid, Spain E-mail:
[email protected]
c.
J. LOPEZ-MARCOS Dpto. Matema'tica Aplicada y Computacidn, Facultad de Ciencias, Universidad de Valladolid, C/ Prado de la Magdalena, s/n, 4'7005 Valladolid, Spain E-mail:
[email protected]
We formulate a new numerical method based on integration along characteristics curves t o solve a size-structured cell population model in an environment with an evolutionary resource concentration. Numerical simulations are also reported in order t o show that the approximations converge to the solution of the continuous problem. Also, we show their fine behaviour to study the dynamics of the considered problem.
In the present work, we study a nonlinear size-structured population model that describes the dynamics of a cell population in an environment with an evolutionary resource concentration (Ramkrishna 5). The model assumes that cells are placed on a continuously stirred tank reactor (CSTR), that no cells death occurs and that they grow in one stage, i.e. there is no cell-cycle structure in the mathematical model. This problem consists on a 28
29
partial integro-differential equation (with nonlocal terms) ut
+ ( g ( x ,S ( t ) )u)x = -r(x, +2
S ( t ) )4 5 , t ) - D u(x ,4
LX-
F(a,S ( t ) )P ( x ,a,S ( t ) )4
g ,
4 da,
(1)
< x < xmax, t > 0, that represents the balance law of the population; an initial condition for the state distribution function,
xmin
~ ( 2 , o )= 4(x),
xmin
Ix I Zmaz;
(2)
the appropiate boundary conditions,
g ( x : , S ( t ) ) u ( x ,=tO) ,
x=xmin,xmax,
O I t ,
(3)
that, from a biological point of view, can be thought as defining the boundary of the physiological state space (see 4, and the next initial value problem, d -s(t) = D (Sf - S ( t ) )dt
/
Xmaz
q(a,S ( t ) )u(a,t )da, 0
5 t ; S(0) = So,
Xmin
(4) that describes the dynamics of the abiotic environment. Equation (1) includes information about growth, division and birth at the single-cell level. The first term of the right-hand side of the equation means the lost of cells with physiological state x due to division leading to the birth of the smaller cells. The second term is the dilution term describing the rate by which cells exit the reactor. And the last term represents the rate of birth of cells with physiological state x originating from the division of all bigger cells. On the other hand, Eq. (4)includes information about nutrient uptake at the single-cell level, where the first term of the right-hand side of the equation describes the difference between the inlet and outlet rate from the reactor and the integral term represents the rate of loss of substrate leading to cell growth. Functions u(z,t ) and S ( t ) represent, respectively, the density distribution of the cell culture with size (or another structural property of the single cell) x and the concentration of the available substrate, at time t. The processes involved in Eqs. (1) and (4)are mathematically described by a set of functions known as intrinsic physiological state functions which, in general, depend on the physiological state of the cell x and the state of the abiotic environment S. The growth process is represented by the function g(x,S) which denotes the rate of increase in size of each cell. The cell division and birth ones are described by the division rate r ( x ,S) and the partition
30
probability density function P ( x ,y, S ) , respectively. The later function represents the partition of the mother cell with size y into two daughter cells with size x and y - x. It must satisfy the next conditions
IX-
P(z,y,S)dx=l,
P(x,y,S)=O,
s>y.
(5)
Xmin
Finally, the nutrient consumption one is characterized by the function q(x,S ) which represents the single-cell rate of consumption of the substrate. In order to conclude the description of the terms involved in the problem (1)-(4), the growth of the cells on a CSTR introduces two constants: D ,the dilution rate, and Sf,the concentration of the substance of the abiotic environment at the stream feed of the reactor. Notice that the coupling between (1) and (4) occurs through the dependence of all the physiological state functions on the concentration of the substrates and that this coupling is the source of nonlinearity in the model. In general, we cannot obtain explicitly the solution of the problem (1)(4) then we propose a numerical scheme to integrate it. This method integrates the model along the characteristic curves, these schemes are quite suitable because they use the dynamical behaviour of the population in order to develop the discretization of the model. Next, we describe the theoretical background of the discretization. We rewrite the partial differential equation by means of r*(z,S ) = r(x,S ) + gx(x,S ) D , and (1) can be transform into
+
'Iltk,
t ) + g(z, S ( t ) )%(Z, t ) = 2
lxmaz qU, s(t))
P ( x ,0,s(t)) u(U,t )do
s(t>) 'Il(x,t),
- r*b,
(6)
< x < Xmax, t > 0. Next, we denote by z(t;to,zo)the characteristic curve of the equation (1) that takes the value xo at the time instant to. This characteristic curve is the solution of the following initial value problem xmin
I % ( t o ;t o ,
(7) 20) = xo.
Then, we define the function
31
which satisfies the following initial value problem
Therefore, the partial differential equation (6) reduces to a family of ordinary differential equations. Recall, that we have to integrate three types of problems: the equation that defines the characteristic curves (7), those that obtain the solution of the problem (9) and the one that describe the dynamics of the abiotic environment (4). Note that problems (7), (9) and (4) are coupled. The numerical solution of Eqs (4), (7) and (9) has been performed using the authors experience in the integration of similar size-structured problems, Ref. 1, 2, 3. This type of models usually presents some properties that difficults the development of numerical methods. One of the most significant features is the shape of the growth funtion that usually takes the value zero in somewhere of the space domain. This aspect makes that characteristic curves methods presents some problems in the construction of the spatial grid. In order to diminish this difficulty we have chosen the development of a characteristic curves scheme on a regular grid. Other important feature is the existence of nonlocal terms that we have to approach with suitable quadrature rules, therefore, we need the values of the numerical solution at all the grid points in the level we are working. Besides, in order to integrate equations (4), (7) and (9) we have employed RungeKutta methods. These schemes are composed by different stages that needs unknown values at time levels that are not, in general, a node of the grid defined on the time interval. To circumvent this problem we compute this values using others computational techniques. On the other hand, we have to carry out the numerical integration of the three equations at the same time because all of them are coupled. We have carried out a significant numerical experimentation with different test problems in which are included different features: the use of different growth functions; batch growth or growth on a CSTR; unequal or equal partition (the integral term of (l),2
lICmaz
qu,S ) ~ ( zu,, S ) u(u,t )do,is
replaced by 4 I'(2 z, S ) u(2 z, t ) ) . This experimentation shows that the numerical approximations converge to the solution of the continuous problem.
32
Acknowledgements The authors were supported in part by project DGESIC-DGES BFM200201250 and project Junta de Castilla y Le6n and Uni6n Europea F.S.E. VA002/01.
References 1. 0. Angulo and J.C. L6pez-Marcos. Numerical schemes for size structured population equations. Math. Biosci., 157:16%188, 1999. 2. 0. Angulo and J.C. L6pez-Marcos. Numerical integration of nonlinear s i z e structured population equations. Ecol. Model., 133:3-14, 2000. 3. 0. Angulo and J.C. L6pez-Marcos. Numerical integration of autonomous and nonautonomous nonlinear sizestructured population models. Math. Biosci., 177-178:3%71, 2002. 4. N.V. Mantzaris, P. Daoutidis and F. Srienc. Numerical solution of multivariable cell popultion balance models: I. Finite difference methods Comp. Chem. Eng., 22:1411-1440, 2001. 5. D. Ramkrishna. Statistical models of cell populations. Adu. Biochem. Engng., 1l:l-47, 1979.
A DUALITY METHOD FOR THE COMPRESSIBLE REYNOLDS EQUATION. APPLICATION TO SIMULATION OF READ/WRITE PROCESS IN MAGNETIC STORAGE DEVICES
c. VAZQUEZ Dep. de Matemdticas, Universidade da Coruiia, Campus Elviiia s/n, 15071-A Coruca, Spaan
I. ARREGUI, J.J CENDAN AND
In magnetic storage devices, heads are designed so that a thin air film (air bearing) is generated between the head and the magnetic storage device in the readlwrite process. In this way, the head-device contact only takes place at the initial and final moment. Thus, once a velocity value is reached, the air film is build up so that the hydrodynamic load balances the external load. Hydrodynamic and elastohydrodynamic lubrication theories govern this kind of processes. In the case of hard disk devices the air hydrodynamic displacement is governed by a nonlinear compressible Reynolds equation and the elastic effects are neglected. In tapes and floppy disks (flexible storage media) the elastohydrodynamic model consists of a coupled system based on the compressible Reynolds equation for air pressure and a rod model for the tape deflection. More precisely, in the case of flexible media, the coupled problem providing the air pressure, p , in the thin film and the head-media gap, h, is posed in terms of equations:
dii ii(0) = G(L)= -(O) dY 33
=
dii -(L) = 0 , dY
(5)
34
where V denotes the tape velocity, X the particle mean-free length, p , the ambient pressure and p the air viscosity. Moreover, T , p, E and I r e p resent the tension, the density, Young modulus and the inertia moment, respectively. The ends of the tape are placed at y = 0 and y = 1 while the edges of the head are localized at y = 11 and y = XZ. The coupled feature results from the definition of the gap, = ii - 8, b being the head geometry, and from the pressure acting as a normal force on the flexible tape (2). The notation KC holds for the characteristic function for the set C. The mathematical analysis for the previous model is presented in and finite difference schemes are used in for numerical simulation. The analysis starts from the change of unknowns and variables:
x
= lOOy, p = @ / p a , u = lo%,
h = lo%, 6 = 1068,
which leads to system:
du u(0)= u(L)= -(O) dx
=
du -(L)
dx
=0
,
where a = (dp,)/(pV), P = (ep,)/(6 x 106pV), r] = E I / ( T - pV2) and K = 102p,/(T - p V 2 ) . From this new scaling, the hard-disk model can easily be recovered. More precisely, in this case only the boundary value problem with equation (6) for the air pressure is considered for E = 1/(6pV), a = 6X@,, ,B = 1 and h a given function, as stated in Jai '. In this paper we propose new techniques to deal with the main two difficulties associated to the nonlinear Reynolds equation (6): the convection dominated feature ( € a= 10W2,€0M for typical device values) and the nonlinear diffusion term. Thus, for the first one we propose the numerical method of characteristics adapted to stationary problems, and for the nonlinear term we propose a duality method for maximal monotone operators.
35
First, we consider the variational formulation for (6): Find p E V1 = {'p E H1(L1,L ~ ) / (=P1 on x = L1, x = L2} such that
+
L2
(ah2*d x
E
L1
+ ph3p*)d x 9 dx dx v'p E H&,
= 0, L2).
(11)
Next, by using the method of characteristics for steady problems as it has already been used in lubrication problems en ', we consider the approximation:
where x k ( x )= x the algorithm:
-
k , k being a positive small parameter. So, we propose
For n = 0,1,2,. . . , given p n , find pn+' E V1 such that:
Next, in order to apply Bermlidez-Moreno duality method for the nonlinear diffusive term, we introduce the maximal monotone operator: s2 if s 0 if s
2 0, 5 0.
so that the variational formulation to obtain pn+' remains:
Then, following 2 , we introduce the parameter w > 0 to obtain pn+' as the limit of the sequence {p?"}, which is computed by the algorithm: 0
0
Initialize p:+' Compute p?:
to pn. as the solution of the linear problem:
36
k’ 0
(hpn)o X‘ydx -
dx
Update the auxiliary variable O by
where f,” denotes de Yosida approximation of the operator f - w I with parameter X > 0, I being the identity function. For convergence purposes 2 , the choice X = ( 2 ~has) already ~ ~ been considered. In this way, from subdifferential calculus, we deduce:
fG(2w)(S)=
2ws - 2w2 (-0.5 -2ws
+ -)/,
if s 5 0, if s 2 0.
(14)
For the spatial discretization of the linear problem (13), piecewise linear finite elements have been used, combined with adequate Gauss formulae for numerical quadrature. In order to validate the good performance of the method, a first test with an additional source term associated to the analytical solution p ( x ) = -x2+x+1 has been done. Next, in order to compute solutions with steepest gradients, more realistic tests from mechanical viewpoint, several tests proposed by Jai for his special finite difference method have been reproduced with our method, leading to satisfactory results. Moreover, the optimal value for w predicted by Par&-Macias-Castro in terms of convergence has been experimentally observed. Moreover, in order to numerically solve the elastohydrodynamic coupled problem associated to flexible storage devices, we propose a fixed point iteration between the hydrodynamic model (6) and the elastic one (7), updating the gap with (8). In this setting we propose a cubic Hermite finite element discretization for (7), combined with different alternatives to solve the discrete obstacle problem due to the presence of the head. Finally, the numerical simulation results for several problems with real data are presented.
References 1. G. Bayada, M. Chambat, C. VBzquez, Characteristics method for the formulation and computation af a free boundary cavitation problem, Journal of Computational and Applied Mathematics., 98, 191-212(1998). 2. A. Bermfidez, C. Moreno, Duality methods for solving variational inequalities, Comp. Math. with Appl., 7 , 43-58(1981).
37
3. B. Bhushan, 'Pribology and Mechanics of Magnetic Storage Devices, Springer, New York, 1996. 4. A. Friedman, B. Hu, Head-media interaction in magnetic recording, Arch. Rational Mech. Anal., 140,79-lOl(1997). 5. M. Jai, Homogenization and two-scale convergence of the compressible Reynolds lubrication equation modelling the flying characteristics of a rough magnetic head over a rough rigid-disk surface, Math. Mod. Num. Anal., 29, 199-233(1995). 6. C. Par&, J. Macias, M. Castro, Duality methods with authomatic choice of parameters, Numer. Math., 89, 161-189(2001).
A N ENVIRONMENT FOR COMPUTING TOPOLOGICAL ENTROPY FOR SKEW-PRODUCT TRANSFORMATIONS
FRANCISCO BALIBREA Department of Mathematics, University of Murcia, 30100 - Murcia. SPAIN e-mail: balibreaC2um.e.s JUAN L. G. GUIRAO AND FERNANDO L. PELAYO * Departments of Mathematics 63 Computer Science, University of Castilla-La Manchu, 16071 - Cuenca. SPAIN e-mail:
[email protected], FernandoL.
[email protected] The main objective of this paper is t o introduce a specific software package for computing topological entropy of two-dimensional skew-product transformations. The Kneading Theory establishes the necessary background for this study is some particular cases. This is not only a hard but also a computationally expensive problem. The CAS Mathernatica provides a suitable environment for implementing the algorithm associated t o the rigorous computation of the topological entropy of the considered systems. Keywords: Symbolic Mathematical Computing, Kneading Theory, Skewproducts Maps, Topological Entropy, Dynamical Systems, Mathematica. Topics: Computational Sciences. Nonlinear dynamic systems models are ubiquitous throughout the sciences and engineering. Efficient computational analysis of specific dynamical systems is often a critical component for the successful completion of a research or design project. We live in a dynamical world. For many reasons, we want t o understand these dynamics: t o predict the weather, to prevent heart attacks and limit spread of infections diseases, t o control agricultural pests, to farsee the consequences on man’s activities on the global climate and the impact of climate changes that might result from these activities, t o design both more reliable and more efficient machines, and so on. *corresponding author 38
39
Questions about how processes evolve and change in time are really important and of broad usefulness. Indeed in various domains of science many situations can be, at least approximately, modelled in a very simple way via difference equations of the form x,+1=
f(x,),
n = 0 , 1 , 2 ,...
where f is a particular function which define a dynamical system. The notion of dynamical system is the mathematical formalization of the more general concept of a deterministic process. The future state of many Physical, Chemical, Biological, Engineering, Economical and even Social Systems can be predicted, to a certain extent, by knowing the present state, 20, and the law governing its evolution, f. Because of that, this work could be a useful object for researchers in these areas who use dynamical systems as model tools in their studies. We are not concerned with the problem of how to find such f. Generally, an experimental function f is approximated by an explicitly defined function depending on parameters that are subsequently determined by means of statistical methods. These dynamical systems are defined by rules of transformations which are needed in order to determine how points in a state space evolve when time elapses. Time can either be discrete or continuous. The traces of points as they move in discrete or continuous time are called trajectories. Dynamical systems theory seeks a comprehensive description of the geometric structures arising from these trajectories. In most of cases such trajectories are difficult to describe, even is not possible, when it occurs for many of them, the system behaves in a very complicated way. The scenario: Let I = [0,1] be the compact unit interval of the real line. We consider skew-product transformations on the unit square, that is, surjective continuous maps from I 2 into itself of the form F : (x,y) -+ (f(x),g(x, y)) ( F E Ca(12)).In this setting, the maps f and g are respectively called the basis and the fiber map of F . For every x E I , the maps gx defined by gx(y) = g(x, y) form a system of one-dimensional mappings depending continuously on x. More details on this kind of maps can be found in These types of systems have a lot of applications in pure mathematics (e.g. in the study of geodesic flows on Riemannian surfaces of constant 16317,576.
40
negative curvature, strange attractors, certain polynomial endomorphisms of en,...)and in other sciences (e.g., all known examples of systems in engineering, physic and economic with strange nonchaotic attractors having the skew-product form, see for instance 23). One of the most important tools for measure the complexity of a dynamical system is the concept of the topological entropy introduced by Bowen in 1971, '. When the map F has zero topological entropy ( h ( F )= 0), this means that the behaviour of the system is not difficult from a dynamical point of view. In some sense, in the previous case we can say that the system is not chaotic. Whereas, positive entropy means complicated dynamic. In this setting, two interesting problems can be stated, on one hand how to characterize maps with zero entropy, i.e., maps with a simple dynamics 15, and on the other hand how to compute explicitly the positive entropy. This paper deals with the second problem for some types of maps on C A( I 2 ) . Firstly, we remark that the mathematical problem of computing the topological entropy for a map F E Ca(12)remains open and far from being solve. In 'l, is developed a kneading theory which solves the problem under some restrictions. Let F E C A ( I ~be ) a map holding the following properties:
4.5
-
b
1-
(1) the basis map f is uni-modal with critical point c, see figure 1 for m=1, (2) for some integer p , the critical point cis pperiodic for f (i.e., fp(c) = c and fj(c) # c for 0 5 j < p where f k denotes the composition of f with itself k times),
41
(3) for some integer m, the map (y) = g p where g p = g (zp-1,g ( ~ ~ - 2 ., ..,g ( X I ,g (50,y)) ...)) is m-modal with critical points (c1, c~,...,cm}, see Fig. 1, (4) for every i E 1,2, ..., m, there exists an integer qi such that the critical point ci is qi-periodic for g p .
+
Then, the topological entropy of F is h ( F ) = h(f) h(gp). The previous result is based on the construction of a symbolical dynamic associated to the critical points which generates a Markov partition of the unit square 12. Two transition matrix are obtained and by the sum of the logarithms of their spectral radius the exactly topological entropy is calculated. The main idea of the program is to compute the periodic order of the critical points of the basis map f , once this is done, function gp is defined and its critical points will be computed, in the same way the periodic order of the critical points of gp has to be computed. Although the location of critical points process can be not too hard (because of the existence of analytical -thanks to symbolic mathematical computing power of Mathematicclr and numerical methods), the computation of the periodic order of such points could become computationally expensive process, especially when the critical points have been obtained via numerical methods. Afterwards, a Markov partition for both functions, f and gp, is defined and their transition matrix will be computed. From the characteristic polynomials associated to these transitions matrix, is necessary to calculate their spectral radius and finally the polynomial entropy of the two-dimensional skew-product transformation is obtained by applying the main result presented in 21 . In order to accomplish our goal we have chosen the Computer Algebra System C.A.S. Mathematica, 35, because it is not only a mathematical assistant which allows us to make both symbolic and numerical analysis, but also, it provides us with a Functional Programming Language. These two characteristics give rise t o a desirable environment, to deal with any problems which need 0 0 0
to handle with data in a symbolic way to compute numerical analysis algorithms t o execute the above tasks by means of a functional language which is extremely next to the theoretical results that make possible this analysis
42
This C.A.S. has been widely used by the authors, e.g. looking for and identifying bifurcations 30, or studying stability of continuous systems 'Of Next, the result obtained from the program execution when the input is the below function F is shown:
F (x, y) = (f (x) ,g (zly)) = (1 - 1.76~',x - 0.823~') CRITICAL POINTS O F f: -0.7589 3-periodic - XI = -0.7589,~2 = -0.0135,~3= 0.9997
ASSOCIATED gp: - g p (y) = 0.9997 - 0.823(-0.0135
- 0.823(-0.7589
- 0.823~')')~
CRITICAL POINTS OF gp: -0.0018 5-periodic - yl = -0.0018,~2 = 0.8041,~3= -0.5795,~4 = 0.3396,~5=
0.6899 0
MARKOV PARTITION FOR THE PERIOD 15 ORBIT OF
0.8
-%
0.8
-
&
0.4 -
v, 0.2
-
0 - yl
-0.2
-
4.4
-
-0.6-
v3
,x, 4.8
I
4.6
, -0.4
, 4.2
x.
x2,
0
0.2
0.4
0.8
0.8
THE CHARACTERISTIC POLYNOMIALS: - PA, - PA,
( t )= 1 - t - t 2 ( t )= 1 - - t' -k t3 - t4
1
43
0
THE SPECTRAL RADIUS: M 1.6183 - A, = l / t , = - A, == l / t , M 1.5128
0
TOPOLOGICAL ENTROPY: - h(F)=h(Fp)= l ~ ~ ( A ) ~ l ~ ~ ( A , ~ A y ) ~ l ~ ~ ( X , ) +
0.8952
An interesting application: We consider Cournot maps on 12, i.e., continuous transformations of the form $(x,y ) = (q51(y),q5z(x)). There exists a very well-known economic production process called Cournot duopoly which is mathematically modelled by discrete dynamical systems based on Cournot maps. “There exist two companies producing a n identical good. I n each step of the process, the responses of the firms in terms of the production , are simultaneous and they depend o n the production of the rival firm in the last step”. In order t o have information about the economical situation previously described, these discrete models have been studied from different points of view with the aim of describing their dynamical behaviour, It has been proved that the topological entropy is the key for studying the complexity of this kind of systems. Specifically in lo, a topological characterization for it was given. This result is exactly the generalization for Cournot maps, on one hand of the one-dimensional Misiurewicz’s theorem (see 22, ( 1 ) @ ( 2 ) ) and on other hand of certain results proved by Sharkovsky in the sixties (see 26, ( 1 ) @ 7712125.
*
(4)). By Sn(.),Per(.), h(.),UR(.), Rec(.) and AP(.) we respectively denote a stratification set, the set of periods of periodic points, the topological entropy and the sets of uniformly recurrent, recurrent and almost periodic points. Misiurewicz & Sharkovsky’s Theorem: Let 4 E C ( I ) . The following properties are equivalent: (3)
(1) h ( 4 ) = 0 , (2) the period of any periodic point is power of two, (3) W 4 ) = W + ) , (4) AP(4) = {x E I : lim q52”(x)= x}. n+m
44
Assume now, that 4(x,y) = ($l(y), 42(x)) is a Cournot map and we are interested in analyzing the complexity of the system modelled by 4, i.e, we want to calculate h(4). Consider the second iterated of 4, It can be observed that q52 is a triangular map. Assume that 42 holds the restrictions of our environment (this is a natural assumption since in the most of Cournot models the reaction functions $1 and 4 2 are multimodal with periodic critical points for having convex utility sets), thus our environment can compute h(42). Now, for a discrete dynamical system generated by a map $I of a compact metric space into itself, it is true that for every integer n 2 1, h($In)= n . h($I)(see '). Then, it can be applied to the case of the Cournot map, n = 2, and h(q52)= 2 . h(4) is obtained. Therefore, we are able to compute the topological entropy of the triangular map 42 and as a corollary we obtain an explicit calculation of the topological entropy of the Cournot map 4 as a half of the topological entropy of 42.
References 1. 2.
3.
4.
5.
6.
7.
8.
V. I. Arnold, Geometrical Methods in the Theory of Ordinary Differential Equations, Springer Verlag, 1983. F. Balibrea and J.C. Valverde, Bifurcations Under Non-degenerated Conditions of Higher Degree and a New Simple Proof of the Hopf Bifurcation Theorem, J. Math. Anal. Appl. 237, 93-105(1999). F. Balibrea and J.C. Valverde, Structural Stability Under Conditions of Non-Hyperbolicity, Comput. Math. Appl. 41,757-768(2000). F. Balibrea and J.C. Valverde, Extreme Degenerations for the Simplest Generic Bifurcations and New Transversally Conditions, Discrete and Continuous T i m e Dynamical Systems Special Volume added to Volume 6, 22-30(2000). F. Balibrea and J.L. Garcia and J.I. Muiioz, Description of w-limit sets of a triangular map on 12, Far East Journal of Dynamical Systems 3(1), 87-201(2001). F. Balibrea and J.L. Garcia and J.I. Muiioz, A Triangular map on I 2 whose w-limit Sets are all Compact Intervals of ( 0 ) x I , Discrete and Continuous Dynamical Systems 8(4),983-994(2002). G.I. Bischi and L. Gardini, Cycles and Bifurcations in Duopoly Games, Chaos, Solitions and Fractals 14,139-150(2001). L.S. Block and W.A. Coppel, Dynamics in One Dimension, Lecture Notes in Math., Springer, 1993
45
9. 10. 11. 12. 13. 14. 15.
16. 17. 18. 19. 20.
21. 22. 23. 24. 25. 26.
27. 28. 29.
30.
R. Bowen, Entropy for Group Endomorphism and Homogeneous Spaces, Transactions American Mathematical Society 153,401-414( 1971). J.S. Canovas and A. Linero, Topological Dynamics Classification of Duopoly Games, Chaos, Solitions and Fkactals 12,1259-1266(2001). J. Carr, Applications of Center Manifold Theory, Springer Verlag, 1981. R.A. Dana and L. Montrucchio, Dynamical Complexity in Duopoly Games, Journal of Economical Theory 40, 40-56( 1986). J. Guckenheimer and P. Holmes, Nonlinear Oscillations, Dynamical Systems and Bifurcations of Vector Fields, Springer Verlag, 1983. J. Hale and H. KO&, Dynamics and Bifurcations, Springer Verlag, 1991. Z. KoEan, The Problem of Classification of Triangular maps with Zero Topological Entropy, Annales Mathematicae Silesianae 13, 181192( 1999). S.F. Kolyada, On Dynamics of Triangular Maps of the Square, Ergodic Theory and Dynamical Systems 12,749-768(1992). S.F. Kolyada and L. Snoha, On w-limit Sets of Triangular Maps, Real Analysis Exchange 18(1), 115-130(1992/93). Y .A. Kuznetsov, Elements of Applied Bifurcation Theory, Springer Verlag, 1995. J. Marsden and M. McCracken, Hopf Bifurcation and Its Applications, Springer Verlag, 1976. J.A. Martinez and F.L. Pelayo and J.J. Miralles and J.C. Valverde, Stability of Continuous System by Routh-Hurwitz and Mathematica, Proceedings of the International Conference on Computational and Mathematical Methods in Science and Engineering (CMMSE 2002), Alicante, Spain, 20-23 September, 2002. D. Mendes and J. Sousa Ramos, Kneading Theory for Triangular Maps, Submitted t o Discrete and Continuous Dynamical Systems, 2003. M. Misiurewicz and W. Szlenk, Entropy of Piecewise Monotone M a p pings, Studia Mathematica 67, 45-36( 1980). S.S. Negi and R. Ramaswamy, A Plethora of Strange non-chaotic Attractors, Pramana Journal of Physics 56( l), 47-56(2001). Z. Nitecki, Differentiable Dynamics, M.I.T. Press, 1971. T. Puu, Chaos in Duopoly Pricing, Chaos, Solitions and Fractals 1, 573-581( 1991). A.N. Sharkovsky and S.F. Kolyada and A.G. Sivak and V.V. Fedorenko, Dynamics of One-dimensional Maps, Dordrecht : Kluwer Academic Publishers, 1997). J. Sijbrand, Properties of center manifolds, 7kans. Amer. Math. SOC. 289,431-469(1985). J. Smital, On Functions and Functional Equations, Adam Hilger, 1988. J.C. Valverde, Teoria de Bifurcaciones Locales de S.D.D. Generalizada, Departamento de MatemBticas de la Universidad de Murcia, Master Thesis, 1998. J.C. Valverde and F.L. Pelayo and J.J. Miralles and J.A. Martinez,
46
31. 32. 33. 34. 35.
Symbolic Mathematical Computing of Bifurcations in Dynamical Systems, Proceedings of the International Conference on Computational and Mathematical Methods in Science and Engineering (CMMSE 2002), Alicante, Spain, 20-23 September, 2002. S. Wagon, Mathematica in action, Springer-Telos, 1998. D.C. Whitley, Discrete Dynamical Systenls in Dimensions One and Two, Bull. London Math. Soc. 15, 177-217(1983). S. Wigginns, Introduction to Applied Nonlinear Systems and Chaos, Springer-Verlag, 1990. S. Wolfram, Mathematica, a System for Doing Mathematics by Computer, Addison-Wesley, 1991. S. Wolfram, The Mathematica Book, Wolfram Media, 1996.
MAXIMUM LIKELIHOOD AND CONDITIONAL MAXIMUM LIKELIHOOD LEARNING ALGORITHMS FOR HIDDEN MARKOV MODELS WITH LABELED DATA-APPLICATION TO TRANSMEMBRANE PROTEIN TOPOLOGY PREDICTION P. G. BAGOS, TH. D. LIAKOPOULOS AND S. J. HAMODRAKAS Department of Cell Biology and Biophysics, Faculty of Biology, University of Athens Panepistimiopolis,Athens 15701, Greece E-mail:
[email protected] Hidden Markov Models (HMMs) have been widely used in applications in computational biology, during the last few years. In this paper we are reviewing the main algorithms proposed in the literature for training and decoding a HMM with labeled sequences, in the context of the topology prediction of bacterial integral membrane proteins. We evaluate the Maximum Likelihood algorithms traditionally used for the training of a Hidden Markov Model, against the less commonly used Conditional Maximum Likelihoodbased algorithms and, after combining results previously obtained in the literature, we propose a new variant for Maximum Likelihood training. We compare the convergence rates of each algorithm showing the advantages and disadvantages of each method in the context of the problem at hand. Finally, we evaluate the predictive performance of each approach, using state of the art algorithms proposed for Hidden Markov Model decoding and mention the appropriateness of each one.
1. Introduction 1.1. Hidden Markov Models
Hidden Markov Models (HMMs) are probabilistic models suitable for a wide range of pattern recognition applications. Initially developed for speech recognition [I], during the last few years they became very popular in molecular biology for protein modeling and gene finding [ 2 ] . A Hidden Markov Model [2] is composed of a set of states. Two states k, I are connected by means of the transition probabilities akl, forming a lst order Markovian process. Assuming a protein sequence x of length L denoted as: x = x , , x , ).”) x , , where the xi’s are the 20 amino acids, we usually denote the “path” (i.e. the sequence of states) ending up to a particular position of the amino acid sequence (the sequence of symbols), by n. Each state k is associated with an emission 47
48
probability ek(xJ, which is the probability of a particular symbol xi to be emitted by that state. 1.2 Maximum Likelihood training
Traditionally, the parameters of a Hidden Markov Model are optimized according to the Maximum Likelihood criterion [ 11,
dM
= argmaxP(x e
18)
(1)
The total probability of a sequence given the model is computed by summation over all possible paths through the model: P ( x IS) = C P ( X , Z 18) 77
This quantity is calculated using a dynamic programming algorithm known as the forward algorithm [2], or alternatively by the similar backward algorithm. The logarithm of P(xl0) is the log-likelihood of the sequence. For numerical stability reasons the usual approach is to maximize the log-likelihood (or equivalently minimize the negative log-likelihood). A widely used algorithm for this task is the efficient Baum-Welch algorithm (also known as forwardbackward) [3], which is a special case of the Expectation-Maximization (EM) algorithm, proposed for Maximum Likelihood (ML) estimation for incomplete data [4]. The algorithm, updates iteratively the model parameters (emission and transition probabilities), by assigning to them their expectations, computed with the use of forward and backward algorithms. Convergence to at least a local maximum of the likelihood is guaranteed. Baldi and Chauvin [ 5 ] , were the first to propose a gradient descent method capable of the same task, which offers a number of advantages over the Baum-Welch algorithm, including smoothness and on-line training abilities. 2.
Labeled sequences and learning algorithms
2.1. Labeled Sequences
In molecular biology there is often a need to train models with a large number of parameters (states), in order to explicitly model different “classes” of data. Examples can be found in the topology prediction of transmembrane proteins [6] or in gene finding [7,8]. To accomplish that, one has to train separate submodels corresponding to the different “classes” (i.e. to train one model for the transmembrane regions, another for the cytoplasmic loops and a third one for the extracellular loops), and then combine them using the appropriate transitions. To overcome this complication, Krogh suggested the use of labeled sequences
49
[9]. Thus, each amino acid sequence x is accompanied by a sequence of labels y for each position i in the sequence: Y = Y , Y2 YL Krogh proposed a simple modified version of the forward and backward algorithms [9], incorporating the concept of labeled data. The likelihood to be maximized in such situations is the joint probability of the sequence and the labeling given the model. P( x, y I 0) = p (x,Y," I 0) = fJ (%Z1 0) 7
7...7
c x
c
Zen,
The simple idea behind this approach is that summation has to be done only over those paths I7, that are in agreement with the labels y. Consequently, one has to declare a new probability distribution, in addition to the transition and emission probabilities, the probability dk(c) of a state k having a label c. In almost all biological applications this probability is just a delta-function, since a particular state is not allowed to match more than one label. The expectations for the emission and transition probabilities are computed with the modified forward and backward algorithms [9]. Consequently, it is straightforward to derive the modified Baum-Welch algorithm. This is one of the methods that we consider in this study. 2.2. Discriminative training With the use of labeled sequences, Krogh also derived a learning algorithm that maximizes the probability of the labeling [9], instead of the probability of the sequences. This procedure is referred to as the Conditional Maximum Likelihood (CML) criterion and the parameters of the model are optimized so that:
The denominator in the right hand side of Eq. (4) is the likelihood computed allowing all paths in Eq. (2), whereas the numerator is the likelihood considering only paths consistent with the labels, Eq. (3). Once again, turning to logarithms is more convenient. The maximization procedure cannot be performed with the Baum-Welch algorithm and an incremental algorithm has been proposed instead [9]. In this work we employed a maximization procedure, using a gradientdescent method as proposed in [ lo].
50
2.3 Maximum Likelihood (ML) gradient descent method for labeled sequences Starting from the gradient-descentML learning algorithm proposed by Baldi and Chauvin [ 5 ] , and incorporating the idea of labeled sequences, we derived a similar ML gradient-descent learning algorithm suitable for labeled sequences. The derivatives of the negative logarithm of the likelihood in Eq. (3), (denoted by 1) with respect to the emission and transition probabilities can be expressed in terms of the forward and backward variables. For example, the derivative with respect to a transition probability a k l is:
-= --
(5)
8% a k l ’ whereas, the derivative with respect to an emission probability is:
ae - --E, ( b ) ( b ) - ‘k ( b ) The quantity Akl is the expected times a transition happens from state k to state 1 whereas &(b) is the expected times a symbol b is emitted from state k. These quantities [2], are calculated according to: aek
wherefk(i) and b k ( i ) are the forward and backward variables respectively, el(xi+l) and a k l are the emission and transition probabilities respectively and & ( Y i + / ) is the delta function indicating the agreement of the current state with the labeling. By calculating the derivatives of the log-likelihood with respect to a generic parameter 0, we proceed with gradient-descent and iteratively update these parameters according to:
where q is the learning rate. To avoid the risk of obtaining negative estimates, we used a proper parameter transformation and performed gradient-descent optimization on the new variables, as proposed in [lo]. For example, for the transition probabilities, we used the softmax transformation:
51
which yields the following update formula:
The gradients with respect to the new variables zkl can be expressed entirely in terms of the expected counts and the transition probabilities at the previous iteration.
Substituting now Eq. (1 1) into Eq. (lo), we get an expression entirely in terms of the model parameters and their expectations. The same holds for the emission probabilities. The procedure described above was also used for the CML training described in section 2.2, for minimizing the negative loglikelihood in Eq (4). Employing this gradient-based optimization method we were able to use some known tricks for escaping from local minima of the negative log-likelihood, such as the use of a momentum parameter. 3.
Transmembrane protein topology prediction
3.1. Data sets and topology of the model To evaluate the performance of the training techniques described above, we developed two different HMMs: one for the prediction of the transmembrane/Istrands and the topology of the bacterial /I-barrel outer membrane proteins [ 1l l , and another for the prediction of the transmembrane a-helices of the bacterial ahelical membrane proteins. Transmembrane protein topology prediction is a task of great interest in molecular biology since transmembrane proteins perform a wide variety of important biological roles, comprising approximately 25-30% of all fully sequenced genomes [121. On the other hand, only a small fraction of them is of known structure and thus there is the need to develop computational tools capable of recognizing membrane proteins when screening fully sequenced genomes. Both models are cyclic (Figure l), with states belonging in three classes, consisting of a total of 70 and 114 states for the outer membrane proteins and the a-helical membrane proteins, respectively. In the case of a-helical membrane proteins the three classes correspond to transmembrane (TM), cytoplasmic (IN) and periplasmic (OUT) segments, whereas for /I-barrel outer
52 membrane proteins to transmembrane (TM), periplasmic (IN) and extracellular (OUT) segments. Each model was constructed on the basis of expert knowledge and designed specifically for capturing the structural features of the corresponding class of proteins. For training we used two non-redundant sets of proteins with experimentally determined topologies. The outer membrane protein set consists of 14 proteins derived from the Protein Data Bank [13], whereas the a-helical membrane protein set, of 110 bacteria membrane proteins taken from the TMPDB 1141.
5gure 1. General topology of the models. Dashed rectangles represent the three different classes (IN, inner; TM, transmembrane; OUT, outer). The two models have major differences in the number of states and their connectivity within each class.
The training was performed in batch mode (off-line) with the use of the modified Baum-Welch algorithm, the gradient method described in section 2.3, and the gradient method for Conditional Maximum Likelihood proposed by Krogh [lo]. For the decoding process we used the well-known Viterbi algorithm, which finds the most probable-state path [1,2], and the N-best decoding algorithm [ 15,161, which aims at finding the most probable labeling".
All algorithms used in this work, were implemented by the authors using the JAVA programming language, on a dual Pentium 111 866 Mhz workstation under Linux. a
53
3.2. Results and Discussion From the results of this study, presented in Tables 1 and 2, the first obvious conclusion is somewhat expected, and is relevant to the performance of the prediction algorithms on the two distinct structural classes of transmembrane proteins. In particular, the overall performance of the prediction methods is significantly higher for ec-helical membrane proteins than for /?-barrel outer membrane proteins. This finding is due to the presence of a clearer pattern accounting for the transmembrane regions in the case of a-helical membrane proteins, which consists of a stretch of 15-30 consecutive highly hydrophobic residues. Table la. Measures of predictive performance for the ^-barrel outer membrane proteins. Gradient
Baum-Welch
CML
Prior Uniform Prior Uniform Prior Uniform Viterbi N-best Viterbi N-best Viterbi N-best Viterbi N-best Viterbi N-best Viterbi N-best 02
0.739 0.804 0.781 0.824 0.771 0.802 0.786 0.823 0.772 0.801 0.693
Ca
0.481 0.603 0.564 0.643 0.542 0.599 0.569 0.639 0.546 0.597 0.393 0.559
TP
121
163
146
FP
1
4
1
FN
93
51
68
176 5 38
139
163
154
175
1
4
5
5
139 1
75
51
60
39
75
0.78
161
95
144
2 53
2
3
119
70
Table 2". Measures of predictive performance for the a-helical membrane proteins. Gradient
Baum-Welch Prior
Prior
Uniform
CML
Uniform
Prior
Uniform
Viterbi N-best Viterbi N-best Viterbi N-best Viterbi N-best Viterbi N-best Viterbi N-best
Q2
0.875 0.888 0.876 0.888
Ca
0.703 0.735 0.703 0.735 0.715 0.735 0.725 0.733 0.604
TP
496
547
FP
57
32
FN
97
46
498 58 96
543 31 50
0.88
0.888 0.884 0.887 0.838 0.853 0.725 0.823 547
537
63
32
64
32
51
62
46
56
46
212
531
547
381
0.61
408 9 185
0.264 0.505 94 44 499
332 7 261
The second observation is the clear advantage of the N-best decoding algorithm, which significantly outperforms the predicting performance of the Viterbi algorithm. This is caused probably by the fact that in applications with a
Q2: Total fraction of correctly predicted residues in a two state mode (transmembrane vs. non transmembrane). Ca: Mathew's correlation coefficient. TP, FP, FN: True positives, False Positives and False Negatives transmembrane segments respectively.
54
labeled data, the total likelihood of different paths sharing the same labeling is significantly higher than the likelihood of the best single path of states. Clearly, N-best decoding should be preferred since the cost of its computational demands is negligible compared to its increased predictive performance. B
-1
-wanw-)
Figure 2. Evolution of the negative log-likelihoods: (A) a-helical membrane protein set initialized by uniform probabilities, (B) a-helical membrane protein set initialized by prior distribution, (C) p-barrel outer membrane protein set initialized by uniform probabilities, and (D) /?-barrel outer membrane protein set initialized by prior distribution.
When referring to the convergence rates of the Baum-Welch algorithm and the gradient-based method for Maximum Likelihood estimation (Figure 2), we observed that for such applications the Baum-Welch algorithm achieves faster convergence. However, we should mention that in cases where the training started with uniform transition and emission probabilities (all parameters were set equal), both ML algorithms reached a local maximum of the likelihood, with a better predictive performance. On the other hand, when the training started from some fixed prior probabilities for the parameters, the algorithms reached a maximum significantly higher than in the former case, but with decreased predictive Performance. Presumably, this is caused by overfitting since it is known that iterative methods such as EM and gradient-descent strongly depend on the initial values of the parameters. This is more obvious in the case of outer membrane proteins where the number of examples in the training set was limited.
55 Comparing the prediction accuracies for each training method, we also observed that discriminative training (CML) does not clearly outperform either the traditional ML training via the Baum-Welch algorithm, or the gradient method proposed in this work. This is expected, as a consequence of the fact that discriminative algorithms are sensitive on data mislabeling [lo]. In such applications the data are inherently incorrectly labeled, since it is known that even in proteins with structure known at atomic resolution, the exact boundaries of the membrane regions are not well determined [6,12]. It is also worth noting that CML training requires twice as much computational time as that for ML training, since it requires two complete passes of the forward and backward algorithms. It seems that CML training might be advantageous in cases like gene finding [7,8], where mislabeling of the data is not expected (exons-introns), and also in cases of hybrid methods where other methods are not suitable [ 101.
References 1. Rabiner L. R. Proc IEEE 77 (2), 257-286 (1989). 2. Durbin R., Eddy S., Krogh A., and Mitchison G. Cambridge University Press, (1998). 3. Baum L.E. Inequalities 3, 1-8 (1972). 4. Dempster, A. P., Laird, N. M. and Rubin, D. B. J Roy Stat Soc B 39, 1-38 (1977). 5. Baldi, P. and Chauvin Y. Neural Comput 6(2), 305-3 16 (1994). 6. Krogh, A., Larsson, B., von Heijne, G. and Sonnhammer, E. L. J A401 Biol 305,567-580 (2001). 7. Henderson, J., Salzberg, S., and Fasman, K. J Comput Biol 4(2), 127-141 (1997). 8. Krogh, A., Saira, I. M., and Haussler, D., Nucleic Acids Res 22 4768-4778 (1994). 9. Krogh, A. Proc 12th Int ConfPatt Recog 140-144 (1994). 10. Krogh, A., and Riis S., Neural Comput 11,541-563 (1999). 11. Schulz, G. E., Biochim Biophys Actu 1565,308- 3 17 (2002). 12. Von Heijne, G., Quart Rev Biophys 32(4), 285-307 (1999). 13. Berman, H.M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T.N., Weissig, H., Shindyalov, I.N., Bourne, P.E., Nucleic Acids Res 28,235-242 (2000). 14. Ikeda, M., Arai, M., Okuno, T. and Shimizu, T., Nucleic Acids Res 31(1), 406-409 (2003). 15. Schwartz, R. and Chow Y. L. Proc IEEE Int ConfAcoust, Speech, Sig Proc, 1,Sl-84 (1990). 16. Krogh, A. Proc 5th Int Conf Intel Sys Mol Bioll79- 186 (1997).
VARIANTS OF RELAXED SCHEMES AND TWO-DIMENSIONAL GAS DYNAMICS *
MAPUNDI K. BANDA Fachbereich Mathematik, Schlossgartenstr. 7, 64289 Darmstadt, Germany
E-mail: bandaC2mathematik.t.u-darrnstadt.de
We present a relaxed scheme with more precise information about local speeds of propagation and a multidimensional construction of the cell averages. Hence the physical domain of dependence is simulated correctly and high resolution is maintained by a genuinely multidimensional piecewise nonoscillatory reconstruction. Relaxation schemes have advantages that include high resolution, simplicity and explicitly no (approximate) Riemann solvers and characteristic decomposition is necessary. Performance of the scheme is illustrated by tests on two-dimensional Euler equations of gas dynamics.
1. Introduction Given an open bounded domain R c R2 and a time interval [O,T],in this paper higher order relaxed schemes are used to approximate a 2-dimensional N x N hyperbolic system of conservation laws:
a
-U at
a a + -F(U) + -G(U) ax
dY
= 0, U E RN
The relaxation system proposed by Jin and Xin [S] is:
au av aw at ax a y
-+-+-=(I,
*This work is supported by deutscher forschungsgemeinschaft grant kl 1105/9. 56
57
-
where E > 0 is the relaxation rate. The matrices A and B are appropriate 0, the solution of diagonal matrices. In the zero relaxation limit, E Eq. (2) approaches the solution of the original system Eq. (1) provided a subcharacteristic condition [6] holds. 1.1. Space Discretization
For the space discretization of Eq. (2), we cover 52 with rectangular cells Ci,j := [xi-+,xi++] x [yj-l,yj++] of uniform sizes Ax and Ay. The cells, Ci,j, are centred at (xi = iAx,yj = j a y ) . We use the notation: wi*;,j(t) := w(xi*;,Yj,t), wi,j*;cq := w(xi,Yj*$,t),
to denote the point-values and the approximate cell-average of the function w at (xi+;, yj,t), (xi, y j * + , t ) , and (xi, yj,t), respectively. Point-values Ui+4 , j , Uij++ , Vi++ ,j , and Wi,j+ ;are defined by upwind schemes and used in defining divided differences for discretizing Eq. (2). The scheme realised is referred to as a relaxing scheme and in the limit E 0 as a relaxed scheme.
-
2. Extensions to Relaxed Schemes
We consider variants in the x-direction only. Those in the y-direction follow analogously. 2.1. More Accurate Speeds of Propagation
In the upwind schemes, piecewise polynomial reconstructions of approximate solutions on each cell introduce possible discontinuities at the interface points. These discontinuities propagate with right- and left-sided local speeds, which, in the genuinely nonlinear or linearly degenerate case, are estimated by
-
dF
au
-
.I), A
%+I ,I. := min{Al(-(y+L, 2 3
dF ~ ( ~ ( ~ ~ + J L o } ~
respectively. Here, A1 < . . . < AN are the N eigenvalues of the Jacobian 273 2 3 and u:++,~:= p:++l,j(xi++,j) h e r e p are and ti;+. 2 13. := p?.(zi+l,.) piecewise polynomials used in the reconstruction of component u of U at time step n.
58
2 . 2 . Multidimensional Schemes
To obtain a second-order multidimensional scheme cell averages are integrated using a trapezoidal rule [2]. We approximate U(z, y, t”) by a linear polynomial reconstruction on each cell Cij. The scheme is now genuinely multidimensional since we add cross-diagonal directions to the Cartesian directions utilised in local speeds. 3. Numerical Tests
We would like t o approximate a solution to the 2-D Euler Equations for gas dynamics: m
2[!]+;[ dt n
p:p]+-&[
n
vp( E ~+ ; P) p]=o,
(3)
U(E + P) where p, u, v, rn = pu, n = pv, p, and E are the density, velocity in xdirection, velocity in y-direction, momentum in z-direction, momentum in y-direction, pressure, and energy, respectively. The equation of state for a polytropic gas is given by p = (y - l ) . ( E - : l l ~ 1 1 ~ ) , where y = 1.4. Radially Symmetric Riemann-Problem This is a test which we used to check the conservation of radial symmetry [4]. The computational geometry is a box R = [-0.5,0.5] x [-0.5,0.5]. An equidistant grid in both the z- and y-direction with 100 x 100 cells was used. The following are the initial conditions:
with (pl,pl,ul,v ~ =) (2.0, ~ 15,0,0)T and (pT,pT,ur,v,)T = (1.0, l.O,O,O)T and x = (z,y) is the space variable. Boundaries were maintained at ( p , p , ~ , v ) ~ ( x= , t )( p r , ~ , , ~ r , v , . ) TFigure . 1 (left) is a contour plot of density at t = 0.13 in which 15 equidistributed contour lines were used. “Double Sod Tube Problem” In this test case a “Double Sod Tube” Riemann problem [l]is considered. The same geometry as above is used with an equidistant grid in both the z- and y-directions with 200 x 200 cells. The following is the initial profile: p(x,O) = 0.1 if z y < 0, 1 otherwise p(x,O)= 0.1 if zy < 0, 1 otherwise u(x,O) = 0;
59
0.5
0.4
O
-0.1
-0.1
4.2
-0.2
-0.3
-0.3 -0.5 -0.4
4.4 4.5
-0.6
I 4.4
-0.2
0
0.2
0.4
0.6
Figure 1. Density Profiles for radial symmetric problem (left) at t = 0.13 and “Double Sod Tube” (right) at t = 0.2.
where x = (5,y) are space variables. The approximate solution at t = 0.2 is presented in figure (1) (right) plotted with 15 equidistributed contours. For both the above tests CFL = 0.475. These profiles are resolved without any input on the elementary waves involved, beyond the characteristic speeds. These preliminary results above compare favourably with the results in [4, 11. For time integration we used a second-order Runge-Kutta scheme [5]. 4. Summary
We have presented modifications to the relaxed schemes for conservation laws and tested it in two-dimensional gas dynamics. More comprehensive tests and comparisons with exisiting schemes applied to gas dynamics are underway.
References 1. D. Aregba-Driollet and R. Natalini, SIAM J. Numer. Anal. 37, 1973-2004 (2000). 2. M.K. Banda, Relaxation Schemes for Multidimensional Conservation Laws in preparation. 3. M.K. Banda, A High Resolution Relaxation Scheme with more Accurate Local Speeds of Propagation - in preparation. 4. M. Brio, A.R. Zakharan, and G.M. Webb, J . Comp. Phys. 1 6 7 , 177-195 (2001). 5. S. Gottlieb, C.W. Shu, E. Tadmor, S I A M Rev. 43,89-112 (2001). 6. S. Jin and Z.P. Xin, Comm. Pure Appl. Math. 48, 235-277 (1995). 7. H.J. Schroll, J. Sc. Comp.17, 599-607 (2002).
PREDICTING TOXICITY OF CHEMICALS USING CHEMODESCRIPTORSAND BIODESCRIPTORS: AN INTEGRATED APPROACH SUBHASH C. BASAK, BRIAN D. GUTE, AND DENISE MLLS Natural Resources Research Institute, University of Minnesota Duluth, 5013 Miller Trunk Highway, Duluth, Minnesota 55811, USA
Modern lifestyle is highly dependent on the routine use of thousands of chemicals. Many of these chemicals have toxic effects which are brought about through a myriad of biochemical mechanisms. A combination of functional (biochemical) criteria and some aspects of the molecular structure have been used for the hazard assessment of pollutants. In recent years, quantitative descriptors of molecular structure, usually called chemodescriptors, are being used for the prediction of biochemical/ toxicological properties of chemicals. In the post-genomic era, mechanisms of action of hazardous chemicals can be characterized by techniques like proteomics and quantitative descriptors derived from proteomics maps, called biodescriptors. This presentation will discuss the relative utility of chemiodescriptors and proteomics-based biodescriptors in the development of integrated QSARs (I-QSARs) in predicting activity/ toxicity of chemicals.
60
MEASURING ECONOMIC WELL-BEING AND GOVERNANCE: SOME METHODOLOGICAL TOOLS SUDIP RANJAN BASU Graduate Institute of International StudiesiKJHEI InternationalEconomicsDepartment and Econometrics Department Universityof Geneva, Switzerland E-mail: sirdip hasu@,hotrnail.com.hasul@,hei.imiee.ch
Over the last decade, there has been a continuos attempt to provide some suitable methodologies for the measurement of ‘Human Development’, ‘Human Poverty’, ‘Gender Developoment’, ‘Governance’, ‘Corruption’, ‘Happiness’, etc., among the International Organisations (e.g., UNDP, World Bank), and in the academic profession (e.g., ICRG, BERI, FH etc.) . In all these studies the single most important element is to quantify some of the subjective or abstract variableslelements into a single quantitative value, so that the countries or the regions could be placed in the rank ordering on the basis of the values of such indices. In this paper we attempt to provide a new method to compute the economic well being (a measure of over all economic development) and governance (a measure of institutional quality) in the economylsociety. The principle economic rationale of the present paper is to attempt explore the dynamics of the economic well being with that of the other socio-economic dimensions, i.e., the quality of institutions and decentralisation for invigorating the economic well being. In exploring this inter linkage between the economic well being and that of quality of institutional arrangements, we attempt to develop a new measure of the economic well being, and that of the quality of institution and then to search for their potential relationship in this present study. Initially the paper attempts quantify the economic well being for the society. The economic well being is a multidimensional concept, and is abstract in nature. We therefore suggest a methodology to compute the Economic Well Being Index (EWBI), combining several variables, which incorporates the socio-economic dimensions of the quality of life. These indicators are related to human resources, to availability of health services, to utilisation of technology, to availability of physical infiastructure, and to the flow of financial services in the economy (measured as: income per capita, literacy rate, enrolment ratio, infant mortality rate, life expectancy, hospital beds, electricity consumption, bank branches, telephone lines, road length, railway routes, intensity of
61
62 cropping, fertilizer consumption). This computed well being index intends to capture the various levels of economic prosperity in the society. We attempt to compute the economic well being index in the following way: as the included indicators are mutually interdependent, we use factor analysis to compute this EWBI. The Factor Analysis (FA) technique is a powerful tool, which is used to reduce the number of influencing indicators, to detect structure in the relationships among indicators that is to classify variables according to their effect on the variables of interest. The underlying assumptions of factor analysis are that there exist a number of unobserved ‘factors’ that account for the correlation among the observed indicators, and because of this relation, the unobserved factors can be inferred from the observed indicators. Thus, the base model can be written as, a j=l
Now rewriting the above equation in a linear FA model yields:
XI =AIL+ A 2 f 2+ .......... +Aqfq+el
1
X, =
+ A2f2+ .......... + A2qfq+ e2
X,
+ AP2f2+ ..........+ Apqfq+ ep
= APIA
We then estimate factor scores in the FA model as below: for a given
f j the i th extracted factor score, denoted by q,, is given by = p , X , , + ........ + p p X , p,where p, ,p , ,.......p pare referred to as regression
factor
e,
coefficients and Xi,, Xi*, .....Xi, are the p observed indicators, for the *
th
observations. We define the Economic Well Being Index (EWBI) as a weighed average of the factor scores, where the weights A’s, are the eigenvalues of the correlation matrix. Thus: I
-
EWBI” =
CTA: Fj Aj ,
s = 1,2 ,.....S(states I regions)
Y
J
Then we quantitatively measure the quality of good governance. The quality of good governance is a quite multidimensional concept, likewise EWBI, and it should be examined along with different socio-economic dimensions. The good governance measure is captured through the ability of the governments to provide basic law and order, to provide social services to build up human capital, to provide physical infrastructure or to properly maintain the economic and administrative management that is to keep up high level of efficiency in its activities. In the present analysis, our governance measure is related to four
63 dimensions, namely, peace and stability; people's sensibility; social equality; and management of government (measured as: crime rates, riots, industrial disputes and strikes, government debt-income ratio, and gini inequality measure). We postulate that Quality of Governance Index (QGOI) is a latent variable, which is supposed to be linearly dependent on a set of observable indicators plus a disturbance term capturing error. Let
QGOI = a + PIXl + ......... + PkXk + e,where XI, ,......X,is set of indicators that are used to capture this QGOI, so
x,
that the total variation in the QGOI is composed of two orthogonal parts: a) variation due to set of indicators, and b) variation due to error. We propose to replace the set of indicators by an equal number of their principal components (PC), so that 100% of variation in indicators is accounted for by their PCs. We now solve the determinental equation,
IR -
= 0 for
A,
th
where R is a K X K matrix ;this provides a K degree polynomial equation in and hence K roots. These roots are called eigenvalues of R. Let
A us arrange A in descending order of magnitude, as A,)A2)..........) A k . Now , corresponding to each value of A ,we solve the matrix equation (R - u)cr= 0 for the K x 1 eigenvectors a'a = 1.
a , subject to the condition that
Then we write the characteristic vectors as
which correspond to 2 = Al = .........., i z k respectively. Hence we obtain the principal components as:
I
4 = a , , X l + .........+ a I k X k
P2 = a,,X, + ......... + a2kXk Pk = a,,X, + .........+ akkXk
Thus we compute all these PCs using elements of successive eigenvectors corresponding to eigenvalues, ,?,,12,. ........Ak respectively. We now estimate the QGOI as weighted average of the PCs, thus:
.
64
QGOIs =
qn, + Ga2 ...........+ P k A k n, +A, + ............ + A + a
eigenvalues
of
,where the weights are the
the
correlation matrix R and ill = Var pl ,......iz, = var pk,and s = 1,2 ,.....S(states/regions) As noted in the literature that the issue of decentralisation is also crucial in percolating the h i t s of development to the people at the grass root level. We use the simple measure of financial decentralisation to capture this argument into our model. Here we focus only on the aspect of financial decentralisation as measured by the assignment of expenditure functions and revenue sources to sub-national levels of governments. In the economy the devolution is supposed to provide more efficient and equitable service delivery, to enhance revenue mobilisation, to promote participation, and to improve political stability in the society. The Financial Decentralisation Index (FIDI) is defined as:
assignment and compensation to local bodies FIDI” = ( 1 governments total revenue expenditure *loo, s = 1,2,...S(states/regions) Then we ran the Panel data regression to the dataset to obtain more eflcient estimates, by pooling the time series and cross section data. We use the model in the following framework: initially, we look at the pooled ordinary least square model of estimation. The pooled model contains observations on N units of observations (cross-section units), over the T time points (time points). The purpose is to estimate a standard regression model of the form:
Ypoi,= a + flxi,+ e, ei,
assumption the
.
are
where i = 42,...N and
iid over
iand t,
.
t = 1,2,....T by i.e., E(e,) = Oand
2
var(e,) = cre . Then we use the Fixed Effect estimation model, specification for the individual state specific effects is given by
y
FE
it
= ai+ p
I
.
x + e, i = 1,.....N ; t = 1.....T .
The Random Effects
l x k kxl
model has been rejected on the basis of Hausman test statistic. So, we use statehegional panel by estimating the following regression: E WBI,, = a p QGOI ,t QGOI,, + p FIDIst FIDI,, 6, E,, ,
+
+ +
where
EWBI,, is the dependent variable for state/ region s i n period t , QGOI and FIDI are independent variables, with 6, are fixed or random effects, and E , are error terms. The econometric analysis of Panel data estimate
65
after controlling for other variables indicates that the estimated parameters are statistically different from zero. Our basic model specification therefore shows a systematic strong positive link between economic well being and the institutional arrangements. The dataset for this study is compiled for 16 major states of India over four points of time (1970s, 1980s, 1990s, and 1997-latest).
MODELLING THE NATURAL GAS CONSUMPTION IN A CHANGING ENVIRONMENT F. A. BATZIAS, N. P. NIKOLAOU, A. S. K A K O S m I . MICHAILIDES Department of Industrial Management and Technology, University of Piraeus, 80 Karaoli & Dimitriou St., 185 34 Piraeus, Greece, E-mail:
[email protected] A composite function was used successfully for modelling the Natural Gas (NG) consumption in 16 European energy markets. Background of the model is a logistic function where the upper limit is also a logistic function of time, with secondary parameters determined either endogenously together with the rest primary parameters or exogenously in a sample space of the energy market. Fitting of this ‘double logistic’ dynamic model to NG consumption data of the period 1980-2000 gave better Standard Errors of Estimate (SEES) for ten energy markets in comparison with the linear, the expone-ntiavasymptotic and the static logistic model. Supplementary results obtained by statistical analysis of answers selected by circulating a questionnaire in the wider area of Attica in Greece, led to the conclusion that income/welfare, residential place and information play an important role as regards the intention of inhabitants to adopt the NG alternative.
1. Introduction Natural Gas (NG) has been actually introduced to most European national energy markets a long time after their maturation, which was based on other energy sources (basically coal, oil, hydro- and nuclear energy). In several energy markets, it is expected that the adoption (or dissemination or diffusion or spread) of NG will follow the familiar pattern of an S-growth curve, which has been observed in many cases of technological substitution; although the validity of such a model may not be finally confirmed, it is widely preferred, at first instance or as a rough approximation, by several researchers, because it implies a rational mechanism of interaction between individuals in society: the rate of adoption is proportional to all possible connections between adopters and nonadopters at any instant of time, or mathematically dy/dt = by(K-y), which gives by integration the well-known simple logistic function y = K / [l + m exp(-bt)], where y is the number of adopters cumulatively calculated at time t, K is the upper limit of the energy market share to be covered by NG, b is the proportionality factor in the differential form or the exponential parameter in the
66
67 integrated form of the model (i.e. a rate constant), m is the relative ‘vertical’ range of the logistic curve measured in yo-units: m = (K-yo)/yo, y=yo at t=O. If the rational mechanism of diffusion suggests that the role of adopters is not significant (e.g. when the diffusion of information depends mainly on a central node transmitting the corresponding message without significant interaction between the receivers) then the logistic model is reduced to dy/dt =b(k-y), which gives by integration the well-known simple exponentiallasymptotic function y = K - (K - yo) exp(-bt). Several versions of the logistic function have been developed so far in various scientific disciplines (like Physical Chemistry, Engineering, Biology and Economics) and certain attempts have been made to design generalized models that incorporate most of these versions as special cases [l, 21. The aim of the present work is (a) to develop a dynamic model for NG consumption, (b) to test its validity with data corresponding to European energy markets, (c) to suggest alternative approaches for parameter values estimation and (d) to show a way for influencing these parameter-values by means of methods used in normative economics in order to accelerate NG adoption in countries like Greece, that are actually late arrivals in the area of NG utilization. 2.
Methodology
The K-value is very important because it determines both, the upper limit and the rate of adoption, as it is evident from the differential form. One of the criticisms most often heard in relation with the applicability of the logistic model concerns this parameter, as it seems unreasonable to assume that its value is constant over the long period of time necessary to provide adequate data for reliable parameter value estimation by means of time series analysis. For the needs of the present study, we have introduced a K-parameter, which is also a function of time and has a different semantic content: it refers to possible adopters at time t (a new parameter, variable over time) rather than to expected adopters at t + 00 (the old parameter, constant over time). As the set of possible adopters, who determine the driving force (K-y) is formed and continuously transformed within the same mechanism of information dissemination accompanied by rational thinking with a tendency for imitation, it is reasonable to assume a logistic model to simulate the dependence of K on time: K = KO/[ l + m exp(-bot)]. Consequently, the following double logistic model (LL) is derived: y = W[l+mo exp(-bot)]/[ 1 + m exp(-bt)]
(1)
This dynamic model does not have the constraint of symmetry, a disadvantage of the static logistic function according to most researchers (see
68 e.g. [3]), although it keeps the limit of this simple form, i.e. y-&, dy/dt--tO for t - m and y+O, dy/dt-0 for t+-Oo. Moreover, (1) maintains simplicity, is recognizable and offers appropriate background for embedding parametervalues exogenously determined; the latter advantage is very important as we can simulate the expected behaviour of the energy market by using a representative sample of the population under consideration to investigate the individuals’ intention at time t to adopt NG (in the near future) as a function of economic and social parameters/characteristics. Turner et al. [4] had inserted a logistic model for K in a Bemoulli-type differential equation representing the rate of population growth. Their integrated model is complicated in comparison with (l), includes an additional exponential parameter, implying decrease of the degrees of f?eedom, and its sum of squares of errors is further increased because of the necessity to improve the short-term predictability by forcing the fitted curve to go exactly through the latest datapoint. This model was used for forecasting the growth of U.S. population, not quite successfully as the upper limit was estimated to be 1751 million inhabitants. It is worthwhile noting that in the same work the suggested integrated model is in error while the cited parameter values do not confirm the quoted y-estimates.
3. Implementation and Results We have applied the LL equation (1) to model the NG adoption in the energy market of 16 European Countries for the period 1980-2000 (see Table 1). Most results indicate a better fitting in comparison with the results obtained by the linear, the exponential/asymptotic and the simple logistic model (LN, EX, LG, respectively). The criterion used for comparison is the Standard Error of Estimate SEE = [ c ( y i - j i ) * i ( n - p ) I”* where j , is the estimated, by means of nonlinear regression, value of yi (i=l, 2, ..., n), p is the number of parameters used for regression and n-p is the number of degrees of freedom; in the case of regression of a logistic model of any kind, a fixed kernel is introduced, implying a corresponding decrease of the degrees of freedom. Three different algorithmic procedures were used for non-linear regression (a) to avoid convergence to local optima and (b) to cross check the SEE-values. The flexibility/adaptability of the dynamic model in comparison with the other models is shown in the diagrams of Figures 1-2. Due to its flexibility, there is no need to obtain data quite beyond a fixed inflection point, which is 1/2, 1/3, l/e of the upper limit for the simple logistic, the Floyd and the Gompertz models, respectively, for parameter values estimation by regression; neither further assumptions on expected values of the
69 dependent variable or smoothing of data are necessary, as it is the case when a limited number of data is available (see e.g. [5]). Table 1. SEE-values (dry NG 109 ft3) for best fitting of the four models under consideration (minimal values are quoted in gray). Model Code Country EX LN LL LG Austria 33.41 40.21 39.08 165.68 160.42 543.22 Belgium 15.92 17.14 Denmark 22.48 17.85 7.55 Finland 17.37 France 1134.10 1103.84 #59.08 2047.51 2364.38 2283.13 Germany 4.25 Greece 54.48 0.87 7.78 Ireland """""5.42 32.98 Italy 1458.39 1417.60 ""143064 1056.65 932.57 Netherlands 960.73 62.12 62.06 Norway 65.88 21.89 21.96 ~ 2L40 Portugal 387.81 593.81 Spain 577.94 0.28 10.48 3.21 Sweden 1.93 Switzerland 1.88 134 r 6761.17 United Kingdom 1529.01 7783.37
lljllHH
j;vi;:;iiijj
jl tiiil '^llitl!
70
600
Spain
Data - -LL 0
nz500 U a
;Al
..
.... P.... . ..
- 400 0
c)
Q
5 300 (b
C
0" 200
6 100 0 ,380
1985
1990 Year
1995
2000
Figure 1. Fitting of models to data in the case of Spain; the superiority of the LL model is evident, especially near the ends; an extrapolation for short term forecasting based on the simple logistic LG might lead to significant overestimation of NG consumption.
160
-m
2
g
140 120 100
'Jp
5
80
t
20
0
o 1980
I985
1990
Year
1995
2000
Figure 2. Fitting of models to data in the case of Finland; only the LL model is capable to 'catch' a local fluctuation, like the one observed near the start of the time period.
71 4.
Supplementary Information by Statistical Analysis
The parameter values K,,. m. bo of the function K = f(t) can be estimated exogenously by means of a questionnaire addressed each year to a sample of population selectedlstructured to be a representative one for the market under consideration. Income level and residence area of the families selected to constitute the representative sample are expected to influence their attitude as it is shown in Table 2, where the cited statistical results refer to a questionnaire answered by 74 persons living in a very high income suburb of Athens (H) and by an equal number of persons living in a very low income suburb of Piraeus (L) within the same urban area of Attica in Greece. The first line (HALF) refers to the question “do you consider seriously the possibility of adopting NG for heatinglcooking needs within the next three years on the grounds that such an adoption might cut your electricity bill to half?”. The second line (ENVR) refers to the same question by informing also about the expected environmental benefits at local and national level in the case of adoption. The third line (RESR) refers to the same question by further adding information on the benefits of replacing lignite by NG from the point of view of saving natural resources for ourselves and the future generations, implying rational decision making, better living and independence. The fourth line (DUBL) refers to the same question (with all additions so far) with a change of the economic incentive to “. .. on the grounds that without such an adoption your electricity bill is expected to double within the same time period due to dramatic increase of k w h - price which is already low for a EU member country?”. The symbols X2, C, Cr, r, D stand for the chi-square statistic, the absolute coefficient of contingency, the relative C as a fraction of C, the tetrachoric correlation of attributes and the difference between X2 and X2, where X2, is the critical value of X2 at confidence level 99.5% (found in statistical Tables with percentile values for the chi-square distribution), respectively. Evidently the results are significant at this confidence level as in all cases D>O or X2 > X2,, i.e. the null hypothesis Ho that there is no difference as regards the intention (I) of each group to adopt the NG alternative is rejected. In other words, the residents of the low income suburb (L) exhibit a significantly stronger intention to adopt the NG alternative in comparison with the residents of the high income suburb (H), with the following percentages in favour of adoption: 29/74 = 39.2%, 56/74 = 75.7%, 62/74 = 83.8%, 65/74 = 87.8%, against 7/74 = 9.5%, 14/74 = 18.9%, 17/74 = 23.0%, 36/74 = 48.6%, for HALF, ENVR, RESR, DUBL, respectively. Nevertheless, the situation is not so clear in the case of middle income suburbs and the determination of the dependence of (I) on income distribution,
72
welfare level and adequacy of information (cross-section analysis in each year) is necessary to obtain information useful to desigdimplement energy policy instruments (economic incentives and direct regulations) for diffusion hastening. This might prove an effective way to overcome social inertia and cope with the energy ‘paradox’, i.e. the very gradual/slow diffusion of apparently costeffective and environmentally more friendly technologies ([6]). Table 2. Statistical analysis of results based on the chi-square test.
Code HALF
ENVR RESR DUBL
Values without Correction I c 1cr( r 17.77 0.33 0.46 0.35 9.89 47.82 0.49 0.70 0.57 39.94 54.98 0.52 0.74 0.61 47.10 26.22 0.39 0.55 0.42 18.34
x’I
Values with Yates Correction c IcrI r I D 16.19 0.32 0.44 0.33 8.31 45.57 0.49 0.69 0.55 37.69 52.56 0.51 0.72 0.60 44.68 24.44 0.38 0.53 0.41 16.56
I D Ix’I
5. Conclusions and Recommendations The NG consumption in several European energy markets can be modelled by the logistic function and in most cases better results are obtained when the upper limit of the energy market share to be covered by NG is also considered as a function of time. When this function is also a logistic, the composite dynamic model, named herein double logistic (LL), appears to (a) maintain simplicity, (b) include meaningful and recognizable parameters and (c) offer appropriate background for embedding parameter-values exogenously determined; the latter advantage is very important as we can simulate the expected behaviour of the energy market by using a representative sample of the population under consideration to investigate the individuals’ intention to adopt NG as a function of economic and social parametershharacteristics. Statistical analysis we performed on data collected through a questionnaire proved that residents on low-income suburbs (which are also environmentally degraded) exhibit stronger intention for adopting the NG alternative in comparison with residents of highincome suburbs. By combining the results of NG consumption modelling (time series analysis) with the statistical results from processing the answers to a questionnaire circulated within a representative sample (cross-section analysis) we can offer incentives that facilitate the adoption of the NG alternative.
Acknowledgments
The authors kindly acknowledge financial support provided by the Research Centre of the University of Piraeus.
73 References 1. A. Tsoularis, Res. Lett. In$ Math. Sci., 2,23 (2001). 2. C. Skiadas, Tech. Forecasting andSoc. Change, 27,39 (1985). 3. C . Skiadas, Tech. Forecasting andSoc. Change, 30,313 (1986). 4. M . E. Turner Jr., B. A. Blumenstein, and J. L. Sebaugh, Biometrics, 25,577 (1969). 5. J. Siemek, S. Nagy, and S. Rychlicki, AppIiedEnergy, in press, (2003). 6. A. B. Jaffe, and R. N. Stavins, Resource and Energy Economics, 16, 91 (1994).
COMPUTER AIDED DIMENSIONAL ANALYSIS FOR KNOWLEDGE MANAGEMENT IN CHEMICAL ENGINEERING PROCESSES F. A. BATZIAS, A. S. KAKOS AND N. P. NIKOLAOU Department of Industrial Management and Technology, Universip of Piraeus, 80 Karaoli & Dimitriou St., 185 34 Piraeus. Greece E-mail:
[email protected]
An algorithmic procedure has been designedldeveloped for Computer Aided Dimensional Analysis (DA) of chemical engineering processes. The main purpose of this software is to construct'select the best combination of dimensionless groups describing adequately a process under certain criteria. The creationloperation of an Ontological Knowledge Base ( O m ) plays a central role in this procedure as it provides, inter alia the means for filteringlreducing the dimensionless groups obtained by solving the system of dimensional equations according to the Buckingham n Theorem. The successful implementation of this software is thoroughly presented step by step in a case of mass transfer (liquid drops moving in immiscible liquids).
1.
Introduction
According to the classical work of Johnstone and Thring [ 13, Dimensional Analysis (DA) is a technique for expressing the behaviour of a physical system in terms of the minimum number of independent variables and in a form that is unaffected by changes in the magnitude of the units of measurements. The physical Variables, Parameters, and Dimensional Constants (VPCs) are arranged in Dimensionless Groups consisting of ratios of the VPCs, which mostly characterize the system under consideration; these groups constitute the new variables in the dimensionless equation of state of the system. In case that the system is a chemical engineering process, the equation of state is used either (i) for scaling down an industrial process (already in operation or in the stage of design) to optimize its (real or expected) operation or solve relevant problems, or (ii) for scaling up a process synthesized in laboratory scale within an R & D programme. The number of dimensionless groups, which constitute the equation of state, is (n - m),where n is the number of VPCs and m is the number of Primary Quantities used to define dimensionally the VPCs, according to the well known Buckingham's Il Theorem (see, e.g. [ 11). The proof of this Theorem, by means of Linear Algebra, implies that the number h of distinct possible arrangements of dimensionless groups is equal to the number of combinations of m linearly independent columns of the dimensional matrix (see [2]). These unique 74
75
solutions form the S-matrix. Although the solutions are all algebraically equivalent, they give different estimators for a dependent variable, by regression to the same data. As a matter of fact most researchers rely on an arbitrarily selected solution for parameter estimation, possibly using only the subjective criterion of recognizing some groups in this solution, as widely used in technical literature. In the present work, we have designed/developed/implemented an algorithmic procedure for (a) finding out all independent feasible solutions and (b) selecting the best solution by means of quantitative criteria and some kind of filtering performed by an Ontological Knowledge Base (Om) which is continually enriched either in the course of running the software on a variety of dimensional problems or exogenously via an intelligent agent [3]. It is worthwhile noting that the methodology presented herein can be easily extended to cover fractal VPCs, as they are defined in [4], provided that corresponding data are available. 2.
Methodology
The algorithmic procedure, especially designed for (a) the DA of a process, (b) the choice of optimal combination of independent dimensionless groups, and (c) the creation/maintenance of the OKB, includes the following stages. Figure 1 illustrates the interconnection of stages within an R & D programme, represented by the corresponding number or letter, in the case of activity or decision node. resvectivelv. I . Collection ofall VPds (totalled n), relating to the full-scale process, which is the objective of the R & D programme. 2. Design and testing of the laboratory scale simulator of the process. 3. Ranking of all VPCs in order of decreasing importance. 4. Measurements of all VPCs by means of the simulator. 5. Selection of the q VPCs ranked first (out of total n) to start the iterative part of the algorithm. Initially, k = q. 6. Computer aided Dimensional Analysis to form the h,x(k-m) S,-matrix, where hi is the number of independent solutions at step j . 7. Filtering of the S,-matrix to obtain a partial Pj-matrix by fulfilling the criterion of minimum number of dimensionless groups ‘recognized’ by the O m . A reasonable choice for this filtering criterion is (k-m). 8. Estimation of all parameter values appearing in the solutions, represented by the rows of the P,-matrix, by non-linear regression. 9. Ranking of the solutions, in order of increasing SEE. Let SEE, be the SEE of the solution ranked first.
76 10. Scale up of the process under development to pilot plant by taking into consideration the relation corresponding to last stored SEE in stage 1 1.
Executive line
..................Information line Figure 1. Flow chart of the algorithmic procedure designeddeveloped for dimensional analysis, scale up of the processes and creatiodenrichment of the OKB.
11. 12. 13. Q. R.
Let a = SEE, (initially a = M, where M is a very big number). Creation/Enrichment/Usage of an OKB. Addition of the next VPC ranked next in stage 3. Is the SEE, of the solution ranked first in stage 9 less than a? Is the equalityj = (n - q - I ) valid?
3.
Implementation
We implemented the above algorithmic procedure in the case of mass transfer under the form of liquid drops moving in immiscible liquids, a process which plays important role in liquid-liquid extractors, in separators used in distillation columns, and in packed towers when the packing is not wetted by the disperse phase. The first nine VPCs collected are the following ranked in order of decreasing importance (see stages 1-3 in Figure 1): U = terminal velocity of drop E [L T-'I, Ap = difference in densities of drop liquid and water E [M L"], D = equivalent spherical drop diameter E[L], g = local acceleration due to gravity E [L T-'], CT = interfacial tension between drop liquid and water E [M T' 2 1, p = density of water E [M L-3], I.( = viscosity of water E [M L-' T-'I, = viscosity of drop liquid E [M L-' T-'I, and d = diameter of container E [L]. The first seven were selected to start the iterative part of the algorithm (stage 5 ) , i.e. herein n=9, m=3, q=7. Instead of performing measurements by means of the simulator (stage 4), numerical data from technical literature [5] were used. The nrcdimensionless groups (r = I, ..., ho, c = 1, 2, ..., q-m) constitute the hox (km ) or 18x4 So-matrix (stage 6). For filtering the So-matrix, we set the criterion
77
of hlfilling the maximum number (i.e. k-m = 4) of dimensionless groups recognized by the OKB in order to obtain the partial Po-matrix (stage 7), which consists of only two rows (lines No.8 and 11 of Table 1). The solution No.8 was ranked first by performing stages 8, 9, and its SEE0 value was stored according to stage 11, as j # (n - q - 1). the stage 13 is activated and the 8" dimensional parameter po enters in stage 6. This iteration gives a 3 1x5 S1-matrixfrom which a 2x5 PI-matrix is obtained by setting the same criterion of fulfilling the maximum number (i.e. k-m = 5) of dimensionless groups recognized by the OKB. The No.12 solution {IIIz,~ = plm P-'I3 - ~g U, n12.2=p2I3P-23 g 113 D, n12.3 = p1I3p-Il3 g-'I3 0,1712,4 = p-' Ap, E11z,5 = p po}, with SEEl = 12.2624, is passing the filter and as SEEl < a this solution dominates. This means that the contribution of po to the modeling of the process is significant even when a strict filter is applied. As j # (n - q - I ) , the stage 13 will again be activated to examine the contribution of parameter d but there is also a provision in our software design for terminating the program in case that the user considers the last SEE-value as adequately small (a termination clause that might be also set a priori). It is worthwhile noting that all information gained during this procedure has been stored in the OKB (stage 12). Table 1. The hj x (k-rn) = 18 x (7 - 3) Sj-matrix (forj=O), with SEE of non-linear regression for each solution; groups in gray are recognizable by the OKB in the stage of filtering.
SEE 12.89856 12.85352 12.79775 12.79961 12.78040 12.79682 12,79450 12.77123 12.81673 12.83649 15.85902 12.79193 12.87053 12.78061 12.81102 12.79239 21.62942 21.14292
78 4.
Discussion
The predominant solution in the above case example is found when j=l as there is no solution passing the filter for j=2; it happens also that SEE1 < a. A problem might arise if SEE1 is slightly higher that SEE,, but the inclusion of within the equation of state increases the explanatory ability of the independent variables. To provide against such a contradiction, we have inserted a man-machine interaction node in the computer program so that the user can judge on whether the marginal utility added (by increasing j ) counterbalances possible slight increase of accepted SEE; so, if he decides to keep the additional VPC, he knows exactly the price paid for the sake of more explicit scientific reasoning. A similar but simpler question arises as regards the selection of the q initial VPCs, which constitute the minimal set of VPCs capable to describe the process under consideration at j=O. E.g. in the above case example, the software user might be an expert in process engineering who might consider the wall effect not negligible enough to omit the container diameter d; consequently, he would start the algorithmic procedure with 8 (including 6) instead of 7 VPCs even if the corresponding information extracted from the OKJ3 is in favour of omitting this parameter. Inversely, the OKB is enrichedupdated by means of such human intervention; the neuro-fuzzy rules within it learn from experience resulting to eventual change of suggested proposals (i.e. change of dominance hierarchy) in next runningslsessions. 5.
Conclusions
We have designed an algorithmic procedure for (a) finding all feasible solutions of independent dimensionless groups that describe a process, and (b) selecting the best solution by means of quantitative criteria and some kind of filtering performed by an Ontological Knowledge Base (OKB). By applying this procedure we avoid the arbitrary choice of a, possibly dependent, combination of dimensionless groups, which is either the result of an empirical procedure (lacking theoretical foundation), or the subjective choice of a person who usually makes decisions on the basis of which dimensionless groups seem familiar to him and/or are cited in other studies, regardless of semantics. The aforementioned procedure was implemented in a user-friendly computer application developed using MS .Net Architecture; moreover, the OKB was implemented using widely accepted Internet standards (XML-W3C specifica-tions) to allow interaction with other similar applications or agents, possibly across the Internet [3]. The software was successfully applied in the
79 case of mass transfer under the form of liquid drops moving in immiscible liquids, a process which plays important role in liquid-liquid extractors, in separators used in distillation columns, and in packed towers when the packing in not wetted by the disperse phase. By checking (a) the procedure and the software functionality (by letting a common user -not an expert- to perform all possible operations) and (b) the results qualitatively (by means of technical literature survey) and quantitatively (by using linear algebra) we found no problematic response of the system. Nevertheless we have installed a manmachine interface to permit human interaction when the user has a reason to believe that another (acceptable but sub-optimal) solution might be better fitted under certain circumstances; in such a case, the OW3 is enriched by human reasoning and new predominant rules may emerge eventually.
6. Acknowledgments The authors kindly acknowledge financial support provided by the Research Centre of the University of Piraeus.
References R. E. Johnstone and M. W. Thring, Pilot Plants, Models, and Scale-up Methods in Chemical Engineering, McGraw-Hill, New York (1957). 2. W. D. Curtis, J. D. Logan, and W. A. Parker, Linear Algebra and its Applications, 47, 1 17 (1982). 3. F. A. Batzias, and E. C. Marcoulaki, Computer-Aided Chem. Engineering 10,829 (2002). 4. M. Rybaczuk, and W. Zielinski, Chaos, Solitons and Fractals, 12, 2517 (2001). 5. P. M. Krishna, D. Venkateswarlu, and G. S. R. Narasimhamurty, Journal of Chemical and Engineering Data, 4,336, 340 (1959). 1.
80 Appendix Table A.I. Sample of the Variables, Parameters, and Dimensional Constants (VPCs) included in the OKB. Symbol
U Ap D g a P u Mo d N DA V
D! APra
kLa Us
DP P MT
e
Name Terminal velocity of drop Difference in densities of drop liquid and water Equivalent spherical drop diameter Local acceleration due to gravity Interfacial tension between drop liquid and water Density of water Viscosity of water (or liquid) Viscosity of drop liquid Diameter of container Impeller/mixing/stirrer speed Molecular Diffusivity Kinematic Viscosity of liquid Impeller/stirrer diameter Pressure difference Volumetric Liquid-side mass transfer coefficient Slip velocity Particle diameter Power Total Mass of particle Power input per unit mass of fluid
Dimension [LT1] [ML0]
[L]
[LT2] [MT2] [ML-3] [M L-' T1] [M L'1 T1] [L]. [T1]
[L2T'] [L'T1] [L] [M L'1 T2] [T1] [LT1] [L] [M L2 T'3] [M] [L2T3]
Table A.2. Sample of the recognizable Dimensionless Groups included in the OKB; groups in gray have been used for filtering in the implementation presented herein. Symbol |
Name Reynolds Number Weber Group : Drag Coefficient of Drops ; Froude Number : B group Gravity Group G Group (G) P Group : Sd Group ; Terminal velocity Group Schmidt number (Sc) Power Number (Po) Solid Concentration or Quantity Group (U) (Sh) Sherwood number Euler number (Eu) (NReP) Particle Reynolds Number Turbulent Reynolds Number (Re) Impeller Reynolds Number (Re)
vimm^ :;§fl:
. :-"?W : ; ;ilp. -«iep:. - (wt)
; :XP) ^•::|l!iA
I
Variables D p a"1 U D p U2 a 1 4/3 D U"2 g Ap p'1 DU' 2 g !/4D 2 0 -'gAp (4/3pA P n- 2 g) 1 / 3 D gD 4 U 4 P 3 °- 3 u"4 c3 g.! p2 Ap"1
y^W'V^cAp-1'3
U 3/41/3 p2'3 u-"3 g"3 Ap-"3 U D A -'v P N'3 DI ~5 p-' MT p 1 D3
MO^DA*2
APp^D,- ^ 2 D p u s v-'
v D 4/3 £ l/3
D,2 N p a'1
81
Figure A. I . The Graphical User Interface of the software application that implements the herein presented algorithmic procedure.
Figure A.2. The 8-variable dimensional problem, discussed in the implementation section, is being defined by the non-expert user through a user-friendly dialog form.
82
Table A.3. The hi x (k-m) = 31 x (8 - 3) Sj-matrix (forj=l), with SEE of non-linear regression for each solution; groups in gray are recognizable by the O D in the stage of filtering. Dimensionless Groups
I
SEE 12.806 12.626 12.598 12.464 12.557 12.271 12.449 12.459 12.271 12.278 12.271 12.262 12.459 12.460 15.214 12.872 12.346 12.304 12.280 12.288 12.882 12.354 12.313 12.379 13.394 12.313 12.263 23.173 20.364 23.387 15.177
NEW DEVELOPMENTS ON EQUIVALENT THERMAL IN HYDROTHERMAL OPTIMIZATION. AN ALGORITHM OF APPROXIMATION
L. BAYON, J. M. GRAU, M. M. RUIZ AND P. SUAREZ University of Ouiedo. Department of Mathematics. E. U.I. T.I. C./ Manuel Llaneza 75,Gijbn, 33208, Asturias, Spain. E-mail:
[email protected]. es
1. Introduction
This work is embedded in the line of research entitled “Optimization of hydrothermal systems”. In a previous paper [l]we considered the possibility of substituting a problem with m thermal plants and n hydroplants (HnT,) by an equivalent problem (H,-Tl) with a single thermal power station: the equivalent thermal plant. In said paper we calculated the equivalent minimizer in the case where the cost functions are second-order polynomials. We proved that the equivalent minimizer is a second-order polynomial with piece-wise constant coefficients; moreover, it belongs to the class C1. In this paper we shall present two fundamental contributions: first, new theoretical results relative to the equivalent thermal plant and, second, an algorithm for the approximate calculus for a general model. We assume throughout the paper the following definitions. Let Fi : Di 5 R R (i = 1,.. . ,m) be the cost functions of the thermal power stations. We assume that
-
V< E
D
= D1
+ ... + Dm G R,3(<1,.. . , < m )E
n m
Di
i=l
the unique minimum of
-
m
m
i= 1
i=l
CFi(zi)with the condition Czi = <.
Definition 1.1. Let us call the i-th distribution function, the function Qi
: D1
+ . + D, 1 .
Di defined by
Qi(t)= & , V i = 1,.. . , m 83
84
-
Definition 1.2. We will denote as the equivalent minimizer of {Fi}?, the function Q : D1+ . . . Dm R defined by
+
m
i=l
with the constraint
m C x i i=l
= J.
2. New Theoretical Developments
In this paper we continue the theoretical studies of the equivalent thermal plant. First we prove, under certain assumptions, the existence and uniqueness of the equivalent minimizer Q. Theorem 2.1. Let {Fi}El c C1[O,0o) be a set o f f u n c t i o n s such that F,! i s strictly increasing (i = l,...,m), with Fl(0)I F[+l(0), and let the m
function F {(XI,.
. . ,x,)
:
[0,0
0 ) ~4
R be F ( z 1 , . . . ,
~ m := )
:=
a= 1
m
E Rm
CFi(zi). Let C a
I xi 2 0 A X X =~a}. i= 1
{Qi}zl
Then, there exists a unique set such that: (1) ( Q l ( u ) ,. . . ,*,(a)) is the minimum of F over C,, 'da 2 0. (2) I t holds that (Q1(u),..
. ,Q m ( a ) )E
6, *
m
a > ( C Fl-l a=
1
0
FA)(O)
-1
i=l
being
(3) ( ~ l ( a .). ,. , Q ~ ( U ) )$
6, * f o r certain i E {I, ...,m - 1)
Qi(u) = Qi+l(U) = ... = Qm(u)= 0 In the previous theorem we also obtain the distribution functions Qk. Now we define the equivalent thermal plant piece-wisely, taking into account the restriction of power positivity. Theorem 2.2. Let { F i } E l , F, and C, be defined as in Theorem 2.1. Then there exists { 6 k } F s 1 c (with Sm+l = 00) and {Qk};ZZ=l c
85
C[O,oo) such that f o r every a > 0, the minimum of F over C, attains at ( Q l ( a ) , . .. ,Qm(a)),being
Also, we shall prove that, for a general model, the equivalent thermal plant belongs to the class C1.
Theorem 2.3. Let {Fi}En=, c C1[O,oo) be a set of functions defined as in Theorem 2.1. T h e n the function m
belongs t o the class C' and Q'(0) = F i ( 0 ) . To conclude this section, we analyze the situation that arises when the thermal plants are constrained to restrictions of the type m
C,
:= { ( X I , .
. . , xm>E I W ~ I P5 xi , ~5~P,,,
A
Exi = a> i= 1
3. An Algorithm of Approximation
We have developed a new algorithm for the approximate calculus of the thermal equivalent of m thermal power plants whose cost functional is general (non-quadratic). The outline is the following: i) We linearly approximate the derivative of the cost function of each thermal plant, F:(x), i = 1,...,m in the power generation interval of each plant. This approximation may be done as finely as one wishes by simply increasing the number of splines in said interval. The integration of these functions leads us to the piece-wise defined functions Q,(x), i = 1,...,m that approximate the cost function of each thermal plant considered
-
+ P z k X + r a k z 2 if 6& + &x + r z l ~if2
5 X < &+I; x2
k
= 1,..., 1 - 1
ha1
86
ii) We next demonstrate that each function $ i ( x ) can be considered as the minimizing equivalent of 1 fictitious thermal plants, whose cost functions, denoted by { Fil(x), F i 2 (x), . . . ,F i l (x)} , are second-order polynomials
&(x) = (Yik
+ &x + ^ / i k x 2 ; k = 1, ..., 1
The aforementioned coefficients, deduced from those obtained in [l],are given by
-
oak
= 27ikbik
+Pik
-
where ( b i k , b i k + l ) is the domain of Q i ( x ) . iii) Finally, we construct the equivalent minimizer of all the functions obtained
{ Fij}i= 1,...,m j=l,...,1
We finally show, using an example, that the developed algorithm offers very good approximate results in comparison with prior methods, such as for instance [2]. References 1. L. B a y h , J. M. Grau and P. SuBrez, A new formulation of the equivalent thermal in optimization of hydrothermal systems, Math. Probl. Eng. (2002). 2. L. Bay&, J . M. Grau and P. SuBrez, A New Algorithm for the Optimization of a Simple Hydrothermal Problem, Proceedings CMMSE 2002, Vol. I pp. 6170(2002).
UNIQUE VIRTUES OF THE PADE APPROXIMANT FOR HIGH-RESOLUTION SIGNAL PROCESSING
DZEVAD BELKIC Department of Medical Radiation Physics, Karolinska Institute, P. 0. Box 260, S-171 76 Stockholm, Sweden
This lecture reviews quantum-mechanical signal processing based upon the Pad6 approximant (PA). We presently link the PA and the Lanczos algorithm to design the Pad6-Lanczos approximant (PLA). The PLA is operationalized with the recursive algorithm called the fast Pad6 transform (FPT) for both parametric and nonparametric estimations of spectra. The FPT for any given power series is defined by the unique quotient of two polynomials. This processor provides a meaningful result even when the original expansion diverges. It can significantly accelerate slowly converging sequences/series. As opposed to a single polynomial, e.g. the fast Fourier transform (FFT), the FPT can analytically continue general functions outside their definition domains. Moreover, we show that the FPT is an efficient solver of generalized eigen-problems e.g. the quantummechanical evolution/relaxation matrix U comprised of auto-correlation functions. These generic functions can be either computed theoretically or measured experimentally. Such a concept, put forward as a computational tool, surpasses its initial purpose. Indeed auto-correlation functions represent a veritable alternative formulation of quantum mechanics. This is not just because all the major observables e.g. complete energy spectra, local density of states, quanta1 rate constants, etc, are expressible through the auto-correlation functions. It is also because these and other observables could be given completely in terms of some appropriate, relatively small informational parts that can be singled out and analyzed separately from the unwantedlredundant remainder of the full data set of auto-correlation functions. The needed dimensionality reduction of original large problems treated by the FPT can be achieved by e.g. windowing using the bandlimited decimation. Alternatively, as abundantly done in this work, the Lanczos tridiagonalization can be employed yielding sparse Jacobi matri87
88
ces in terms of the Lanczos coupling parameters {an,&}that have their very important physical interpretations. The FPT is naturally ingrained in the Schrodinger picture of quantum mechanics and in the total timeindependent Green function for the studied system. This yields a versatile framework for a unified treatment of spectroscopy and collision within signal processing and quantum mechanics. In the quantum-mechanical method of nearest neighbors or tight bindings, we use experimentally measured time signals as the only input data to derive the exact analytical expressions for the FPT, the Lanczos polynomials {Pn(w),Q,(w)} , the coupling parameters {an, Pn} , the Lanczos state vectors qbn , the total wave function T(w) at any frequency w and the ‘Hamiltonian’ operator 6. An extension of the FPT to multi-dimensional analysis is also given for usage in very diverse fields. A joint potential power of all parametric methods is their ability to extract the peak parameters from a given time signal as the result of data processing. By contrast, the Fourier-based spectral analysis must resort to post-processing fits to perform quantification of resonances in a studied spectrum. However, a common weakness of all parametric estimators is the inevitable production of spurious or extraneous resonances. Spurious peaks are due to the noise corruption of the time signal and to the socalled overdetermination problem. If the signal of the length N happens to have less than N / 2 peaks, the problem becomes algebraically overdetermined, since there are more equations than unknowns. This unavoidably leads to singular values associated with false peaks that represent spurious resonances. In fact, the adequacy and utility of all the existing parametric methods ultimately depend upon their ability to unequivocally identify and regularize spurious resonances. In quantum chemistry within Nuclear Magnetic Resonance (NMR), more than two decades ago, the method called Linear Predictor (LP) showed an initial success for parametric processing of experimentally measured time signals. Long computer processing times and unavailability of automated software of the LP were largely tolerated, but the critical lack of an objective procedure to adequately solve the problem with spurious poles was the main reason for the noticeable absence of this method among modern parametric estimators. We address this key stability issue within the Pad6 approximant by modeling a complex spectrum through a function determined uniquely by a rational polynomial i.e. a polynomial quotient for the given power series of the resolvent or the Green function with time signal points as the expansion coefficients. Extraneous roots are present in both the numerator
89
and denominator polynomial of the PA. Most harmful are the denominator spurious roots of the PA as they lead to unphysical spikes in the Pad6 magnitude or power spectrum. At first glance, the numerator spurious roots of the PA might seem innocuous, but they must also be viewed as unwelcome since they fill out the valleys with unphysical anti-resonances and this destroys the phase minimum, as well as the uniqueness feature of the PA. We examine this important problem within the inherently unstable PA by using the so-called constrained root reflection which is an analytical procedure for regularizing spurious roots. First we unequivocally separate the genuine from spurious resonances in the Pad6 power spectrum which is itself the PadQChebyshev approximant (PCA). Second, the unstable PCA containing diverging and converging exponentials is properly stabilized. This is done by a special root reflection which reverses the sign of diverging exponentials so that they are relocated on the side of genuine resonances. Such a procedure is accomplished under the constraint that the parameter and the shape spectra of the PA are the same. The shape spectrum is the Pad6 polynomial quotient evaluated at selected frequencies. The pai rameter spectrum is the sum of partial fractions built from the found peak parameters (position, width, height and phase). The ensuing constrained root reflection has its strong physical significance in full preservation of the total energy of the signal. The resulting method is called the PadQSchur approximant (PSA) which, as a stable estimator, possesses only converging exponentials. The PSA describes the Pad6 power spectrum as the unique ratio of two Schur polynomials whose computed roots are all adequately regularized and, as such, located on the side of physical resonances. In this way, rather than attempting to eliminate or reduce the noise content from the measured data, as repeatedly done previously in the literature on signal processing but with the risk of losing weak genuine spectral features, the PSA processes noise as it comes together with the physical signal. At the CMMSE 2003 Conference we shall present a number of illustrations establishing the unique virtues of the Pad6 approximant by assessing its utility and versatility for signal as well as image processing with a special emphasis on accuracy, stability, efficiency, robustness and simplicity in user-friendly automatized applications across interdisciplinary research.
A FEEDBACK LINEARIZATION TECHNIQUE BY USING NEURAL NETWORKS: APPLICATION TO BIOPROCESS CONTROL Y.S. BOUTALIS' AND 0.1.KOSMIDOU Automatic Control Systems Laboratory Department of Electrical & Computer Engineering Democritus University of Thrace GR-67100 XANTHI. Greece
Most physical systems operations are nonlinear in nature and hence they should be described by means of nonlinear mathematical models. Since nonlinear models are not convenient for control purposes, due to both theoretical and computational reasons, they are often linearized by using appropriate exact or approximate linearization techniques [ 5 ] , [4]. Among them, feedback linearization [ 5 ] is the most theoretically rigorous method; it consists in finding a feedback control law and a state variable transformation (diffeomorphism) such that the closed loop system model becomes linear, in the new coordinate variables. However, feedback linearization requires some strong constraints to be satisfied by the original nonlinear system, and thus its applicability is quite restricted. If, in addition, the original system is characterized by uncertain parameters, external disturbances and unmeasured state variables, as is the case of bioprocess control systems, the linearization problem becomes particularly complex and almost inextricable. In the present paper a method for linearizing a nonlinear system is proposed by using neural networks. More precisely, a nonlinear n-th order system is considered, in state space description; it is assumed that the system is characterized by uncertain time-varying parameters and that some of the state variables can not be accurately measured. The feedback linearizing controller applied to the system, yields a linear model in state space canonical form, of order equal to the relative degree r of the original nonlinear system. In this model, the r-th state equation is of the form
Xr =f
( x ) + g(x)u + d
x : the state vector u : the control input d : external disturbance
'
Corresponding author. Tel. +30 25410 79504, FAX +30 25410 26473,,E-muil:
vhoirr (rj;w .duth. P r
90
91 which is expressed in terms of the controller components, and, in some cases, in terms of the effect of the system’s zero dynamics. Therefore, it implies an additional first order nonlinear differential equation depending on the system’s non-linearity and uncertainty. In the proposed method, this equation is approximated by an artificial neural network (ANN). Both cases of f(x) and/or g(x) being uncertain and not completely known are considered. In all cases NN approximators are used to emulate the performance of the unknown non-linear functions. A problem often arising in control problems using NNs with conventional training algorithms (e.g Backpropagation) is that these algorithms do not guarantee that the performance of NNs will not deteriorate the stability of the whole process. This is overcome by describing the control objective as a tracking stability problem and the weights of the NNs are obtained on-line so that the tracking stability is ensured. This way, we ensure that the error tracking system is not going to become unstable due to the performance of the NNs. The above technique is applied on fermentation processes, such as the bioprocess of the production of Saccharomyces Cerevisiae (baker’s yeast), or the bioprocess related to waste water treatment. It is well known that fermentation processes are characterized by the growth of bacteria whose behavior is highly nonlinear and time varying. Moreover, many of the process parameters are uncertain or naturally varying during the process operation. In addition, external disturbances are often present at the system input. Under these conditions it is very important to ensure the process “stability” as well as a desired level of productivity [2]. Appropriate state space non-linear models for these processes have been proposed in [7], [6]. These models are linearized by the proposed method. Simulation results are given to show the ability of multilayer ANNs to approximate the nonlinear behavior and to act as reliable emulators of the nonlinear terms of equation (1).
REFERENCES 1. G. Bastin and G. Dochain, “On line Estimation and Adaptiva Control of Bioreactors”, Elsevier, 1990. 2. G. Bastin and J.F. Van Impre, “Nonlinear and Adaptive Control in Biotechnology: A Tutorial”, European Journal of Control, 1, 1995, pp. 37-53. 3. Y.S. Boutalis and 0.1. Kosmidou, “Design of a Neural Network Observer to Control a Feed-Batch Bioprocess”, European Control Conference ECC 99, Karlsruhe, Germany, 1999.
92 4. G.O. Guardabassi and S.M. Savaresi, “Approximate Linearization via Feedback - an Overview”, Automatica, 37, pp. 1-15,2001. 5. A.Isidori, “Nonlinear Control Systems”, Springer-Verlag, 1995. 6. M. Pengov, “Applications des observateurs non lintaires i la commande des bioproctdts”, Thkse de Docteur en Sciences, Universitk de Metz, France, 1998. 7. A. Rajab, J.-M. Engasser, P. Germain and A. Miclo, “A Physiological Model of Yeast Aerobic Growth”, 3rdEuropean Congress on Biotechnology, Miinchen, Germany, 1984. 8. D.E. Rumelhart, G.E. Hinton and R.J. Williams, “Learning Internal Representations by Error Propagation”, in Parallel Distributed Processing, D.E. Rumelhart and J.L. McLelland, Eds., Cambridge, MA: MIT Press, 1986.
SOME NUMERICAL METHODS FOR STIFF PROBLEMS
J. C. BUTCHER In the modelling of many physical systems, stiff differential equations need t o be solved numerically. Because stiff problems require implicit methods, implementation costs are an important consideration in the assessment of contending algorithms. We will consider a number of alternatives to the popular backward difference methods. These standard algorithms are incapable of being both A-stable and of order greater than 2 and we therefore focus on multistage methods with special structures t o guarantee low implementation costs. High order RungeKutta methods suffer the natural disadvantage of high-dimensionality in the algebraic systems required for stage evaluation. This is partly overcome by the use of singlyimplicit methods and we will consider how far this approach can be taken. Further developments involve general linear methods and the key t o finding good methods within this large family seems to be t o restrict attention to methods with inherent Rung-Kutta stability.
1. Introduction
Stiff differential equation systems arise frequently in the modelling of physical and engineering problems. They are characterised by the existence of rapidly decaying components in the solution; these rapidly decaying components cannot be faithfully approximated by traditional explicit methods unless abnormal restrictions are placed on the stepsize. In contrast to this unsatisfactory situation, implicit methods sometimes possess a property known as A-stability which guarantees stable behaviour for many stiff problems. The simplest examples of an (unsatisfactory) explicit method and of a (satisfactory) implicit method are, respectively, the Euler method and the implicit Euler method. The most straightforward motivation for these is to write the change of solution over a single step as a simple approximation to the integral of the derivative of the solution. Consider a standard initial-value problem
X ( t 0 ) = xo.
X’ = F ( t ,X ) ,
If n - 1 steps, each of size h, have been completed to give an approximation
x,-1
M
X(t,-,)
= X(to 93
+ ( n - 1)h).
(1)
94
the next approximation can be found from the formula
X ( t 0 + nh) = X ( t o + ( n- 1)h) +
tn
Ll
F ( t ,X ( t ) ) d t .
The two simplest ways of approximating the right-hand side are to replace the first term by its approximation given by (l),and the integral term by the width of the interval h, multiplied by the integrand computed at either the start of the step or the end of the step. This gives the two numerical methods
and
X" = X " _ l + hF(t",X").
(3)
These are known as the Euler method and the implicit Euler method respectively. To see why (2) is unsatisfactory when applied to a stiff problem, consider the linear differential equation system
X' = M X ,
X ( 0 ) = xo,
where M is supposed to have moderately sized eigenvalues (that is, eigenvalues close to zero) and to have, in addition, further eigenvalues with very negative real parts. In practice we will want to solve non-linear problems, and it will not be possible to de-couple the problem into sub-problems in which the two eigenvalue scales can be treated separately, but for the purpose of analysis, we will take as a prototypical special case a matrix M of the form
where q is negative with large magnitude. The exact solution is
[
X ( t ) = ex:(t)
ex:qt)
3 xo.
Because of the rapidly decaying nature of the solution to the second component, a numerical scheme should have its stepsize selected to model the behaviour of the first component in a faithful way, in accordance with whatever accuracy is required. In the case of the Euler method (2), the numerical approximation is
95
To approximate the first component, a stepsize h M 0.1 may be a suitable choice. However, it is the second component that will cause trouble. Rather than decaying rapidly, as for the exact solution, this component of the approximation will have a magnitude which actually increases, unless lhql 5 2. This is usually an unreasonable requirement. On the other hand, for the implicit form of the Euler method given by (3), the numerical approximation becomes
[
xn=
(1 - h)-n 0 0 (l-hq)-n]Xo
and the second component decays in magnitude for any negative value of hQ. Considerations of the behaviour of this simple type of linear problem motivates the following definition. Definition 1. A numerical method is A-stable if the sequence of numerical approximations produced by the method, when applied to the differential equation X‘ = q X , is bounded whenever hq is a (possibly complex) number with negative real part. The search for good methods for solving stiff problems is typically centred on the consideration of numerical schemes which satisfy this definition and have a reasonably high order of accuracy but, at the same time, do not impose severe computational demands. It is known that linear multistep methods cannot be A-stable and have order greater than 2. Hence we will look instead for suitable Runge-Kutta methods and at other multistage methods within the large general linear family. In Section 2 we will discuss implicit RungeKutta methods as possible contenders and in Section 3 we introduce a promising new class of general linear methods. 2. A-stable implicit Runge-Kutta
methods
RungeKutta methods are usually represented by a coefficient tableau such as
96
In taking a step from t,-l to t, we need to compute approximate solutions at t,-l+ h q , where c1 = - $ and c2 = 3 This explains the numbers in the first column of the tableau. The approximations at these "off-step" points are given by
+ 9.
4
X(t,-i
+ hci) = x(t,-l) + hmX'(t,-l + hcl) + hailtX'(t,-l + hcz),
where
A=
a22]
[& a a 1
all a12 =
and X',at the points where it is required, are found from the differential equation being solved. This explains more of the numbers in the tableau for this method. The final row in the tableau is related to the formula for the solution at the end of the step, which is given by the approximation
X(t,)
x
X ( L - 1 ) + hblX'(t,-1
+ hci) + hb2X'(tn-i + hcz),
with bl = b2 = 51. This particular implicit RungeKutta method has two stages because it uses approximations at this many points in order to complete the step. The cost of implementing the method goes up rapidly with s, the number of stages. The cost is also related to the structure of A. We present two further methods for comparison. These are respectively
and
It is clear that the method with tableau (5) has definite advantages over (4) because the two stages can be evaluated in sequence, rather as components of a larger coupled system. On the other hand, (4) has order of accuracy 4, rather than just 2 in the case of (5). The coefficient matrix A for the fully coupled method given by (6) has the same eigenvalues as for the decoupled method (5). By carrying out the computation in the most efficient way, this method can be implemented with a similar cost to (5).
97
Moreover ( 6 ) has computational advantages related to the fact that each of the stages can be computed with a similar accuracy as for the final output result. A-stable methods such as (6) exist for quite high orders but they suffer their own disadvantages for increasingly high orders. However, some of these difficulties can be partly overcome using a generalization known as “effective order”.
3. A-stable general linear methods w i t h the IRKS property General linear methods constitute an enormous family of computational schemes. Not only do they have a multiplicity of stages but they also have a multiplicity of values passed from one step to the next. Instead of representing the method in terms of two principal arrays, A and bT, as in a RungeKutta method, we can now use four matrices which we denote by A , U,B and V. It is convenient to arrange these as a partitioned ( s T - ) x (s T - ) matrix, where s is the number of stages and T is the number of values computed in a step and passed on to the next step. Thus we use a tableau of the form
+
+
[t ]; to represent one of these methods. In the interests of efficient computation, the s x s submatrix A should be chosen to be lower triangular; furthermore we would like the diagonal elements to be equal. In addition to these requirements, we would like the method to be A-stable and of high order. High order alone is often not enough for stiff problems; the stages also need to be highly accurate as approximations to the corresponding offstep values. It has been discovered recently that all of these conditions can be met by imposing a restriction known as “inherent Runge-Kutta stability”. This guarantees, amongst other benefits, that the stability of the method can be looked at in just the same way as for Runge-Kutta methods. It is possible to derive A-stable methods of quite high orders within this family. What is not known is how to select from the wide range of methods that exist with these properties, a particular choice in the most propitious way. It is also not known how to overcome some of the technical difficulties and complications that are thrown up in implementing these new methods. In spite of these complications, reasonable solutions are already known to many of these questions. But improvements are undoubtedly possible and a rich set of opportunities exists for further investigation.
PERIOD TWO TRICHOTOMY ON a Y X N - l + 6XN-2 XN+1
=
+
XN
+
XN-2
E. CAMOUZIS Department of Mathematics, University of the Aegean, Karlowasi, samos, Greece
R. DEVAULT Northwestern State University, Natchitoches, LA 71497, USA G. PAPASCHINOPOULOS Democritus University of Thrace, Department of Electrical and Computer Engineering, GR-67100 Xanthi, Greece
Our aim in this paper is to investigate the boundedness, global stability and periodic character of solutions of the difference equation Xn+1 =
QI
+ yxn-1+ bxn-2 , n = 0,1, ..., B x n + Dxn-2
where the parameters a,y, 6,B , D and the initial conditions are nonegative real numbers and
Bxn
+ Dx,-2
for all n.
98
>0
LEAST SQUARES FITS OF POTENTIALS BY USING ENERGY AND GRADIENT DATA : VIBRATIONAL ANHARMONIC SPECTRA FOR H2CO FROM DFT CALCULATIONS
P. CARBONNIERE,D. BEGUE, A. DARGELOS AND C. POUCHAN Laboratoire de Chimie Structurale. UMR 5624 I.F.R. Rue Jules F e r v , 64000 PAU, FRANCE E-mail:
[email protected]
We present a least-squares fitting procedure to obtain a quartic force field by using energy and gradient data arising from B3LYP/cc-pVTZ calculations on a << simplex-sum )) of Box and Behnken grid of points. We illustrate and we test for H2C0 the quartic force field and the resulting vibrational anharmonic spectra performed from 44 simplex-sum configurations and we compare our results to those obtained by using the classical 168 energy calculations.
1. Introduction The construction of an accurate complete quartic force field is an important stake in the theoretical studies of vibrational anharmonic spectra. Generally, the conventional approach consists to accumulate a great number of energy data resulting from ab initio calculations carried out in a grid of points representing the geometrical variations and to deduce from it the analytical potential h c t i o n by using a standard least squares method [l]. When increase the size of the molecule the number of data point required in this latter approach increases dramatically limiting his use to four or five atoms systems. This limitation has recently stimulated new development to intend obtaining accurate quartic force field for larger molecules. Among these methods two approaches using energy, gradient and Hessian data have been done recently in the literature [2,3] including an extended least squares procedure consisting jointly in fitting all the data obtained in the grid. In this work we present the procedure implemented in our code REGRESS EGH [4] for determining the analytical form of the potential with a reduced number of points suitably chosen in the grid following a simplex-sum design known for its efficiency and accuracy. Illustration and tests presented here concern the formaldehyde molecule.
99
100
2.
Method
The extended least squares method taking into account the first and second derivatives presented elsewhere [5,6] leads to an extended merit function defined by :
in which
V(s,,s2,..., snV) correspond to the model of potential function which can be
written as a Taylor expansion in terms of displacement coordinates :
where (anv/qasi) are the coefficients to be determined. Vi (sI,s2,..., snv) and vib (sl ,s2,..., snV) are respectively the first and second derivatives of the model with respect to S, and sb displacement coordinates, Nv the number of vibrations.
(g)m( and
Em
'
a2E
are respectively the value of energy, gradient and
Hessian terms analytically obtained by a b initio calculations in each point of the grid.
3. Example of (E-G) method :application and tests for H2C0 In this case the expression of the merit h c t i o n is truncated as R = R, + R, . We have shown [6] that the simplex-sum grid of Box and Behnken [7] truncated to the third sum seems well suited to the resolution of this problem by fulfilling the criteria of efficiency and accuracy. This grid leads to the following number of computations reported in Table 1 with regard to the number of variables si for the (E-G) method.
101 Table 1. Number of calculations with the (E-G)method with a simplex sum design truncated to the third sum. Comparison with an usual (E) linear regression procedure. (results obtained without symmetry consideration). Nterm Npoint 35 70 6 210 420 9 715 1430 12 1820 3640 15 3876 7752 18 7315 14630 21 12650 25300 24 20475 40950 -
(E) Redundancy 2 2 2 2 2 2 2 2
Simplex-sum design (E-G) Npoint Redundancy Gain 15 1.7 4.7 64 2.1 6.6 176 2.5 8.1 378 2.7 9.6 697 2.9 11.1 1160 3.0 12.6 1794 3.1 14.1 2626 3.2 15.6
The use of the (E-G) method results in significant gains that increase with the size of the system studied, since the number of calculations is divided by a factor of about 5 for a triatomic molecule and close to 16 for a molecule containing 10 atoms. Nevertheless, the computational cost of the analytical gradient is added to that for energy at each calculated point of the potential grid. Consequently, the gain in term of time appears less important. The computational costs of the H2C,0 series using the B3LYP/6-31 lG* method estimated by the two ways show that in spite of time limitations due to calculation of the gradient, the gain remains satisfying since it rises by a factor of 3 to 13 when n increases from 1 to 7 illustrating the degree of effectiveness of the (E-G) method. For HZCO a comparison of the quartic fit based on 168 energy data with those based on 44 (energy + gradient) data shows a higher accuracy for the former system. The RMS error are respectively 0.05 cm-' and 0.16 cm-' for the determination of the 84 non null parameters expressed with respect to dimensionless normal coordmates. It is worth noticing however that the mean dispersion between the cubic and the quartic force constants obtained from the two ways are found to be 0.1 and 0.9 cm-' respectively. We have reported in table 2 the wavenumbers calculated for H2CO fiom the two quartic force field using a perturbative approach. The agreement is remarkably good for the findamental bands (mean difference = 0.8 cm-') and for single combinations (mean difference = 1.5 cm-I). This mean difference don't exceed 3 cm'' for all low-lying overtones. To illustrate this good agreement we show in figure 1 the difference obtained for all bands calculated from the two ways until
102 12 000 cm"1. It should be noted that the mean difference is only 0.85 cm"1 in the medium IR region with a maximum dispersion of 4 cm"1. This mean difference increases reasonably in the near IR region justifying the effectiveness of the (EG) method to calculate an accurate quartic force field. Table 2. Wavenumbers for H2CO calculated by 2nd order Perturbation Theory from B3LYP/cc-pVTZ quartic force field. Comparison between (E-G) method and (E) method.
(E) method (E-G) method 1197 1197 V] 1256 1256 V2 1521 1521 V3 1798 1798 V4 2734 2738 V5 2772 2773 V6 mean difference for V;: 0.8 cm" label
2v, 2v2 2v3 2v4 2v5 2v6
2411 2514
2411 2514
3044 3044 3575 3575 5412 5399 5469 5468 mean difference for 2vj : 2.5 cm"1 3641 3v, 3640 3774 3v2 3776 4571 3v3 4570 5333 3v4 5331 8003 3v5 7997 8093 3v6 8087 mean difference for 3vj: 2.8 cm"1
(E) method (E-G) method 2445 2445 V]+V 2 2709 2709 V]+V 3 2775 2774 V 2 +V 3 2988 2987 Vi+V 4 3047 3047 V 2 +V 4 3309 3309 V 3 +V 4 Vl+Vs 3926 3929 3955 3956 Vi+V 6 3985 3988 V 2 +V 5 4022 4021 V 2 +V 6 4247 4251 V 3 +V 5 4284 4283 V 3 +V 6 4537 4542 V 4 +V 5 4578 4577 V 4 +V 6 5370 5376 V 5 +V 6 mean difference for v,+Vj: 1.5 cm"1 label
103
5t
---------;
I
0
8000
4000
12000
Wavenumber (em")
Figure 1. Difference between vibrational bands calculated fiom the (E-G) method and (E) method until 12000 cm-'.
References 1. G. C. Schatz in Reaction and Molecular Dynamics (Proceedings of the European School on Computational Chemistry, Perugia, Italy, July 1999), edited by A. Lagana and A. Riganelli (Springer, New York, 2000). 2. C. S. Ewig, R. Berry, U. Dinur, J. R. Hill, M. J. Hwang, H. Li, C. Liang, J. Mapple, Z. Peng, T. P. Stockfisch, T. S. Thacher, L. Yan, X. Ni, A. T. Hagler, J. Comput. Chem. 22, 1782 (2000). 3. T. Xie, J. M. Bowman, J. Chem. Phys. 117, 10487 (2002) 4. REGRESS EGH, Ph. Carbonnibre, D. Btgut, A. Dargelos, C. Pouchan, Laboratoire de Chimie Thtorique et Physico-Chimie Moltculaire, UMR CNRS 5624,2001. 5. J. R. Maple, M. J. Hwang, T. P. Stockfisch, U. Dinur, M. Waldman, C. S. Ewig, A. T. Hagler, J. Comput. Chem. 15, 162 (1994). 6. Ph. Carbondre, D. BtguB, A. Dargelos, C. Pouchan, J. Chem. Phys. (to be published). 7. G. E. P. Box, D. W. Behnken, Ann. Mat. Stat. 31,838 (1960).
DERIVING PREDICTION INTERVALS FOR NEUROFUZZY NETWORKS G. CASTELLANO, A.M. FANELLI, AM) C. MENCAR CILAB - Computational Intelligence LABoratory Department of Computer Science, University of Bari v. E. Orabona, 4 - 70126 - Bari, ITALY E-mail: {castellano, fanelli, mencar}@di. uniba.it In this paper, we describe a method to calculate prediction intervals for neurofuzzy networks used as predictive systems. The method also allows defining prediction intervals for the fuzzy rules that constitute the rule base of the neuro-fuzzy network, resulting in a more readable knowledge base. Moreover, the method does not depend on a specific architecture and can be applied to a variety of neuro-fuzzy models. An illustrative example is reported to show the validity of the proposed approach.
1. Introduction A neuro-fuzzy network is a fuzzy system that uses a learning algorithm derived from or inspired by neural network theory to determine its parameters (fuzzy sets and fuzzy rules) by processing data samples [I]. Several neuro-fuzzy networks exist in literature, and most of them acquire an input/output mapping from data in form of fuzzy rules that can be used for prediction or classification. However, in many real-world problems, data are corrupted by noise or follow complex relationships that are hardly discovered by simple models; in such cases, a model that provides a prediction interval as output rather than a single numerical value is more appreciable. Many methods have been proposed for estimating confidence intervals and prediction intervals for neural networks [2], but many of such techniques are dependent on the specific neural architecture. As a consequence, such techniques are not applicable for different models, like neuro-fuzzy networks. In this paper, we propose an approach to derive prediction intervals for neuro-fuzzy networks so as the system provides an estimate of the uncertainty associated with predicted output values. The proposed method does not depend on the specific architecture, so as to be applied on a variety of models. In addition, it can be applied on each rule of the neuro-fuzzy network, resulting in a more readable knowledge base. 104
105
2.
A neuro-fuzzy network
The neurofuzzy network used as pilot model for prediction interval calculation is the Adaptive Network-based Fuzzy Inference System (ANFIS) of order 0 [3], as depicted in Figure 1. Li
L:
L
ir
Figure I . The Olh order ANFIS network (one output only)
The rule set that is expressed by such model follows the schema:
Ri : IF xis Ai THEN y1 = a,,, ...,y,,,= aim
(1)
where x is an input variable defined over R", Ai are n-dimensional fuzzy sets, yj are output variables and aYmare constants in R . For simplicity, we assume that m = 1, that is, the system has only one output. For m > 1 the extension of the proposed method is straightforward. When an input vector is presented to the network, the system output is defined by the following inference rule:
r=l
where the rule strength pA, coincides with the membership value of input x to the fuzzy set A,, for each of the R rules that define the rule set. The ANFIS network can be trained in several ways like through backpropagation or second order methods [3]. At the end of the training process, the parameters of the network - namely the parameters of the fuzzy membership functions and the consequent constants - are adjusted so as to minimize a given loss function, usually the Mean Squared Error.
106 3. Prediction interval derivation
Here we describe the proposed method to derive prediction intervals for the neurofuzzy system outputs. Suppose that there exist a supervisor that provides, for each input x , the exact value g of an underlying unknown function g = ?(XI . We can assume that the supervisor is memory-less, that is, the returned value is independent on the previously calculated values. A neuro-fuzzy system, like the previously described model, is trained so as to provide an approximated value for the same input x . The following error function can be defined as: e(x) = G(x)- Y(x)
(3)
Suppose that a finite training set T = {(x, ,gP) E Rn+’p = 1,2,..., N } is available for training. The examples of the training set are assumed to be drawn independently from a single source, so as to be identically distributed. After training, a finite number of errors e,, e,, ...,eN are available. Such errors can be considered as independent and identically distributed random variables; hence their mean value
e
N
=+Eel
(4)
i=l
is a random variable that approaches the Normal distribution for increasing N. For a normally distributed random variable, prediction intervals can be calculated. A prediction interval [La,U,] of confidence level a represents an interval that will include the error enew of a newly drawn example xnew, with probability greater than 1- a! [4]. Formally: p (en,, E [L,,u,]) L 1- a
(5)
The last relation is equivalent to the following:
P (g E [G - u, ,5 - La1) L 1- a Relation (6) defines a statistical method to estimate the true value of the underlying function approximated by the neuro-fuzzy network, with a desirable confidence level. Formally, a prediction interval is defined by the following relations:
107
where t%,,N-llis the value of the Student distribution with N - 1 degrees of freedom corresponding to the critical value a/2, and s is the sampled standard deviation:
The width of the prediction interval is directly related on the model accuracy. As a consequence, the less accurate is the model, or the smaller is the training set cardinality, the wider is the prediction interval. In the proposed approach, the prediction intervals are derived on the basis of the approximated outputs inferred by the neuro-fuzzy network. However, for representation pursuits, prediction intervals can be calculated for each rule consequent, resulting in the following representation: I F x i s A, THENy E [a, - Ua,a, - L a ] (1 - a )
(10)
The calculation of prediction intervals for the fuzzy rules leads to a more explanatory knowledge base, since it helps users to perceive an entire range of validity of each rule, instead of a single numerical value. It should be noted, however, that such intervals are used only for rules representation, while the derivation of the prediction interval for the network output must follow relation (6). 4. 'Illustrativeexample In order to validate the proposed method, we trained a Oth order ANFIS based on 2-dimensional data. The available dataset was split into a training set of 25 examples and a test set of 26 examples. The training dataset and the inputloutput mapping defined by the trained ANFIS are depicted in Figure 2. Then,
108 prediction intervals are calculated for a = 10%,1%,0.5%,0.1% and are illustrated in Figure 3. 0.8,
Figure 2: The training dataset (circles) anf the input!output mapping defined by the neuro-fuzzy system (line)
Figure 3: Prediction intervals of the input/output mapping for four confidence levels (lo%, 1%, 0.5%, 0.1%)
For each value of the confidence level, the test set has been used to check whether the desired outputs fall within the prediction interval. The results are reported in
Table 1. As it can be seen, the number of desired outputs that fall outside the prediction interval is coherent with the confidence level. In the design phase, the choice of the appropriate value of the confidence level should be a tradeoff between precision of the mapping (narrow intervals) and good predictive estimation (large intervals).
109
Table 1: Prediction errors with different confidence levels ~~
Confidenc e level 10% 1% 0.05% 0.01%
5.
examples outside the prediction interval 2 1 0 0
(7.69%) (3.85%)
Conclusions
In this paper we have proposed a method to assign prediction intervals to neurofuzzy networks used as predictive systems. The method is quite general and can be applied to a variety of neuro-fuzzy models. Moreover, the method can be used to define fuzzy rules with interval-valued consequents, which result in more readable knowledge base. Future research is in the direction of defining prediction intervals that are functionally related to the input, so as to make interval width related to model accuracy. References 1. D. Nauck, F. Klawonn, and R. Kruse, Foundations of Neuro-Fuzzy Systems, Wiley, Chichester, (1997) 2. R. Dybowski, S. Roberts, ConJdence intervals and prediction intervals for feed-forward neural networks, In Dybowski R, Gant V. (eds.) Clinical Applications of Artificial Neural Networks. Cambridge: Cambridge University Press, pp 298-326 (2001) 3. J. S. R. Jang, ANFIS: Adaptive--network-- based fuzzy inference systems, IEEE Trans. on System, Man and Cybernetics, Vol. 23, No. 3, pp. 665-685 (1993) 4. J. Neter, W. Wasserman, and M. H. Kutner, Applied linear statistical models: Regression, analysis of variance, and experimental designs. Homewood, IL: Irwin (1985)
MARANGONI EFFECTS IN A HORIZONTAL SOLIDIFICATION PROCESS IN MICROGRAVITY *
M. M. CERIMELE, D . MANSUTTI AND F. PISTELLA Istituto per le Applicazioni del Calcolo ”M. Picone” (C.N.R.) Viale del Policlinico, 137 00161 R o m a - Italy E-mail: pistellaaiac. cnr.it
The present work was stimulated by the need to plan solidification experiments on a space platform and the general aim is to provide data from numerical simulation to be used to set properly the experimental apparatus. The material typically used in this kind of experiments is succinonitrile (SCN) for it assembles the characteristics of semi-conductor materials and metals although being transparent and allowing easy observation of the flow structures, phase front and deformations. We shall present the numerical simulation of the horizontal Bridgman solidification process where, at the open top of the parallelepipedic crucible, the shear stress induced by the surface tension gradients at the air/melt interface (Marangoni effect) is taken into account. In order to better understand the mechanism of interaction between the buoyancy and the thermocapillary forces we shall compare the numerical results obtained for SCN with those from the simulation of the solidification of silicon, a material chacterized by negligible surface tension. The mathematical model here adopted describes the flow of the liquid phase - considered as an incompressible newtonian fluid -, the heat transport phenomena within the whole sample and the evolution of the phase front. The stream-function/vorticity formulation for the flow of the liquid phase and the front-fixing treatment of the moving phase front are used. The numerical approximation is based upon a second order E N 0 scheme combined with a second order time scheme. The validation of the mathematical *This work has been developed within the project ”EnvironmentalProcesses” at Istituto per le Applicazioni del Calcolo ”M. Picone” (C.N.R.). 110
111
and numerical models versus physical direct observation was discussed in for a melting experiment and in for the present experimental setting in a full gravity environment. Here we shall compare streamlines and isothermal lines at thermodynamical equilibrium in microgravity and g-jittering (oscillatory microgravity) fields versus the case in full gravity field. As referencing experimental setup we adopt the one described in but for the top of the crucible that in our case is assumed to be open. In the horizontal Bridgman technique for artificial crystal growth the molten material is placed in a (idealized) shallow parallelepipedic crucible; the crystal grows as a device moves from one extremum to the other on the crucible, by taking the heat out and carrying along the solidification front. During this process a convective flow starts within the melt due to the horizontal temperature gradient. In the following we detail the equations governing the evolution of the material. The unsteady Navier-Stokes equations for incompressible fluids are adopted to represent the melt dynamics:
PL
av (at + (V . 0 )V)
= -Vp
+ pV2v - pL[1-
cr (TL- Tp)]g
(1)
Here v is the velocity field, p is the pressure, TL is the temperature of the melt, g is the acceleration due to gravity, p is the dynamic viscosity, a is the volumetric thermal expansion coefficient, CL is the specific heat and k L is the thermal conductivity of the melt. The spatial and temporal derivatives are meant with respect to the quadruple (z,y, z, t ) , where (z,y, z ) varies within the domain 0: c R3 occupied by the melt and t 2 t o . The above equations are built on the basis of the Boussinesq approximation: p~ is the density at the melting point and within the buoyancy term the density p is supposed to vary according to the linear law p = p ~ [ 1 ~ - ( T-LT p ) ]where , (I: and Tp are respectively the thermal expansion coefficient and the melting temperature. In the solid phase only heat diffusion occurs, which is governed by the equation
112
where Ts, ps, cs and ks are respectively the temperature, the density, the specific heat and the thermal conductivity of the solid. The above equation holds on Di,that is the spatial domain occupied by the solid (Dic R3) for t 2 to. On the phase front, that is the surface separating the domain of the melt DZ and the domain of the solid Di,the equation
holds, where 1 is the latent heat characteristic of the material, the derivatives are meant to be with respect to the outer normal to the front in Di and n is the corresponding normal unit vector. Equation (5) is the Stefan condition which expresses the balance of the energy exchanged between the two phases whose difference has to be equal to the latent heat. The above equations are completed by the following boundary conditions: a condition on the velocity field of the melt is required for the momentum equation (1);we impose
v=o
(no- slip)
at the walls of the crucible (impermeability) and also at the phase front. At the open top of the crucible (within the restriction of flat (horizontal) free-surface), we impose
au -- -aaaTL _ ay at ax v=o where x and y are the coordinates respectively along the horizontal and the vertical directions, u and v are respectively the horizontal and vertical components of the velocity field of the melt and is the temperature gradient of surface tension accounting for the thermocapillary effects.
3
113
0
the energy equations for the melt (3) and the solid (4) require conditions on the temperature fields or their normal derivative on the whole boundary. As we suppose that the material is pure, the phase-change temperature is a sharp value that we call Tp. Then we impose
TL = TH
(TH2 Tp, superheating)
at the vertical wall of the crucible in 0;;
TL = Ts = Tp
(phase - change temperature)
at the phase boundary;
Ts = Tc
(Tc < Tp,subcooling)
at the vertical wall of the crucible in 0;;
-man = Lo ,
m S =o
(adiabatic wall)
an at the horizontal crucible walls (open top case included). The initial conditions in our simulations will describe a portion of material in solid phase at temperature (Ts(z, y, z , t o ) < T p )and the remaining melt, restly, at temperature (TL(z, y, z , t o ) > T p ) .
Acknowledgments Beside the institutional funding from C.N.R., the development of this work was made possible by the finantial support of A.S.I. within the project "Simulazione numerica per lo studio della convezione naturale in processi di solidificazione", 2002-2003. References 1. M. M. Cerimele, D. Mansutti, F. Pistella Computers and Fluids 31,437 (2002). 2. M. M. Cerimele, D. Mansutti, F. Pistella Mathematics in Industry I, 197 (2002). 3. G. H. Yeoh, G. de Vahl Davis, E. Leonardi, H.C. de Groh I11 and M. A. Yao J. of Crystal Growth 173, 492 (1997).
SIMULATION OF INCOMPRESSIBLE FLOWS ACROSS MOVING BODIES USING MESHLESS FINITE DIFFRERENCING C. S. CHEW, K. S. YE0 AND C. SHU Department of Mechanical Engineering, National University of Singapore 10 Kent Ridge Crescent Singapore 119260 Republic of Singapore E-mail: engp031I @nus.edu.sg,
[email protected]
Computational fluid dynamics (CFD) is increasingly becoming the tool of choice for researchers and engineers facing complex fluid flow problems. One area of CFD that has attracted significant attention in recent years is the development of meshless or mesh-free methods'. These methods have so far been used in many applications like fracture mechanics, astrophysics and fluid flow. The key advantage of meshless methods over mesh-based methods like finite volume (FV) and finite element (FE) methods is its non-requirement of pre-specified connectivity between nodes in order to derive approximation and interpolation. This leads to the ability of finite difference (FD) methods to treat irregular domains and boundaries. A direct result of this is another advantage of mesh-free methods having the adaptability for problems involving large movement or wholesale motion of boundaries or embedded bodies. In this area, it has the edge over composite or overset grid methods in being able to use only one coordinate frame. A convecting meshless scheme for boundary driven fluid flow is described. This scheme can be applied to self-propulsion problems, or, in general, flows involving motion of rigid or deformable bodies embedded within the flow. The generalized finite difference method' (GFD) was chosen this scheme. A meshless cloud of nodes is embedded around the body surface along with the structured Cartesian background nodes for the entire flow field. By incorporating a subdomain for each node and its neighbours, the use of Taylor series expansion enables the derivatives to be found. Two forms of incompressible momentum equations are used: the standard Eulerian form in primitive variables for the structured nodes and the Arbitrary LagrangianEulerian (ALE) form for the meshless cloud of nodes. GFD with moving least squares approximation is used for spatial discretization of the convecting meshless nodes. The projection method with Crank-Nicolson discretization is used to achieve second order accurate time discretization. Following prescribed body motion or fluid structure interaction, the meshless cloud acquire new locations in the next time step and the momentum equations in ALE form can then be solved.
114
115 Test cases were performed to verify code usability and method accuracy and convergence. The simulation of the decaying vortex has been a popular benchmark for the gauging of accuracy3. Basically the exact solutions of the time-dependent function values are known and both residues and absolute errors were calculated. Solving using different grid sizes proved the present method to be of second order spatial accuracy. The meshless method was then tried on the driven cavity flow at Reynolds number 1000. The results were well-compared with popular benchmark results4. In considering the effects of implementing the convecting meshless nodes across a normal flow field, the decaying vortex was again simulated, with a moving patch of nodes within the Cartesian field. The residues were well controlled and the errors were not significantly increased. Finally, a couple of external flows cases were simulated, basically on a flow past a stationary bluff body and a flapping ellipse inside an enclosed body of fluid. Future works in the pipeline include writing a parallel code for larger scale simulation and simulating oscillating cylinders and flapping of elliptic aerofoil(s). TOPICS + KEYWORDS: Computational fluid mechanics, meshless, generalized finite difference, moving boundary, Arbitrary Lagrangian-Eulerian, incompressible, Navier Stokes equations, bluff body, fractional step. PREFERENCE: Oral References T. Belytschko, Y. Krongauz, D. Organ, M. Fleming, P. Krysl, Comput. Methods Appl. Engrg. 139,3 (1996) 2. N. Perrone, R. Kaos, Computers and Structures 5,45 (1975) 3. D. Kim and H. Choi, J. Comput. Phys. 162,411 (2000) 4. U . Ghia, K. N. Ghia and C. T. Shin, J. Comput. Phys. 48,387 (1982) 1.
116
U
V velocity conponert dflow a c m elipse
171694 142469 113245 0 640204 0 547959 0255714 -0 0365306 -0 328776 -0 621 02 -0 913265 -1 20551 -1 49776 -1 79
L
External flow across ellipse, u velocity contour and v velocity contour.
117
Vorticity plot and streamlines plot of flapping ellipse within confined space.
AUTOMATIC GENERATION OF SOFTWARE COMPONENTS FOR REAL OPTIONS MODELLING
A. CHORTARAS, Y. GUO, M. M. GHANEM, F. 0. BUNNIN Department of Computing, Imperial College London E-mail: (ac901, yg, mmg, f o b l } @doc.ic.ac.uk
We describe the design and implementation of a software system that, from high level specification documents, generates source code for the numerical valuation of real options. The documents allow the description of both the flexibility present in investment projects and the dynamics of the underlying stochastic variables, independently of any valuation methodspecific details. By applying symbolic transformations to the specifications the system generates efficient and reusable software components, which are combined with software components that implement numerical methods in order for the final real option valuation code to be generated. 1. Introduction The complexity of the interactions among the real options that real-life projects include and the use of complex stochastic models to describe the underlying uncertainty sources often make numerical techniques the only tractable way for valuating real options. By exploiting the analogy between financial and real options [4], real options may be valuated by extending the financial option pricing methodology [ 5 ] ,[3]. However, developing code for the valuation of varying project structures in combination with different stochastic models is an expensive, error prone and time-consuming process. A system automating this process can help the efficient and rapid design of the management’s investment plans by shortening development time, reducing implementation costs and offering increased flexibility in project analysis. The system that we present extends earlier work in [l]and [2] 118
119
in the real option valuation domain by providing support for incomplete markets and interacting real option collections modelling, while maintaining the flexibility of handling complex stochastic models. 2. System Design
The design of our system allows it to produce real option valuation code in a systematic and transparent way, by interacting at a high level with the end-user. It deploys the technologies of automatic code generation and software components. Its kernel is implemented in Java and Mathematica while the product of the automatic code generation are software components implemented as C++ classes. The use of the software components technology facilitates the mapping of financial and mathematical concepts to software entities at a high level of abstraction and offers reusability of the generated code. The specifications of the real option valuation problems are provided to the system in the form of XML documents. The architecture of the system is illustrated in Fig. 1 and consists of three parts, each working at a different level of abstraction and modelling a different aspect of a given valuation problem. Real Option X M L SpecificationDocuments
Stochastic Model X M L Specification Documents
Real Option Components Generation
Stochastic Mode ComponentsGeneration
Ciasses Library Components Library
Components Generation
New C++ Valuation Classes
Figure 1. System architecture.
Real Option Specification The real option specification part provides a high level way of specifying individual real options and real option collections independently of any stochastic model or valuation method-specific
120
details. The XML specification documents define the properties of individual real options (underlying variables, payoff, time to expiration and exercise style) and of collections of real options (ordering in time and type of interaction). Single and compound options are supported. Compound options may be contingent either on single or on sets of mutually exclusive options, thus sequences of interacting options can easily be modelled. From the specification documents two types of software components are generated: individual real option and real option collection components. The generated components are based on abstract class hierarchies that define the functions that concrete classes, corresponding to specific real o p tions, have to implement. The individual real option components abstract the structural and semantic similarities and differences among real options in an effective way that permits the development of computationally efficient valuation components. The collection components model the general project structure and are made up by individual real option components. Stochastic Model Specification The modelling of the dynamics of the options underlying variables is done at the stochastic model specification part. The underlying variables are assumed to follow continuous diffusion processes. The generated components capture all the mathematical details of the model in a systematic way that facilitates the development of the valuation components. The components are generated from XML documents defining the details of a stochastic model (drift, diffusion, dividend structure, uncertainty sources and their correlation). Given that there is no best numerical method for all valuation problems the system is designed without opting for a particular valuation method. The stochastic model components comply with this requirement by providing to the numerical components all the mathematical quantities the latter need and can be directly derived from the mathematical model. Given the different nature of the several numerical methods the stochastic model components must accommodate different needs. For example, finite difference methods require the calculation of the coefficients of a partial differential equation, while Monte Carlo methods the calculation of the risk neutral dynamics. The application of symbolic transformations (using Mathematica) on the original specifications and the use of abstract class hierarchies that model the needs of the different numerical methods is how this objective is achieved. Valuation Components The valuation components, generated by the valuation code generation part, are the final product of the system and
121
those that perform the actual real option valuation. They are generated by combining a real option collection and a stochastic model component and hence they inherit the complete specifications of a given problem. They have the form of classes with a well-defined external interface so that project valuation applications can easily be developed. Their most significant part is the numerical component they are based on. The numerical components implement numerical methods suitable for the valuation of the options that can be specified in the specification parts of the system. The applicability of a numerical method depends on the characteristics of each particular problem. The numerical components form a numerical library, which is extensible so that new components may be added. The current implementation includes finite differences and lattice method components for options contingent on one or two stochastic variables. The software components generated by each part are semantically and operationally independent and reusable. This allows the generation of several valuation components by combining different real option collection components with different stochastic model components. 3. Conclusions
Our system provides a problem solving environment and applies the software synthesis technology to the real options domain. It facilitates the rapid development of transparent and theoretically sound investment project valuation programs and provides management with the opportunity to formulate projects as real option collections at a natural level and perform flexible analysis for varying model assumptions and project structures. This flexibility increases understanding of the uncertainty inherent in operations and results in faster and better-informed decision making. References 1. F.O. Bunnin, Automatic Generation of Software Components for Financial Modeling, PhD Thesis, Imperial College, London, 2001. 2. F.O. Bunnin, Y. Guo, Y. Ren, and J. Darlington, Design of High Performance Financial Modeling Environment, Parallel Computing, vol 26, 2000. 3. A. Gamba, An Extension of Least Square Monte Car10 Simulation for Multioption Problems, 6th Annual Real Options Conference, Cyprus, 2002. 4. L. Trigeorgis, Real Options: Managing Flexibility and Strategy in Resource Allocation, MIT Press, 1996. 5 . L. Trigeorgis, The Nature of Option Interactions and the Valuation of Investments with Multiple Real Options, Journal of Financial and Quantitative Analysis, vol. 28, 1993.
MODELING THE STATE AND BEHAVIOR OF AN ENZYME USING UML - AN OBJECT ORIENTED APPROACH
v. N. CHRISTOFILAKIS~' Institute of Informatics and Telecommunications National Centre of Scientific Research, "DEMOKRITOS" Ag.Paraskevi, Athens 15310, GREECE E-mail:
[email protected]
CH. ALEXOPOULOS Laboratory of Peptides Chemistry Department, University of Ioannina Panepistimioupoli, Ioannina 451 10, GREECE E-mail: me00051 @cc.uoi.gr In this paper an object oriented approach is used to visualize a family of enzymes that are widespread in nature and can be found in many animals and plant species. To achieve this assay, Unified Modeling Language (UML) is utilized to describe accurately and in detail the chemical information. In this survey we intend to provide multiple dimensional access to the inherent information concerning a biochemical process, and as an example the acid phosphatase is chosen.
1. Extended Abstract
It is very common to represent real word systems using models. Especially when the system is too complicated, models help you to visualize a system as it is or as you want it to be. Complementally, modeling is important because: It allows you to specify the structure and behavior of a system Gives you a template that guides you in constructing a system Documents the decisions you have made Gives the possibility of better and faster comprehension of significances UML [l], which was adopted by the Object Management Group OMG [2] on November 1997 as a standard formal language, is used for visualizing,
. ..
Electronics - Telecommunicationsand Applications Laboratory, Physics Department, University of Ioannina, Panepistimioupoli, Ioannina 45 110, Greece. * Corresponding Author
122
123 specifying, constructing and documenting the artifacts of systems the range of which is very broad. By applying object-oriented approach through the diagrams of UML it is possible to obtain reusable, change-tolerant and stable models of any system. UML encompasses nine types of diagrams: use-case, class, object, state, sequence, collaboration, activity, deployment and component. These could be categorized according to their features in structural, behavioral and implementation diagrams. (Figure 1)". Depending on the system that is modeled, it is possible to use one or more from the diagrams that are illustrated in Figure 1 to represent different aspects of the system. In this approach it is shown how the functionality is designed inside a set of chemical data, originating from an enzyme catalyzed system. The enzyme that used is Acid Phosphatase (AP) [3]. AP is a general term, associated with nonspecific phosphomonoesterases with optimum activity in the pH range of 4-6.
C
BEHAVIORAL
SEQUENCE
COLLABORATION
Figure 1: The classification of nine diagrams of UML.
You can obtain figures 1,2, 3 from Web server at URL: httD://www.telecomlab.ar/extendedabstractACCMSEI a
124
Figure 2: Phosphate ester hydrolysis catalyzed by acid phosphatase (E.C.3.1.3.2) on pnitrophenylphosphate(pNPP) as substrate.
AP belongs to the family of hydrolases (E.C.3.), by acting on ester bonds (E.C.3.1.) and more precisely, it constitutes a phosphoric monoester hydrolase (E.C.3.1.3). We are focused on acid phosphatase, characterized by E.C.3.1.3.2, which could also be appeared with other name, such as: acid phosphomonoesterase or phosphomonoesterase or glycerophosphatase. By using the enzyme databases available in the World Wide Web, (ExPaSy, KEGG, WIT, BRENDA) we noticed that there are 24 PDB entries in enzyme class E.C.3.1.3.2. It is possible to use artificial phosphate esters since the enzyme is rather non-specific and will catalyze phosphate ester hydrolysis on many different substrates. Phosphate ester of p-nitrophenol is a good substrate to use, since the product formed after ester hydrolysis, p-ninitrophenol, can be easily detected and measured. Utilizing the Rational Unified Process framework [4] of Rational Suite Development Studio [5] a detailed representation of the enzyme catalyzed system is given through structural and behavioral diagrams of UML. In Figure 3 the class diagram [footnote], at a higher level of perspective, of the enzyme catalyzed system is illustrated. The classes of enzymes, phosphate esters, chemical reactions, products, their instances and the relationships among them are shown in this structural diagram. Furthermore, taking advantage of the interoperability of UML with extensible Markup Language (XML), [6] an efficient, general, powerful and extensible mechanism for handling both the “capture” and the publication of chemical information is offered [7]. To accomplish this task the XML document is developed by modeling it as a class diagram in Rational Rose XML Document Type Definition (DTD) [8].
125
14/ EC.3.1 .3.2
Oh4.W. +concentration 6pecificity
lpNpp4
I
I
I
I
aerperature
\ xoncentration mame
I Chemical RcacCons
G-
-
Color Reacnon QPti
Xolor aAbsorbance
Figure 3: Class diagram of the enzyme catalyzed system
It is the first time to model through UML an enzyme catalyzed system. This representation offers the possibility to create a new data-bank in the Web, accessible to everybody and an innovative, powerful tool in chemical education - computational Chemistry. References 1. OMG UniJed Modeling Language Specification, Version 1.4 2. Object Management Group OMG, http://www.omg.org 3. H. Bull, PG. Murray, D. Thomas, AM. Fraser and PN. Nelson, Mol. Pathol., 2002,55 (2), 65-72. 4. P. Kruchten , The Rational Unified Process An Introduction, AddisonWesley, 2"dEdition. 5 . Rational Software, http://www.rational.com 6. Extensible Markup Language (XML), htt~://www.w3.org/XML/ 7. P. Murray-Rust, H. S . Rzepa, Markup Languages-How to Structure Chemistry-Related Documents, Chem. International, Vo1.24 No.4, (2002) 8. Rational Solutions, Online Documentation, v2002.05.00
ATOMISTIC MONTE CARL0 SIMULATION STUDIES OF POLYMER MELTS GRAFTED ON SOLID SUBSTRATES K. DAOULAS AND V.G. MAVRANTZAS Institute of Chemical Engineering and High-Temperature Chemical Processes (FOR TH-ICE/HT) Stadiou Street, Platani,GR 26500,Patras, GREECE E-mail: daoulas@physics. upatraxgr.
[email protected]
We discuss the results of a recent atomistic Monte Carlo simulation study of polyethylene (PE) melts grafted by one of their ends on a solid substrate. The simulations have been executed with a rather detailed forcefield which describes interactions at the level of individual atoms. The shortest length scale in the simulation is the Carbon-Carbon bond length I (=1.54A).Results are presented for the main thermodynamic and conformational properties of these grafted layers in the vicinity of two types of surfaces: a non-interacting hard wall and a graphite basal plane.
1. Introduction
Polymers adsorbed on solid surfaces are used extensively in many industries to manufacture surfaces with controlled friction and wear characteristics, manipulate wettability by aqueous and organic liquids, stabilize suspensions against flocculation and improve the compatibility of immiscible interfaces. In the latest years, particular emphasis is given to polymers adsorbed on solid surfaces by one of their two ends. The layers formed in this case are highly ordered and this property has been exploited in the fabrication of nanoscale structures with immediate applications in areas such as molecular electronics, corrosion inhibition, molecular recognition and design of biosensors. In an effort to understand molecular packing and orientation in these systems, we present here the first results of a recent atomistic simulation study of PE melts grafted on a solid substrate. Such an analysis has immediate applications to the class of materials known as self-assembled monolayers (SAM’S), which are formed, for example, by the adsorption of alkanethiols on a Au( 11 1) surface and by alkyl chains on a Si( 1 1 1) surface.
I
2. Molecular model The system examined is a thin film of a PE melt exposed to a solid surface (a semi-infinite graphite basal plane or a wall) at the lower face and to a vacuum or air at the upper face. The simulations are conducted in an orthorhombic box of dimensions LY,L! and L, in the directions x, y and z, respectively, which is filled with segments from several polymer chains; z is the direction perpendicular to
126
127 the interface. Some or all of these chains are permanently grafted on the substrate. In the simulations, the united-atom representation has been adopted, with each methyl and methylene group along the chain backbone regarded as a single interacting site. The molecular model is described in detail in Refs 2,3 and comprises: (a) intramolecular and intermolecular interactions described by a Lennard-Jones (LJ) potential, (b) a bending potential governing the distribution of C-C-C bond angles, (c) a torsional potential governing the distribution of torsional angles and (d) the interaction of all PE carbon atoms with the substrate. Bond lengths are considered to be frozen to their equilibrium value, 1=1.54A. Two types of substrates have been considered: (a) graphite and (b) a hard, noninteracting wall. The functional form of the interaction with the graphite is described in Refs. 4,5 and is shown schematically in Fig. 1.
-
-n 80
.
-0.85
u; -n go c
f- -0.95
0
0
Figure 1. Polymedgraphite potential plotted over a graphite lattice unit cell at a distance z = 3 0 from the first layer of graphite atoms (gray circles). Positions o f local energy minima, where adsorption is more likely to take place, are shown as void circles.
In the course of the simulations, all grafted chains are assumed to have their first carbon atom permanently anchored at a distance from the graphite surface equal to the value of the C-C bond length (i.e., equal to 1.54A). As regards the upper face of the simulation box, the strategy followed in the present work depends on the system studied: For the systems consisting exclusively of grafted chains, the upper face of the system is assumed to be exposed to vacuum. On the other hand, for the systems consisting of mixtures of grafted and free chains, the upper face of the simulation box is assumed to be exposed to air.
3. Simulation method
128 The very powerful end-bridging MC method (EBMC) was chosen to cany out the simulations in a semigrand canonical ensemble where the following parameters are held fixed: the pressure P, the temperature T, the total number of chains NL/?,the total number of mers n, and the spectrum of relative chemical potentials p* of all chain species in the melt except two, which are taken as reference species. Initial configurations at the desired grafting density with which the subsequent MC simulation is carried out were obtained after modifying the amorphous cell method of Theodorou-Suter, properly modified to take into account the presence of the solid substrate and the grafting on the solid. For systems composed of both grafted and free chains, with the upper face of the film exposed to air, the MC simulations are executed in the semigrand N m P T p* ensemble, where the dimension Lz of the simulation box in the z direction is allowed to vary, responding to the value of the pressure P applied. To prevent chains from exiting from the top face of the simulation box, through 7 the upper box surface, a soft, ramp-like potential was assumed there . In the case of a fully grafted melt, the soft, ramp-like potential was disregarded and chains were allowed to freely dangle; the simulations in this case are also executed in the NchnTPp* ensemble but with P=O atm. In all simulations, T=450 K. 4. Results
Fig. 2a,b presents typical snapshots of a system containing: (a) only grafted chains (chain length=C7x, 0=2.62nm-') and (b) a mixture of grafted and free chains C7x chains grafted on a hard surface at 0=1.31nm-*. The highly ordered structure of both systems is striking.
129
Figure 2a,b. Atomistic views of a) a Cx PE mclt dl chains of which are grafted on a hard surface at o=2.62nm2 and b) a PE mclt consisting of fee (gray) and graftcd chains @lack) at 0=1.31nm?
The strong orientational order of the grafted melts can be quantified by means of the components of the conformation tensor C. This is the tensor of the secondmoment of the chain end-to-end vector properly normalized so that at equilibrium it reduces to the unit tensor. Fig. 3 presents the time evolution of the CZzcomponent of C perpendicular to the substrate, as function of U, for the C7x melt. It is seen that with increasing U, the chains assume a conformation which is highly extended along the z direction indicative of the strong anisotropy that grafting induces to these macromolecular systems. Additional information about the conformational properties of the grafted layers can be obtained by studying the average chain conformational path. This is defined as the average height <<(i)> of backbone atom i above the grafting plane. This is shown in Fig. 4 as a function of the scaled bond coordinate s=i/N, where N is the chain length. Also shown in the Figure is the curve proposed by the analytical brush theories, 8,9?10,1I which suggest a universal dependence of <<(s)> on s.
130
a = 2.62 nm"
-a=
40
0.0
0.1
03
1.75nm'
. - .
03
a4
5
CPU time (10~s) Figure 3. Evolution of the C,, component of the chain conformation tensor C with CPU time for tethered C78 PE melts, as a function of grafting density, at o=1.75nni2 (filled (open circles) and 0=2.62nm+~ (thick solid line). circles), 0=2.18nm-~
A
1'2
n
3 lu
Y
n A W rA
f
n
' -w-
rA
s
t
u = 2.62nni2 -a-u=2.lW2 -oa = 1.75nm.' Theay (Refs. 8,9)
-.-
s=i/N Figure 4. f i e normalized mean height [(s) shown as a function of the normalized carbon atom coordinate s along chain contour, of a tethered C n PE mdt for three grafting densities 0. f i e prediction of the Milner et al.8*9theoryisalso shown as a dashed line.
Fig. 5 presents the variation with distance from the substrate of the density p of the C78 PE melt simulated, all chains of which are grafted: (a) on a hard substrate and (b) on graphite, at grafting density ~ 2 . 1 nm 8 -2 . In the latter case, the strong dependence of p on z is evident, particularly at the first layers where p
131 attains values considerably higher than in the bulk due to polymer adsorption on graphite.
*r
E
n N v
-hard surface
1.0
--O-graphite
0.8 0.6
a
0.4 0.2 0.0 0
15
45
30
z(
60
5
1
Figure 5 . Local density p plotted as a h c t i o n of distance fiom the graffing plane for a C s PE melt tethered at o=2.18nni2,above a graphite (open circles) or B hard substrate (solid line). The numbers 1 through 3, for the case of graphite, denote the positions ofthe first three local maxima in the density due to polymer adsorption.
By properly analyzing the mean orientational characteristics of the grafted layers at the level of individual segments, one can develop a methodology for calculating their deuterium ( H) NMR spectrum. Such calculations have been presented in a recent article3 and the resulting spectrum for the C78 PE melt (0=2.62 nm-l) are shown in Fig. 6. The spectrum exhibits its highest peak at frequencies characterized by a doublet splitting on the order of 0.4kHz. This value is in qualitative agreement with available experimental data reported in the literature for long PDMS chains grafted on silica characterized by similar chain overlaps as those of the shorter systems studied here.
'
132
--
I
~
C,, 0=2.62nm-~
n V
3
5
QVQ/2
(kHz)
Figure 6. 2H-NMR rpcetnun of t h e c78 PE systcm graficd at 0=2.62nm'~ealeulatcd from the atomistic simulation data
References
2.
A.V. Shevade, J. Zhou, M.T. Zin and S. Jiang, Langmuir, 17, 7566 (2001): L. Zhang, K. Wesley and S. Jiang, Langmuir, 17,6275 (2001). K.Ch. Daoulas, A.F. Terzis and V.G. Mavrantzas, J. Chem. Phys., 116,
3.
K. Ch. Daoulas, V.G. Mavrantzas and D.J. Photinos, J. Chem. Phys., 118,
4. 5. 6. 7. 8.
W.A. Steele, Surface Sci., 36, 3 17 (1973). K.F. Mansfield and D.N. Theodorou, Macromolecules, 24,4295 (1991). D.N. Theodorou and U.W. Suter, Macromolecules, 18, 1467 (1985). K.F. Mansfield and D.N. Theodorou, Macromolecules, 23,4430 (1990). S.T. Milner, T.A. Witten and M.E. Cates, Macromolecules, 21,2610
1.
11028 (2002). 1521 (2003).
(!988). 9.
S.T. Milner, T.A. Witten and M.E. Cates, Macromolecules, 22,853 (1 989).
10. T.M. Birshtein, Yu.V. Liatskaya and E.B. Zhulina, Polymer, 31, 2 185( 1990). 11. E.B. Zhulina, O.V. Borisov, V.A. Pryamitsyn and T.M. Birshtein, Macromolecules, 24, 140 (1991).
MARKOV'S PROPERTY AND GENERLIZED PADE -TYPE APPROXIMANTS NICHOLAS J. DARAS Jean Moreas 19, 15232 Chalandri, Athens, Greece
Pad6 approximants are rational functions whose expansion in ascending powers of the variable coincides with the Taylor power series expansion of analytic functions into a disk as far as possible, that is up to the sum of the degree of the numerator and denominator. The numerator and denominator of a Pad6 approximant are completely determined by this condition and no freedom is left. On the contrary, PadC-type approximants are rational functions with an arbitrary denominator, whose numerator is determined by the condition that the expansion of the PadB-type approximant matches the Taylor expansion of analytic functions up to the degree of the numerator. The great advantage of PadC-type approximants over Pad6 approximants lies in the free choice of the poles which may lead to a better approximation ([2-61). One would like to adapt the proofs of the one variable to the several variables case, however some major obstacles present themselves. It is therefore reasonable to suspect that the outlet lies with the consideration of another type of series representation for functions. Thus, in [8], we investigated PadC-type (and Padt) approximants to the Fourier series expansion of a real-valued function harmonic in the unit disk. Moreover, in [ 113, we introduced PadC-type (and Padd) approximation to the Fourier representation of a complex-valued harmonic function. This coordinate procedure is defined to be a composed approximation. Any classical Padt-type approximant to an analytic function coincides with a composed PadC-type approximant to this function. With this background, in [9] we used the solution of the Dirichlet problem in order to discuss the possibility of the numerical evaluation of a 2n-periodic
Lp function on [-n,n] (or on the unit circle) by means of composed PadCtype approximants to its Fourier series representation. Further, in [12] we considered integral representation formulas for all the above Padd-type approximants to Fourier series and in this connection we defined and studied the corresponding Padt-type operators. Finally, the definition and effectiveness of a PadC-type approximation to the Fourier series representation of a 2n -periodic finite Baire measure on [-n, n](or on the unit circle) are investigated in [ 101. Following the above direction, in [ 131, by using interpolating generalized polynomials for the Bergman Kernel function into an open bounded subset of we defined the so-called generalized Padt-type approximants to any f
a
en,
133
134 which are of class L2. HL2(R) of all analytic functions in The terminology is due to H . Van Rossum who in [23] was first introduced the in the space
notion of generalized Pad6 approximants. The characteristic property of such an approximant is that its Fourier series representation with respect to an orthonormal basis for HL2(a)matches the Fourier series expansion of f as far as possible. In this talk, we consider and study the extension of the generalized Padk-type approximation to continuous functions on a compact set E of several complex variables. The crucial hypothesis is the validity of a Markov property on E . Recall that a compact subset E of @" preserves the Markov inequality (or has the Markov property) (M , ) , if there exists an integer m 2 1 and, for each
a E N",a constant La < 00 such that
From
Jackson's
C"(E) +
theorem, it follows that the injective restriction C ( E ) is continuous, and, by Mityagin's theorem ([18]), there is a
H such that the injections are continuous. Since E preserves ( M , ) , the set { x"") :j = 0, 1,...} is linearly independent in H , (here
Hilbert space
v : N + N" is a bijection), and hence, by the Hilbert-Schmidt orthogonalization processus, one can find a system { y j :j = 0,1, ...) c H consisting of orthonormal polynomials with deg
u(z>=
gcj
(u>tyj (2)
uniformly
on
tyj = IIv( j)lI. Thus,
E(C~(U>
=< u , y j > H > ,
j=O
wherenever u E Cm(E). This Fourier representation for any u E Cm(E) permits us to introduce generalized approximation to u and motivates our second purpose to give definition and properties for the generalized PadB-type approximants to continuous fkctions on compact sets preserving a Markov property. Definition. ([ 141) Let E cc Cc" be a compact set preserving (M , ) . Assume that { y j :j E N} is a self-summable family in sequence
{ yj(z)vj:j E N} is summable in H
a finite set of painvise distinct points
H (i.e. for any z E E, the ([16])). For m 2 0, choose
135
so that
for any k S m . Any continuous function
(GPTA/ m)U(z) , defined by
rn
(GPTA/m)U(z)=~cj(u)a,'m'(z) j=O
(with
is called a generalized Pad&-typeapproximant to u E Cm( E ) with generating sysgtem
M,+, . If, moreover, for each v = 0,1,. ..,m,
the function
(GPTA / m)U(Z ) is said to be a Pad&-typeapproximant to u .
Theorem. ([141) If
fl~"'"'~,,
Pad&-type approximant to
(z) is the Fourier expansion of a generalized u (Z ) =
pplu)= c, ( u ) for any v = 0,1,. ..,m. Further, if the function
C
C,
( u ).v,(2) E C" ( E ), then
136
v=o
is continuous on
E and if
the corresponding generalized PadC-type approximation sequence to u
[(GPTA/ rn)U(z) :rn E N} converges uniformly on E ([ 141). References
1. References
M.S. Baouendi and C. Coulaouic: Approximation
polynomiale des functions cmet analytiques, Ann. Inst. Fourier, 21 (1971), 149-173. C. Brezinski: PadC-type approximants for double power series, Journal of the Indian Math. SOC.,42 (1978), 267-282. C. Brezinski: Rational approximation to formal power series, J. Approx. Theory, 25 (1979), 297-3 17 C. Brezinski: PadC-type approximation and general orthogonal polynomials, International Series in Numerical Mathematics, Birkhauser, Basel, 1980. C. Brezinski: Outlines of PadC-type approximation, in Computational Aspects of Complex Analysis, H. Werner and al. (eds.), D. Reidel Publishing Company, 1983, 1-50. C Brezinski: Pad6 approximants: old and new, Jahrbruch Uberblicke Mathematik, 1983,37-63. W. Cheney: Introduction to approximation theory, International Series in Pure and Applied Mathematics, Mc Craw-Hill, 1966.
137 N.J. Daras: Rational approximation to harmonic functions, Numerical Algorithms, 20 (1999), 285-301. N.J. Daras: Padd and Pad6 type approximation for 2X-periodic Lp functions, Acta Applicade Mathematicae, 62 (2000), 245-343. N.J. Daras: Interpolation methods for the evaluation of 2~ -periodic finite Baire measure, Approx. Theory & its Appl., 17: 2 (2001), 1-27. N.J. Daras: Composed Padt-type approximation, Journal of Computational and Applied Mathematics, 134 (2001), 95-1 12. N.J. Daras: Integral representations for Padd-type operators, Journal of Applied Mathematics, 2(2) (2002), 5 1-70. N.J. Daras: Generalized Padt-type approximation and integral representations, to appear in Advances in Computational Mathematics. N.J. Daras: Generalized Padt-type approximants to continuous functions on compact sets satisfying a Markov property, submitted. N.J. Daras: Markov's inequalities on compact subsets of @", submitted. P.R. Halmos: Introduction to Hilbert space and the theory of spectral multiplicity, 2"d ed., Chelsea Publishing Company, New York, 1957. P.J. Laurent: Approximation et optimization, Hermann, Paris, 1972. B.S. Mityagin: Approximate dimension and bases in nuclear spaces, Russian Math. Surveys, 16 (1961), 59-127. W. Pawlucki and W. Plesniak: Markov's inequality and @" functions of sets with polynomials cusps, Math. Ann., 275 (1986), 467-480. W. Plesniak: Compact sets of preserving Markov's inequality, Matem. Vesnit, 40 (1988), 295-300. W. Plesniak: Markov's inequality and the existence of an extension operator for @" functions, J. Approx. Theory, 61 (1990), 106-117. J. Siciak High non continuable functions on polynomially convex sets, Univ. Iagello Acta Math., 25 (1985), 95-107.
cN
H. Van Rossum: Generalized Pad6 approximants, in Approximation Theory 111, E.W. Cheney (editor), Academic Press, New York, 1980. A. Zeriahi: Inegalites de Markov et developpement en serie de polynomes orthogonaux de functions et A", Mittag Leffler Institut, Djursholm, 1991.
c"
A PRESSURE WEIGHTED UPWINDING SCHEME FOR CALCULATING FLOWS ON UNSTRUCTURED GRIDS
MASOUD DARBANDI * Department of Aerospece Engineering Sharif University of Technology Azadi Ave., Tehran, P.O. Box 11365-8639,Iran E-mail:
[email protected] KIUMARS MAZAHERI-BODY+ Deparment of Mechanical Engineering, Tarbiat Modares University, Gisha Bridge, Tehran, P.O. Box 14115-111,Iran E-mail: kiumars @modares.ac.ir SHIDVASH VAKILIPOUR~ The correct estimation of conservative statements at the control volume surfaces has a serious influence in the accuracy of numerical solution and even improving the convergence history. In this paper, a sound physical pressure-weighted scheme is utilized t o calculate the convective fluxes at cell faces. The method is extended in a manner which permits arbitrarily choice of element distributions and orientations in the solution domain, either entirely or partly, wherever the advantages of one distribution/orientation dominate those of others. In this regard, the necessary modifications which are needed to be undertaken in order t o include the advantages of utilizing the unstructured element distributions/orientations in a finite element volume context are presented. Eventually, the extended formulations are validated against a standard benchmark test case.
1. Introduction
There are numerous aspects which one can assume in order to categorize different finite volume methods. Considering upwind point of view, Winslowl used a triangular mesh with polygonal control volumes to treat the quasilinear Poisson equation. Baliga and Patankar2 borrow the idea of Ref.l and *Assistant Professor, Head of Aerodynamics and Propulsion Divisions t Assistant Professor, Head of Aerospace Engineering Division tGraduate Student 138
139
develop a finite volume element formulation to solve fluid flow problems using a three-node triangular elements. They exponentially interpolate the convected variables in the direction of the element-average velocity vector and linearly in the direction normal to it. Generally speaking, the interpolation functions have been greatly improved for the next two decades. For example, Rida, et aL3employ skewed mass-weighted upwind and floworiented exponential functions for the staggered control volumes in an unstructured triangular grids. Reference4 utilizes structuired hybrid element shapes in a finite volume element approach which employs the pressureweighted upwind scheme for calculating the convective fluxes. In this work, a physical influence scheme is suitably extended for employing on the unstructured triangular mesh distributions. The basic motivation behind this research is to practice the benefits of pressure-weighted upwind scheme on unstructured grids. Additionally, it is one step toward using the physical influence scheme in combined structured and unstructured grids. This combination evidently improves the accuracy of the solution for a fixed number of nodes distributed in the solution domain. Considering the above advantages, a control-volume based finite-element method is used to treat the governing equations. The method is pressure-based. The proper selection of the connections between the variables on control volume surfaces and the main nodal values allows the use of a colocated grid arrangement. The colocated grid arrangement needs special treatment for the coupling of velocity and pressure. This treatment removes the possibility of the checkerboard problem which may arise in collocated grids6.
2. G r i d Arrangement
In a finite volume element procedure, the solution domain is needed to be divided into a number of finite elements. Considering the squared cavity as a benchmark case, there are different choices to distribute the inside grid. Figure 1 illustrates two types of element distributions in the cavity. Based on the physics of inner flow field, Figure l(1eft) indicates a better element distribution then the one in Figure l(right) if the numeber of nodes is the same. After a suitable distribution of the elements within the solution domain, each element is broken into a number of sub-quadrilaterals. The finite volumes are then constructed from the proper assemblage of these sub-quadrilaterals. Figure 2(left) shows eight neighbour elements taken out of the domains in Figure 1. Each triangle can be divided into three
140
.-
Figure 1. Two types of unstructured mesh in the domain.
sub-quadilaterals by the help of three medians of each triangle. The medians are shown by dashed lines. As is seen, irrespective of the shape and distribution of the elements, each node is surrounded by four subquadrilaterals. The proper assemblage of neighboring sub-quadrilaterals around any non-boundary node creates an eight-edges control-volume. In these figures, nodes are treated as the location of all problem unknowns. Such grid arrangement is called the colocated grid.
I
Figure 2.
Finite element nomenclature and velocity upwinding.
141
3. Computational Modeling
Reference5 provides full details of the current computational approach and the basic employed formulations. We briefly discuss the parts which are more important in this study. The Navier-Stokes equations is given by qt
+ Fz(4+ GY (9)= R%(9)+ T y (4
(1)
where q = [0, u, vIT is the solution vector and F, G,R,and T are the convection and diffusion flux vectors. In control volume approaches, Eq.(l) is integrated over the control volumes using the divergence theorem. Then, the convection and diffusion fluxes are integrated over the control volume surfaces. The integrals are approximated by the midpoint approximation for each line segment. These midpoints are denoted as integration points zp and are shown in Figure 2 as crosses. Since F and G variables are nonlinear with respect to the u , v , and p dependent variables, they must be linearized properly. One simple linearization strategy yields
F
M
[u,Cu + p / p , UwIT , G M [w,Bu, Bv + p/pIT
(2)
The idea of connecting the integration point values to the corresponding nodal values in the convection terms has been largely investigated in control volume methods. The essence is that the connection is not only based on the complex mathematical functions but also on the physical interpretation of the governing equations. Reference5 derives an appropriate algebraic approximation to the differential equations at each integration point. It provides the physics and the relevant couplings. The starting equation for computing u at zpl is given by
where Kot = d m . This equation is then discretized properly. In this regard, the transient term is differenced backward in time. The bilinear interpolation is used to treat the pressure term and the Laplacian operator using finite element shape functions and appropriate diffusion length scales. The key point in this method is found in the convection term which is written in the streamwise direction. This form provides the correct direction of upwinding in the streamwise direction, i.e.,
where L,, and uUpare illustrated in Figure 2 for zpl.
142
The upstream location is found by intersecting the extension of the streamline direction at integration point with the edges of the same element. Then, uUpis interpolated between the two adjacent nodes which are nodes 2 and 3 for ipl in Figure 2, i.e., (uup)ipi
=~
[ U+Z(b/a - 1)U3I/b
(5)
If the above models are plugged into Eq.(3) and similar coefficients are combined together an algebraic equation for the integration point value of u is derived. The results yield
+
+
{u}= [C""]{U} [C"P](P} {C"}
(6)
where C"" and CUP are two 3 x 3 matrices which indicate the effect of U and P fields on u. The 3 x 1 array of C" includes all known parts of the assembled terms. The first and second superscripts of the C means to which equation and parameter of equation it respectively belongs. Similarly, an expression for w velocity can be derived by the same procedure but starting from y-momentum equation. After deriving the appropriate integration point expressions for u and w, they are substituted in F and G of Eq.(2). 4. Results
Figure 3. Two types of unstructured mesh in the domain.
5 . Closure
The material presented in this extended abstract is submitted to be considered for presentation at the International Conference of Computational
143
Methods in Science and Engineering, Sep. 12-16, 2003, in Kastoria, Greece. The theory and problem formulation have been presented completely. In the final version of the paper, the paper is condensed and the results section will be extended and completed. The authors anticipate that sufficient material has been presented to enable a quick review at the abstract level for presentation at the conference. References 1. A.M. Winslow, J. Comp. Phys., 2, 149 (1967). 2. B.R. Baliga, and S.V.Patankar, Num. Heat Trans., 6, 245 (1983). 3. S. Rida, F. McKenty, F.L. Meng, and M. Reggio, Znt. J.Numer. Meth. Fluids, 25, 697 (1997). 4. M. Darbandi, G.E. Schneider, and A.R. Naderi, AIAA Paper 2003-3638,2003. 5. M. Darbandi, G.E. Schneider, K. Javadi, and N. Solhpour, AIAA Pap. 20030436, 2003. 6. S.V. Patankar, Numerical Heat Transfer and Fluid Flow, Hemis. Corp., Washington, D.C., 1980.
AN EFFICIENT MODIFICATION OF THE PRIMAL TWO PATHS SIMPLEX ALGORITHM
- DUAL
K. DOSIOS, K. PAPARRTZOS, N. SAMARAS*AND A. SIFALERAS Dept. of Applied Informatics, University of Macedonia, 156 Egnataa Street, Zip code 540 06, Thessaloniki, Greece, E-mail: dosiosOuom.gr, paparrizOuom.gr, samarasOuom.gr,
[email protected]
Linear programming deals with the problem of minimizing or maximizing a linear function in the presence of linear constraints. The popularity of linear programming can be attributed to many factors such as the ability to model large problems, and the ability to solve large problems in a reasonable amount of time. Lots of real world problems can be formulated as linear programs. The explosion in computational power of hardware has made it possible to solve large-scale linear problems in personal computers. Since the development of the simplex algorithm by Dantzig l, many r e searchers have contributed to the growth of linear programming. In 1979, Khachiyan proposed the ellipsoid method to solve linear problems in polynomial time. Then, in 1984 Karmarkar developed another polynomial, O(n35 ) , algorithm based on projective transformation. Recently, a new algorithm for linear problems, which generates two paths to optimal solution, has been constructed by Paparrizos '. This new algorithm is reported in the bibliography as exterior point simplex algorithm (EPSA). EPSA has two major computational disadvantages. These are (1) It is difficult to construct good moving directions. The two paths generated by the algorithm depend on the initial feasible direction that is closely related to the initial feasible vertex. (2) There is no known way of moving into the interior of the feasible region. This movement will provide more flexibility in the searching of computationally good directions.
A well-established way of avoiding the previous computational disadvantages is the transformation of exterior path of EPSA into a dual feasible ~
*correspondingauthor
144
145
simplex path. The algorithm that results from this transformation is called primal-dual two paths simplex algorithm (PDTPSA). The aim of this paper is to present an improved modification of the Primal-Dual Two Paths Simplex Algorithm (PDTPSA), developed by Paparrizos et al. Let the linear problem be written in the standard form as min s.t.
cTx
Ax
=
b
x
2
0
where A is an m x n constraint matrix, c is an n x l column vector, b is an mx 1 column vector, T indicates the transpose and 0 represents the n x 1 null column vector. The dual problem associated with (PLP) is
max s.t.
b'w ATw+s
=
C
S
2
0
where w is an m x 1column vector and s is an n x 1 column vector of slack variables. PDTPSA requires an initial dual feasible basic solution and a feasible point for the (PLP) problem. This means that at every iteration the relation s = c -ATw 2 0 holds. y is a boundary point of the feasible region of (PLP). Then the direction d1 = x1 - yo , is calculated where yo is an initial, not basic, point of problem (PLP). If point x1 is feasible to problem (DLP) and point y1 is feasible to problem (PLP) then the direction d1 is ascent for the objective function cTx. The direction d1 creates a ray R' = x1 td' : t 2 0, which enters into the feasible region from the boundary point y2. The hyperplane from which ray R1 enters the feasible region corresponds to the basic variable Xk. Then a dual pivot operation on which basic variable Xk exits the basis, is performed. PDTPSA is better than EPSA, because it faces with success the two previous computational disadvantages. The fact that the points y t , t = 1, 2, . . . are boundary leads to a serious computational disadvantage. The disadvantage derives from the point y belongs to more than one hyperplanes. This means that ties exist in the choice of leaving variable which in turn can lead to stalling and/or cycling. In order to avoid stalling and cycling it is preferable the point y to be interior of the feasible region and no boundary as it is in PDTPSA. Specifically, the new algorithm that was developed and is presented
+
146
in this paper traverses across the interior of the feasible region in an attempt to avoid combinatorial complexities of vertex-following algorithms. This new algorithm is called Primal-Dual Interior Point Simplex Algorithm (PDIPSA). Also, moving into the interior of the feasible region is translated into an important reduction of number of iterations and CPU time, particularly in linear problems, which are degenerate. A means of comparison of different algorithms is the execution of extended experimental computational studies. The computational studies constitute a useful tool in the hands of operational researchers for the classification and hierarchy of different algorithms. A computational study on randomly generated sparse dual feasible linear problems is presented to establish the practical value of the PDIPSA. The results are very encouraging for the modified algorithm. In the computational study with dual feasible randomly generated linear problems an extended comparison performed among Dual Simplex (DSA), Primal-Dual Two Paths Simplex Algorithm (PDTPSA) and Primal-Dual Interior Point Simplex Algorithm (PDIPSA). The revised form of the Simplex Algorithm was used in these algorithm's implementation . In the experimental computational study three different densities for the randomly generated linear problems were used. The densities are: 1.25%, 2.5% and 5%. For each density three different categories of linear problems were solved; square n x n and rectangles (n/2)xn and nx(n/2). Each one of them includes 6 different classes of problems. Every class includes 10 randomly generated linear problems. The ranges of values of the linear problem coefficients used for the computational study are c E [O 2501, b E [lo 1001 and A E [-200 5001. The feasibility tolerance used is and the tolerance on a pivot row and column is 10-l'. Totally 540 linear problems were solved for all different densities and dimensions. In particular for linear problems of size 1000 x 1000 and for densities 5%, 2.5% and 1.25% PDTPSA is 4.09, 5.24 and 5.17 times faster than DSA concerning the number of iterations, while PDIPSA is 4.49, 5.72 and 5.99 times faster. In terms of CPU time PDTPSA is 7.61, 9.01 and 8.31 times faster than DSA whereas PDIPSA is 8.33, 10.05, 10.59 times. The conclusion which results is that for the linear problems as the dimension increases and the density decreases, so much more rapid becomes the PDITPSA over the DSA and PDTPSA.
147
References 1. B.G. Dantzig, Programming in a linear structure, Comptroller, US Air Force, Washington, D.C., 1948. 2. L.G. Khachiyan, A polynomial algorithm in linear programming, Soviet Mathematics Doklady 20 pp. 191-194(1979). 3. K.N. Karmarkar, A new polynomial time algorithm for linear pro-gramming, Combinatorica 4 pp. 373-395(1984). 4. K. Paparrizos, An exterior point simplex algorithm for (general) linear programming problems, Annals of Operations Research 47 pp. 497-508(1993).
COMPARING SEQUENTIAL VISUALISATION METHODS FOR THE MANDELBROT SET
V. DRAKOPOULOS Department of Informatics tY Telecommunications, Theoretical Informatics, University of Athens, Panepistimioupolis 157 84,Athens, Hellas E-mail:
[email protected] Sequential visualisation methods for the most widely used methods for the graphical representation of the Mandelbrot set are compared. Two groups of methods are presented. In the first, the Mandelbrot set (or its border) is rendered and, in the second, its complement is rendered. Examples of two-dimensional images obtained using these methods are also given.
1. Introduction
There are two basic problems in iteration theory. The first (and classical) one is to study the iterative behaviour of an individual function; the second one is to study how the behaviour changes if the function is perturbed, the simplest (but already sufficiently complicated) case being a family of functions that depends on one parameter. We shall consider here only the second aspect, particularly we shall discuss the dynamics of the polynomial p , : C -+ C with p c ( z ) = z2
+ c,
c E C,
which is an enormously rich fountain of fractal structures. Although the fractal sets generated from the above-mentioned transformation have been discussed extensively in the literature, as far as we know, no previously published work exists that comprises the best known sequential visualisation methods and whose scope is the comparison of their performances. In order to present these methods, we must first introduce some useful terminology. A periodic orbit or cycle is a set of k 2 2 (distinct) points {all.. . , a k } such that p , ( a l ) = a2 , . . . , p c (u k - 1 ) = ak,pc(ak)= a l ; so, in fact, for each j = 1 , 2 , . . . ,k, z = aj is a solution of pf(z) = z , where pf(z) = p,(~f-~(z)). Hence, a point a is periodic, if p t ( a ) = a for some k > 0; 148
149
it is repelling, indifferent or attracting depending on whether I(pt)'(a)l is greater than, equal to or less than one, respectively. If I(pt)'(a)l = 0, a is termed superattracting. If k = 1, z is called a jixed point of p,. Naturally, attracting means that points 20 near a will generate orbits
zo H 21 H
22
H
23..
.
zk+l = p c ( z k ) , k = 0,1,. . ., which approach a. By collecting all such points one obtains the basin of attraction of an attracting fixed point a
A,(a) = { z E C : lim p,"(z) = a}. k+co
It is obvious that 00 is an attracting fixed point of p,. The boundary of A,(co) is denoted by dA,(m) and is called the Julia set of p,. We also use the symbol J , = dA,(m). Other than A,(co) and J,, also to be considered is a third object
K, = C \ A,(co) = { z E C : p,"(z) stays bounded for all k) sometimes called the filled-in Julia set. Obviously, we have that
dK, = J , = dA,(co), i.e., J , separates competition between orbits being attracted to co and orbits remaining bounded as k 4 00. Each filled-in Julia set K , is either connected (one piece) or a Cantor-like set (dust of infinitely many points). The Mandelbrot set M is
M = {c
E C : K , is connected} = { c E
C : 0 E K,}.
The paper is organised as follows. Firstly, after describing briefly the most widely used sequential methods for constructing the Mandelbrot set, we present efficient sequential algorithms for visualisation purposes. As examples we give sequential algorithms in the form of ready-teuse code to attack the problem of determining the Mandelbrot set. Next, we compare all the implemented sequential methods with each other in order to find the best balance between speedup and accuracy. Finally, some conclusions are drawn along with a discussion of technical issues. 2. Comparative results
Two basic characteristics of the algorithms are compared: the speed with which they compute the Mandelbrot set and the efficiency with which they display it to the computer screen. Figure 1 demonstrates the Mandelbrot
150
Figure 1. T h e Mandelbrot set obtained by (a) the BSM, (b) the LSM, (c) the LSM but showing the border of the encirclements, (d) the CPM and (e) the DEM.
set obtained by some of the methods presented here. A first table presents the sequential runtime measured for each of the six methods (BSM, MBSM, LSM, CPM, DEM, DBM) for the computation of the Mandelbrot set. A first observation from these results concerns the increase of the runtimes, while increasing the resolution of the images. A second observation concerns the low runtime obtained for the BSM and MBSM methods; the latter is obviously an improvement of the former. The LSM, the CPM and the DBM are the fastest methods. The DEM is the slowest method with a slight difference from the BSM. Of course the result is worth such a delay! A second table presents the efficiency measured for each of the six methods (BSM, MBSM, LSM, CPM, DEM, DBM) for the computation of the Mandelbrot set. When we speak about efficiency we mean the quality of the resultant picture, i.e. how accurate the graphical representation of the fractal set is. The more efficient method is the DEM; for that, it is the slowest. Sufficiently satisfactory results are obtained also with the other methods, except for the fact that BSM, LSM, CPM do not demonstrate clearly the connectedness of the Mandelbrot set.
151
3. Conclusions The current implementation of the algorithms mentioned before is written in Microsoft Visual C++ 6.0. Time results are given in CPU seconds on a Pentium I1 PC with a 233 MHz CPU clock running Windows 2000 SP3. As can be easily extracted from the comparison analysis of the preceding section, the DBM is the best method (over all measures) for visualising the Mandelbrot set. It is well known that DEM is one of the more accurate methods to obtain the best quality pictures of this fractal set. The third best method is the MBSM and then following, in order, the BSM, CPM, and LSM. Hence, depending on the sought-after fractal set, a compromise between runtime and accuracy must be made. References 1. Barnsley, M. F., fiactals everywhere, 2nd ed., Academic Press Professional, 1993. 2. Drakopoulos V., Mimikou N. and Theoharis T., Comput. €4 Graph. 27,(2003). 3. Hepting, D., Prusinkiewicz, P. and Saupe, D.: Rendering methods f o r iterated function systems, in Peitgen, H.-O., Henriques, J. M. and Penedo, L. F. (eds), fiactals in the fundamental and applied sciences, North-Holland, pp. 183-224, (1991). 4. Hoggar, S. G.: Mathematics f o r computer graphics, Cambridge Univ. Press, 1992. 5. Peitgen, H.-0. and Richter, P. H., The beauty of fractals, Springer-Verlag, 1986. 6. Peitgen, H.-0. and Saupe, D. (eds), The science of fractal images, SpringerVerlag, 1988.
STRUCTURED MATRIX PERTURBATIONS FOR MOLECULAR CONFORMATIONS
IOANNIS Z. EMIR.IS* AND THEOD0R.E G. NIKITOPOULOS Department of Informatics €4 Telecommunications National University of Athens, Athens, Greece E-mail:
[email protected],
[email protected]
Three-dimensional molecular structure is fundamental in drug design and discovery, docking, and chemical function identification. The input t o our algorithm consists of a set of approximate interatomic distances or distances constrained in intervals of varying precision; some are specified by the covalent structure and others by NMR experiments and application of the triangle inequality. The output is a valid molecular conformation in a specified neighborhood of the input. We aspire that our approach helps in detecting outliers of the NMR experiments, and that it manages t o handle partial inputs. Numerical linear algebra methods are employed for reasons of speed and accuracy. The main tools include, besides iterative local optimization, distance geometry and matrix perturbations for minimizing singular values of real symmetric matrices. Our algorithm is able to bound the number of degrees of freedom on the conformation manifold. A public domain MATLAB implementation is described; it can determine a conformation of a molecule with 20 backbone atoms in 3.79sec on a 500MHz PENTIUM-111.
1. Introduction
Our method, given a known valid conformation, explores nearby conformations lying on the same manifold, hence also topologically close to each other. This answers the need of biased sampling in order to avoid previously sampled configurations. Our algorithm offers the freedom to choose any direction during the exploration. By systematically sampling the conformation space, we can compute several possible geometries when the input are interval constraints. For molecules or molecular substructures with few degrees of freedom (up to about lo), our methods are able to fully enumerate all realizable conformations. In addition, the algorithm is able to bound the dimension of the manifold of all allowable conformations. *Work partially supported by Project 70/4/6452 of the Research Committee of the National University of Athens, Greece. 152
153
In this work we propose numerical linear algebra methods for computing conformations of geometrically constraint molecules. We formulate molecular embedding as a structured singular value (or eigenvalue) minimization problem: Given distance approximations (or interval constrains, respectively), the aim is to find values near the given approximations (or in these intervals) so that the structure be embedded in euclidean space R3. Our algorithm is based on iterative local optimization and extends . Modeling the molecular problem in algebraic terms is achieved by distance geometry. There are n points corresponding to the backbone atoms with &j, i,j E (1,.. . , n} denoting the euclidean distances. Consider the symmetric distance matrix (or Cayley-Menger matrix) D(1, ,. . ,n) which contains the adjacency matrix as a (principal) submatrix (with squared distances) and an additional row and column of units, with the diagonal being zero. A classical theorem states that n 2 4 distinct points (not all coplanar) are embeddable in lR3 iff rank(D(1,. . . , n)) = 5.
’
2. Matrix perturbations
Let R be a matrix with the same dimensions as matrix M and q(’) denote the k-th singular value. Linear algebra theory states that f(E) := Ok(M ER) is differentiable with respect to real variable E as long as Ok ( M - JR) is distinct from all other singular values, for all J. For M real and symmetric, f’(J) = -uTRv, where u,v are the k-th singular vectors of M - [R. An analogous result applies to eigenvalues and eigenvectors. In order to define the allowed perturbations in a general way, let Rij be a perturbation matrix having the same dimensions as M and zeros in all entries except of units at entries (i,j) and (j,i), where 1 5 i < j 5 N. The number of independent Rij is p , and for Cayley-Menger matrices corresponding to n points, we have p 5 n(n - l)/2,n = N - 1. Let R be a subspace of symmetric square matrices of dimension p generated by the Rk,kj, 1 < k i < k j I N : R={C:=iJkRk,kj : [ J 1 , . . . , J p ] E R P } . The algorithm in ’, specialized to a square real symmetric N x N matrix M , consists of the following steps: (0.) Initialize the perturbation matrix R E R,possibly to the zero matrix. (1.) Compute the SVD decomposition M - R = U C V T , where the N-th singular value and vectors are denoted by O N ,U N , U N respectively. (2.) Let perturbation matrix A E R have minimum norm such that it minimizes ~ ~ u Z A V ~ - R ) v ~ l l .(3.) - uZ(M Let Q c IluZA(1- VNV:)(M - R)+All, y + min(1, I ~ u ~ A v N I I / ( ~ ~ and R + R yA. (4.) If rllAll/llM - RII is smaller than some given
+
154
tolerance, the algorithm stops; otherwise, go to step 1. The main theorem in 2 states that, if ||A|| is always bounded, then, the above algorithm makes
A € K, n < j < N,
(1)
and set, in the next step, R <— R + A. Hence, if some Cayley-Menger matrix is sufficiently close to a given approximate distance matrix D, then a Cayley-Menger matrix exists and is unique iff the solution of (1) exists and is unique. 3. Cycloalkanes Table 1. Computing one cyclic conformation. Cyclohexane heptane octane nonane decane endecane dodecane
dof 6
7 8 9 10 11 12
Init.eigval 1.56e-01 1.45e-01 l.lle-01 1.24e-01 1.64e-01 2.10e-01 1.32e-01
Fin.eigval 3.72e-08 1.31e-08 4.65e-07 3.31e-08 6.86e-07 1.43e-06 7.41e-08
Iter. 3 3 3 3 3 3 5
[aecj 0.02 0.03 0.05 0.08 0.12 0.18 0.27
[KFlop] 26
38 54 80
119 183 281
For the cyclohexane, our method can compute all conformations. We have applied a perturbation in the range of 10% on interatomic distances and angles in order to destroy the molecule's symmetry and produce a finite number of conformations. Our method gives results as good as fully rigorous algebraic methods 3'4 in that we obtain at most 4 solutions, namely 2 chair and 2 boat conformations, as encountered in nature. The number of solutions upper bounds the number of connected components of the manifold, provided the input is generic (in practice, random). For the cycloheptane, the Cayley-Menger matrix has 7 unknown entries. We completely explored each 1-dimensional manifold with a step of 0.05A, obtaining e.g., more than 50 valid conformations with u\ in the range [8.586,11.290]A. We see that the matrix entry ui is constrained by MI < 11.29A. While exploring the one-dimensional manifold and after some iterations, if u\ is increased beyond this bound, then (1) cannot be satisfied.
155
After some computation we conclude that the manifold dimension cannot be larger than one. Hence we confirmed that there are two 1-dimensional conformation manifolds 5. For the cyclooctane, the Cayley-Menger matrix has 12 unknowns. Our methods find that the dimension of the conformation manifold is > 2. By extracting certain perturbation matrices, just as for the cycloheptane, we conclude that the dimension is 2, confirming 4. Table 2. Computing one conformation on a 500MHz PENTIUMIII
dof 7 8 9 10 11 12
13 14 15 16 17 18 19 20
Init. eigval 2.98e-02 2.57e-02 2.10e-02 2.38e-02 3.16e-02 8.13e-02 8.09e-02 3.72e-02 3.53e-02 3.78e-02 3.83e-02 3.53e-02 3.80e-02 4.00e-02
Final eigval 6.64e-14 4.43e-12 6.29e-ll 2.95e-13 2.60e-12 1.20e-07 8.49e-08 6.04e-13 2.02e-14 1.72e-12 1.70e-13 3.93e-13 4.59e-14 7.09e-13
Iterations 3 3 3 3 3 3 3 3 3 3 3 3 3 3
[sec| 0.01 0.05
0.05 0.11 0.16 0.22 0.30 0.49 0.77 1.15 1.54 2.14 2.91 3.79
[KFlop] 36 49 73 109 165 282 450 606 940 1404 2082 3039 4344 6136
References 1. T.G. Nikitopoulos and I.Z. Emiris, Structured eigenvalue optimization in distance geometry, In Hellenic-European Conf. Computer Math. & Appl, Athens, Greece, pp. 451-454, September 2001. 2. M.A. Wicks and R..A. Decarlo, Computing most nearly rank-reducing structured matrix perturbations, SIAM J. Matrix Anal, and Appl. 16(1), 123137(1995). 3. I.Z. Emiris and B. Computer algebra methods for studying and computing molecular conformations, Algorithmica - Special Issue on Algorithms for Computational Biology, 25, 372-402(1999). 4. N. Go and H.A. Scheraga, Ring closure and local conformational deformations of chain molecules, Macromolecules, 3(2), 178-187(1970). 5. G.M. Grippen, Exploring the conformation space of cycloalkanes by linearized embedding, J. Comput. Chem., 13, 351-361(1992).
A NEW ALGORITHM TO OBTAIN ALL MAXIMUM COMMON SUBGRAPHS IN MOLECULAR GRAPHS USING BINARY ARITMETHIC AND CONSTRAINTS SATISFACTION MODEL G. CERRUELA GARC~A,I. LUQUE RUIZ, M. A. G~MEZ-NIETO Department of Computing and Numerical Andisis. University of Cdrdoba Campus Universitariode Rabanales, Building C2, Plant 3, E-14071 Cdrdoba (Spain) E-mail:
[email protected]
In this work we propose a new algorithm to find out graphs isomorphism. This algorithm has been applied to calculate the similarity index in chemical compounds, representing the molecular structures like colored graph and these graphs as vectors on n-dimensional spaces. Hereby is possible to reduce the maximum common structure detection problem to a simple vector problem.
1.
Introduction
Chemical compounds are usually represented as coloured and non-directed graphs, named molecular graphs in Computational Chemistry [ 11 We can write a molecular graph like a characteristic vector defined in a multidimensional space, where each vector component represents a dimension determined by a characteristic presents in the molecular graph. Thus, it is possible to reduce the problem to obtain all maximum common subgraphs between two known molecular graphs, to a constraint satisfaction problem [2] defined as the k-dimensional space in which can be represented the higher number of common components of the characteristic vectors given. In this way a molecular structure can be represented as a molecular graph G= (V, E, F), where V represents the graph vertexes, E, the edges and F is a function that assigns the type of relation (edge) between each vertex vi and vj. This graph can be represented using an associate characteristics vector CG,
-
4
where each element C G ( ~ )represents , the existent subgraphs in G formed by two vertexes vi, v, and one edge qj characterized by the function F(vi, v,), the one depth subgraphs involves in G.
156
157 2. The Isomorphism model If two representative Characteristics in the G graph share one or more vertices, we can affirm that they are related. The set of relationship between characteristics of the molecular graph can be written as: c&=cGY (1) where: C&: is the array of relationship between characteristics, CG is the characteristics array of G graph, and Y is coefficients matrix xu that gives information if two characteristic cgi, cgj are related, taking value 1 when cgi and cgj characteristic have a common vertex, and zero otherwise. Thus, we can represent the topological structure of two chemical compounds A and B over the n and m dimensional spaces using the characteristic and relationship vectors CA, CRA and CB, CRB, The common topological structure to both molecules can be obtained finding a common dimensional space k, k I min(n,m), in which can be represented the common topological structures through the representation in this space of every one depth molecular subgraphs common to A and B graphs.
3. Matching-Characteristics Algorithm (MCA) The MCA algorithm carries out the graph isomorphism finding the common k dimensional plane in which the characteristics and relationships of A and B graphs are represented, trying to find the isomorphism between the relationship present in the characteristics arrays, that is, between CRAand CRBarrays. The isomorphism process between graphs needs to establish an overlapping constraint, which can be defined in the following way: Constraint 1: Two characteristic a, & CAand bj & CBrepresent the same space (dimension) if both are constituted by the same type of two colored nodes related by the same type of edge (the same one depth subgraph). Constraint 2: The common components of A and B characteristics arrays represented in the isomorphic k-dimensional space satisfy the previous constraint and preserve the relationships present in the correspondents CRA and CRBarrays. Step 1: All possible common k dimensional spaces to CA and CB are obtained through the cross product of the A and B characteristics arrays. Then the array: V = CAx CB CD is obtained, where @ is the coefficients matrix xi,, that take value 1 when the characteristics aci and bc, satisfy the imposed restriction, and 0 otherwise.
158
Step 2: This step find inside the possible set of common characteristics between A and B, represented by V, those characteristics sharing a maximum set of relationships, which are represented in the CRAand CRB arrays For this, is defined an H* array with a k x k size, being k the number of elements of the V array that satisfy the restriction 1 (take value 1 in the array). The elements in H* array, are obtained in the following way: Step 2.1: The V array is visit, choosing all the elements that have the value 1 (k elements). Step 2.2: For each pair of selected elements from V, is verified if CAand CB components are equal or these components are related in the correspondents CRA and CRB arrays (fulfillment the restriction 2). In this case is realized the cross product over the result of the dot product between the CA and CB components of the pair of elements selected from V, so that: (CA YA)" (CA 'UA)'J x (CB YB)" (CB (2) Step 2.3 The result of the previous operations is verified, taking the following considerations. a) It is obtained an element ( ~ o c b ) that does not satisfy the restriction 2, that is, not exist inside k elements that take the value 1 in the array V. In this case, these results are despised. b) It is obtained an element (COC~) E V that satisfies the restriction 2,
v
which indicates that elements Vi and Vj are connected through the element (cacb) in the common isomorphism hyperplane, being updated the array H* as: H*(Vi, V,) = H*( V,, Vi) = C&b, and H*(Vi, Vi) = H*(V,, V,) = 1 (3) c) The operation gives a null value, no element is obtained (CA (CA YA)vJ=(CB YB)vi (C, YB)"' = NULL, indicating that the elements Vi and V, are directly connected in the common isomorphism hyperplane. Being updated the H* array as follows: H*(Vi, Vi) = H*(Vj, Vj) = 1 (4) At the end of the process, the array H* contains all up to two depth isomorphic subgraphs between the graph A and B, reducing the problem to find the array H* subspaces that make maximum the representation of A and B characteristics. Step 3: The redundancies in the array H* are analyzed, being eliminated under the following criteria: a) Those redundant elements that do not appear as solution in any operation fulfilled according to the equation 2, are eliminated.
159 The characteristics of the redundant elements in the relationship arrays are verified; being eliminated those elements in which both CAand CBcomponents have different characteristic. Step 4: Finally, the array H* is visit and the array of matching M containing the isomorphic subgraphs is extracted. b)
4. Results For initial effectiveness evaluation of proposed MCA algorithm was selected a natural compounds database, calculating the Tanimoto [ 13 similarity index for each compounds regarding to the rest of elements stored in the database. This index relate the elements of the molecular graphs (nodes and edges) to the common structures in both graphs using the following expression: T=nc/(ngA+ngB-nc), where: nc is the common elements in both graphs, ngA is the elements of the molecular graph A, and ngB is the elements of the molecular graph B. As example we show the results obtained for a little group of compounds of the total evaluated. n
n
""
JN!N
0 ;
b
0Q
A B C D E
A 1.ooo
B 0.795 1.ooo
C 0.707
0.850 1.ooo
D
N N,
'' \"I
. <
0
N-~'^H
0
0
E
0.538
0.447
0.615 0.714 1.ooo
0.500 0.412
0.243 1.ooo
References
1. Rouvray, D.H.; Balaban, A.T. Chemical Applications of Graph Theory. Applications of Graph Theory. R.J. Wilson and L.W. Beineke, eds. Academic Press., pp. 177-221, 1979. 2. Larrosa, J.; Valiente, G. Graph Pattern Matching using Constraint Satisfaction. The European Joint Conferences on Theory and Practice of Software (ETAPS) 2000.
IMPLEMENTING APPROXIMATE REGULARITIES EXTENDED ABSTRACT
MANOLIS CHRISTODOULAKIS, COSTAS S. ILIOPOULOS Department of Computer Science King’s College London {manolis: csi} Qdcs.kcl. ac.uk
KUNSOO PAR.K Department of Computer Engineering Seoul National University kparkQthwry. snu. ac.kr
JEONG SEOP SIM Electronics and Telecommunications Research Institute Daejeon 305-350, Korea
[email protected] In this paper we study approximate regularities of strings, that is, approximate periods, approximate covers and approximate seeds. We explore their similarities and differences and we implement algorithms for solving the smallest distance approximate per-iod/cover/seed problem and the restricted smallest approximate period/cover/seed problem in polynomial time, under a variety of distance rules (the Hamming distance, the edit distance, and the weighted edit distance). We then analyse our experimental results t o find out the time complexity of the algorithms in practice.
1. Introduction
Finding regularities in strings is useful in a wide area of applications which involve string manipulations, such as molecular biology, data compression and computer-assisted music analysis. Typical regularities are repetitions, periods, covers and seeds. In applications such as molecular biology and computer-assisted music analysis, finding exact repetitions is not always sufficient. A more appropriate notion is that of approzimate repetitions [2, 3, 41, where errors are allowed. In this paper, we consider three different kinds of approximation: 160
161
the Hamming distance, the edit disctance and the weighted edit distance. Sim, Iliopoulos, Park and Smyth showed polynomial time algorithms for finding approximate periods [5] and, Sim, Park, Kim and Lee showed polynomial time algorithms for the approximate covers problem in [6].More recently, Christodoulakis, Iliopoulos, Park and Sim showed polynomial time algorithms for the approximate seeds problem [l]. In this paper we implement and compare the algorithms given in [ 5 , 6, 11. 2. Preliminaries 2.1. Distance functions
We call the distance 6(x,y) between two strings x and y, the minimum cost to transform one string x to the other string y. The special symbol A denotes the absence of a character (i.e. an insertion or a deletion occurs). The edit or Levenshtein distance between two strings is the minimum number of edit operations that transform one string into another. The edit operations are insertion, deletion and substitution, each of cost 1. The Hamming distance between two strings is the minimum number of substitutions that transform one string to the other. We also consider a generalized version of the edit distance model, the weighted edit distance, where each insertion, deletion, substitution has a different cost, stored in a penalty matrix. Comparing two characters a, b under the hamming distance model ( a ,b E C) or under the edit distance model ( a , b E C u (A}) takes constant time: the cost is 0 if a = b, and 1 if a # b. On the other hand, under the weighted edit distance model, to get the distance between a and b we need to search the penalty matrix, a procedure that introduces an extra cost in terms of time consumption. 2 -2. Approximate Regularities
Let x and s be strings over C*, 6 be a distance function, t be an integer and sl,sa,.. . ,s, (si # E ) be strings such that S ( s , s i ) 5 t , for 1 5 i 5 r . s is a t-approximate period of x if and only if there exists a superstring y = xu (right extension) of x that can be constructed by concatenating copies of the strings SI,s 2 , . . . , s,. s is a t-approximate cover of 3: if and only if x can be constructed by overlapping or concatenating copies of the strings s1, s2, . . . ,s,.. s is a t-approximate seed of x if and only if there exists a superstring
162
y = uxu (right and left extensions) of x that can be constructed by overlapping or concatenating copies of the strings s1,s2, . . . ,s,. 3. Problem Definitions and Solutions
3.1. Smallest Distance Approximate Period/Cover/Seed Problem Definition 3.1. Let x be a string of length n, s be a string of length m, and 6 be a distance function. The Smallest Distance Approximate Period/Cover/Seed problem is to find the minimum integer t such that s is a t-approximate period/cover/seed of x. There are two steps involved to solve this problem:
(1) Compute the distance between s and every substring of x . Let w i j be the distance between s and x[i..j], for 1 5 i 5 j 5 n. The value of each wij is computed by using dynamic programming. (2) Compute the minimum t such that s is a t-approximate period/cover/seed of x. Initially, to = 0. For i = 1 to n, we compute ti
=
min {max { min. {tj}, W h + l , i } }
O
h<j
The value tn is the minimum t such that s is a t-approximate period/cover/seed of x . Step 1: In the case of approximate seeds, we add the following two rules to the first step of the algorithm:
(A)
Wlj
(B)
win
= minl
Rule (A) represents the left extension and rule (B) the right extension. Obviously, for approximate periods we use only rule (B) and for approximate covers we don't use any of the rules. Both (A) and (B) can be realized in constant time as follows: instead of computing a new D-table between each s [ l . . k ](resp. s[k..m]) and x [ i . . n ] (resp. x [ l . . j ] ) ,we just make one D-table between s and x [ i . . n ] (resp. sR (~[l..j])~) and take minimum value of the last column of this table." "This does not hold for the Hamming distance, because in this case we don't use Dtables.
163
Step 2: The inner loop (minh<j
+
+
3.2. Restricted Smallest Approximate Period/Cover/Seed Problem
Definition 3.2. Given a string x of length n, the Restricted Smallest Approximate Peraod/Cover/Seed problem is to find a substring s of x such that: s is a t-approximate period/cover/seed of x and there is no substring of x that is a k-approximate period/cover/seed of x for all k < t. The approach we are using to solve this problem is to consider every substring s of x,of length less than 1x1/2,as a candidate period/cover/seed and run the algorithm described in the previous section for each s. Since the length of s is not fixed in this case, we use a relative distance function (rather than an absolute distance function); that is, an error ratio, in the case of the Hamming or edit distance, or a weighted edit distance.b 4. Experimental Results 4.1. Experimental Environment
The algorithms were implemented in C++ using the Standard Template Library (STL), and run on a Pentium-4M 1.7GHz system, with 256MB of RAM, under the Madrake Linux operating system (v8.0). The dataset we used to test the algorithms is the nucleotide sequence of Saccharomyces cerevisiae chromosome IV. 4.2. Performance In all our tests, the main string x consists of the first n characters of the chromosome. In the case of the Smallest Distance Approximate Period/Cover/Seed we choose s, the candidate period/cover/seed, to be a random substring of the main string x,of length m = 5. bThere also exist other ways to solve this problem, but since they do not offer any improvement in the time or space complexity, we prefer to use this simpler method.
164
Smallest Distance Problem - Edit Distance
Smallest Distance Problem Hamming Distance ~
,
72
4
,
,
.
,
,
-
-
10
IS
---
*FvE."mu~---
A
v-.
Corn
Lm.-
Rest~ictedSmallest Problem - Edit Distance
~ -- --
A ~ o . m i I P P O U M . Car@prourn.
350
" Figure 1. Experimental results (m = 5).
Figure 1summarizes our results. We can see that approximate periods, covers and seeds need exactly the same amount of time to be computed under both the edit distance model and the weighted edit distance model (but not for the Hamming distance). Comparing the efficiency of our program in the case of edit distance with that of weighted edit distance, we see that although these two have the same asymptotical time complexities, the constant factor associated with the weighted edit distance is bigger, because of the penalty-matrix look-up time.
165
References 1.
2.
3.
4.
5.
6.
Manolis Christodoulakis, Costas S. Iliopoulos, Kunsoo Park and Jeong Seop Sim, Approximate Seeds of Strings, in preparation. Maxime Crochemore, Costas S. Iliopoulos and H. Yu, Algorithms for computing evolutionary chains in molecular and musical sequences, Proc. 9th Australasian Workshop on Combinatorial Algorithms, pp. 172-185( 1998). Gad M. Landau and Jeanette P. Schmidt, An algorithm for approximate tandem repeats, Proceedings of the 4th Annual Symposium on Combinatorial Pattern Matching, Padova, Italy, 684, Springer-Verlag, Berlin, pp. 12O-133( 1993). URL = citeseer.nj .nec.com/landau93algorithm.html. Jeanette P. Schmidt, All highest scoring paths in weighted grid graphs and its application t o finding all approximate repeats in strings, SIAM Journal on Computing 27(4), 972-992(1998). Jeong Seop Sim, Costas S. 11iopoulos, Kunsoo Park and William F. Smyth, Approximate Periods of Strings, Theoretical Computer Science 262,557-568(2001). URL = citeseer.nj.nec.com/4677lO.html. Jeong Seop Sim, Kunsoo Park, S. Kim and J. Lee, Finding Approximate Covers of Strings, Journal of Korea Information Science Society 29(1), 16-21 (2002).
DISTRIBUTED SUFFIX TREES AND THEIR APPLICATION TO LARGE-SCALE GENOMIC ANALYSIS
RAPHAEL CLIFFORD and MAREK SERGOT Department of Computing, Imperial College, London. E-mail:
[email protected] and
[email protected] We have recently presented a variant of the suffix tree which allows much larger genome sequence databases t o be analysed efficiently. The new data structure, termed the distributed s u m tree (DST), is designed for distributed memory parallel computing environments (e.g. Beowulf clusters). It tackles the memory bottleneck by constructing subtrees of the full suffix tree independently. The standard operations on suffix trees of biological importance are easily translatable t o this new data structure. While none of these operations on the DST require inter-process communication, many have optimal expected parallel running times.
1. Introduction
The suf€ix tree is the key data structure of computational pattern matching allowing a multitude of sophisticated operations to be performed efficiently (see e.g. [l,51). In the field of bioinformatics these operations include whole genome alignment [3], analysis of repetitive elements [S], and fast protein classification [4],amongst many others. However, the main obstacle to more widespread acceptance of these methods remains that of memory use. Suffix trees have high memory overheads, and the poor memory locality, both of their construction and of querying algorithms, make disk-based implementations highly problematic [7]. We have presented in [2] two new data structures for problems of intermediate size-that is, problems larger than can be handled by existing suffix tree/array methods but small enough that the input can be stored entirely in real memory-a range of at least an order of magnitude. To give some indication, the new methods allow us to store and analyse the whole human genome, perform cross species pattern matching on all available bacterial genomes at once, or search a large EST database, using a small cluster of standard PC’s. The data structures are termed the distributed su& tree (DST) and the paged SUBX tree (PST). Both are based on a new 166
167
extension of Ukkonen’s suffix tree construction algorithm [9] which allows subtrees of a suffix tree to be constructed efficiently in space proportional to the size of the resultant data structure and not the whole suffix tree. This enables a suffix tree to be distributed over a number of computing nodes and queried in parallel (the DST) or for a single node to compute independent subtrees successively (the PST). By effectively splitting the input string lexicographically (not into contiguous substrings), it can readily be shown that all the most popular biologically inspired operations on suffix trees exhibit optimal or near optimal parallel speedups. Furthermore problems which would previously have been impossible to solve due to their size can now be tackled efficiently, either in parallel or serial and with modest hardware requirements. Here we will focus on the distributed version, the DST. The DST construction algorithm has been implemented in C on an 8 processor distributed memory parallel computer, increasing by a factor of 7.65 the size of the largest database that could be indexed. Exact set matching and repeat finding procedures for random data have also been implemented and performed on the DST. The results showed substantial speedups (with average efficiencies in excess of 90% and 99%, respectively) and exhibited good scalability, confirming the theoretical analysis. For systematically biased genetic data, preliminary results show that simple load balancing schemes can successfully increase the parallel efficiency of biological operations to close to 90%. The method is simple to apply. Almost any current bioinformatic technique that relies on suffix trees can be modified to take advantage of DSTs, greatly extending the range of problem sizes that can be tackled. Also, complex or time consuming queries, such as the preliminary stages of matching all ESTs against the human genome, can be performed with optimal or near optimal efficiency in parallel. In the next section we first describe the new data structure. We then present the expected time efficiencies of a representative sample of operations on the DST.
2. Distributed Suffix Trees
A suffix tree of input string t is a compacted trie of the suffixes o f t . We define a sparse sufix tree (SST) of input string t to be a compacted trie of a subset of the suffixes o f t . Here, we are interested in the special case where all the suffixes in this subset start with the same prefix z and assume from now on that all SSTs are of this type. Distributed sufix trees (DST)
168
are simply collections of SSTs defined in this way. Usually, a single SST will be held at each computing node and the union of the path labels of the leaves of these SSTs will be the full set of suffixes of t . In other words, every s u f i of t will be represented by exactly one SST at exactly one of the computing nodes. An example DST and the corresponding standard suffix tree are given in Figures 1 and 2. In this case the prefixes for the 6 different SSTs are aa, ac, ca, cc, a$ and $. Each SST has been connected to a central root node. The main difference of the sparse su& links is that in the standard suffix tree the suffix links can point the full width of the tree. In the DST the new links point only to nodes that are within the same SST. This allows the SSTs to be constructed independently without any inter-process communication.
-‘XI Figure 1.
The SSTs for aacacccacacaccacaaa$ with their respective root nodes labelled T,,, T,$ and T $ . The sparse s u f i links for the valid sets Vaa, V,,, V,,, V,,, Va$ and V$ are marked with dashed arrows. Note that the final suffixes, a$ and $, are included but typically will not be used.
T , ~ ,T,,,T,,,
2.1. Operations on the DST Gusfield [5] provides what can be regarded as a canonical list of major bioinformatic techniques on suffix trees. All are readily translated to a DST wihtout incurring any communication overheads. We summarise the analysis of five representative problems from this list. They are Longest Common Substring, Exact Local Matching (LCS, ELM), Maximal Repeat Finding (MRF), All Pairs Sufi-Prefix (APSP) [6] and Exact Set Matching
169
Figure 2. The standard s u f i tree of aacacccacacaccacaaa$ with standard suffix links. This is for comparison with the merged tree in Figure 1. See the text for further explanation.
(ESM). Full descriptions along with their serial solutions using suffix trees can be found in Gusfield [5] and elsewhere. Because we are interested in average case and not worst case analysis we make the commonly used assumption that the input characters are independent and uniformly distributed (i.u.d.). In practice, this assumption may not hold, of course, requiring some form of load balancing for systematically biased data. We suppose that there are k computing nodes and assume for simplicity that k = &I, where (T is the alphabet size and z is the fixed prefix. Table 1 compares the expected running times for the solution of these five problems using the fastest serial method (based on a standard suffix tree) and a parallel method (based on distributed suffix trees). Table 1. Post-construction average time complexities for five different problems using standard and distributed s u e trees with k computing nodes. r is the number of strings for the All Pairs Sufi-Prefix problem and the number of patterns for Exact Set Matching. Problem
Expected Running Time
LCS and ELM APSP ESM
+
O(n r 2 ) O(r1ogn)
170
Experimental results on random data showed substantial speedups consistent with the theoretical analysis. For systematically biased biological data, we were able to devise simple load balancing schemes that increased the parallel efficiency of the operations to close to 90%. For example, we were able to increase the parallel efficiency of maximal repeat finding on human chromosome X using 16 computing nodes from 61% to 89%. Simple load balancing schemes for the other problems listed above gave similar improvements in efficiency for real biological data. Acknowledgements This work was undertaken while the first author was a PhD student at Imperial College London supported by an EPSRC studentship. We should like to thank Costas Iliopoulos and Wojtek Rytter for a number of valuable suggestions. References 1. A. Apostolico. The myriad virtues of subword trees. In A. Apostolico and Z. Galil, editors, Combinatorial Algorithms o n Words, volume F12 of N A T O A S I Series, pages 85-96. Springer-Verlag, 1985. 2. R. Clifford and M. Sergot Distributed and Paged Suffix n e e s for Large Genetic Databases. Proc. 14th Annual Symposium on Combinatorial Pattern Matching. 2003. 3. A. Delcher, S. Kasif, R. Fleischmann, J. Peterson, 0. White, and S. Salzberg. Alignment of whole genomes. Nucleic Acids Research, 27(11):2369-2376, 1999. 4. B. Dorohonceanu and C. Nevill-Manning. Accelerating protein classification using suffix trees. In Proc. 8th International Conference o n Intelligent Systems for Molecular Biology (ISMB), pages 126-133, 2000. 5. D. Gusfield. Algorithms o n strings, trees and sequences. Computer Science and Computational Biologg. Cambridge University Press, 1997. 6. D. Gusfield, G. M. Landau, and D. Schieber. An efficient algorithm for the all pairs suffix-prefix problem. Information Processing Letters, 41:181-185, 1992. 7. J. Karkkainen and E. Ukkonen. Sparse suffix trees. In COCOON '96, Hang Kong, LNCS 1090, pages 219-230. Springer-Verlag, 1996. 8. S. Kurtz and C. Schleiermacher. Reputer: Fast computation of maximal repeats in complete genomes. Bioinfomatics, 15(5):426-427, 1999. 9. E. Ukkonen. On-line construction of suffix-trees. Algorithmica, 14:249260, 1995.
MODELING THE DETECTIVE QUANTUM EFFICIENCY OF SCINTILLATORS USED IN MEDICAL IMAGING RADIATION
DETECTORS A. EPISKOPAKIS, D. NIKOLOPOULOS, K. ARVANITIS Department of Medical Instruments Technology Technological Educational Institution of Athens 122 10, Athens, GREECE N. DIMITROPOULOS 'Euromedica' Medical Center Athens, GREECE G. PANAYIOTAKIS Department of Medical Physics University of Patras Patras, GREECE D. CAVOURAS and I. KANDARAKIS Department of Medical Instruments Technology Technological Educational Institution of Athens 122 10, Athens, GREECE E-mail:
[email protected]
1. The purpose of this work was to model and compare the zero-frequency detective quantum efficiency (DQE(0)) of Gd202S:Tb, Gd202S:Eu Gd202S:Pr, Gd202S: Pr, Ce, F and YTa04:Nb granular scintillators for use X-ray medical imaging detectors. The work uses a mathematical model based on a photon diffusion differential equation. X-ray tube voltage was considered to vary from 10 and 200 kV while phosphor screen thickness ranged between 10 and 160 mg/cm2 coating .
1.
Introduction
The detective quantum efficiency (DQE) describes the efficiency of an imaging system to transfer the signal to noise ratio (SNR) from the input to its output. DQE has been expressed in terms of the statistical moments of the scintillation light, pulse height distribution. The aim of the present study was to model the detective quantum efficiency of some high performance scintillators (Gd20pS:Tb,GdpOpS:Pr, GdzOzS:Pr,Ce, F, Gd 2 0 2 S : E ~YTa04:Nb) , for use in
171
172 radiation detectors of digital and computerized x-ray medical imaging systems. Gd202Shost based materials exhibit high x-ray absorption and x-ray to light conversion properties. This is due to the energy of their K-absorption edge (50.2 keV) and their low band-gap energy (4.5 eV) between valence and conduction bands. In addition there is a strong interest in Ce and Pr doped materials exhibiting fast temporal response, suitable for real time dynamic medical imaging’,’. On the other hand Eu activator induces reddish light emission compatible to CCD array optical sensors used in digital imaging detector^^^^,^*^. Finally YTa04:Nb is a new material recently incorporated in radiographic screens. 2.
Materials and Methods
Scintillators were considered to be in the form of layers containing luminescent grains (granular scintillating screens). To model scintillator DQE, a theoretical framework, based on previously published s t u d i e ~ ~ * was ~*~ developed. Intrinsic scintillator optical coefficient data394were determined by computer fitting techniques (Levenberg-Marquard method) on previously obtained experimental data and considering Mie light scattering effects within scintillator layers. Matlab programming was used throughout this study. 3.
Results and Discussion
Figurelshows the variation of DQE(0) with X-ray energy. The coating thickness of the screens was 80mg/cm2. The similarity between Gd202S:Tb, Gd202S:Eu and Gd202S:Pr curves, is due to their common x-ray attenuation coefficients. Differences in final DQE values are due to the different ion activators. Gd202S:Eu showed best performance. This was due to the lower value of the optical coefficient 0, expressing light attenuation within scintillator mass6. Hence a larger number of optical quanta reach scintillator’s emitting surface. This increases output signal and decreases noise. YTa04:Nb had good performance at low energies, mainly due to its high quantum detection efficiency (QDE) at the low energy region. However its high optical attenuation coefficient affect negatively the overall performance of this material. Figure 2 illustrates the variation of QDE with X-ray energy.
173
0.7
~
G d,O,S:P r G d,O,S :E u (YTa0,:N b)
0.2
-
0.1
-
0.0
I
I
I
I
50
100
150
200
X-ray energy (kV)
Figure 1. DQE(0) versus x-ray energy [LV].
0.84
\
:,
X-ray Energy(keV) Figure 2. QDE versus x-ray energy [keV].
174 Figure 3 shows the variation of DQE(0) with coating thickness. Again Gd202S:Eu showed better results. The shape of curves is explained by the corresponding variations in x-ray and optical quanta absorption within scintillator mass.
a4-
DQE' a3-
02-
ar -
Figure 3. DQE(0) versus coating thickness [mg/cm2].
References 1.
C.W.E. van Eijk, J. Andriessen, P. Dorenbos, R. Visser. Nucl. Instr. Meth. Phys. Res. A 348,564 (1994) 2. C.W.E. van Eijk, P.Dorenbos, R.Visser. IEEE Trans. Nucl. Sci. 41.738. (1994). 3. LKandarakis D. Cavouras. Nucl. Instr.Meth.Phys. Res. A. 460.412 (2001). 4. I. Kandarakis ,D. Cavouras . Appl. Rad.Isot. 54.821( 2001) 5. LKandarakis D. Cavouras G S Panagiotakis and C D Nomikos. Phys. Med. Biol. 42. 1351 (1997) 6. R.K. Swank. Appl. Optics 12. 1865 (1973)
BIFURCATION PHENOMENA IN MOLECULAR VIBRATIONAL SPECTROSCOPY
S. C. FARANTOS Institute of Electronic Structure and Lasers, Foundation f o r Research and Technology-Hellas, and Department of Chemistry, University of Crete, Iraklion 711 10, Crete, Greece E-mail:
[email protected]
Spectroscopic techniques such as the Dispersed Fluorescence (DF), and Stimulated Emission Pumping (SEP) have revealed a new dynamical picture of small polyatomic molecules excited at very high vibrational states. Spectra of high complexity at high resolution may show regularities at low resolution, intense regular progressions of spectral lines may coexist with congested bands, and even irregular spectra may be replaced by regular ones as energy increases. Vibrational spectra of excited molecules are the fingerprints of the nonlinear mechanical behaviour of the molecule. The assignment of the spectral lines particularly of the regular ones, and thus the extraction of dynamics, is a challenging task, since the picture which emerges defies the previous simple ideas that a molecule shows regular behaviour at low energies and chaotic one above a threshold energy. Furthermore, when elementary chemical processes such as isomerization and dissociation are involved the understanding of how a bond is breaking and a new one is formed brings back fundamental questions of chemical dynamics. The established theoretical methods based on a normal mode description of the molecular vibrations, applied at energies close to the equilibrium point, are not any longer valid for vibrationally excited molecules. The departure from the harmonic approximation of the potential energy surface imposes the need for the construction of accurate potential functions far from the equilibrium point and the application of nonlinear mechanics to understand the dynamics of the molecule. Polyatomic molecules stimulate new computational challenges in solving accurately the Schrodinger equation and obtaining hundreds of vibrational states. Nowadays, triatomic molecules can be treated with fully ab initio methods both, in their elec175
176
tronic and nuclear part. Tetratomic molecules are more difficult to deal with, in spite of the progress which has recently been achieved. For example, six-dimensional calculations up to energies of t h e isomerization of acetylene to vinylidine have been published 2. Apart from the computational challenges small polyatomic molecules expose conceptual and physical interpretation problems. A result of the nonlinear mechanical behaviour of a dynamical system is the simultaneous appearance of ordered motions and chaos, as well as the genesis of new type of motions via bifurcation phenomena. What are the quantum mechanical counterparts of these classical behaviours? What are the spectroscopic signatures of the nonlinear dynamics of the molecules? In cases where the vibrational spectra depict isomerization and dissociation processes how can we identify them in the spectra? As a matter of fact the progress of nonlinear mechanics forces us to reexamine the mechanisms of the breaking and/or the formation of a single chemical bond as it happens in the elementary chemical processes. To answer the above questions new assignment schemes which allow the classification of quantum states in a meaningful way are required and such novel methods have been developed and applied by our group. We use periodic orbits and bifurcation theory to explore the complex structure of the molecular phase space and from it to deduce the quantum dynamics. The theoretical background of our approach stems in several advances of semiclassicaltheory in the past years. Gutzwiller using Feynman’s path integral formulation has derived a semiclassical expression for the trace of the resolvent (Green operator) of the quantum Hamiltonian operator as a sum over all isolated periodic orbits, taking into account their linear stability indices. This formula provides approximate values for the quantum eigenenergies of classically chaotic systems. Berry and Tabor derived in a different way the trace formula based on the Einstein-Brillouin-Keller quantization rule. Their formula gives the quantum density of states as a coherent summation over resonant tori, and therefore it is applicable to integrable systems. Finally, a uniform result bridging the Berry-Tabor and Gutzwiller trace formulas for the case of a resonant island chain was derived by Ozorio de Almeida 5. Last but not least, the importance of periodic orbits for a qualitative understanding of the localization of quantum mechanical eigenfunctions came from the scarring theory of Heller 6 . It turns out, that for small polyatomic molecules the probability density of eigenfunctions is accumulated along short period stable or the least unstable periodic orbits.
177
Periodic orbits (PO) evolve with the energy of the system or any other parameter in the Hamiltonian, bifurcate and produce new periodic orbits which portrait the resonances among the vibrational degrees of freedom. Generally, PO reveal the structure of phase space at different energies. For about fifteen years we study families of PO in molecular systems, using model and realistic potentials. We compare the classical and quantum mechanical behaviour of the molecule by constructing bifurcation diagrams of periodic orbits. Locating PO in multidimensional systems is a two point boundary value problem. We apply advanced shooting methods to locate periodic orbits and to continue them in their parametric space ?. We have systematical study the PO networks for different type of molecules; triatomic and tetratomic molecules, van der Waals lo molecules and at energies below and above dissociation ll. The bifurcation theory of Hamiltonian dynamical systems has mainly been developed in the last half of twentieth century. One important outcome of the theory is the identification of the elementary bifurcations which are described by simple Hamiltonians and they are valid for a broad class of Hamiltonian systems. The elementary bifurcations are the saddle-node, transcritical, pitchfork and Hopf bifurcations. Bifurcation phenomena, i.e. the change of the structure of the orbits by varying one or more parameters, are well known in vibrational spectroscopy. For example, the transition from normal to local mode oscillations is related to the elementary pitchfork bifurcation. A number of studies, classical and quantum mainly in small molecules, revealed that isomerization and dissociation reactions are driven by another type of elementary bifurcation, the saddle-node (SN). Periodic orbits which emerge from SN bifurcations appear abruptly at some critical value of the energy, usually in pairs, and penetrate in regions of nuclear phase space where the normal mode motions can not reach. Saddle-node bifurcations are of generic type, i.e. they are robust and remain for small (perturbative) changes of the potential function 9
We initially determined the importance of SN bifurcations of periodic orbits in studies of the isomerization dynamics in double well potential functions 12. These PO connect the two minima and scar the isomerizing wavefunctions, i.e. eigenfunctions with significant probability density in both wells. Their birth is due to the unstable periodic orbit which emanates from the saddle point of the potential energy surface. However, even below the potential barrier a series of SN bifurcations of periodic orbits pave the way to the isomerization process.
178
The spectroscopic signature of SN bifurcations has been found in a number of triatomic molecules l37I4. HCP was the first molecule where experimental evidence for SN bifurcations was given. In the other extreme, infinite dimensional systems, such as periodic or random lattices, show spatially localized and periodic in time motions, called discrete breathers, and it has been shown that they can also be associated with saddle-node bifurcations l53l6. Spectroscopic evidence for the existence of discrete breathers can be found among biomolecules 17. The emanation of SN bifurcations as a generic phenomenon in elementary chemical processes is the main theme of the talk. References 1. H.-L. Dai and R. W. Field, Molecular Dynamics and Spectroscopy by Stimulated Emission Pumping, in Advanced Series in Physical Chemistry, 4, (World Scientific Publ. Co., Singapore, 1995). 2. S. Zou and J. M. Bowman, J. Chem. Phys. 117,5507 (2002). 3. M. C. Gutzwiller, Chaos in Classical and Quantum Mechanics, of Interdisciplinary Applied Mathematics, 1, (Springer-Verlag, New York, 1990). 4. M. V. Berry and M. Tabor, J. Phys. A 10,371 (1977). 5. A. M. Ozorio de Almeida, in T. Seigman, Quantum Chaos and Statistical Nuclear Physics, in Lecture Notes in Physics, 263, 197, (Springer-Verlag, New York, 1986). 6. E. J. Heller, Phys. Rev. Lett. 53, 1515 (1984). 7. S. C. Farantos, Comp. Phys. Comm. 108,240 (1998). 8. S. C. Farantos, Int. Rev. Phys. Chem. 15,345 (1996). 9. R. Prosmiti and S. C. Farantos, J . Chem. Phys. in press (2003). 10. R. Guantes, A. Nezis and S. C. Farantos. J. Chem. Phys. 111, 10835 (1999). 11. M. Founargiotakk, S. C. Farantos, H. Skokos and G. Contopoulos, Chem. Phys. Letters 277, 456 (1997). 12. S. C. Farantos, Laser Chemistry 13, 87 (1993). 13. M. Joyeux, S. C. Farantos and R. Schinke, J. Phys. Chem. A (feature article) 106,5407 (2002). 14. H. Ishikawa, R. W. Field, S. C. Farantos, M. Joyeux, J . Koput, C. Beck and R. Schinke, Annu. Rev. Phys. Chem. 50, 443 (1999). 15. S. Flach and C. R. Willis, Phys. Rep. 295, 181 (1998). 16. G. Kopidakis and S. Aubry, Physica D 130, 155 (1999). 17. A. Xie, L. van der Meer, W. Hoff and R. H. Austin, Phy. Rev. Lett. 84, 5435 (2000).
VERIFICATION OF A SIMPLE TECHNIQUE FOR THE REMOVAL OF MOTION ARTEFACTS IN ELECTRICAL IMPEDANCE EPIGASTROGRAPHY SIGNALS GAITANIS A', FREEDMAN MR AND SPYROU NM Department of Physics, University of Surrey, Guidvord,
Surrq, GU2 7m, U.K *Correspondence should be addressed to: Department of Medical Instrumentation Technology, Technological Educational Institute of Athens, Ag. Spyridonos Street, Egaleo, GR-122 10, Athens, Greece. e-mail:
[email protected]
Electrical Impedance Epigastrography (EIE) is a non-invasive method that allows the assessment of gastric emptying rates without using ionizing radiation. This method works by applying an alternating current with a frequency of 32 kHz which can be varied by the operator from 1 to 4 mA, through electrodes over the epigastric region and measuring the potential difference between them. The post acquisition analysis relies on the ShortTime Fourier transform algorithm (STFT) to extract the gastric motility component of the signal (centre frequency of 0.05Hz with a bandwidth of 0.02Hz). The exact influence of motion artefacts was investigated by asking volunteers to carry out a variety of movements. It was clear that the motion artefacts produced positive and negative spikes on the acquired signal. Due to the large frequency range of a delta function, the spikes were producing false positives in the frequency domain, suggesting the presence of gastric motility when there was none, or exaggerating the magnitude of real events. Several attempts were made for removing motion artefacts and an appropriate algorithm was created. The efficacy of the algorithm was tested by using epigastrographic signals stored in a database at University of Surrey.
1. Introduction There has been strong debate within the scientific community about the methods used for the assessment of gastric function. Various techniques have been developed and employed in the clinical environment. However, there are a number of limitations including lack of accuracy, discomfort caused to the patient or the use of ionising radiation which prevents repeat studies, particularly on children and pregnant women. As Pickworth' concluded there is need for an improved method which is simple and efficient. 179
180
This need led researchers to develop Electrical Impedance Epigastrography (EIE). This method has been used for almost two decades for the assessment of gastric emptying rates in humans. Three pairs of electrodes are placed over the epigastric region of the abdomen. One electrode of each pair is placed anteriorly and the other posteriorly. A current with a frequency of 32kHz and amplitude of between 1 and 4mA (usually 2.5mA) is injected through one of the three pairs of electrodes while the other two pairs. Until now EIE has only been regarded as a research tool although important developments have been made in recent years. One drawback is that the current density is focused at the electrode-skin interface, meaning that the device is very sensitive to movements at the skin surface, such as yawning, coughing or shifts in the patient’s position. The purpose of our study is to investigate the effects of different types of artefacts on EIE signals in order to remove them using computational methods. 2.
Materials & Methods
The Epigastrograph was developed by the Medical Physics Group at the University of Surrey and built by the Medical Electronics Unit of St. George’s Hospital. The skin surface was cleaned with non-alcohol wipes before the application of the electrodes to ensure good electrical contact. For studying motion artefacts four patients were used and a simple protocol was established. No food or drink (except water) was to be taken by the volunteers 6 hours prior to the test and water could not be taken 2 hours prior to the test. Subjects were asked to eat a low fat light snack for their evening meal prior to the test and told to refiain from strenuous exercise. They were also asked to avoid alcohol, caffeine (including coffee, tea, stimulant drinks etc.), cigarettes and other nicotine based products (including patches and gum) and spicy foods from the evening prior to the study. The duration of the experiment was between 80 and 90 minutes and during this time subjects were required to be motionless in a semi-supine position. During this time subjects were asked to carry out some predetermined movements (coughing, stretching, etc) to evaluate the effects of motion artefacts. The post acquisition signal processing is based on the Short Time Fourier transform (STFT) which divides the signal into epochs and extracts the frequency components using the Fast Fourier Transform (FFT) for each segment. These are stacked in time order to give time-frequency-power information regarding the gastric motility. It was clear from the experiments that motion artefacts caused positive and negative spikes in the EIE signal primarily due to the rapid changes in
181 impedance at the skin’s surface. The frequency components of a spike follow a sinc function over all frequencies and it became apparent that it was necessary to remove them in the time domain, before the application of the STFT algorithm to avoid the possibility of false positive results. It was decided that the signal could be processed in the time domain by dividing the signal into epochs and applying a statistical threshold. Several types of statistical processing may be used depending on signal statistical properties*. In this case the primary objective was a constant threshold for every signal for detecting the artefacts. The threshold was selected as f2.50 of the mean value of the epoch. The probability of a given point lying within this threshold is almost 99%3. Consequently, it was accepted that any point of the signal outside of the range xB.50, was considered as artefactual. After the signal had been processed once, the procedure was repeated. The artefacts continued to reduce until the fourth iteration. The differences between the fourth, fifth and sixth iterations were insignificant. The effect of different epoch lengths was also tested and the most effective length set at 500 points. 3.
Results and Conclusions
The purpose of this study was to investigate the effect of motion artefacts on epigastrographic signals and develop an algorithm to remove them in the time domain, preventing the possibility of false positive results in the frequency domain. Figure 1 shows one of the recorded signals with artefacts artificially introduced in the last half of the signal (Fig. 1A). It is clear that the first iteration is not effective in removing all of the artefacts (Fig, 1B) as the success of removal will depend on a number of factors, primarily the slope of the baseline and frequency of the artefact. The signal shown is considered as a worse case scenario, with a fluctuating baseline and a high frequency of artefacts (approximately one every four minutes). Both of these factors are likely to increase the mean value of the epoch and consequently reduce the stringency of the threshold. However, after four iterations, the artefacts have been almost totally removed so that their effect on the frequency domain will be minimal (Fig. 1C). By analysing the FFT of the signals in Fig. 1A (Fig. 1E) and 1C (Fig. lF), the difference that the algorithm makes is obvious. This has been shown in Fig. 1D by subtracting the signal in Fig. 1F from the signal in Fig. 1E.
182
31
29 5
E x
29
9 229
285
+-”.
c
c -v (0
g. 28 E -
28
-
Figure 1. An example of successful filtering using time domain statistical threshold filtering epoch by epoch. A) the EIE signal, B) after one iteration, C) after four iterations, D) the differential FFT between pre-processed and processed EIE, E) the FFT of preprocessed EIE and F) the FFT o f the signal after four iterations.
Therefore, the results indicate that the proposed technique can be established as a useful and improved tool for the removal of motion artefacts in EIE signals than other techniques4. Removal of motion artefacts before signal processing significantly reduces the possibility of false positive results when investigating the gastric motility patterns using the STFT. References 1. M. J .W. Pickworth, ‘Impedance Measurements for Gastric Emptying’, MSc Dissertation. University of Surrey, (1984). 2. R. E. Challis and R. I. Kitney, Medical & Biomedical Engineering & Computing, 509-523 ( 1 990). 3. J. R. Taylor, ‘An Introduction to Error Analysis’, University Science Book, USA, (1997) 4. E. J. Ching, ‘An Investigation into method of qualifying gastric contractions using epigastric impedance data’, MSc Dissertation, University of Surrey, (1992)
NUMERICAL SIMULATION OF SCAVENGING OF AN URBAN AEROSOL BY FILTRATION TAKING INTO ACCOUNT THE PRESENCE OF COAGULATION, CONDENSATION, AND GRAVITATIONALSETTLING P. J. GARCIA-NIETOt Departamento de Matemdticas, Facultad de Ciencias, Universidad de Oviedo. C/ Calvo Sotelo s/n, Oviedo, 33007, SPAIN E-mail:
[email protected] This work studies the scavenging efficiencies of an average urban aerosol by means of filtration after a given mechanism of removal (coagulation, heterogeneous nucleation, and gravitational settling) as a function of time. Filtration is a simple, versatile, and economical mean for collecting samples of aerosol particles. The capture of aerosol particles by filtration is the most common method of aerosol sampling and is a widely used method for air cleaning. At low dust concentrations, fibrous filters are the most economical means for achieving high-efficiency collection of submicrometer particles. Aerosol filtration is used in diverse applications, such as respiratory protection, air cleaning of smelter effluent, processing of nuclear and hazardous materials, and clean rooms. The process of filtration is complicated, and although the general principles are well known there is a gap between theory and experiment. In this paper, we review filtration in order to provide an understanding of the properties of fibrous and porous membrane filters, the mechanisms of collection, and how collection efficiency and resistance to airflow change with filter properties and particle size.
1.
Introduction
In this paper we have analyzed first individually and then together coagulation, condensation and gravitational settling of an characteristic spectrum for particle populations': average urban aerosol. Once we have shown the behavior of these 3 physical mechanisms, we then write the general dynamic equation (GDE) and describe the mathematical model that we propose to solve it. The equation describing the behavior of aerosols is a nonlinear, integrodlfjerential equation of considerable complexity. If all of the deposition, condensation, and coagulation mechanisms are included, it is clear that only feasible method of solution is a numerical one. Indeed, in practical calculations
Work is supported by the project MB-03-513-1 due to University of Oviedo.
183
184 this is the procedure and many computer codes exist that solve the dynamic equation using finite differences or finite element methods**”. A major drawback of numerical methods is the vast amount of computer time necessary to obtain a useful survey of all the relevant parameters. This causes difficulty in gaining physical insight into the problem’. 1.l. Coagulation
A population of aerosol particles evolves in time as a consequence of colliding with themselves in order to adhere. This process is called coagulation. There are two kinds of coagulation depending on the collision mechanism: (a) one due to Brownian motion of aerosol particles or “Brownian coagulation”, and (b) the other due to turbulent flow or “coagulation in turbulence”. Coagulation modifies the particle size distribution (PSD) since it reduces the overall number of particles and increases the mean size (diameter) of them, which facilitates that particles were removed by other mechanisms ( e g , gravitational settling). Any kind of coagulation keeps invariant in time the overall mass of initial PSD. 1.2. Nucleation
The formation and growth of an aerosol by condensation require a surface initially on which the vapor can condense. This surface can be a small cluster of vapor molecules, an ion, or a solid salt particle, for example, called a condensation nucleus. Homogeneous nucleation is the nucleation of vapor on embryos comprised of vapor molecules in the absence of foreign substances, whereas heterogeneous nucleation (condensation) is the nucleation on a foreign substance or surface, such as an ion or a solid particle.
1.3. Deposition Deposition is the removal of aerosol from the atmosphere to the earth’s surface so that the aerosol PSD suffers modifications in the particle number and in the overall mass. We study here the particle deposition due to dynamical processes, dispensing with deposition caused by chemical processes. The large particles (D,> 20 pm) are settled on the earth’s surface due to gravity force that acts on them: gravitational settling. We shall only analyze the gravitational settling in this work. The particles that are scavenged by this mechanism have large diameters, so for those particles the medium can be considered as a continuum.
185 1.4. Physical-mathematical model
First we have analyzed individually the coagulation, the heterogeneous nucleation (condensation), and the gravitational settling, and finally in combination. Now we are in a position to write the continuous general dynamic equation (continuous GDE)for the 3 analyzed mechanisms, and this expression is4.5,6.
(1) On the other hand, the 2 first integrals represent the coagulation, the following term represents the condensation, and the last term is due to gravitational settling. The GDE is a nonlinear integro-differential equation. In order to solve it numerically, due to the large range of volumes that aerosol particles show, it is necessary to transform the space of volumes to a logarithmic scale called a J-space". 2.
Deposition mechanisms in a filter
There are five basic mechanisms by which an aerosol particle can be deposited onto a fiber in a filter3": (1) interception; (2) inertial impaction; (3) diffhion; (4) gravitational settling, and (5) electrostatic attraction. These five deposition mechanisms form the basis set of mechanisms for all types of aerosol particle deposition, including deposition in a lung, in a sampling tube, or in an air cleaner. The method of analysis and prediction is different for each situation, but the deposition mechanisms are the same. The first four mechanisms are called mechanical collection mechanisms. Each of the five deposition mechanisms is described below, along with equations that predict the single-fiber efficiency due to that mechanism. The theoretical analysis is complex, and only simplified equations are presented. Still, these equations are accurate enough to show the trend of collection efficiency with filter parameters. Wherever possible, the equations are based on experimentally verified theory and except where noted, are valid for standard conditions. Collection by interception occurs when a particle follows a gas streamline that happens to come within one particle radius of the surface of a fiber. The particle hits the fiber and is captured because of its finite size. The single-fiber efficiency due to interception depends on the dimensionless parameter
186
R = D, f df , where Dpis the particle diameter and df is the fiber diameter, respectively. The single-fiber efficiency for interception, E, ,is given byas9 (1 -a)R 2 E, = Ku (1 + R) where Ku is the Kuwabara hydrodynamic factor, a dimensionless factor that compensates for the effect of distortion of the flow field around a fiber because of its proximity to other fibers. Ku depends only on the solidity a,
a2 Ku = h a 3 (3) 2 4+a-- 4 Ku ranges from 1.9 f o r a = 0.005 to 0.25 f o r a = 0.2. Interception is an important collection mechanism in the particle size range of minimum efficiency and is the only mechanism that does not depend on flow velocity U,. Inertial impaction of a particle on a fiber occurs when the particle, because of its inertia, is unable to adjust quickly enough to the abruptly changing streamlines near the fiber and crosses those streamlines to hit the fiber. The parameter that governs this mechanism is the Stokes number, defined as the ratio of particle stopping distance to fiber diameter:
The single-fiber efficiency for inertial impaction E, increases with an increasing value of Stokes number (Stk) and is given by9:
(Stk)J E, =(5) 2Ku2 where J = (29.6 - 2 8 ~ ~R"2 ~ 27.5 ~ ) R2.' forR < 0.4. There is no simple equation for J when R > 0.4. For approximate analysis, a value of J = 2.0 for R > 0.4can be used. Inertial impaction is the most important mechanism for large particles, but such particles have significant collection by interception as well. The sum of E, and E, can not exceed the theoretical maximum of 1 R .
+
The Brownian motion of small particles is sufficient to greatly enhance the probability of their hitting a fiber while traveling past it on a nonintercepting streamline. The single-fiber efficiency due to diflision, ED, is a function of the dimensionless Peclet number, Pe = d,
U , f D , where D is the particle
diffusion coefficient. The single-fiber efficiency due to diffusion EDis:
187
E, =2
(6)
Single-fiber efficiency increases as Pe and particle size decrease. ED is the only deposition mechanism that increases as D, decreases. The dimensionless number that controls deposition due to gravitational settling is3 G
U , and VTsare in the same direction, downward airflow, the single-fiber efficiency for settling, E , = G (1 + R) . For gas flow in the direction opposite to VTs, E , = -G (1 + R ) , and E , decreases overall single-fiber efficiency.
When
When flow is horizontal, E,is much less and on the order of
G2 .Generally,
E , is small compared with other single-fiber mechanisms, unless the particle size is large and U,is low. When U,is greater than about 10 c d s , impaction is more important than settling. The remaining deposition mechanism, electrostatic attraction, can be extremely important, but is difficult to quantify because it requires knowing the charge on the particles and on the fibers. Electrostatic collection is often neglected, unless the particles or fibers have been charged in some quantifiable way.
3. Overall efficiency of a filter The overall efficiency of a filter3,’ can be determined if the total single-fiber efficiency E, is known. The mechanical single-fiber efficiencies are correctly combined as long as each acts independently and is less than 1.O:
EX = l - ( l - E R ) ( l - E / ) ( l - E D ) ( l - E G ) 4.
x E , + E / +ED +EG
(9)
Results and Conclusions
In all calculations it was used a standard filter having a thickness of 1 mm, a solidity a = 0.05, and a fiber diameter dr = 2 pm, operating at a face velocity
U , = 0.1
d s . For this filter and face velocity, interception and
188
impaction are negligible for small particles, but increase rapidly for particles larger than 0.3 pm. Diffusion is the only important mechanism for particles below 0.2 pm, but is of decreasing importance for particles above that size. For all particle sizes, gravitational settling is small compared with the other mechanisms. The collection of particles larger than 0.5 pm is governed by mechanisms that depend on the particle’s aerodynamic diameter, but for particles less than 0.5 pn, collection is governed by mechanisms that depend on physical diameter. The particle size that gives the minimum efficiency, about 0.2 pm is an inbetween size that is too large for diffusion to be effective and too small for impaction or interception to be effective. Because these competing mechanisms are most effective in different size ranges, all filters have a particle that gives minimum efficiency, usually in the range 0.05-0.5 pm. Acknowledgments Dr. P.J. Garcia-Nieto gratefully thanks Department of Mathematics and Department of Physics at University of Oviedo for their computational and financial support. We thank University of Oviedo for its support under the project MB-03-513-1. References 1. P. J. Garcia-Nieto, B. Arganza Garcia, J. M. Femindez-Diaz and M. A. Rodriguez Braiia, Parametric Study of Selective Removal of Atmospheric Aerosol by Below-cloud Scavenging, Atm. Environ. 28,2335 (1994). 2.
J. M. Fernindez-Diaz, P. J. Garcia-Nieto, M. A. Rodriguez Braiia and B. Arganza Garcia, A flux-based characteristics method to solve particle condensational growth, Atm. Environ. 28,3027 (1998).
3. W. C. Hinds, Aerosol Technology, John Wiley and Sons, New York (1999). 4. P. C. Reist, Introduction to Aerosol Science, Macmillan Press, New York (1993). 5.
J. H. Seinfeld, Atmospheric Chemistry and Physics of Air Pollution, John Wiley and Sons, New York (1986).
6. J. H. Seinfeld and S. N. Pandis, Atmospheric Chemistry and Physicsfrom Air Pollution to Climate Change, John Wiley and Sons, New York (1998).
189
7. R. C. Brown, Air Filtration: An Integrated Approach to the Theory and Applications of Fibrous Filters, Pergamon Press, Oxford (1993). 8. K. W. Lee, Maximum Penetration of Aerosol Particles in Granular Bed Filters, J.Aerosol Science 12,79 (1991). 9. K. W. Lee and B. Y. H. Liu, On the Minimum Efficiency and Most Penetrating Particle size for Fibrous Filters, J. Air Poll. Control Assoc. 30, 377 (1980). 10. M. M. R. Williams and S. K. Loyalka, Aerosol Science. Theory and Practice, Pergamom Press, Oxford (199 1).
STRESS ANALYSIS AND FAILURE MECHANISMS OF COMPOSITE MATERIALS WITH DEBONDED INTERFACES E.E. GDOUTOS,A.A.GIANNAKOPOULOUAND D.A.ZACHAROPOULOS Democritus University of Thrace School of Engineering GR-671 00 Xanthi Greece
Ceramics are reinforced with high-strength continuous fibers to increase their ultimate tensile strain and damage tolerance, as opposed to polymers and metals whose reinforcement with fibers increases strength and stiffness. The basic characteristics of this class of composites are that constituents are elastic, the stiffness of matrix is comparable to that of fiber and the interface between the constituents may be imperfect. In such materials the failure stress or strain of the matrix is lower than that of the fibers and prefailure damage under longitudinal tension usually initiates with multiple matrix cracking. This damage mechanism requires the fibers to have sufficient strength to remain intact when they are surrounded by matrix cracking and interfacial bonding between fibers and matrix to be weak. Under such circumstances matrix cracking reaches a saturation stage and it is followed by partial debonding at the fiber-matrix interface. Final failure of the composites takes place by fiber fractures. The sequence and interaction of failure mechanisms in the composite depend on the properties, of the matrix, fiber and interface as well as on the processing residual stresses. Stress transfer from the matrix to fibers in a composite takes place by shear at the fiber-matrix interface. Strong interfaces result in high strength and stiffness, but low fracture toughness composites. On the other hand, weak interfaces promote deflection of matrix cracks along the interface and lead to high fracture toughness but low strength and stiffness composites. The process of transfer of load between fibers and matrix in the neighborhood of a fiber break or a matrix crack depends on the strength of the interface. Although fibers and matrix can be characterized by conducting simple tests, interface properties are most difficult to determine. Interfacial shear strength is an important parameter that controls the fiber-matrix debonding process and, therefore, the sequence and relative magnitude of the various failure mechanisms in the composite. In the present work the problem of a ceramic matrix composite reinforced with continuous fibers is studied. A cylindrical element of matrix (Fig. 1) with a single fiber and two matrix cracks perpendicular to the fiber direction under longitudinal tensile load is assumed. The stress analysis of the cylindrical element is undertaken by a shear lag model and the finite element method.
190
191 Following the stress analysis a detailed study of the failure mechanisms takes place. The matrix cracks may propagate through the fiber or may deviate along the interface.
Figure 1. Cylindrical element of composite. The order of singularity and the angular dependence of the stress field in the neighborhood of the crack periphery were determined by using the stress function approach proposed by Zak and Williams. The stress intensity factor was determined by combining the results of the local stress solution with a finite element analysis. The case of fiber debonding originating from the periphery of the annular cracks was also studied. For that problem both opening-mode and sliding-mode stress intensity factors and the strain energy release rate were determined. Numerical results for all these quantities were presented for a host of geometrical configurations and material properties of the composite cylinder. These results help to understand the various failure mechanisms including matrix cracking, debonding along interfaces and kinking of interface cracks into fibers in brittle matrix composites.
SUPPORT VECTOR MACHINES FOR CLASSIFICATION OF HISTOPATHOLOGICALIMAGES OF BRAIN TUMOUR ASTROCYTOMAS D. GLOTSOS, P. SPYRDONOS, P. PETALAS AND G. NWFORIDIS Computer Laboratory, School of Medicine, Universiv of Patras Rio, Patras 265 00, Greece E-mail:
[email protected] D. CAVOURAS Department of Medical Instrumentation Technology, Technological Education Institution of Athens Ag. Spyridonos Street, Aigaleo, 122 10, Athens, Greece E-mail:
[email protected] P. RAVAZOULA Department of Pathology, University Hospital Rio, Patras, 265 00, Greece P. DADIOTI AND I. LEKKA Department of Pathology, General Anticancer Hospital METAXA Piraeus, 18537, Greece A computer-based image analysis system was developed for the automatic classification of brain tumours according to their degree of malignancy using Support Vector Machines (SVMs). Morphological and textural nuclear features were quantified to encode tumour malignancy. 46 cases were used to construct the S V M classifier. Best vector was obtained performing an exhaustive search procedure in feature space. SVM classifier gave 84.8% accuracy using the leave-one-out method. To validate the systems’ generalization to unseen data, 41 cases collected from a different hospital were utilized. For the validation unseen data set classification performance was 82.9%. The generalization ability of the proposed classification methodology was verified enforcing the belief that automatic characterization of brain tumours might be feasible in every day clinical routine.
1. Introduction Brain tumour astrocytomas (ASTs) are considered as one of the most lethal and difficult to treat forms of cancer’. Once an AST tumour has been verified to exist, the next step is the determination of the degree of tumour abnormality (grading). This step is very important because it determinatively influences the 192
193 choice and type of treatment to be recommended*. However, the doctor subjective interpretation in grade assessment has been shown to influence diagnostic accuracy? 36% to 62% agreement among different pathologists and 5 1% to 74% agreement for the same doctor concerning readings for the same samples. Thus, the necessity was generated to approach more objective grading techniques, in order to enhance patient management in terms of better diagnosis, prognosis and therapy assessment. Among early approaches for computer-based AST grading, linear discriminant analysis of semi-quantitative nuclear features was investigated4. Research headed towards neural networks (NN) technology due to its superiority to classical multivariate classification5. Initially back propagation NNs (BNN) were evaluated trained with principal component analysis6. Selfediting nearest neighbor nets proved superior in performance compared to BNN and Kohonen NN using a specialized grading protocol7.Among the most recent developments, the application of decision tree models was investigated'. In this work, we propose a novel application of SVM' for classification of ASTs as low or high-risk according the most common and widely accepted from the medical community WHO grading system and Hematoxyl-Eosin (HE) staining protocol". The problem of small datasets has been emphasized in previous approaches4s697. The possibility that SVM may generalize ensuring good performance even with limited training samples, made the selection of SVM most attractive. Additionally, there is no approach of these machines in the classification of histopathological images. In contradiction to other appro ache^^^, the proposed system's generalization to unseen clinical data was evaluated. 2. Methods and materials
The clinical material comprised 87 biopsies of ASTs (Figure 1) collected from the Departments of Pathology of the University Hospital of Patras (UHP), (46/87) Greece and the General Distinct Anticancer Hospital METAXA (GAHM), (41187) Greece. Tumour grade was defined as low (30187) or highrisk (57/87) according the WHO grading system from 4 independent pathologists, 2 fiom each hospital. Images fiom tissue samples were digitized (768x576~8bit) at a magnification of x400 using a light microscopy imaging system consisting of a Zeiss KF2 microscope and an Ikegami color video camera. A segmentation algorithm" was performed (Figure 1) to separate nuclei from surrounding tissue in order to quantify diagnostic features from cell nuclei. 18 morphological features related to the size and shape of the cell nucleus and 22 textural features (first-order, co-occurrence, run-length based) that encoded chromatin distribution and nuclear DNA content were extracted". After feature generation, each distinct case was represented by a 40-dimension feature vector.
194 Initially, the UHP 46 cases were used to construct the SVM classifier. Exhaustive search12 was performed in order to determine the best feature vector combination that lead to the smallest classification error. SVM classifier performance was evaluated using the leave-one-out method12. To validate system's generalization ability to unseen data, the 41 GAHM cases were utilized retaining the same best vector and SVM classifier parameters. The results for both clinical datasets were recorded in the form of truth tables (Table 1-2).
Figure 1: Examples of high (top) and low-risk (bottom) brain ASTs along with the resulted segmented images
3.
Results and Discussion
SVM with polynomial kernel of degree 6 optimized classification performance resulting in 92,6% accuracy in correctly classifying high-risk samples and 73,7% for low-risk samples (Table 1). The best feature vector consisted of one textural feature named inertia, and two morphological the standard deviation of area and the standard deviation of roundness. Table 1. Evaluation of system's performance
Diagnosis Low-risk High-risk Overall accuracy
System classification Low-risk High-risk Accuracy 14 5 73,7% 2 25 92,6% 84,8%
The system's ability to generalize in unseen data was 82,9%. More specifically, 86.7% (26/30) of the high-risk cases and 72.7% (8/11) of the lowrisk cases were correctly identified by the system (Table 2).
195
Table 2. Evaluation of system's generalization to unseen clinical data
Diagnosis Low-risk High-risk Overall accuracy
System classification High-risk Accuracy Low-risk 8 3 72,7% 4 26 86,7% 82,9%
In conclusion, an SVM-based classification methodology was implemented for the automatic discrimination of AST tumours' grade. The generalization ability of the proposed system was verified in unseen clinical data enforcing the belief that automatic characterization of brain tumours might be feasible in every day clinical routine. References 1. 2. 3. 4. 5. 6. 7.
8. 9. 10. 11. 12.
L.M. DeAngelis, New England Journal of Medicine, 344, 123 (2001). W. Shapiro, and J. Shapiro, 12, 233 (1998). M. Mittler, B. Walters, and E. Stopa, Journal of Neurosurgery, 85, 1091 (1996). M. Scarpelli, P. Bartels, R. Montironi, D. Thompson, Analytical and quantitative Cytology and Histology, 16, 351 (1994). H. Kolles, and A. v.Wangenheim, Analytical cellular pathology, 8, 101 (1995). M. McKeown, and A. Ramsay, Neuropathology & Experimental Neurology, 55, 1238 (1996). H. Kolles, A. v. Wangenheim, J. Rahmel, I. Niedermayer, and W. Feiden, Analytical and Quantitative Cytology and Histopathology, 18, 298(1996). P. Sallinen, Sallinen S., T. Helen, I. Rantala, H. Helin, H. Kalimo, H. Haapsalo, Neuropathology and Applied Neurobiology, 26, 319 (2000). V. Kechman, MT(200l). P. Spyridonos, P. Ravazoula, D. Cavouras and G. Nikiforidis, Analytical and Quantitative Cytology Histopathology, 24, 317 (2002). P. Spyridonos, P. Ravazoula, D. Cavouras, K. Berberidis and G. Nikiforidis, Medical Informatics, 26, 179 (2001). S. Theodoridis, and K. Koutroubas, Academic Press (1999).
N-BODY MODELLING DONALD GREENSPAN Mathematics Department University of Texas at Arlington Arlington, Texas 76019, USA E-mail:
[email protected]
N-body problems are studied using molecules and collections of molecules, called particles. Applications are made in the spirit of molecular mechanics to primary flow and turbulent flow for water vapor, soliton collision, motion of a top, microdrop collision, stress of a slotted copper plate, contact angle of adhesion, cellular self reorganization, the bounce of an elastic ball, saddle surfaces, and elastic snap through. Numerical methodology is also developed for special relativistic motion. In the Newtonian models, methodology is developed for conservation of energy, linear momentum and angular momentum. In the relativistic case, it is shown how to calculate in the lab and rocket frames so that the numerical calculations are covariant.
196
ABOUT ONE APPROACH TO THE MINIMIZATION OF THE ERRORS OF THE TUTORING OF THE NEURON NETWORKS ALEXANDER P. GRINKO, MICHAL M. KARPUK
Koszalin Technical Universiv, Koszalin, Poland E-mail:
[email protected] The feasibility of function of errors with fractional exponent for solving of a problem of optimization and tutoring of neural networks was theoretically explored. The analytical expressions for estimation of parameters of the models or weight factors were obtained. The algorithms were designed and the numerical experiment on actual economic datas was held, where the efficiency of an offered procedure is shown. Keywords: neural networks, optimization, method of least squares, function of errors, fractional integrals and derivatives.
1. Introduction
In algorithms of tutoring of neural networks the most spread method is the method of least squares. For k input datas the root-mean-square error of diversions of an output signal of a web from a true value is minimized. For example, for delta - the Widrow - Hoffs rule of tutoring of a neural networks 111 k
k
E = C E ( i )=
C(y;--ti)*
i=l
,
i=l
where E(i) - is a root-mean-square error for the i-th output signal, n
yi =
Cmixi - T ,
ti
- is accordingly the value of an output signal of a neural
j=l
networks during its tutoring process and true ("theoretical") value of the i-th output signal, xi - the input signal for j-th neuron of a neural networks, j =1, 2... n,
oj- weight value for the j-th neuron, T - threshold of activation of a
neuron. The choice of functions like (1) for minimization is stipulated by a relative simplicity of calculation of weight coefficients m j ,j = 1,2, ..A [2], and completeness of space l2 [3]. However at minimization of a root-mean-square error (1) we inevitably interfere with some problems. Firstly, at usage of tutoring series of signals k the 197
198 influence of unbiased errors grows. It follows fiom expression (l), when the minimization of squares of diversions is yielded, and the square of diversion for a unbiased error can essentially influence the function of minimization and, accordingly, the theoretical curve. In practice the methods of elimination of unbiased errors from tutoring algorithms are used, but they are not always effective. Secondly, the minimization (1) is reduced to deriving and solution of combined equations concerning weight coefficients Wj, j = 1,2, ..A for a local minimum of a function of diversions(1). For simple "theoretical" dependences this system solves rather easy. For more complicated theoretical dependences the finding of a local minimum of a function(1) is usually possible only numerically. Thus the examination of other types of functions of minimization of errors, the development of theoretical methods of research of local extremes and their usage in algorithms of optimization and tutoring of neural networks represents the practical concern.
Theory. In this work we consider the function of errors like
In this case there is no first derivative in points yi = ti. For examination of functions of that type the application of the functionals of fractional differentiation of order a of Reimann-Liouville [3,4] or their modifications is possible
O
and fractional derivatives of the Marcheau
199
O
O<x
For continuously - differentiable functions the functionals (3) - (4) are reduced to the functionals (5) - (6) [ 5 ] . For example, the fractional derivative (5) for q ( x ) = xu , v > -1 equals:
Let's designate H A([0, b]
)
as a set of functions satisfying the
condition of Holder: ~P(t1)-P(t2)~q -t2(19 t 1 O
t2
E[O,
b].
The following principle of an extreme for the functionals
(Dt+q)(x)
and (Dr-p)(x) is known. Let's assume that we are given a nondecreasing, non-negative function w (t) that's not to equal to zero identically and function f (t) continuous on a segment [0, x] and in an extremly small range 0 < t I x of a point t = x the product w ( t ) f (t)
E
H A([O, b ] ) , A > a . Then, if on a segment [0, x]
the function f ( t ) reaches a positive maximum (negative minimum) in a point t = x , t h e n (Dt+p)(x)>o ( ( D t + q ) ( x ) > O ) .
In the work the results of improving the principle of an extreme for the functionals
(Dt+v)(x) are obtained.
Theorem 1.Let f ( t ) E H A ( [ O ,b]
), A > a . Ifforall E E [O,x]
200
or
then the function
f (t ) is monotonically increasing (monotonically decreasing)
on [0, x] . Theorem 2. Let p(f) E H a ([0, b]
),
/z > a . The decomposition
takes place:
+.L(P(Xo))E2-e +. ..+ f , (P(x o ) ) En-@+... , o < e < i J; (p(x,)) = 0 and the fkction f ( t ) is monotonically increasing (monotonically decreasing) on [x, - E , x, ] and Then if in a point X,
monotonically decreasing(monotonical1yincreasing) on point
[xo,xo + E ] ,then
in a
tp (t ) reaches a local maximum (minimum). The proof of theorems 1 - 2 is obtained by estimations.
X,
Modeling and the Results. The expressions (5) - (9) allow to explore functions of errors like(2). As an example we shall consider the generalization of a classic method of least squares in which the criteria of a diversion of direct model
y li from observations ( x i , yi ) is
= a,
+ alxi
(10)
20 1
+
For function q ( x ) = a bx the equality (6) will take view:
a
la + bxl"" - la + btIu+' (a+bx) (a+bt) la + bxrl dt = ( 1 - a ) E= ( a + bx) (xo - t)=+l
"1
w - a >%-'?
a+bx
-
la + bxluc' +co (-u)i(-a)i (b.)' r ( l - a ) F ( a + b x ) i=l ( l - a ) i i ! (a+bx)i
c
where
0 < Re y < Re a - Gauss hypergeometric function,
(p),= I , (p), = p ( p + 1 ),..., ( p + k - l ) , k = l , a # 0, -1, -2, ... - Pohhammer's index.
2,...,
Applying the theorem 2 in equality (10) we can write the analog of normal combined equations:
For estimation of parameters ao, a1 and calculation of structural parameters of linear model (10) it is necessary to solve combined equations (12). The solution was camed out numerically on the basis of algorithm of
202
conjugate lapse rates of minimization of function of errors like (2). The program for model operation is written on C++ [6, 7]. The numerical experiment was conducted on the ground of dates on unemployment in Poland in 1997 - 2000 years. (GUS dates). The criteria for measurement for different a was function (2). The obtained results of estimations of parameters do, a/ are represented in the Table 1 and in Figure 1. The results of a numerical modeling allows to make deductions, that at criteria (2) the best objective function is the function with an exponent 0 < u < 1. Selection of parameter a can be held during the optimization process or in process of tutoring of a neural networks. For obtained dates the optimal value of a is u=0,7. Thus a0 =9,735, a; = 0,072.
3 2,5 2 1 u 1,5 10,5364 10,4781 10,3692 10,1618 9,7333 ao 0,0315 0,0346 0,0412 0,0534 0,0722 ai Err(u) 49,962 41,307 34,966 30,234 26,911
0,9 9,7303 0,0725 26,447
0,8 9,7272 0,0727 26,133
0,4 0,2 0,5 0,7 0,6 0,1 0,01 u 9,7351 9,5636 9,4562 9,3636 9,3030 9,4459 0,0227 ao 0,0720 0,0818 0,0831 0,0945 0,1082 0,0854 0,3805 ai Err(u) 25,992 26,242 26,474 26,978 30,626 31,494 35,196 Table 1. Estimates of parameters of linear model a0, aj and value of function of errors Err (u) (2) The obtained values of estimation of parameters were used for a dot estimation of the prognoses (Fig. 2). In a figure actual values of percent of unemployment on the following time intervals (continuous curve) and forecast values also are shown on the basis of linear model (continuous straight line). As follows from a figure, that the prognosed values, obtained on the base of minimization function (2) are much closer to actual values, the than obtained on the base of a method of least squares. Therefore, while solving the problems of optimization or problems of tutoring of neural networks, it is expedient to conduct the examination of function of errors and modify it with the purpose of diminution of value of an objective function of optimization or function of errors of tutoring.
203
tt
Cf)
45 3
•
~e t 40 to Ul
35 *»
30
*»»*»»*
oc
0
0,5
1
1,5
2
2,5
3
u
Fig. 1. Dependence of function of errors Err (u) from quantity of an exponent u in (2) for linear model. Conclusion Thus, in the given work the following results were obtained • On the base of application of the functionals of fractional differentiation the expressions for dot estimations of parameters of optimization of functionals or weight factors in function of tutoring of neural networks were obtained; • The feasibilities of results for examination of models are shown; • The numerical modeling for the test problem was carried out and it is shown, that the optimization at fractional exponents of function of errors gives the better result, than at optimization by a method of least squares. References 1. Widrow B., Hoff M. Adaptive switching circuits // In 1960 IRE WESCON Convention Record. DUNNO. 1960. P. 96 - 104. 2. Rutkowska D., Pilicki M., Rutkowski L. (Sieci neuronowe, algorytmy genetyczne i systemy rozmyte. Warszawa. PWN. 1997. 410 S.) 3. Kolmogorov A.N., Fomin S.V. Elements of theory of functions and analyze of functions. Moskow: Nauka. 1968. 496 C.
204
14,O 13,O
9 8,O '0 0
12
24
36
t (time for 03.1997)
Fig.2. Dependence of a degree of unemployment on time and approximation by linear models with function of errors (2). A solid line - u=2 (method of least squares), dash line - u=0,7.
4. Samko S. G., Kilbas A. A. and Marichev 0. I. Fractional integrals and derivatives. Theory and applications. 1993. Gordon and Breach, New York, etc.
5. Erdelyi A., Magnus W., Oberhettinger F. and Tricomi F.G. Higher transcen- dental functions. Vol. 1. 1953. McGraw-Hill, New York, etc. 6. T. Masters. Practical neural network recipies in C +t. (Academic Press, Inc., 1993). 7. Osowski S. Sieci neuronowe w ujKciu algorytmicznym. Wydanie drugie. (warszawa. WNT. 1996.346 S.).
DIGITAL CONCRETE: A MULTI-SCALE APPROACH FOR THE CONCRETE BEHAVIOR F. GRONDIN AND G. MOUNAJED Centre Scientijique et Technique du Bcitiment, Division MOCAD, 84 av.Jean Jaurks, 77420 Champs-sur-Marne,France E-mail: ji-ederic.grondin@cstb$r, [email protected]
A. BEN HAMIDA AND H. DUMONTET Laboratoire de Mode‘lisation,Mate‘riam et Structures, CNRS UMR 7143, Universite‘Paris 6, 8 rue du Capitaine Scott. 75015 Paris, France E-mail:
[email protected],
[email protected] This article presents a digital method of homogenization in thermo-elasticity based on the choice of a representative volume of the building material. Heterogeneities are generated in a random process. This numerical model developed by Mounajed is implemented in the general finite element code ‘SYMPHONIE’ of the CSTB. This stochastic method is applied to the calculation of the homogenized behavior of a High Strength Concrete. Furthermore, this method allows, through the process of location, to estimate the local field and to predict possible damages of the building material.
1. Introduction Multi-scale approaches appear for several years completely relevant to model the behavior and the degradation of heterogeneous materials. From the knowledge of the microstructure, these techniques allow to characterize the behavior of the heterogeneous material as equivalent homogeneous material. The interest of these approaches lies also in the fact that it is possible, for a given macroscopic request, to reach stress and strain distributions at the level of the microstructure. So we are able to predict and to follow damage within the building material [l]. We suggest a method of homogenization to analyze the thermo-elastic behavior of a high strength concrete (HSC). This method is based on the exploitation of the model Digital Concrete [2] developed in the code of calculation ‘SYMPHONIE’ [3]. It suggests the choice of a very rich Representative Elementary Volume (REV), in the most close to the reality of the microstructure of the HSC. It takes into account the aggregate size distribution and the pore size distribution. This REV is discretized by finite element method
205
206 by generating, in an unpredictable way, the various phases of various sizes that are pores and aggregates. 2.
Reminders of homogenization in thermo-elasticity
The equivalent homogeneous behavior of a heterogeneous material, with various phases of which present a linear thermo-elastic behavior, is defined by the following homogenized law of behavior, [4]:
z = (C>y = Chom(E- ahom A 6 I)
(1)
which connects the average, on the volume V of the REV, of microscopic stress a(y) to the macroscopic strain E and to an incremental macroscopic temperature A 0 . The local stress field is solution of the cellular linear thermo-elastic problem put on the REV [2]. In practice, by linearity, this problem is decomposed into two problems: E f 0 , A 6 = 0 , and E = 0 , A 6 # 0 . The various methods of homogenization distinguish themselves among them by the choice of the REV that they adopt and the way of imposing on this volume a macroscopic load. So, simplified approaches privilege REV of simple geometries [ S , 61. 3.
The digital concrete in Symphonie
Representative elementary volume is chosen so as to realize the random distribution of various heterogeneities of the building material. This REV is discretized by finite elements. The meshing is generated in a regular way according to a grid. Aggregates and pores, supposed spherical, are stacked in this grid in an unpredictable way by respecting their proportion according to their size distribution of the considered building material. A first REV was retained to characterize the behavior of the mortar of which phases are the cement paste, pores and aggregates. A second REV, on the scale of the HSC, of which phases are the mortar and aggregates. 4.
Calculation of the thermo-elastic behavior of the concrete
The studied concrete is the HSC MlOOC for which the characteristics of phases were measured by Pimienta and al. [7]. Digital results (noted with superscript hom), obtained by calculations in two dimensions under the
207 hypothesis of plane strains (PS), are recapitulated in table 1 for the concrete. They are confronted with experimental measures [7,8]. Table 1. Homogenized characteristics and experiments
D.C. Experiment
Ehom(PS) = 50610 MPa Eexp= 50833 MPa
= 0.196 ahom(PS) = 8.66.10" (0C-I)
vexp= 0.2
6 < aexp < 14 (.lo"
OC-')
The influence of the phase size distribution was tested through various studies of sensibility. Mechanical characteristics and the coefficient of coupling are sensitive to a modification of the porosity [8, 91. A decrease of the porosity makes that the building material tends more to dilate. Evolution of the effective Young modulus of the mortar according to the Young modulus of the cement paste is similar to that observed experimentally on the other concrete types, [ 101.
5.
Micromechanical analysis
The multi-scale method gives the local strain field and the local stress field within the microstructure under a given macroscopic load. This phase of location constitutes the major interest of the method of homogenization proposed here, with regard to approaches simplified by homogenization which do not supply exploitable local fields. So for a load of macroscopic shear, the distribution of local shear strains is presented to the figure 1 in the mortar. One notices while the most sought zones are localized in the cement paste (darker zones on the figure 1). The study of the location of local fields allows to explain the origin of the damage and to predict the appearance of micro-cracks.
Figure 1. Distribution of the local shear strain field in the mortar.
208
6. Conclusions In this work, we developed a tool of simulation, the Digital Concrete, integrated into the code of calculation of structures by finite element ‘Symphonie’ of the CSTB. It characterizes the homogenized behavior of the concrete under a linear thermo-elastic load by a multi-scale approach. The random phase size distribution on the scale of the mortar and of the concrete is taken into account. Obtained results of simulations confront in a completely encouraging way with experimental results. Furthermore, this method allows reaching strain and stress fields which reign in the microstructure. Access to such local fields will allow after approaching for example the problem of the behavior of concretes subjected to high or low temperatures. Works continue at present by the consideration of non-linearity of the behavior of phases according to the temperature. The use of the tools of location set up in the module Digital Concrete associated to the non-linear module of Thermo-Hygro-Mechanical developed in the code ‘Symphonie’ [ l l , 121, will simulate the various mechanisms of damage introduced in the microstructure [ 131.
References 1. A. Benhamida, H. Dumontet, F. LCnC, Ninth International Conference on Composite Materials, ICCE/9, San Diego, USA (2002). 2. G. Mounajed, Cahiers du CSTB, 3421 (2002). 3. G. Mounajed, CSTB, France (1991). 4. G. Frankfort, Homogeneization and thermoelasticity, SIAM, Journal of Mathematics Analysis, Vol. 14, pp. 696-708, 1983. 5. J. Aboudi, Elsevier Science Publishers B. V. (1991). 6. M. Bornert, T. Bretheau, P. Gilormini, Hermb Sciences Publication, 1 (2001). 7. P. Pimienta, A. Le Duff, Projet National BHP 2000: Be‘tons d Hautes Performances, CSTB, 2 (1 996). 8. G. Dreux, J. Festa, Editions Eyrolles (1995). 9. Y. Malier, Presses de I’Ecole Nationale des Ponts et Chaussdes (1992). 10. G. Li, Y. Zhao, S.-S Pang, Y. Li, Cement and Concrete Research, 29, 1455 (1999). 11. G. Mounajed, W. Obeid, International Workshop, BRI-Tsukuba, Japan (1998). 12. W. Obeid, G. Mounajed, A. Alliche, Computer Methods in applied mechanics and engineering, 190(39) (2001). 13. G. Mounajed, H. Ung Quoc, H. Boussa, First International Conference on Applications of Porous Media, Jerba, Tunisia (2002).
A NEW APPROACH TO DISCRETE APPROXIMATION OF A CONTINUOUS-TIME SYSTEM MODEL BASED ON SPLINE FUNCTION
SHA GUANGYI Doctoral Program in Engineering ,University of Tsukuba, 1-1-1 Tennodai, Tsukuba-shi,Ibaraki-ken Tsukuba, 305-8573,Japan E-mail: shaOis.tsukuba.ac.jp TOHRUKAWABE Institute of Information Sciences and Electronics ,University of Tsukuba 1-1-1 Tennodai,Tsukuba-shi,Ibamki-ken 305-8573, Japan E-mail: kawabeQistsukuba.ac.jp KAZUO TORAICHI Center f o r Tsukuba Advanced Research Alliance, University of Tsukuba 1-1-1 Tennodai, Tsukuba-shi, Ibaraki-ken 305-8573,Japan E-mai1:tomichiQis. tsukuba. ac.j p KAZUKI KATAGISHI Institute of Information Sciences and Electronics, University of Tsukuba 1-1-1 Tennodai, Tsukuba-shi, Ibaraki-ken 305-8573,Japan E-mai1:katagisiQis. tsukuba. ac.jp
1. Introduction
Although system in the real world is usually described as continuous-time system, we need discrete approximation of this system to operate by digital devices in many signal processing application areas. One of the early works on this problem was proposed by Kalman l . He also developed the Kalman filter based on this concept. Many discritization methods have been proposed since Kalmanfs work. However, most of them have two problems. First one is that they do not make clear the definition of the signal space t o which the continuous-time signal belongs. Second problem is that most of conventional discritaization method commonly used the staircase signal 209
210
as a approximation of continuous-time signal. Since it is only one degree of freedom per sampling interval, the precision of approximation could not be so high. Toraichi et a1 =.have proposed the new concept of the 'fluency signal' function theory which is extension of spline function theory as the key concept for solution of two problems above. It forms a series of signal spaces that includes and generalizes the staircase one, polygonal one and spline one. Ishii et a1 '. have proposed construction method of a discrete time system approximated by a piecewise polynomial function based on the concept of 'fluency signal'. However, to obtain the approximated discrete h eA"u(p)dp is required for the basis of time model, the computation of functional space. If we use the conventional numerical calculation method, the Simpson formula and so on, it may be that the approximation precision of discritization is lower. In this paper, therefore, more accurate discrete approximation method based on the primitive function for the calculation of this integration.
so
2. Main contribution
Let consider a continuous-time linear dynamical system with state-space discription is defined as follows:
where A E RnXn,BE Rnxl,C E Rlxn are known constant matrices with 1 5 r , q 5 n , and where s(t) E Rn is a state variable of the system, w ( t ) E Rr is a known deterministic input signal, and y ( t ) E RQis an output signal of the system. The initial time sets 0. In the CS, the input w ( t ) is able to be represented by ( m- 1)-st order piecewise polynomial signal U(T) U(T)
= [U1(T). . .U ' ( T ) ] T
ui(.)E
"s; i = 1,2,. . . ,m = 1 ~ 2 ,...
whose sampled values constitute the vector described in the following form:
'1Lk
:= u(tlc). Then U(T) can be
M
vu E " S , where 211, E Rr denotes a vector composed of the sampled values of the original input vector w ( t ) . It is well-known that ; $ k ( t ) can be replaced
211
by the B-spline basis R & ( t ) . We can therefore denote U ( T ) as
where where wl is the vector of the B-spline coefficients sequence. Then following relation is obtained:
Where
Bi=
lh
eAp
. ;&(p)dpB,
i = -m
+ 1 , . . . ,O
(4)
Finally, CS is transformed into :
"DS denotes the discrete-time system model based on the B-spline function. While 0, := [0 . . . 01 E RIXn,the matrices are defined as follows:
ii := [o
.. . 0 1
c := [O
. . . C]E drn-l+,).
T
B-m+l]
E
R(m-l+n)xl
st
We can see that calculation of the eApu(p)dp is need to obtain "DS . Therefore, we propose the computation method of this integral by using the primitive function based on the following theorem.
212
Theorem 1. We can compute the
eA”u(p)dp as follows
i = 0,.. . , m - 1. (5) Proof omitted. By using this theorem, we expect to get more accurate discrete-time model. 3. Conclusions
In this paper, we proposed the new approach to discrete approximation of continuous-time system models based on fluency signal function. Further investigation is required to this research, for example, an extension of the proposed method to apply the field of observation with Kalman filter, and so on.
Acknowledgments This research was partially supported by the grant of the Core Research for Evolutional Science and Technology (CREST) Program under the Japan Science and Technology Corporation (JST) , and the competitive research fund of support scheme for funding selected IT proposals from the Ministry of Public Management, Home affairs, Posts and Telecommunications. The authors would like to acknowledge here these organizations.
References 1. R.E. Kalman, A new approach to linear filtering and prediction problems, Trans. ASME, J. Basic Eng., vol. 82D, no. 1, pp. 34-45 (1960). 2. K. Toraichi, M. Kamada, R. Mori, Sampling theorem in the signal space spanned by spline function of degree 2, Trans. IEICE, vol. E67, no. 9, pp. 531-532 (1984). 3. H. Ishii, K. Katagishi, K. Toraichi, Discrete approximation model of a continuous-time system with piecewise polynomial input signals in the space of piecewise polynomials, The Trans. IEE of Japan, vol.ll8-c, no. 3, pp. 366-375 (1998).
APPLICATION OF SHANNON’S ENTROPY TO CLASSIFY EMERGENT BEHAVIORS IN A SIMULATION OF LASER DYNAMICS
J.L. GUISADO Centro Uniuersitario de Mkrida, Uniuersidad de Extremadura, 06800 Mgrida (Badajoz), Spain E-mail:
[email protected] F. JIMENEZ-MORALES Departamento de F&ca de la Materia Condensada, Uniuersidad de Seuilla, P.O.Box 1065, 41 080 Seuilla, Spain E-mail:
[email protected]
J.M. GUERRA Departamento de Optica, Facultad de CC. Fsicas, Uniuersidad Complutense de Madrid, 28040 Madrid, Spain E-mail:
[email protected] Laser dynamics simulations have been carried out using a cellular automata model. The Shannon’s entropy has been used t o study the different emergent behaviors exhibited by the system, mainly the laser spiking and the laser constant operation. It is also shown that the Shannon’s entropy of the distribution of the populations of photons and electrons reproduces the laser stability curve, in agreement with the theoretical predictions from the laser rate equations and with the experimental results.
1. Introduction Traditionally, laser dynamics has been analyzed by solving a set of coupled differential rate equations which describe the interrelationships and transition rates among the electronic states in the laser active medium and the laser photons. But recently an alternative approach based on a cellular automata (CA) model has been proposed’. In this paper we focus on the application of Shannon’s entropy to recognize and classify the different types of behavior shown by the CA simulation of laser dynamics. 213
214
2. Cellular automata model
Cellular automata are a class of spatially and temporally discrete mathematical systems, characterized by local interaction and synchronous dynamical e v ~ l u t i o n ~They ? ~ . have been used to build models of a wide variety of physical system^^>^. In the present model, a two-dimensional square lattice of N , = 200 x 200 cells with periodic boundary conditions is used. Two variables are associated to each node of this lattice: a i ( t ) ,which represents the state of the electron in node i at time t and c i ( t ) ,which represents the number of photons in node i at time t. The time evolution of the CA is governed by a set of transition rules which determine the state of any cell at time t 1 depending on the state of the cells included in its neighborhood at time t. These rules represent the different physical processes that work at the microscopic level in a laser system: pumping, stimulated emission, photon decay and electron decay. A small noise level of random photons in the laser mode is also introduced to represent the experimentally observed noise level, responsible of the initial laser start-up.
+
3. Simulation results and Shannon's entropy analysis
Three parameters determine the response of the system: the pumping probability (A), the life time of photons (7,) and the life time of excited electrons ( T ~ ) .Initially, ui(O) = 0, ci(O) = 0, 'di,except a small fraction 0.01% of noise photons present. We let the system evolve for 500 time steps. In each time step, the total number of laser photons n(t) = ci(t), and the total number of electrons in the upper laser state (population inversion) N ( t ) = Czl ai(t) are measured. For each particular triad of values of the parameters, we are interested in recognizing if n(t) and N ( t ) show either a constant or an oscillatory behavior. To this end, the Shannon's entropy S of the distribution of values taken by n(t) and N ( t ) is calculated by dividing the range of values taken in lo3 intervals, and computing the frequency (fi) at which the magnitude value lies inside every particular non-void interval a. S is defined as: S(A, T,, 7,) = - Czl fi log2 fi, where m is the number of non-void intervals. Since the Shannon's entropy measures the dispersion in the distribution of values taken by the analyzed magnitude, it is a good indicator of the presence of oscillations in the system. The dependence of the Shannon's entropy on two of the parameters of the system when the third parameter is fixed is shown in Fig. 1. In order to compare the results of the simulations with the predictions of the laser rate equations, in Figure
c,r"=.,
215
.r-=10
TP10
0 10
0 10
OM)
0 08
OM)
0 15 20
006
0 30 40
004
25 30 35 40
004
50 80
x
0 02
20 40 60 80 1W 120 140 1W 180 2W
T
10
0 02
80
20 40 60
80 1W 120 140 1W 180 2W r.
Figure 1. Contour plots of the Shannon’s entropy of the distribution of values taken by the number of laser photons S, (left) and the population inversion S, (right), showing the dependence of S with the pumping probability X and the upper laser level life time T, for a fixed value of the cavity life time of T~ = 10. T~ and T~ are measured in time steps.
2 the Shannon’s entropy for the number of laser photons is plotted taking as x-axis and as y-axis, where R is the laser pumping rate, Rt is the threshold pumping rate, At is the threshold pumping probability, and = -A The theoretical stability curve from the rate equations is the Rt At’
2
2
black line, given by:
2 = 4(k--1). (N
In Fig. 2, areas of high S appear above and to the right of the theoretical stability curve (dark zones), where oscillations are expected, and areas of low S appear in the bright zones, where a constant behavior is expected. This is verified by plotting the temporal evolution of n(t) and N ( t ) for values of the parameters in each zone: relaxation oscillations (laser spiking) are found in the first case, and a constant behavior in the second one, in good agreement with the theoretical predictions from a linearization of the laser rate equations6i7 and with the behavior experimentally found. 4. Discussion
The Shannon’s entropy has been used to classify the types of behavior shown by the system in the parameter space. Two characteristic behaviors have been found, in agreement with the predictions of the laser rate equations and with the experimental results: relaxation oscillations (zones of high S ) and constant behavior (zones of low S ) . Other features of the
216
0
2
4
6
8
10
12
14
16
18
20
R/R
Figure 2. Contour plot of the Shannon's entropy of the distribution of the number of laser photons for a fixed value of T~ = 10 time steps. Low values of S, (bright zones) indicate that the response of the system is non-oscillatory,while high values (dark zones) indicate an oscillatory response. The black line is the theoretical stability curve.
laser phenomenology (threshold pumping rate and spatio-temporal pattern formation) are also reproduced by this CA model. It can be useful as an alternative modeling tool to the standard treatment of laser dynamics (based on differential equations) when numerical difficulties arise, for example in lasers governed by stiff differential equations, with convergence problems. Due to its intrinsic parallel nature, it can be implemented in parallel computers and offer a great advantage in computing time. In addition, this kind of model could be used to study problems of current interest, such as cooperative phenomena or chaos in lasers.
References 1. J.L. Guisado, F. JimBnez-Morales, and J.M. Guerra. A cellular automaton model for the simulation of laser dynamics. Phys. Rev. E. To be published. 2. S. Wolfram. Universality and complexity in cellular automata. Physica D, 10:1, 1984. 3. S. Wolfram. Cellular automata and complexity. Addison-Wesley, 1994. 4. B. Chopard and M. Droz. Cellular automata modeling of physical systems. Cambridge University Press, 1998. 5. T. Toffoli and N. Margolus. Cellular automata machines: a new environment for modelling. The MIT Press, 1987. 6. A.E. Siegman. Lasers. Unversity Science Books, 1986. 7. 0. Svelto. Principles of lasers. Plenum Press, 1989.
COLLOCATION AND FREDHOLM EQUATIONS OF THE FIRST KIND
G. HANNA AND J . ROUMELIOTIS* School of Computer Science and Mathematics Victoria University of Technology PO Box 144.28 Melbourne City Mail Centre Victoria 8001, A U S T R AL I A E-mail:
[email protected]. edu. nu
[email protected]
1. Extended Abstract
A great many problems in the fields of engineering and science can be modelled with partial differential equations. If these problems involve some well defined region, 0, with known conditions on the boundary I?, then the equations can be transformed into a Fredholm integral equation of the first or second kind. In this chapter we will consider equations of the first kind
S(Y) =
s,
K(. - Y)f(.)
Wz),
(1)
where g represents some known data at the point y E I? and K is a distribution of source (or sink) terms over the boundary. We seek to find the density of this distribution, f. The integral equation (1)is inherently ill-posed. That is, it can be shown that a small perturbation on g can give rise to arbitrarily large changes in f . To substantiate this point, consider the singular integral F1
*Work partially supported via the overseas study program of Victoria University of Technology 217
218
For 0 < a < 1 and n large, then infinitely small changes for the integral correspond to infinitely large changes in the integrand. For this reason, numerical methods for solving such equations are often ill-fated. The simple illustration here shows this is often manifested in high frequency terms of an orthogonal expansion for the unknown. Popular methods for inverting (1) involve discretizing the region and employing some interpolatant in each interval. To this end, we consider the integral equation
S(Y) =
J
b
K(z,y)f(x)ds,
a
IY I4
(3)
and define a grid
a
< 51 < ... < xn-l < x,
= xo
= b.
(4)
Thus, we may write (3) as
where we have mapped the integral over each sub-division to the unit interval and Fj (t ) = f(Sjt+xj-l). The length of each interval is Sj = xj-xj-l, for j = 1 , 2 , . . . ,n. Using cubic Hermite interpolation polynomials, we can write
F j ( t ) = F ~ ( o ) H ~ ( ~ ) + F ~ ( o ) H ~ ( ~ ) + F ~ ( ~ ) for H ~ (j~=)1+, 2F,~.. (. n~ (6)
Substituting (6) and re-arranging gives n
g(Y)
FIBl,l(Y) + cFj(Bj,l(Y) + Bj-1,3(Y)) j=2
-k
F j (Bj,2(Y)
+ &Bsjj-l,4(!/) + Fn+lBn,3(Y),
(7)
where
1
1
Bj,k(Y) = sj
K(Sjt
+ +l,Y>Hk(t)
dt,
for j = 1 , . . . ,n and k = 1 , 2 , 3 , 4 . We refer the reader to for the details. To formulate a linear system we evaluate (7) at 2n collocation points aL
~i
< y2 < ... < ~ 2 n - 1< ~
I b.
2 n
(8)
219
Inverting this system will provide, on substitution into (6), a cubic approximation that is continuous and differentiable. It might first appear that the collocation points (8) can be selected without regard to the distribution of the grid (4). This being the case, one might choose to distribute the 2n collocation points uniformly over [a,b ] , therefore ensuring that the points are as far away from each other as possible. The expected result being that the collocation equations are the most linearly independent (i.e. the system is assured of being well-conditioned). This is certainly not the case since a polynomial approximation is provided for each interval j , thus evaluation should be performed in all intervals. The question then arises: How should the collocation points be chosen for each interval? Clearly, collocation points should be unique, thus the node points are not candidates. Scaling each interval to (O,l), we collocate twice, 0 < yl,y2 < 1. To obtain a stable system, the distribution of collocation points must be considered as a function of both polynomial interpolation order and kernel singularity. Much work has been done in this direction where a convergence theory for piecewise constant and linear interpolants was developed For an excellent review see l. Convergence of the numerical solution is guaranteed if one collocates evenly between the node points though not necessarily to the solution 1,3 . Recently, extended this theory to include Hermite cubics. They show that with a logarithmic kernel, collocating symmetrically in each interval gives a theoretical convergence rate of O(h5)and collocating at the special points 0.2451188417393386,.754881158261 induces a super-convergence of O(h7). These rates have been measured using the norm of some special Sobolev space. They do, however, identify that an optimal collocation regime does exist. Unfortunately, their results apply for a closed boundary with smooth data. In an effort to identify optimal collocation points, we will employ a Peano theorem to approximate the integral equation. The only assumptions made are that the kernel is non-negative and integrable and that the unknown satisfies some smoothness conditions. For example, if we assume that f is piecewise constant then we can employ the Peano kernel 8~1571216~11~14~597.
2,6,11t13,
d z and integrating by parts gives, Substituting (9) into J:p(z, y, <)f’(z)
220
upon using (3),
(10) Taking the modulus of both sides of (10) and applying a Holder type inequality (similar to the Cauchy-Schwartz-Buniakowskyinequality) provides the result
where the bound I(E,y) has been simplified by reversing the order of integration to
Equation (12) was obtained by taking an upper bound in (13) and using the well known result 1 max{u, b } = - ( u 6 la - bl). 2
+ +
By differentiating and appealing to the properties of convex functions, it is a simple matter to show that I is minimised at the midpoint of the interval [= (see lo for the details). The result is independent of the kernel, K , and is due to the rule sampling at both end-points. Thus, with this class of function, the optimal approximation to g is
9
9(Y) = f(@)x~,(Y>+ f(b)xb(y)
+ e(Y),
(14)
where
/T a+b
&(Y) = and
b
K(z, Y) d z ,
ICb(Y) =
K ( z ,Y) dz
(15)
221
Substituting two collocation points a 5 y1 < y2 5 b into (14) produces two linear equations which are easily inverted to produce the solution 1
f(a) = A(Y11Y2)
(Kb(Y2)(dY1)
+ e(y1)) - Kb(Yl)(g(Y2) + e(y2)))
(17)
where A(y17y2) = Ka(Yl)Kb(Y2)
-
(19)
Kb(Y1)Ka(Y2)
If we now assume that the kernel is symmetric K ( x ,y) = Klx - yI and that y1, y2 are evenly distributed, y2 = a + b - yl,then it is a simple exercise to show that
+
K a ( a b - y) = Kb(y),
&(a
+ b - y) = Ka(y)
and e(a
+ b - y) = e(y).
Hence (17)-(18) become
where
A(Y) = m y ) - K a y ) Equations (20)-(21) provide explicit error bounds for functions f of bounded Minimizing first derivative in terms of a collocation point y E [a, should produce an optimal collocation strategy for this class. In this paper, the above result will be expanded to include other norms and functions of higher smoothness. A numerical application involving Symm’s integral equation will also be considered.
9).
References 1. 2. 3. 4.
D. N. Arnold and W. L. Wendland. On the Asymptotic Convergence of Collocation Methods. Math. Comput., 41(164):349-381, 1983. D. N. Arnold and W. L. Wendland. The Convergence of Spline Collocation for Strongly Elliptic Equations on Curves. Numer. Math., 1985:317-341, 1985. L. Collatz. The Numerical Treatment of Differential Equations. SpringerVerlag, Berlin, 1966. W. McLean and S. Profidorf. Boundary element collocation methods using splines with multiple knots. Num. Math., 74:419-451, 1996.
222
5.
H. Niessner. Significance of Kernel Singularities for the Numerical Solution of Fredholm Integral Equations. In C. A. Brebbia, W. L. Wendland, and G. Kuhn, editors, Boundary Elements IX. Volume 1: Mathematical and Computational Aspects, pages 213-227, Berlin, 1987. Springer-Verlag.
6.
7.
8. 9.
10.
11.
12. 13. 14. 15.
H. Niessner and M. Ribaut. Condition of boundary integral equations arising from flow computations. J. Comp. Appl. Math., 12 & 13:491-503, 1985. S. Prossdorf and A. Rathsfeld. On Quadrature Methods and Spline Approximation of Singular Integral Equations. In C. A. Brebbia, W. L. Wendland, and G. Kuhn, editors, Boundary Elements I X . Volume 1: Mathematical and Computational Aspects, pages 193-211, Berlin, 1987. Springer-Verlag. S. Prossdorf and G. Schmidt. A finite element collocation method for singular integral equations. Math. Nachr., 10033-60, 1981. J. Roumeliotis. A Boundary Integral Method applied to Stokes Flow. PhD thesis, The University of New South Wales, 2000. URL http://www.staff.vu.edu.au/johnr. J. Roumeliotis. Product inequalities and weighted quadrature. In S. S. Dragomir and T. M. Rassias, editors, Ostrowski Type Inequalities and Applications in Numerical Integration, pages 373-416, Dordrecht, 2002. Kluwer Academic. J. Saranen and W. L. Wendland. On the Asymptotic Convergence of Collocation Methods With Spline Functions of Even Degree. Math. Comput., 45 (171):91-108, 1985. G. Schmidt. On Spline Collocation for Singular Integral Equations. Math. Nachr., 111:177-196, 1983. G. Schmidt. On Spline Collocation Methods for Boundary Integral Equations in the Plane. Math. Meth. in the Appl. Sci., 7:74-89, 1985. G. Schmidt. On E-Collocation for Pseudodifferential Equations on a Closed Curve. Math. Nachr., 126:183-196, 1986. W. L. Wendland. On the asymptotic convergence of boundary integral methods. In C. A. Brebbia, editor, Boundary Element Methods, pages 412-430, Berlin, 1981. Springer-Verlag.
INTERMOLECULARINTERACTIONS OF (H2O)z A. HASKOPOULOS AND G. MAROULIS' Department of Chemistiy, University of Patras GR-26500 Patras, Greece E-mail: marovlis@,.upafvas.pv,hask(ii2chernistrv.uuatras.zr We report optimal structures, interaction energies and interaction induced dipole moments and polarizabilities for the van der Waals complex (H20)2-He. Relying on Meller-Plesset perturbation theory with large, carefully optimized basis sets we have located the most stable configuration while the corresponding interaction energies were computed using coupled-cluster techniques. The potential energy surface (PES) of the complex has been determined using the MP2 method. The dependence of the calculated interaction properties on the basis set is also studied.
1. Introduction The van der Waals complexes of water with model collisional partners, such as the rare gas atoms, have attracted significant experimental and theoretical attention as prototypical systems for hydrophobic interactions [l, 21. Such interactions may play a dominant role in diverse areas like protein conformation, biological membrane formation and solvation thermodynamics [3,4]. The interaction of water with the helium atom is of particular importance [5-71. Not only is helium the simplest possible closed-shell monomer, it also is the second most abundant collisional partner of water in the atmospheres of outer planets and in interstellar clouds. In previous work on the water molecule [S, 91 and on the water dimer [ 101 we presented a complete description of the electric properties of the referenced systems. In the present study we have explored most of the potential energy surface (PES) for the trimer, (H20)2-He, and compared the global minima obtained with various basis sets. Furthermore we have calculated the interaction dipole moment and dipole polarizability for the most stable configuration of the trimer.
+Authorto whom correspondence should be addressed
223
224 2.
Theory
The ab initio calculations reported in the present study rely on self-consistentfield (SCF), second (MP2) and fourth (MP4) order Wller-Plesset perturbation theory and coupled-cluster theory with singles and doubles (CCSD) and coupled-cluster with singles, doubles and perturbatively linked triple excitations (CCSD(T)). Detailed presentations of the mathematical and physical foundations of these methods are given in standard textbooks [ 11, 121 The interaction properties are obtained through the Boys-Bernardi counterpoise-correction (CP) [ 131. For a defined configuration of system A,"B the interaction property Pint(A..B)is computed as:
4",(A..3)= P(A-- 3)- P ( A - - X) - P(X - - 3 )
(1) where P(A"'X) etc. denotes calculations of the property for the subsystem A in the presence of the ghost orbitals of subsystem B. We have used Eq. (1) to determine the geometrical parameters of stable molecular configurations and the appropriate interaction dipole moments and polarizabilities. 3.
Computational details
The quality of the employed basis sets is of primary importance in the derivation of interaction properties. We based our study on the basis sets used in previous work [lo]. Table 1. Basis sets used in the calculations ~
~~
Basis set H20
DO DI 02 03
Q1 Q2 P3
He
[6s4p2d/4s2p] [6s4p3d] [6s4p3d/4s2p] [6s4p3d] [6s4p3d4s3p1d] [6s4p3dI [6~4p3dlf/4s3pld] [6s4p3d] [9s6p6d3f/6s4p2dlfl [6s4p3d] [9~6p6d4f76sSp3d2fl [6s4p3d] r9~6p6d4f76~5p3d2fl r6s4p3dl fl
Our calculations were performed with the two water molecule subunits kept frozen at the theoretical molecular geometry reported by Frisch et al. [ 141. Atomic units are used through this work. Conversion factors to SI units are: energy, 1 E h = 4.3597482 x lo-'' J, length, 1 = 0.529177249 x lo-'' m, dipole
225
moment, 1 e% = 8.478358 x lo5' Cm, &pole polarizability, a, 1 e'~'Ei' = 1.648778 x lo4' C'm'J-I. All calculations were performed with GAUSSIAN 98. 4. Results and Discussion
We have calculated the most stable configurations of the complex using the basis sets DO,D1,D3 at the MP2 level of theory. According to all the basis sets used the trimer has a T-shaped configuration with the He approaching the water dimer through they axis.
Figure 1. Stable configuration of the complex (H20)2."He.
The potential energy surface for (H20)2"'He is explored with the MF'2 method using the DO basis set. We have collected in Table 2 the calculated interaction dipole moments and polarizabilities of the trimer. Table 2. Interaction dipole moment and polarizability (atomic units) for the (H20)2".Hecomplex calculated at the SCF and MP2 level of theory. Method SCF m 2 SCF MP2 SCF MP2 SCF MP2 SCF
Aa
P 0.0093 0.0098 0.0093 0.0098 0.0092 0.0098 0.0092 0.0098 0.0091 0.0092 0.0092
-0.0236 -0.0173 -0.0236 -0.0172 -0.0232 -0.0174 -0.023 1 -0.0174 -0.0174
0.3142 0.3594 0.3138 0.3589 0.3151 0.3617 0.3 157 0.3628 0.3175 0.3 176 0.3176
226
Figure 2. h4P2 potential energy surface of the (H20)2"'Heinteraction.
References 1. R.C.Cohen and R.J.Saykally, J. Chem. Phys. 98,6007 (1993). 2. G.T.Fraser, F.J.Lovas, R.D.Suenram and KMatsumura, J. Mol. Spectrosc. 144,97 (1990). 3. A.Ben-Naim, J. Chem. Phys. 90,7412 (1989). 4. K.Watanabe and H.C.Anderson, J. Phys. Chem. 90,795 (1986). 5. M.P.Hodges and R.J.Wheatley, J. Chem. Phys. 116, 1397 (2002). 6. E.Arunan, T.Emilsson and H.D.Gutowsky, J. Chem. Phys. 116, 4886 (2002). 7. KPatkowski, T.Korona, R.Moszynski, B.Jeziorski and K.Szalewicz J. Mol. Struct. 591,231 (2002). 8. G.Maroulis,J. Chem. Phys. 94, 1182 (1991). 9. G.Maroulis, Chem. Phys. Lett. 289,403 (1998). 10. G.Maroulis, J. Chem. Phys. 113, 1813 (2000). 11. A.Szabo and N.S.Ostlund, Modem Quantum Chemistry, MacMillan, New York, 1982. 12. T.Helgaker, P.Jsrgensen and J.Olsen, Molecular Electronic-Structure Theorv. Wilev. Chichester. 2000.
227
13. S.F.Boys and F.Bernardi, Mol. Phys. 19,55 (1970). 14. M.J.Frisch, J.A.Pople,and J.E.Del Bene, J.Phys. Chem. 89,3664 (1985).
ACCURATE THERMOPHYSICALPROPERTIES OF NEAT GLOBULAR GASES AND THEIR BINARY MIXTURES DETERMINED BY MEANS OF AN ISOTROPIC TEMPERATUREDEPENDENT POTENTIAL U. HOHM Institutfiir Physikalische und Theoretische Chemie der TU Braunschweig, Hans-Sommer-Str. IO,D-38106Braunschweig, FRG E-mail:
[email protected] L. ZARKOVA Znsfifufeof Electronics, Blvd Tzarigradsko Schoussee 72, Soja 1784, Bulgaria, E-mail:
[email protected]
We present results on self-consistent calculations of second pVT- virial coefficients BV), viscosity data q(lJ and diffusion coefficients pD for heavy globular gases (BF3, CF4, SiF4, Ccl4, sic&, SF6, MoF6, w F 6 , u F 6 , C(CH3)4,and Si(CH3)4)and their binary mixtures with globular gases, and Ar, Kr, and Xe, respectively. The calculations are performed mainly in the temperature range between 200 and 900K by means of isotropic n-6 potentials with explicitly temperature-dependent separation rm(T) and potential well-depth E (T). In the case of the pure gases the potential parameters at T = 0 K (E, r, n) and the enlargement of the first level radii Gare obtained solving an illposed problem of minimizing the squared deviations between experimental and calculated values normalized to their relative experimental error. The temperature dependence of the potential is a result of the influence of vibrational excitation on binary interactions. The interaction potential of the binary mixtures are obtained with simple combination rules from the potential parameters of the neat gases. In all cases we observe excellent reproduction of the experimental thermophysical properties of the neat gases and the binary mixtures.
1. Introduction
Tables with reliable thermophysical data of pure heavy globular gases and binary mixtures of these gases in a wide temperature range are requested in contemporary industry. For example, CF4, SiF4, CCL, SiCL, SF6,WF6, MoF6, C(CH& and Si(CH3)4 are used in different microelectronic and chemical technologies (chemical vapor deposition (CVD), thin-film epitaxy, etc.). SF6 228
229 and CF4 are also applied in power breakers, and the importance of UF6 in radiochemistry and nuclear technology is well-known. Such thennophysical data can be used as input for numerical simulations when a cheaper and safer technology or better design are looked for. Unfortunately, accurate and systematic measurements of the thermophysical properties of these rather aggressive and toxic gaseous halides and hydrocarbons are sometimes dangerous and expensive, particularly at high temperatures. Moreover it is hardly possible to obtain thermophysical data of the incredibly high number of various binary mixtures of these gases. The alternative is to calculate their thermophysical properties by means of reliable intermolecular interaction potentials. A precise knowledge of these binary intermolecular potentials is also requested for the interpretation of dielectric and refiactivity virial coefficients, collision-induced light scattering (CILS) spectra and calculation of the structure and spectroscopic properties of small clusters. All this stimulates theoretical investigations and a proper modeling of intermolecular interactions as a background for approximation of available experimental data and prediction of different transport and equilibrium properties. Here we present the simple but very successful approach of an effective (n-6)-Lennard-Jones type potential with explicitly temperature dependent potential parameters. The concept of this isotropic-temperature dependent potential (ITDP) is outlined in the next section. 2.
Theory
The idea of the ITDP is based on the fact, that the intermolecular interaction between two molecules depends on their vibrational quantum number v [ 1,2]. The fraction of molecules with vibrational quantum number v can be calculated with the help of the vibrational partition function:
Here N is the total number of normal vibrations, vi is the quantum number of the vibrationally excited level of the i-th normal vibration and gi the corresponding degeneracy; e,Aw,/k, , where h = 2 z A is Planck’s constant and ks is Boltmann’s constant, respectively. In Eq. (1) the harmonic oscillator approximation is used. Instead of a mixture of n, different vibrational states, we now consider the globular gas at given temperature T to consist only of one vibrationally
230 excited state with an enlargement of the molecular radii S'"O( T ) averaged over all n, states [3,4]:
".
S'eff)(T)= E S ( T = O ) x ( C I xxl(T)+C, xx,(T)]
(2)
I,m=O
The harmonic oscillator force constants, C I , ~are , known and equal to the enlargement of the excited level k (k=l,m) normalized to the enlargement V = O ) of the first level. xk(k=l,m) is the relative population of the excited state
", k ( Z X k ( T ) = 1). It is calculated as a function of temperature by means of the k=O
vibrational partition function Z(T). The interaction between two excited molecules of the same kind with an averaged effective size at given T is now described by means of a single isotropic (n-6) potential U(r,T ) :
U ( r , T )=
[
d e f f ) ( T ) 6[ rPff;(T)]' - n[r$ff;(T)]'l ' n-6
(3)
where r is the distance between the centers of mass of the two molecules and n is the repulsive parameter. In Eq. ( 3 ) the effective equilibrium distance is rm( e f f ) ( T= ) r, (T = 0) + S c e f f ) ( T ) (4) and the effective potential well depth is
E
Lr& 1
'em(T) = E(T = 0) -
The relation ( 5 ) follows from the assumption that the long-range attractive forces are not influenced by excitation. In both equations ( 4 ) and (5) r,(T= 0 ) and E(T= 0 ) are parameters of the ground state-ground state interaction. It is obvious that the temperature dependence of r,(eff)is implied in the dependence Ckxk(T), which is specific for a particular globular gas. It can be calculated accurately by means of Eq. ( 2 ) and then approximated with a high precision. Subsequently this relation can be used to calculate 2 ') by using Eq. (5). For unequal particles "1" and "2" we use the combination rules [5]
23 1 r,$$)(T)
= k Z f f ) ( T ) +r,$ff)(T)]/2
,
(7)
and
It is the strength of this approach that the unlike interactions can be calculated once the potential parameters of the pure gases are obtained by a minimization procedure. 3.
Minimization Procedure
In general, the determination of the ITDP parameters is based on the experimental pVT- second virial coefficients B, viscosities 7,and the second acoustic virial coefficients p. The input sets of the experimental data of the pure gases were critically analyzed. Those of them which were not consistent with the majority of the other data or which contradict theory were excluded from the minimization procedure. To this end a so-called ‘jack-knife control’ was applied. This procedure allowed us to analyze the consistency of each experimental data set with the basic data set. It gave an objective reason to accept or to reject a specific work no matter what accuracy the authors claim. As a rule, when there were enough experimental data we did not use those data which are derived from the measurements by means of models with unknown input parameters (e.g. B(T) derived from special refractivity measurements). The ITDP parameters are determined by solving a typical “ill-posed” problem [6] of minimizing the SUm of squared deviations F= = c [ l n ( p i , , /&lc)/ai,exp]2 between Mmeasured (Pexp) and
C(R,?) M
M
calculated (Pcalc)values of B, 7,and p normalized to their relative experimental error aexp.It is noteworthy that the only input data required for obtaining the ITDP for any particular gas are the normal vibrational frequencies and as much as possible experimental data of different kind. All these thermophysical properties are calculated by means of the well known formulae for pure gases. The root mean square deviations R M S = J m and the mean relative deviations < R >= M - l z I n ( p i , e x p/pi,calc)/ai,exp are normalized to ai,exp. M
232 Different types of sensitivity checks were performed in order to justify the accurateness of the so-determined potential parameters.
4.
Results
The potential parameters of the ITDP at T = OK for like interactions and the RMS deviation of the fit are given in Table 1. Table 1. Potential parameters of the ITDP at T = OK and root-mean-square deviation of the fit for the neat gases. The numbers in Darentheses denote the standard deviation of the
Gas
dkBK
BF3 CF4 SiF4 CCI4 SiC14 sF6 MoF~ wF6 uF6 C(CH34 Si(CH3),
310.5(2.1) 328.4(1.1) 200.2(1 .l) 696.4(3.8) 775.2(1.1) 417.80(87) 827.3(1.4) 712.8(1.1) 1040.0(2.3) 586.32(42) 674.74(91)
n
4.196(6) 4.329(2) 5.270(10) 5.589(6) 5.715( 10) 5.04l(3) 4.995(8) 5.076(7) 4.992(5) 5.779(3) 5.905(4)
23.00(31) ).0957(0.35) 0.686 52.71(63) 1.29(3) 0.701 12.83(11) 2.09(17) 0.691 18.6(2.2) -0.55(14) 0.640 1.98(17) 26.05(1 1) 0.596 1.31(3) 0.657 34.76(33) 1.568 28.19(24) 1.94(3) 1.966 17.22(13) 3.36(3) 27.87(30) 2.26(3) 2.061 1.41(3) 28.02(12) 1.201 20.79(11) 1.88(3) 0.769
An example of the temperature dependence of the ITDP is given in Figure 1 for neat UF6. Except for CC4 we observe that i-,,,(efl(T,) increases and E ( ' ~ ( T )
decreases with temperature. The accuracy of the fit is shown in Figure 2, where SF6 serves as an example. It can be seen that the ITDP reproduces most of the experimental data (second p VT - virial coefficient B and viscosity 7)within the experimental accuracy uaP. This is also displayed in the last row of Table 1. In Figure 3 we consider the temperature dependence of the viscosity 7 of binary CF4 - Ar mixtures of different compositions x(CF4). We see that the deviations between the calculated and measured mixture viscosities in most cases is less than one percent. It is noteworthy that different thermophysical properties of a given gas can be calculated by using only one ITDP, whereas generally the calculation of e.g. B and 7 simultaneously with simple potential models requires different interaction potentials.
233
1000
.
:
1
;
. :
1
:
I I
. :
1
:
1
:
1 I 1
:
I
o--
h
m
z
3
* .
1
I 1
500-Y
I I
. . :
-500--
-T=O K
- - - - T=300 K ....... T=600 K
-1000 -I
5
4
6
7
101Or/rn
Figure 1 . Isotropic temperature-dependent potential of temperatures T.
uF6
calculated at different
......................
I
43
V
0 .
0
Figure 2. Weighted relative deviations of measured B(T) and q(T) of SF6 from the best solution obtained with the ITDP (zero line).
234
0
h c.
C
0
a,
za,
Q
v
X
'E
+
+
0.5
P
,.\
-ij
03
03
H
\"
-0.51
s
-1.0-
m
03 a 0.1921
H .
.>
+ 0
0 . . + @
0
0.2051 0.3788 0.4211 0.5924 0.6131 0.7977 0.7987
TlK Figure 3. Relative deviations of measured and calculated q,,&) of CF4 - Ar mixtures. The inset shows the mole fraction of CF4.
5. Conclusions The isotropic temperature-dependent potential (ITDP) allows for the precise calculation of several thennophysical properties of neat globular gases and their binary mixtures. It also works in the case of non-spherical molecules like BF3. We plan to extend our studies to more non-spherical molecules (e.g. BC13, Ga(CH3)3) and to globular macromolecules. Acknowledgments
Financial support of the Deutsche Forschungsgemeinschafl and Fonds der Chemischen Industrie is gratefully acknowledged.. References 1. K. Refson, G. C. Lie and E. Clementi, J. Chem. Phys. 87,3634 (1987). 2. B. Stefanov,J. Phys. B: At. MoI. Opt. Phys. 25,4519 (1992). 3. L. Zarkova, Mol. Phys. 88,489 (1996).
235 4. L. Zarkova and U. Hohm, J. Phys. Chem. Re$ Data 31, 183 (2002). 5. L. Zarkova, U. Hohm and M. Damyanova, J. Phys. Chem. Re$ Data (2003), accepted. 6 . A. H. Tichonov and V. Y. Arsenin, Solution of I11 Posed Problems (John Wiley, London, 1977).
MODELLING AND COMPUTATION OF AXIALLY SYMMETRIC FLOWS OF ELECTRORHEOLOGICAL FLUIDS *
R. HOPPE and W. LITVINOV and T. RAHMAN Institute for Mathematics University of Augsburg Universitatsstr. 14, 0-86159Augsburg, Germany E-mail:
[email protected]
In this article, we discuss the extended Bingham fluid model introduced in the paper [2] for electrorheological fluids, and formulate the problem in the axially symmetric cyllindrical coordinates system. As an application we choose the ER Shockabsorber,and present some numerical simulation of its behaviour.
1. Formulation of the problem Electrorheological fluids are smart materials which are concentrated suspensions of polarizable particles in a nonconducting dielectric liquid. Under the influence of an externally applied moderately large electric field, the particles form chains along the field lines, and these chains then aggregate to form columns (cf. [l]).These chainlike and columnar structures cause dramatic changes in the rheological properties of the suspensions, like the viscocity (resistance to flow) which increases by several orders of magnitude. The fluids become anisotropic, as the viscosity in the direction perpendicular to the applied electric field increases much more significantly than in the direction of the electric field. The chainlike and columnar structures break under the action of large stresses (beyond the yield stress), the viscosity of the fluid then decreases, and the fluid becomes less anisotropic. The transition which takes place under the influence of an electric field, happens very quickly, within the order of milliseconds, and it is completely reversible. It is this behavior that makes electrorheologicalfluids potentially *This work has been supported by the German National Science Foundation (DFG) within the DFG funded Collaborative Research Field SFB 438. 236
237
attractive in various technological applications, in particular the automobile industries. For the modelling of such fluid, in the continuum mechanics, the typical model has been the classical Bingham model for viscoplastic fluids. In the present paper we follow a recently developed model, cf. [2], which is a generalization of the classical Bingham model, and it takes into account the anisotropy of the electrorheological fluid. The model is quite flexible for practical applications, and it can easily handle a general flow. On the basis of experimental results, in the generalized model, the following constitutive equation for the electrorheological fluid has been developed (cf. [2]).
Here, a i j ( p ,u,E ) are the components of the stress tensor which depend on the pressure p , the velocity u = ( u ~ 212, ,u3) and the electric field strength E = ( E l ,Ez, E3), bij is the Kronecker delta, ~ i j ( uare ) the components of the rate of strain tensor, and I(u)is the second invariant of the rate of strain tensor.
$ is the viscosity function depending on I(u),IEI, the modulous of the electric field strength, and p(u, E ) , the angle between the velocity field and the electric field. The viscocity function is identified through numerically approximating experimentally obtained flow curves, for instance by using piecewise polynomials. The function is formulated as the following, cf. [2].
where X is a small positive parameter (for regularization). From here on we use the cylindrical coordinates system ( T , ~ , z ) and , suppose that the flow of the fluid is axially symmetric. The velocity vector u is given by u = (uT,ue ,u,), where uT,ue and u, are all functions of ( T , 2). For such flow the components of the rate of strain tensor have the following
238
form:
Let R = { ( r ,z)I 0 < z < I , R l ( z ) 5 T < R z ( z ) ) be the domain of fluid flow, where R1 and R2 are continuous or piecewise continuous nonnegative functions given in [0,I ] . We consider stationary flows of electrorheological fluids, and we neglect the inertial forces. Then, in R, we obtain the following equations of motion,
and the following condition of incompressibility,
Here K1, K2 and K3 are the components of the volume force K = (K1,K2, K3), and div, is the operator of divergence in cylindrical coordinates for an axially symmetric flow. We consider mixed boundary conditions for our problem. Let S1 and S2 be open subsets of the boundary S of the domain Q such that S1 is non-empty, 3 1 U 3 2 = S and S1 fl S2 = 8. ulS, = Q ,
[-pbij+2$~ij(u)]v.I 3 sz
=Pi, i , j = 1 , 2 , 3 ,
(8)
where Fi and vj are the components of the surface force F = ( F I ,F2, F3) and the unit outward normal v = (v1,0,v3) to S2, respectively. The electric field strength E is generated by the application of some external electric voltage U.The distribution of E is obtained by separately solving a boundary value problem for the electrostatic potential, cf. [ 2 ] .
239
2. Generalized solution Let 51
= {W\W = (vT,we,w,) E C'(Q3,
J =
{WIW
E
J1,
W ~ S ,= 0) ,
w,(O, J2 = {vlw
Z)
= ve(0,Z ) = 0,
E J,
and X I , X , V be the respective closures of the norm
51,
z E [0, l ] } ,
div, w = 0},
J and J2 with respect to
We assume that there exists a function U satisfying
U E X l , U/sl Define an operator L : J
-+
=
c,
div, U = 0.
(10)
J* as follows:
s,
( L ( ~ ) , h ) = 2 $ ~ i j ( U + ~ ) € i j ( h ) r d r d z ,V , ~ X. E
(11)
Here we assume that the function $, determined by Eq. (3), is continuous and bounded below and above by positive constants. Let Y be a space of scalar functions with the norm
Consider the problem: find a pair (w,p) E X x Y satisfying
(L(v),h) - (div,*p,h) = (&,hi) (div, w,q ) = 0,
+
FihidS,
J,,
hEX,
(13)
Y.
(14)
qE
Here divz is the operator adjoint to div, that maps X into Y , and dS = (27r)-l dS2, where dS2 is the surface measure of the boundary of Q, where
Q = {zI z = (rcoscr,rsina,z), ( r , z ) E 0, cy E (0,27r]}.
If (w,p) is a solution of the problem Eq. (13)-(14), then the pair (u =
6 + w,p) is a generalized solution of the problem Eq. (4)-(8). It follows from [2], that, under certain assumptions which are natural from the physical point of view, there exists a solution of the problem Eq. (13)-(14) for $ defined by Eq. (3), moreover if p(u, E ) = p(z) is known everywhere, then the solution is unique.
240
3. Numerical aspects
For the numerical experiment we have considered two examples, an electrorheological clutch and an electrorheological shock absorber. We produce the numerical solutions by solving Eq. (13)-(14) using the technique of augmented Lagrangian with operator splitting [3].
Figure 1. Schematic diagram of an ERF Shockabsober (left). Velocity (M/s) profiles at one cross section of the duct for various applied voltages during compression (right).
A schematic diagram of an electrorheological shock absorber is shown in Figure 1 (left). The shock absorber contains two chambers filled with an electrorheological fluid, a piston with two transfer ducts connecting the chambers, and a third gas-filled chamber separated from the others by a floating piston. The inner walls of the ducts serve as electrodes that are supplied by a voltage lead within the piston rod. As the piston rod moves the fluid passes through the ducts from one chamber to the other. The generated electric field in the ducts is perpendicular to the flow of the electrorheological fluid. For the numerical investigation we set the radius of each chamber to 23mm, and the distance between the walls of the duct to 4mm. Vertical components of the velocity for different applied voltages ( U ) calculated at one cross section of the duct, are shown in Figure 1 (right). As seen from the figure, for non-zero voltages, we observe a flat velocity profile in
24 1
the middle representing a solid region (with large viscocity) being formed, which is growing for increasing voltage.
References 1. M. Parthasarathy, D. J. Klingenberg, Electrorheology: mechanisms and models, Material Science and Engineering, R 17, 57-103, 1996. 2. R. H. W. Hoppe, W. G. Litvinov, Problems on electrorheological fluid flows, Submitted to Discrete and Continuous Dynamical Systems, Series B, 2001. 3. R. Glowinski, P. LeTallec, Augmented Lagrangian and operator-Splitting Methods i n Nonlinear Mechanics. SIAM studies in Applied Mathematics, Vol. 9, SIAM, Philadelphia, 1989
SIMULATIONS OF SPARTAN RANDOM FIELDS
DIONISSIOS T. HRISTOPULOS Department of Mineral Resources Engineering Technical University of Crete University Campus, Kounoupidiana Chania 73100, Greece E-mail:
[email protected]
Spartan random fields have multivariate Gibbs probability distributions that are determined from a frugal set of parameters [I]. Thus, they provide a parsimonious model for representing the variability of spatially distributed processes. Potential applications include interpolation and simulation in geostatistical studies as well as methods for compressing large images. Here we develop methods for simulating Spartan fields with pre-determined parameters on regular lattices and at random locations in two spatial dimensions. 1.
Introduction
A random field is a set of inter-dependent, spatially distributed random variables. It involves an ensemble of different realizations (states) that occur with a frequency determined from the multivariate probability density. Spartan random fields have recently been introduced [ 11 as models of spatial processes. Potential applications include the modeling of environmental (groundwater, atmospheric) pollutant distributions, mineral resources concentrations, the morphological and transport properties of technological, microscopically nonhomogeneous materials (e.g., porous composites, paper products). Spartan random fields are determined from a “frugal” set of free parameters. Their multivariate probability density is constructed by incorporating physical constraints. A practical motivation for Spartan random fields is to avoid calculating two-point functions (e.g., correlation function, variogram) from samples, which is a computationally intensive inverse problem that scales 3s O ( N 2 ) ,N being the sample size. In addition, for general distributions (e.g., anisotropic with a priori unknown principal directions) the calculation of two-point functions involves various empirical assumptions. Spartan random fields provide a computationally efficient alternative.
242
243 2.
Spartan Random Field Models
Spartan random fields are special cases of Gibbs random fields [2], which have , where Z is a a probability density function given by f,[XI= Z-’ exp { -mX]} normalization factor (partition function), and H X ] is an “energy functional” of the random field states X . Spartan random fields involve energy functionals of the form H[X, (s)] . The subscript ‘‘A ”. denotes a minimum length scale determined from the measurement resolution. For lattice random fields, R is set by the lattice spacing. Thus, in contrast with random field models commonly used in geostatistics, Spartan random fields incorporate a built-in notion of scale. This is physically meaningful, since investigations of spatial variability involve a specific resolution. Spartan random fields have the following general properties: (i) The energy functional couples the values of the field within local neighborhoods, which are specified according to the application (e.g., necessary constraints for reconstruction of specific properties). (ii) The couplings are motivated by physical or geometric constraints. (iii) For a model with specified couplings the energy functional is completely determined from a small set of parameters, in contrast with calculating a continuous function (e.g., the covariance). (iv) The model parameters are estimated from the available sample. (v) The Spartan random fields are equivalent to Markov random fields. Spartan models are defined in continuum and discrete spaces. The latter involve both regular lattices and irregular (off-lattice) spatial distributions. Continuum models are useful for theoretical investigations, while discrete models are more suitable for numerical simulations. Regular lattices are appropriate if a systematic scanning of the spatial degrees of freedom is possible, e.g., for remotely-sensed images and for industrial materials. Irregular spatial distributions are natural for investigations based on ground measurements, because financial cost and other considerations limit the number and location of sampling stations. Methodological applications of Spartan models include structure characterization of heterogeneous materials, compression of morphological information, estimation of process values at nonsampled locations, and simulations of uncertain environmental processes.
2.1. A Discrete Spartan Model in Two Dimensions Consider a square lattice with L nodes per side ( L = 2 m , m integer), unit spacing, and N nodes sn , n = 1,..., N . The nearest neighbors of s, are s, k iii, where ej the unit vector in the direction i = x,y . The energy functional is
244
H,,(q,c ; X , ) is the “energy” associated with s, and its nearest neighbors. The
parameters of the model are the mean mX , the scale coefficient f7a, the shape coefficient v l , and the correlation length 5 (in units of lattice spacing). The parameters q, , q1, 5 are determined by minimizing a metric that measures the distance between sample constraints (obtained from spatial sample moments) and stochastic constraints (obtained from the multivariate probability density). The definition of the constraints and the optimization procedure used to infer the coefficients from a sample are presented in [ 11. In the following we assume that & X , ) have a physical the coefficients are known. The terms in H,?(v,, motivation: The first term is the deviation of the local value from the mean, the second is proportional to the square of the local gradient, and the third to the square of the local curvature. The multivariate probability distribution of the Spartan model defined by Eqs. (1)-(2) is Gaussian, because the energy is a quadratic functional of the field. Stochastic expectations with respect to the probability density are denoted by the symbol E [ . ] . The field is stationary because the mean m, and the coefficients q o , qI and 5 are uniform in space. It is also “quasi-isotropic” because there is no distinction between the lattice axes (full isotropy is not possible due to the discrete lattice structure). This condition can be relaxed by introducing direction dependent coefficients. The coefficient 7” determines the total variance, 5 determines the range of spatial dependence, and q1 affects the shape of the covariance function.
2.2. The Covariance Spectral Density The covariance Gx,,,(r)= E [ X , ( s ) X , ( s + r ) ] - E [ X , ( s ) ] E [ X , ( s + r ) ]is the most common measure of spatial dependence (it fully determines the spatial dependence for Gaussian probability densities). The spectral density, i.e., the
245 Fourier transform of the covariance, is well approximated by its continuum counterpart, which is expressed in terms of qo, q , 5 as follows
The vector k is the spatial frequency (wave-vector). The kernel I QA(k) I cuts off fluctuations at frequencies higher than l / A . For a square lattice we approximate the kernel with a sharp cutoff at kmx = 2 z . The parameters q0, ql , 5 are not completely free, since Bochner's theorem, e.g., [3] requires the spectral density to be non-negative and integrable. Hence, either qo > 0, 5 > 0 and qi 2 0 , or qo > 0,C > 0 and q1< 0, 7: < 4 should be satisfied . the
The covariance for non-zero separation distances can be obtained [ l ] by numerical inverse Fourier transform of the spectral density
Gx:,(r)=-&
JdkGx:,(k)eik'r. The direct transform is given by means of
GXx,(k)= jdr GXij, (r)e-ik'r. The correlation range of the covariance function is defined by means of b2 = jdr Gx.A(r)/Gx;A(0). This is also expressed as
b2 = ex:,(k= O)/G,;,(O), and using the Eq. (3) we obtain b = 5 . The variance D:;,
= G,:,(O)
o:;,= (27r)-'
is obtained by
fm
dk k g, ( k ) ,
integrating the spectral density, i.e.,
Using the transformation (kc)*+ K
assuming ergodic conditions, k,,{
and
>> 1 , a:;Ais shown to be independent of
the correlation length and can be evaluated explicitly from integral tables [l].
3.
Fast-Fourier-Transform Simulation Method
Simulations of lattice Spartan random fields can take advantage of the regularity of the lattice structure. For multi-Gaussian distributions Fourier filtering can be used [4,5] to generate fluctuations around the mean: A set of N Gaussian random numbers are generated and filtered in frequency space to enforce correlations. These operations scale as O ( N ) . The random field in real space is obtained by evaluating the inverse Fast Fourier Transform of the filtered set, which scales as O(N log N ) . This ensures very efficient performance of the
246
algorithm compared with methods based on covariance matrix decomposition, which are O ( N 2 ) and memory intensive. If X , ( s ) is a state of a real-valued, continuum random field its Fourier transform is %,(k) = u(k)[cx;,(k)]'2,
where u(k) is a complex, zero-mean
Gaussian process with ultra-local correlation E[u*(k')u(k)]= (27r)*S(k- k') . This condition ensures that X , (s) = IFFT[%>.(k)] has the desired covariance. Since the field X,(s) is real, it is also required that u'(-k)=u(k). These relations should be suitably modified for matrices representing lattice functions.
3.1. Numerical Experiments
Two examples of Spartan states are shown in Figure 1. The field shown on the left is ergodic with a sample standard deviation equal to the theoretical value ox:j, = 0.34. The field on the right is non-ergodic, and the sample standard deviation is different from the theoretical value D,,,= 0.63 .
Figure 1: Simulation of a Spartan random field with q,, = 1 , q = 0.2 and 6 = 5 (left). Simulation of a Spartan random field with q,, = 3 , 7,= -0.2 and 6 = 50 (right).
We have verified that the theoretical covariance function (obtained by the numerical inversion of the spectral density) is in excellent agreement with the
247 sample covariance functions. The latter were obtained by (1) calculating the L one-dimensional covariance functions along the horizontal lattice direction, and ( 2 ) averaging over the L 1-d covariance functions. 4.
Mode Superposition Method
The FFT method can not be used to simulate Spartan random fields at randomly distributed points. We can then use a mode superposition method [6,5]
The phases q5,, are distributed uniformly in [0,27c], and the frequencies k, follow the probability density f , (k,) =
(k)/c:;>.
.
The mode superposition
converges to X , ( s ) as N , tends to infinity. The numerical complexity is O ( N , N ) (since the cosine function must be calculated for all mode-point combinations). Realizations simulated with a number of modes on the order of 10,000 provide an accurate reproduction of the statistics. 5.
Conclusions
We presented methods for simulating Spartan random fields on square lattices (based on the Fast Fourier Transform) and at random points in two-dimensional space (using a harmonic mode superposition method). Extensions to systems of higher dimensionality are trivial. We focused on a specific Spartan model with nearest-neighbor coupling, but the methods can easily be extended to more strongly non-local models with Gaussian statistics. We will consider space-time random field models, robustness of the parameters to sample noise, and extension to non-Gaussian distributions in future research. References 1. 2.
3. 4. 5. 6.
D. T. Hristopulos, SIAMJ. Sci. Comp., in press. G. Winkler, Image Analysis, Random Fields and Dynamic Monte Carlo Methods: A Mathematical Introduction, Springer Verlag, NY, 1995. G. Christakos and D. T. Hristopulos, Spatiotemporal Environmental Health Modelling, Kluwer, Boston (1 998). H. A. Makse, S. Havlin, M. Schwartz and H. E. Stanley, Phys. Rev. E, 53, 5445 (1 996) D. T. Hristopulos, Stoch. Environ. Res. Risk Assessment, 16,43 (2002). I. T. Drummond and R. R. Horgan, J. Phys. A 20,4661 (1987).
FUNDAMENTAL FUZZY RELATION CONCEPTS OF A D.S.S. FOR THE ESTIMATION OF NATURAL DISASTERS’ FUSK (THE CASE OF A TRAPEZOIDAL MEMBERSHIP FUNCTION) LAZAROS S. ILIADIS Department of Forestry and Environmental Management and Natural Resources, Democritus Universityof Thrace, Orestiada, Pantazidou 193, 68200 Orestiada, Greece. E-maiI:lIiIiadis@,,finenr.duth..gr
[email protected] STEFANOS H. SPARTALIS Department of Agricultural Development, Democritus Universityof Thrace, Orestiada, Pantazidou 193, 68200 Orestiada, Greece. E-mail:
[email protected]
The effective protection from natural disasters requires the development of a rational and sensible policy. The cases of the recent (2003) floods in northern, central and southern Greece and the big problem of the forest fire breakouts (during the summer season) have revealed the necessity for a serious strategic planning that will reduce the consequences and especially the cost in human lives. Computer science and Fuzzy Mathematical relations can provide significant aid in this direction. Given a specific kind of disaster the estimation of the degree of natural disasters’ risk (D.N.D.R.) for a prefecture of Greece can be performed by the use of a Decision Support System (D.S.S.). The whole inference mechanism of the D.S.S. will be based on various aspects of Fuzzy Algebra. Fuzzy Machine learning techniques will be also used. Fuzzy logic models constitute the modeling tools of soft computing. Fuzzy logic is a tool for embedding structured human knowledge into workable algorithms. The problem of the existing approaches (for the D.N.D.R. estimation) is that they use crisp sets. A crisp set is based in the concept that something either belongs to it or it does not. Based on this logic an area either belongs to the highest (or lowest) risk group or not. In this way specific boundaries are drawn between the areas in order to cluster them. For example areas with 20 to 30 annual forest fire breakouts is considered to be high risky. At the same time an area with 19 fires is considered not risky and another with 31 is considered as maximum risk area. However it does not seem rational to differentiate two areas with one fire breakout difference. In this case, Fuzzy sets can be used to produce 248
249 the rational and sensible clustering. For Fuzzy sets there exists a degree of membership ps(X) that is mapped on [0,1] and every area belongs to all clusters at the same time (fiom lowest risk to highest) with a different degree of membership. Of course the characteristic cluster for each prefecture is the one with the highest value of ps(X). A trapezoidal membership function can be applied in this case that produces five different cases of Degrees of Membership.
However in many cases the problem is the estimation of a joint D.N.D.R. For example it is very important to define the risk degree of an area based on the height of rain and on the estimated economic damage at the same time. The production of a unique risk index would be very important in such a case. The D.S.S. would be asked to cluster the areas and characterize them as “Areas with the most floods AND areas with the largest financial damage”. To achieve this kind of characterization or others of the same nature, Fuzzy mathematical operations like T-Norms or S-Norms can be applied on the b(X)ij where i=1,2,3,4.....n (n is the number of the areas under examination) and j=1,2,3 ...m (where m is the criterion for which the risk is calculated). Various types of joint D.N.D.R.’s can be produced for all of the prefectures of Greece by the application of Fuzzy relations and by using Matrix multiplications. The system will be able to provide clustering for the areas of Greece based on a specific type of D.N.D.R. and also for joint D.N.D.R.’s The Greek authorities will be based on the risk groups in order to distribute their forces rationally and to plan appropriate recovery policies.
A HYBRID MOLECULAR DYNAMICS SIMULATION METHOD FOR SOLIDS. S . ITOH Toshiba R&D Center Kawasaki, Kanagawa 212-8582, JAPAN E-mail: satoshi.
[email protected]
M.IGAMI Fuji Research Institute Corporation Kanda Nishicho, Chiyoda-ku, Tokyo, IO2-8443Kawasaki, JAPAN E-mail:
[email protected]
An efficient molecular dynamics simulation method for solids is presented, where two different schemes are hybridized. This hybrid method can be processed at high speed over 10 times in comparison with
conventional simulation methods. 1. Introduction
A molecular dynamics (MD) simulation is a powerful tool for investigation of a mechanical property of solids. In the MD simulation, constituent atoms are moving according to forces acting on each atom, and the force is evaluated from an interatomic potential. If the interatomic potential is assumed as a simple potential derived from a tight binding Hamiltonian, the MD simulation can be carried out very easy, and, therefore, a very large system containing -lo6 atoms can be analyzed by the MD simulation. However, the tight binding Hamiltonian contains adjustable parameters, and there is a ambiguity in determination of these parameters. Recently, MD simulations have been performed based on a fully quantum mechanics, and the MD simulations have no adjustable parameter, which called abinitio molecular dynamics (AIMD) simulation. Since the AIMD simulation has no empirical parameter, the AIMD predicts electronic, mechanical and other properties of materials without any experimental data.
This work is supported by ACT-JST. Present address: National Institute of Science and Technology Policy, Kasumigaseki, Chiyoda-ku, Tokyo, 100-0013, Japan. *
250
25 1
But, unfortunately, there is a limitation for application of the AIMD, because the AIMD simulation exhausts a large computational time. The AIMD simulation gives a highly reliable result without empirical parameters, and requires a huge computational times. On the other hand, the tight binding molecular dynamics (TBMD) simulation using simple interatomic potential deals with a large system containing a lot of atoms/molecules, and needs some empirical parameters. It is expected to perform an efficient simulation by using the AIMD together with the TBMD simulations, which called a hybrid MD simulation. In the hybrid MD, the AIMD is used in a small region containing chemically reactive part, while the TBMD is carried out in a large region where there is no chemically reactive part. It is necessary to calculate the forces acting on each atom based on quantum mechanics in the chemically reactive region, and, on the other hand, it is enough to evaluate the forces acting on each atom via parameterized interatomic potentials in the chemically stable region. Morokuma and his coworkers have presented an efficient hybrid scheme to evaluate energy of a large molecule and its derivatives, i.e. the forces acting on atoms, which called ONIOM method [1,2,3]. Morokuma group has shown good numerical results for many large molecules including biomaterials. The present author expands the ONIOM method into a hybrid MD simulation scheme for solids. In our scheme, two different approximations to construct Hamiltonian, i.e. density functional theory (DFT) and parameterized tight binding theory, and two different seized atomic structures are incorporated under a periodic boundary The numerical test of this method is presented in this paper. condition.
2. Methodology
The present method is based on a supercell method. In conventional supercell method, a large-sized supercell is needed to obtain a precious result. In order to avoid handling such a large-sized supercell, another small-sized supercall is introduced. This small-sized supercell corresponds to a small region in the large-sized supercell, which contains reactive region like a defect, impurity and so on. According to the original ONIOM method, the large-sized and small-sized supercells are named as REAL and MODEL systems, respectively. We denote two different approximation methods as HIGH and LOW, which means the DFT and parameterized tight binding methods, respectively. It is desirable to calculate a total energy of the large-sized supercell by using a high-level
252 approximation method for Hamiltonian like a DFT. Thus, in similar way to the ONIOM method[l], the total energy E(HIGH,REAL) is approximately given by
E(HIGH,REAL) s E(LOW,REAL) - E(LOW,MODEL) + E(HIGH,MODEL).
(1)
Differentiating the E by atomic coordinates Ra, we obtain the force acting on the atom a,
aE(LOW,MODEL) dE(HIGH,MODEL) + aRa aRa In Equations (1) and (2), the computational time of the third term in the right hand is dominant in a whole computational time. In original ONIOM formulation, link atoms are introduced to terminate the surface atoms of the MODEL system. However, in the present formulation for solids, the link atoms are unnecessary, so that Jacobian matrix does not appear in Equation (2). 3. Numerical Test
The numerical result for a defect energy in Si is presented in Table 1, where the electronic structures of MODEL systems are calculated in the framework of DFT with the local approximation for the exchangecorrelation energy, and the electronic structures of REAL systems are calculated by using parameterized tight binding Hamiltonian. The former calculations are pursued by using plane wave basis with the cut-off energy of 8 Ryd. The later calculations are carried out non-selfconsistently, and, thus, are performed quickly. If a MODEL system containing 64 atoms and a REAL system containing 1000 atoms are adopted as the hybrid system, calculated defect energy is 3.9023 eV. The error from the energy obtained by using the large-sized superell consisting of 216 atoms is 1%, and CPU time required for the hybrid system is about 1/50 for the largesized supercell. The present hybrid MD simulation method is highly efficient for semiconductors. Now we are checking the efficiency of the method for metallic systems, which will be reported in elsewhere.
(2)
253
4.
Concluding Remarks
A hybrid MD simulation method for solids based on the ONIOM method for large molecules has been presented, and has given an efficient and good result for evaluation of defect energy in Si with lattice relaxations. This hybrid MD simulation method is coarse-grained from the viewpoint of parallel processing, and, therefore, is suitable for distributed computing, even if the communication speed between processor elements is slow. Therefore the hybrid MD simulation method is applicable for a large area computer network and GRID computing.
Table 1. Point defect energy in silicon . Error means difference from the ab-initio defect energy estimated by using the large supercell containing 2 16 atoms. Number of atoms Number of atoms Defect in MODEL system in REAL system energy (eV) 8 8 8 8
64 64 64
64 216 512
1000 216 512 1000
2.5684 3.1525 3.2454 3.2712 3.7836 3.8765 3.9023
Error (eV) 1.3745 0.7904 0.6975 0.6717 0.1593 0.0664 0.0406
References 1. F.Mareras and K.Morokuma, J. Comput. Chem. 16, 1170 (1995). 2. M.Svensson, S.Humbe1, R.D.J.Froese, T.Mutsubara, S.Sieber, and K.Morokuma, J. Phys. Chem. 100, 19357 (1996). 3. S.Humbe1, S.Sieber, and KMorokuma, J. Chem. Phys. 105, 1959 (1996).
THE CONTRACT GAS MARKET WITH A LINEAR SUPPLY FUNCTION *
1.G.IVANOV AND L.G.TASEVA Faculty of Economics and Business Administration Sofia University "St.K1.Ohridski", 125 Tzarigradsko chaussee, bl3: Sofia 1113; Bulgaria E-mail:
[email protected]
This paper presents a stylized model of a learning process through which power generating companies could adjust their supply bidding strategies in order to achieve profit- maximizing equilibrium in the form of Supply Function Equilibrium (SFE). The model is based on real market assumptions. Market players can form their behavior relying on market observations with no need of information on other players' contracts and generation costs. We assume an asymmetric duopoly selling a homogenous commodity. Market conditions are characterized by linear demand and supply functions and in addition a constraint is imposed on one of the players' quantity.
1. Introduction Game theory is one of the most promising approaches to understanding and studying competition in electricity markets. It finds applications now in generation markets. We use a Cournot-adjustment process based on the assumption that a player could observe some aggregate statistics pertaining to previous actions of all players. Although no player is supposed to have information about strategies and performance of each individual competitor, it is possible to estimate how other players behave. At the minimum players know the level of total demand served by the market in each trading period, actual clearing prices in the trading period, which are applied in the spot market. The literature dealing with energy and gas markets is huge and a wide range of models are proposed for simulating the interaction of competing *This work is partially supported by the Sofia University "St. K1. Ohridski" research project 20/2003 254
255
generating companies who price strategically. We refer to the game theoretic part of: Baldick, Grant and Kahnl consider a supply function equilibrium (SFE) model of interaction in the electricity market, assuming that demand is linear. They consider several strategic players all having capacity limits and affine marginal costs. Breton and Zaccour2 formulate a model incorporating two asymmetries: one of the players maximizes its profit and the second one maximizes its revenue and is not allowed to sell more than a certain proportion of the quantity sold by its rivals.
2. Model
We consider a duopoly selling a uniform commodity. Players are labeled respectively 1 and f ; their corresponding quantities on market are q1 and q f . Let Q = q1 q f denotes the total quantity supplied by both market participants. Demand as formulated in Green’s paper: The Electricity Contract Market In England And Wales, is the total demand for electricity, less the supply from a fringe of non-strategic generators, assumed to bid at marginal cost. We will use the model with linear demand function having the following form: A-bp, where A is a constant intercept, though it can be treated as varying over time, or stochastically; b is the slope of the demand responsiveness to price and b is assumed to be nonnegative. Player 1’s production cost is given by:CL(ql) = wq1+ +6q;, where w and d are positive parameters; d is the marginal cost coefficient of player 1. Production cost of player f is assumed to be zero, that is, the production cost is very negligible for this player when compared to the benefits of earning revenues (eventually in hard currencies): C f ( q f )= 0. We assume that the market rules specify that the supply form of each function is affine; its form is: q1 = a1 + &I and respectively q f = af + ,Dfp where ol and are the slopes of each of the supply functions and with known slopes, intercepts a1 and ,BLcould be calculated. We make an additional security constraint that the slope of the supply function of player f is: pf = y& where y is a variable and y > 0. Companies may also enter bilateral contracts with buyers at prices that are not instantly related to spot prices. These are total contractual obligations q,j = 1 , f with weighted average price r j , j = I , f. Each firm may have m different contracts at prices r?, respectively. In this case, firms total contractual commitment is equal to 5 = + . . . + q F and
+
qrn
q1
256
characterized by the weighted average price The outcomes of the firms are:
rj
=
r f yj't. . . + ~ j Y ","'
Yj
x
and Y f are the contracted quantities and r1 and r f are prices on where bilateral contract quantities. Firms formulate their strategies trying to maximize their profit using Cournot model. This can be interpreted that they maximize their profit and determine the equilibrium quantity which they will produce and sell to customers a day ahead. Using the Cournot model player 1 faces the following first order condition in order to maximize its profit:
The results give calculations for the slope and intercept of player 1's supply function. We have to bear in mind that slope is nonnegative and thus we find a final solution for the slope of the supply function of player 1 in the form of:
+ J(l + b6 - y)2 + 4b6y ,
a ! = - K PI - 4 1 . 267 b 701 We use the same simple computational methods for player f i n order to find an equilibrium that satisfies the stated market conditions b cxf = Y f and pl = 7-1' The step that follows is to find a value of which meets the requirements of such a formulated market conditions and thus allowing both players to achieve maximum profit. A quadratic function for is received. As a result has two different values, which are: Pl =
-(1+ b6 - y)
71 =
2+6b-d-
,
72
=
2
+
+ 6b + db2P+ 2b6
2 First we take into consideration the value of y1 and make some computations to find the intercepts and slopes of the supply functions of both firms. We have to verify if y1 is nonnegative. The evaluation shows that y1 is within 0 and 1 and this is the case where the slope of player 1's supply function is negative: 2b < 0. P1 = b6 - d b 2 h 2 2b6
2
+
257
This case is not interesting, because it describes an inverse relationship between price and quantity supplied. Now we take into consideration the second case where y2 is calculated. It can be easily verified that y2 > 1. The constraint can be interpreted as constraint imposed on player f thus assuming that its supply function is more elastic to price changes. We use the same computations with 7 2 to find the slopes and intercepts of both firms:
+ + Jm) ‘’ = b6 + d m ’ pf + x ( b 6 + Jb2d2 + 2b6) - w(bd + 1+ Jb262 + 2b6) ffl = , fff=Yf. (b6 + 1 + db2a2+ 2b6) (bd + db2P+ 2b6) 2b
b(b6 2 = b6 d
Using these results both players can make some calculations for the slopes and intercepts of their supply functions for each day ahead and find and equilibrium giving maximum profit for each of them thus relying only on market information for the day before. 3. Conclusions
Three major market conditions may influence and determine firms strategy concerning the quantity produced and sold: Contractual obligations. It is not unusual for firms to change their contractual obligations on a daily basis. However, the structural dynamic of the process described above indicates that changes in contractual commitments will have no impact on the learning process. Firms’ marginal costs and demand requirements. Though these factors are a crucial part of the model, companies do not need information on contractual commitments of other participants and their rivals’ marginal costs. All the needed information is the demand responsiveness to price and the firm’s marginal cost coefficients. References 1. R.. Baldick, R.. Grant and E. Kahn, Linear supply function equilibrium: generalizations, applications and limitations, Technical Report (2000). 2. M. Breton and G. Zaccour, Equilibria in an asymmetric duopoly facing a security constraint, Energy Economics 23, 457-475, (2001).
HYSTERESIS LOOP OF A NANOSCOPIC MAGNETIC ARRAY
A. KACZANOWSKI, K.MALARZ AND K. KULAKOWSKI Faculty of Physics and Nuclear Techniques, University of Mining and Metallurgy, al.Mickiewicza 30, 30-059 Krakdw, Poland E-mail:
[email protected]
Dynamics of nanoscopic arrays of monodomain magnetic elements is simulated by means of the Pardavi-Horvath algorithm. Experimental hysteresis loop is reproduced for the arrays of Ni, with period 100 nm and the mean coercive field 710 Oe. We investigate the fractal character of the cluster of elements with positive magnetic moments. No fractal is found. We apply also the technique of damage spreading. The consequences of a local flip of a magnetic element remain limited to a finite area. We conclude that the system does not show a complex behaviour.
Nanoscopic magnetic arrays have been proposed recently as devices of ultrahigh-density information storage '. The dynamics of the total magnetization, i.e. its time dependence in the presence of the oscillating magnetic field, is governed by two agents: the long-range magnetostatic interaction between each two elements of the array, and the switching field (coercive field), which varies from one element to another. The nonzero switching field enables to preserve information in the array at positive temperatures. The interaction leads to flips of magnetic moments and the information, otherwise frozen in the array, is destroyed at least partially. It was announced recently 2 ~ 3that an array of bistable magnetic wires can display a complex behaviour. The authors discussed the effect of selforganized criticality (commonly abbreviated as SOC 4). Let us recall the characteristic features of SOC. According to an iluminating paper of Flyvbjerg 5 , excitations wandering in a system can lead it to a self-organized state, i.e. a spontaneously formed state far from thermal equilibrium. By criticality we mean that in this state, excitations can be of any order of magnitude. In fact there is no characteristic scale in a critical state: the array is scale-free in the same way as a cluster of positive spins in a ferromagnet at its Curie temperature. If SOC is present in our arrays, information cannot 258
259
be preserved in the system - it is erased by excitations which can spread over the whole array. The problem is important and tempting. The aim of this paper is to investigate the state of SOC in the system as realistic as possible. Our simulations concentrate on the nanoscopic array described in Ref. 1. Monodomain magnetic cylinders of nickel, 57 nm of diameter and 115 nm length, form a square array with period 100 nm. The magnetization is M , = 370 e m u / c m 3 . The mean switching field of one element is H , = 710 O e , with the standard deviation 105 Oe. The switching field is constant in time for each element. The magnetostatic interaction is calculated by means of the RPA formula '. The only modification introduced here are the periodic boundary conditions, not to introduce boundaries which could alter the results. Basically, we search for excitations which spread over the lattice. The technique applied is known as damage spreading '. However, one of the arguments of Refs. 2 , 3 is based on the calculation of the box-counting fractal dimension of the cluster of elements with a given orientation of magnetic moments. That is why we investigate also this fractal dimension. The dynamics of the system is simulated by means of the PardaviHorvath (PH) algorithm g . This algorithm is checked by the calculation of the hysteresis loop and a comparison with the experimental one - the accordance is quantitative and good. The damage spreading technique is applied for an array of 100 x 100 elements. Two arrays are preserved in the memory of CPU, in the same randomly selected initial state. The PH procedure is applied to lead these arrays to the stable (stationary) state, where no flips occur. Then, a magnetic moment is flipped in one array, and we check if the flip is stable. Subsequently we apply a periodic magnetic field of amplitude H,. Note that the frequency is not relevant because the whole arrays go to stable states each time before the field is varied. The damage is defined as the Hamming distance between two arrays: the number of elements with different magnetic moments. The result is that the Hamming distance increases only during some transient time. Then, the system reaches a limit cycle, with the length equal to some multiple of the period of the applied field. A typical result is shown in Fig.1. We note that if a system is critical, the size of damages continuously increases lo. Here the size A of the maximal damage increases with the amplitude H , for small H,, and it vanishes in most cases for H , > 12000e. It is obvious that any damage must disappear if the system is saturated. Besides, A varies strongly from a sample to a sample. As a rule, the damage is localized as a shapeless and seemingly random pattern
260
60 50
m
40
c. v)
5
p 30 .E
r5
20
10 0
0
1
2
3
4
5
6
7
8
9
10
t Figure 1. Time dependence of the damage. Maximal value of A in the stationary state is close to 50.
of the array elements, formed of several clusters, separated but close to each other. The fractal dimension D of the positively (or negatively) oriented cluster is calculated for the array of 1024 x 1024 elements. In the initial, randomly selected state D is equal to 2.0, what means that is has no fractal character. Besides, such a state preserves the characteristics of the generator of (pseudo)random numbers, and therefore it cannot be treated as realistic. We have calculated D dependent on time during the hysteresis experiment performed in the computer. It is obvious, that D = 2.0 or zero in the saturated states. We found that this bistable character of D is preserved also for small amplitude H , of the applied field. Between these two values, the scaling is not proper. An example is shown in Fig.2. Typically, a part of the curve for small size of the cluster shows the inclination close to zero, and another part - close to 2.0. Similar plots have been presented also in Refs. 2, 3. As the applied field changes, the bent part of the curve is shifted left and finally disappears. On the contrary to the conclusions in the above papers, it seems to us that our numerical accuracy does not allow us to claim that the investigated clusters can be characterized as fractals. The fractal dimension D obtained in Refs. 2, 3 is 1.97, which is close to non-fractal value 2.0. This argument
261
0
0.5
1
1.5
2
2.5
3
3.5
log@)
Figure 2. The fractal dimension D as read from the slope log(n) vs log(d), where n is the number of boxes of size d , containing elements with positive magnetic moments. Here D switches from 2.0 to zero.
can be weakened if we remember that in the Ising ferromagnet at the Curie point is 187/96 x 1.95 l l . However, we have checked that in the latter case, the slope of the curve analogous t o Fig.2 is much more uniform. References 1. M. C. Abraham, HSchmidt, T. A. Savas, H. I. Smith, C. A. Ross and R. 3. Ram, J. Appl. Phys. 89 5667, (2001). 2. J. Velbquez and M. Vgzquez, J. Magn. Magn. Mater. 249,89 (2002). 3. J. Velbquez and M. VBzquez, Physica B320,230 (2002). 4. Per Bak,How Nature Works. The Science of Self-organized Criticality, Copernicus, Springer-Verlag New York, Inc. 1999. 5. H. Flyvbjerg, Phys. Rev. Lett. 76,940 (1996). 6. M. Hwang, M. Farhoud, Y. H m , M. Walsh, T. A. S a w , H. I. Smith and C. A. Ross, IEEE Trans. Magn. 36,3173 (2000). 7. N. Jan and L. de Arcangelis, in Annual Reviews of Computational Physics I, ed. by D. Stauffer, World Scientific, Singapore 1994, p.1. 8. C. Beck and F. Schlogl, Thermodynamics of Chaotic Systems, Cambridge UP, Cambridge 1993. 9. G. Zhang, M. Pardavi-Horvath and G. Vertesy, J. Appl. Phys. 81,5591 (1997). 10. K. Malarz, K. Kulakowski, M. Antoniuk, M. Grodecki and D. Stauffer, Int. J . Mod. Phys. C 9, 449 (1998). 11. A. L. Stella and C. Vanderzande, Phys. Rev. Lett. 62,1067 (1989).
NUMERICAL SOLUTION OF THE TWO-DIMENSIONAL TIME INDEPENDENT SCHRODINGER EQUATION WITH EXPONENTIAL-FITTING METHODS*
Z. KALOGIRATOU Department of International Trade, Technological Educational Institute of Western Macedonia at Kastoria, P.O. Box 30, GR-521 00, Kastoria, Greece TH. MONOVASILIS~ Department of Computer Science and Technology, Faculty of Science and Technology, University of Peloponnese, GR-22100 W p o l i s , Greece
T.E. SIMOS? 5 Department of Computer Science and Technology, Faculty of Science and Technology, University of Peloponnese, GR-22100 Ipripolis, Greece E-mail: tsimosQmail.ariadne-t.gr
The solution of the two-dimensional time-independent Schrodinger equation is considered by partial discretization. We apply exponential-fitting methods for the solution of the discretized problem which is an ordinary differential equation problem. All methods are applied for the computation of the eigenvalues of t h e twodimensional harmonic oscillator and the two-dimensional Henon-Heils potential. The results are compared with the results produced by full discretization.
*This project is funded by research project 71239 of Prefecture of Western Macedonia and the E.U. is gratefully acknowledged t Also at Department of International Trade, Technological Educational Institute of Western Macedonia at Kastoria, P.O. Box 30, GR-521 00, Kastoria, Greece $Active Member of the European Academy of Sciences and Arts §Corresponding author. Please use the following address for all correspondence: Dr. T.E. Simos, 26 Menelaou Street, Amfithea - Paleon Faliron, GR-175 64 Athens, Greece. 301 94 20 091 Fax number:
++
262
263
1. Introduction
The time-independent Schrodinger equation is one of the basic equations in quantum mechanics. Plenty of methods have been developed for the solution of the one-dimensional time-independent Schrodinger equation. In the literature the two-dimensional problem is treated by means of discretization of both variables x and y. Then the problem is transformed into an algebraic eigenvalue problem of a block tridiagonal matrix. Here we use partial discretization only on the variable y, then we have an ordinary differential equation problem. We apply to this problem exponential-fitting methods developed in the literature ([2-61). All methods are applied in order to find the eigenvalues of the two-dimensional harmonic oscillator and the two-dimensional Henon-Heils potential. 2. Partial discretization of the two-dimensional equation
The two-dimensional time-independent Schrodinger equation can be written in the form
$(x, 700)= 0,
-00
$(fool y)
-00
= 0,
< x < 00, < y < 00
where E is the energy eigenvalue, V(x, y) is the potential and $(x, y) the wave function. The wave functions $(x, y) assymptotically approaches infinity away from the origin. We consider $(x, y) for y in the finite interval [-Ry,Ryl and
$(x, -Rv) = 0 the boundary conditions. FRY 7 RYI
-Ry=Y-N,
Y-N+1,
and
$(x, 4 )= 0
We also consider partition of the interval ' " 7
Y-1,
YO, Y11
- . . , YN-1, YN=Ry
where yj+1 - y j = h = Ry/N. We approximate the partial derivative with respect to y with the difference quotient
and substitute into the original equation
264
where
then equation ( 1 ) can be written as d2*
- = -S(x)*(x) dX2
where S ( x ) is a (2N - 1 ) x (2N - 1 ) matrix
B(xc,Y-N+1) l/h2 l/h2 B ( x ,Y-N+2)
l/h2
S(x)=
B(x,YN-z) l/h2 l/h2 B ( x ,Y N - 1 ) The matrix S ( x ) can be written in terms of three matrices the identity matrix I, the diagonal matrix V which contains the potential at the mesh points Y - N + 1 , . .. ,Y N - 1 and the tridiagonal matrix M with diagonal elements -2 and off diagonal elements 1. l/h2
S ( X )= 2EI
-~
1 V ( X -A4 ) h2
+
3. Application of exponential-fitting methods Now we consider x in the interval [-R,, R,] with boundary conditions
* ( - R x ) = 0,
!P(R,) = 0.
For convenience we consider R, = R, = R,we also take a partition of the above interval of length N
-R, = X - N ,
X-N+1,
. . . , 2-1,
XO, X I ,
.. ., X N - ~ , X N
= Rx
then the step size is as before xn+l - xn = h = R / N . First we apply a modification of Numerov’s method developed by Raptis and Allison [3] as implemented in Vanden Berghe et. al. [5] and [6]. The modified Numerov method is $n+l-
2+n
+ +n-1
= h2 ( h f n + l + h f n
+hfn-1)
265
with
+
ewh 1 eZwh -- 2 and bo = (1 - ewh)2 (1 - ewh)2 w2h2'
1 bl=-w2h2
w E %.
This method is exact for 1, z,
2,z3,e x p ( f w z )
For w = i k imaginary we have the following coefficients 1 1 2 - cos(kh) -bl = and bo = 2 - 2 cos (kh) (kh)' (kh)' 1 -cos(kh)'
k
3.
The right hand side function f is in our case f(z,Q) = -S(z)Q. We apply the method to equation ( 2 )
+ Qn-l = -h2 (blS(z,+l)Qn+l + boS(z,)Qn + b 1 V - l ) (3) each Q n for n = - N + 1 , . . . ,0 , . . . ,N - 1 is the k = (2N - 1) length vector Q(z) evaluated at z, and S ( z n ) for n = - N + 1,.. . ,O,. . . , N - 1 is a Qn+'
- 2Qn
(2N - 1) x (2N - 1) matrix. Now let the 2 = k2 = (2N - 1)2 length vector Q
=
(*-N+l
7
Q,--N+Z
,. .. ,QO,.
.. , Q N - Z , Q N - y
Also consider the matrices A and B block tridiagonal matrix of size I x 1, each block is a diagonal matrix of size k x k. Matrix A has diagonal blocks -2 I and off diagonal blocks I . Matrix B has diagonal blocks boI and off diagonal blocks b l l . The off diagonal blocks for both A and B are the identity matrix I . Also consider the block diagonal matrix C with diagonal blocks M , and the diagonal matrix V with diagonal blocks V ( z - ~ + 1to) v(zN-1).
We rewrite (3) in matrix form
AQ = -2h'BEQ
+ 2h2BVQ - CBQ
or
( P + Eh2Q)Q = 0 where
P =A
-
2h2BVQ+ C B Q ,
and
Q = 2B
We also apply a P-stable exponential-fitting method [2] G7l
+
+ fn-l) + bo.f, + h f n - 1 )
= Yn - bh2(f,+l - 2 f n
Y ~ +-I 2 % ~ yn-1 = h2(blf,+l
266
where
bo =
8 - sew'
+ 4wh + 4whewh- ( 2 . l ~+) ~(wh)2ewh 2(wh)2(-1+ e w h )
+
+ 8ewh- 4wh - 4whewh- (
~ h )(wh)2ewh ~ 4 ( ~ h ) ~ ( - ewh) 1+ ( 2 - 2ewh wh + whewh)2 b= 2 ( ~ h ) ~ ( - ewh)(8 1+ - 8eWh+ 4wh + 4whewh- ( ~ h+ )(wh)2ewh) ~
bl =
-8
+
This method is exact for
1 , x, x 2 , x3,exp(fwz) and also P-stable. For w = ilc the corresponding trigonometrical-fitting method is derived. Substitution into equation ( 2 ) gives the following generalized eigenvalue problem
(P
+EPQ
-
EWR)Q=O
where
P = A - 2h2BV + C B
Q = 2B
+ b boh2 ( D A - 2VCA - 2CAV) + 4b boh4VAV,
+ 4bboCA - bboh2(AV + V A ) ,
and
R
= -4bbo.4
and D is a block diagonal matrix with each block equal to A2. 4. Numerical Results
We applied both numerical methods developed above to the calculation of the eigenvalues of the two-dimensional harmonic oscillator and the twodimensional Henon-Heiles potential. Results are compared with those produced using the full discretization technique. 4.1. Two-dimensional harmonic oscillator
The potential of the two-dimensional harmonic oscillator is
1 V ( x , y )= - (x2 2 The exact eigenvalues are given by
+ y2)
267
4.2. Two-dimensional Henon-Heiles potential
The Henon-Heiles potential is 1 V(Z,Y) = -(x2 2
+ y2) + (0.0125)”2
References 1. Davis M. J., Heller E. J., Semiclassical Gaussian basis set method for molecular vibration wave function, Journal of Chemical Physics 71 5356-5364( 1982). 2. Kalogiratou Z., Simos T.E., A P-stable Exponentially-fitted method for the numerical integration of the Schrodinger Equation, Applied Mathematics and Computation, 12 99-112(2000). 3. Raptis A., Allison A.C., Exponential-fitting methods for the numerical solution of the Schrodinger Equation, Computer Physics Communications, 14 1-5(1978). 4. imos T.E., Exponential fitted methods for the numerical integration of the Schrodinger euqtion, Computer Physics Communications 71 32-38( 1992). 5. Vanden Berghe G., De Meyer H., Vanthournout J., A modified Numerov integration method for second order periodic initial value problems, International Journal of Computer Mathematics, 32 233-242( 1990). 6. Vanden Berghe G., De Meyer H., A modified Numerov method for higher Sturm-Liouville eigenvalues, International Journal of Computer MathematZCS, 37 63-77( 1990).
PROBABILISTIC NEURAL NETWORK VERSUS CUBIC LEASTSQUARES MINIMUM-DISTANCE IN CLASSIFYING EEG SIGNALS I. KALATZIS, N. PILIOURAS, E. VENTOURAS AND I. KANDARAKIS Department of Medical Instrumentation Technology, Technological Educational Institution of Athens, Ag. Spyridonos Street, Egaleo GR-122 10, Athens, Greece..
C. C. PAPAGEORGIOU AND A. D. RABAVILAS Psychophysiology Laboratory, Eginition Hospital, Department of Psychiatry, Medical School, University of Athens, Greece. D. CAVOURAS‘ Department of Medical Instrumentation Technology, Technological Educational Institution of Athens, Ag. Spyridonos Street, Egaleo GR-I 22 10, Athens, Greece.. E-mail:
[email protected] The purpose of the present study is the implementation of a classification system for differentiating healthy subjects from patients with depression. Twenty-five depressive patients and an equal number of gender and agedmatched normal controls were evaluated using a computerized version of the digit span Wechsler test. Morphological waveform features were extracted from the digitized Event-Related Potential (ERP) signals, recorded from 15 scalp electrodes. The feature extraction process focused on the P600 component of the ERPs. The designed system comprised two classifiers, the probabilistic neural network (PNN) and the cubic least-squares (CLS) minimum-distance,two routines for feature reduction and feature selection, and an overall system evaluation routine, consisting of the exhaustive search and the leave-one-out methods. Highest classification accuracies achieved were 96% for the PNN and 94% for the CLS, using the ‘latency/amplituderatio’ and ‘peak-to-peak slope’ two-feature combination. In conclusion, employing computer-based pattern recognition techniques with features not easily evaluated by the clinician, patients with depression could be distinguished from healthy subjects with high accuracy.
1.
Introduction
Event-Related Potentials (ERPs) are electro-encephalographic (EEG) potentials recorded when a subject is presented with external stimuli’. Research efforts focus on investigating the significance of local maxima and minima of the ERP ‘ Please address corresuondence: Prof. D. Cavouras, Ph.D. , Dept of Med Inst. TEI of Athens, Tel: (+30) 210-5385-375 (work) -Fax: (+30) 210-5910-975 (work), E-mail: cavourask2teiath.m
268
269 waveform, called components. The P600 component, elicited between 500 and 800 msec after stimuli presentation, indexes second-pass parsing mechanisms of information processing and has gained wide research interest in recent years2. ERP classification systems have been proposed, providing various degrees of accuracy, for supporting the diagnostic procedure, concerning both neurological and psychiatric disorders. As in other types of biosignals, a large part of the research has been devoted to the selection of the optimum features to be extracted from the signal3. In the present study we have developed a computer-based signal analysis software system for the automatic discrimination of ERP signals of patients with depression versus healthy controls. The aim was to analyze the ERP waveform in the time interval from 500 to SOOmsec, in order to gain insight into probable differences between depressives and non-depressives, related to the P600 component. The ERP signals were analyzed by means of features related to morphological characteristics of the signal. A comparative analysis was performed concerning two different classification methods, the probabilistic neural network (PNN)4and the cubic least squares (CLS)’ classifiers. 2.
Material and Methods
Twenty-five patients with depression and an equal number of gender and agedmatched healthy controls were examined. All participants had no history of any neurological or hearing problems. The subjects were evaluated by a computerized version of the digit span Wechsler test, as detailed in a previous work6. EEG activity was recorded from 15 scalp electrodes based on the International 10-20 system of Electroencephalography7,referred to both earlobes (abductions at Fpl, Fp2, F3, F4, C3, C4, (C3-T5)/2, (C4-T6)/2, P3, P 4 , 0 1 , 0 2 , Pz, Cz, andFz). Using a dedicated software program, developed in C++, the following features were automatically calculated: The P600 component’s latency (trar), the amplitude),s( of the signal at (trol),the latency/amplitude ratio (trot/,),s and the peak-to-peak slope, which was calculated by the following relation:
where,,s and sminare the maximum and minimum signal values present in the investigated time interval, occurring at tmmand tminrespectively. The PNN and the CLS classifiers were tested, based on the features extracted from the ERP waveform. The accuracy of classification of the two subjects’ groups was evaluated by employing, at each abduction separately, all
270
features in combinations of 2, 3 and 4. The performance of the classification system was evaluated using the leave-one-out method.
3. Results and discussion The identification of differences in the characteristics of ERP signals between a specific patient group and healthy controls may provide valuable support in the diagnostic process, especially in cases where widely accepted clinical criteria do not lead to a definite conclusion. Figure 1 shows a scatter diagram of two commonly employed by Psychiatrists features in assessing ERPs, demonstrating, due to overlapping, the lack of apparent differences between the two groups of subjects. However, when latencylamplitude ratio and peak-to-peak slope were used, the two groups seemed to be well separable (Fig. 2).
0 COntmlS A Depressives
A
=
o
0 Contmls A Depressives
- Decision boundaq
A
A*
1.
-2-
.I5 -
A 4.5
-1
-0.5
0
0.5
I
”
&
A0 1.5
2
Latency
Figure 1. ‘Latency’vs. ‘Amplitude’ scatter diagram at Fpl abduction.
6
4
.
2
0
2
4
Latencylarnvllude falio
Figure 2: ‘Latency/Amplitude’vs. ‘Peak-to-peakslope’ diagram at Fpl abduction,with CLS decision boundaries imposed.
Tables 1 and 2 show classification accuracies achieved by the PNN and the CLS classifiers respectively, employing the leave-one-out method. The PNN classifier, probably due to its higher complexity, performed better. However, taking into consideration the simplicity and computational speed of the CLS, its performance may be comparable to that of the PNN. In conclusion, employing features generated from the P600 component waveform, which are not easily evaluated by visual inspection, depressive
271 patients could be clustered further away from the normal controls. Moreover, applying effective classifiers we managed to discriminate between the two groups of subjects with high accuracy. Table 1. PNN classification results using the 'latency/amplitude ratio' and 'peak-to-peak slope' features at Fpl abduction.
Subjects
PNN classification Depressive Controls s 1 24 24 1
Accuracy
96% Controls 96% Depressives 96% Overall accuracy Table 2. CLS classification results using the 'latency/amplitude ratio' and 'peak-to-peak slope' features at Fpl abduction. Subjects Controls Depressives Overall accuracy
CLS classification Depressive Controls s 1 24 23 2
Accuracy
96% 92% 94%
References 1. M. Fabiani, R. Johnson Jr. (Ed.), Event-Related Brain Potentials and Cognition, Handbook of Neuropsychology, Amsterdam, The Netherlands: Elsevier, 1995. 2. Papageorgiou, C., Liappas, I., Asvestas, P., Vasios, C., Matsopoulos, O.K., Nikolaou, C., Nikita, K.S., Uzunoglu, N., Rabavilas, A., 200l.Neuroreport 13, 1773-1778. 3. C. Vasios, Ch.Papageorgiou, G.K.Matsopoulos, K.S.Nikita and N.Uzunoglu, German Journal of Psychiatry 5, 2002, pp. 78-84. 4. D. F. Specht, "Probabilistic Neural Networks," Neural Networks 3, 1990, pp. 109-118. 5. S. Theodoridis and K. Koutroumbas, System evaluation, in Pattern Recognition, Academic Press, 342 (1998). 6. C. Papageorgiou, E. Ventouras, N. Uzunoglu, A. Rabavilas and C. Stefanis, Neuropsychobiology 46, 2002 pp. 70-75. 7. H. Jasper, "The Ten-Twenty Electrode System of the International Federation", Electroenceph Clin. Neurophysiol 10, 1958, pp. 371-375.
PROBABILISTIC NEURAL NETWORK CLASSIFIER VERSUS MULTILAYER PERCEPTRON CLASSIFIER IN DISCRIMINATING BRAIN SPECT IMAGES OF PATIENTS WITH DIABETES FROM NORMAL CONTROLS I. KALATZIS AND N. PILIOURAS Department of Medical Instrumentation Technology, Technological Educational Institution of Athens, Ag. Spyridonos Street, Egaleo GR-122 10, Athens, Greece.. D. PAPPAS Department of Nuclear Medicine, 251 General Airforce Hospital, Athens, Greece.
E. VENTOURAS AND D. CAVOURAS* Department of Medical Instrumentation Technology, Technological Educational Institution of Athens, Ag. Spyridonos Street, Egaleo GR-122 10, Athens, Greece.. E-mail:
[email protected] The aim of this study was to compare the performance of the probabilistic neural network (PNN) classifier with the multilayer perceptron (MLP) classifier, in an attempt to discriminate between patients with diabetes mellitus type I1 (DMII) and normal subjects using medical images from brain single photon emission computed tomography (SPECT). Features from the gray-level histogram and the spatial-dependence matrix were generated from imagesamples collected from brain SPECT images of diabetic patients and healthy volunteers, and they were used as input to the PNN and the MLP classifiers. Highest accuracies were 99.5% for the MLP and 99% for the PNN and they were achieved in the left inferior parietal lobule, employing the mean value and correlation features. Our findings show that the MLP classifier outperformed slightly the PNN classifier in almost all cerebral regions, but the lower computational time of the PNN makes him a very useful classification tool. The high precision of both classifiers indicate significant differences in radiopharmaceutical (99mTc-ECD) uptake of diabetic patients compared to the normal controls, which may be due to cerebral blood flow disruption in patients with DMII.
*
Please address corresuondence: Prof. D. Cavouras, Ph.D., Dept of Med Inst. TEI of Athens, Tel:
(+30) 210-5385-375 (work) -Fax: (+30) 210-5910-975 (work), E-mail: cavowas@,teiath.m.
272
213 1. Introduction
The probabilistic neural network (PNN)’ is a software classifier with simple structure and low computational power requirements, in contrast to the backpropagation mukiplayer perceptron (MLP)’ classifier. These two algorithms were employed to analyze brain single photon emission computed tomography (SPECT) images of patients with diabetes mellitus (DM) type 11. Diabetes mellitus (DM) often results in brain micro-blood flow disorders that may cause cerebral infarction3. Previous studies by brain SPECp,5 have reported differences in the radio-pharmaceutical uptake between DM patients and normal subjects, while other studies6 have reported the opposite. In the present work we investigate whether PNN and MLP classifiers can distinguish between brain SPECT images of patients with DM type 11 and healthy subjects, by analyzing the count distribution of SPECT images by means of features derived from the gray level histogram and the gray level co-occurrence matrix’. 2.
Material and methods
A large number of samples (4028) collected from regions of interest (ROIs) from brain SPECT images of diabetic patients and 4081 samples collected from ROIs fiom brain SPECT images of healthy volunteers were analyzed using pattern recognition techniques. Features evaluating radio-pharmaceutical (99mTc-ECD) uptake distribution were extracted by means of the first-orderstatistics (derived from the image-sample gray-level distribution) and secondorder statistics (computed from the co-occurrence matrix).’ Classification between the two groups was performed for each ROI separately, employing the PNN and MLP software classifiers. The accuracy of classification was evaluated exhaustively by combining features in all possible ways, using the PNN and the leave-one-out method. Best feature combinations achieved by the PNN were used in the design of the MLP classifier. 3. Results and discussion
Highest classification accuracy with the minimum number of features was achieved for the feature combination “mean value - correlation” (see Fig. I), in the left inferior parietal lobule cerebral area. Highest classification accuracies of the PNN and MLP classifiers are presented in tables 1 and 2 respectively.
274 When comparing these results (Tables 1 and 2), it is evident that the two classifiers have similar classification capabilities. However, the significantly lower computational training time of the PNN classifier in comparison to the MLP, in combination with the accuracy achieved, make the PNN classifier a very powerful classification tool. Table 1. Truth table demonstrating PNN classification of image-samples corresponding to the left inferior parietal lobule of diabetics and non-diabetics using the best feature combination (mean value - correlation). PNN classijkation Subjects
Diabetics
68 1
Diabetics Non-diabetics Overall accuracy
diabetics 1 122
Accuracy
98.6% 99.2% 99.0%
Table 2. Truth table demonstrating MLP classification of image-samples corresponding to the left inferior parietal lobule of diabetics and non-diabetics using the best feature combination (mean value - correlation).
1 1 I
Subjects
Diabetics Non-diabetics Overall accura
I dT: 1
MLP classification
D i a r
A;;
98.6% 99.5%
1
275
2.5 2 1.5
1
B 0.5 0
V
0 -0.5
-I -1.5 -2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
Mean
Figure 1. ‘Mean value - Correlation’ plots (normalized values) and decision boundary drawn by the MLP classifier, for the ROI corresponding to the left inferior parietal lobule. The above results were obtained using for the PNN smoothing parameter a=0.4 and for the MLP employing a two hidden-layer four layer-node structure. In the rest of the ROIs, higher overall accuracies in correctly discriminating between “diabetics” and “non-diabetics” image-samples varied between 84% and 99.5% using the MLP and 86% and 99% using the PNN. Our findings indicate the existence of differences in cerebral blood flow between patients with DMII and normal subjects. These differences were found by evaluating count-distribution features from brain SPECT images, such as correlation, that are not easily discernable by visual inspection. Additionally, the employment of image analysis techniques based on classifiers with high discriminatory ability, such as the neural networks, may assist in distinguishing “diabetics” from “non-diabetics” with high accuracy. The usefulness of the present system is its ability to semi-automatically segment ROIs on brain SPECT images and then provide physicians with a second opinion. Its final testing ground will be the clinical environment, where new cases of DMII will be presented.
References 1. D. F. Specht, Neural Networks, 3, 109 (1990).
276 2. 3. 4. 5.
6. 7.
A. Khotanzad and J.H. Lu, IEEE Trans. Acoust. Speech & Signal Proc., 38, 1028 (1990). W.B. Kannel and D.L. McGee, J. Am. Med. Assoc., 241,2035 (1979). J. F. Jimenez-Bonilla, J. M. Carril, R. Quirce, R. Gomez-Barquin, J. A. Amado and C. Gutierrez-Mendiguchia, Nucl Med Commun, 17,790 (1996). R. Quirce, J. M. Carril, J. F. Jimenez-Bonilla, J. A. Amado, C. GutierrezMendiguchia, I. Banzo, I. Blanco, I. Uriarte and A. Montero, Eur. J. Nucl. Med., 24, 1507 (1997). 0. Sabri, D. Hellwig, M. Schreckenberger, R. Schneider, H. J. Kaiser, G. Wagenknecht, M. Mull and U. Buell, Nucl. Med. Commun., 21, 19 (2000). R. M. Haralick, K. Shanmugam and I’H. Dinstein, Textural Features for Image Classification. IEEE Trans. Sys. Man. Cyb., 6,610 (1973).
PARALLEL MOLECULAR DYNAMICS SIMULATION OF LENNARDJONES LIQUIDS ON A SMALL BEOWULF CLUSTER T. E. KARAKASIDIS, A. B. LIAKOPOULOS Department of Civil Engineering, School of Engineering University of Thessaly, Pedion Areos, 38834 Volos, Greece N. S. CHOLEVAS Department of Mechanical &Industrial Engineering, School of Engineering University of Thessaly, Pedion Areos, 38834 Volos, Greece
In the present paper we present results concerning the performance of a parallel Molecular Dynamics simulation of a Lennard-Jones liquid on a PC cluster consisting of four Pentium 111 processors running under Linux and using the MPI protocol. The methodology for the parallelization was the atom decomposition method in which each processor is assigned to deal with a given group of atoms. We examined the program performance for system sizes 10' to lo5 atoms and number of processors varying from 1 to 4. The influence of the communication methods between processors was also examined. It was found that even such a small cluster can be a very useful and cost-effective solution for the realization of MD simulations of small Lennard-Jones liquid systems for real times up to lps within a reasonable computation time.
1.
Introduction
Molecular Dynamics (MD) is a well established simulation method based on an atomic description of matter. MD is well-suited for the study of transport and structural properties as it can probe microscopic mechanisms not easily accessible by experiment and thus it is widely used in several areas of physics and materials science. Classical MD provides us with a set of atomic trajectories and allows the calculation of system's properties through the formalism of statistical mechanics [ 11. However, even with present day computing power, MD is quite demanding in computer time if one attempts to simulate either very large systems or/and very long real times [2-31. The introduction of parallel computers initiated a great deal of research towards the development of parallel codes for computationally intensive problems. However the high cost of parallel computers had limited the availability of such machines to major and well-funded research centers. The emergence of Linux and the Beowulf project allowed the introduction of lowcost high-performance parallel computation to small research groups while many major labs created and operate vast such clusters. In the present work we 277
278 deal with the building of a small Beowulf cluster and the necessary transformation of a serial MD code used for the simulation of liquids. 2.
Molecular Dynamics and Computational Details
In Molecular Dynamics a system consists of N atoms interacting through Newton’s second law and an appropriate interaction potential. In the case of pair interactions, the equations of motion of the particles are given by :
The most time consuming part of an MD program is the calculation of forces. Forces are obtained by calculating the interactions of a particle i with all particlesj which lie within the cut-off range r, of the potential. This calculation can be accelerated if one does not check at each step if all atomsj lie within r, but constructs for each particle i the so called neighbor list [4]. Distribution of the computational load over the available processors can be accomplished by atom decomposition (AD), space decomposition or force decomposition [2, 51. Given that AD seems to be quite efficient for small computer clusters [2] and in an attempt to maintain the readability of our code we have chosen to use the AD method. In this approach each of the P processors is assigned N/P particles to deal with which assures a good loadbalancing. The computation proceeds as following. At the first step the positions of all particles are communicated between processors so each processor can calculate the forces due to the neighbors. Integration of equation (1) results in the new particle positions which are then communicated to all processors and the procedure is repeated. The communication of the positions was performed in two ways. In the first one we used the blocking SEND and RECEIVE commands of MPI [6,7]. In this case the processor sends the desired information to the other processors and the program does not advance until the content is received by the destination processor. In the second realization we used the BROADCAST command [6,7], which is a non-blocking command. In this case the information is sent by one processor to all other processors. The Beowulf cluster consisted of four PCs each with Pentium I11 processor at 733h4Hz, 128Mb RAM and lOOMbs network card. They were connected through a lOOMbs switch. On the software side, the Suse 6.4 distribution of Linux, the MPICH v. 1.2.1 of Argonne National Laboratory implementation of MPI and the GNU Fortran compiler version 2.95.2 were used.
279
3.
Physical System
In this test case we examined a physical problem already studied by other researchers (see [2, 8-91). We simulated a liquid for which interactions between atoms are described by a Lennard-Jones potential :
where E and r~ are constants, and r denotes the interatomic distance. The simulations were performed in the microcanonical (NVE) ensemble. Forces were truncated at a distance rc=2.5a. The equations of motion were integrated s. and the update frequency for using the Verlet algorithm with a timestep the neighbor list was10 steps. The system was simulated at a reduced density p*=0.8442 and temperature T*=0.72. Simulations were performed using a cubic box with periodic boundary conditions and the size of the systems studied ranged from N=256 to N=108,000 atoms. Positions and velocities were saved on disk every 10 steps for post-processing reasons. 4.
Results and Discussion
We have performed lo4 timestep runs for all size systems except for the last two large systems where only lo2 steps calculations were carried out and the speed-up of the program was calculated. The corresponding results are presented in table 1. The results show that the usage of the BROADCAST command results in slightly faster calculations than that of the SEND and RECEIVE commands both in the case of two processors (P=2) and four processors (P=4). This is something one expects as the BROADCAST is a nonblocking command. It is of interest to note that the value of speed-up is close to the number of processors employed in the case of P=2 for lower system sizes than in the case of P==4 and that the timings themselves are quite close for the case of the two smallest system studied. In both cases (P=2 and P=4) speed-up increases, as the system size gets larger. We attribute this behavior to the fact that part of the execution time is spent in communicating the necessary data between processors. When the size of the system is small, the calculations of new particle positions are performed quickly on each processor and the communication of the information among processors is being done quite frequently. As the size of the system increases, more time is spent by each processor on force calculations than on communication of the new data. However, for a very large size system the computation time per timestep increases in a significant way and renders impractical long simulation runs. On the other hand, if one deals with small
systems (e.g.N=256) he can manage to do calculations for of lps real time (lo8steps) within a week using the cluster described in section 2.3. This is particularly interesting for small research groups dealing with model liquids (especially when studying phenomena with relatively high characteristic times), as the required computer system can be built based on rather inexpensive top-ofthe-self components. Table 1 CPU Seconds/Timestep for various size systems and number of processors (P) for the two communication methods used a) SEND and RECEIVE (SR) b) BROADCAST (BC). In all case the speed-up (Sp) with reference to the serial code is given too. System Size atoms
P=l t
256
0.0120
0.0087
1.37
0.0078
1.54
0.0082
1.47
0.0059
2.03
500
0.0281
0.0162
1.74
0.0158
1.I7
0.01 17
2.41
0.01 16
2.43
0.0228
2.74
0.0208
3.00
864
0.0624
0.035 1
1.78
0.033 1
1.89
1,372
0.1138
0.0651
1.74
0.0613
1.86
0.0400
2.85
0.0372
3.06
2,048
0.2242
0.1322
1.70
0.1269
1.77
0.0803
2.79
0.0739
3.03
2,9 16
0.3978
0.2160
I .82
0.1993
1.99
0.1230
3.23
0.1192
3.34
4,000
0.6441
0.3464
1.86
0.3325
1.94
0.1987
3.24
0.1888
3.41
32,000
40.194
20.509
1.95
20.467
1.96
10.611
3.78
10.456
3.84
223.945
1.99
222.734
2.00
112.567
3.95
111.795
3.98
108,00( 445.137
Acknowledgments
Partial support from the University of Thessaly Research Committee under grant RC2498 is acknowledged.
References [I] [2] [3] [4] [5] [6] [7] [8] [9]
M.P. Allen, T.J. Tildesley, Computer Simulation of Liquids, Clarendon Press, Oxford, (1987). S. Plimpton, J. Comp. Phys. 117, 1 (1995). D. M. Ceperley, Rev. Modern Phys. 71, S438 (1999). L.Verlet, Phys. Rev. 159,98 (1976). W.Smith, Comp. Phys. Comm. 62,229 (1991). MPI Forum. International Journal of Supercomputing Application 8, 165 (1 994). http://www-unix.mcs.anl.gov/mpi/ D.Brown, H.R. Clarke, M. Okuda and T. Yamazaki, Comp. Phys. Comm. 74 67 (1993). K.Esselink, B.Smit and P.A.Hilbers, Comp. Phys. Comm.55 269 (1989).
VIBRATIONAL PROPERTIES OF NiO(110) SURFACE BY MOLECULAR DYNAMICS SIMULATION T. E. KARAKASIDIS Department of Civil Engineering School of Engineering University of Thessaly Pedion Areos, 38834 Volos, Greece
[email protected]
Using molecular dynamics and a rigid ion potential we studied the vibrational properties of the NiO(110) surface. The simulations were carried out at T=300K and we calculated the phonon density of states (DOS) for the anion and cation sublattice as a function of the distance form the surface along the three directions parallel and normal to the surface. We discuss how the bulk DOS is altered as a function of the distance from the surface.
1. Introduction The structure and the dynamical behavior of surfaces play an important role in phenomena like sintering, grain growth, oxidation, surface roughening etc. Atomistic simulations are well suited for such studies since they provide insight into the microscopic description of the phenomena but also a measurement of macroscopic properties. Such methods include lattice statics, lattice dynamics and molecular dynamics (MD). In all cases matter is treated as atoms interacting through an appropriate potential function. Lattice statics and lattice dynamics do not take into account temperature effects explicitly. Molecular dynamics can account explicitly for temperature effects and can provide information on time dependent properties so that it is better suited for the study of vibrational properties. The choice of the material (NiO) and the surface (1 10) is due to the fact that NiO has a relatively simple structure (fcc) and there is earlier work showing the stability of the given surface [l-21. In addition, NiO can serve, although with caution, as a model system for the description of the general behavior of rocksalt-structured ionic materials.
28 1
282 2.
Molecular Dynamics and Computational Details
The simulations were carried out using slab geometry where the simulation cell is partially filled and includes a region of vacuum. The simulation box containjng the slab is a parallelepiped with edges parallel to the x:[liO], y: [OO 1] and z:[ 1101 directions. The slab consists of 2000 ions arranged on 20 planes along the [110] direction, each plane containing 50 cations and 50 anions. In the [ 1101 direction there is an empty space 4 times the dimension of the slab. The use of periodic boundary conditions results in a system with two free (1 10) surfaces perpendicular to the [ 1101 direction on opposite sides of a slab of infinite extent in [liO] and [ O O i ] directions. Simulations were carried out in the constant temperature canonical ensemble using the Nose scheme [3]. The equations of motion were integrated using the Verlet algorithm and a time step of lO-’’s. For the atomic interactions we adopted a rigid ion potential developed for NiO [4] that has already been used in simulations of grain boundaries [5-71 and surfaces [8-91 of NiO. The Coulombic contributionshave been evaluated with the use of the Ewald method [lo]. The phonon DOS were obtained .from the Fourier transform of the velocity autocorrelation function.
where < > indicate an average on time performed over particles i located on the corresponding region (in our case bulk or grain boundary region), and z is the time increment and N is the number of ions. The results were obtained from simulations of a duration of 10 ps where velocities where saved every 5 timesteps.
3. Results and Discussion In figures l a and l b we present the bulk phonon density of states calculated at T=300K for the cationic and anionic sublattice respectively. Results are presented for (1 1 9 planes as function of the distance from the surface along the x: [l i01,y:[oo 13 and z:[ 1 101 directions. We remind here that the x and y directions are parallel to the surface plane while z direction is normal to it.
283
-02
0
5
10
15
20
frequency (THz)
25
30
5
10
15
20
25
30
frequency (THz)
Figure 1. Phonon density of states (DOS) for the bulk and for (110) planes as a function of the distance from the surface for the directions x (-), y(- - -), zC -) (a) cationic sublattice (b) anionic sublattice.
We can see from figurela that the cationic sub-lattice is responsible for the low energy phonon modes that are essentially situated around 5 THz and 9 THz for the bulk. As we can see in the first plane (ions on the surface) we observe an enhancement of low frequencies in the x direction, along with an important peak around 3 THz in the y direction both indicating a much looser coupling than the bulk ions, and in the z direction (normal to the surface) the high frequency peak becomes lower indicating a looser coupling of the ions too. In the second plane we observe in the x direction an important peak at 12 THz indicating much stronger coupling of the cations than those of the bulk, y direction resembles quite to that of the bulk while in the direction z we have an important peak at about 3 THz indicating a much looser coupling than bulk ions. In plane 3 the DOS resembles to that of the bulk for the x and y directions with slight differences along the z direction. For more distant planes 4, 5, 6 we observe practically bulk behavior. The anion sub-lattice is responsible for the high-energy phonon modes that are essentially situated around 9.2 THz and 20 THz for the bulk (figurelb). As we can see in the first plane (ions on the surface) we observe a shift towards higher frequencies in the x direction indicating stronger coupling while along the y direction we have an important peak around 7 THz indicating a much looser coupling than the bulk ions. In the z direction (normal to the surface) the high frequency peak becomes lower with an increase of frequencies around 14 THz. In the second plane we observe in the x direction a shift towards frequencies around 15 THz and 22 THz indicating stronger coupling of the
284 anions than those of the bulk, in the y direction we observe a shift towards high frequencies around 22THz indicating stronger coupling and in the direction z we have an enhancement of frequencies lower than 9 THz indicating a looser coupling. In plane 3 the DOS resembles to that of the bulk for the x and y directions with slight differences along the z direction. For more distant planes 4, 5 , 6 we observe practically bulk behavior, as in the case of the cation sublattice.
References P. M. Oliver, G.W. Watson, S.C. Parker, Phys.Rev. B 52 5323 (1995) M. Yan, S.P. Chen, T.E. Mitchell, D. H. Gay, S.Vyas, R.W. Grimes, Phil.Mag. A 72 121 (1995) S. Nose, J. Chem Phys. 81 5 11 (1984) C. Massobrio and M. Meyer, J. Phys. :Cond. Maffer3 279 (1 99 1) M. Meyer and C. Waldburger Mat. Sci. Forum 126-128 229 (1993) T.E. Karakasidis and M. Meyer, Phys Rev.B 55 13853 (1997) T. Karakasidis and M. Meyer, Modelling andSimulation in Materials Science and Engineering 8 117 (2000) [8] T.E. Karakasidis. and G. Evangelakis, S u 6 Science 436 193 (1999) [9] T.E. Karakasidis, D.G.Papageorgiou and G.A. Evangelakis, Appl. S u 6 Science 162-163 233
[l] [2] [3] [4] [5] [6] [7]
(2000) [lo] P.P. Ewald, Ann. der. Physik64 253 (1921)
ELECTRIC PROPERTIES OF SUBSTITUTED DIACETYLENES P. KARAMANIS AND G. MAROULIS' Department of Chemistry Universify of Patras, GR-26.500Patras, Greece E-mail: pkzlr(ii..chemisti-v. U U U ~ Y L Igi; S . i??aroulis(i~~upatrt~s. PI. A systematic study of the electric properties of substituted diacetylenes: H-C=C-C-C-X, -X = Li, Na, K, Al, Ga, In, F, CI, Br, I, CN, NC, CP, PC, C6H5,C4N, CSH4Nand N2B is presented. The electric properties that have been studied are the dipole moment (pa) the dipole polarizability (aa&,the first (Papy) and the second (yapyB)dipole hyperpolarizability. The calculations have been performed with ab initio methods (Msller-Plesset Perturbation theory, Coupled Cluster techniques) of high predictive capability and flexible basis sets especially designed for (hyper)polarizability calculations.
1. Introduction
The need of new NLO-materials which can be easily produced and modified for practical use in advanced technology applications (optical switching, four-wave mixing, optical storage etc) has directed the efforts of many authors towards the study of the electric (hyper)polarizability of various classes of organic compounds and polymers [ 1,4]. Amongst the variety of the organic molecules that have been investigated organic molecules with conjugated triple bonds have shown very promising properties such as large nonresonant third order susceptibilities and ultra fast optical response. Diacetylene is an ideal example of a two triple bond conjugated system. Its derivatives have attracted the attention of many workers in the field of nonlinear optics [4-81. Recently, a large nonlinear susceptibility has been reported for a pyrrole derivative of a conjugated diacetylene monomer solution in acetone by Paley et af [4].As the macroscopic linear and nonlinear susceptibilities are related to the microscopic (hyper)polarizabilities, a systematic ab initio study can lead to valuable information on the (hyper)polarizability substitution effect on the model X-C=C-C=C-H system. In this work five groups of mono-substituted diacetylene derivatives are being investigated: (a) F, C1, Br, I, (b) Li, Na, K, (c) Al, Ga, In, (d) NC, CP, PC (e) C6H5, C4N, C5H4N and N2B. Reference self-consistent field (SCF) values were calculated with very large, flexible basis sets. Electron correlation effects were accounted for via Moller-Plesset Perturbation theory and Coupled Cluster techniques.
To whom all correspondence should be addressed.
285
286 2.Theory and Computational details
The energy of an uncharged molecule interacting with a weak homogeneous static electric field can be written as: W F , ) = En- paFa- (1 2bapFaFP- (1 1 @)Pap,FaFpFr - (1 I 24)yaPY8Fa FpF, F,
+ ...
(1)
where Fa is the field, ?I is the energy of the free molecule and a@, Basu,y + ~ are the dipole (hyper)polarizabilities. The subscripts denote Cartesian components and the repeated subscript implies summation over x, y and z. Molecular geometries were optimized at the MP2(full) level using D95** (5d 7 f ) basis sets for the (F, C1, Al, NC, CP, PC, C6H5, C4N, N2B)-C=C-C=C-H, 6-31 1G** (5d 7f) basis set for (Li, Na, K)-C=C-C=C-H and 3-21G** (5d 7f) for (Ga, In)C=C-C=C-H. The geometry optimization of (Br, I)-C=C-CS-H was based on D95** (5d 7 0 basis set for carbon and hydrogen, a (13slOp4d)/[4~3pld]was used for Br and (16sl3p7d)/[5~4p2d][lo] for I. The produced geometries are stationary points characterized by computations of the fundamental vibrational frequencies. All (hyper)polarizability calculations were performed using especially designed basis sets of various types for each molecule and all property components were obtained in the finite field approach. The GAUSSIAN 94 and GAUSSIAN 98 programs were used for the calculations. All electric property values are in atomic units. 3. Results
Table 1 contains near-Hartree-Fock values and CCSD(T) corrections of the dipole moment and the mean values of the second (hyper)polarizability for the mono-halogenated derivatives. MP2/3-2 lG* geometries, Mulliken charges of (Ga, In)-C=C-C=C-H (obtained through the MP2 density) and their longitudinal MP2 first hyperpolarizability tensor are shown in Figure 1. Figure 2 shows the HOMO and LUMO (using the RHF/D95 density) and MP2 values of the dipole moment and mean polarizability of C6H5,C4N, C5H4Ndiacetylene derivatives. Finally, two sample contour plots of the interaction energy between an electron and CP-C=C-CK-H, PC-C=C-C=C-H are shown in Fig. 3. These were obtained using reference SCF values of their molecular electric properties.
287 Table 1. Reference SCF values of the dipole moment and the mean values of the second (hyper)polarizability and CCSD(T) corrections in parenthesis. M X a Y
F
0.3375 (-0.1195)
CI
0.1643 (-0.0848)
Br
I
0.1310 '-0.1088) 0.0304 (-0.0819)
-U.J6 ttt.14
49.3 (-0.15) 66.81 (-0.05) 74.62
9632 (2408) 15307 (50Z8)
20072 (7665) 28359
(0.93)
88.99 (1.29)
-&.I6 4 . 1 6
(10910)
4
U.25
Figure 1. MP2/3-21G* geometries, Mulliken charges and longitudinal first hyperpolarizability tensor of (Ga, In)-C-C-C=C-H.
Figure 2. HOMO and LUMO representation and MP2 values of the dipole moment and mean polarizability for (C6H5, C4N, C5H4N)-C=C-C=C-H.
288
Figure 3. Interaction energy between an electron and the linear molecules
PC-C=C-C=C-H andCP-C=C-C=C-H.
References 1 . Introduction to Nonlinear Optical Effects in Molecules and Polymers; P.N.Prasad, D.J.Williams, Eds.; John Wiley and Sons, Inc.: New York, 1991. 2. S.R.Marder, J.E.Sohn (Eds), Materials for Nonlinear Optics: Chemical Perspectives, ACS Symposium Series 455 (ACS, Washington DC, 1991) 3. D.C.Hanna, M.A.Yuratich, and D.Cotter, Nonlinear Optics of Free Atoms and Molecules (Springer, Berlin, 1979) 4. R.A.Hann, D.Bloor, Organic Materialsfor nonlinear optics II, Royal society of Chemistry, Cambridge, 1993. 5. M. S. Paley, D. 0. Frazier, J H. Abdeldeyem, J S. Armstrong, S. P. McManus J. Am. Chem. SOC.117,4775 (1995) 6. M. S. Paley, D. 0. Frazier, H. Abdeldayem, Chem. Muter. 6,2213 (1994) 7 . A. V. V. Nampoothiri, P. N. Puntambekar, B. P. Singh, R. Sachdeva, A. Sarkar, Dipti Saha, A. N. Suresh, and S. S. Talwar, J. Chem. Phys. 109,2 (1 998). 8. J. Waite, M. G. Papadopoulos, J. Chem. SOC. Faraday Trans. 2, 81,433 (1985). 9. R. I. Kaiser Chem. Rev. 102,1309 (2002) 10. J. Andzelm, M. Klobukowski, E. Radzio-Andzelm, J. Comput. Chem. 5, 146 (1984).
MOLECULAR STRUCTURE AND ELECTRIC POLARIZABILITY IN SODIUM CHLORIDE CLUSTERS N.KARATSISAND G. MAROULIS+ Department of Chemistry, University of Patras GR-26.500 Patras, Greece E-mail:
[email protected]. riknr@chernistn'.ii~atrns.nr
We report an investigation of the electric polarizability of sodium chloride clusters. Relying on conventional ab initio and density functional theory methods we find that the mean (per atom) dipole polarizability reduces strongly with cluster size.
The properties of alkali halides have attracted particular theoretical [ 1,2] and experimental [3,4] attention . With the notable exception of Bederson's work on the dimers [4] very little experimental evidence exists on the dipole polarizability of higher alkali halide clusters. In this report we focus our efforts on the electric polarizability of sodium chloride clusters (NaCI),, n = 1,2,3 and 4. We rely on conventional ab initio and density functional theory methods for the determination of molecular geometries and dipole polarizabilities. We have designed small and medium-sized basis sets that can be used for electric property calculations for sodium chloride clusters with up to the decamer. The HOMO and LUMO of (NaC1)3 and (NaC1)4 are shown in Fig. 1. In Fig. 2 we have plotted the evolution of the mean per atom dipole polarizability. It is fairly obvious that two widely used theoretical methods, MP2 and B3LYP predict the same trend in the evolution of this property: a strong reduction of the mean per atom polarizability with cluster size.
'Author to whom correspondence should be addressed
289
290
HOMO
HOMO
LUMO
LUMO
Figure 1. HOMO and LUMO for the sodium chloride trimer and tetramer, (NaCl), and (NaCI)4.
29 1
17.5 -
Cluster size and dipole polarizability
-a-
HF
--o- B3LYP
a
11.0
I
I
I
I
2
4
6
8
Number of atoms in (NaCl)"
Figure 2. Evolution of the mean per atom polarizability with cluster size in (NaCI)".
References 1.
P.Weis, C.Ochsenfeld, R.Ahlrichs and M.M.Kappes, J.Chem.Phys. 97, 2553 (1992). 2. R. P. Dickey, D. Maurice, R. J. Cave, R. Mawhorter, J.Chem.Phys. 98, 2182 (1993). 3. R.Kremens, B.Bederson, B.Jaduszliwer, J.Stockdale and A.Tino, J.Chem.Phys. 81, 1676 (1984). 4. T.Guella, T.M.Miller, J.A.D.Stockdale and B.Bederson, J.Chem.Phys. 94, 6857 (1991).
ABOUT THE POSSIBILITY OF APPLYING THE NEURON NETWORKS FOR DETERMINING THE PARAMETERS OF UNIAXIAL FILMS ON THE BASIS OF THE ELLIPSOMETRIC MEASUREMENTS MICHAL M. KARPUK Koszalin Technical University, Koszalin, Poland E-mail:
[email protected]
The problem of definition of parameters of thin anisotropic films used in a microelectronics on the basis of ellipsometrical measuring is explored. The method of definition of parameters of films with use of neuron networks is offered. The networks is trained in space of acceptable values of parameters of layered system. The algorithm of tutoring of a network grounded on a rule Widrow-Hoff. At tutoring the error of experimental data's was taken into account. The neuron networks is applied for definition of parameters of uniaxial films of Langmuir-Blodgett dimethyl-3,4:9,1O-perylenebis(dicarboximide). The network has shown high performance, the results coincide with obtained other methods. The network can be applied for examination of layered systems. Keywords: neural networks, optimization, ellipsometry, anisotropic layered system.
1. Introduction Modem microelectronics, information technologies, touchsensitive techniques widely use thin-film materials as element basis. In the physics of thin films, where this materials are actively investigated, one of basic problems is defining the parameters of layered structures. One of methods is ellipsometrical. It allows to define optical parameters of layered systems with great accuracy. The essence of this method is analysis of reflected polarized light from investigated structure [ 11. For the elementary isotropic layered systems the analytical algorithms and iteration procedures of finding the complex indexes of refraction and thicknesses of layered mediums are designed. However in practice more anisotropic materials are applied, to which this procedures are inapplicable. For finding the parameters of anisotropic layered structures the optimization procedures consisting in minimization of the root-mean excursion of calculated ellipsometric angles to the measured experimentally are used more often. According to this the problem of creation of self-studying system, that, on the base of experimental data and prior information about the structure of anisotropic system could define its parameters, is actual.
292
293
Theory. In ellipsometry at reflection of a polarized light from layered system ellipsometric angles A and Y, that depend on parameters of explored anisotropic system are defined. In the given abstract the system of uniaxial stratum isotropic substrate is considered. The thickness of a stratum equals d, the optic axis c makes angle 4 with a normal line to the boundary. Ellipsometric angles A and Y measured experimentally are the functions of the angle of incidence a, complex indexes of refraction of a film Nu=No' - iNo", Ne=N,' - iNe", its thickness d, the orientations angle 4 of the optic axis, and also complex index of refraction of a substrate N,= N,' - iNs":
The optimization procedures consisting in minimization of a functional are usually applied to defining the unknown parameters of such layered systems
where,
A: ,'Pirn - are measured experimentally, Ai ( N o ,N , ,d ,@, N , ),
Yi( N o ,N , , d , @ ,N , )
- calculated with the help of model ellipsometric
angles, n - quantity of measurings. However minimization (2) for anisotropic systems gives great errors in defining the anisotropy of a film, as the responsivity of ellipsometric angles A and Y varies for explored parameters [2]. For example, at orientation of optic axis c of a film, direct to a normal line, the small angular variations of angle a give great errors in definition ordinary No and unordinary N, indexes of refraction of a film. On the other hand, the structure of functional (2) is, that the existence of great number of local minimums, that are not giving the valid solution and definition of parameters of a film is possible. In this connection the idea of usage of neuron networks for solution of the given physical problem appeared. Neural networks have ample opportunities of pliable tuning up of parameters, adaptations to varying input dates [3, 41. On the base of a physical model a two-layer neuron network with three input and eight output neurons was implemented. The choice of such network was
294 stipulated by the structure of physical data. Giving on inlet of neuron network data for n measuring, we get parameters of explored system on an output. Neuron network realizes nonlinear transformation of input signal in such a manner that
i=l
here wi - weighting coefficients of the network, P , (m=1,2, ... k) - parameters of layered system. The solution of problem consist of two stages. At the f i s t stage - grade level - it was necessary learn the network to select the weighting coefficients w i depending on input data. The algorithm of tutoring consisted of the following stages: 1. Definition of permissible boundaries of changing of the parameters of explored physical system (assignment of a k-dimension cube, inside which the solution was; k - quantity of defined parameters); 2. Assignment of a rough grid inside the k-dimension cube, for which the direct problem of ellipsometry was solved - the comers A and Y were defined. This grid, considering the computing opportunities made modem PC, consisted not less than of 1000 elements; 3. On the base of minimization on a least square method with application of gradient descent, vector w i- the weighting coefficient of the network was defined for each element of a grid. After tutoring the network for every element of a grid inside a k-dimension cube the vector of weighting coefficients was calculated. On the second stage - the stage of solution of the problem - it was necessary to minimize a functional (2). This problem was solved by neuron network as following: 1. The initial experimental data was inlet; 2. On the basis of injected data the search of a point inside a k-dimension cube, for which the total of squares of diversions of experimental data and calculated data was minimum, was carried out. 3. For the greater approach to the solution of a problem by interpolation the point of a k-dimension cube and vector of interpolated values of weighting factors was defined. This point and vector of weighting factors were selected as initial approach for the defined parameters of a film and substrate. 4. The minimization of a functional was carried out (2) with the help of a method of coordinate descent, and the step 3 was iterated for each subsequent approach of defined parameters.
295
The algorithm of the solution of a problem is implemented in languages C++ and Fortran-90, most adequate the content of problems [5]. The results For testing of the neuron network films of organic molecules were selected. The direct problem of an ellipsometry was solved and ellipsometric angles A and *F were defined for them. Then the received data "was spoiled" by the errors of measuring device (5-10") and were used for testing the network. The results of model calculations are shown in table 1. Here we can see, that only for small thickness the error of definition of parameters is great, and at d/k > 0,05 the error is promptly diminished up to 0,1%. The model operation was spent also for a case of an oblique orientation of optic axis. Thus the error of definition of a angle <j> angle of orientation of optical axis made 5-7 %. The traditional methods of minimization give a considerably greater error (at 20%). Table 1. A numerical modeling for transparent uniaxial films with an axis of an anisotropy located perpendicularly of a plane of boundary. A substrate Ns= 1.46, angle a = 60°, wave length of an incident wave A.=6328 A. True values NO "e 1.540 1.550
1.540 1.540 1.540 1.540 1.540 1.540 1.540 1.540
1.550 1.550 1.550 1.550 1.550 1.550 1.550 1.550
d(K) 1.27 6.33 12.66 31.64 63.28 126.56 316.40 632.80 1265.6
Ellipsometrical Calculated values angle (grad) A «*(A) V NO "e -0.065 6.917 1.564 1.568 11.635 -0.325 -0.650 -1.625 -3.244 -6.440 -24.87 2.229 -25.44
6.917 6.916 6.911 6.892 6.817 4.906 2.738 4.671
1.544 .536 .538 .539 .539 .539 1.540 1.540
1.563 1.548 1.547 1.547 1.548 1.549 1.550 1.550
12.457 12.965 31.916 63.650 126.95 316.42 632.98 1265.6
Error in % NO 1.56
0.26 0.23 0.10 0.08 0.06 0.04 0.00 0.00
"e <*(A) 1.18 819.3 3 0.84 96.86 0.11 2.44 0.22 0.87 0.18 0.58 0.11 0.31 0.05 0.01 0.00 0.03 0.00 0.00
Offered neuron network was used for definition of parameters of the films of Langmuir-Blodgett. Film 1 isotropic one formed by thermal vacuum deposition dimethyl-3,4:9,10-perylene-bis(dicarboximide). Film 2 and 3 are LB films formed by deposition from water surface (8Y layers) and from subphase containing dissolved CoBr2 (1.10-4 M, 6Y layers) accordingly. One of features of technology of cultivation of similar organic compounds is the orientation of molecules and, accordingly, the axis of anisotropy, in a direction, close to a normal line to a boundary. In this case it's almost impossible to define the angle
296
()> and of an anisotropy by traditional methods. The results of calculations are shown below in the table 2. Table 2. Results of calculation of parameters of films with use of the offered algorithm for neural networks. d,A Film No
, (grad) Ne 77(5). 1. 1.86(4)-i0.01(3) 1.86(6)-i0.01(6) IUD 43(4). 24,(5) 2. 1.45(7)-iO.OO(6) 1.46(3)-iO.OO(6) 46(0). 1.47(3)-iO.OO(3) 1.48(0)-iO.OO(2) 3. 1U7) You can see, that the of physical parameters of a film are quite real, and the received results correspond to the given system.
References 1. R.M.A.Azzam and N.M.Bashara. Ellipsometry and Polarized Light (North-Holland, Amsterdam, 1977). 2. M.Schubert, B.Rheinlander, E.Franke, H.Neumann, J.Hahn, M.Roder and F.Richter. Appl. Phys. Letter 70 (14), 1819, (1997). 3. K.Warwick, G.W.Irwin, KJ.Hunt. Neural networks for control and systems (United Kingdom, London, 1992). 4. D.Rutkowska, M.Pilinski and L.Rutkowski. Sieci neuronowe, algorytmy genetyczne i systemy rozmyte (Polska, Warszawa, 1997). 5. T. Masters. Practical neural network recipies in C++. (Academic Press, Inc., 1993).
A FUZZY LOGIC PARADIGM FOR INDUSTRIAL ECONOMICS ANALYSIS SAEID H . KASHANI UNIVERSITE DE RENNES I Faculty of Economic Sciences Doctoral Program, Center of National Research in Science (CNRS) 7, Place Hoche, CS86514,CREREG, OfJice N" 293 Tel. : 0033 2 23 23 33 24 35065 Rennes-France E-mail: Saeid. Kashatzi;ijiirl.ihi-~eniies].tiKashaiiiiciiivreniies1Givuhoo.fr
Investment decision in assets with a high degree of "know-how" specificity under uncertainty in the sense of "adverse selection" is an important matter for policy-maker and enterprise managers. According to the "Transaction Cost Theory", these two variables affect deeply the motivation for vertical cooperation. On the other hand, with a given level of these variables, we can determine the organizational size of enterprise. A non appropriate size could provoke a considerable transaction costs due to the unnecessary transactions. In order to measure the impact of "adverse selection" and "know-how" specificity on the contract decision through a subcontractory production system in this paper, we developed a new panoramic vision using "fuzzy logic" methodology. The model applied the real data obtained of 17 enterprises in French automotive industry. Finally, the fuzzy index estimated is compared with the real data about the levels of contracting by the enterprises. The results show a powerful association between the real and fuzzy estimated data, and explain the behavior of contract decision in a pertinent way. Fuzzy set theory, Fuzzy Logic, Transaction Cost Economics, Automotive Industry, Vertical cooperation, Contract, Subcontractory, Uncertainty, Asset specificity, Adverse select
297
MONOTONIC SCALING OF THE KPZ GROWTH WITH QUENCHED DISORDER
H. KATSURAGI AND H. HONJO Department of Applied Science for Electronics and Materials, Interdisciplinary Graduate School of Engineering Sciences, Kyushu University, 6-1 Kasugakoen, Kasuga, Fukuoka 81 6-8580, Japan E-mail: [email protected] Multiaffine analysis is applied to the KPZQ (Kardar-Parisi-Zhang growth with Quenched disorder) model. In the previous work, we have found that the BDP (Ballistic Deposition growth with Power-law noise) model has a multiaffinity, and power-law noise tends to break the KPZ universality. In this paper, we report on the monotonic self-affine scaling of the KPZQ model. It is confirmed that the KPZ universality holds in higher order exponents. This implies that the BDP model and the KPZQ model can be classified by the multiaffine analysis.
1. Introduction
Growing rough surface is one of the most typical patterns observed everywhere in sciences and engineering'. The dynamics of growing rough surfaces can be written in a simple scaling w N xQ*(t/xa/@), where w, x , t , a , p, and correspond to surface width, horizontal place, time, roughness exponent, growth exponent, and scaling function, respectively. The has two different scaling regimes depending on the argument u = t / x f f / @ , i.e., Q(u) uB when u << 1, and *(u) const. when u >> 1. The KPZ equation has been proposed to understand this scaling'. The KPZ equation is written as &h = vV2h $(Oh)' ~ ( xt ),,where h, v , A, and 11 are surface height, effective surface tension, lateral growth strength, and noise term, respectively. The scaling exponents can be calculated as a = 1/2 and /? = 1/3 for the KPZ growth2. In contrast with this exact result, most of experiments show a > 1/2l. Some improvements have been proposed for the KPZ model. Zhang has proposed the BDP (Ballistic Deposition growth with power-law noise) model3. This model can yield large a(> 1/2). Another candidate is the KPZQ (KPZ with quenched disorder) model. In the ordinary KPZ equation, arguments of the noise term are the horizontal
*
*
-
N
+
+
298
299
place and time: ~(x,t). However, noise may be quenched at local position in some real cases. Thus one can easily imagine the case that ~ ( xt ),is replaced by ~ ( xh). , Finally, the KPZQ model should be written as,
+ x2
8th = vV2h -(Oh)’
+ v 3- ~ ( 2h ), .
Where v is a constant driving force term needed for steady growth. KPZQ model also produces large a (> 1/2). What is the difference between BDP and KPZQ growth? Which model is more resonable for real growing surface phenomena? In order to answer this problem, we measure the multif i n e and multigrowth exponents of the two models. In the BDP model, multiaffinity and multigrowth property have been revealed4. In the present, we report on the monotonic self-affine result of the KPZQ model. 2. Simulation and analysis Let us consider the qth order height-height correlation function described =I
where 6h(x,t ) = h(x,t ) - ( h ( t ) ) zThen . we can define the qth order roughness exponent a, and the qth order growth exponent & as
Cq(x,0) nJ xqaq, C,(O,t)
(3)
N tQPq.
The exponents a, and Pq are defined at the limit x + 0 and t -+ 0, respectively. If a, (p,) varies associated with q, the results are multiaffine (multigrowth). We carry out (1 1) dimensional KPZQ model simulation. The discretized version of Eq. (1) is written as5
+
+ + 1,t))
+
h(x,t At) = h(x,t)+ A t ( { h ( ~ 1,t) - 2h(x,t) h ( x
x
+
+ + V ( 2 , [h(Gt)l)). (4)
+ Z { h ( x 1,t ) - h(x - l,t)}2 v
Where [h(x, t ) ]denotes integer part of h(x,t ) . We use quenched disorder ~ ( xh), with a Gaussian distributed noise. Parameters At, A, and v are set as 0.01, 1.0, and 0.05, respectively. The system size L , total time step TI and average ensemble number N are taken as L = 1024, T = 10000, and N = 100 for C,(x,O) calculation, and L = 512, T = 1000, and N = 1 for C,(O, t ) calculation, respectively. The computed results of C, are shown in Fig. 1. As can be seen in Fig. 1, all curves have the same slope in the small x and t regime, i.e., a, and p, are independent of q. The obtained values
300
0.63 and Pq = 0.44 almost satisfy the KPZ universality scaling law a+(a/P) = 2; (0.63+(0.63/0.44) e 2.06). The quenched disorder does not break the KPZ universality. In Fig. l(b), the slopes of smaller q curves vary in large t regime. The reason of this qualitative change remains unsolved. Larger scale computations are necessary for more accurate analysis.
aq =
A
1.51
0
1
2
3
4
5
6
In x
-1 .o -1.5
-.
-2.0
u"
-2.5
h
0,
-. 0
0-
-3.0
d
-3.5 4.0
-4.5 -4
I
I
I
I
I
I
-3
-2
-1
0
1
2
In t Figure 1. The height-height correlation function of KZPQ model, (a) ( l / q ) lnCq(z,O) vs. lnx, and (b) ( l / q ) lnCq(O, t ) vs. In t . All curves have the same slope indicating monotonic scaling of growing surface. In the figures, the curves correspond to q = 1 , 2 , .. . , 10, from bottom to top.
301
3. Discussion We have investigated the KPZQ growth model with multiaffine analysis method. While the BDP model shows multiaffinity, the KPZQ model does not show multiaffinity. That is, multiaffine analysis method can distinguish the two models. Myllys et al. have examined the paper combustion front experiment, and they have found the multiaffinity6. On the other hand, the Hida mountains profiles do not show m~ltiaffinity~. Both of surfaces have the large a (> 1/2). The large a might not result from the sole origin. Of course, there are many other models for growing rough surfaces. Universal classification is required to understand the surface growth phenomena physically. The multiaffine analysis is just one method for classification. Moreover, we have also examined the effective noise distribution of the KPZQ model, and have found that it has an exponential tail. The effective noise is defined as qm = 6h(z’,t’ At) - 6h(z’,t’). This is quite different from the BDP model. In the BDP model, it obeys power-law form4.
+
References 1. A. -L. BarabLi and H. E. Stanley, Fractal Concepts in Surface Growth (Cambridge University Press, New York, 1995); 2. M. Kardar, G. Parisi, and Y. -C. Zhang, Phys. Rev. Lett. 5 6 , 889 (1986). 3. Y. -C. Zhang, J . Phys. (Paris) 51, 2129 (1990). 4. H. Katsuragi and H. Honjo, Phys. Rev. E67, 011601 (2003). 5. Z. Cshbk, K. Honda, and T. Vicsek, J. Phys. A26, L171 (1993). 6. M. Myllys, J. Maunuksela, M. J. Alava, T. Ala-Nissila, and J. Timonen, Phys. Rev. Lett. 84, 1946 (2000); 7. H. Katsuragi and H. Honjo, Phys. Rev. E59, 254 (1999).
A RECURSIVE ALGORITHM FOR FINDING HDMR TERMS FOR SENSITIVITY ANALYSIS
H. KAYA, M. KAPLAN Istanbul Technical University, Institute of Informatics, Computational Science and Engineering Program, 34469 Maslak, Istanbul, Turkey Ernail: hkayaQbe.itu.edu.tr, [email protected]
H. SAYGIN Istanbul Technical University, Institute of Energy, 34469 Maslak, Istanbul, Turkey Email: sayginhQitu. edu. t r
High Dimensional Model Representation (HDMR) is a newly developed technique which decomposes a multivariate function into a constant, univariate, bivariate functions and so on. These functions are forced t o be mutually orthogonal by means of an orthogonality condition. The technique which is generally used for high dimensional input-output systems can be applied to various disciplines including sensitivity analysis, differential equations, inversion of data and so on. In this article we present a computer program that computes individual components of HDMR resolution of a given multivariate function. The program also calculates the global sensitivity indices. Lastly the results of the numerical experiments for different set of functions are introduced.
1. HDMR Fundamentals There have been recent efforts to develop efficient methods for tackling with the multivariate functions in such a way that the components of the approximation formula to express a multivariate function are ordered starting from a constant and gradually approaching to multivariance as we proceed along the terms like univariate, bivariate and so on. This type of the method was offered by Sobol and its revised and generalized form was proposed and applied to various problems The method is called High Dimensional Model Representation (HDMR) and its basic philosophy can be given through the following general equation for a given multivariate 293t495.
302
303
function f ( x 1 , ...,X N ) defined over an n-dimensional unit hypercube K n . N
N
f(z)= f O + x f z ( x z ) 2=1
+ ~ f z j ( x j , x j ) + " ' f f 1 2 .N(Zl,...,XN)
(1)
1,3=1
%<3
where all variables 5 1 , . . . , X N are assumed to vary between 0 and 1 without loss of generality. The right hand side additive terms of this equality are orthogonal components of the original function. The orthogonality is achieved by imposing the following vanishing condition (2). dx21w(x21)f%1
2k (x21,.
.
* 7 x 2 k )
=0
if
21
E
{21,.
. . ,z k }
(2)
where w(x,,) is a weight function. The basic underlying philosophy of HDMR is to generate an approximation scheme using the first few terms of the expansion (1) usually up to the second order terms. N
n
i= 1
i,j=l i<j
HDMR expansion can be also regarded as hierarchically correlated function expansion in terms of the input variables. For example f i ( x i ) represents the individual contribution of the independent variable x i whereas f i j ( x i ,x j ) represents the correlated action of the inputs x i and x j together and so on. The constant term fo can be thought as the mean response of the system. There are two main areas where HDMR approach is valuable; (1)generating an approximation for a model by using first few HDMR terms2i5. (2) identifying the key variables with the most influence on the outputs. This is possible by calculating HDMR coefficients and then global sensitivity indices. For this purpose following variances are introduced.
Dil ...i k
=
f ? l . , . i k ( x i l r . . * rxik)dg
(5)
Taking square of Eq (l),integrating with respect to all variables and using the vanishing condition (2) one may get n
n
304
If the Eq (6) is divided by D , sensitivity estimates are obtained. 1=
N
N
k=l
i l , ...,i k = l i l < . (.< i k
C C
Sil ...i k
(7)
where
Of course to find the sensitivity indices, the squares of all functions considered so far are assumed to be integrable over Kn.It can be easily seen that sensitivity indices must be positive and less than or equal to zero. Although we can manually determine HDMR resolution and the global sensitivity indices for some set of functions, a computer program is needed for general functions and high number of variables. For this purpose we developed a computer program details of which is explained in the coming section. 2. The Algorithm
This section is devoted to the details of the program and the underlying algorithm. The program is written in a language capable of symbolic integration '. It simply takes the multivariate function and the maximum number of variables then returns the components of the HDMR resolution of that function. 3. Numerical Experiment
In the experiment we choose 4 different functions with 4 variables and computes the sensitivity estimates.
+ + + +
F =2 1 22 23 24 G = 2 1 * 22 2 3 +2 4 H =2 1 * 22 *23 +24
I = 2 1 * 22 * 5 3 * 2 4
(9)
We examine the impact of different sets of variables upon the output of the function. There are 24 - 1 = 15 possible variations: individual effects of variables, pairwise correlated effects, and so on. These effects can be observed by using our algorithm based on HDMR method. Table (1) shows that for the function F, all variables have equal contributions. In the
305
F G H I
1 25.0 9.67 4.97 15.4
2 25.0 9.67 4.97 15.4
3 25.0 38.0 4.97 15.4
4 25.0 38.0 79.6 15.4
12 3.22 1.66 5.14
13 1.66 5.14
14 5.14
23 24 34 1.66 5.14 5.14 5.14
123 0.55 1.71
124 1.71
134 234 1234 1.71 1.71 0.57
function G, 2 3 and 2 4 have larger contributions than x1 and 22 because the product 2 1 ‘ 2 2 produces small numbers compared t o the sum 2 3 + 2 4 on the interval [0,1]. The table also reflects the correlated actions of variables, For the function H, the column 14 is empty since there is no relation between X I and 2 4 .
4. Conclusion We introduced a computer program which is written in a symbolic algebra language capable of taking symbolic integration for calculating HDMR functions and global sensitivity indices of a given function by means of defining recursive procedures. Although some parts of the program need improvements, for instance the transformation from sets t o the integers, it can be used as an efficient tool for HDMR researches.
References 1. I.M. SOBOL,Sensitivity Estimates for Nonlinear Mathematical Models, Mathematical Modeling and Computational Experiments, 1, 407-414( 1993). P.C. IP, A N D H. RABITZ,An Efficient Chemical Kinetics 2. J.A. SHORTER, Solver Using High Dimensional Model Representation, J . Phys. Chem. A , 103, 7192-7198(1999). 3. 0 . A L I ~A N D H. RABITZ,Efficient Implementation of High Dimensional Model Representations, J . Math. Chem., 29, 127-142(2001). 4. I. BANEJEE AND M. G . IERAPETRITOU, Design optimization under parameter uncertainty for general black-box models, Ind. Eng. Chem. Res. 41, 66876697(2002). 5. G . LI, s. W. WANG, H. RABITZ,S. WANG AND P. JAFFE,Global uncertainty assessments by high dimensional model representation (HDMR), Chemical Engineering Science 57, 4445-4460(2002). 6 . MuPAD, The Open Computer Algebra System. http://www.mupad.de. 7. M. DEMIRALP AND A . A. KANMAZ, High Dimensional Model Representation: A Symbolic Computer Program for Finding Orthogonal Components, 12. National Mechanics Congress, Konya, Turkey, 2001 (in Turkish). 8. I. M. SOBOL,Theorem and Examples on High Dimensional Model Representation, Reliability Engineering and System Safety, 79, 187-193(2003)
CALCULATION OF VIBRATIONAL EXCITATION OF DIATOMIC MOLECULES BELOW DISSOCIATIVE ATTACHMENT THRESHOLD
:
P. KOLORENC, J. HORACEK K. HOUFEK AND M. C’ZEK Institute of Theoretical Physics, Charles University Prague, V HoleSoviEkdch 2, 180 00 Praha 8, Czech Republic
G. MIL’NIKOV, 13. NAKAMURA Institute f o r Molecular Science, Miyodaiji, 444 Okazaki, Japan
The problem of vibrational excitation (VE) of diatomic molecules by low-energy electrons below the threshold of dissociative attachment (DA) is a difficult and interesting problem. One of the interesting features is the unusual series of oscillations in VE cross section converging to the DA threshold, sometimes called the boomerang oscillations 2 , see Figure 1. These oscillations are very sensitive to minor changes of underlying forces and represent a very stringent test of the theory used for the calculation of the VE cross section. The calculation of the VE cross section in this case is difficult because for energies below the DA threshold we have to solve scattering integral equations describing the motion of the nuclei in the resonance state at a negative energy. Scattering equations are always solved for positive energies since at these energies the particles may escape the scattering region completing the process of scattering. In such a case the leading term and the integral kernel are bounded functions provided the interaction potential is bounded. If the energy is negative the particles cannot escape. The leading term in scattering integral equation represents usually the free particle wave function which at negative energies diverges at large internuclear distances. We are therefore faced with the problem to solve integral equation with diverging leading term and diverging kernel. Since the underlying forces always decay at large distances we can
’,
*+mail: horacekOmbox.troja.mff .cuni.cz
306
307
restrict the integration to a finite range (0,R) where however R might be large for long range forces of polarization type. To our knowledge such problem has never been discussed in the literature. To tackle this problem we make use of the R-matrix representation of scattering Green's function recently proposed by the authors 3 , and perform detailed numerical study of convergence of the VE cross section in dependence of all parameters involved. As an example we show in Figure 2 the accuracy of the calculated
0 Ln N
HBr
-
0 0 N
ri
E
0
N I
A
0
d
o
-In
d
8
-4 4J
U
v=o+o
a, c " z rl ul
m
0 k
U 0 Ln
DA 0
0.1
0.15
0.2
0.25
0.3
I
I
I
0.35
0.4
0.45
0.5
Electron energy (eV) Figure 1.
Resonance elastic scattering and 0 -+1 VE cross section for HBr.
VE cross section at two energies, one above the DA threshold, i.e. positive energy and also at a negative energy below the DA threshold. The calculation is based on the use of the so called nonlocal resonance model ', which represents the most successful and general approach to resonance
308
electron molecule collision . As a major tool for the numerical solution we used the Schwinger-Lanczos method '. T h e method is based on Schwinger variational principle for the scattering integral equation. Employing the convenient (Lanczos) basis set the solution is found in terms of continued fraction. Similar method can be used for solution of broad class of integral equations. Acknowledgement This work was supported by KONTAKT No. ME562 of t h e MSMT of the Czech Republic and by the Czech-Japan collaboration program supported by JSPS.
25
30
35
40
50
45
55
60
65
70
Nb
Figure 2. Rate of convergence of the method for the calculation of 0 -+ 1 VE cross section for two energies below and above DA threshold. Accuracy of the cross section is plotted versus size of the basis set.
References 1. J. HorBEek, in The Physics of Electronic and Atomic Collisions XXI ICPEAC, Sendai, edited by Y. Itikawa et al. (A.I.P., New York, 1999), p. 329. 2. M. Ciiek, J. HorBEek, A.-Ch. Sergenton, D. B. PopoviC, M. Allan, W. Domcke, T. Leininger and F. X. Gadea, Phys. Rev. A 63,062710(2001) 3. G. V. Mil'nikov, H. Nakamura, and J. HorBEek, Comp.Phys.Comm. 135, 278(2001) 4. P. KolorenE, M. Ciiek, J. HorBEek, G. Mil'nikov and H. Nakamura, Physica Scripta(2002) 5. W. Domcke,Phys. Rep. 208, 97(1991) 6. H.-D. Meyer, J. HorBEek and L. S. Cederbaum, Phys. Rev. A 43, 3587(1991).
ON THE USE OF COLOR HISTOGRAMS FOR CONTENT BASED IMAGE RETRIEVAL IN VARIOUS COLOR SPACES K. KONSTANTINIDIS AND I. ANDREADIS. Laboratory of Electronics Section of Electronics and Information Systems Technology Department of Electrical and Computer Engineering Democritus University of Thrace GR-67100 Xanthi Tel. i 3 0 541079566 Fax: i 3 0 541079564
Greece E-mail: kor2konstrir'ee.cjuth.yr,iaiirlreud~~ee.duth,~r
As digital image libraries are already overpopulated, the development of improved methods for better indexing and retrieving images from such libraries (databases), has become an imperative issue. Content-based Image Retrieval (CBIR) is a technique for retrieving images on the basis of automaticallyderived features such as color, texture and shape. The features used for retrieval can be either primitive or semantic, but the extraction process must be predominantly automatic [ 11. A large number of applications, including military, industrial and civilian generate gigabytes of images daily. As a result, there is a huge amount of information which cannot be accessed or made use of unless it is organized [3]. By organized it is meant that appropriate indexing is available in order to allow efficient browsing, searching and retrieving as in keyword searches of text databases. The easiest way to search is with the use of query by example, which means that the user has to present an image to the system and the latter searches for others alike by extracting features from the example and comparing them to the ones stored in the database. A popular technique, which is widely used in CBIR in order to represent the above features, is the use of color histograms. A simple way to describe the operation of a color histogram is as follows: Given a discrete color space, a color histogram simply counts how much of each color occurs in the image. In this paper we present an evaluation of various methods for content-based image retrieval used to compare global and local color histograms, within a choice of four color-spaces. Global histograms are the histograms of the whole image. Local histograms are obtained by splitting up an image into smaller windows and then forming a histogram for each of these sub-images. Subsequently, these histograms are compared to the equivalent ones of the query image and the system decides, according to some predefined criteria, which of the images in the database are similar to the query image.
309
310 The histograms for the query image and the images in the database are all initially passed through a smoothing algorithm and following that are used for comparison in order to detect similar images to the one in question. The methods for comparison presented are: Histogram Intersection [2], The Euclidean Distance and the Wilcoxon rank sum test. Further, the following color spaces are used: Red - Green - Blue (RGB), Hue - Saturation - Value (HSV), Luminance - relative redness/greenness - relative yellowness/blueness (L*a*b) and Luminosity - Chroma - Hue (LCH). The RGB scheme was chosen due to the fact that it is generally used in display devices and HSV because it reflects more accurately human color perception. L*a*b on the other hand is a perceptually uniform color space and is particularly sensitive to differences in color. In addition, a color space which results from L*a*b is LCH [4] which was selected due to its usefulness to user interface. All the comparisons of the featured methods were initially performed using a fixed database of 200 images (with a dimension of 400 x 400 pixels), through simulations which were performed under the environment of the Mathworks’ Matlab package.
References A. Del Bimbo, “Visual Information Retrieval”, Morgan Kaufman Publishers, San Francisco, California, 1999. M.J. Swain, D.H. Ballard, “Color Indexing‘’, International Journal of Computer Vision, 7( 1): 1 1-32, 1991. I. Gagliardi, R. Schettini, “ A Methodfor the automatic indexing of color images for effective image retrieval”, The New Review of Hypermedia and Multimedia, Vol. 3, pp. 201-224, 1997.
N.H. Papamarkos, “Digital Image Processing and Analysis ”, Xanthi, DUTH, 200 1.
THEORETICAL STRUCTURAL AND RELATIVE STABILITY STUDIES OF ISOMERIC AND CONFORMERIC FORMS OF XOOY PEROXIDES (X=H, CH3, CI, Br, I, Y= CI, Br) AGNIE M. KOSMAS Department of Chemistry, University of Ioannina, Greece 451 I0
A b initio quantum mechanical studies are carried out for the conformeric and isomeric forms of several chlorine and bromine peroxides, XOOCl and XOOBr, (X=H, CH3, C1, Br, I) of interest in stratospheric halogen chemistry. The calculations indicate interesting trends in the nature of halogen-oxygen bonding. In particular both the halogen-oxygen bond distances and the relative stability ordering results show a notable dependence on the ionic character of the bond and the electronegativity of X fragment.
1.
Introduction
Chlorine and bromine peroxides and their isomers are interesting intermediates in the atmospheric self-reactions of C10 and BrO radicals [ 1, 21 or their reactions with the hydroxy, HO [3], and methoxy CH3O radicals [4]. Thus, extensive theoretical investigations [5- 191 have been devoted to this subject. The purpose of the present study is to examine and contrast CI-0 and B r - 0 bonding in XOOY compounds and their isomers and investigate trends in bond distance and relative stability, related to the electronegativity of X fragment. In several of these structures, chlorine and bromine are capable of forming hypervalent compounds with a varying degree of ionic character and thus, interesting trends result with respect to the strength of the corresponding bonds and the energy ordering of the relevant species.
2.
Computational Details
The equilbrium state geometries of all but the I- containing species, have been fully optimized at the UMP2(full) / 6-31G* level of theory and they are 311
3 12
found in good agreement with higher order reported results [5-181 for several of these compounds. The I- containing peroxides have been treated within the MP2 methodology, using the orbital-adjusted effective-core-potential plus DZ of Hay and Wadt, i.e., the LANL2DZ basis set for all atoms involved, augmented by additional polarization functions [ 191. All calculations were performed using the GAUSSIAN 98 series of programs [21]. For each peroxide XOOCl and XOOBr, X=H, CH3, CI, Br, I two additional isomeric forms, XOClO, XC102 and XOBrO, XBr02 and two conformeric configurations, cis-XOOCI, trans-XOOCI and cis- XOOBr, transXOOBr have been studied in total. In all cases the lowest energy structure is found to be the skewed XOOY geometry with a dihedral XOOY angle ranging from 82 to 89 deg. 3.
Structural Trends
Table 1 summarizes the C1-0 and Br-0 equilibrium bond distances for all structures under consideration. 0
Table 1. CI-0 and Br-0 bond distances in A, in various isomers and conformers of XOOY, Y=Cl, Br
........................................................................................ X\ CI
XOOCl
cis-XOOC1
trans-XOOC1
H CH3 I C1 Br
1.739 1.761 1.716 1.756 1.723
1.717 1.716 1.665 1.645 1.642
1.701 1.704 1.676 1.669 1.668
X\ Br
XOOBr
XOClO' a XC102 1.754, 1.763, 1.763, 1.796, 1.735,
1.513 1.520 1.489 1.548 1.479
1.488 1.489 1.445 1.477 1.430
......................................................................................... cis-XOOBr trans-XOOBr
XOBrO' a
XBr02
........................................................................................ H CH3 I CI Br
1.883 1.899 1.870 1.884 1.878
1.871 1.875 1.818 1.816 13 0 2
1.855 1.859 1.827 1.839 1.819
1.868, 1.896, 1.872, 1.892, 1.855,
1.676 1.666 1.634 1.659 1.692
1.645 1.647 1.597 1.613 1.653
........................................................................................
a The two columns under XOYO' isomers correspond to OY and YO' bond distances respectively, Y=CI, Br
3 13 Table 1 shows that both Cl-0 and Br-0 distances in all investigated species species follow similar trends and cover a wide range of values differing for 0
more than 0.3 A, . More particularly, two patterns are evident : the OY distances in the skewed minimum energy configurations are always larger than the values in either the cis- or the trans- conformer. The constraint of the planar geometries is largely reflected in the 0-0 bond which is always found to be shorter in the skewed configuration and larger in the planar geometries, thus, allowing a tightening of the XO and YO bonds in the cis and trans conformers, as shown in Table 1. For example the 0-0 distance is calculated to be 1.395, 0
0
1.526, 1.491 A, and 1.406, 1.575, 1.516 A, for skewed, cis and trans BrOOBr and BrOOCl respectively. The second result is that the 0-Y distances assume the largest values in each family for intermediate oxygen atoms i.e., in the skewed XOOY and XOYO' forms. The Y - 0 distances for terminal oxygens in XOYO' species are shorter and they become the shortest in XY02. Thus, a general conclusion is confirmed regarding the relative strength of 0-Y bonds. The halogen-oxygen bonding is weaker for intermediate oxygen atoms whereas it is strongest when it involves terminal oxygen atoms. 4.
Relative Stability Trends
Table 2 summarizes the calculated energy differences of the various species under consideration with the respect to XOOY equilibrium forms. It is readily seen that the cis- barrier is always higher than the trans- configuration, tending to increase with the size of X in most cases. Regarding the relative stability of the various isomers a striking result is found. The XY02 forms show a tendency to stabilize with increasing electronegativity of the X entity. For X=H, CH3 and I, XOYO is the next stable structure after the peroxide form. The XY02 forms are higher located with respect to XOOY and XOYO and particularly the HY02 isomers which are found very unstable. As we move from H to I the energy differences of XY02 isomers gradually decrease and for X=Cl, Br the situation is completely reversed. The XOYO structures become more unstable and the stability of XY02 increases significantly, for instance ClC102 is even found comparable to ClOOCl in some studies [ 5 ] . Thus, we have an increasing stabilization of XY02 structures when the electronegativity of X fragment increases.
314
Table 2. Relative stability in kcalmoi'l for various isomers and conformers of XOOY, Y=C1, Br X\C1
5.
XOOC1 cis-XOOCl trans-XOOCl XOC1O
H CH3 I Cl Br
0.0 0.0 0.0 0.0 0.0
X\Br
XOOBr
H CH3 I Cl Br
0.0 0.0 0.0 0.0 0.0
6.0 9.2 8.1 9.1 10.4
4.0 3.1 3.1 5.1 5.0
8.3 8.4 12.4 13.3 10.5
cis-XOOBr trans-XOOBr XOBrO 6.9 10.1 8.4 9.3 10.0
4.6 3.2 2.9 4.1 4.2
2.9 6.3 3.8 13.1 9.5
XC1O2
49.7 30.8 22.6 3.4 6.8 XBrO2
52.4 33.4 9.3 3.4 6.8
Discussion and Conclusion
The observed trends in Cl-O and Br-O bond distances and the relative stability among the various conformeric and isomeric structures of XOOY peroxides reveal significant aspects of the nature of chlorine and bromine oxygen bonding. This bonding is ionic in character and results from a combination of p-d hybridization in the halogen (3p/3d for Cl and 4p/4d for Br) and p->d promotion of a lone pair electron [22], This allows the halogen atom to form multiple bonds and as would be expected from electron counting, hypervalent bonding of this type for chlorine and bromine exhibits a large ionic component. In the XOYO' isomers, the Y halogen atom becomes hypervalent and forms a multiple bond with the terminal oxygen atom, thus, increasing the ionic character of Y-O' bond compared to O-Y. The effect becomes very pronounced in the XYO2 type, Y=C1, Br, where the Y halogen atom formally contains five bonds and is very positivily charged with respect to the oxygen atoms. Thus, the corresponding Y-O bond distance is expected to be the shortest and the stability is very much dependable on the electronegative character of X. The more electronegative X can be the greater stability results for XYO2. In summary, chlorine and bromine oxygen bonds are ionic in nature and the degree of ionic character increases significantly in the hypervalent compounds,
315 especially for the XY02 type molecules. The effect of the ionic character on the stability of each species greatly depends on the electronegative character of X fragment, making the H- analogues, HY02, the least stable species in the series. The conclusion is further confirmed by comparing the HY bond distances in 0
0
HY02 and free HY compounds : 1.351, 1.274 A, and 1.508, 1.414 A, for H-CI and H-Br respectively. This confirms the severe weaking of H-Y bond compared to free HYmolecules as a result of the positively charged Y in HY02. References
1. S.L. Nickolaisen, R.R. Friedl, S.P. Sander, J. Phys. Chem. A , 105, 11226 (200 1) and references therein. 2. M.H. Hanvood, D.M. Rowley, R.A. Cox, R.L. Jones, J. Phys. Chem. A, 102, 1790 (1998) and references therein. 3. C.S. Kegley-Owen, M.K. Gilles, J.B. Burkholder, A.R. Ravishankara, J. Phys. Chem. A , 103, 5040 (1999) and references therein. 4. D. Shah, C.E. Canosa-Mas, N.J. Hendy, M.J. Scott, A. Vipond, R.P. Wayne, PCCP, 3,4932 (2001). 5. T.J. Lee, C.M. Rohlting, J.E. Rice, J. Chem. Phys. 97, 6593 (1992). 6. J.S. Francisco, S.P. Sander, T.J. Lee, A.P. Rendell, J. Phys. Chem. 98, 5644 (1 994). 7. J.S. Francisco, J. Chem. Phys. 103, 8921 (1995). 8. P.C. Gomez, L.F. Pacios, J. Phys. Chem. 100, 8731 (1996). 9. W-K. Li, C-Y. Ng, J. Phys. Chem. A 101, 113 (1997). 10. S. Guha, J.S. Francisco, J'Phys. Chem. A 101, 5347 (1997). 1 1 . P.C. Gomez, L.F. Pacios, J. Phys. Chem. A 103,739 (1999). 12. R. Sumathi and S.D. Peyerimhoff, J. Phys. Chem. A 103,7515 (1999). 13. R. Sumathi and S.D. Peyerimhoff, PCCP, 1,3973 (1999). 14. D.K. Papayannis, A.M. Kosmas, V.S. Melissas, Chem. Phys. 243, 249 (1 999). 15. D.K. Papayannis, A.M. Kosmas, V.S. Melissas, J. Phys. Chem. A 105, 2209 (2001). 16. R.S. Zhu, Z.F. Xu, M.C. Lin, J. Chem. Phys. 116,7452 (2002). 17. E. Drougas, A.M. Kosmas, Chem. Phys. Lett. 369, 269 (2003). 18. D.K. Papayannis, V.S. Melissas , A.M. Kosmas, submitted. 19. V.S. Melissas, D.K. Papayannis, A.M. Kosmas, J. Mol. Struc. (Theochem) in press. 20. M.J. Frisch et al. GAUSSIAN98, Gaussian, Inc., Pittsburgh, PA (1998). 21. T.J. Lee, C.E. Dateo, J.E. Rice, Mol. Phys. 96, 633 (1999).
MODELING OF CHIRAL SEPARATIONS IN CHROMATOGRAPHY BY MEANS OF MOLECULAR MECHANICS W. J. KOWALSKI, J. NOWAK AND M. KONIOR Institute of Chemistry and Environmental Protection Pedagogical University 13/15 Armii Krajowej Av. Czqstochowa, 42-201 Poland E-mail: [email protected]
Modeling of enantiomeric separations in chromatography allows a prognosis whether the desired enantiomeric separation in a studied chromatographic system could be achieved and anticipates the elution order of particular enantiomeric analytes. The use of molecular mechanics for chiral chromatography can empower a modification of the available, design new chiral stationary phases and explore various types of interactions in chromatographic systems. Furthermore, an insight into the enantioseparation mechanisms can be obtained.
1.
Introduction
Chromatography is an important and commonly applied method for separation of enantiomers. The chromatographic separations can occur indirectly, i.e., via separation of diastereomers in achiral chromatographic systems, or directly by separations of “pure” enantiomers in chiral systems. The chromatographic techniques that have been applied for separations of enantiomers are referred to as the chiral chromatography. The following chromatographic techniques are frequently used for analytical and preparative goals: high pressure liquid, planar, capillary gas and hypercritical fluid chromatography, A chromatographic system is composed of a stationary and a mobile phase and molecules of solutes. Direct separations of enantiomers are carried out in the following chromatographic systems: gadliquid and gadsolid - (i.e, gas chromatography) [1,2], liquidliquid and liquidsolid - (i.e., liquid chromatography) [3] and in a fluid exceeding the critical point/liquid or solid - (i.e, supercritical fluid chromatography) [4]. Studies intended for elaboration of methods, procedures and conditions for the separation of racemic solutes into enantiomers have been published for individual racemic mixtures, selected groups and entire classes of chemical compounds. Many authors discussed the premises, origins and mechanisms being observed during the enantioseparation process of chiral analytes. A majority of them accepted that phenomena of enantioselectivity in chromatographic systems cold be caused by the subsistence of a chiral environment generated by a chiral component, i.e., a chiral selector in the 3 16
317 particular chromatographic systems [ 5 ] . The chiral selectors could be formed from molecules or stereoisomeric groups of atoms. The chiral selectors could be chemically bonded to solid support surfaces in order to improve the thermal, chemical and mechanical stability of chiral stationary phases [ 6 ] . Polyatomic chiral systems were described too, where it was not possible to point to chiral atoms. Chiral selectors can be present in the stationary phase (liquid or solid), in the mobile phase (liquid or fluid), as well as in the both phases simultaneously. The chiral selectors interact with enantiomeric solutes, which are defined as selectands. The chromatographic systems containing chiral selectors in the stationary phase chemically bonded to support materials have gained currently an important significance and they are usually referred to as the chiral stationary phases (CSP). The widespread chromatographic systems containing chiral selectors for the gas chromatographic technique and the chiral systems employed in the liquid chromatography have been compiled by Schurig [7] and Wainer [8], respectively. Chromatographic separations of enantiomeric solutes could be described thermodynamically because they observe the Gibbs-Helholtz equation [9,lo]:
- AR,AGo = RTIn-KR = R T l n a
(1)
KS
ARsAGa= AG; - AG;
(2)
Where: R is the gas constant, T - absolute temperature, KR, Ks - equilibrium constants of sorption for the individual enantiomers (R and S). AMAGO is the enantioselective difference of free enthalpies of interactions of the enantiomers R and S in the chromatographic system. This term is also referred to as the chiral selectivity or enantioselectivity. The ratio of the equilibrium constants of sorption can be easily determined from a recorded chromatogram taking into account the following relations:
Where kR is the retention factor of the solute R, CIRS - the coefficient of separation of two solutes R and S (also referred to in the literature as the coefficient of chiral separation, coefficient of chiral selectivity or coefficient of enantioselectivity), V- retention volume, V M - retention volume of the nonretarded solute. The enantioselectivity of the chromatographic system can be determined if we consider a formation of diastereomeric complexes (or associates) between the chiral selector and the solute selectand. Very often a formation of binary
318 complexes selector-selectand could be accepted, although the stochiometry and stereochemistry of the complexes in many cases could present the goal of investigations per se. In the studies of enantioselective interactions the binary complexes formed between a chiral selector and one molecule of selectand (i.e., the solute) are compared [l 11. Chiral selector could exist as a part of the stationary phase or as a liquid mobile phase additive. The separation of enantiomers proceeds in a reversible equilibrium process of formation of diastereomeric complexes between the chiral selector and the solute enantiomers. The enantioselective part of the chiral selector could interact reversibly with the solute enantiomers during the elution of the chromatogram and could form diastereomeric complexes of different stability, which cause the enantiomeric separation. In all systems where enantioselective interactions takes place two competing equilibria can be distinguished [12]:
H~ H~
+ G~ s + G~ s
H ~ . G ~ H~.G~
(5)
(6) H refers to the host molecule serving as a selector, G is the guest molecule, the selectand, R and S are stereochemical descriptors. The diastereomeric complexes are formed in the both equilibria due to the following interactions: electron donor-acceptor (e.g., n - p and R - x), hydrogen bonds, Van der Waals interactions and others. The enantioselective interactions are in many cases very week in comparison to the achiral complexing interactions. The left sides of eq. (5) and eq. (6) are equal. In this equation are present the same host molecule (H) and guest (G) molecules. They possess the same shapes, identical free energies and identical solvatation degree in the nonbonded state in the achiral environment. To be determined are only the free enthalpies of the diastereomeric complexes selector-selectand ARSAGO. The values ARsAGO can be used to predict of the complex forming preference for the particular enantiomer of guest (G) using the double difference method proposed by de Tar [ 131. It is assumed that if the competing binding mechanisms are similar enough, influences of polar effects, salvation effects and entropy differences would cancel, thus making differences in computed energies from molecular mechanics calculations, comparable to differential free enthalpies ARsAGo. This premise authorizes a correlation between ARsAG0 values and the potential (steric) energies AEpot determined by means of computational procedures of molecular or quantum mechanics. Lipkowitz [12] proposed the use of computational procedures of molecular and quantum mechanics for determination of AEpot and accepted that this value was correlated with the enantioselective differences of free enthalpy of complexation ARSAG". Recently other approaches for defining the enantioselectivity have been derived from the molecular topology and the use of quantitative relationships between chemical structure of molecules and the enantioselectivity being obtained by means of self learning neural nets [14,15].
319 2.Intention of the study
We intended in the present contribution to find out methods and procedures enabling to determine the stereoselectivity (or more exactly the enantioselectivity) for selected chromatographic systems by means of molecular mechanics using computational procedures. We compared the computed enantioselectivity parameters, ARSEpot, with the experimentally available chromatographic retention data that describe the enantioseparation, ARsAGo, for the following mechanisms involved in the chiral chromatography: - reversible complex formation between the d and f metal complexes with chiral ligands and enantiomeric ketones and alcohols solutes, - reversible ligand exchange in the Cu(I1) complexes containing Lhydroxyproline derivatives and and a-amino acids solutes, - reversible hydrogen bond formation between chiral crown ethers and enantiomeric solutess containing amine or hydroxyl groups in the a position. The presented types of interactions are observed between the chiral stationary phases and enantiomeric solutes in the following chromatographic techniques: - the complexation gas chromatography (formation of the metal-nucleophilic ligand complexes) [ 171, - the ligand exchange liquid chromatography, - the inclusion chromatography with the hydrogen bonds formation. Apart from the mentioned types of interactions, in the studied chromatographic systems could be distinguished an variety of interactions, their description and evaluation will contribute to a better knowledge of the enantioselectivity and enantioseparation mechanisms in chromatography. 3.
Method of Investigations
Modelings of enantiodiscrimination in chromatography by means of molecular mechanics make use of the energy of interactions, EPot,that is also referred to as potential or steric energy. This energy could be determined for the most likely conformers of the studied molecular systems. In the procedures of predicting the separation selectivity and the order of elution of the enantiomeric solutes involved the evaluation of the energy differences that characterize interactions between the selector and selectand of the two separated enantiomers. The procedures include generating of the most probable diastereomeric associates (or complexes) between the chiral selector and the particular enantiomer of the solute that possesses low energy and even to approximate the conformers corresponding to the global minimum of energy for the studied systems. A three dimensional plot of the potential energy vs. the two introduced geometric changes in the studied complex (e.g., changes in interatomic distances, the bond angles, the dihedral angles, or the improper torsion angles)
320 is defined as the potential energy surface (PES) for the studied diastereomeric complexes. The information about the PES is very important, because the surface determines the shape of the molecular system, its dynamic character and reactivity. The most important piece of information obtained from the 3D diagrams of the relationship of the potential energy (EPot)vs. the steric structure of conformers is the energetic minimum of the diastereomeric complexes. This value could be approximated with the aid of conformational analysis and a number of algorithms as, e.g., the sequential search of conformations in order to obtain the low energy conformers. The first step in the procedure includes an optimization of geometry of the diastereomeric complex formed between the chiral selector and one enantiomeric selectand. The accessible computing procedures give rise to determine the local energy minimum and to search the nearby minimum on the potential energy surface. The minimization should be started from various points in order to find out a number of local energy minima. In the presented work a number of procedures and approximations for the search of the 'global' energy minima were used.
Molecular mechanics A molecule is defined as a set of nuclei bonded together by forces defined by the equations of classical physics. The set of equations used for describe the studied chemical system is defined as the force fields. The forces acting between atoms of the system are described by the functions composites of the potential energy, the most important are: deformation energy of bond lengths (ESP",,& deformation energy of angles Eseetch) deformation energy of dihedral angles (Edihedr), deformation energy of van der Waals (&W) deformation energy of type hydrogen bonding (EHbond) deformation energy of electrostatic (Eelectrostatic) Pot
=
c(
spring +
stretch +
dihedr
+
vdW
+
Hbond +
electrostatic
+ ..'
(7)
The sum of the all interactions defines the potential energy of the studied system (EPot)(also called the strain or steric energy). The enhanced version of the standard force field proposed by Allinger, MM+ and the developed version MM3 [ 161 were used for calculations. The geometry optimizations were carried out through systematically moving of all atoms belonging to the studied system till the resulting forces acting on the all the atoms would be minimized. In this way an optimized geometry can be determined that corresponds to one of local energetic minimum of the studied molecular system.
Molecular dynamics The molecular dynamics generates a trajectory, i.e., a set of molecular structures in time; each structure (a conformer) has the defined amount of the
32 1
potential energy, kinetic energy and the temperature. As a result the list of conformations of various values of potential energy is obtained. From this list the low energy conformations could be achieved.
Sequential search of the low energetic conformations Algorithms used for the search of the conformational space (or torsional hyperspace) include iterative sampling of conformations from the set of the conformational space. The procedures include following steps: generation of optimized original structures, optimization of geometry of the original structures and comparison of the optimized structures with the structures downloaded to the memory. Subsequently the structure of a conformer that observed the mentioned conditions is added to the list of conformers. Advanced algorithms of search the conformational space in their first step include two sub-procedures: A selection of the original structure among the list of conformers which was stored in the memory and the introduction of a known perturbation to the initial structure in order to determine a new structure of a conformer. 4.
Selected examples of studied chromatographic systems
Calculations were carried out on a personal computer equipped with Pentium IV processor and sets of software HyperChem and CAChe obtained form Hypercube (Canada) and FQS (Poland), respectively. The following enantioseparations were studied: - separation of alkyl substituted cyclohexanones by means of the complexation gas chromatographic technique, [ 171 - separation of a-amino acids by means of ligand exchange chromatography (on the planar chromatographic bed) [ 181, - separation of a-amino and a-hydroxy compounds by means of the inclusion liquid chromatography on stationary phases containing chiral crown ethers [ 191. The following computational procedures were applied: optimization of geometry, molecular dynamics, sequential searches of energetic minima, determination of three dimensional potential energy surfaces as a function of two variable geometrical parameters of the studied molecular system The potential energies of interactions were determined for the diastereomeric complexes selector-selectand for tris-p-camphorates of europium with chiral ketones, L-hydroxyproline Cu(I1) derivatives with a-amino acids and chiral crown ethers with enantiomeric a-amino compounds. The obtained results indicate the possibilities of application of the molecular modeling for predicting the enantiomeric chromatographic separations. A reasonable trend of computational results vs. the experimental chromatographic retention data denotes that the molecular modeling could be employed for
322 description and suggesting the separation mechanisms of chiral solutes by the studied chromatographic techniques. Further on, with the aid of molecular modeling it could be possible to improve the studied chromatographic separations, as well as to design new chiral stationary phases with superior enantioselectivity for the studied chiral analytes and enable an insight into the enantioseparation mechanisms.
References 1. V. Schurig, J. Chromatogr. A, 441, 135 (1988). 2. W. A. Konig, The practice of enantiomer separation by capillary gas chromatography, Hiittig Verlag, Heidelberg, (1987) 3. D. W. Armstrong, J. Liquid Chromatogr., 7 (S-2), 353 (1984. 4. K. L. Williams, L. C. Sonder, J. Chromatogr. A, 785, 149 (1997). 5. S. G. Allenmark, Chromatographic enantioseparations, Ellis Hoorwood Editions, Chichester, UK, (1988). 6. N. N. Maier, P. Franco, W. Lindner, J. Chromatogr. A, 906,3 (2001). 7. V. Schurig, J. Chromatogr. A, 906,275 (2001). 8. L. W. Wainer, TrendAnal. Chem., 6, 125 (1987. 9. J. C. Hiddings, Dynamics of chromatography, Principles and Theory, Marcel Dekker, New York, (1965). 10. M. C. Ringo, C. E. Evers, Anal. Chem. (News & Features), 316 (1998). 11. W. J. Kowalski, J. Nowak, M. Konior, Analytical Sciences (Tokyo), 17 Suppl., 757 (2001). 12. K. B. Lipkowitz, J. Chromatogr. A, 906, (2001) 417. 13. D. F. de Tar, Biochemistry, 20, 1710 1981). 14. J. Aires-de-Sousa, J. Gasteiger, J. Molecular Graphics and Modelling, 20 (5),373(2002)., 15. A. Goldbraigh, D. Bonchev, A. Tropscha, J. Chemical Information and Computer Sciences, 41 (l), 147 (2001) 16. N. L. Allinger, Z. Q. Zu, K. H. Hen, J. Arner. Chem. SOC., 114, 6120 (1977). 17. W. J. Kowalski, J. Nowak, M. J. Maslankiewicz, Modeling of enantioseparations of ketones by chiral complexaiongas chromatography, Annals Polish Chem. SOC.,236 (2001). 18. W. J. Kowalski, J. Nowak, Molecular modelling of planar chromatographic enantioseparation of amino acids on sorbents containing copper complexes of L-hydroxyproline derivatives, Proc. International Symposium on Planar Separtions, Lillafured, Hungary, 23 - 25 June 2001. 19. W. J. Kowalski, J. Nowak, M. Konior, A. Kozielec, Atomistic modeling of separation of enantiomeric amino acids on LC stationary phases containing chiral crown ethers, 14-th International Symposium on Chirality, Hamburg, Germany, 8 - 12 September 2002,
PROBABILITY DISTRIBUTIONS OF VOLATILITY IN FINANCIAL TIME SERIES
M. I. KRIVORUCHENKO Institute for Theoretical and Experimental Physics, B. Cheremushkinskaya 25, 11 7259 Moscow, Russia E-mail: [email protected] E. ALESSIO, V. FRAPPIETRO AND L. J. STRECKERT Metronome-Ricerca sui Mercati Fznanziari, C. so Vittorio Emanuele 8.4, 10121 Torino, Italy E-mail: frappietroQmetronome.it The problem with constructing probability density functions (PDFs) of volatility is discussed. The first model we have used represents the simple Gaussian random walk model. The PDF is given in the form of a one-dimensional contour integral. The second model we have used is based on the joint multidimensional Student PDFs of returns. Such distributions are useful for the description of well established deviations from the Gaussian random walk, observed for financial time series, such as an approximate scaling of the PDFs of returns, heavy tails of return distributions, return-volatility correlations and long ranged volatility-volatility correlations. We have fixed free parameters of the Gaussian and modified multidimensional Student PDFs of returns over a short term period by fitting three to eight years of trade-by-trade quotes from the Eurex Bund, Bobl, DAX and EuroSTOXX futures contracts, and over the long term by fitting 100+ years of the DOW Jones 30 Industrial Averages (DJIA) and 50+ years of the Standard & Poors 500 (S&P 500) daily quotes. Two estimators are considered to quantify volatility. These are short-term dynamic and long-term static PDFs that are then constructed. The volatility distributions are compared with the historical ones of the Eurex Bund, Bobl, DAX and EuroSTOXX futures contracts and also with the historical distributions for the DJIA and the S&P 500 indices.
323
324
The term volatility represents a generic measure of the magnitude of market fluctuations. It quantifies risk and enters as an input to virtually all option pricing models. Although market participants talk of volatility, it is variance, or volatility squared, that has more fundamental theoretical significance. The variance 02[&]of a time series {Ri}can be defined as follows:
A
where R= CZ, Ri. The value of n is referred to as the time window of the variance. The quotes Ri have the form Ri = Aj where Ai are returns, such that Ai = Ri - Ri-1. The square root of the variance, a[Ri], is called volatility. PDFs of volatility can be constructed provided PDFs of returns Ai are known. We consider two such PDFs and derive, respectively, two PDFs of volatility. The first one corresponds to the Gaussian random walk. Such a model is reasonable as a zero-order approximation. The second PDF takes into account the well established deviations in behavior of the financial time series from the Gaussian random walk. We thus start from the simplest random walk model. The joint PDF of returns is given by 1
1 2
Gn(A) = -exp( - -A2). (27r)n/2
The vector A = (A1,...,A n ) describes a sequence of the uncorrelated returns. According to Ref. l , the correlations are absent at a time scale greater than 20 min. The absence of the statistically significant correlations,
Corr(Ai,A j ) = 0 ,
(3)
has been widely documented (see e.g. Ref. 2, and often cited as a support for the efficient market hypothesis 3 . For ( 2 ) , we have E(a2[Ri]) = 1. LBvy stable truncated distributions are known to provide for financial time series (i) an approximate scaling invariance of PDFs of returns with a slow convergence to the Gaussian behavior and (ii) existence of ”heavy tails” in the PDFs. We propose the n-dimensional Student PDFs
325
for modeling joint PDFs of returns. It is not difficult to verify that a marginal PDF of the random vector A is a multidimensional Student PDF again. If we integrate out all of the Ai except for one, we get (4) with n = 1. The tails behave empirically like dA/A4, and so a 3. The sum 9 = CZ, Ai is described by N
n
N
9 d9 Ai)SE(A)dA = Sr(-)-.
fifi
(5)
This is the exact scaling law. If all components of the vectors E = ([I, ..., En) and 7 = (71, ...,va) are normally distributed with a zero mean and unit variance, then the random vector A = has the distribution (4). The value 9 can therefore be represented as
9 = El
+E2
+ ... +En
m
7
in which case Eq.(5) easily follows. The dispersion of the 9 increases linearly with n, in agreement with the empirical observations. The multidimensional Student PDFs have therefore heavy tails and the exact scaling invariance from the start. These distributions can be modified further to describe two other well established stylized facts which are (iii) long ranged volatility-volatility correlations that are also known as volatility clustering and (iv) return-volatility correlations that are also known as leverage effect There are no correlations of returns in the PDFs (4). The volatilityvolatility correlations is, however, a fundamental property. Using PDF (4), we get Eq.(3) and also 738.
The absolute values of returns correlate positively. The correlation coefficient does not depend on the time lag r = j - i > 0. The empirical facts show that there is a slow decay of the correlation coefficient. The PDFs (4) are apparently an excellent zero-order approximation for modeling joint PDFs of returns. Notice that Corr(lAi1, lAjl) = 0.23. An extension of the PDF (1) that has the coefficient Corr(lAil', IAjl') decaying with time is rather straightforward: From the representation (6), it is clear that the long ranged correlations occur, since the denominator is common for all the increments ti. In order to provide a decay of the correlations, it is sufficient to use different 7's for different groups of
326
the ti's. The analogy with the Ising model can be useful. Groups of the increments ci with the same denominator can be treated as domains of spins aligned in the same direction. We assign the usual probability for every such configuration: w [ a l , ...,a,] = N exp(-P Cyz; aioi+l) where ai = f l . The normalization constant is given by 1/N = 2 ( 2 c o ~ h ( p ) ) ~ - ' . The correlation of absolute values of the returns equals (7) provided that the Ai and A j belong to the same domain and zero otherwise. The probability to get the Ai and A j within the same domain can be found t o be W ( T ) = cosh(/3~)/(2cosh(p))'. The coefficient Corr(lAilr,IAjl') for the modified multidimensional Student P D F has therefore the form of Eq.(7) multiplied by ~ ( 7 )Notice . that W ( T << 1/p) 1 and W ( T >> 1/p) iexp(-y~) where y = ln(1 e c 2 B ) > 0. Empirically, y l/year, in the appropriate units. The free parameters of the modified n-dimensional Student PDFs can be fixed over a short term period by fitting three t o eight years of trade-bytrade quotes from the Eurex Bund, Bobl, DAX, and EuroSTOXX futures contracts, and over the long term by fitting 100+ years of the DJIA and 50+ years of the S&P 500 daily quotes. The joint PDFs of returns are used then for the probabilistic forecasting of volatility. The existence of historical data makes it possible to check quantitatively the proposed models. The volatility distributions are compared with the historical ones of the Eurex Bund, Bobl, DAX, and EuroSTOXX futures contracts and with the historical distributions for the DJIA and S&P 500 indices. Cycles in the financial time series are identified using the Fisher statistical criterion and considered along with trends as causal components of the time evolution.
+
N
-
-
References 1. P. Gopikrishman et al., Phys. Rev. E60, 5305 (1999).
2. 3. 4. 5. 6. 7. 8. 9.
A. Pagan, J. Empirical Finance 3,15 (1996). E. F. Fama, Journal of Finance 25, 383 (1970). R. N. Mantegna and H. E. Stanley, Nature 376,46 (1995). V. S. Korolyuk, N. I. Portenko, A. V. Skorokhod, A. F. Turbin, Handbook on Probability Theory and Mathematical Statistics, Nauka, Moscow, 1985. Z. Ding, C. W. J. Granger and R. F. Engle, J . Empirical Finance 1, 83 (1993). F. Black, Proceedings of the 1976 American Statistical Association, Business and Economical Statistics Section, p. 177. J. C. Cox and S. A. Ross, J . Fin. Eco. 3, 145 (1976). R. A. Fisher, Proc. Roy. SOC.(London) A125,54 (1929).
KINETIC SOLUTION OF THE BOLTZMANN-PEIERLS EQUATION
MATTHIAS KUNIK, SHAMSUL QAMAR. AND GERALD WAR.NECKE Institute for Analysis and Numerics Otto-von- Guericke University PSF 4120: 0-39106Magdeburg, Germany This paper is concerned with the solutions of initial value problems of the Boltzmann-Peierls equation (BPE). This integr-differential equation describes the evolution of heat in crystalline solids at very low temperatures. The BPE describes the evolution of the phase density of a phonon gas. The corresponding entropy density is given by the entropy density of a Bcse-gas. We derive a reduced threedimensional kinetic equation which has a much simpler structure than the original BPE. Using special coordinates in the one-dimensional case, we can perform a further reduction of the kinetic equation. Making a one-dimensionality assumption on the initial phase density one can show that this property is preserved for all later times. We derive kinetic schemes for the kinetic equation as well as for the derived moment systems. Several numerical test cases are shown in order t o validate the theory.
1. Introduction In 1929, Peierls proposed his celebrated theoretical model based on the Boltzmann equation. According to him the lattice vibrations responsible for the heat transport can be described as an interacting gas of phonons. The Boltzmann-Peierls approach is one of the milestones of the theory of thermal transport in solids, especially at very low temperatures. It is important to mention that Fourier theory of heat flow fails to describe heat conduc tion processes at low temperatures, see for example Dreyer and Struchtrup and references therein. Dreyer, Herrmann and Kunik have used a kinetic scheme in order to solve the microscopically two-dimensional Boltzmann-Peierls equation (BPE). In this paper we present the kinetic solutions of the microscopically three-dimensional BPE. In order to solve the one-dimensional discrete form of the BPE numerically, we utilize the idea that we automatically obtain 327
328
kinetic flux vector splitting under a CFL condition. Flux vector splitting is a technique for achieving upwinding bias in numerical flux function, which is a natural consequence of regarding a fluid as an ensemble of particles. Since particles can move forward or backward, this automatically splits the fluxes of energy and heat flux into forward and backward fluxes within each cell. The intial data to the scheme is a discrete matrix of the phase density in phase-space . The present scheme is more efficient and faster than the scheme used in 3 , because the kinetic scheme used there is discrete in time but continuous in space, therefore an interpolation polynomial was needed in order to calculate the free-flight phase density. The Boltzmann-Peierls equation is a kinetic equation for the phase density of phonons. This equation describes the evolution of the phase density f ( t , x,k), where f ( t , x,k ) d 3 d 3 k is interpreted as the number of phonons at time t in an infinitesimally small phase cell element d3xd3k centered at (x,k). Here hk denote the momentum, k the phonon wave vector and h is Planck’s constant, see for further details. The microscopically three dimensional Boltzmann-Peierls equation can be written as 514
where c is the Debye constant, time is denoted by t and the quantity the collision operator which will be defined below.
C is
The moments of the phase density f reflect the kinetic processes on the scale of continuum physics. The most important moments are 00
e ( t ,x) = fic
S
-00
Ikl f ( t , x, k) d 3 k ,
(2)
329
The fields e, Q = (Ql, Q2,Q3) and the Matrix N = (N ij) are the energy density, heat flux and momentum flux, respectively. Phonons are classified as Bose particles and the corresponding entropy density-entropy flux pair (h,‘p) is given by 574,
3 where y = s, see 4 .
In contrast to the ordinary gas atoms, the phonons may interact by two different collision processes, called R- and N-processes. R-processes include interactions of phonons with lattice impurities which destroy the periodicity of the crystal, while N-processes can be interpreted as phononphonon interactions which are due to the deviations from harmonicity of the crystal forces. N-processes conserve both, energy and momentum, while Rprocesses only conserve energy. The Callaway approximation of the collision is a suitable simplification of the actual interaction processes. operator The Callaway collision operator is written as the sum of two relaxation operators modelling the R- and N-processes seperately. We write
The positive constants 7 R and T N are the relaxation times, while PR and PN are two nonlinear projectors. Here PRf and PN f represent the phase densities in the limiting case when the relaxation time tends to zero. Explicitly, we define PR f and PN f as the solutions of two optimization problems, namely
where e(f), Q(f) are given by (2),(3).
330
The maximization problems can be solved by means of Lagrange multipliers A$ and A$, A h , A&, A%. and we get
where
C d t , x,k) = fic Ikl A;, C N ( t ,X,k)
= fic Ikl A%(t,X)
+ hki A k ( t , X) .
(11)
(12) From (8) and (9) the Lagrange multipliers can be calculated explicitly. They are given by, see ', 3,
6
Four-field system: When the thermodynamic state is described by four fields e and Qi only, then we can derive the following four-field system from the BoltzmannPeierls equation (1) and the maximum entropy principle, see ',
d e dQi -+-=(I, dt dXi
where
(15)
x is the so called Eddington-factor:
Note that in above equations (16) the 7-N term do not appear on the right hand side, therefore the applicability of these equations is restricted to the relaxation limit 7-N + 0.
331
Now we give a short overview of this paper: In this paper we have derived a kinetic scheme for the solution of the reduced Boltzmann-Peierls equations as well as for the hyperbolic four-field system. The first reduction of the BPE reduces its moment integrals to surface integrals over the unit sphere. This reduction can be obtained without further assumptions on the intial data and was studied in '. In that paper only the microscopically two-dimensional BPE was studied, while we are dealing with the microscopically three-dimensional BPE. Moreover we have obtained a second reduction of the already reduced BPE which is looking much simplier than the first one, but requires the additional assumption of a one-dimensional flow. Using special coordinates which are adapted for this one-dimensional flow, we can reduce the surface integrals for the moments to simple one-fold integrals ranging over the compact interaval -1 to 1. We may summarize the following three main contributions to this t h e ory of the BPE. The first one is the kinetic solutions of the microscop ically three-dimensional BPE, while previously only the microscopically two-dimensional BPE was solved, see '. The second contribution is the use of special coordinates in the macroscopically one-dimensional case in order to perform a further reduction of the kinetic equation. We show by three lemmas that this reduction is valid for all times. The third contribution is the use of a kinetic flux-vector splitting scheme at a numerical approximation. We use the average values of the initial phase density over phase-space cells as a initial data for the scheme. This scheme is much faster than the kinetic scheme used by which is discrete in time but continuous in space.
References
1. J. Callaway, "Quantum theory of the solid state", Academic press, San Diego, 1991. 2. W. Dreyer and M. Kunik, "Initial and boundary value problems of hyperbolic heat conduction", Cont. Mech. Themodyn. 11.4 (1999) pp. 227-245. 3. W. Dreyer, M. Herrmann and M. Kunik, "Kinetic solutions of the BoltzmannPeierls equation and its moment systems", WAS-Preprint No. 709, Berlin (2001).
332
4. W. Dreyer, H. Struchtrup, ”Heatpulse Experiments R.evisited”, C o d . Mech.
Thermodyn., 5, (1993)pp. 1-50. 5. R..E. Peierls, ”Quantum Theory of Solids”, Oxford University press, London 1995.
COMPUTER AIDED ENGINEERING FOR THEORETICAL STUDIES OF VEHICLE ACTIVE SUSPENSION SYSTEMS P.V. KYRATSIS+, D.A. PANAGIOTOPOULOS' West Macedonia Technological Institution of Education Kila, GR 50100, Kozani, Greece E-mail: pl;vn~ttsis~u~tcc,9r, c/panm(ici.liozani.teikoz.rr/. +
'
D.V. KAKOGIANNIS Hellenic Petroleum S.A. Thessaloniki Industrial Complex P.O. Box 10044, GR-541 10 Thessaloniki, Greece E-mail: d.17.k i i k o ~ i i i n n i s ~ ~ ~ , h e l l e i i i e - ~ ~ t ~ t ~ ( ~ l ~ i i n i , ~ r Active suspension systems offer significant advantages over passive systems. The high power necessary to operate conventional active systems is leading to further investigation for low energy consuming systems. The present work is a theoretical study of a low energy active suspension system. A half vehicle model is developed in order to simulate the active suspension system based on the change of its mechanical lever ratio. The degrees of freedom used are enough to simulate accurately real road conditions, whilst at the same time the models are not very complicated. Due to the horizontal motion of the spring involved, the energy consumption is much lower than in conventional systems, in which actuators are primary suspension elements. Non-linear models are included. These were written in AUTOSIM, which was used to create programs in FORTRAN. The elimination of roll in manoeuvring, by appropriate movement of the actuators, created problems of discomfort and a conflict of these two factors is investigated. When roll is controlled, the control law in order to avoid jacking of the vehicle is defined and different lay-outs are examined. The work is contributing to the good design of a low energy automotive active suspension, mechanically compatible with contemporary systems.
1.
Extended Abstract
1.1. Introduction
The design of automotive suspension system involves a number of compromises. There is a conflict between suspensions that must appear soft in order to achieve a good level of comfort and suspensions that must appear stiff in order to control the vehicle attitude changes and maintain good tire to ground
333
334
contact. Usually the car suspensions are passive and contain elements like springs and dampers, the properties of which could not be changed according to driving conditions. A good suspension design for cars requires an effective compromise between: passenger ride comfort, tyrelground contact force variations, suspensions working space, vehicle attitude control, capital cost, power consumption, reliability, maintainability and durability, component weight and noise transmission, advertising potential etc. In order to achieve better handling and ride performance of the suspension system, a force actuator may be included in the wheel suspension. In addition to the actuator, sensor and power supply, a control computer with the appropriate software is essential. 1.2. Controlling the Vehicle Behaviour If roll is eliminated when a vehicle is cornering, the driver is not visually impaired with a tilted horizon as he would be with a rolling vehicle. At the same time the absence of body roll ensures the driver a steady seating position. In addition to the ride, the handling can be improved because of the better attitude of the tyre contact with the road. This road contact generates the side forces. These are strongly influenced by the load transfer ratio between the two sides of the vehicle. Similar problems are caused for the driver when the vehicle accelerates or decelerates. View problems are imposed to the driver when in this situation the load transfer between the front and rear axles, makes a lot of difference to the way that the tyres generate the longitudinal forces. For these reasons a lot of research has been carried out in order to reduce the conflicting criteria of the suspension design and the main concepts are presented. The present work is a step forward to a new direction of ideas for the active control of the vehicle behaviour. The main advantage of this system is the simplicity and the low fuel consumption that is necessary for its operation. A feasibility study of this idea was conducted during the research of this work. 1.3. Concept of the Low Energy -Active Suspension System
Examining the case of a swing axle suspension, it appears that there is a difference between the wheel rate and spring rate. Figure 1 represents such a suspension, for which a simple expression can be derived between the rates. This involves the mechanical lever ratio squared: 2 Kwheel=Kspring* (1111’2) 3
335
Figure 1. Simplified representation of mechanical lever ratio of a swing axle suspension where Kwheel:suspension wheel rate, Ksphg: suspension spring rate and 1l/l2: suspension mechanical leverage ratio. It can be seen that by changing the position of the spring, the wheel rate changes. Increase on the length 11, for the same properties of the spring and amount of mass, increases the Kwheeland, due to that, the natural frequency of the body. This means stiffer suspension and better handling but reduced comfort because of greater body acceleration. The displacement of the preloaded spring in theory does not need much power because the direction of the force is perpendicular to the displacement. 1.4. Vehicle Dynamics Model Techniques & Tools
To achieve numerical simulation of vehicle dynamics, non-linear differential equations should be integrated over a small time step. There are three ways of doing these kinds of studies: a) derive the equations of motion 'by hand' and then code them in a special purpose computer program, b) use a generalised numerical analysis program to set up and solve the differential equations and c) use a symbolic multibody analysis program that can derive equations of motion automatically and then generate a special purpose code for the specific system of concern. There is a difference in the run efficiency of the programs created 'by hand' and the general purpose programs available. The former seems to run faster but the time of development, debugging, validation of the code and preparation of the documentation is much greater. This leads to the third solution for simulating vehicle dynamics phenomena. To this direction, use of AUTOSIM is crucial. AUTOSIM is an extension of the computer language COMMON LISP and can operate on most computer operational systems. It contains a lot of functions, macros in extensive libraries and a set of new data types. As AUTOSIM is an extension of LISP, it is extendible itself too. So it is possible to create more functions and import them to the environment and use them when necessary.
336 1.5. Discussion
Many models were developed throughout the project. From the simpler one (MATLAB) up to the more sophisticated (AUTOSIM) in order to define and conclude upon the efficiency of such a low-energy active suspension system. The MATLAB model proved to be a good start in investigating the concept of changing the mechanical lever ratio in order to control the roll of a vehicle suspension. It was based on a detailed analysis of the body motion and was simplified in order to obtain the solution more easily. Later it was proved that its accuracy was satisfactory.In order to proceed with more accurate models, the use of AUTOSIM was crucial. AUTOSIM, with its operational structure, helped the development of the project by producing FORTRAN code to simulate the suspension behaviour. 1.6. Conclusions
The concept of controlling the roll of the vehicle in a steady state turn proved to be feasible. The jacking effect created was eliminated after the last control scheme was developed. Using the initial suspension configuration (without the circular track) the limitation of the specific layout was defined. The conflict between the increase of the leverage ratio and the loss of the suspension spring preload was established analytically. This led to the development of the circular track layout, which increases the leverage ratio of the suspension without losing the spring preload. For both suspension layouts different ways of control were tested. It is thought that the low energy feature of the suspension remained even when changing the layout of the suspension by adding the circular track. The energy required would be higher than in the case without circular track due to the effects of geometry in the horizontal forces developed. FORTRAN was used to simulate the low-energy active suspension system. This had the advantage of keeping the non-linearity of the system. This nonlinearity was established analytically using sinusoidal steering input and plotting the jacking PSD and roll PSD waterfall diagrams of the passive vehicle model. These diagrams showed the interference between roll and bounce vibrations. The jacking response fiequency was double that of the steering input excitation fiequency. At the same time, in the area of the roll resonance, roll was affecting jacking. Further to modelling a swing axle suspension, the position of the roll centre on the road surface permitted the simulation of a general suspension layout.
FOUR-STEP, TWO-STAGE, SIXTH-ORDER, P-STABLE METHODS M. LAMBIRIS, CH. TSITOURASt* A N D K. EVMORFOPOULOS Dept. of Applied Sciences, TEI of Chalkis, GR34400 Psahna, GREECE
An implicit four-step, sixth-order, P-stable method for initial value problem of the form y" = f (x, y ) is suggested. It is recommended for systems with stiff oscillatory solutions. Only two stages required per step, instead of three stage methods found until now in the literature.
1. The problem and the methods
Consider the special second order initial value problem (1) Y" =fhY),Y(X0)=yo, Y'(X0) =Yo' which is of continuous interest in many fields of sciences and engineering. Numerical methods for solving (l), produce approximations over a set of points y , c y(xo+ih), i = 1,2,... There are various types of popular methods that integrate numerically (l), such as Runge-Kutta-Nystrom (RKN) [3,8] or Stormer-Cowell (SC) [4,9]. Numerov (NU) [4, pg 4641 formula is the classical two-step representative of implicit SC methods. Their coefficients can be derived using common interpolatory techniques. Many authors use modifications of NU [1,2,12] methods in order to achieve special characteristics for their suggestions. The classical way is using off-step nodes. This procedure can be applied to four step methods as well [5,10]. The new four step formula we propose for approximating y4 = y(xo+4h) while yo,y ~y2, and y3 are given, is 9688 14526 9688 Yo --Y1+2425Y2 -2425Y3 +Y4 = 2425 (2) 606 1 1 36375 y' Y4n + -Y:) 250
5
with off step node
3 7 6 7 7 3 Yo =?Yo -3Y1 +-y2 15 --Y3 +-Y4 + 3 5 1511 556 ', 1511 ,, +-y;--y2 +-Y3 45 180 180 Corresponding author
9
* E-mail: [email protected], URL-address: http://users.ntua.gr/tsitoura/ 337
(3)
338 It is obvious that two hnctions evaluations are needed every step, namely y: and y i . 2.
Zero Stability and Accuracy
The method (2-3) is zero stable since the polynomial 9688 14526 9688 x 3 + x 4 x+-x2 -p(x)=l-E 2425 2425 has four roots on the unit circle and only two of them are equal to 1. There are no stability requirements fory,. The method is of sixth order of accuracy since y4 = y(xo+4h) + O(h8).To verify this observe first that yo = y(x+2h) + O(h6).Taking this in account we may expand (2) in Taylor series to get the desired result. We may consider the new method as a symmetric one because y: can grouped with y ; in (2). 3.
P-stability
Lambert and Watson [6], used the test equation y" = -A2 y, (4) for introducing the concept of the so-called interval of periodicity. Especially when applying a symmetric four-step numerical method, like the one introduced here by formulas (2-3), to the problem (4) we obtain a difference equation, with characteristic equation of the form p1(v2)+ p2(v2).x + p3(v'). x2p2(v'). x3 + p 1(v2). x4 = o
(5)
where v = Ah andpl(v2),p2(v2),p3(v2)polynomials in v2. The method considered here is said to have an interval of periodicity (0, v:
)
if for all 0 < v < vo the roots rl(v), rz(v), r3(v), rd(v) of (5) satisfy: 1 1YiI = I r 2 I = 1, Ir3I 5 1, Ir4I A method is said P-stable if its interval of periodicity is (0,~). It is obligatory for such methods to be implicit. Many authors have constructed P-stable methods. Papageorgiou et. al. gave a fifth order RKN method [7]. Cash [ 11, Chawla and Rao [2] derived sixth order hybrid Numerov methods while Simos and Tsitouras were the first who produced eigth order methods of this type. Jain et. al. [S], constructed an implicit, P-stable, four-step method sharing three stages per step. Its local truncation error is LTE 0.000 18.h8+O(h9)
339 while the truncation error of the new method is LTE 0.00464.h8+O(h9) Considering the efficiency of those methods as [ 111, Efficiency = #stages.PTE1'8, where PTE the principal term in local truncation error, we may conclude that both methods are of comparable efficiency. The method derived in [5] was a two-parameter modification of a sixth order implicit method by Lambert and Watson [6]. Our new method has five free parameters even if it requires only two stages. This is due to full exploitation of all possible coefficients appearing in a two-stage method. So it is believed that a better method could be derived with a comprehensive search. References 1. 2. 3. 4.
5. 6. 7.
8. 9. 10. 11. 12.
J. R. Cash, Numer. Math. 37,355 (1981). M. M. Chawla and P. S. Rao, IMA J. Numer. Anal. 5,215 (1985). E. Hairer and G. Wanner, Numer. Math. 25,383 (1976). E. Hairer, S. Norsett and G. Wanner, Solving Ordinary Differential Equations I (2nded.), Springer-Verlag, Berlin, 1993. R. K. Jain, N. S. Kambo and R. Goel, IMA J. Numer. Anal. 4, 117 (1984). J. D. Lambert and I. A. Watson, J. IMA 18, 189 (1976). G. Papageorgiou, I. Th. Famelis and Ch. Tsitouras, Numer. Algorithms 17, 345 (1 998). S. N. Papakostas and Ch. Tsitouras, SIAMJ. Sci. Comput. 21,747 (1999). G. D. Quinlan and S. Tremaine, Astron. J. 100,1694 (1990). A. D. Raptis and T. E. Simos, BIT31, 160 (1991). L. F. Shampine, Math. Comput. 46, 135 (1986). T. E. Simos and Ch. Tsitouras, J. Comput. Phys. 130, 123 (1997).
BINARY AND MULTICATEGORY CLASSIFICATION ACCURACY OF THE LSA MACHINE GEORGIOS LAPPAS a) University of Hertfovdshire, Computer Science Dept., Hafield, Herts ALIO 9AB, UK b) Technological Educational Institute (TEI) of Western Macedonia, Dept. of Public Relations and Communication, Kastoria Campus, P. 0.Box 30, Kastoria, Greece. Email: lappas@kastoria. teikoz.gr
VIVIAN AMBROSIADOU Medical Informatics Laboratory, Aristotle University of Thessaloniki, Thessaloniki, Greece The LSA machine is an effective method for predicting a class from linear separable data. LSA machine is based on the combination of Logarithmic Simulated Annealing with the Perceptron Algorithm. In this paper we present and compare the classification accuracy of the LSA machine on two medical databases a) the Winsconsin Breast Cancer Database, which is a binary database with two associated classes and b) the Diabetic Patient Management Database, which is a multicategory database with four associated classes. Many researchers use the Winsconsin Breast Cancer Database (WBCD) database, as a benchmark database for testing their systems. The WBCD database consists of 699 samples with 9 input attributes. The LSA machine is trained on 50% and 75% of the entire dataset and in both cases we obtain a classification accuracy of 98.8% on the remaining samples. This classification accuracy on the test set of samples, to our best of knowledge is the highest reported in the literature. The Diabetic Patient Management database consists of 746 samples with 18 input values and an associated class label denoting one of the four treatments for the patient. The LSA machine for comparison reasons is trained on the 646 samples of the database, obtaining stable classification accuracy over 74% for all four classes, with highest classification accuracy of 87%.
1.
Introduction
In many medicine problems the task, which is concerned with classification, is the prediction of a certain class based on known patient characteristics. This class usually denotes a certain type of disease as in [17], or a medical treatment as in [9]. Accurate classification rate allows the physician and the patient to take better treatment decisions. The LSA machine, introduced by Albrecht and Wong [8], is an implementation of a learning algorithm that derives from the combination of the Logarithmic Simulated Annealing algorithm [ 11, [ 151, with the classical perceptron algorithm [ 181, [22]. Simulated Annealing is in our days a popular active research area [23]. The Simulated annealing method is a 340
34 1
feasible method that can successfully handle NP-hard problems, and is the method chosen in LSA for the optimization strategy. The main idea of the LSA Machine is to use a logarithmic cooling schedule to control the unrestricted increase of the classification error on training samples caused by the Perceptron algorithm [3]. The search is guided by logarithmic simulated annealing (LSA), while the neighbourhood is defined by the Perceptron algorithm. Various modifications of the LSA Machine have been applied to classify image data (CT image classification) [3], [S], [6], and to gene-expression data analysis [4], [7]. In this work we present and compare the classification accuracy of the LSA machine in a popular binary classification domain [17], tackled by many researches PI, P O I , [111, [121, [14l, P61, D91, WI, P11, "W, PSI, [261, [27l, 1281, and to a multicategory classification medical database [9]. 2.
Methods
In this work, the core of the LSA Machine is based on depth-two threshold circuits. The input gates calculate hypotheses of the type: f ( x ' ) = ~ ; = , w' X, , 2 9 . where n is the number of input attributes of the domain, w,and 6' are the input weights and threshold value of the perceptron respectively, calculated by the perceptron algorithm [22] and xI is the input value of the attribute i. The first layer of the depth-two circuits of the neural network is trained over random selected sample sets of the training set. Independent hypotheses are calculated. After training the perceptrons of the first layer the testing set of unseen examples are applied to the entire network. The outputs at first layer are collected and the class decision is finally denoted at the output of the second layer. Combining more than one, depth-two layers networks we can produce depth-three layers network. The training of the first layer perceptrons is the most time consuming part of the LSA machine Simulated Annealing to be explicitly defined [ 13, requires: a configuration space that defines the search space, an objective function that defines the function to be optimized either by maximizing or by minimizing this function, a transition mechanism that generates our new hypothesis to be examined and defines the acceptance criteria for the new hypotheses and a cooling schedule which controls the annealing procedure. The objective function is the number of misclassified examples calculated by each perceptron from the sample set which defines the configuration space of the method. To compute our next hypotheses, the first layer of the circuit is computed by a combination of the Perceptron algorithm and Logarithmic with a heuristic of choosing the elements that are far away from being correctly classified. These elements are assigned higher probability for being our next hypotheses. A new hypotheses is accepted if one
342 of the following happens: a) it produces lower classification error to the objective function or b) it produces higher classification error to the objective function and at each annealing temperature a uniformly randomly selected sample p E [O,l ] is greater than
e 44% )-o(wk-l
))/t(k1
where o(wk),o(wk-,):the objective function of hypotheses k and k-1, and t(k) the annealing temperature of the logarithmic cooling scheme. The logarithmic cooling scheme of the LSA machine is based on Hajek's theorem [13]. Applying this feature to the LSA Machine we manage to use inhomogeneous markov chains of finite length to restrict the classification error. The learning method in the LSA machine requires that each perceptron is trained by a randomly selected training set The LSA machine introduced a new method to compute the threshold circuits by performing an Epicurean-style learning procedure, where several independent hypotheses are calculated from randomly chosen sub-sets of the total training samples. Each threshold function is calculated from a random selection of positive
IsposI = a
I and
negative ISnegI=P.lr neg I samples out of the entire training set T, where a, p E [0, I]. Investigating issues are concerned with the number of examples used for training and testing, the choice of a and p. The values of a and p, denote the number of random examples that each perceptron will be trained to learn with desirable zero or minimum error. These parameters are also investigated in this work as the quality of results depends on the choice of a and p. 3.
Experimental Results
We applied this method to two different medical domains. The Diabetic Patient Management Domain [9], which is a multiclassification domain, and the Winskonsin Breast Cancer Database (WBCD) [17], which is a binary domain. The data used for determining insulin regime specification, were provided at [9]. Insulin regimens and dose adjustment are prescribed by diabetologists depending on a number of factors such as diabetes type, patient age, activity during the day and control targets. There are 4 most widely used insulin regimens, called here regimel, regime2, regime3 and regime4. The data were compiled by interviewing diabetes experts in the United Kingdom and Greece. Subsequently, a questionnaire was prepared and was sent to three diabetic departments of UK and fourteen diabetological centers in Greece. The questionnaire was asking for the parameters that were necessary in order to
343 decide for each specific insulin regime. In this work Ambrosiadou et a1 used in their neural network approach 746 cases divided into 646 samples used for training and 100 samples used for testing the performance. The reported classification results for the 100 testing samples had a total classification accuracy of 58%, which is divided to 33%, 83%, 58% and 14% for regime 1, regime 2, regime 3 and regime 4 respectively. In our approach for multicategory classification purposes, again 646 samples were used for training and 100 samples were used for testing. Testing the system we obtained stable results over 74% of correct classification for all classes. The best-obtained result was 87%. The Winsconsin Breast Cancer Database (WBCD) can be found at the UCI [17]. The Repository http:ilwww.ics.uci.edu/--mlcarn/MLRepositoiy.h~ml WBCD database is the result of the efforts made at the university of Wisconsin Hospital for accurately diagnosing breast masses based solely on a Fine Needle Aspiration (FNA) test. There are 9 input features in the WBCD database. WBCD is a binary classification problem. The output is either a benign case (positive example) or a malignant case (negative examples) The data set consists of 699 samples. 16 samples have missing values, and they are discarded in this work in a pre-processing step. The remaining 683 data are divided to 444 benign (Positive examples) and 239 malignant cases (Negative examples) The training samples used for training the network and testing the performance are divided to a) 75% for training and 25% for testing of the entire sample set b) 50% for training and 50% for testing of the entire sample set and c) 100 samples left out, i.e.583 cases for training and 100 for testing. WBCD is considered a benchmark database for artificial intelligence systems. Researchers [2], [lo], [ill, [121, [14l, P61, [191, [201, P11, W I , [25l, [261, W I , P81, that have tackled the database have provided the literature with results ranging from 90% [ 111, to 98.24% [24], on the testing data. Our classification accuracy in this work is 98.8%. This shows that the performance is comparable to the best results available in the literature LSA machine as shown in this work achieves high quality of results. Finetuning of the parameters of the LSA machine that affect the quality of the performance, need to be deeper investigated. An open research issue in this method is to derive general guidelines for prior setting of the parameters in any application domain. References 1. E.H.L.Aarts, and J.H.M. Lenstra, Local Search in Combinatorial Optimization, Wiley&Sons, (1998).
344 2. J. Abonyi, and J.A. Roubos, Structure identification of fuzzy classifiers, Sh online World Conference on Soft Computing in Industrial Applications ('WSCS), Sept 4-18, (2000). 3. A. Albrecht, E. Hein, K. Steinhofel, M. Taupitz, and C.K. Wong. BoundedDepth Threshold Circuits for Computer-Assisted CT Image Classification. Artificial Intelligence in Medicine, 24(2): 177-190, (2002). 4. A. Albrecht, G. Lappas, S.A.Vinterbo, C.K. Wong, and M. Ohno-Machado, Two Applications of the LSA Machine, Proceedings of the International Conference On Neural Information Processing (ICONIP '02), (2002). 5. A. Albrecht, M. J. Loomes, K. Steinhofel, and M. Taupitz, Adaptive Simulated Annealing for CT Image Classification, Pattern Recognition and ArtiJicial Intelligence, 16(5), (2002). 6. A. Albrecht, K. Steinhofel, M. Taupitz, and C.K.Wong, Logarithmic Simulated Annealing for Computer-Assisted X-ray Diagnosis. Art$cial Intelligence in Medicine, 22(3):249-260, (200 1). 7. A. Albrecht, S.A.Vinterbo, C.K. Wong, and L.Ohno-Machado, A Simulated Annealing and Resampling Method for Training Perceptrons to Classify Gene-Expression Data. Proceeding of The International Conference on Artificial Neural Networks (ICANN '02), Lecture Notes in Computer Science Series, Springer-Verlag, (2002) 8. A. Albrecht, and C.K. Wong, Combining the Perceptron Algorithm with Logarithmic Simulated Annealing. Neural Processing Letters, 14(1):75-83, (2001). 9. B.V. Ambrosiadou, S. Vadera, V. Shankararaman, D. Goulis, and G. Gogou, Decision Support Methods in Diabetic Patient Management by Insulin Administration. Neural Networks vs Induction Methods for Knowledge Classification, Proc. 2"d ICSC Symposium on Neural Computation, Berlin, (2000). 10. A. Cannon, L.J. Cowen, and C.E. Priebe, Approximate Distance Classification, Computing Science and Statistics 30, (1 998). 11. D. Chiang, W. Chen, Y. Wang, and L. Hwang, Rules Generation from the Decision Tree, Journal of Information Science and Engineering, 17:325339, (2001). 12. N. Friedman, D. Geiger, and N. Goldszmidt, Bayesian Network Classifiers, Machine Learning, Vol29, 13 1- 163. Kluwer, Boston, (1 997) 13. B. Hajek Cooling Schedules for Optimal Annealing, Mathem. of Operations Research, 13:311-329, (1988). 14. N. Japkowicz, Supervised Learning with Unsupervised Output Seperation, In Proceedings of the IASTED International Conference on Artlficial Intelligence and Soft Computing (ASC2002), pp. 338-343, (2002). 15. SKirkpatrick, C.D. Gelat,Jr., and M.P. Vecchi, Optimization by Simulated Annealing. Science, 220:671-680, (1983). 16. C.G. Looney, Interactive clustering and merging with a new fuzzy expected value, Pattern Recognition 35:2413-2423, Pergamon, (2001).
345 17. C.J. Mertz and P.M. Murphy, UCI Repository of Machine Learning Databases. htt~://~t.wtv.ics.uci.edu!..-mlearn/M1.htnil, (1996). 18. M.L.Minsky, and S.A. Papert, Perceptrons. MIT Press, Cambridge, Mass., (1969). 19. M. Madden, Evaluation of the Performance of the Markov Blanket Bayesian Classifier Algorithm, Technical Report No. NUIG-IT-011002, Department of Information Technology, National University of Ireland, Galway, (2002). 20. D. Nauck, and R. Kruse, Obtaining interpretable fuzzy classification rules from medical data, Artificial Intelligence in Medicine, vol. 16, pp 149-169, (1999). 21. C.A. Pena-Reyes, and M. Sipper, Fuzzy CoCo: A Cooperative Coevolutionary Approach to Fuzzy Modeling, IEEE Transactions on Fuzzy Systems, Vol 9, Number 5 , p.p. 727-737, (2001). 22. F. Rosenblatt. Principles of Neurodynamics. Spartan Books, New York, (1962). 23. P. Salamon, P. Sibani, and R. Frost, Facts, Conjectures, and Improvements for Simulated Annealing, Society for Industrial and Applied Mathematics (SIAM), Philadelphia, (2002). 24. R. Setiono, Generating concise and accurate classification rules for breast cancer diagnosis. Artijkial Intelligence in Medicine, 18(3), p.p 205-217, (2000). 25. R. Setiono, and H. Liu, Neural-Network Feature Selector, IEEE Transactions on Neural Networks, 8(3): 654-659, (1 997). 26. I. Taha, and J. Ghosh, Characterization of the Wisconsin Breast cancer Database Using a Hybrid Symbolic-Connectionist System,Tech. Rep. UTCVIS-TR-97-007, Center for Vision and Image Sciences, University of Texas, Austin, (1997). 27. W.H. Wolberg, and O.L. Mangasarian. Multisurface Method of Pattern Separation for Medical Diagnosis Applied to Breast Cytology. Proceedings o f t h e National Academy of Sciences, U.S.A., Vol. 87, pages 9193-9196, (1990). 28. J. Zhang, Selecting Typical instances in Instance-Based Learning. Proceedings of the Ninth International Machine Learning Workshop, Aberdeen, Scotland. Morgan-Kaufinann, San Mateo, Ca, 470-479, (1992).
DATA MINING AND CRYPTOLOGY
E.C. LASKARI(lt3’, G.C. MELETIOU‘2’3’ D.K. TASOULIS‘1*3’, M.N. VRAHATIS(1’3’ (1) Department of Mathematics, University of Patras, GR-26110 Patras, Greece, E-mail: { elena, dtas, vrahatis} @math.upatras.gr (2)
(3)
A.T.E.I. of Epirus, P.O. Box 110, GR-47100 Arta, Greece, E-mail: gmeletQteiep.gr
University of Patras Artificial Intelligence Research Center (UPAIRC), University of Patras, GR-26110 Patras, Greece
This paper addresses the issue of mining encrypted data, in order to protect confidential information while permitting knowledge discovery. Common cryptographic algorithms are considered and their robustness against data mining algorithms is evaluated. Having identified robust cryptosystems, data are encrypted and wellknown data mining techniques are applied on the encrypted data to produce classification rules which are then compared with those obtained from the initial nonencrypted databases.
1. Introduction Nowadays business and scientific organizations collect data, orders of magnitude greater than ever before. Considerable attention has been paid in the development of methods that contribute to knowledge discovery in large databases, using data mining techniques, for example see Fayyad et al. in 7. Data mining can be used either to classify data into predefined classes (classification), or to partition a set of patterns into disjoint and homogeneous groups (clustering), or to identify frequent patterns in the data, in the form of dependencies among concepts-attributes (associations). Business and scientific databases typically contain confidential information. Clifton and Marks in provide examples in which applying data mining algorithms on a firm’s database reveals critical information to busipresents a technique to prevent the disclosure of ness rivals. Clifton in confidential information by releasing only samples of the original data. This 346
347
technique is applicable independently of the specific data mining algorithm to be used. In later work, Clifton has proposed ways through which distributed data mining techniques can be applied on the union of databases of business competitors so as to extract association rules, without violating the confidentiality of the data of each firm. This problem is also addressed by Lindell and Pincas for the case when classification rules are to be extracted. Another approach to extract association rules without violating privacy is to artificially decrease the significance of these rules (see 1 , 6 ) . Here, we also consider a scenario in which a company or a scientific organization negotiates a deal with a consultant. We address the privacy problem by encrypting the data. Thus the miner will be unable to extract meaningful information neither from the raw data, nor from the extracted rules. Having applied the data mining algorithms, the consultant provides the organization with the extracted rules. Finally, the organization decrypts those rules so as to restore their true meaning. Two important issues arise in this approach. The first is the need to investigate the robustness of commonly used cryptosystems to potential attacks by data miners. The second key issue is to choose the appropriate cryptosystem that will allow the deduction of correct results, in the sense that the decrypted rules have to be as close as possible to the rules that would be extracted from the original data.
2. A new approach to cryptanalysis
In the context of cryptography, the encryption algorithm is called cryptosystern or cipher, its input is called plaintext and its output is the ciphertext. The elementary requirement for a cryptosystem to be considered secure is that it is computationally infeasible for an eavesdropper who obtains the ciphertext to deduce any portion of the plaintext. Cryptanalysis is the study of mathematical techniques to violate cryptographic systems. Frequency analysis is the first step undertaken by cryptanalysts. The idea of using the underlying frequency distribution of a language to decipher an encrypted message dates back to the early 15th century, and it is attributed to an Arab mathematician named Qalqashandi. Since, data mining algorithms are able to identify regularities in data in the form of dependencies, we primarily consider the application of data mining techniques for the purposes of cryptanalysis as a generalization of traditional frequency analysis. To this end, we study the robustness of commonly used cryptosystems to alternative data mining algorithms. This
348
knowledge is critical for the identification of the proper cryptosystem that will be incorporated in the Alice-to-Alice cryptography, described immediately below. 3. Alice to Alice cryptography
This section addresses the key problem of privacy of data mining on encrypted data. This approach is known as the Alice-to-Alice cryptography, and has been recently proposed in The essence of this approach lies in the fact that the proprietor of the database, who in the context of cryptography is called Alice, encrypts the database. The encrypted data is then transferred to the data miner who extracts the set of classification rules through available data mining techniques without being able to obtain insight into the meaning of either the data, or the rules. Finally, the classification rules are returned to Alice who by decrypting them obtains the true meaning of the extracted rules. Clearly, for this approach to be reliable, the resulting rules should correspond to the rules that would be obtained had data mining been applied on the real data. The main acting agents of the protocol are, “Alice” that represents a business or scientific organization and “Bob” that represents a data mining consultant who handles the data mining process. Alice owns a database with fields and field values that correspond to attributes and attribute values referred by the data mining rules. Attribute values, irrespective of what they represent, have to be encrypted. Since each attribute value has a label like “good customer” or “driver” or “tomato”,this label can be transformed to an integer. For instance, a label can be transformed to a string of bits with the help of ASCII code, in turn, each string corresponds to a number (integer). As a result, in both of the above cases each attribute value can be represented as a small integer. The methodology is as follows:
’.
encrypted data mining algorithm (1) Alice collects the data organized into relational tables (2) Alice encrypts the relational tables (3) Alice sends the encrypted tables to the miner. Data mining performed. The miner returns the obtained rules to Alice (4) Alice decrypts the rules.
During the first step, Alice selects and preprocesses the appropriate
349
data and organizes it into relational tables. A relational table is supposed to be two dimensional, however it can be represented as one dimensional considering it in a row m a j o r order or in a column m a j o r order. At the second step, encryption takes place. At the third step, Alice sends the encrypted tables to Bob. Bob applies the proper data mining algorithm to the encrypted tables and a number of data mining rules are extracted. Of course, attribute and attribute values appearing in the rules are encrypted. Then Bob returns these rules to Alice. Finally, Alice decrypts the rules. The scope of the present study is to extent this line of research by identifying the proper cryptographic methods and data mining techniques so as to meet the objectives of privacy, i.e. the inability of the data miner to extract any meaningful information from the encrypted data he receives; and reliability, in the sense that the encrypted rules once decrypted will correspond as closely as possible to the ones that would be obtained had the data been processed in its original form. To this end, the process will be applied on extensively studied databases.
References 1. M. Atallah, E. Bertino, A. Elmagarmid, M. Ibrahim, V. Verykios, Disclosure Limitation of Sensitive Rules, Proceedings of the 1999 Workshop on Knowledge and Data Engineering Ezchange, Chicago, 45-52, (1999). 2. B. Boutsinas, G.C. Meletiou, M.N. Vrahatis, Mining Encrypted Data, Proceedings of the International Conference on Financial Engineering, E-commerce €4 Supply Chain, and Strategies of Development, (FEES 2002), June 10-12, 2002, Athens, Greece, in press. 3. C. Clifton, D. Marks, Security and Privacy Implication of Data Mining, Proceedings of the 1996 ACM Workshop on Data Mining and Knowledge Discovery, (1996). 4. C. Clifton, Protecting against Data Mining through Samples, Proceedings of the 13th IFIP Conference on Database Security, Seattle, Washington, (1999). 5. C. Clifton, Privacy Preserving Distributed Data Mining, (2001). 6. E. Dasseni, V. Verykios, A. Elmagarmid, E. Bertino, Hiding Association Rules by Using Confidence and Support, LNCS 2137, 369-383, (2001). 7. U.M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, Advances in Knowledge Discovery and Data Mining, A A A I , Press/MIT Press (1996). 8. Y . Lindell, B. Pinkas, Privacy Preserving Data Mining, Advances in CryptolOgy - C R Y P T 0 '00, LNCS 1880, 36-53, (2000).
APPLICATION OF AUTOMATIC DIFFERENTIATION IN NUMERICAL SOLUTION OF A FLEXIBLE MECHANISM MING-GONG LEE+ Department ofApplied Mathematics 707 Section 2 Wu-Fu Road Hsin Chu, Taiwan, 30012 R.O.C. E-mail:[email protected] The dynamic behavior of a rotating flexible four-bar structure has been analyzed by Lagrange multiplier formulation, which includes a coupled of rigid and flexible generalized coordinates and they form a structure which can be defined as differential-algebraic equations (DAEs). A stable numerical algorithm for solving differential-algebraic equations is implemented. Automatic differentiation (AD) system (ADIFOR) are used to generate numerical values of Jacobian matrix of the constraint equation described in the formulation. Another AD tool AUTODERIVE is used hrther to get the second order derivative terms. Comparison between using hand-deriving and automatic differentiation is studied, and it shows that a great accuracy of implementing AD tools in the differential-algebraic equation solver for flexible structure.
1. Introduction
Dynamic modeling of flexible mechanical systems has been arisen in the past three decades. Flexible structures play an increasingly vital role for robotics, transportation, aerospace, biomechanics and other engineering applications in advanced machine design. The accurate modeling in studying elastic deformations and rigid motion of continuous, elastic bodies are becoming increasingly important [ 1,4-5,11,20]. The dynamic behavior of the flexible rotating four-bar mechanism can be interpreted in terms numerical solutions of a constrained mechanical system or a differential-algebraic equation (DAE), which is composed of differential equations and a set of kinematic constraints. Several computational methods have been studied such as the Baumgarte constraint stablization technique [3], Regulization methods [ 131, projection methods [9-10,14,16-171, etc. Of these techniques just cited above, the differentiation of the original constraint equation is always necessary. The derivatives are taken during derivation of the equations forming the DAE. The first derivative may often be This work is supported by the National Science Council, Taiwan, R.O.C., under NSC 91-2212-E-216-008
350
35 1 estimated by finite differences, or symbolic derivation. For a complex mechanism, the symbolic expression could be derived either by hand or by symbolic software; e.g., Maple V, Mathematica. But the latter sometimes suffers from what is called expression swell. It becomes very complicated if repeated differentiation is necessary. Automatic differentiation is the third option. It is a method for evaluating first derivative and higher derivatives of multivariable functions and it is like the evaluating of formulas from symbolic approach in that it gives answers which are correct up to roundoff error. However it does not have the problems of the symbolic approach with expression swell. The purpose of this study is applying automatic differentiation tool into a differential-algebraic equation numerical algorithm to a rotating flexible fourbar structure. 1.l.Differential-algebraic Eequations by Lagrange Multiplier
Formulation The flexible four-bar mechanism is formed by five sets of generalized coordinates to describe the configuration as shown in Figure 1. One is the reference inertial coordinates, and there are three reference frames in center of mass of each bar which is parallel to the reference inertial coordinates. Since bar 3 is flexible, a reference frame fixed to bar 3 will be moving with bar3 to express flexibility of this bar. The origin of this local reference frame is the same as bar 3, and the x- displacement will be parallel to the length of bar3. A transverse deformation is assumed in bar 3 and is modeled by a mode shape function. The governing equations of motion and kinematic constraints fall into the category of constrained mechanical system. The system equations of motion subjected to a set of constraint conditions can be described by using Lagrange multiplier formation,
d dT dt dq
-(-)T
dT 89
- (-)
T
dU + (-y + cf,, T .A = F,, 89
where T is Kinetic energy, U is Strain Energy;
a,is the Jocobian matrix of
the kinematic constraint with respect to the generalized coordinate q;
is
the constraint force; F,,, is the general external force including gravitational force and the driving torque [ 5 ] . 1.2. Differential-algebraic equation by Euler-Lagrange Formulation
Equation 1 can be transformed into the Euler-Lagrange equation, which is an index 3 DAEs and it can be written as,
352
tk
M(4, + @(q,t)= 0 where
(4,
- f ( t ,4,4)= 0
(2)
(3)
M ( g ,2) is an n x n coefficient mass matrix, which is usually symmetric
positive definite; f (q,q, t) is a generalized force vector, including Coriolis and inertia force of gyroscopic, force exerted by springs, dampers between bodies, gravitation force, control force, and inertial force due to flexibility; A E 93 is a Lagrange multiplier; (D(q,t): R" x R R" is a mapping that describes constraints of the mechanism;
+
@,(q,t)is the Jocobian matrix of @(q, t) and is represented by matrix
AT for
simplicity. Sometime, Eq. 3 is called the position constraint. A time derivative of Eq. 3 deduces the velocity constraint, m q q+ @, =O (4) Another time derivative of Eq. 4 deduces the acceleration constraint, aJqq+(@qq)qq-2@qlq-cDtt=0
(5)
Even though, Eqs. 4-5 are not the original constraints, but the direct numerical solutions of Eqs. 2 and 5 if converge, usually does not satisfy these constraints, sometimes they even diverge [ 191. That causes the numerical solution of DAEs difficult.
2. Automatic Differentiation Tool Automatic Differentiation is a technique for augmenting computer programs with statements for computation of derivatives based on the chain rule of differential calculus [6-7, 151. The ADIFOR 2.0 system provides automatic differentiation for programs written in Fortran 7 7 . Given a Fortran subroutine (or collection of subroutines) for function @(q), e.g., Eq. 3, ADIFOR will produce Fortran 77 subroutines for the computation of the gradient vector of this function, @,(q). There are few reasons why ADIFOR 2.0 is chosen; 1 . Ease of Use: Users just have to supply the Fortran source code and indicate the variables that correspond to the independent and dependent variables; 2. Efficiency: Derivatives codes generated by ADIFOR usually outperform divided-difference approximation [ 151.
2.1 Numerical Algorithm
353 In this section, a numerical algorithm incorporated with the ADIFOR 2.0 and AUTODERIVE will be implemented to show accuracy of the proposed new technique to obtain terms shown in Eqs. 3-5 and correct numerical solutions of the system of Eqs. 2-3. Detail derivation of the algorithm can be refereed to [16].
Algorithm : Step 1. Implement an acceleration projection method [2]; q = C(CrMC)-'CrQ- (I - C(C'MC)-ICrM)A(A'A)-'(k ' q -+ u )
(6)
where C is an orthogonal complement matrix of A such that ATC= 0 , to get q , . Let q,=w, Use the Runge-Kutta-Fehlberg method [8] to get 4 and 4 , . 9
Define q, = V , and let q, = qn. Step 2.
qnis treated
as a predicted numerical solution, and at this time V , is
treated as a true solution at this time step. Use a k-step BDF method to discretize q - v = 0 , a nonlinear system in q is derived. True position vector 4, will be obtained by solving the following nonlinear system by Newton-iteration,
A generalized mass matrix is defined as B = M
+ @k + @h)2D
[ 121
Step 3. Use velocity vector at step 1 v, = V, as a predicted solution, and use a kstep BDF method to discretize 9 - w = 0 . The true velocity vector v will be obtained by one step linear system solver as the following,
1
L,, (",w , , , r =
B ( " - q,,- hpw,,)+a +
Step 4. If time is over, then stop; otherwise, go to step 1.
= 0
(9)
354 3.
Numerical Results
A demonstrated example for the efficiency and accuracy of the above numerical algorithm is a flexible four-bar mechanism. The model is described as in the following Figure 1, for detail derivation can be referred to [4].
Figure 1. A flexible Four-Bar mechanism The material of this four-bar mechanism is aluminum and only bar 3 is flexible and is assumed to have transversal deformation. Table 1 gives some geometric parameters of this mechanism. Table 1. Geometric Parameters for bar 2, 3, and 4
bar 2
bar 3
bar 4
length(cm)
10.8
27.94
27.05
Position of center of mass(cm) Cross section area
5.4
13.97
13.525
2.154
0.406
1.218
0.063078
0.0308
0.089336
2
(cm ) Mass (kg)
Length of Bar 1 Input angular velocity (rads)
25.4 35.6047 (clockwise)
355
Bar 2 is rotated by a fixed angular velocity. A driving torque is obtained by assuming all three bodies are rigid and is considered as a driving torque during simulation. Variation of position constraints of bar 3 are shown in Figure 2. Variation of velocity constraints of bar 3 are shown in Figure 3, and displacement of the transverse deformation parameter is shown in Figure 4. The variation of position constraints has accuracy up to and the variation of velocity constraint has accuracy up to 1 . These result shows that the numerical solutions satisfy the kinematic constraint, which means the algorithm is accurate and stable.
Figure 2. Variation of position constraint of bar 3
356
Figure 3. Variation of velocity constraint of bar 3
Figure 4. Displacement of transverse deformation parameter of bar 3
357 The difference of numerical solutions for 10 variables of this mechanism between implementing by hand-deriving for Jacobian matrix and from AD tools is shown in Fig. 5 It shows that these two sets of numerical solutions agree to high accuracy.
Figure 5. Difference between implementing hand-coding and automatic differentiation 4.
Conclusions
The use of AD tools in the simulation of a simple flexible mechanism has shown the possibility of further application of this technique in a more complex system. It shows that the numerical solutions in terms of applying ADIFOR and handderived derivatives agree to high accuracy. In addition, the algorithm used in this paper is implemented in the simulation of some differential-algebraic equations. It shows that such algorithm can be successfully implemented in the numerical solution of flexible mechanisms.
358 Acknowledgments The author would like to thank Professor Ching-I Chen supplied such a wonderful model to study. References 1. T.E. Blejwas, “The Simulation of Elastic Mechanisms Using the Kinematic Constrains and Lagrange Multipliers”, Mechanism and machine Theory, 16(4), 441-445(1981).
2. M. Borri ,C. Bottasso, and P. Mantegazza, “Acceleration Projection Method in Multibody Dynamics,” Eur. J. Mech., A/Solid, 11(3), 403-41 8( 1992). 3. J. Baumgarte, “Stabilization of Constraints and Integrals of Motion in Dynamical Systems,” Computer Methods Appl. Mech. Engng., 1, 116(1972). 4. C.1 Chen, and M-G Lee, “Dynamic Modeling of Flexible Structures Including The Reaction Force”, Proceeding of The 25th National Conference on Dynamics, 2001. 5. C.I. Chen,, V.H. Mucino, and C.C. Spyrakos, “Flexible Rotating Beam: Comparative Modeling of Isotropic and Composite Material Including Geometric Nonlinearity“, J. of Sound and Vibration, 178 (S), 591-605 (1994). 6. C. Bishof, Alan Carle, P. Havland, P. Khademi, and A. Mauer, “ ADIFOR 2.0 User’s Guide (Revision D),” Technical Report ANL/MCS-TM- 192, Mathematics and Computer Science Division, Argonne National Laboratory, USA, 1994. 7. C. Bishof, A. Carle, P. Khademi, and A. Mauer, “The ADIFOR 2.0 System for the Automatic Differentiation of Fortran 77 Programs,” Argonne Preprint ANL-MCS-P841-1194. 8. Conte and de Boor, ”Elementary Numerical Analysis”, 3rd Edition, McGraw-Hill, 1980. 9. E. Eich., C. Fuhrer, B. Leimkuhler, and S. Reich , “Stabilization and Projection Methods for Multibody Dynamics, Tech. Rep. 279, Helsinki University of Technology, Helsinki, Finland, 1990. 10. E. Eich, and M. Hanke “Regulization Methods for Constrained Mechanical Systems”, Tech. Rep. 91-8, Humboldt University zu Berlin, Germany, 1991. 11. B. Fallahi, “An Enhanced Computational Scheme for the Analysis of Elastic Mechanisms”, Computers & Structures, 62(2), 369-372( 1997). 12. Q-Q Fu, Numerical Solution for Differential-Algebraic equations, Ph.D. Thesis, University of Iowa, 1992. 13. C. Fuhrer and B. Leimkuhler, “Numerical Solutions of DifferentialAlgebraic Equations for Constrained Mechanical Motion”, Numer. Math.,
359 59, 55-69(1991). 14. C.W. Gear, B. Leimkuhler, and G.K Gupta, ”Automatic integration of Euler-Lagrange Equations with Constraints”, J. Comp. Appl. Math., 12 & 13,77-90(1985). 15. A. Griewank, “Evaluating Derivatives-Principles and Techniques of Algorithmic Differentiation”, SIAM, Philadelphia, 2000. 16. Lee, M.G., and Wang, C.L., “Stable Numerical Solutions of Differential Algebraic Equations in Mechanical Dynamic Systems”, Proceeding of The Eighteen National Conference on Mechanical Engineering, pp. 663-670, 2001. 17. M-G Lee,and C-I Chen, , “An Algorithm for Numerical Solutionsn of A Flexible Structure by Means of Differential-Algebraic Equations”, Proceeding of The 25th National Conference on Dynamics, 2001. 18. C.C Pantelides., “The consistent initialization of differential-algebraic equations”, SIAM J. Sci. Stat. Comput., 9,213-231(1988). 19. L. Petzold., “DifferentiaUAlgebraic equations are not ODES”, SIAM J. Sci., Stat. Comput. 3, 367-384(1982). 20. A.A. Shabaha, “Dynamic of Multibody Systems”, John Wiley 8z Sons, Inc. 1989.
NUMERJCAL QUADRATURE PERFORMED ON THE GENERALIZED PROLATE SPHEROIDAL FUNCTIONS*
T. LEVITINA AND E. J. BR.ANDAS Department of Quantum Chemistry Uppsala University, Box 518. S-751 20 Uppsala, Sweden
In the recent paper on the computation of the prolate spheroidal wave functions the numerical technique for their accurate integration was also proposed. The conventional quadrature formulas can not be used here: this would cause a significant lost of accuracy. The point is that the prolate spheroidal functions may vanish exponentially fast or oscillate rapidly, accumulating zeroes near singular points &I. The necessity to compute integrals containing prolate spheroidal wave functions arises in many practical applications. They appear with the separation of variables in spheroidal coordinates as eigenfunctions of a singular Sturm-Lioville problem and constitute a natural basis in axially-symmetric physical problems. In this case integrals representing the Fourier coefficients or matrix elements are desired, which contain this functions or even their products ’. Also an important application is signal processing, where angular prolate spheroidal wave functions Sml(q),defined on (-1, l),are employed as the eigenfunctions of the weighted-kernel finite Fourier transform 2,3:
with pml standing for the associated eigenvalues of the indexes m, 2. Here, in addition to the above integrals, one needs to compute convolutions, Fourier transforms, etc. In particular, in signal processing applications a radial prolate spheroidal wave function (radial functions satisfy the same differential equa*Supported by the Royal Swedish Academy of Sciences, project number 12523, the Swedish Foundation for Strategic Researches and the Russian Foundation for Basic Research under Grant no. 02-01-00050 360
361
tion as angular ones, but over different range of the variables, namely ( 1 , ~ )is) required to be an analytical continuation of the angular part. One may hope to fit this condition simply comparing the angular and radial parts at the singular point 1. However, in most cases this is forbidden: both functions vanish there. Instead it is enough to compute at a selected point > 1, the desired radial function as a finite Fourier transform of the angular part
and then scale appropriately the original ” non-normalized” radial function everywhere. There are other similar integral relations for special functions. Their nature and connection with the separation of variables is discussed in detail in ‘. Among them for us the relation is important to which the so-called generalized prolate spheroidal wave functions (GPSWF) satisfy. This functions serve to build eigenfunctions of the 2D finite Fourier transform 512:
J
P~(r3=
ei (c>G)$,(qd$.
?-=lfl=.
and are themselves the eigenfunctions to the singular Sturm-Liouville problem defined by the equation
and boundedness boundary conditions IT(rl)l < 0 0 1 rl+
+o
&2 IT(rl)I < 00, 77 --+ 1 - 0 .
(4)
Defined first at the finite interval (0, l),GPSWF of indexes rn, 2 might be analytically continued through the complex plane to be also defined on ( 1 , ~ )As . in the case of conventional spheroidal functions, the integral relation
plays here the key role, where Jm(.) is a conventional Bessel function. The integral in the right side of relation (5) is not the only one required in applications. Like above Fourier coefficients, matrix elements expressed in terms of this functions are often needed. The convolution of image
362
data with the eigenfunctions of the 2D finite Fourier transform (which also contains integrals of GPSWF) plays an important role in image processing. Fortunately the approach proposed in (see also earlier papers, e.g. appeared very general both to be applied to various integrals containing a particular kind of special functions (say, spheroidal) and to be extended to the cases of other special functions. Strictly speaking, we have already applied this approach in our recent paper lo, where GPSWF are computed having been normalized in advance. The extension allowing to compute various integrals of GPSWF is straightforward. In lo GPSWF are calculated via the auxiliary functions - phase 8(r]) and amplitude ~ ( 7 ) : 67877)
(6)
d drl
- Tml(rl)= -
4 r l ) T ( r l ) sin(e> (1- r12)
.
A special ”scaling” function ~ ( 7is)introduced to make O ( q ) and r(q)change slower lo. The auxiliary functions in turn are computed as the solutions to the Cauchy problem for a system of first-order ODES, being much simpler than the original boundary value problem. Here some preliminary work is to be done to transfer boundedness boundary conditions from singular point to regular ones and to produce the initial values for the auxiliary functions. In order to compute a particular integral an extra auxiliary function is introduced in a special manner and the associated (first-order, linear) differential equation is incorporated into the system. We recommend to integrate this system numerically using the RungeKutta adaptive technique with the automatic choice of the step size. As a result from the auxiliary functions the desired integral value will easily be obtained with the accuracy guaranteed by the Runge-Kutta procedure Note that, although the function under integration may be extremely steep, one has not to store huge arrays of its values in order to keep the accuracy of the integral. Actually there is even no need to compute the generalized prolate spheroidal wave function itself explicitly. Further, if several various integrals containing the functions under study are required, one introduces an extra auxiliary function per a desired integral and incorporates all the corresponding differential equations into the
3 63
above system. Again the joint integration of the auxiliary functions will provide the values for all the desired integrals simultaneously. The present approach may be of use t o continue GPSWF analytically to the semi-infinite interval: namely, the phase e(7) and amplitude ~ ( 7of)the designed continuation are introduce in a slightly different manner as in (6); in the technique for evaluation of radial wave functions for spheroids and tri-axial ellipsoids is developed, which may be easily modified t o find the bounded solution to Eq. (3). Again the auxiliary functions are calculated as the solutions t o the Cauchy problem for a system of first-order ODEs. However to recover a solution t o (3), one needs first to fix its value a t a > 1, which may be done with the aid of the relation (5). selected point Finally the required solution is composed of the phase and amplitude with the appropriate scaling factor taken into account. A short remark should be done on the Bessel functions required to compute the integral in (5). The proposed technique may easily be modified to compute Bessel’s functions through the associated auxiliary functions (phase and amplitude). Calculation of these functions may also be incorporated into the system of other auxiliary ODEs and, in particular, into the equation which yields the value for the integral in (5). ‘9’
References 1. T.V. Levitina, E.J. Briindas, Computational techniques for Prolate Spheroidal Wave Functions in Signal Processing, J. Comp.Meth.Sci. Y!. Engrg. 1 (200l), no. 1, 287-314. 2. I.V. Komarov, L.I. Ponomarev, and S.Yu. Slavyanov, 1976, Spheroidal and Coulomb Spheroidal Functions, [in Russian], Nauka, Moscow. 3. D. Slepian and H.O. Pollak., Prolate spheroidal wave functions, Fourier analysis and uncertainty, I. Bell System Technical Journal, 40 (1961), 43-64. 4. V.B. Kuznetsov and E.K. Sklyanin, Separation of variables and integral relations for special functions, Ramanujan J. Math. 3 (1999), 5-35. 5 . D. Slepian. Prolate spheroidal wave functions, Fourier analysis and uncertainty, IV: Extensions to many dimensions; generalized prolate spheroidal functions. Bell System Technical Journal, 43 (1964), 3009-3058. 6. E.S. Birger, On evaluating functionals of the eigenfunctions of boundaryvalue problems for systems of linear ordinary differential equations, USSR Comput. Maths. Math. Phys., 13 (1973), no. 1, 297-305. 7. N.B.Konyukhova, S.Ye.Masalovich and I.B.Staroverova, Computation of Rapidly Oscillating Eigenfunctions of a Continuous Spectrum and Their Improper Integrals, Comp. Maths. Math. Phys. 35 (1995), 287-302. 8. A. A. Abramov, A. L. Dyshko, N. B. Konyukhova, T. V. Pak, and B. S. Pariiskii, Evaluation of prolate spheroidal function by solving the corresponding
364
differential equations, U.S.S.R.. Comput. Math. and Math. Phys. 24 (1984), no. 1, 1-11. 9. A. A. Abramov, A. L. Dyshko, N. B. Konyukhova, and T. V. Levitina, Computation of radial wave functions for spheroids and triaxial ellipsoids by the modified phase function method, Comput. Math. and Math. Phys. 31 (1991), no. 2, 25-42. 10. B. Larsson, T.V. Levitina, E.3. Brandas, Generalized Prolate Spheroidal Functions and R.elevant Computations, submitted t o J. Comp.Meth.Sci. & Engrg.
MULTITAPER TECHNIQUES AND FILTER DIAGONALISATION A COMPARISON
-
T.LEVITMA AND E. BUNDAS Department of Quantum Chemistv Uppsala University, Box 518, S-75120 Uppsala, Sweden Abstract In the present contribution we compare the new Multitaper Filtering technique with the very popular Filter Diagonalisation Method. The substitution of a time-independent problem, like the standard Schrodinger equation, by a time dependent one fiom the Filter Diagonalisation Method allows the employment of and comparison with standard signal processing filtration machinery. The use of zero order prolate spheroidal tapers as filtering functions is here extended and exactly formulated using techniques originating fiom general investigations of prolate spheriodal wave functions. We investigate the modifications presented with respect to accuracy and general effectiveness. The approach may be useful in various branches of physics and engineering sciences including signal processing applications as well as possibly also in general time dependent processes.
365
COMBINED AIR AND RIVULET FLOW AND APPLICATION TO FUEL CELLS
M.H.X. LIANG & B. WETTON Deparment of Mathematics, University of British Columbia, Vancouver, Canada, VGTlZ2 C A N A D A E-mail: mliangOmath.ubc.cn, [email protected] T.G. MYERS Dept of Mathematics and Applied Mathematics, University of Cape Town, Rondebosch 7701, South Afnca A steady-state, two-dimensional rivulet flow is introduced t o model the two phase flow of air and water in a circular pipe. Gravity is ignored t o simplify the analysis. Two kinds of rivulet geometry are introduced. One is the annulus rivulet, and the other is the circular arc rivulet. The relationship betwen pressure gradient and water flux is studied in both cases while holding flux of air constant. Specifically, two kinds of problems are defined for the circular arc rivulet: the inverse problem, which consists in calculating the size of the rivulet and pressure gradient needed t o drive the flow for specified air and water fluxes,and the direct problem, which consists in calculating the size of the rivulet and velocities of fluids for given pressure gradient. Computations are done using FEMLAB.
1. Introduction
Two-phase flow can exist in a variety of regimes, such as bubbly flow, slug flow, churn flow, and annular flow. This is described in detail by Fowler4. Fowler also introduced a flow regime map, which classifies the flow patterns according to density, velocity, and gas volume fractions. However, this classification doesn’t include the case of the two-phase flow in a typical small dimensional fuel cell. Those fuel cells are in the order of lmm diameter, and the velocity of flow is typically in the order of 10-4m/s for water and l m l s for water. So we propose a new model to study the flow inside the fuel cell. We model the fuel cell channel as a circular pipe with a given diameter and study the flow in 2-D. We also assume that the flow is in rivulet form. For a rivulet with small Bond number, pressure and surface tension are the 366
367
dominant forces that affect the motion of a rivulet. Gravity can be ignored to simplify the computation. First, we do the direct problem, i.e. compute the fluxes of the coupled water/air flow for a given pressure gradient. Then we do the inverse problem, where fluxes are given, and we compute the pressure gradient needed to drive the flow. Then we apply the problem to a specific situation, modeling water movement inside a fuel cell. Many different aspects of research have been done to understand the science behind a proton exchange membrane(PEM) fuel cell'. Such as gas flow in the electrodes, gas flow with condensation and other thermal effects, and liquid (water) motion driven by air flow. One aspect that affects the performance of PEM fuel cells is the accumulation of liquid water in the oxygen channel. Presence of certain amount of water is necessary for ironic transportation in the membrane. However, if there are too much water inside the channel, it could block the flow of oxygen, thus reducing the efficiency of the fuel cell. In some fuel cell designs, liquid water is pushed out with the flow of reactant gases. Thus, an understanding of the two phase (water and air) flow is essential to improve the performance of a fuel cell. 2. Mathematical formulation
The geometry of an idealized flow channel is shown in Figure 1. The dy-
Figure 1. Geometry of a flow channel.
namics of a viscous incompressible fluid is governed by the Navier-Stokes equation:
dii
pdt
+ (ii.V)ii= f - V p + p V %
368
together with the continuity equation V .u' = 0, where p, density of fluid; ii, velocity of fluid; p, viscosity of fluid; f,external force and p is the pressure gradient. The rivulet flow is modeled under the following assumptions: (1) the fluid is incompressible, (2) the flow has reached a steady-state, (3) the flow is fully developed, that is, it is unidirectional and the derivatives in the direction of flow are negligible.
Assumptions 2 and 3 indicate the velocity field is specified by ii = (O,O, w(z, y)) and p is independent of z. Thus equation (1) becomes
PAW - X = 0
(2)
Physically, the pressure drop along x-y direction is proportional to curvature of the rivulet, and is governed by the equation: p , - pa = -ck
(3)
where c is the surface tension of water and h is the height of the rivulet. Pa is the air pressure and P, is water pressure. Since we ignored gravity, which is reasonable for small rivulets, the pressure drop along x-y direction is zero. We get a constant curvature along x-y direction. And constant curvature corresponds to the geometry of an arc of a circle Circular arcs exist in two state, one is the full circle which corresponds to annulus flow, and the other geometry is an arc of a circle which corresponds to circular arc flow. These two kinds of flows are investigated in detail below. 3. Annulus Rivulet
First, assume the rivulet clings itself to the wall of the channel symmetrically, with air flowing down in the core as shown in Figure 2. Both the water and air domains are governed by similar equations, and by the same boundary conditions as for the air only case. We have pa Aw = X in the air region, and p w A w = X in the water region. Together with the boundary and interface conditions below Boundary Condition: w = 0 Interface Condition: [w] = 0 and [pw,] = 0 we compute the velocity profile inside the channel. The interface conditions correspond to continuity of velocity and stress at the water-air interface y. [.] denotes the difference in the quantities across the interface y. And w, is the normal velocity.
369
Figure 2.
Cross section geometry of a ring rivulet channel.
For a given channel size R, and radius of air flow ?, the velocity of air and water at radius r are: Wa(r) = -(PA 4Pw
2
A - R 2 ) + -(r2
4 ~ a
(0 < T < P )
-F2)
(4)
W W ( r= ) - ( AT 2 -R2) (P
R4
7rX
= --[--
2
R3 + (- R3 + -)h + O(h2)+ ...I
4 ~ a 2 ~ w 4 ~ a
+
R4 R3 R3 Let C1 = -and C2 = -, write X = A, + A ‘ where A, is the 4Pa 2Plu 4PLa base solution, and A’ is a small perturbation from the base solution. Then
Qa = :(Ao 7r
+ A’)(Cl + C2h + ...)
= -(A,C1+ A’Cl+ A,C2h
2
+ ...)
For fixed air flux, the last two terms must cancel. Thus we have A’ 0: h. Since Qw c( h2,we get the pressure gradient increases as the square root of
370
the increase in water flux, i.e.
4. Circular Arc Rivulet
For the case that water accumulates at the botom of the channel, we assume it forms a rivulet with circular arc interface there, shown in Figure 3, where 6 is the contact angle, and is the angle that determines the size of the circular arc. The governing equations are the same as the annulus $J
Figure 3.
Cross section of a circular arc rivulet channel.
rivulet case, except that the geometry is different. Since there is no analytic equations to describe the flow field as a function of 9 and $J,we are not able to solve the coupled problem analytically. Instead, We used a software package called FEMLAB to calculate the velocity profile. FEMLAB allows the user to specify the geometry and the PDEs to be solved, together with boundary and interface conditions. Then it generates triangular meshes and uses finite element method to solved the PDE. Figure 4 below shows the meshes generated by FEMLAB after refinement.
4.1. The direct problem First we solve the direct problem. i.e. with $J and X given, find the size of the rivulet, compute the velocities of water and air, and then integrate to get the fluxes. The following data are used in the calculations. 0 0
pa = 1.8 * kg/m/s pw = 1.0 lop3 kg/m/s
*
371
Figure 4. 0 0 0
Triangular mesh of the channel
Radius of channel: 1 mm Pressure gradient: 100 Pa/m Contact angle (8): 84O (between graphite and water)3
The velocity contours are shown in Figure 5. Notice that the velocity 1
.a1 0 0792
0.9271 08761 O.SZ46
0.5
0.m 0.7215 0.67
O.bl* 0.-
0
0.5154 0.1638 0.4123 0.2407 05092
-0.5
o nn 0m 1 0.1546 8.1W 00.115
x10-
Figure 5.
Contour plot of fluxes inside a circular channel
contours have different slopes at the interface due to continuity of shear stress and the difference in viscosity of air and water. 4.2. The inverse problem
The inverse problem-given fluxes of air(Q0), water (Qw) and size of the channel R, determine the size of the rivulet ($) and the pressure gradient
372
(A) needed to drive the flow. Consider the general equation p a w = A. Let x L and $ are nondimensional quantities. Thus Aw=-
aZw+-aZw= -(-++ I d2w dy2
8x2
L2
dj.2
+ Li,y
-+
L$, where
d2w dy
s
Since Q = wdA, for fixed Q , we have X varies like l/r4. (A factor of l/r2 comes from w and another factor of l/r2 comes from A, the area of the cross section). For computational simplicity, let R = 1, then X = 1 too. We will use X = 1 and R = 1 in our proceeding calculations of the reference water and air fluxes. Note that the equation p a w = X is linear in A, so we have a scalar inverse problem to solve. The computation is carried out in the following steps: (1) Specify radius(R) of the channel and contact angle(8) between water and the channel. (2) Calculate the size of the rivulet that gives Qratio = w using X = 1 and R = 1 by modifying the angle II, in successive iterations, and terminate when the specified error tolerance is reached. (3) Use either equation X = Q ( t a r g e t ) or X = t o calculate
%
m,
$$;?:
the pressure gradient and fluxes for the specified channel size and contact angle. Next we apply the method above to a specific case. Suppose we have a lmm diameter circular channel with length lm. Then the effective area of the channel is diameter times length, which lorn2. Again suppose the current density is 1A/cm2, then the total current goes through the channel is lOC/s. The electro-chemistry inside the fuel cell is governed by: H2
O2
-+
2H+
+ 2e-
+ 4e- + 4H+
-+
2H20
Thus 1 mole of water corresponds to 2 moles of electron. The flux of water is: Qw = 18 * 10/2F, where F is the Faraday’s constant (9.632 * 104C/moZ) and 18 is the molar mass of water. And flux of air is: Qa = (32 .78/.21* 28) * 2 * 10/4F. 32 is the molar mass of oxygen and 28 is the molar mass of nitrogen. Oxygen occupies 21% of the volume in air and nitrogen occupies 78%. Other small percentage of gases that makes the air composition are neglected. 2 is the usual stoichiometric flow factor which is the factor of
+
373
the amount of input air to the minimum amount of air needed for reaction. So the targeting flux of air and water are:
Qw = 9.3 * 10-4kg/~ Qa = 7.1
* 10-3kg/~
&
Using these data, we find the ratio of flux of air and water: Qratio = = 7.6. FEMLAB is used to carry out the steps 1-3 above, and compute the results in Table 1. The data shows that X(pressure drop) is more sensitive
e 60" 84O - 100"
R lmm X = 2.83248e + 01 X = 2.82130e + 01 X = 2.82906e + 01
0.5mm X = 4.529674e X = 4.506544e X = 4.526498e
1c,
+ 02 + 02 + 02
0.25mm X = 7.2475e 03 X = 7.2105e 03 X = 7.2424e 03
+ + +
17.5" 12" 10"
to change in size of channel(radius) than to change in contact angle. Figures 6 and 7 show rivulet sizes for different specifications of contact angle and radius of channel. The rivulet is concave in Figure 6, and convex in Figure 7. The qualitative shape of a rivulet depends on a number of factors. For example, the contact angle between water and the channel, radius of the channel and the fluxes. Notice that the velocity contours of water are not shown in those Figure7. This is because the velocity of water is much smaller than the velocity of air. However, there is one velocity contour of water shown in Figure 6. In an actual fuel cell, density and viscosity of air/water varies in space. It will be more useful from a practical point of view that we have a way to find the pressure gradient for an arbitrary Qratio. So we get the following universal curves. Figure8 gives rivulet size 1c, for any Qratio, and Figure 9 gives the air flux (Q:) for X = 1. With this information, we can estimate the pressure gradient for arbitrary given air flux Qa and radius r of the channel, i.e.
374 x 10-3 7.327 5.961 5.595
&En 5.852 5.405 5.128
4.763 4.586
4.M a661 3.297 2531 2565 2188
1.89 1.465 1
.an
0.7327 0.-
Figure 6. shape of rivulet for r=lmm, 6 = looo.
-25
-2
-15
-1
-05
0
05
1
15
2
25
x ro-
Figure 7. shape of
rivulet for r=.25mm, 8 = 60°.
5. Pressure gradient vs. flux of water
We know that X = C I up to~ leading order for small Q, with Qa fixed for annulus rivulet from asymptotic analysis, as indicated in equation ( 6 ) . However, it is impossible to get asymptotic results for the circular arc rivulet due to algebraic complexity of the problem. So we did a comparision of pressure gradient vs. flux of water of the two cases numerically, and results are shown in Figure 10. We conclude that the pressure gradient of the annulus model is more sensitive to change in water flux than that of the circular arc model, but
375
for r=lmm, 8 = 84O.
Figure 8. Rivulet size
$J
Figure 9. Air flux vs.
+ for r=lmm, 8 = 84O,and X = 1.
vs.
QTatio
the qualitative behaviour is the same. It can be shown using log-log plot that X = CZQ;’~ , for small Q w , circular arc flow. 6. Conclusions
We have developed two types of models for the two phase rivulet flow in a circular pipe. We found that the pressure gradient needed to drive the flow is proportional to the square root of the water flux for the annulus rivulet model. For the circular arc rivulet model, pressure drop is more sensitive to change in size of the channel than to change in contact angle,
376
Figure 10. Pressure gradient vs. Qw for annulus and circular arc rivulet
and the pressure gradient is proportional to R-4. Our numerical scheme is accurate and is substantiated by analytical solutions and non-dimensional analysis. The research results are not only useful for fuel cell design, but also applicable to many other industrial processes which involve two phase flow.
Acknowledgment M.Liang and B.Wetton would like to thank NSERC Canada and MITACS for their support of the research.
References 1. M.H.X.Liang Riwulet flow and stability, Master's thesis, Department of mathematics, Institute of applied mathematics, 2-3, 2002. 2. MATLABIFEMLAB, The Mathworks Inc., http://www.mathworks. com. 3. Handbook of physics and chemistry 4. A.C.Fowler Mathematical models in the applied sciences, Cambridge texts in applied mathematics, 1997. 5. G.K. Batchelor, A n introduction t o fluid dynamics, Cambridge University press.
THE TRANSMUTATION OF THE ARCHITECTURAL SYNTHESIS. MORPHING PROCEDURES THROUGH THE ADAPTATION OF INFORMATIONAL TECHNOLOGY. MARIANTHI LIAF'I Dipl. Architect Engineer, A. U.Th. 62 K. Karamanli str., Neapoli, 567 2 7, Thessaloniki, Greece E-mail: [email protected] KONSTANTINOS - ALKETAS OUGRINIS Dipl. Architect Engineer, A. U.Th. Ph.D. Candidate, A. U.Th. 2 Dimokritou str., Panorama, 552 36, Thessaloniki, Greece E-mail: [email protected]
The notion of informational mobility that came along with the revolutionary development of digital media reconfigured radically the architecturalpractice. There are two main axes on which this transmutation has been observed. The one is constituted by the modifications in the procedure of the architectural production that leaded to more complete but standardized mass projects, in relation with advanced methods of construction. The other refers to the theoretical approach of architecture for a digitally altered world. The virtual augmentation of biologic perception revealed an amplified (broader) experiential universe, urging the manipulators of space to evolve. The integration of digital systems facilitates a spherical approach, including a plethora of parameters. The transcended outcome is a mutation to an architecturalhybrid composed of mass and information.
The evolution of data processing media and the improvement of communication networks was a very important aspect of the human inquiring effort. The positive impact of the aforementioned issues contributed in the overall acceleration of important and evolutionary steps in the field of scientific disciplines. The theory and implementation of the art of building has always tried to keep up with these steps and mark each era. It has always adopted the innovative technologies of the culture that it represented, both utilitarian (functionalism, facilitation of completion) and semiological (symbols revealing progress, aesthetics). Architecture managed to equilibrate between the technological material world and the creative immaterial one, with the creation of time-enduring monuments along with its transmutation from creator of protective shelters into creator of utilitarian art-works. By drawing elements from both worlds,
377
378
architecture hovers among them, moving randomly closer to one or the other. The effort to incorporate and demonstrate a plethora of elements, made architecture an adscience replete with a need for perspicacity, adaptability, enlightenment and imagination. The theory between the %nes" of architecture, attempting to preserve constantly up-to-date theses, seeks for the translation of contemporary theoretical analyses and results of scientific disciplines, by using a language that could be implemented in morphing procedures. The results of this attempt either reveal future applications or create powerful points (symbols) that resemble snapshots of a particular moment. In any case however, the aspiration is definitely the integration of progress in the architectural discourse. During the 20" century, humanity convulsed its beliefs under the influence of radical scientific discoveries and subversive philosophical theses. Architecture did not remain unaffected by those shifts. Its elements were transformed and fragmented many times. The emergence and decline of architectural movements occurred under continuous, non-identical, accelerating loops. The alteration of the human perception concerning the surrounding spacetime, the transition of the built environment from a "hard" static spatiality to a "soft", 24-hoursfday animated space and the increasing need for transmitting information through all available means constitute crucial changes that affect space and consequently architecture. These "phenomena" are amplified with the essential management of energy for data transfer. The contemporary informational technologies offer efficient and qualitative data management. They intrude into every aspect of human activity, enhancing capabilities and magnifying qualities. This intrusion proved for architecture to be the appropriate vessel for adapting itself to the constant flux created by the latest informational revolution. Digital media amplify the precision, the output and the graphic representation of the architectural product. The ability of the emerging mediums to create photographic images @hotorealism), representing the actual form of a structure replaces the conventional design method. In their early stages of development, digital design media, though they were not flexible enough to be creatively utilized for artistic purposes, they offered convenience in the field of conventional architectural production, a fact that rendered them essential enough to be established. Design software operated into a two dimensional space, as a medium to replace the hand practice. But the task to level and even to exceed the hand's capability so that to express human creativeness and imagination is not an easy one. The geometrical progression of the computational potency, as described by Moore's Law, continuously increases, through programming, the capabilities of digital media. Hardware and software overcame their initial rigidity and embedded new parameters, such as the incorporation of the 3rddimension in the
379 digital design process, initiating constitutional developments in all aspects of the human activity. The reconfiguring of the architectural practice came gradually in acknowledgment of the facilitation provided by even the “primitive” design programs. It was the “copy-paste’’ ability that radically altered the production of conventional projects. Parts of or even whole projects could be “cloned” multiple times. The broadening of the digital design abilities paved the road for the creation of more complicated and aesthetically upgraded elements that could also be replicated. The most advanced products of this procedure acquired a “value”. The ease with which a conventional project could be completed became a fact. The design procedures gradually acquired an “industrialized” philosophy as the output became standardized and an ever growing number of project elements turned into barter objects. At the same time, the evolving global digital network of data transfer boosted this exchange. The gathering of architectural-design concepts from all over the world on the Internet leads to the emergence of a world wide data bank of design objects. It is the incubation of global design aesthetics, deriving from this data bank that creates a universal “type” of design production. Still, the overall effect of the World Wide Web is the general dissemination of knowledge. The elemental truths of a mechanistic universe collapsed. The relativistic theories of a quantum chaotic universe rose. The need for visual simulation of micro/macro phenomena grows stronger for the researchers. Digital manipulation programs provide the prosthetic tools to create forms out of time-related immaterial and indeterminate phenomena. Through digital reality, images reveal to common people the iconic, beyond perceptiveness, world of imagination. A gateway to fantasy has opened; a digital realm where the scientific mind can meet with the artistic one and animate an imaginary world. The transmutation of architecture, due to conceptual and philosophical alterations, required the use of design programs that augment the human bandwidth of perception i.e. by manipulating simultaneously phenomena with different chronicities. The scientific simulation programs provided the matrix for the creation of form finding programs for the architectural synthesis. 3d animation programs allow the “inhabitation” of the fourth dimension, time, just like architecture has been for so long the medium for inhabiting the 3dimensional Cartesian space. Their advantage is that space can be creatively manipulated and animated inside a time based environment. The immersion in fluid spaces, the representation of genesis and evolution and the vivid virtual alter-selves of structures overwhelmed the architectural thinking. Projects escaped from the Euclidean practice and incorporated matters of fluidity and mobility under the influence of the notions of modem reality.
380 First attempts exhibited, sometimes aggressively, the need to set free from the conventional practice and expand. The results demonstrated both tendency for impressiveness and depreciation of the functional elements of the architectural project. The digital enthusiasm needed to be "tamed" so that the digital tools would operate as a medium, along with their users and not by overpassing them. The simultaneous amelioration of theoretical and implemented approach towards the affiliation of digital media to the architectural practice brought equilibrium between human imagination and digital prosthetics. The capabilities of the digital media that have been proved to be of highest importance for the architectural practice are: a. The process of 3d4d diagrams, b. The folding and deformation of forms c. The simulation of force effects in a time-based environment d. The representation of phenomena that exceeds the human comprehensiveness e. The application of complex mathematical and physical theories, such as fractals, topology, non-linear dynamics, chaos and quantum theory f. 3d spatial coordination. The exploitation of those capabilities, made feasible for the architect to render digitally both the physical and the virtual attributes of the environed space. The functional diagrams (emerging archetypes) acquire "body and life", exhibiting clearly a life-span full of flows. Through the fermentation of the archetypes emerge revolutionary forms that constitute snapshots of the animated diagram at either random or specific time-stops. The liberty provided by the "digital laws" allows the creation of the desired environments that will include all the parameters affecting an architectural project. In this way the initial goals of the architect are clarified, leading to an honest and efficient solution, while incorporating the outmost of the involving parameters. The specific needs for constructing the produced floating structures require innovative methods to handle the complex 3d spatial coordination, where the use of the digital media is indispensable. The emerging spaces are characterized by the multi-dimensionality of the elements from which they derived, infusing "soul" to the plethora of data that comprise them. In this manner data is transcended to information. Information, even though it is conceived as an immaterial entity, is integrated into the material shells, rendering them alive. The communicative abilities infused into the structure transforms it to a "gateway", creating links to other environments, physical or virtual, and establishing a new symbiotic relationship with its inhabitors. The creation of vivid, multi-perspective spaces, with a multi-leveled hypostasis, leads to an informational accumulation. The informational density ruptures the physical space, triggering "events". The turbulence in the flow, created by the events, reinforces the semantic essence of the design products and their surrounding space, since they acquire new characteristic features that bare additional information. The system is temporarily unstable obstructing the
381 flows. Thus the system needs to establish a dynamic stability, osmosis, between the interacting forces, redistributing the flows and re-establishing a fluid equilibrium. The fusion of all newly acquired, digital, characteristics generates the creation of a hybrid architectural product. Its hybrid hypostasis is made of bits and bricks, hovering at will between existence and inexistence, between reality and virtuality. These shifts are controlled either by a network of sensors, part of the structure’s AI, or by human interaction or both. The cyber-enhanced abilities of the structure to shape-shift in response to environmental changes, serving its inhabitors, result to an emergence of an artificial consciousness. It becomes a medium for augmented perception and a floating point of reference, a vital element of human life. The integration of digital technology as biological prosthetics into the human body, will amplify the bond between the user and its hybrid shells. Human and surrounding space will become a unit, an entity (a nomad-snail carrying its own shell), evolving into a higher phase, mutating from the fragile, restricted “single cell” organism to a mobile, protected, and interconnected “multi-cell” organism with augmented capabilities. “The whole is greater than the sum of the parts”. References
1. Di Cristina Giuseppa (ed) (200 1) Architecture and Science. London: WileyAcademy. 2. Engeli Maia (ed) (200 1) Bits and Spaces. BaseVBostodBerlin: Birkhauser. 3. Galofaro Luca (1 999) Digital Eisenman. An Ofice of the Electronic Era. Basel/BostonBerlin: Birkhauser. 4. Jormakka Kari (2002) Flying Dutchmen. Motion in Architecture. BaselBostonBerlin: Birkhauser. 5. de Kerckhove Derrick (2001) The Architecture of Intelligence. BaseVBostodBerlin: Birkhauser. 6 . Proctor G. (ed) (2002) Thresholds. Design, Research, Education and Practice, in the Space between the Physical and the Virtual. Pomona: A.C.A.D.I.A. 7. Ranaulo Gianni (2001) Light Architecture. New Edge City. BaselBostonBerlin: Birkhauser. 8. Spiller Neil (1998) Digital Dreams. Architecture and the New Alchemic Technologies. London: Ellipsis. 9. Zellner Peter (1999) Hybrid Space. New Forms in Digital Architecture. London: Thames & Hudson.
A PARALLEL ADAPTIVE FINITE VOLUME METHOD FOR NANOSCALE DOUBLE GATES MOSFETS SIMULATION* YIMING L I I . ~ , SHAO-MING yu2AND PU CHEN~ 'National Nan0 Device Laboratories 'National Chiao Tung University 1001 Ta-Hsueh Road, Hsinchu city, Hsinchu 300, TAIWAN E-mail: [email protected]. iw We propose a quantum correction transport model and apply a parallel adaptive refinement methodology to nanoscale semiconductor device simulation on a Linux cluster with MPI libraries. In the nano device simulation the quantum mechanical effect plays an important role. To model this effect a quantum correction Poisson equation is derived and replaced the classical one in the transport models. Our numerical method is mainly based on the adaptive finite
volume method with a posteriori error estimation, constructive monotone iterative method, and domain decomposition algorithm. A 20 nm double-gate MOSFET is simulated with the developed simulator. 1. Introduction Development of nanoscale metal oxide semiconductor field effect transistors (MOSFETs) has been of great interest in recent years [ 1-111. Computational methods for macroscopic semiconductor device models, such as drift-diffusion (DD) and hydrodynamic (HD) models play a crucial role in the development of semiconductor device simulators. As MOSFETs are further scaled into the nanoscale regime, it is important to consider quantum mechanical effects when performing nano device modeling and simulation [3-111. The most accurate way of incorporation the quantum effect in the inversion layer is to solve the coupled Schrodinger-Poisson (SP) equations subject to an appropriated boundary condition at the interface [7-91, but it encounters numerical difficulties and is a time-consuming task in nano device simulation. There have been different approaches to the modeling of these quantum effects; one of them is adding quantum corrections to the classical DD or HD model [3-111. In some quantum correction models [ 1 11, the classical carrier concentration was directly multiply by an additional correction term. However, to obtain a self-consistent correction
* This work is partially supported by the NSC91-2112-M-317-001 and PSOC 91-EC-17A-07-S1-0011 Taiwan.
382
383 in the Si02/Siinterface and apply to the 2D and 3D DD and HD simulations, it is necessary to consider a correction model self-consistently . In this paper we have developed a quantum correction Poisson equation which is feasible for nonoscale MOSFET DD and HD simulation. We then solve the quantum correction transport model with a parallel adaptive computing technique for a nanoscale double-gate (DG) MOSFETs. Simulation results and computational benchmark show the model accuracy and method efficiency. In Sec. 2, we state the quantum correction model. Sec. 3 is the computational method. Sec. 4 is the results and discussion. Sec. 5 draws the conclusions. 2.
A Quantum Correction Transport Model
In classical transport models, there are more than 3 partial differential equations (PDE) have to be solved for a deep-submicron device simulation [1,11]. For example the DD model includes: the Poisson equation, the electron current continuity equation, and hole current continuity equations [ 1,111. Based on a phenomenological investigation from the SP solution, the classical electron density in the Poisson equation is modeled to reflect the quantum confinement effect. With this quantum correction Poisson equation, the classical DD or HD models can directly apply to the nanoscale DG MOSFET simulation. The proposed quantum correction Poisson equation is V . ( E V ~=)-q (p - n + N i - NA) (1) and 2NC IF. (2)
n=aox e-Y,"+o,;'-Y,4'
All the symbols and physical quantities used here are followed [1,11].
3.
Parallel Adaptive Computational Method
We solve the 2D quantum correction DD model with a parallel adaptive computing technique for a nanoscale double-gate MOSFETs. This simulation methodology has been developed and applied to deep-submicron MOSFETs simulation in our recent works [ll]. Based on the adaptive unstructured mesh generation, a posteriori error estimation, finite volume (FV) discretization, and the Gummel's decoupling algorithm, a quantum corrected 2D DD model is decoupled and discretized, and hence a system of nonlinear algebraic equations is obtained. We solve the nonlinear system by means of the MI method instead
384 of conventional Newton's iteration (NI) method on our cluster system. The MI method is a constructive technique for the numerical solutions of PDEs. Compared with the N1 method, apply the MI method to nano device simulation has some merits: (1) global convergence, (2) easier implementation, and (3) ready for parallelization [ 1 11. In parallelization, the domain decomposition approach is applied to perform the simulation on our 32 nodes cluster system. 4.
Simulation Results and Discussion
We simulate a 20 nm double-gate MOSFETs on our cluster system. The figure 1 shows the mesh refinements. The mesh refinement mechanism is based the estimation of solution error element by element. Figures 2 and 3 show the computed potential and electron concentration, we have found the refinement precisely locates the solution variation near the device surface efficiently. In our numerical experience, a 27 times speed-up has also been obtained on our 32nodes PC-based Linux cluster system. ,
.. . ---:r --
:
,..,".
.. . ; I - : ' . *
Figure 1. The adaptive refinement levels: (a) initial mesh, (b) the 3rdrefined level, (c) the 5threfined level, and (d) the 7'h refined level.
5. Conclusions A quantum correction transport model has been developed for nanoscale double-gate MOSFETs. The quantum correction DD model has been solved with parallel adaptive method on a Linux cluster with MPI libraries. Numerical results for a 20 nm double-gate MOSFETs have been reported to show the computational efficiency for the proposed model and simulation methodology.
385 Acknowledgments This work was supported in part by the 2002 Research Fellowship Award of the Pan Wen-Yuan Foundation in Taiwan. Potential (Voltage)
om 0602
om
05%
03 1
0 467 4 DwomS
32
Figure 2. The simulated potential distribution for the 20 nm DG MOSFET under inversion condition (VGI=VG*=I V and Vos = 1 V).
omsxm
B.mS-w7 l.EQIw333 2m*m 32Dos.m 4.DW6006
x (cm) Figure 3. The contour plot of the simulated electron concentration for the 20 nm DG MOSFET. The peak location of electron concentration has about 1 nm shift from the device top and down surfaces.
386
References 1. S. M. Sze, Physics of Semiconductor Devices, 2"d Ed., Wiley-Interscience,
New York, 1981 2. B. Yu, et al., in: Tech. Dig. IEEE Int. Elec. Dev. Meeting (2002) p. 10-02 3. M. Ogawa, et al., in: Proc. Int. ConJ: on Simulation of Semiconductor Processes and Devices (2002) p. 261 4. D. K. Ferry, J. Comp. Elec. 1,59 (2002) 5. D. K. Ferry, VLSI Design 13, 155 (2001) 6. M. G. Ancona, et ai., IEEE Trans. Elec. Dev. 47,23 10 (2000) 7. C. Fiegna and A. Abramo, in: Proc. IEEE Int. Con$ on Simulation of Semiconductor Processes and Devices (1 997) p. 93 8. S.-H. Lo, et al., IBMJ. Res. Dev. 43, 327 (1999) 9. Y. Li, et al., Comp. Phys. Commun. 147,214 (2002) 10. T.-w. Tang and Y . Li, IEEE Trans. Nanotechnology 1,243 (2002) 11. Y. Li, et al., Engineering with Computers 18, 124 (2002)
AN ITERATIVE METHOD FOR SINGLE AND VERTICALLY STACKED SEMICONDUCTOR QUANTUM DOTS SIMULATION* YIMING LIt Department of Nan0 Device Technology, Nat 'I Nan0 Device Laboratories Microelectronics and Information Systems Research Center, Nat 'I Chiao Tung Univ. I001 Ta-HsuehRoad, Hsinchu city, Hsinchu 300, TAIWAN E-mail: [email protected] We present a computational efficient nonlinear iterative method for calculating the electron energy spectra in single and vertically stacked InAs/GaAs semiconductor quantum dots. The problem is formulated with the effective one electronic band Hamiltonian, the energy and position dependent electron effective mass approximation, and the Ben Daniel-Duke boundary conditions. The proposed iterative method converges for all quantum dot simulations. Numerical results show that the electron energy spectra are significantly dependent on the number of coupled layers. For the excited states, the layer dependence effect has been found to be weak than that for the ground state.
1.
Introduction
The experimental fabrication and theoretical study of nanoscale semiconductor quantum dots (QDs) have been of great interest in recent years [l-lo]. With the advanced nanofabrication technology, it is possible to consider another degree of freedom along the growth direction for vertically coupled QDs. One of evident features in this system is the effects of dot-to-dot interactions on the electronic structure, the electronic entanglement, and charge transfer [5,6]. In the modeling and simulation of semiconductor QDs, various works considering a two dimensional (2D) lateral geometry and confinement potential models have studied the coupled 2-layers quantum dots [5-81. To thoroughly clarify the electronic structure and tunneling ability for potential applications to memory devices and optoelectronics, it is necessary for us to investigate the vertically stacked QDs system with a full 3D approach. In this paper a unified 3D model is proposed and solved numerically for a system of vertically stacked QDs. The model is formulated with the effective one electronic band Hamiltonian, the energy and position dependent electron effective mass approximation, the hard-wall Confinement potential, and the Ben Daniel-Duke boundary conditions. Our solution method is based on a nonlinear * This work is partially supported by the NSC91-2112-M-317-001, Taiwan.
Work partially supported by grant PSOC 9 1-EC-17-A-07-S 1-00 11 Taiwan.
387
388 iterative method to simulate an electron confined by InAs QDs embedded in GaAs semiconductor matrix. Numerical results show that the QDs’ transition energy is dominated by the number of stacked layers. The distance d among QDs plays a crucial role in the tunable states of the dots. For d = 0.5 nm, there is about 30% variation in ground state energy. Our investigation is constructive in studying the magneto-optical phenomena and quantum optical structures. Sec. 2 states the 3D QDs model. Sec. 3 discusses the computational method. Results and discussion are presented in Sec. 4. Conclusions are drawn in Sec. 5.
Figure 1. The let? figure is a single InAs QD embedded in the GaAs semiconductor matrix. The right one shows a vertically coupled 3-layers QDs system.
2.
Mathematical Model and Computational Method
Considering electrons confined in a single or vertically stacked QDs system, as shown in the figure 1, we apply the one-band effective Hamiltonian H [9,10]:
where m(E, r) is the energy and position dependent electron effective mass -=-1
m(E,r)
P2 [ 2 A2 E + E ( r ) - E ( r ) g
c
1 1 E + E (r)+A(r)-Ec(r)
(2)
g
and V(r)= Ec(r)is the QDs’ confinement potential. The Ec(r),E&), A ( r ) , and P are the position dependent electron band edge, band gap, spin-orbit splitting in the valance band, and momentum matrix element, respectively [9,10]. Because the QDs have ellipsoid shape with radius Ro and of height zo, we solve the problem in ylindrical coordinate (R, 4 z). The QDs system is cylindrical symmetry, so the wave function is written as: @ ( r ) = @ ( R ,z) exp(iZ@),where Z = 0,+1,+2, ...is the electron orbital quantum number and the model is --
h2
a2 7 ( +- a +- a2
2 m i ( E ) aR
RaR
a~~
i2 --)@;(R,z) R~
+ V , ( R , Z ) @ , ( R , Z =) E @ ; ( R , z ) I
(3)
where Vi=](R,z) = 0 inside the dots and Vi=2(R,z) = V, outside the dots. The interface conditions (the so-called Ben Daniel-Duke boundary conditions) are
389
where z =f,(R, z) (s represents the ellipsoid shape QDs) is a contour generator of the QDs structure in (R, z) plane. Figure 1 shows the single and 3-layers QDs. To obtain the "self-consistent" energy solution of the model, we generalize here the nonlinear iterative algorithm [9,10]. It consists of (a) Set an initial energy Eo, (b) Compute effective mass m, (c) Solve Schrodinger equation, (d) Back to (b). The iteration terminates when a specified stopping criterion on energy is reached. To solve the Schrodinger equation in (c) we apply a finite volume method with nonuniform mesh, a mixed eigenvalue solution method, and an inverse iteration technique. Starting from an arbitrary initial guess in energy, the mixed methodology proposed here to solve the semiconductor QDs eigenvalue problem demonstrates very excellent convergent property and reduces the computing time cost significantly for the complicated QDs system. With the developed multi-layers QDs simulator, the tunneling effects and electronic structure are studied for different layers QDs system. 050
First excited slate enemy (14 = 1)
-z
2. 103
,045
$0 40
s 104
Ground state energy (/ = 0)
f 10'
0 35
a
4 = 10 (nm) and g = 2 (nm)
10's
Distance among dots I0 5 (nm)
0 30 1
2 3 4 5 6 Number of vertically stacked ODs
10" 0
2
4 6 8 Number of iterations
10
Figure 2. The left figure reports the energy versus the number of stacked QDs. The right one is the convergent behavior for a 5-layers systema. 0.0 ' * 0.1
-sx)
0.2
-
E
5
10
15
20
0.3 0.4 0.5 0.6 0.7 0.8 0.9 25
5
10
15
20
25
R (nm)
Figure 3. The left figure is the wave function spreading for I = 0. The right one is the wave function of the vertically coupled 3-layers QDs system.
a
It takes about 350 seconds to computes all bounded states in a Linux-based PC system, where CPU is Pentium IV 2.5 GHz and RAM is with the 256 MBytes.
390
3. Results and Discussion As shown in the right figure of Figure 2, it takes about 8 iteration loops to reach a stopping criterion where the maximum energy error is less than lo-''. The global convergence mechanism is due to the effective mass is a monotone function in energy. We simulate a 6-layers vertically stacked QDs system in our study. Each ellipsoidal-shaped QD is separated by distance d among layers. For small dots with zo = 2 nm and Ro = 10 nm, and separated by a fixed d we have found that the transition energy is essentially dominated by the number of stacked QDs. When stacks increase the electron transition energy decreases monotonically. For d = 0.5 nm the variation of ground state energy can up to 30%. The transition of the first excited state energy is less dependent on stacks (- 5% energy variation). It is also found that d plays a crucial role in the tunable states of the dots. In the figure 3, the wave function significantly demonstrates effects of the coupling and spreading. 4.
Conclusions
A computational efficient technique for the single and vertically stacked QDs system simulation has been proposed. With the developed QDs simulator, we have found that the electron energy spectra are significantly dependent on the number of coupled layers. For d = 0.5 nm, there is about 30% variation in ground state energy. For the excited states, the layer dependence effect has been found to be weak than that for the ground state. The modeling, numerical method, and study presented here not only provide a novel way to simulate the QDs but also are useful to clarify principal dependencies of stacked QDs energy states on material band parameter and the number of stacked QDs. Acknowledgments This work was supported in part by the 2002 Research Fellowship Award of the Pan Wen-Yuan Foundation in Taiwan. References 1. H. Akinaga and H. Ohno, IEEE Truns. Nunotech. 1, 19 (2002). 2. M. Bayer, P. Hawrylak, K. Hinzer, S. Fafard, et al., Science 291, 451 (2001). 3. D. Bimberg, et al., Thin Solid Films 367,235 (2000). 4. A. D. Yoffe, Advances in Physics 50, 1 (2001). 5. X. Hu and S. Das Sarma, Phys. Rev. A61,062301 (2000). 6. P. Yu, et al., Phys. Rev. B60, 16 680 (1999). 7. Z.J. Zhang, B.W. Li, and C.G. Bao, Physica B 324,245 (2002). 8. F. Troian, U. Hohenester, E. Molinari, Phys. Rev. B65, 161301(R) (2002). 9. Y. Li, et al., Comput. Phys. Cornmun. 141,66 (2001). 10. Y. Li, et al., Comput. Phys. Commun. 147,209 (2002).
P R O P O S A L OF A NEW COMPUTATIONAL M E T H O D F O R THE ANALYSIS OF THE SYSTEMATIC D I F F E R E N C E S I N S T A R CATALOGUES*
J.A. LOPEZ AND F.J.MARCO Departamento de Matembticns Univ Jaume I de Castelldn, Spain E-mail:[email protected]. es, [email protected]. es M.J. MART~NEZ Depto. Matema'tica Aplicnda Univ Polatecnica de Valencia, Spain E-mail:[email protected]. es
1. Extended Abstract
One of the main objects of fundamental astronomy is the determination of the celestial reference frames. The reference frames must approximate the inertial system with great accuracy. Positions of the celestial bodies can not be obtained directly from the reference system so it is necesary to built star catalogues, which materialize the reference system. The positions of the celestial bodies can be obtained from them by comparisson of the body position with the stars coorditates takem from the cataloges. The reference frame in use nowadays was adopted in Kyoto by International Astronomical Union and their main caracteristics are: a.-From tghe first of Jaunary 1998 the Celestial Reference System adopted by I.A.U. will be The International Celestial Reference system (ICRS) defined by the International Earth Rotation Service. b.-The corresponding Reference Frame will be the International Celestial Reference n a m e (ICFR) build for the corresponding workgroup of the I.A.U. c.- The Hipparcos Catalogue is the materialization of this reference frame for visible wavelenghts. *corresponding author: j.a.16pez. email: [email protected]. fax: +34.964.728429 391
392
d.-The IERS , and the IAU working group will take the apropiated measures for their maitenance of the ICRF and the relations with the other reference frames. The relations of this reference system with the other systems has been studied for several autors, Schwan (2001,2002) arranged a complete study of the systematic diferences between the most important catalogues with respect to Hipparcos, obtaining numerical and analytical corrections. The analytical corrections for the complete sphere are developed in spherical armonics ?a, 6) expansions of the right ascension a and declination 6:
For other kind of domains D on the sphere the series expansions can be ~, made with recpect to the functions K n , m ( 6)
where Kn,m(a,6),n,m = 0, 1, .... is a suitable base of 1 2 ( ( D ) space.. For compare the differences in the shared stars position determinate from two catalogues we use a finite series expansion for a, 6 as model of signal as:
n=O m=O 0 0 0 0
n=O m=O
where ra,i, r6,i are stocatic errors in Aai, A&. The coeficients of the analytical expansion can be obtained from the minimum condition of the residual function: NE
R({'iyj7
7 7 i , ~ } i = l , . . , N ; j = l , . . ,= ~
C [r&+ r2i cos2bi] i=l
wherw NE is the size of the sample. This method is apropiated if the stars distribution is homogeneous in the spherical domain D because in this case the system of the funcions K n , m ( 6) ~ ,is a good approximation of an orthogonal system in D over the sample .
393
In more cases ( ie. differences between observated and calculated positions of the minor planets) we can not suppose homogeneity in the sample. In this situation we must use the minimum condition:
To evaluate this integral we can discretize the spatial domaine by means of a rectangular lattice. The control volume Di is determinated by the center of the four adyacent rectangles at the node i ( except for the nodes in the boundary) and the values of the Aa,Ad is repalced by the mean values:
where V, is the area of the contol volume Di. Unfortunately we do not know the functions Aa, A/deZta for all ( a ,6 ) E D so we proceed to aproximate & , A h by means of a kernel smoothing estimation, this aproximatin can be written as:
n(r,=
c
j€I;
where:
is a not negative compact Ci support function with a maximum at the node i and l i is the element of the sample where iC( Il(aj’6j)-(ai’6i)JJ h is not null and h is a parameter called bandwidth. The integrals of the (eq 3) can be evaluated as:
where NN is the number of nodes in the lattice. The integrals that appear in the expansion of the right side of the last equation can be calculated by means of a gaussian quadrature of apropiate order. For this purpouse it is necessary to know the values of the functions Aa, A h in the nodes of the quadrature formulae. To know these values
394
a reconstruction operator must be defined. The reconstruction proccess consists in the estimation of values of the functions Aa,A6 from the values --
nai, mi.
Let P ( a ,6) be the reconstruction operator, if k is the size of the big side of the cell lattice, we can obtain a reconstruction of order o(k'+') taking a node matrix of ( r 1) x ( r 1) arround the point ( a ,6). For the reconstruction process is more convenient the use of a esentially not oscillatory mehtod (E.N.O.)in order to not introduce false oscillations in the functions. The bidmensional reconstruction operator can be built as composition of two unidimensional operators (Casper, Atkins) one in a direction and other in 6 direction. From the calculation of the integrals we have a normal system of equations:
+
+
r
= 0,
..,N
s = 0, ..,M
This system can be solved by numerical methods and their solution allow A, V ~ the resolution allows to reach the values of the parameters A E , , ~ This is a general method, adequated to be employed to analyse errors from a set of observational positions of minor planets. In this case we use the right ascension and declitation difference between the calculated positions taken from the Lagrange Planetary Equations integrations and the observated positios reducted from a star catalogue. In this case the distribution of the observations in the domain D is not usually homogeneous. This may cause zonal errors in domains other than the ones which contains the observations. This is the reason why we propose the previous method.
SIMULATION OF THE SWITCHING CURVE IN ANTIFERROMAGNETIC ISING MODEL *
M. s. MAGDON-MAKSYMOWICZ Department of Mathematical Statistics, Agriculture University, Mickiewicza 21, 31-120 Krakdw, Poland A. DYDEJCZYK, P. GRONEK AND A. Z. MAKSYMOWICZ Department of Physics and Nuclear Techniques, AGH Mickiewicza 30, 30-059 Krakdw, Poland email: gronek0agh. e d u . p l
phone: (+48 12) 617 23 18, fax: (+48 12) 634 00 10
At sufficiently low temperatures T we expect all spins in the antiferromagnetic ground state for a finite net size. Then we apply magnetic field + H on spins "down" sublattice and - H on spins "up" sublattice t o provoke spin reversals on all sites from initial configuration which now is metastable. At sufficiently big H , spin reversal takes place. The switching curve H ( T ) obtained from computer simulations does reflect different switching mechanism at very low, low and high temperature range corresponding t o spin flip and domain wall movement. These two mechanisms may be observed in real samples. In this paper we discuss the onset of the two switching paths and analyze main trends of the switching curves for different model parameters such as the size N of the spin net or the coordination number 2.
1. Introduction
The switching curve is of a phase transition character when, on applied magnetic field opposite to the current metastable spin arrangement, the system reacts with a rapid response by magnetization reversal. For ferromagnets at T = 0, there are two spin "up7' and spin "down" twin configurations, and the system goes to one of them. On applied magnetic field in the direction opposite to the direction of the spins, this configurations becomes metastable state and we expect the system to have tendency to get over to the stable state after all spins get reversed. The field H at which *Extended abstract without tables and figures, etc. 395
396
such reversal takes place is the switching field. For antiferromagnetic coupling the situation is similar, yet we need to apply nonuniform field of +H and -H values on neighboring sites. Analytical results may be obtained when, for example, an oversimplified molecular field approximation is applied. In this case we compare the molecular field at thermal equilibrium of the system and attempt to find the magnetic field at which the system becomes unstable. This assumption overestimates the critical field since it corresponds to uniform, simultaneous jump of all the spins into the new state. In real cases, the local effective fields are not uniform and varies from site to site depending on local configuration of neighbouring spins. This is the essence of the Monte Carlo computer simulation technique. Computer simulations literature of the Ising model is very rich. The switching curve studied here is directly related to the magnetization reversal and hysteresis. In this paper we concentrate on calculations of the switching curve for an antiferromagnetic Ising model. The simulations were carried up for two dimensional net N x N with spins coupled by negative exchange energy J < 0 between nearest neighbours.
2. Results and discussion
For calculations of the normalized switching curve h(t) of the spin flip we used antiferromagnetic exchange integral J < 0, for which the critical temperature Tc(MFA, theory)= 1, TJIsing, theory)= 0.567296326. Monte Carlo simulations were carried out on N x N = 1000x1000 lattice with the standard cyclic boundary condition, for given reduced temperature t = TIT,. Initial configuration was always antiferromagnetic with all spins "down" on the first sublattice and "up" on the other one. After about 1000 iterations steps we are close to the thermal equilibrium. Then we apply magnetic field H > 0 on the first sublattice and -H on the other. We observe the switching of the system from this metastable state into a stable solution when the spins are reversed for the field above a certain value h = -H/4J which depend on temperature t . Results of the simulation for the applied switching field (+H,-H) on the two sublattices are presented (using switching fields h = -H/4J for N x N = lOOOx 1000 net size). The Molecular Field Approximation (MFA) curve is far from the presented Ising model switching curve. We observed
397
flat part which correspond to specific magnetization process due to propagation of a nucleated bubble domain wall across the sample. This flat region happens for tl < t < t z . The remaining two monotonic parts, however, are of a uniform magnetization process character when all sites are vulnerable for the spin flips that take place simultaneously. Thus we propose two different mechanisms for the system to reach the stable state when normalized critical line h(t) is met: (a) for 0 < t < tl when standard uniform rotation mechanism is dominant and (b) for tl < t < t z when a bubble domain wall movement of about one cell length per iteration takes place. For t 2 < t < 1 we claim both mechanisms are active.
Let us discuss more in detail the main features of obtained results. At sufficiently low temperatures the system is in the ground state. When the (+H,-H) field is applied, the probability of picking up a site with the opposite spin is increased leading to creation of the nucleation centre. Once such a bubble domain is created on a central site, this leads to significant reduction of the exchange field experienced by neighbouring sites. The reduction of the exchange energy J for 2 nearest neighbours is then from Z J in the ground state to (2 - 2) J, by a factor of (1 - 2/2). Therefore if the field necessary to flip locally a spin is H = -2J (or h = 1) in the ground state, a smaller value h = 1 - 2 / 2 for nearest neighbours of the already reversed spin makes these spins also get flipped. This means propagation of the area of reversed spins and the bubble domain grows as result. However, this smaller field, h = 1/2 for two dimensional Ising model Z = 4, takes place only when the sample is already softened by reversed nucleation centre. We may estimate roughly the minimum temperature at which we expect such centres may appear. The Ising model predicts reduced magnetization m = 1 - 2 . x4 in the low temperature limit, where x = exp(-0.5-J/T). On the N x N net with just 1 spin flipped it corresponds to m = 1 - 2 / ( N x N ) . From this we get the estimate of TI, T = Tl(of 1 spin flipped)= J/Zog(N). This result is not as yet quite correct one since we still have to account for the (+H,-H) applied field which adds to the exchange energy, reducing the barrier for a spin flip. The local spin energy is now Z J H. For the applied field h = 1/2 it reduces the exchange integral to an effective exchange integral J e f f = J/2. Replacing J by the effective value, and using the critical temperature T,
+
398
value T, = J .0.5673, we finally get T,/T1 = 1.1346 . Zog(N). For N=1000 we get T l / T c = 0.128. It should be noted that in the thermodynamical limit N -, 00, TI + 0 as it is expected. Also it may be important to realize that in experiment N is rather close to the thermodynamical limit. Below temperature t l there are no nucleation centers. All the spins are then reversed nearly simultaneously (uniform rotation) at fields from the maximum value h = 1 at t = 0, down to h = 1/2 at t l . Just above tl when we get nucleation centres, the plateau h = 1/2 means the bubble grows at the critical field h = 1/2, a value sufficient for the domain wall to move. This situation lasts until another critical temperature t 2 is reached above which the uniform rotation takes place again. The switching field is down to zero at t = 1, as it is expected. The simulated switching curve clearly shows the plateau region for temperature range from tl = 0.09 to t z = 0.16. Our analytical value tl = 0.13 obtained from considerations on the model switching mechanism is perhaps not too far from the simulation result. The expected h = 1/2 plateau matches exactly the computer simulation result. In the computer experiment we also compare results for different number of iterations n = 1000 and n = 400. We recover same switching curve for n = 1000 and n > 1000. For n = 400 we have larger fields h for t > t 2 which make us believe that the wall creeping mechanism is not completed for that small number of iterations n < N , less than the net size. From experiment done by H. Kronmuller, at al. on coercivity versus temperature, one clearly can see the plateau as obtained in our simulations. This plateau, predicted at h = 1 - 2/2, may be read out from experimental. data at positions h corresponding to the coordination number Z between 8 and 12 for both F e 7 7 N d l ~ Band ~ Fe77Pr15Bg samples. (To do this we extrapolated the experimental data down to t = 0 temperature to evaluate the maximum of the switching field.) The tl = 0 limit of large sample size N + 00 is seen as the flat region from the lowest temperatures. Rough estimate of t 2 yields numbers from 0.10 to 0.16, against t 2 = 0.16 from simulations.
399
3. Summary and conclusions
The switching curve shows 3 regions of different character of magnetization reversal process. At low temperatures the spin flips occur nearly simultaneously at all site after a threshold field is reach. This field decreases with temperature down to h = 1 - 2 / 2 at normalized temperature t = ti. Rough estimate of l / t l is l / t l = 0.5673. Zog(Nat,,,). For the net size N x N = 1000 we get t l = 0.128. However, for real samples Natornsmay or so for which tl = 0.035, in the range of 10 or 20K. be of order of The plateau at h = 1 - 2 / 2 extends from tl to t2 and correspond to domain wall movement. The model includes only local exchange energy, eventually diminished by the applied magnetic field, as the only barrier to overcome. Possible extension of the model should include inclusions or surface roughness or other obstacles of domain wall movement as another source of the coercivity, pushing up the critical fields h. Above t2 we believe that the two above named magnetisation reversal mechanisms are present, yet this needs more attention. Suitable computer experiment is planned. The calculations were carried up on the PC cluster of the Faculty of Physics and Nuclear Techniques, University of Mining and Metallurgy, and the Academic Computer Center CYFRONET-KRAKOW.
ELECTRIC HYPERF'OLARIZABILITY CALCULATIONS G. MAROULIS Department of Chemistry, Universiry of Patras GR-26.500 Patras, Greece E-mail: [email protected]
We present an analysis of the ab initio quantumchemical calculation of electric hyperpolarizability calculations in some systems of primary importance. Various computational aspects of these calculations are closely examined. Particular consideration is reserved to basis set effects and the systematic evaluation of the performance of theoretical methods. 1. Introduction and Theory The energy and dipole moment of an uncharged molecule interacting with a weak, homogeneous static electric field can be expanded as [I]
where Fa, ... is the electric field at the origin and Eo, b the energy and permanent dipole moment of the free system. The number of independent components needed to specify the tensors b, pusy and y 4 y 6 is regulated by symmetry [ 11. c(a~is the dipole polarizability. The coefficients of the nonlinear terms in Eq. (2) are called hyperpolarizabilities. In late years the experimental and theoretical determination of the electric hyperpolarizability of atoms and molecules has attracted particular attention [2]. Electric hyperpolarizability is of importance to intermolecular interaction studies [3], simulations of fluids [4] and the rational interpretation of nonlinear optical phenomena [5]. In this work we turn our attention to the calculation of the electric polarizability in some important and characteristically difficult cases. The (n=O or 4, X=B,N) and systems considered are c&6, (H20)2, CIO+,HI~-~X~ N2"'He. The calculation of the molecular properties relies on the finite-field approach. Conventional ab initio methods and Density Functional Theory were employed. Details of the computational philosophy underlying these calculations have been presented elsewhere [6]. Pattern recognition and cluster analysis was used to evaluate and classify the performance of theoretical methods [7]. 400
40I 2. The hyperpolarizabilityof trans-butadiene The electric hyperpolarizability of c4H6 has been extensively studied in late years [8,9]. We have shown that basis set effects are quite important for this molecule. The y+vs tensor has nine independent components. The Hartree-Fock
-
limit for the mean hyperpolarizability is estimated at y = (14.6k0.4)x i d e4a04EL3. The estimation of electron correlation effects merits particular attention. Conventional, rapid methods as MP2, second-order Mgller-Plesset perturbation theory seem to overestimate this correction. Our final estimate is (3.0k0.6) x lo3e4a04Ei3.
3. Hyperpolarizability of adamantane, 1,3,5,7 tetraboro-adamantane and 1,3,5,7 tetraaza-adamantane The calculation of the electric properties of adamantane (C1a16) is a rather difficult task [ 101. This important prototypical system is of central importance to current investigations based on novel ideas for the design of molecular building blocks (MBB) for molecular nanotechnology applications. As experimental and theoretical investigations of the molecular properties of heterofullerenes are attracting much attention [ l l ] we have examined two similar derivatives of adamantane: 1,3,5,7 tetraboro-adamantane (Fig 1) and 1,3,5,7 tetraazaadamantane (Fig 2 ) . We obtained molecular geometries, electric moments and (hyper)polarizabilities for these tetrahedral molecules. Their HOMO and LUMO orbitals are shown in Figs 3 and 4. 4.
How reliable are DFT methods in interaction hyperpolarizability calculations? (H20)2as a test case.
A systematic study of the interaction electric (hyper)polarizability of the water dimer [ 121 shows a surprisingly small effect for this important molecular system. Conventional DFT approaches seem to overestimate the electric (hyper)polarizability of the dimer (H20)2 but yield rather reasonable values for the interaction properties.
402
Figure 1. Molecular structure of 1,3,5,7 tetraboro-adamantane.
Figure 2. Molecular structure of 1,3,5,7 tetraaza-adamantane.
403
Figure 3. HOMO and LUMO of 1,3,5,7 tetraboro-adamantane.
Figure 4.HOMO and LUMO of 1,3,5,7 tetraaza-adamantane
404
5. Interaction hyperpolarizabilityin the helium-nitrogen system. The potential energy surface of the dinitrogen-helium system, N2".He, has been extensively studied by various research groups. We report for the first time a study of the interaction (hyper)polarizability. The calculated values are compared to those obtained for COzy.He [ 131. References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13.
A.D.Buckingham, Adv. Chem. Phys. 12, 107 (1967). D.P.Shelton and J.E.Rice, Chem.Rev. 94, 3 (1994). M.H.Champagne, X.Li and K.L.C.Hunt, J.Chem.Phys. 112, 1893 (2000). G.Ruocco and M.Sampoli, Mol. Phys. 82, 875 (1994) S.Kielich, Molekularna Optyka Nieliniowa (Nonlinear Molecular Optics), Naukowe, Warsaw, 1977. G. Maroulis, J. Chem. Phys. 108, 5432 (1998). G.Maroulis, Znt.J.Quant.Chem.55, 173 (1995). P.Norman, Y.Luo, D.Jonsson and H..&gren,J.Chem.Phys. 106, 1827 (1997). G. Maroulis, J. Chem. Phys. 111, 583 (1999). G. Maroulis, J. Chem. Phys. 115, 7957 (2001). R.H.Xie, G.W.Bryant, L.Jensen, J.Zhao and V.H.Smith, J.Chem.Phys. 118, 8621 (2003). G. Maroulis, J. Chem. Phys. 113, 1813 (2000). G.Maroulis and A.Haskopoulos, Chem.Phys.Lett. 349,335 (2001).
SEGMENTATION NATURAL SPEECH USING FRACTAL DIMENSION F. MARTINEZ, A. GUILLAMON AND J.J. MARTINEZ Departamento de Matembtica Aplicada y Estadistica Universidad Politkcnica de Cartagena SPAIN E-mail: Cmnrtine&uuct. ev cintonio.nuillanion~~iip~t. es
Speech signals can be considered as being generated by mechanical systems with inherently nonlinear dynamics. The purpose of this paper is to present an automatic segmentation method based on nonlinear dynamics with low computational cost. The fractal dimension is a measure of signal complexity that can characterize different voiced and unvoiced. The segmentation process is carried out in two stages: estimation of the fractal dimension using the method suggested by Katz [l] and detection of the stationarity of the fractal dimension by means of the value of the variance parameter computed over the smooth fractal dimension signal. Using this combination of techniques, a quick and automatic segmentation is obtained. Our experiments have been computed over recorder signals from a speech Spanish database (AHUMADA).
1. Introduction The segmentation is a key preliminary component in continuous speech recognition. A segmentation algorithm splits the signal into homogeneous segments with lengths adapted to the local characteristics of the analyzed signal. There is a vast literature on digital processing for speech segmentation with applications to speech analysis, [2] and [3]. Usually, the proposed methods can be classified in two categories: model-based and model-free. The model-based methods can be quite efficient only if the parametric model fit the speech signal correctly (vowels). On the other hand, the model-free methods provide a way to circumvent modeling problems, but the model-free distortion measures depend crucially on spectral densities, and these measures have a lack of robustness when missed-type spectra is expected. Moreover, all these methods have the common limitation of a great computational cost. Recent suggestions that speech production may be a nonlinear process have sparked great interest in the area of nonlinear analysis of speech giving rise to many studies [4], [ 5 ] . These researches are based on the natural hypothesis that nonlinear processes occur in speech production. 405
406
In this paper, we present an alternative method of speech segmentation based on nonlinear dynamic techniques. The segmentation process is carried out in two stages: estimation of the fractal dimension and detection of the stationarity of the fractal dimension by means of the value of the variance parameter computed over the smooth fractal dimension signal. The remainder of this paper is organized as follows. In Section 2, the techniques used in this work for estimating the fractal dimension are commented, in Section 3, the statistical method developed for the segmentation procedure is presented, and finally, the experimental results and conclusions are shown in Sections 4 and 5.
2. The fractal dimension The fractal dimension (0,)can be considered as a relative measure of the number of basic building blocks that form a pattern. For this reason, we think that D, is a measure of signal complexity that can characterize different voiced and unvoiced sounds. It provides an alternative technique for assessing signal complexity in the time domain, as opposed to the embedding method based on the reconstruction of the attractor in the multidimensional phase space, [6]. This innovation permits a direct connection between complexity variations and speech signals changes over time, providing a fast computational tool to detect nonstationarities in these signals. There are many algorithms in the literature for estimating the fractal dimension of a waveform, [l], [7] and [S]. In our research, the fractal dimension was computed according to the algorithm proposed by Katz. In contrast to other methods, Katz's Dfcalculation is slightly slower, but it is derived directly from the vaweform, eliminating the preprocessing step. The Dfof a time sequence analyzed x ( j ) ,j=1,2 ,..., N , can be defined as:
where :L =
L
is
the
sum
of
distances
between
successive
points
CIx(i+ 1) - x ( i ) / and d is the diameter estimated as the distance between N I ,=I
the first point of the sequence and the point of the sequence that provides the farthest distance d = max - n(i)j} .
lX(l)
2L6N
Df computed in this fashion depends on measurement units used. Katz's approach solves this problem by creating a general unit or yardstick: the average
407
distance between successive points, ( L ). Finally, normalizing distances in equation (1) by this average, results in: D, =
logIo(N-0
3. The segmentation process
The final stage of our method is based on a simple statistical property of the variance parameter. Using a sliding window over the smooth fractal dimension signal D’fi we compute the variance parameter:
where n and rn denoted the size of the window used. Thus, values of the variance parameter in the neighborhood to zero denote stationarity of the fractal dimension whereas great values denote changes in the uniformity of the signal (nonstationarity). In this sense, frames with variance parameter value near to zero characterize vowel and nasal sounds whereas frames with great variance value characterize most of complex sounds such as fricative sounds or transitions between vowel and nasal consonant.
4. Results The D, parameter is obtained from the experimental signals using a sliding window of 512 points (32 ms), shifted along the speech signal, with 1 point of overlap. The smooth process is carried out using a sliding window of 128 points and variance parameter is obtained using frames of 512 points. A frame is characterize as stationary if its variance is less than the 10% of the maximum value obtained for the variance parameter over full speech signal. Our experiments have been computed over a variety of speech recorded signals from a speech Spanish database (AHUMADA). Figures 1,2 and 3 show the results obtained for the segmentation of the Spanish word “maAana”,by means of the procedure developed in this paper.
408
Figure 1: Speech Signal (Spanish word "mafianu)
I --x.
-._-..---*
(1)
~
0.5 .
L_
I1
(I)
1000
2000
3000
4000
---+ Cj,-,1uuth c - I - ~ L5er10 v ~ ~ LY-~
(2)
'.- ,,f. \, //
u,
_I.
5000
--+
6000
TOO0
8000
Vnrinncs pmrnmater ssrie
Figure 3. Segmentation process 5. Conclusions In this paper, a new method is presented for speech segmentation. In a first stage, this method uses the fractal dimension (Or) in a first stage in order to obtain a simple time series that contain the amount of regularity of the speech signal. Finally, a simple statistical method is used to split the 0, series obtained. Taking into account that this procedure has a low computational cost; it proposes an efficient segmentation technique of speech signals and provides an automatic method that can be used in on-line systems.
References 1. 2.
M. Katz, Comp Biol. Med, vol. 18, no. 3, pp. 145-156, 1988. R. Andre-Obrecht, IEEE Trans. Acoust. Speech, Signal Processing, vol. 36, pp. 29-40, 1988. 3. M. Basseville, Signal Processing., vol. 18, pp. 349-369, 1989. 4. M. Banbrook and S . McLaughin, IEEE Collouium on Exploiting Chaos and Signals Processing, 811-8110, 1994. 5 . A. Kumar and S.K. Mullick, Electronics Letters., vol. 26, no. 21, pp. 17901791, 1990. 6. 7. 8.
P. Grassberger and L. Procaccia, Phys. Rev. Lett., vol. 5, pp. 345-349, 1983. T. Higuchi, Phys. D,vol. 31, pp. 277-287, 1988 A. Petrosian, IEEE Symposium on Computer-Based Medical Systems, pp. 212-217, 1995.
CLUSTER MODELS FOR SPATIAL POINT PROCESSES WITH APPLICATIONS *
J. MATEU~ANDJ.A. LOPEZ Department of Mathematics, Campus Riu Sec, University J a u m e I, E- 1.2071 Castello'n, Spain
Spatial point process models provide a large variety of complex patterns to model particular clustered situations. Due to model complexity, spatial statistics often relies on simulation methods. Probably the most common such method is Markov chain Monte Carlo (MCMC) which draws approximate samples of the target distribution as the equilibrium distribution of a Markov chain. Perfect simulation methods are MCMC algorithms which ensure that the exact target distribution is sampled. In this paper we focus on point field models that have been used as particular models of galaxy clustering in both cosmology and spatial statistics. We present simulation and estimation techniques for these models and analyze by an extensive simulation study their flexibility for cluster modeling, under a large variety of practical situations.
Extended Abstract The last half of the twentieth century saw cosmology develop into a very active and diverse field of science. The main tools t o compare theoretical results with observations in astronomy are statistical, so the new theories and observations also initiated an active use of spatial statistics in cosmology. Many of the statistical methods used in the analysis of the large-scale distribution of matter in the universe have been developed by cosmologists and are not too rigorous. In many cases, similar methods, sometimes under different names, had been used for years in mainstream spatial statistics. In the late 1950s, when the Berkeley statisticians J. Neyman and E. Scott carried out an intensive program for the analysis of galaxy catalogs, the connection between spatial statisticians and cosmologists was a fruitful one. However, in the following 30 years cosmologists were not, in general, aware of developments in statistics, and vice versa. Fortunately, recent *This work is supported by grant BFM2001-3286. tcorresponding author. Email: [email protected]. fax: +34.964.728429 409
410
years have brought the resumption of a dialog between astronomers and mathematicians, led by the Penn State conferences, Cosmology is a good field for applications of spatial statistics. Its welldefined and growing data sets represent an important challenge for the statistical analysis, and therefore for the mathematical community. A redshift surveys provide a set of positions { X ~ } K of ~N galaxies in a portion of the universe with volume V . This can be regarded as a realization of a random point process. In this sense, the discrete distribution of galaxies, each one considered as a point of the process, is a good field of application of the statistical techniques developed in spatial statistics. However, in cosmology, a different approach is more often employed. The spatial distribution of matter in the universe can be considered, both now and in the past, as a continuous function of spatial locations, that is, a continuous random field. Both random fields and point processes are indeed examples of stochastic processes. We present in this paper statistical models of the large-scale distribution of galaxies. While statistical models cannot claim to offer a full physical model of structure formation, they do have the advantage of being more manipulative and having greater flexibility, two properties that are useful in studying the behaviour of statistical measures under diverse circumstances and assessing the significance of these measures. We introduce point field models that have been used as particular models of galaxy clustering in both cosmology and spatial statistics. Since the introduction of Markov point processes in spatial statistics, attention has focused on the special case of pairwise interaction models. These provide a large variety of complex patterns starting from simple potential functions which are easily interpretable as attractive and/or repulsive forces acting among points. However, these models do not seem to be able to produce clustered patterns in sufficient variety, which prevents them from being useful in cosmology applications. And this is the reason why other families of Markov point process models, able to produce clustered patterns, are introduced in this paper. The area-interaction process, the continuum random-cluster model and the penetrable spheres mixture model are considered. These are of interest in spatial statistics in situations where the independence property of the Poisson process needs to be replaced either by attraction or by repulsion between points. They are also highly relevant in statistical physics, where the first and third models provide the most well-known example of a phase transition in a continuous setting.
41 1
A kind of point process very popular in the field of spatial statistics was originally introduced by Neyman and Scott in cosmology to model the distribution of galaxies. They are Poisson cluster processes because they are based on an initial homogeneous Poisson process with intensity X whose events are called parent points. Around each parent point, a cluster of daughter points is scattered. The number of points per cluster is randomly generated according to a given discrete probability distribution function. The location of the daughter points with respect to their parent center is independently generated for each parent following a given density function. This law is the same for all parents. The final process is formed only by the offsprings. As a subclass of the Neyman-Scott field, M a t h proposed the widely used M a t h process, which reads as follows. Each event of the parent Poisson process is surrounded by a sphere of radius R in which m points are distributed randomly, following a binomial process. The offspring of each parent point varies from center to center following a Poisson distribution with mean p. The distribution of mass in the universe can be regarded as a continuous density field. The luminosity, however, is concentrated into galaxies, forming a discrete point process. The three-dimensional distribution of galaxies can be regarded as a point field, where {xi}El represents the coordinates of the positions. One way to establish a connection between the discrete galaxy distribution and the underlying continuous density field is by means of an inhomogeneous Poisson point process with density n ( x ) described formally as a sum of Dirac delta functions S,(x). Spatial distributions are often so complex that inference needs to be based on simulation methods such as Markov chain Monte Carlo (MCMC). In MCMC we construct an ergodic Markov chain whose equilibrium distribution 7r is the distribution of interest. If we simulate this Markov chain for a sufficiently long time, then we can use its samples to estimate properties of the target distribution. Before the Markov chain has reached convergence its distribution will be influenced by its initial state at time 0. To ensure that the initial state will not bias the estimation results we usually discard the first M samples. In MCMC jargon M is called the length of the burn-in period. Unfortunately, in practice it can be difficult to determine a value of M such that the initialisation bias becomes negligible. Convergence rate computations may help but are difficult and may lead to unduly pessimistic and impractical values for M . Convergence diagnostics on the other hand cannot guarantee that
'
412
the chain has reached equilibrium and so may lead us to choosing a value for M that is too small. A third alternative are perfect simulation methods. These are MCMC algorithms which verify dynamically if the chain has converged yet and thus choose a sufficiently large value for M . The sample produced by the algorithm has the exact equilibrium distribution, hence the name exact or perfect simulation. Clearly, if applicable, perfect simulation provides a way out of the dilemma of determining the length of the burn-in period. In the paper, we present the basics of point process modeling. Then, we discuss some MCMC methods applied in practical modeling. We shall deeply discuss and analyze how flexible are these models in practice, and shall propose measures of flexibility and possible interchanges among cluster models.
References 1. M a t h , B., Spatial variation. Medd. fran Statens Skogsforskningsinst, 49, pp. 1-144, 1960. 2. Neyman, J. and Scott, E.L., Statistical approach to problems of cosmology, Journal of The Royal Statistical Society, 20, pp. 1-43, 1958.
THE USE OF COMPUTATIONAL ANALYSIS TO DESIGN NOVEL DRUGS T. MAVROMOUSTAKOS, P. ZOUMPOULAKIS, M. ZERVOU, I. KYRIKOU, A. KAPOU, N. BENETIS National Hellenic Research Foundation, Institute of Organic and Pharmaceutical Chemistry, Vas. Constantinou I 1 635, Athens, Greece E-mail: [email protected] Hypertension is a growing undesired symptom which damages health and threatens mostly the developed societies. It is estimated that 20%of the Greek population suffers from hypertension. Research efforts for the controlling of hypertension are focused in blocking Ang I1 release and more recently in competing Ang II binding on AT, receptors. This latest approach generated the synthesis of losartan and promoted it in the pharmaceutical market (COZAAR). Other derivative drugs which fall into SARTAN’s class followed. To comprehend the stereoelectronicrequirements which may lead to the better understanding of the molecular basis of hypertension, the stereochemical features of angiotensin II, its peptide antagonists sarmesin and sarilesin, synthetic peptide analogs, AT1 non-peptide antagonists commercially available as well as synthetic ones were explored. AT1 antagonists are designed to mimic the C-terminal part of Ang I1 [I]. In this aspect, it is proposed that the butyl chain of losartan may mimic the isopropyl chain of Ile, the tetrazole ring mimics the C-terminal carboxylate group and the imidazole ring the corresponding imidazole ring of His6. The drug design is based on the optimization of superimpositionstudies of losartan with C-terminal part of sarmesin [Z].
1. Introduction Hypertension is a growing undesired symptom which damages health and threatens mostly the developed societies. It is estimated that 20% of the Greek population suffers from hypertension. Research efforts for the controlling of hypertension are focused in blocking Ang I1 release and more recently in competing Ang I1 binding on AT1 receptors. This latest approach generated the synthesis of losartan and promoted it in the pharmaceutical market (COZAAR). Other derivative drugs which fall into SARTAN’s class followed. To comprehend the stereoelectronic requirements which may lead to the better understanding of the molecular basis of hypertension, the stereochemical features of angiotensin 11, its peptide antagonists sarmesin and sarilesin, synthetic peptide analogs, AT 1 non-peptide antagonists commercially available as well as synthetic ones were explored. AT1 antagonists are designed to mimic the C-terminal part of Ang I1 [l]. In this aspect, it is proposed that the butyl
413
414 chain of losartan may mimic the isopropyl chain of Ile, the tetrazole ring mimics the C-terminal carboxylate group and the imidazole ring the corresponding.
2.
Theoretical Calculations
Theoretical calculations constitute a valuable and indispensable tool in the orthologic design and synthesis of novel antihypertensive pharmaceutical candidates that possess better pharmacological profile in comparison to the existing drugs. Computational Analysis is used extensively in our laboratory in order to comprehend on the Molecular Basis of the disease. In particulit-, strategy used in our laboratory is shown below (Fig. 1). The steps in which the computational analysis is involved in the orthologic design are briefly summarized below.
8?
2.1. (i) Conformatwnal Analysis. The first step as it is shown in the Scheme 1 is the conformational analysis. For structure elucidation and exploring the conformation of a biomolecule it is necessary to resolve the chemical shifts and J-couplings of the NMR spectrum.
2.2. Simulation of NMR spectra. Therefore, complicated spectra in which the parameters of chemical shifts and Jcouplings cannot be resolved need to be calculated through the use of theoretical calculation. In particular, the J-couplings are essential for constructing the preliminary conformational model.
2.3. Energy minimization and conformational search methods. The known experimental techniques give detail information of only one possible bioactive conformer of a drug candidate. Computational analysis provides the barriers of a cluster of low energy conformers and allows comprehending on the stereoelectronic features responsible for their activity.
2.4. Superimposition studies. Superimposition modes permit to view the pharmacophoric segments similarities and differences between the compared molecules. For the interactions of drug with the active site (membrane or receptor) the computational analysis is an essential tool.
Conformational Analysis
1
Conformational Analysis
u Overlay of Sartan with C terminal S e g m e n t of Sarmesin
I
a a
Design of N e w Candidates with 0 ptimized Mimicry Based on Drug:M e m b r a n e and D rug:Receptor
T e s t and optimize the bioactivity through the synthesis of n e w analogs Figure 1. Strategy used in our drug design
416 2.5. Drug:Active Site Interactions.
Theoretical calculations is a requisite for building a membrane and receptor environment. Algorithms which involve drug:receptor interactions (docking) and comprehend on the pharmacokinetic parameters that govern drug activity are developed
2.6. Quantitative Structure Activity Relationships (QSAR).. The design of new leads through QSAR needs computational analysis along with Molecular Graphics. These two necessary tools in the drug design extend our knowledge on the physical chemical features of pharmacophoric segments and aid in the design of new leads. The application of computational analysis is shown in Figure 2 in which losartan is approaching AT 1 receptor. First, conformational analysis was sought in order to explore the bioactive conformers of losartan and then is docked in the membrane with embedded the AT1 receptor. This is achieved using a combination of a-c computational analysis steps along with 1D and 2D Nuclear Magnetic Resonance experiments. Docking is used (step d) for exploring the active site of losartan. QSAR studies will be applied in the near future.
References 1. De Gasparo, M.; Catt, KJ.; Inagami, T.; Wright, JW. and Unger, T. International Union of Pharmacology. XXIU. The Angiotensin II Receptors. Phamacol. Rev. 2000 52,415-472. 2. T. Mavromoustakos, V. Apostolopoulos, J. Matsoukas. Antihypertensive drugs that act on Renin-Angiotensin System with emphasis in ATI antagonists Mini Rev. in Curr. Med. Chem. (2001) 1,207-217.
417
Figure 2. The approach of losartan onto the AT1 receptor.
ENERGY CONSERVATIVE ALGORITHM FOR NUMERICAL SOLUTION O F ODES INITIAL VALUE PROBLEMS *
E. MILETICS Department of Mathematics, Sze'chenyi Istvcin University 9026 Gy6r; Egyetem t h 1. Hungary E-mail: [email protected]
The numerical treatment of the ODE initial value problems is an intensively researched field. Recently the qualitative algorithms, such as monotonicity and positivity preserving algorithms are in the focus of investigation. For the dynamical systems the energy conservative algorithms are very important. In the case of Hamiltonian system the sympledic algorithms are very effective. This kind of algorithm is not adaptive, but doubtless they are powerful. The high-efficiency computers and the computer algebraic s o h a r e systems allow us t o create efficient adaptive energy conservative numerical algorithm for solving ODE initial value problems. In this article an adaptive numerical-analytical algorithm is suggested which very effectively can be applied for Hamiltonian systems, but the idea of construction can be adaptable for other initial value problems too, where there are some quantity preserved in time. The idea and the efficiency of the proposed algorithm will be presented by simple examples, such as the Lotka-Volterra and linear oscillator problems.
1. The qualitative behaviour of numerical methods
The continuous mathematical models of real physical phenomena in many cases are ordinary differential equations. For the dynamical systems these continuous models know the conservation laws, so its solutions reflect exactly the physical laws, such as conservation of mass, energy, impulse etc. In the most of cases for the continuous mathematical model the analytical solution is not known, and we have to construct some numerical algorithm to get an approximate solution. The first conditions for the approximate solution is to be *This work was partly supported by t43258 grant of the Hungarian National Research Fund
418
419
close to the exact solution. The algorithm must be stable and economic, see 6 , ’. The investigation of the numerical methods from the point of view of qualitative behaviour of the approximate solution began with the construction of these methods, but in the last ten years this field of research became more intensive. The main examined qualitative properties are: conservation of physical quantities. For example when modelling transport processes, Lagrangian methods generally conserve physical quantities such as mass, moment or vorticity. Numerical difficulties arise, however, at diffusion (dispersion) steps. An optimal compromise is the use of quadtrees generated by the moving Lagrangian points in each time step, see for details. Other important preserving quantities are monotonicity, positivity ‘. Recently the conservation of the energy is in the focus of research for Hamiltonian systems 3. The symplectic algorithms are very ingenious for the solution of this problem, and the collocation type, such as implicit Runge-Kutta algorithms are well-examined from this point of view. The appearance of the computer algebraic system such as MapleV lo1 give new alternative technology and methodology of investigation for such problems. The classical Taylor series algorithms recently can be practically used by these systems. For more complicated problems the formal Taylor series algorithm can be combined with numerical derivatives 5 , g . The investigation of this algorithm led us to the observations, that using a family of explicit Taylor series algorithm several adaptive energy conservative algorithms can be constructed. In the next sections we present these results, and propose an energy conservative algorithm quite different from the symplectic ones. 2. Examples of the Hamilton systems and behaviour of the Taylor series algorithms
The attention in this work will be addressed to Hamiltonian problems. These problems are of the form:
where the Hamiltonian H(p1, . . . , p , , q l , ..., qn) represents the total energy, qi are the position coordinates and pi are the moments for i = 1,..., n. The Hamiltonian:
is an invariant of a first integral.
420
2.1. The Lotlca- Volterra model The Lotka-Volterra model is not a typical mechanical Hamiltonian system. The differential equations are: d
-4) dt
d
-v(t) dt
= u ( t ) ( v ( t) 2),
= v(t)(l - u(t)),
(3)
and the initial conditions:
v(0)= vo,
u(0) = u o ,
(4)
where u(t) is the number of predators, v(t) is the number of prey. The Hamilton function of system, which must be constant:
+
H ( u ( t ) , v ( t ) )= ln(u(t)) - u(t) 2ln(v(t)) - v ( t ) ,
(5)
and the preserved quantity:
Ho
= In(zl0)
- ZQ
+ 2ln(vo) -
210,
so the Hamiltonian’s delay defined by:
Hd = H ( u ( t ) , v ( t ) ) ffo,
(7)
must be zero for every t for the exact solutions. 2.2. The linear oscillator problem The following formulae describe the simple linear oscillator (where m is the mass and k is the spring constant):
and the initial conditions: P ( 0 ) = Po, The appropriate Hamiltonian is:
and the preserved quantity:
d o ) = Qo.
(9)
421
Figure 1. The Taylor series approximations of the Hamiltonian (10) of the linear oscilator problem for large interval
3. The behaviour of the approximate Hamiltonian obtained by Taylor series algorithms
The above examples are ODES initial value problems. Numerical solution of these problems can be obtained by any numerical algorithm, but most of the common used algorithm, such as R-K or multistep, or other ones is not suitable to follow the preservation the value of Hamiltonian. Here we present, that by the Taylor series algorithm it is easy to study the qualitative behaviour of the Hamiltonian of approximate solution. This observation led us to the idea, how the adaptive numerical algorithm can be constructed for preserving the initial value of the Hamiltonian. The Taylor series approximation of the solutions of the initial value problems (3), (4)and (8), (9) are created formally by MapleV system up to order 10. These truncated Taylor series approximations are substituted into the Hamiltonian (5) and (10). These expressions give the approximate Hamiltonians, which are the continuous functions of the independent variable t . These functions can be analysed by analytic tools, and by this analyse we can study its qualitative behaviour. On the Figure 3. first we show the global behaviour of the Taylor series a p proximation of the delays of Hamiltonian (10) from the preserved quantity Ho (11) of the linear oscillator problem up to order 10 (the order of the method is indicated by the number (i), i = 2,3, ..., 10 near to the graph). It
422
Figure 2. The Taylor series approximations of the delay of Hamiltonian (10) from HO for the linear oscillator problem
is clear that the approximation of the Hamiltonian is good near the initial
point and became worst at some distance from this point. This effect depends also on the order of approximation applied. It is seen that the error of every approximation tends to infinity as the distance from the initial point tends to infinity. But very important observation is that the Taylor series approximation of the delay of Hamiltonian from the Ho has some number of roots near to the initial point and this qualitative behaviour is different for the different order of Taylor approximation. It means, that there are some points near to the initial points, where the approximate solution preserve the value of the Hamiltonian. If we use a family of the different order methods, there are several number of such points. To know, how many points there are near to the initial point we have to generate more detailed figure. On the following pictures we visualize the approximate Hamiltonian. On the Figure 3. we can see the Taylor approximations of the Hamiltonian (5), while on the Figure 3 the Hamiltonian (10) around the initial values to study the fine structure of these approximations On the Figure 3. it seems, that near to the initial point there are several root of the Hamiltonian’s delay approximation. The other experiments have shown, that the grid points of the zero values depends on the constants in equations (here for example the mass and and they can be quite close to the initial values).
423
Figure 3. The Taylor series approximations of the delay of Hamiltonian (5) from HO for the L o t h Volterra problem
From the Figure 5. it is obvious, that the behaviour of the approximation of the Hamiltonian’s delay for the L o t h Volterra problem is similar to the linear oscillator case. We have to remark, that numerous other numerical experiments have confirmed these observations. This was the starting point to develop a numerical algorithm which in every step preserves the initial value of the Hamiltonian (zero delay at the grid points). 4. The adaptive algorithm for assuring desired local error and preserving the initial value of Hamiltonian
The above observations have led us to the idea, that using a variable order family of Taylor series algorithms the classical adaptive error control can be combined with the control of the zero value of the Hamiltonian’s delay of the approximation solution. A pure algorithm proposed is as follows: (1) initialise the variables 20, Order := 2,
E for the local error control, Maxorder , initialvalues, endpoint, (2) calculation of the truncated Taylor polynomial of the problem with accuracy given by Order, (3) using some step size control algorithm determine some admissible step size h, (4) evaluation of Hamiltonian’s delay and determine its zero points in [xo, 50 + h ] ,
424
(a) if any zero points was found to chose the maximum one which value will be the new step size h,,,, xo := xo hWW, if xo 2 endpoint go to 7. else go to 5. (b) if no zero points, Order := Order + 1, (c) if Order 5 Maxorder, go to 2. else go to 6.
+
( 5 ) approximate solution is adopted for the actual interval with zero Hamiltonian’s delay, evaluation of new initial values and go to 2. (6) choosing from the different order approximate solution which gives minimal Hamiltonian’s delay in the interval [zo,xo h ] ,it will be adopted as an approximate solution in the actual interval, evaluation of new initial values, z o := xo h, if xo 2 endpoint go to 7. else go to 2. (7) end.
+
+
5. The result of numerical experiments
The above algorithm description is drawn in outline. The program for its realization written in the MapleV system because we have to use several built in functions, such as dsoZve(), fsoZve(), m a x i m i z e ( ) , m i n i m i z e ( ) , m a x ( ) ,mino, plots library, and many other its formula manipulation capability. This is why we can state, that the above algorithm is an analytic numeric algorithm. The results of realized algorithm confirm the usability of the idea described above. Some of the results of numerical experiments are demonstrated on the figure 4. On the figure 4 it is clear, that the algorithm works well, and the exact and the calculated solutions are very close to each other. But this is only visualization. Much more interesting results can be obtained vhen we see the values of Hamiltonian’s delay at the grid points. In our numerical experiments we have seen that almost every subiterval the proposed Hamiltonian correction step works and there are only some subinterval, where we have to accept the approximate solution without correction.
425
Figure 4. The results of the numerical experiments for linear oscillator problem (8) and the exact solutions
6. Conclusions
In this work we presented an adaptive numerical algorithm based on the formal Taylor series algorithm. In this algorithm we combined a classical adaptive step size control with a Hamiltonian’s delay correction step. This correction based on some properties of the continuous approximation of Hamiltonians and its qualitative behaviour, namely that these approximations obtained by using a family of different order Taylor series methods has some zeros in the interval accepted by classical error control algorithm. Because the Hamiltonian approximation is one variable real function, these zeros can be localized by numerical algorithm. If the next grid point is chosen such zero points for which the approximate solution satisfies the accuracy condition and the same time t he Hamiltonian preserving property. The algorithm presented is a relatively primitive version of possible alge rithm, but the results of this algorithm can assert that the idea proposed is employable. We are alive to primitiveness and rudeness of the presented algorithm. But it can be examine more precisely and more accurately in the future. Our numerical experiments have shown some interesting effect, which have to be investigated too, but these effects was only some delicate numerical effect which modified the run of the algorithm, but not contradicted to the main idea. Doing over the algorithm these effects can be used to improve it also.
426
References 1. E. Hairer, S. P. Norsett, G. Wanner, Solving Ordinary Differential Equations,
I. Nonstiff Problems, Springer Verlag, 1987. 2. E. Hairer, S. P. Norsett, G. Wanner, Solving Ordinary Differential Equations, II. Stiff and Differential-Algebraic Problems, Springer Verlag, 1991. 3. E. Hairer, C. Lubich, G. Wanner, Geometric Numerical Integration, Structure-Preserving Algorithms for Ordinary Differential Equations , Springer Verlag, 2002. 4. Z. Horvbth, Consistency and stability for some nonnegativity-conserving methods, Applied Numerical Mathematics, 13,371-381, 1993. 5. G. Molnbrka, B. RBczkevi, Implicit single step method by spline-like functions for solutions of ordinary differential equations, Comput. Math. Applic., 16(9), 701-704, 1988. 6. M. Crouzeix, A.L. Mignot, Analyse numerique des equations differentielles, MASSON, Paris, 1984. 7. A. Gibbons, A program for the automatic integration of differential equations using the method of Taylor series, Computer J. 3, 108-111, 1960. 8. C. Gaspar, J. Jozsa, Two-dimensional Lagrangian Flow Simdation Usi n g Fast, Quadtree-based Adaptive Multigrid Solver, Proceedings of the 9th GAMM Conference, Lausanne, Switzerland, 25-27 September, 1991. Vieweg Verlag, 1992. 9. P. E. Miletics and G. Molnbrka, Taylor Series Methods with Numerical Derivatives for Initial Value Problems, in Computational and Mathematical Methods on Science and Engineering, Vol. I. eds.: J. Vigo Aguiar and B. A. Wade. Proc. of CMMSE-2002, Alicante, Spain, pp.258-270. 10. G. Molnarka, et al. The MapleV and its application (in hungarian), Springer Hungarica Kiad6 Kft. Budapest, 1996. 11. A. Heck, Introduction to Maple, Springer-Verlag New York Inc,. 1996. 12. H. Yosida, Construction of higher order sympbectic integrators, Phys. Lett. A 150, 262-268, 1990.
ENZYMATIC SPIN CATALYSIS INVOLVING
0 2
BORIS F. MINAEV AND HANS AGREN Laboratory of Theoretical Chemistry, The Royal Institute of Technology, SCFAB Roslagstullsbacken 15, Stockholm 10691, Sweden
Intermolecular interactions between dioxygen and other molecules induce a number of interesting optical and biochemical phenomena A concerted insertion of the ground state dioxygen into an organic (diamagnetic) molecule is a spin-forbidden reaction Biological systems activate triplet dioxygen for controlled chemical syntheses via electron-transfer reactions and exchange interactions Manifestations of similar interactions can be found in the oxygen-induced fluorescence quenching of organic dyes and other photophysical processes 8 . The advantages of aerobic life and oxidative metabolism were important factors during evolution: These are mainly connected with the high exothermicity of oxidation of organic molecules. At the same time life strongly depends on the kinetic barriers to the molecular oxygen reactions. 0 2 is used as an oxidant in respiration of mammals and oxidative metabolic processes, reducing food to water and carbon dioxide. Although these transformations are equivalent to combustion, the oxygenase enzymes control the specific reaction pathways that store and smoothly release energy by subtle spin selective processes, whereas combustion is a radical chain process. It requires a high temperature initiation step and developes like explosion without strong regulation of energy release 7,9. The kinetic constraints, which allows the controlled use of dioxygen by aerobic life in the presence of strong thermodynamic drive, are determined by spin selective interactions between dioxygen and oxygenase enzymes ‘. The first step in understanding these constraints should be connected with analysis of photophysical processes induced by 0 2 interaction with solvents and gases ’. The oxygen molecule possesses a triplet ground state, X 3 C ; , and two low-lying singlet excited states, alA, and blC:, which lie only 0.98 and 1.63 eV, respectively, above the ground term; thus the singlet-triplet (S-T) transitions in the visible (760 nm) and near infrared (1270 nm) regions are ll2i3
435,6.
’.
4y596.
427
428
the most prominent features in atmospheric spectroscopy in spite of their forbidden character. 0 2 collisions with diamagnetic molecules induce numerous S-T transitions in organic chromophores by exchange perturbations and vise versa organic solvents enhance the singlet oxygen emission by a very specific mechanism lo. The oxygen molecule itself has quite specific spin-orbit coupling (SOC) and S-T transition probabilities. The key point is presented by SOC induced mixing of the close lying S-T states @b = @(blC:)
-k
c b , X @ ( x 3 ~ ~ o )
(1)
and
where
Here @(X3Cg,0)corresponds to R = 0, where R is the total electronic angular momentum quantum numbcr R = A C. In a simple approximation the SOC matrix element in the numerator of Eq. (3) is equal to to -i
+
"1'2>'3
(@a
I F I @ x , o ) = - c ; , X ( ~ ( a ' ~ g I) F I Q(~'c:)),
(4)
Here T is an arbitrary multipole in the expansion of the radiation field. In the free 0 2 molecule f is a quadrupole (Q). The b - a transition has an electric-quadrupole moment, Q b - a = 1.005 eai 16i11, which determines a corresponding a - X , 0 transition quadrupole moment: Qa-X,O = - C ; , X Q b - a = -i 0.0135 ea;. This electric-quadrupole contribution to the intensity of the a - X , 0 transition is practically negligible; Aa-x,o = 5 x s-' In free 0 2 molecule the a - X emission band (1270 nm) consists practically from magnetic radiation (transition a - X , 1 to the R=1 spin sublevel).
''Y'~.
429
Analysis of the response of the open 17ri shell of the 0 2 molecule to intermolecular perturbations in collision complexes with other molecules indicates that the collision-induced electric-dipole moment of the b- a transition is the characteristic feature of such complexes. Thus the radiative decay of singlet molecular oxygen, alAg + X 3 C ; , is very sensitive to collisions in gas phase ,',l to weak intermolecular interactions in solid noble gas matrices l7 and in liquid solvents since the the a - X emission band borrows its intensity from the collision-induced b - a transition. Now an arbitrary multipole T in Eq. (4) has to be changed by an electric dipole moment operator b. The collision-induced electric dipole transition moment Da-b = (Q(alA,) I b I Q(blC$)) is calculated to be as much as 0.01 - 0.1 eao s,21 being dependent on the polarizability of the organic solvent. The Einstein coefficient for the singlet oxygen emission a1Ag+X3C; at X = 1270 nm and b1C,f+X3C; emission at X = 750 nm have been calculated in ourlaboratory by the quadratic response (QR) multiconfiguration self-consisted field (MCSCF) method for a number of collision complexes 0 2 M, where M = He, Ne, Ar, Hz, Nz, CzH4, CsH6, NO, 0 2 . Calculations of the dipole transition moment for the Noxon band, blC,f - alas,by the linear response (LR) MCSCF method were also performed for a number of collision complexes. Spin-orbit coupling between the blC,f and X 3 C ; ( M s = 0) states does not change much upon collisions, thus the a-X transition borrows intensity mostly from the collision-induced Noxon band b-a. The a-X intensity borrowing from the Schumann-Runge transition is negligible. The calculations show that the b - a and a - X transition probabilities are enhanced approximately by lo5 and lo3 times by 0 2 M collisions. An order of magnitude difference occur for both transitions for noble gases with large differences in polarizability. Geometry optimization of the 0 2 M complexes in a number of excited states of the oxygen moiety has permitted us to calculate vibrational frequencies and the infrared absorption intensity induced by collision. These quantities are also discussed here since they correlate with the electronic transitions enhancement. The infrared absorption is forbidden for homonuclear diatomic molecules. At the same time a weak collision-induced infrared absorption of 0 2 near 6.4 pm (1563 cm-l) is well known from atmospheric spectra '. We have studied the dependence of this collision-induced IR intensity on the polarizability of the collider and on geometry of collision. The collision-induced infrared intensity is studied for a number of states of the 0 2 molecule and is proved to be strongly dependent on the specific state nature. 18y19320
+
+
+
430
The collision-induced vibrational and electronic transitions intensity in infrared, visible and ultra-violet regions are thus important characteristics of intermolecular interactions between dioxygen and solvents They depend on specific intermolecular overlap of wave functions, on ionization potential and on solvent polarizability 21,10. A very instuctive example provides the singlet oxygen quenching by amines 23. Molecules with low oxidation potentials quench 02(a1A,) with a high rate constant, which correlates with the ionization potential. This charge-transfer (CT) induced process has been explained by a strong increase of SOC mixing between XsEg and alAg states when C T admixtures of different orbital nature nr, and nry occur in the triplet and singlet states 23. Similar requirements are fulfilled in some oxygenase enzymes 4 . At the stage of the superoxide ion production (OF) by electron transfer from an organic donor molecule (M) a relatively strong probability of electron spin flip (the triplet -+ singlet nonradiative transition, or intersystem crossing) occurs in the radical pair Mf...O, (charge transfer state) 15,23. The mechanism of the strong SOC enhancement at this stage is similar in nature to the mechanism for the singlet alA, oxygen quenching by amines in the gas phase It originates from SOC in the 0, moiety: the electron spin in the donor radical ion M+ moiety is "passive" in this radical pair and the triplet + singlet spin flip occurs just in the superoxide ion. Dioxygen interaction with unsaturated hydrocarbons leads to intermediate biradicals, but SOC a t the biradical stage is negligible The only effective way to overcome spin prohibition and to make dioxygen reactive in respect to diamagnetic species without radical chain initiation processes is to involve the charge transfer step in the first elementary act. Even partial charge transfer character induced by collision of 0 2 with NH3 (and with other amines) leads to nonzero SOC mixing between X 3 C ; and alAg states, which depends on the ionization potential of the amine 23. This mechanism is applied here t o dioxygen activation by glucose oxidase. In copper amine oxidases (and in tyrosine hydroxylase and in lypoxygenase) the additional mechanism of intersystem crossing (ISC) in the radical pair M+ ...0, is more important: This is the exchange interaction between nonpaired spin in the metal ion with the spin of the 0, anion. There is no chemical bonding between metal ion and oxygen 25, but spin exchange still exists. Spin exchange between colliding radicals constitutes a well known dynamic process in solvents; it is detected by the line shape of E P R signals and is induced by exchange interactions 26. Thus the study of physical processes in 0 2 collisions provides an important help in understanding spin 22i8.
23324.
7915.
431
catalysis in oxygen activation by enzymes.
References 1. E.H. Fink, K.D. Setzer, J. Wildt, D.A. Ramsay, and M. Vervloet. Int. J. Quant. Chem., 39:287, 1991. 2. J. Wildt, E.H. Fink, P. Biggs, R.P. Wayne, and A.F. Vilesov. Chem. Phys., 159:127, 1992. 3. F. Thibault, V. Menoux, and R.L. Doucen. Appl. Optics, 36:563, 1997. 4. B.F. Minaev. Ukrainian J . Biochem., 74:11, 2002. 5. R. Prabhakar, P. Siegbahn, B. Minaev, and H. Agren. J. Phys. Chem. B, 106:3742, 2002. 6. P. Siegbahn, R. Prabhakar, and B. Minaev. Biochim. Biophys. Acta, 1647:173, 2003. 7. B.F. Minaev. RUS.J . Struct. Chem., 23:170, 1982. 8. B.F. Minaev, K.V. Mikkelsen, and H. Agren. Chem. Phys., 220:79, 1997. 9. B.F. Minaev. Khim. Fizika, 3:983, 1984. 10. B.F. Minaev and H. Agren. J . Chem. SOC.Faraday Trans., 93:2231, 1997. 11. R. Klotz, C.M. Marian, S.D. Peyerimhoff, B.A. Hess, and R.J. Buenker. Chem. Phys., 89:223, 1984. 12. R. Klotz and S.D. Peyerimhoff. Mol. Phys., 57:573, 1986. 13. B.F. Minaev, 0. Vahtras, and H. Agren. Chem. Phys., 208:299, 1996. 14. P.H. Krupenie. J. Phys. Chem. Ref. Data, 1:423, 1972. 15. B. F. Minaev. Theoretical analysis and prognostication of spin-orbit coupling effects in molecular spectroscopy and chemical kinetics. Dr. Sc. Thesis, N.N. Semenov Institute of Chemical Physics, Moscow. (in Rus.), 1983. 16. E.B. Sveshnikova and B.F. Minaev. Opt. Spectrosc., 54:320, 1983. 17. G. Tyczkowski, U. Schurath, M. Bodenbinder, and H. Willner. Chem. Phys., 215:379, 1997. 18. Jr. A.A. Krasnovsky. Chem. Phys. Lett., 81:443, 1981. 19. P.R. Ogilby and C.S. Foote. J . Am. Chem. SOC.,105:3423, 1983. 20. P.K. Frederiksen, M. Jorgensen, and P.R. Ogilby. J. Am. Chem. SOC., 123:1215, 2001. 21. B.F. Minaev and G.K. Mambeterzina. Oxygen complexes with naphthalene and decapenthaene studied by mindo ci method. In G.A. Ketsle (Ed). Photoproceses in Atomic and Molecular Systems, page 35. Karaganda State University, 1984. in Russian. 22. B.F. Minaev, V.V. Kukueva, and H. Agren. J . Chem. SOC.Faraday Trans., 90:1479, 1994. 23. B.F. Minaev. Theor. Exp. Chem. (USSR). Panslated by Plenum Publ. Corp., 20:199, 1984. 24. E.A. Ogryzlo and C.W. Tang. J. Am. Chem. SOC.,92:5034, 1970. 25. J.P. Klinman. J . Biol. Inorg. Chem., 6:1, 2001. 26. D. Kivelson. J . Chem. Phys., 33:1094, 1960.
A NEW CLASS OF METHODS FOR SOLVING ORDINARY DIFFERENTIAL EQUATIONS
N. MOIR Department of Mathematics The University of Auckland Private Bag 92019 Auckland New Zealand Almost Runge-Kutta methods are a subclass of the family of methods known as general linear methods, used for solving ordinary differential equations. They retain many of the properties of traditional Runge-Kutta methods, with some added advantages. The higher stage order enables cheap error estimators to be obtained. For some orders it also means a reduction in the number of internal stages required to obtain that order. We will introduce these methods and present some recent results.
1. Introduction Traditionally mathematicians have used one of two different classes of methods t o numerically solve ordinary differential equations. These are known as Runge-Kutta methods and linear multi-step methods, respectively. Runge-Kutta methods calculate a number of internal stages and then use this information to take a step forward in time, discarding all previous information. Due to the fact these methods are one-step methods, it is very easy to change the step-size being used to follow the behaviour of the solution. However, it is expensive to get an accurate error estimator for them. A cheap error estimator is desirable to tell us what the size of the next step should be. By a “cheap” error estimator we mean a formula for approximating the local truncation error in a step without requiring additional function evaluations. This is not possible for high order explicit Runge-Kutta methods, where it is necessary to add extra internal stages to the method to obtain an error estimator. Linear multi-step methods take a very different approach. They use information from previous steps t o take a step forward in time, but only compute a single derivative value. This makes it difficult to change the step432
433
size being used. They are cheaper to use than Runge-Kutta methods, but they they have a smaller stability region. They also have the disadvantage of being difficult to start. In the first few steps of the computation we do not have sufficient information to apply the method. One way of getting around this problem is to take the first few steps using a Runge-Kutta method of the same order. Each of these classes of methods can be sub-divided into explicit and implicit methods. Explicit methods allow the next step to be calculated using information calculated at previous time steps. Implicit methods require information about the derivative at the current time to calculate the solution. This means an iteration scheme, such as Newton iteration, needs to be used at each time step. Explicit methods are obviously much cheaper to use, but are not able to handle stiff problems. There is no generally accepted definition of stiffness but the phenomenon can be recognised when is occurs. That is, if stability rather than accuracy dictates the step-size for a problem, then the problem has to be regarded as stiff. We are interested only in non-stiff problems in this paper. We would like to enjoy the benefits of Runge-Kutta methods (e.g. stability and ease of changing step-size) but also some of the advantages of linear multi-step methods. The idea is to use a method like a Runge-Kutta method, but with more information passed between steps. This was first proposed in the form of “Pseudo Runge-Kutta methods” [l].Another approach, that of “two-step Runge-Kutta methods”, was proposed in [2]. The approach used in this paper, of ARK methods or “Almost RungeKutta methods”, follows the papers [3], [4] and [5]. This class of methods is in fact part of the family of methods known as general linear methods, which were first formulated in [6]. Although general linear methods were designed to be a unifying framework for Runge-Kutta methods and linear multi-step methods, it has always been hoped that new methods would arise from this general formulation.
2. Almost Runge-Kutta methods
The idea of these methods is to retain the multi-stage nature of RungeKutta methods, but allow more than one value to be passed from step to step. This gives the methods a multi-value character. Of the three input and output values in “ARK methods”, one approximates the solution value and the other two approximate scaled first and second derivatives respectively. To make it easy to start the methods, the second derivative
434
is required t o be accurate only within O(h3). The method has inbuilt “annihilation conditions” to ensure this low order does not adversely effect the solution value. The general form of ARK methods is
A B
I‘
V
where s is the number of internal stages. For ease of computation A is lower triangular. This means the methods are not suitable for stiff problems. The coefficients of the method are chosen in a careful way to ensure the simple stability properties of Runge-Kutta methods are retained. The stability matrix of an ARK method, given by
M ( z ) = V + z B ( I - zA)-lU, possesses the “RK-stability” property of only one non-zero eigenvalue. The remaining eigenvalue is equal to the truncated exponential series. An example of a fourth order, four stage method is 0
’1
=
V
0
0
0
1
1
L16 O O O l T l , 1
2
0
g i
--
0
0
1-3
011
2
; $
-+
0
o g Q o 1 $
0
0
0
0
;
0-g
_-
0
1
0
0
2 0-1
0
The four internal stages approximate y(z,-l+ cih), where c = [I,$,I, 1IT. Since the early papers [3] and [4] a great deal has been discovered about these methods. In particular, a special family of fourth order methods has
435
been developed with zero fifth order coefficients. This means the methods behave as if they were fifth order for fixed step-size. If they are implemented in a special way this result can be extended to variable step-size. Numerical experiments have been performed on many evolutionary problems, such as ordinary differential equations and delay differential equations. Preliminary results are promising, showing that the methods perform as well, if not better than, traditional Runge-Kutta methods for fixed stepsize implementation. They then have the added feature of a cheap error estimator, ensuring they perform well for variable step-size implementations. The high stage order also means an interpolator can be found for these methods. A consequence of this is that they perform well on delay differential equations as well.
References 1. G. D. Byrne and R. J. Lambert, Pseudo Runge-Kutta methods involving two points, J. Assoc. Comput. Mach. 13 (1966), 114-123. 2. Z. Jackiewicz, R. Renaut and A. Feldstein, Two-step Runge-Kutta methods, SIAM J. Numer. Anal. 28 (1991), 1165-1182. 3. J. C. Butcher, A n introduction to “Almost Runge-Kutta” methods, Applied Numerical Mathematics, 24 (1997), 331-342. 4. J. C. Butcher, ARK methods up to order five,Numerical Algorithms, 17 (1998), 193-221. 5. J. C. Butcher and N. Moir, Experiments with a new fijlh order method,
Numerical Algorithms, to appear. 6. J. C. Butcher, On the convergence of numerical solutions of ordinary differential equations, Math. Comp. 20 (1966), 1-10.
IMPLICIT EXTENSION OF TAYLOR SERIES METHOD FOR INITIAL VALUE PROBLEMS
G. MOLNAR.KA Department of Mathematics, Sze‘chenyi Istvrin University, Gyoq Hungary E-mail: [email protected] The Taylor series method is one of the earliest analytic-numeric algorithms for approximate solution of initial value problems for ordinary differential equations. The main idea of the rehabilitation of these algorithms is based on the approximate calculation of higher derivatives using well-known technique for the partial differential equations. The approximate solution is given as a piecewise polyne mial function defined on the subintervals of the whole interval. This property offers different facility for adaptive error control. This paper describes several explicit Taylor series with implicit extension algorithms and examines its consistency and stability properties. The implicit extension based on a collocation term added t o the explicit truncated Taylor series. This idea is different from the general collocation method construction, which led t o the implicit R-K algorithms l 3 It demonstrates some numerical test results for stiff systems herewith we attempt t o prove the efficiency of these new-old algorithms.
1. Introduction
The Taylor series algorithm is one of the earliest algorithms for the approximate solution for initial value problems for ordinary differential equations. Newton used it in his calculation and Euler describe it in his work. Since then one can find many mentions of it such as J. Liouville, G. Peano, E. Picard. Many authors have further developed this algorithm, see for example A. Gibbons 6 , and R. E. Moore The basic idea of these developments was the recursive calculation of the coefficients of the Taylor series. Modern numerical algorithms for the solution of ordinary differential equations are also based on the method of the Taylor series. The overview of the modern algorithms on can find in the monograph of E. Hairer, S. P. Norsett and G. Wanner ’, A possible implicit extension of the Taylor series algorithm is given in 4 , and ’. Actually, in the qualification of the algorithm become important their quality properties such as conservativity, pozitivity preserving 3 , monotonity preserving, detecting
’.
’.
436
437
and following the bifurcation path. The algorithms based on Taylor series are efficient tools for investiagtion of algorithm with quakitative properties. By appearing the parallel computers as the complexity of algorithm new cost functions must be defined because in this case the main goal is to minimize the execution time and not the number of function evaluations. From this point of view several variant of the method of the Taylor series with numerical derivatives could be an effective algorithm 12. On the other hand, the most of the validated numerical methods for the solution of ODE initial-value problems based on Taylor series, see u . In this paper we propose an implicit version of Taylor series algorithm. 1.1. Formulation of the problem The problems to be solved are as follows: y'(x) = f ( x , y ( x ) ) ,
y(x0) = yQ<
(1)
where: z £ [XOXQ+T], y(x) = [yi(x),y2(x),...,yn(x)}T yi(x)
:R^Rn,
e C?+1([x0,x0 + T})} i = 1, ...,n
for a given p and:
f(x,y(x))=[f 1 (x, y(x)), /2(x, y(z)),..., fn(x, y(x}}]T. Let us introduce the following notations: Y(x] = [x,yi(x),...,yn(x)]T,
F(Y(x)) =
[l,fl(Y(x)),...,
(2) By this notation the equation (1) is as follows: Y'(x)=F(Y(x)),
Y(x0) = [x0,yi(x0),y2(x0),...,yn(x0)}T.
(3)
By this step we deduced the well known fact that every nonautonomous system can be transformed into autonomous one, but in this paper, from the point of view of the realization of the proposed algorithm, the notation has some practical significance. Using this notation the Taylor series of the solution of (1) is:
Y(x0 + a) = Y(x0) + Y (x0)s + ~Y"(x0)s2 + ... + O(s^+1),
(4)
438
where 0 denotes well known ordo function, and value of p might be chosen depending on the smoothness of the right side in (3) and the desired order of the method to be obtained. 2. Taylor series methods
The classical Taylor series algorithms has a disadvantage, that the for higher order methods the calculation of the higher order derivatives of the right hand side of 2 is too complicated, and in some cases, when the analytical form of the right hand side is not known. These problems can be avoid using numerical approximations of the derivatives. The main idea of the construction of the method of Taylor algorithm with numerical derivatives is the numerical approximation of the derivatives Y ( i ) ( z ~ i= ) , 2 , 3 , 4 , ... . For higher order derivatives we have to approximate several partial derivatives lo. In l2 we have given some construction for approximations of the derivatives of the rleft hand side F ( Y ( z ) )of (3) defined in (2). We get matrices (linear form for first derivatives) bilinear, threelinear etc. forms (for second, third etc derivatives). One further advantageus property of such approximation is that its calculation can be made fully parallel. By summarizing the results from l2 we can construct some truncation of Taylor's series of the Y solution of (1) as an approximate solution at a given subinterval [ z k , z k + l ] . So the following expression gives a fourth order explicit algorithm:
where the appropriate expreessions giving the derivatives 2 , 3 , 4 are given in l 2
x(zk),
i =
2.1. Implicit Taylor series methods up t o fifth order
The explicit Taylor series methods can be used for the construction of implicit algorithm. The idea of the construction is that the explicit Taylor series truncations could be augmented by one or more extra terms see '. Here we formulate the implicit extensions with one extra term as follows. Let us denote by ? ( a k ) ( s ) the Taylor series truncation of the solution of (1) obtained by the above mentioned explicit methods. Then: k-1
.
439
where the vector a k is to be determined. The condition for determination of a k is that in some collocation point, denoted by scoll,the Y ( z k ) ( a k ) ( s ) expression must to satisfy the original equation (1). By substituting (6) into (1) we get: y ' ( x k ) ( a k ) (scoll)
(7)
F(Y(xk)(ak)(scoll)).
On the left hand side of (7) the derivation can be performed explicitly. So we get:
k = 2,3,4,5.
Let us introduce the following notation:
and
The simple iteration algorithm for determining the unknown vector X is as follows: X(l+')
= F(Tk-l(Scoll)
+ -x(')) - L)Tk-l(scoll), k ~COLL
1
0 , 1,2 . . .
(9)
where X(O) is given. The following theorem states that for small enough scoll the iteration (9) for every X(O) value is convergent. Theorem 2.1. Let u s assume, that the right hand side of (1) satisfy the following Lipschitz condition: IIF(Y1) - F ( Y 2 ) I l
I LIIYl - Y 2 l l .
(10)
T h e n if V L < 1 the equation (7) has a n unique solution and the iteration ('9) converges t o this solution f r o m every initial value X(O).
Remark 2.1. We see that the iteration is convergent for small enough scoll value and by comparison with the classical implicit algorithm this scoll value can be k times greater see (9). The other advantage of this iteration is that for the initial values for iteration X(O) = 0 in most cases is a good choice because the solution must be near zero.
440
2.2. The consistency order of the implicit methods q + l ] is Y(xi + s) The exact solution of the problem (1)in the interval [xi, while the approximate solution obtained by the proposed implicit methods (7) is ?(Xi s).
+
Theorem 2.2. If the right hand side of ( 1 ) has continuous derivatives up t o (k 1) order then consistency order of the proposed implicit algorithms is k 1, that is:
+ +
while for the s E (xi,scoll) the following estimation holds:
2.3. The stability analysis of the algorithms
The linear stability analysis of the implicit Taylor series algorithm is similar as the analysis of the other implicit algorithms. The transfer or stability function for the above proposed algorithms can be calculated easily and we give them in the following table, Table l., where we use the As = z notation. The-method order
The transfer function
+
24+182+6z 24-2 120+96z+362 +8z + z 120-2
From the Figure 1. it is clear that the implicit methods of order 3, 4 and 5 are conditionally stable algorithms and its stability conditions are very near to each other. (On the Figure 1. the curves are almost similar.) The stability function of the second order implicit Taylor series algorithm is equal with the stability function of the well known implicit trapezoidal formula, therefore it is unconditionally linear stable algorithm.
441
Imz
Figure 1. The stability regions of the methods; (3)-third order, (4)-fourth order, ( 5 ) fifth order methods
3. Computer implementation and test results
The algorithms proposed above can be regarded as analytic-numeric algorithms because they define the approximate solution as a piecewise polynomial. Therefore we have decided to realize these algorithms by using computer algebraic system, namely the MapleV. The realization of the Taylor’s series algorithm for an equation by MapleV is not a complicated task, therefore we have worked out an implementation for the system of equations.
3.1. The test p r o b l e m To test the efficiency of the proposed algorithm we used the well known two body problem l. Y;(t) = Y3(t)Y3(t)r
&(t)= -Yl(t)Y3(t), Yl(0) = 0 ,
y2(0) = 1,
This model has a periodic solution.
Yj(t)
=
y3 =
1.
-0.5lYl(t)yz(t),
442
3.2. Automatic e r r o r control a n d numerical complexity
Usually the arithmetic complexity of the algorithms for numerical solution of the initial value problems of ordinary differential equations measured by the number of function evaluation by step. One of the advantage of the above proposed algorithms is that the truncated Taylor’s series is obtained in explicit form in every subinterval. This is the main point from which one can profit using MapleV. This property facilitate to investigate several properties of the approximate solution such as the local error of approximate solution for each component because the main term of the local error is known for example in the form of ”- 1050.796332s3”,where s is the local time variable in a subinterval [ z k , x ] . so without any further calculation one can estimate the admissible step size to ensure the prescribed local error for each component of the solution (’). Using this technique while we apply these algorithms we get an automatic step size control algorithm too. One simple step size control algorithm could be the following: i) We chose an appropriate value of h. ii) One give the admissible local error E > 0 and choose the order p of the algorithm. iii) From cakulation the values ei = 1 1 I , ( P ) ( z k ) I , i = 1,2, ...,n - 1, are known for every k. iv) In the interval
[ Z k , z]
we calculate the value m e k
=
V)
We chose such Step Size S k for which S k
vi) If h
>sk
than h :=
<
p/&
I
max( yi l
and x k + 1
(p)
(Zk)
= 51,
I).
+s k .
3,go to ii).
So we can conclude, that for the adaptive step size control is not necessary further function evaluation contrary to the Richardson extrapolation or embedded R-K algorithms. For the algorithm (5) the number of function evaluations are 13n, where n is the number of equations. These evaluations can be made fully parallel. We have to remark, that the error control by Richardson extrapolation require 12 sequential function evaluation for a fourth order R-K method. In this case these ex-ra evaluations are not needed. The following results are obtained by the algorithm with implicit extension (9). order variant of above explained algorithms. We remark, that applying the error control value E = 0.02 the number of the steps on this the interval [20] was nearly 110, and in every step the number of iteration
443
for solving the implicit problem was two or three. In the Figure 2. one can see the numerical solution of our test problem.
circle - soluiions with RK method solid line - soluiions with p o p s e d algorithm
Figure 2. The numerical solution obtained by implicit Taylor series algorithm, solid line is the exact solution, circles are the calculated solution values
For the reader who uses the conventional numerical algorithms for the solution of ordinary differential equation the result obtained by the algrithm (9) are curious. In the Table 2. we have shown the truncated Taylor’s series for some set of subintervals generated by the automatic error control. We have to remark, that the first four coefficients of the truncated Taylor series for this test problem are exact, because, the right hand side of the equation is second order polynomial and its derivatives the matrices D , H and T approximate exactly. 4. Conclusions
The implicit Taylor series method with numerical derivatives proposed in this article is such algorithm which can be competitive with the classical algorithm for numerical solution of initial value problems for ordinary differential equations for parallel computers. The complexity of these algorithms for parallel computers can be better than the classical ones because the function evaluat.ions can be performed
444
Endpoints of interval [xk,xk+l]
[8.43217, 8.870841
Calculated truncated Taylor series of solution c i ( z k s), i = 1 , 2 , 3 jj1 = 0.79353 0.50168s - 0.34443s2+ +0.03477s3 + 0 . 0 3 8 9 4 ~-~0 . 0 3 1 7 3 ~ ~ & = 0.60879 - 0.65391s - 0.10895s2+ +0.12141s3 - 0 . 0 5 6 2 5 ~ ~0 . 0 0 0 3 5 ~ ~ 53 = 0.82406 - 0.24638s 0.05443s2+ +0.10611s3 - 3 . 0 3 6 7 3 ~-~0 . 0 0 1 6 0 ~ ~ 61 = 0.95118 + 0.226919 - 0.27942s2+ +0.04758s3 - 0 . 0 1 6 8 1 ~-~0.001128~~ & = 0.30914 - 0.69819s - 0.01195s2+ +0.03166s3 - 0 . 0 3 8 1 7 ~ ~0 . 0 1 6 9 7 ~ ~ 6 3 = 0.73402 - 0.14996s 0.15146s2+ +0.04355s3 - 0 . 0 3 0 2 4 ~ ~0 . 0 0 5 8 4 ~ ~
+
+
+ +
[8.87084, 9.35417 ]
+ + +
fully-parallel and they use only matrix-vector operations. These methods can be regarded as numerical-analytical algorithms. The obtained results contain more information about solution and these is a promising feature for the construction of further algorithms having some "quality properties" such as energy preserving, positivity preserving, etc. The analytic-numeric structure of the approximate solution offers some new, easy error control algorithms which could be useful for the stiff systems as well. The numerical experiments performed by the MapleV system show that the proposed algorithms work well. There are a lot off way to develop the basic algorithm, for example the automatic step size control could be improved, the information obtained by the approximate solution could be used more intensively. The realization and test of the proposed algorithm for parallel computers is a perspective topic to develop. Each of these investigation could be the subject of forthcoming paper. References 1. E. Hairer, S. P. Norsett, G. Wanner, Solving Ordinary Differential Equations, Z. Nonstiff Problems, Springer Verlag, 1987. 2. E. Hairer, S. P. Norsett, G. Wanner, Solving Ordinary Differential Equations, ZZ. Stiff and Differential-Algebraic Problems, Springer Verlag, 1991. 3. 2. HorvGth, Consistency and stability for some nonnegativity-conserving methods, Applied Numerical Mathematics, 13,371-381, 1993.
445
4. G. Molnkka, B. Riczkevi, Implicit single step method by spline-like functions for solutions of ordinary differential equations, Comput. Math. Applic., 16(9), 701-704, 1988. 5. J.B. Murray, Lectures on nonlinear-differentiabequation models in biology, Clarendon Press, Oxford, 1977. 6. A. Gibbons, A program f o r the automatic integration of differential equations using the method of Taylor series, Computer J. 3,108-111, 1960. 7. R..E. Moore, Methods and applications of interval analysis, SIAM studies in Appl. Math, 1979. 8. A. Jorba and M. Zou, A software package f o r the numerical integration of ODE by means of high-order Taylor methods, http://www.maia.ub.es/-angel/,2001. 9. E. Miletics and G. Molniirka Implicit extension of Taylor Series Method with Numerical Derivatives f o r Initial Value Problems, Under publication. 10. R.. Neidinger, A n eficient method f o r the numerical evalution of partial derivatives of arbitrary order, ACM Trans. Mathm Software, 18, 159-173, 1992. 11. Y.F. Chang and G.F. Corliss, ATOMFT: Solving ODES and DAEs using Taylor series, Computers and Mathematics with Applications, 28, 209-233, 1994. 12. E. Miletics and G. MolnBrka, Taylor Series Method with Numerical Derivatives for numerical solution of ODE initial values problems, Hungarian Electronic Journal of Sciences, http://heja.sze.hu, Section Applied and Numerical Mathematics, 1-16, 2003. 13. E. Harier, C. Lubich, G. Wanner, Geometric Numerical Integration: Structure-Preserving Algorithms for Ordinary Differential Equations , Springer Verlag, 2002.
EXPONENTIAL-FITTING SYMPLECTIC METHODS FOR THE NUMERICAL INTEGRATION OF THE SCHRODINGER EQUATION*
TH. MONOVASILIS~ Department of Computer Science and Technology, Faculty of Science and Technology, University of Peloponnese, GR-22100 Tripolis, Greece
Z. KALOGIRATOU Department of International Trade, Technological Educational Institute of Western Macedonia at Kastoria, P.O. Box 30, GR-521 00, Kastoria, Greece T.E. SIMOS! 5 Department of Computer Science and Technology, Faculty of Science and Technology, University of Peloponnese, GR-22100 Ripolis, Greece E-mail: tsimos @mail.ariadne- t.gr
The solution of the one-dimensional time-independent Schrodinger equation is considered by exponential-fitting symplectic integrators. The Schrodinger equation is first transformed into a Hamiltonian canonical equation. Numerical results are obtained for the one-dimensional harmonic oscillator and the hydrogen atom.
*This project is funded by research project 71239 of Prefecture of Western Macedonia and the E.U. is gratefully acknowledged t Also at Department of International Trade, Technological Educational Institute of Western Macedonia at Kastoria, P.O. Box 30, GR-521 00, Kastoria, Greece tActive Member of the European Academy of Sciences and Arts §Corresponding author. Please use the following address for all correspondence: Dr. T.E. Simos, 26 Menelaou Street, Amfithea - Paleon Faliron, GR-175 64 Athens, Greece. Fax number: ++ 301 94 20 091 446
447
1. Introduction The time-independent Schrodinger equation is one of the basic equations of quantum mechanics. Its solutions are required in the studies of atomic and molecular structure and spectra, molecular dynamics and quantum chemistry. In the literature many numerical methods have been developed to solve the time-independent Schrodinger equation. Symplectic integrators are suitable methods for the numerical solution of the Schrodinger equation, among their properties is the energy preservation, which is an important property in quantum mechanics. Also, exponential-fitting methods have been very widely used for the numerical integration of the Schrodinger equation. In this work we develope multistep symplectic integrators with the exponential-fitting property. Our new methods are tested on the computation of the eigenvalues of the one-dimensional harmonic oscillator and the hydrogen atom. 2. The time-independent Schrodinger equation
The one-dimensional time-independent Schrodinger equation may be written in the form 1 d2$ V ( x ) $ = E$ 2 dx2 i-
where E is the energy eigenvalue, V ( x )the potential, and $(x) the wave function. Equation (1) can be rewritten in the form
, where B ( z )= 2(E - V ( x ) ) or
qY = -B(x)$ =4
3. Numerical methods 3.1. Symplectic numerical schemes The key point of applying symplectic methods into numerical calculations is to maintain sympectic structure in the discrete scheme. Given an interval [a,b] and a partition with N points
xo=a,x,=xo+nh,
n = 1 , 2 ,..., N .
448
An one-step discrete scheme
is symplectic if M T J M = J where
J = ( -10 10 ) The two stage Yoshida [3] type symplectic method is of the following form
which is equivalent
:) ($:I:)
(-d2:
=
or
== ( %
(4n+1)
Tn
$n+1
We eliminate Tn-l$’n+l-
(t -c2hB7) (::) ”’) ( 2 ) hn
4 in order t o derive the two step method.
(Tn-16,
+ Tnan-l)$n + T n ( L - l a n - 1 -
Yoshida used the coefficients 1 C I = ‘2 =
-2 ’
dl
= 1,
Tn-lPn-l)$n-l=
d2 = 0
In this case the method is written as 2$n
We set
dl =
1 and $n+l-
d2 = 0, 2+n
+
+
$n-l=
-h2Bn+n
the method is written as $n-1
= -h2(C1
+
~2)Bn$n
0
449
Now we want the method t o integrate exactly the function $(x) = efw this gives
where c = c1
+ c2. The Taylor expansion of c is c=1+-
w2h2 12
w4h4 ++0(h6) 360
for w = 0 we have Yoshida second order symplectic method. We also develope a 4-step method of the form
where f(x) = - B ( x ) $ ( x ) .
4. Numerical results We consider the one-dimesional eigenvalue problem with boundary conditions
We use the shooting scheme in the implementation of the above methods. The shooting method converts the boundary value problem into an initial value problem where the boundary value a t the end point b is tranformed into an initial value y’(a), the results are independent of y’(a) if y’(a) # 0. The eigenvalue E is a parameter in the computation, the value of E that makes y(b) = 0 is the eigenvalue computed. 4.1. The Harmonic Oscillator
The potential of the one dimensional harmonic oscillator is 1
V ( x )= - k x 2 2
we consider k = 1. The exact eigenvalues are given by 1 2
En = n + - , n = 0 , 1 , 2 ,. . .
450
4.2. The Hydrogen Atom
The potential of the hydrogen atom is
V ( x )= -
1 1(1+ -+X
1) ]
22
05r<+Oo,
The exact eigenvalues are given by 1 E n-
2n2 I
z=0]1]2 ]...
n = 1 ] 2] . . .
In order to compute the eigenvalues for the case 1 = 0 by the shooting method we used the interval (O,b] with boundary conditions y(0) = 0 and y ( b ) = 0. References 1. Arnold V., Mathematical methods of Classical Mechanics, Springer-Verlag, 1978. 2. Simos T.E., Numerical methods for l D , 2D and 3D differential equations
arising in chemical problems, Chemical Modelling: Application and Theory, The Royal Society of Chemistry, 2(2002),170-270. 3. Yoshida H . , Construction of higher order symplectic integrators, Physics Letters A 150 262-268(1990)
STRUCTURE OF THE EXACT WAVE FUNCTION: PROGRESS REPORT HIROSHI NAKATSUJI Departent of Synthetic Chemistry & Biological Chemistry, Graduate School of Engineering, Kyoto University, Kyoto, Japan
Solving the Schrodinger equation and the corresponding relativistic equation is a central theme of theoretical chemistry because of its scientific and practical importance. In our series of studies I-' we have investigated the structure of the exact wave function and the methods to calculate the exact wave function. Here, we summarize the progress report. The atomic and molecular system is defined by the Hamiltonian
H
=E--Ai1 i
- z x Z A / r A +i x l / q j .
2
i
A
i>j
In a second quantized form, it is written as
H =Zvia:ap+ pr
E w;4a:afaqap
P4"
which is equivalent to the above one when the one-electron basis ( p > i s complete. The Hamiltonian has a simple structure composed of only one-and two-particle operators. The exact wave function of this system is the solution of the Schrodinger equation (SE)
HW=W and the inverse Schrodinger equation (ISE)
H-'w = E-'v/, where the inverse Hamiltonian is defined by H-IH
= HH-' = 1.
When we sift our Hamiltonian to be positive, we have the variational principle and the
H -square group of equations for both H and H - ' , and
further the cross H-square equations connecting the worlds of these equations are equivalent to the SE.
45 1
H and H - ' . All
452
A necessary condition for the wave function I,!/ to have the structure of the exact wave function is expressed as follows. When the unknown variables included in I,!/ is optimized by the variational principle, the resultant I,!/ satisfies H-square equation: then this I,!/ has the structure of the exact wave function. This is because the variational principle gives the best possible wave function and the H-square equation is valid only for the exact wave function. From this criterion, we have proposed the iterative Cl (ICI) wave function and the extended coupled-cluster (ECC) wave function to have the structure of the exact wave function.
Suppose that we divide our H or H-' into
where
N , parts,
i is null for regular Hamiltonian and i = -1 or i for the inverse
Hamiltonian, and define a variable operator
s by
cf
with as variables. When we use the second quantized Hamiltonian and divide it into all singles and doubles parts, we have
which is the general singles and doubles (GSD) case. The ICI wave function is defined by the recurrence,
I,!/, = (1+' S n ) I , ! / n - , where the variables
7
c:,,are optimized by the regular or inverse variational
vn
principle, and at convergence is proved to be exact. Since each step of ICI is variational, its energy converges monotonically from above (or below) to the exact value. The ECC wave function is defined by
ly =exp('S)ly,. The simplest one (SECC) is for
N , =1 and it is proved to be exact in both
regular and inverse cases. From different argument, optimal
c' would be minus
infinite. The general ECC is not proved to be always exact if c i and c7q are
453 different, but it can be exact because of the non-linear nature of the wave function. (For details, see ref. 3.) Calculations of the excited states with the ICI and ECC methods were described in some details in refs. 2,4 . Applications to harmonic oscillator and to atoms and molecules were given. For a t o m and molecules, the ICI was shown numerically to converge monotonically to the exact value. When we use the inverse Hamiltonian in constructing the ICI and SECC wave functions, the convergence of the ICI was remarkably rapid because of the non-existence of the nuclear singularity in the inverse Hamiltonian. The same was true for the limited truncations of the exponential operator of the SECC. In the GSD case, the convergence of the ICI was quite rapid even if we use the regular Hamiltonian in constructing the wave function. This is an important special case because we can avoid the integrals V ; and w ; ~in the second expression of the S-operator given above.
References 1.H. Nakatsuji, J. Chem. Phys, 113,2949(2000). 2.H. Nakatsujiv and E. R. Davidson, J. Chem. Phys. 115,2000(2001). 3.H. Nakatsuji, J. Chem. Phys. 115,2465(2001) 4.H. Nakatsuji, J. Chem. Phys. 116, 1811(2002) 5.H. Nakatsuji, Phys. Rev. A 65,052122(2002) 6. H. Nakatsuji and M. Ehara, J. Chem. Phys. 117,9(2002) 7. H. Nakatsuji, submitted
COMPLEXITY OF MOLECULES SONJA NIKOLIC AND NENAD TRINAJSTIC The Rugjer BoSkoviC Institue, PP 180, I0002 Zagreb, Croatia
We studied complexity of several classes of molecular graphs': linear and branch trees, (po1y)cycles and general graphs. Trees and (po1y)cycles are simple graphs that are graphs without multiple edges and loops. General graphs are graphs with multiple edges and loops. A loop is an edge with both of its vertices identical. We used trees2 to represent alkanes, cycles to represent cycloalkanes, various polycyclic graphs to represent cage-structures and general graphs to represent heterosystems. We employed in our study the following complexity indices: Bonchev's indices3, Zagreb complexity indices4, spanning trees2 and total walk counts'. We considered the following structural features: a graph (molecule) size in terms of number of vertices and edges, branching, cyclicity, the number of loops and multiple edges and symmetry. Bonchev proposed the topological complexity index TC to account for a complexity of the molecular graphs3:
TC =
x s
xdi(s) verices
The first sum in the formula is over all connected subgraphs of a graph G, the second sum is over all vertices in the subgraph s and di is the vertex degree. The vertex degrees in subgraphs are taken as they are in G. If the vertex degrees are taken as they are in the respective isolated subgraphs s, than we have another Bonchev's complexity index denoted by TC1. Zagreb complexity indices TMI, TM1*, TM2, TM; are based on the original Zagreb indices M1 and M2 and the above Bonchev's idea of using subgraphs. M I and M2 indices are defined as:
M, = x d z vertex
M=xdidj edges
TM1 and TM2 are defined as:
454
(3)
455
TM, =
x s
xdz(s)
TM, =
xdidj(s) s
(4)
verices
(5 1
edges
Vertex degrees entering TM, and TM2 for a given subgraph s are taken as in a graph G. If vertex degrees are used as appearing in a subgraph s, than above formulas give TM1*and TM; indices. A spanning tree of a graph G, t(G), is a connected acyclic subgraph containing all vertices in G. In the case of trees (alkanes), the spanning tree is identical to the tree itself t(tree)=l
(6)
In the case of monocycles (cycloalkanes) the number of spanning trees is equal to the cycle size: t(cycle)=V
(7)
where V is the number of the vertices in the cycle. There are a number of methods available to compute the number of spanning trees’. The total walk count twc is given as a sum of the molecular walk count mwc5-’: v-1
twc = & l w c , e=1
A walk in a (molecular) graph is an alternating sequence of vertices and edges. The length of the walk is the number of edges in it. The molecular walk count of length t mwce is given as a sum of all atomic walks of length e
where awce(i) is the atomic walk count of length e of atom i. Formulas (8) and (9) apply equally to simple and general graphs.’ Our analysis shows: (i) all considered indices increased with the size and branching of alkanes; (ii) all considered indices also increase with the size of the cycloalkanes (note that M,=M2 in the case of monocycles); (3) general graphs were studied only by twc. We found that increase in the number of loops and
456 multiple edges considerable increases the value of twc; (4) the increase in symmetry lowers the values of considered indices (twc index is not considered because it is symmetry independent). Additionally it should be emphasize that TC, TC1, TMI, TM,*, TM2, and TM2* indices depend on knowledge of all connected subgraphs. The number of subgraphs often enormously increases with the graph size. MI and Mz are highly degenerate indices, so they cannot discriminate between many isomers. Spanning trees are useless in the case of acyclic structures. So our final conclusion is that all of the considered indices are useful to study some aspects of molecular complexity but we cannot say for any of them to be the best complexity index to use.
References
5.
6. 7.
8. 9.
N. TrinajstiC, Chemical Graph Theory, 2"d revised edn., CRC Press, Boca Raton, FL, 1992. F. Harary, Graph Theory, second printing, Addison-Wesley, Reading, MA, 1972. D. Bonchev, Novel indices for the topological complexity of molecules, SARaQSAR Environ. Res. 7 (1997) 23-43. S. NikoliC, N. TrinajstiC, I.M. ToliC, G. Rucker, C. Rucker, On molecular complexity indices, in: Complexity in chemistry, Introduction and fundamentals; D. Bonchev, D.H. Rouvray, eds.; Mathematical chemistry series, Taylor&Francis, London, 2003. G. Rucker, C. Rucker, Counts of al! walks as atomic and molecular descriptors, J. Chem. In$ Comput. Sci. 33 (1993) 683-695. G. Rucker, C. Rucker, Walk counts, labyrinthicity and complexity of acyclic and cyclic graphs and molecules, J. Chem. In$ Comput. Sci. 40 (2000) 99-106. G. Rucker, C. Rucker, On topological indices, boiling points and cycloalkanes, J. Chem. In$ Comput. Sci. 39 (1999) 788-802. S. NikoliC, N. TrinajstiC, A. JuriC, Z. MihaliC, G. Krilov, Complexity of some interesting (chemical) graphs, Croat. Chem. Actu 69 (1996) 883897. I. Lukovits, A. MiliEeviC, S. NikoliC, N. TrinajstiC, On walk counts and complexity of general graphs, Internet Electron. J. Mol. Des. 2002, 1(8), 388-400; http://www.biochempress.com
RADIATION DETECTION EFFICIENCY EVALUATION OF YAP: CE SCINTILLATOR BY MONTE-CARL0 METHODS D. NIKOLOPOULOS, P. LIAPARINOS,S. TSANTIS, D. CAVOURAS AND I. KANDARAKIS~ Department of Medical Instruments Technology Technological Educational Institution of Athens AgSpiridonos, 12210, Athens GREECE E-mail:[email protected]
G . PANAYIOTAKIS Department of Medical Physics Medical School University of Patras 26500, Patras, GREECE
Monte Car10 techniques were applied to evaluate the performance of YAP scintillator for use in medical imaging applications. The energy range considered was from 50 to SO0 keV and the thickness range from 5 to 30 mm. The absorption efficiency of YAP decreases rapidly in the energy range from 50 up to 200 keV. For higher energies up to SO0 keV, slow variation with energy is exhibited. In the energy range 200-800 keV the scintillator absorbs energy mainly through Compton recoil electrons while at the 50-200 keV energy range the photoelectric process dominates even following a scatter event.
1.
Introduction
Various forms of radiation detectors have been rapidly developed during the last few decades for application in positron emission tomography (PET), single photon tomography (SPECT), x-ray computed tomography (CT), digital radiography (DR) etc.' In most cases a scintillator layer coupled to an optical photon detector (photocathode, photodiode) is used. Recently particular attention has been paid to YA103:Ce (YAP:Ce) scintillator which has been reported to be of interest.2s YAP is of rather high density (5.37 g/cm3) and short decay time (25 ns). Light yields of the order of 40-50% relative to NaI:TI have been reported. However, it is of relatively low effective atomic number, thus necessitating the use of rather thick scintillator layers.
*
~
Please address corresoondence: Prof. I. Kandarakis, Ph.D., Dept of Med Inst. TEI of Athens,
Tel: (+30) 210-5385-375 [email protected]
(work)
-
Fax:
457
(+30)
210-5910-975
(work),
E-mail:
458 Radiation transport phenomena have been extensively studied by application of Monte Carlo method. The latter has proven to be by far the most successful one for the simulation of the stochastic process involved in radiation detection problem4. In the present study codes based on the Monte Carlo method have been developed and applied to examine the detection efficiency of YAP scintillator as a function of the incident photon energy and scintillator layer thickness. 2.
Material and Methods
A general Monte Carlo Fortran 77 code has been generated for the investigation of the transport of photons with energies up to 1 MeV similar to previous work.4 To be specialized for certain scintillator media the code was fed with the interaction cross-sectional data and the non-relativistic form and scatter factors of the scintillators investigated. In this study results for YAP scintillator are reported. In addition data for the well-known Gd202Sscintillator were derived for method validation. Following a previous study the dimensions of the simulated Gd202S block were selected to be 9x 9x 0.01226 mm3.The corresponding dimensions of the YAP were of 3 x 3 mm2 entrance area, while the thickness of the block varied between 5 and 30 mm. lo6 photons were generated and traced within the energy range from 10 to 800 keV, used in medical imaging applications from x-ray mammography to positron tomography.
3.
Results and Discussion
Figure 1 shows the fraction of incident photon energy absorbed (Total Absorbed-TA), the fraction of energy absorbed following photon scattering (Scatter & Absorbed-SA), the fraction of energy absorbed due to a photoelectric interaction following a scatter event (Scatter & Photoelectric Absorption-SPA) and the fraction of energy transmitted (Transmitted-T) through the Gd202S scintillator block. These results are in close agreement with results reported by others [5,6] Similar results are presented in figure 2 for the YAP scintillator block. As it may be observed the absorption efficiency of YAP decreases rapidly in the energy range from 50 up to 200 keV. For higher energies up to 800 keV, slow variation with energy is exhibited. In this energy range the behavior of the scintillator is dominated by incoherent scattering since the TA and SA fractions are almost equal. This is reinforced by the fact that at the above energy range, the SPA fraction is approximately zero. This may not be a problem since, although Compton effect degrades image quality in projection x- and y-ray imaging, the latter is mainly performed at significantly lower energies e.g. 15-100 keV for xray imaging and 140 keV for nuclear camera imaging using 99Tc.It is interesting
459
to note that in the energy range 200-800 keV the scintillator absorbs energy mainly through Cornpion recoil electrons while at the 50-200 keV energy range the photoelectric process dominates even following a scatter event.
Figure 1 . Variation of radiation absorption and transmission in Gd202S scintillator block with incident photon energy.
..-x~. ....)(... __ .x.. __ ..-.-......x , ,
..X”
’
’
d - scaner&Atsorbed + - Scadter&Rwt&ectnc
.- %+. -Trarsmined
-
L - & - & . L 0.m
0.10
oa
or)
4
-
0.40
~ 5 0
0.60
0.70
om
0.90
(MeV)
Figure 2. Variation of radiation absorption and transmission in YAP scintillator block with incident photon energy. Figure 3 shows TA, SA, SPA and T fractions for YAP scintillator with increasing thickness at 140 keV incident photon energy. As it is observed at 0.30 cm, which is a thickness proposed for animal PET scanners, the TA fraction is of the order of 36%.
460 In conclusion the absorption efficiency of YAP scintillator may be considered adequate for positron imaging. In addition, it could be convenient for 140 keV ("Tc) gamma camera imaging, since scatter effects dominate above 300 keV.
Figure 3. Variation of radiation absorption and transmission in YAP scintillator block with scintillator thickness at 140 keV photon energy.
References 1. C. W. E. van Eijk, Phys. Med. Biol. 47, R85 (2002) 2. M. Moszytiski, Kapusta M., et al. Nucl. 1nstr.and Meth.in Phys. Res.A 404, 158 (1998) 3. A. Del Guerra, C. Damiani et al. IEEE Trans. Nucl. Sci. NS47, 1537 (2000) 4. H.P.Chan and K. Doi, Phys. Med. Biol. 28, 109 (1983) 5. J.Boone, A. Seibert, et al. Med. Phys.26(6), 905 (1999) 6. I. Kandarakis, D. Cavouras, et al. Phys. Med. Biol. 42, 1351 (1997)
B-SPLINES : A POWERFUL AND FLEXIBLE NUMERICAL BASIS FOR THE CONTINUUM SPECTRUM OF THE SCHRODINGER EQUATION. AN APPLICATION TO HYDROGENIC ATOMIC SYSTEMS.
L. A. A. NIKOLOPOULOS Institute of Electronic Structure and Laser, F.O.R. T H , P. 0. Box 1527, Heraklion 711 10, Crete, Greece E-mail:[email protected] We present a method for the accurate calculation of the complete spectrum of the Schrodinger equation in terms of B-splines polynomial basis. The method is capable to represent numerically the bound and continuum spectrum of complex atomic systems. The theoretical method is discussed, and an application to hydrogenic hamiltonian is given.
1. Introduction
In the case of an atom with one electron outside a closed shell, the solution of the stationary Schrodinger equation (SE) can proceed as follows: Exploiting the spherical symmetry of the potential, the wavefunction of the electron is written as +,lmm,(r)= (l/r)P,L(r)&(O, 4)um,, with K,(O, 4 ) the usual spherical Harmonics and ems, the spin-function. Then the radial SE may be written as: 1 d2 2 dr2 The above equation is supplemented with the appropriate boundary conditions (BC) for the bound and the continuum states which, for conventional potentials r K ( r ) -+ Z,,,, r + 00, are given by, PE/kl(r+ 0) + C,/kp-lfl -+0 while for the infinity the asymptotic conditions read: ~ ] P , l ( r= ) [---
P,l(r 0O) + ~ , ~ e - e 7r Pkz(r + 00) A,lsin[lcr - 14C(r) --f
+ 2
-+
€50
+ bkl]
E
(2)
20
(3)
with k being the momentum vector of the continuum state related with the energy E = k2/2, + c ( r )the Coulombing phase shift (long-range phase 46 I
462
shift) and b k l ( T ) being the scattering phase shift (short-range), which basically reflects the deviation of the true potential 'seen' by the outgoing electron K ( r ) from the Coulombing potential Vc = -Z,jj/r. The amplitudes AE,kl,c , / k are determined through the appropriate normalization of the bound and continuum solutions. 1.1. Rayleigh-Ritz-Galerkin approach Expanding the radial wavefunctions in a finite basis set (Gaussian, Slater, B-splines, ...), defined in an interval [0, R ] , 0 < R < co,(for Gaussian and Slater basis, R extends to infinity) as:
c 12,
Pel(?-)=
C;E1)ui(T),
(4)
i= 1
where, n, is the number of basis ui(r). Substituting this expansion to the radial SE (l),and taking the variational condition in respect of the coefficients Ci, b(uj(r)I(hl(r)--~)I C:")ui(r))= 0 leads to the following matrix equations for the coefficients C,l = (ct'), cf'), ....,ens (€1) ) :
A ~ ( E. C ) = [hi- EU]* C,i = 0
(5)
The matrix hl is the representation of the radial hamiltonian on the finite basis, and U the overlap matrix, defined by,
u . .= 23-
< u . 1 ~>= . a
3
I"
drui(r)uj(r)
(7)
The boundary conditions for the bound states define a two-point boundary value problem, whose solution give discrete eigenfunctions and eigenvalue energies, whereas the boundary conditions for the continuum states define an initial value problem for each arbitrary pre-selected energy E = k2 12. More over, the 'inner' hamiltonian hl, for the case that R has finite value, is well known that is non-hermitian due to kinetic term -d2/dr2.
where h is the symmetric part of the Hamiltonian and hs is the surface term, vanishing in the limit R 4 00, with algebraic form given by the Bloch operator: hs = i b ( r ) g ( r e ) .
463
Hydrogen Radial functions free boundnry codtiom approach I " " '
"
"1
~ ' " ' ~ " ' " ' ' " " " ' ' '
0.7
0.5
-d
0.3
0.1
-0.1
-0.3
t
-0.5""''""' 0 5
10
'
"
"
15
'
"
"
'
"
20
"
'
"
"
"
25
~
30
35
radius (n.u.)
Figure 1. Hydrogen radial bound states
2. Free boundary conditions approach An approach for considering the problem is t o ask for scattering solutions at a certain energy E with no assumed boundary conditions. The present approach for the continuous spectrum of Eq. ( 5 ) , is to tranform it to a system of inhomogeneous linear equations B F M 9 2 . We choose as the basis set the B-spline polynomials Bfs i = 1,, 2 , ...n,, of order k,, defined in an interval [0, R] on a sequence of knot points t[i]5 t[i 11, i = 0,1, ...,n, k,, where t [ O ] = t[l] = ..t[k,] = 0 and t[n, 11 = t[n, 21 = .. = t[n, k,] = R In this representation the boxhamiltonian and the overlap matrix are given from relations as in Eqs.( 6,7) with the obvious substitution ui -+Bi (V -+ B ) . A special discussion is necessary here for the surface term h S . The matrix representation of this operator is given by:
+ +
+
+ +
h; = -(1/2)B@)B;(R)
(9)
By definition of the B-spline basis, the only non-zero B-splines functions at the boundaries are the first B-spline Bl(0)= 1 and the last one Bns( R )= 1. Since the solutions should satisfy Pe,kl(0) = 0 we exclude from the basis set the B-spline the first B-spline B l ( r ) . Furthermore the surface term is reduced t o involve only the terms with i = n,.Finally from the properties of B-splines the relations BAs(R)= -BAs-,(R) = ( k , - l ) / t s , t s = t[n,+l]-
464
t[n,]are obtained. Then the only non-vanishing elements of the B-splines representation of the Bloch operator are given by, hZan, = -hZ,n,-l k -1 = % - h s This is a clearly unsymmetrical matrix since hn,-ln, = 0 # hngn,-l. It is worth noting that this 'non-hermiticity' of the hamiltonian is independent on the box radius R. It also favorables B-spline of lower order
k,. The B-splines box-hamiltonian now is given by a non-hermitian matrix of the form:
hi=
[
hll
hlz
.....
.....
... .,.
I
hi(ns-i)
...
h(nS-1)1h(n,-1)2 ... ... h(ns-l)n8 hn,l hn,2 *.* h(ns-1)n, + hs hngn, h? (lo) Assuming Eqns (5) and (10) we rewrite the non-linear system of equations as: ( h i - EB) . C,i = Co
(11)
where h i is the hermitian part of hi and CO = (O,O, ...., 0 , l ) and Gel are new set of coefficients which related with those of Eq. (5), through the relation C,l(new) = C~i(old)/(hsC,,_l(old)).A little discussion is worth Hydrogen radial continuum function R=U)Oa.u.,~=O.06517a.u.,l=l
-1.5
'
100
m
300
I
400
Radius ax%(8.u.)
Figure 2. Hydrogen continuum radial state for e k = 0.06517-a.u. . For 0 < T < R = 200-a.u. the, un-normalized (4) and normalized (??) in energy, radial state is plotted as produced by the calculation. For the region R < T < 400-a.u the WKB asymptotic expansion has been used.
noting here. Our effort is to eliminate the problem of non-hermiticity of the
465
box-hamiltonian. For this, we have rewritten the matrix equations keeping only the hermitian part of the hamiltonian and making the system of linear equations non-homogeneous. The non-hermitian part of the hamiltonian, has been moved t o the right-hand which depends on the product hs.Cn,-1, side (RHS) of those equations, which in principle is unknown. Dividing both parts of equations by the arbitrarg number hs . Cns-lwe fix the RHS as given by CO.This way we have eliminated the problem of nonhermiticity of the box-hamiltonian with the cost that we are only able to determine a solution vector of arbitrary normalization. The complete determination of the solution vector in the box comes later on, applying the normalization rules that the bound and continuum states should satisfy, according quantum mechanical scattering theory B z l r 6 3 . In figure (1) we = 1) ls,2s,2p,3p states obtained with the free show the hydrogen (2,ff boundary conditions approach. As a another representative example, in figure ( 2 ) we plot the unnormalized, normalized and WKB radial state for Ek = 1.7733.
References BFM92. Tomas Brage, Charlotte Froese Fischer, and Gregory Miecznick. Nonvariatonal, spline-Galerkin calculations of resonance positions widths and photodetachment cross sections for H - and He. J . Phys. B, 25:5289, 1992. Bur63. A. Burgess. The deternination of phases and amplitudes of wave functions. Proc. Phys. SOC.,81:442, 1963. NikO3. L. A. A. Nikolopoulos.A package for the ab-initio calculation of one- and two-photon cross sections of two-electron atoms, using a CI B-splines method. Comp. Phys. Comm., 150:140-165, 2003.
A FINITE ELEMENT APPROACH FOR THE DIRAC RADIAL EQUATION
L. A. A. NIKOLOPOULOS Institute of Electronic Structure and Laser, F.O.R. T H , P. 0. Box 1527, Heraklion 711 10, Crete, Greece E-mai1:nlambrosOiesl.forth.gr The Dirac radial functions are expanded in polynomial B-spline basis, transforming the Dirac equation in a generalized eigensystem matrix problem. Due to the locality nature of the B-spline functions the matrix representation of all the involved operators are highly sparse. Diagonalization of the matrix equations provides the bound and continuum eigenstates. Comparison with the analytical solutions of the hydrogenic atomic systems is presented and application for the non-hydrogenic atomic systems. The generalization of the above programs to the case of exotic atomic systems and highly charged ionic systems is straightforward.
1. Introduction
The field-free Dirac Hamiltonian is of the type:
where a and 1 are the 2 x 2 Pauli and the diagonal matrices, respectively. The central potential is given by U ( r ) = -Ze2/r+K(r), with - Z e 2 / r being the nuclear potential and V ( r )the 'screening' potential which is either a model potential or a Dirac-Fock potential determined self-consistently. Straightforward partial wave analysis gives for the time-independent radial Dirac equation in a central potential:
with
E
= E - mc2 the 'transformed' energy and h D given by:
with the quantum number k being the relativistic analog of the 1 quantum number in the classification of the states in the Schrodinger equation (SE). 466
467
Knowledge of k is equivalent t o knowledge of the quantum nubers j , l of the j 2 ,l 2 operators. The use of the finite basis method in the Dirac equation (DE) has certain problems that do not appear in the case of the SE. The main reason is that the spectrum of the DE is not bound from below (the negative eigenenergies decrease indefinitely t o -ca thus making the variational method inappropriate for relativistic calculations. 2. Dirac equation and B-splines method Detailed presentation and application of the method has been given by Johnson and Sapirstein in a pioneering paper J B S 8 8 and in a recent review sJ96. However, attention has been given mainly for the bound state of the atomic systems. Here we present a method, which is capable t o represent the continuum relativistic states at high accuracy. The basic idea is the same as in the non-relativistic case, which is the confinement of the atom in a sphere (box) of radius R. This has the effect of the finiteness of the number of the bound states (for R 4 00 this number is infinite) and the discretization of the continuum spectrum, while the number of the continuum states remains infinite. The equations to be solved, are derived using the action principle, which has the advantage of introducing the boundary condition into the radial equations in a systematical manner. Expanding the radial functions in a B-spline set of order k,, total number n,, defined in a region [0,R ] ,dB78 as: ng
Gk ( T ) =
n,
C pik)&
Fk(T)
(T)
i=l
= Cq,(")Bi(r)
we obtain the 2n, x 271, generalized eigenvalue equation, from 0, dS/dqi = 0 :
A . U(k) = &kB. ~ ( ~ 1 ,
k
u ( ~=) (PI
where the A , B matrices are given by:
and
(4)
i= 1
k
7
PZ
k k k , '**, Pn8 q11 QZ 1
aslapi = k
7
...qn, 1,
(5)
468 Hydrogen N I100, k = 9 , R I 100 am., h e a r grid
0
5
10
15
0
r (a.u.)
Figure 1. Small radial components for the hydrogen for the Is, 2s states.
The elements of the matrix M are given by the integral Mij(q) = Bi(r)q(r)Bj(r).The elements of the “boundary” matrix A s ( @ are = 0 , aS/aqi = 0 , namely: derived from the variatonal equations,
s,”
The solution of the above system gives n, states with E > 0 (positive states) and n, states with E < 0 (negative states). As an representative example in figure (1) we plot the small components of the Is, 2s radial functions of hydrogen. In table (1) we show eigenenergies of the ns states of the Hydrogen, obtained with the Dirac Hamiltonian H D analytically and numerically. 2.1. Continuum states and normalization.
In atomic quantum theory, bound states are normalized to unity while the continuum eigenstates are energy normalizable. Leaving the angular part normalization out of the discussion we have for the radial part of the wavefunction P ( r ) + F ( r ) , G ( r )(P,IPb) , = dab. The physical meaning of the calculated positive-energy wavefunctions pEs ( r )within the basis-set framework is the following: The positive-energy solutions pc,( T ) (normalized in
469
Wn
nsllz 1 2 3 4 5 6 7
Analytical
B-splines
-0.50000665656957 -0.12500208019145 -0.055556295171932 -0.031250338036512 -0.020000181052151 -0.01388899674244 -0.010204150879325
-0.5000066565997 -0.12500208018833 -0.05555629517705 -0.031250338031454 -0.020000181060167 -0.013888996751446 -0.010204150944494
unity), when divided by the weight wi which allows integration over the continuum, represent the actual continuum Coulomb function, of energy ~ i inside , the box.
The adoption of boundary conditions at finite radius determiners the spacing of consecutive energy eigenvalues, or equivalently the density of states. The density of states p ( E ) instead of the &function form takes now finite values, p(Ei) = 2/(Ei+l - Ei-1). Choosing as normalization factor the square inverse of the density of energy, namely, Ai = P(E 1 = for the normalization of the discrete positive-energy states we obtain, R-CU (PiIPj) = p ( ~ i ) 4 b(&i- ~ j ) . The limit of this finite normalization when R -+ 00 goes to &function as it should.
,/-
References dB78. Carl de Boor. A Practical Guide to Splines. Springer - Verlag, New York, 1978. JBS88. W.R. Johnson, S.A. Blundell, and J. Sapirstein. Finite basis sets for the dirac equation constructed from B-splines. Phys. Rev. A , 37:307, 1988. SJ96. J. Sapirstein and W.R. Johnson. Use of B-splines in theoretical atomic physics. J. Phys. B, 29:5213-5225, 1996.
A BAYESIAN STATISTICAL MODELING FOR THE DISTRIBUTION OF INSURANCE CLAIM COUNTS IOANNIS NTZOUFRAS Department of Business Administration, University of the Aegean 8 Michalon Street, 82100, Chios, Greece E-mail:[email protected]
ATHANASSIOS KATSIS Department of Statistics and Actuarial Science, University of the Aegean Vourlioti Building, Karlovasi, 83200, Samos, Greece E-mail: [email protected]
DIMITRIS KARLIS Department of Statistics, Athens University of Economics and Business 76 Patission Street, 10434, Athens, Greece E-mail: [email protected] The aim of this article is to develop model comparison techniques among three widely used discrete statistical distributions employed for estimating the outstanding claim counts in actuarial and practice. The statistical treatment is from the Bayesian point of view. We utilize the advanced computational technique of Reversible Jump Markov chain Monte Carlo algorithm to estimate the posterior odds among the different distributions for claim counts. The results are compared for various data sets.
1.
Introduction
The problem of choosing the appropriate distribution to model the outstanding claims is of particular interest in actuarial science (Makov 2001). Research on claim distributions includes the publications of Ter Berg (1980) on Poisson and Gamma model, Scollnik (1 998) on Generalizeaangrangian Poisson distribution and Denuit (1 997) on Poisson-Goncharov distribution. On the other hand, Reversible Jump Markov Chain Monte Carlo (RJMCMC, Green, 1995) sampling strategies are used to generate samples for each posterior distribution of interest, estimate posterior model probabilities and account for distributional uncertainty. In this paper, we focus on the comparison of three popular distributions used for claim counts, estimating the posterior model odds and comparing the three distributions of interest.
470
47 1 2.
Distributions for claim counts
We describe three widely used distributions for modeling the marginal claim counts, namely the simple Poisson distribution (Ter Berg, 1980), the negative binomial (Verrall, 2000) and the Lagrangian Poisson distribution (Scollnick, 1998). Consequently, the simple Poisson model can be regarded as a special case of either the negative binomial or the Lagrangian Poisson distribution. Let us assume data y , , i = 1,..., n. Then the simple Poisson model is given by y , I A, Poisson (Al) In our case, since we are interested in the marginal distribution of the claims, A, = A . The negative binomial distribution is given by adding a hierarchical step on the simple Poisson model.
-
I 0 - Gamma (eye)
y , I E , ,A, Poisson(E,A,) , E,
where 8 > 0 and Gamma(a, b) is the Gamma distribution with mean and variance a / b and a/b2 respectively. The above model can be rewritten as
+
with E ( y , ) = A, and V ( y , )= A, A , 2 / e . The Poisson model is a limiting distribution of (1) for
8+w.
Since
A,= A ,
we may adopt the
reparametrization 0 = A / 4 . This will result to a Dispersion Index (DI) equal to D I = 1 + 4 . For 4 + m , the above distribution degenerates to the simpler Poisson distribution. The GeneralizedLagrangian Poisson model with parameters defined in the following way:
6, and co is
For o=O, the above distribution degenerates also to the simple Poisson model with mean 6, . In order to enhance comparisons across the three models, we reparametrize the above distribution using
A, = 6,/(1 - a ) .
The
472 reparametrized
distribution
V ( y ,) = A,(1 +
has
mean
and Dispersion Index
E ( y , ) = A,,
DI = (1 +
variance
.
3. Comparing the claim distributions using the Bayesian approach In the outstanding claims problem, the Bayesian approach has been substantially advocated. It is based on constructing a model m, its likelihood f(yl 0,,m) and the corresponding prior distribution f (0, 1 m) ,where 8, and y denote the parameter vector and the data vector respectively. Although inference is primarily based on the posterior distribution f(0, I y , m ) , we also incorporate model uncertainty by estimating the posterior model probability f ( m I y ) . Consider two competing models m, and
m, . Using the Bayes theorem, the posterior odds PO,, of model m, versus model m, is given by
where B,, is called Bayes factor of model m, against model m, and
f (mo) f (ml)
is called 'prior model odds' and the marginal likelihood is given by
f ( y I m)=
If
( y I 0, ,m ) f (0, I m)dem. The Bayes factor B,, evaluates the
evidence against the null hypothesis which is familiar to classical hypothesis testing. For a set of models M = {m,,m2,..., m l M Ithe } posterior probability of model m E M is defined as
m, EM
where M and 111.11 denote the set and the number of models under consideration respectively. Only in specific examples are the integrals involved in the computation of the posterior model probabilities analytically tractable. Therefore asymptotic approximations or alternative computational methods must be frequently employed. Some of the most popular techniques for calculation of these quantities are Markov Chain Monte Carlo (MCMC) methods and their recent extensions (Reversible jump algorithm) in varying dimension models. Moreover, Reversible Jump MCMC (RJMCMC) methodology helps us to account for model uncertainty.
473 4.
Reversible Jump MCMC for Bayesian model comparison
The Reversible Jump methodology is based on creating an irreducible and aperiodic Markov chain that can alternate (that is jump) among various models with parameter spaces of different dimension, while retaining detailed balance which ensures the correct limiting distribution. To our knowledge, no publication in actuarial literature considers these advanced and modem techniques. We introduce a latent model indicator m that takes values m E (ml,m 2 ,m3 ) corresponding to the three distributions respectively. Similarly, let 8 , denote the parameters of these distributions. For the comparisons of the distributions we are interested in, the algorithm can be formulated in the following way: Initially, we generate model parameters 8, distribution
f ( 8 , I m),
Then,
we
propose
from the conditional with
probability
j(m’,m)= (I A4 I -1)-’ to jump from m to rn’ . If rn’ f m, (the Poisson model), 8 or ware generated from pseudo-prior q(ul m‘). Accept the proposed move with probability a(m,m’) = min(1, S(m,m’)} where
References 1. 2. 3. 4. 5. 6.
M. Denuit, Astin Bulletin 27,229-242 (1997). P. Green, Biometrika 82, 71 1-732 (1995). U. Makov, North American Actuarial Jour. 5(4), 53-73 (2001). D.P.M. Scollnik, Astin Bulletin 28, 135-152 (1998). P. Ter Berg, Astin Bulletin 11, 35-40 (1980). R. Verrall, Insurance: Mathematics and Economics 26, 91-99 (2000).
DESIGN, EVALUATION MEASUREMENTS AND CFD MODELING OF A SMALL SWIRL STABILISED LABORATORY BURNER
N. ORFANOUDAKIS TEI Chalkis, Mechanical Engineering Department, Laboratory for Steam Boilers, Turbines & Thermal Plants. 34400 Psachna-Evia, Greece A. HATZIAPOSTOLOU TEI Athens, Energy Technology Dept., 12210 Athens, Greece
E. MASTORAKOS Universiiy of Cambridge, UK
E. SARDI Imperial College, London. K. KRALLIS Heron Consultants Engineers, Greece N. VLACHAKIS TEI Chalkis, Mechanical Engineering Department, Laboratory for Fluid Mechanics, 34400 Psachna-Evia, Greece
S. MAVROMATIS TEI Chalkis, Mechanical Engineering Department, Laboratory for Machine Elements, 34400 Psachna-Evia, Greece
An important way of understanding flame processes is to scale-up the measurements' results of tests on small burners to larger ones. Ideally, the result of a burner and flame scaling would be the complete similarity of all the combustion processes (turbulent transport and mixing, heat generation, heat transfer) in the scale down domain. In reality however, that is not possible, as all the physical and chemical processes will not scale down in the same way. Extensive work on volatile evolution and ignition and their scale up to larger burners, has been reported in the past, where non-dimensional numbers have been used to scale up results measured in small scale burners as for example the En number (for ignition process) and F1 number (for devolatilisation process). There are mainly two practical criteria for the scaling of burners, namely constant velocity (CV) and constant residence or mixing time (CRT) scaling. Both scaling criteria rely on the scaling of the large macro-scale turbulent 474
475
mixing process. In the constant velocity scaling case air velocities and fuel particle velocities are maintained constant with scale reduction. Swirling flows are used as means of controlling flames in combustion chambers and have also found application in various types of burners in order to achieve the desired ignition and burnout characteristics for a given fuel. This paper presents a purpose-built laboratory burner that was designed as a scale model of an industrial coal burner operating in a cement rotary kiln. The design criteria are described in order to justify the technical solutions employed in the burner and its support equipment. Detailed velocity and temperature measurements obtained by means of a Laser Doppler Velocimetry system and a thermocouple respectively, under various operating conditions, are also presented. The measurements provide further understanding on how swirl interacts with the combustion process occurring in this type of industrial burners and can lead to conclusions about the behaviour of coal particles or droplets within the flame as well as the emission production. The latter can be explained because, as large and small particles follow different trajectories- dependent on near burner aerodynamics- remain in a variety of temperature regions. In this way, the coal particles heating up, devolatilisation and combustion occur under different temperature field and oxygen presence. Thus, the production of NOx that depends on those factors can be predicted. The data presented can also be used to form a database for the validation of a computer code simulating the combustion process. This work is part of a larger program undertaken by the current group for the study of processes occurring in kilns of the cement industry. The main objective of this program is the development of the necessary know-how for the clean combustion of organic wastes in this type of kilns. The laboratory burner used in this study and shown in Figure 1 allows for mixtures of gaseous, liquid and pulverised solid fuel and flows of different degrees of swirl. The similarity to the industrial scale burner was accomplished by employing the constant velocity (CV) scaling criterion. Due to the small size of the laboratory flame, in comparison to the respective flame in the industrial burner, the residence time of the coal particles in the former is reduced and the use of gaseous fuel (methane) is necessary to ensure flame stability. The amount of swirl in the flow can be adjusted by varying the ratio of axial and tangential air, while maintaining the same total air flow rate. The air for coal is directed to an air tight metallic box which encloses the coal feeder. It uses a vibrating tube with variable amplitude to disperse the coal from a hopper into the air flow. Then coal is pneumatically transported through a copper tube to the burner. Velocity measurements at the near burner region were obtained with a dualbeam LDV system employed in the off-axis (30")forward-scatter mode as described in earlier work of our group. Isothermal and reacting single phase
476 flows (air and/or natural gas) as well as reacting multiphase flows (with coal present) were seeded.
U Liquid fuel
* 1
Pulverised coal and air
Figure 1: Laboratory burner (a) with enlarged view of the fuel gun (b) and horizontal cross-sections of the burner (c) showing i) tangential and ii) axial inlets (d) coordinate system and velocity components. Four cases under reacting conditions with gaseous and pulverised coal fuels have been considered and the mean and rms velocity components at 4 different stations downstream the burner exit are reported. The total air flow rate was the same for all cases, but the tangential to axial air flow rate ratio vaned resulting in weaker or stronger swirling flow as shown in Table 1. The radial component is presented only at the two stations downstream, closer to the burner exit, since values are close to zero further downstream. In all figures, velocity and radial distance were normalised by the bulk exit velocity V, and the burner throat diameter (De),respectively. All mean profiles show a high degree of symmetry around the central burner axis, which indicates a symmetric flow field for both cases.
477
Case 1 Medium swirl 96 84 7 25
Case description Tangential air [m3/h] Axial air [m3/h] Gas flow rate [m3/h] Bulk exit velocity [ds]' Swirl number
0.65
(a) -case 1
(b)-case 2
(c)- case 1
Case 2 Strong swirl 175 0 7 24 0.9 (d)- case 2
Figure A: Effect of swirl on gaseous (a & b) and coal flames (c) and (d). At the internal zone, with high combustion activity, the influence of the combustion can be interpreted as acting opposite to the swirl movement, resulting in a weaker inner vortex core, whereas the outer vortex core is swirling with higher velocities. Combustion induces temperature increase and thus volume expansion of the gaseous elements to all directions. This expansion implies disorderly motions resulting in angular momentum loses for the swirling flow. The mean swirl profiles further downstream do not display this feature, probably due to the larger distance from the combustion initiation zone where local pressure differences are intense. To enhance understanding on the effect of the swirl number on the fluid and particle motion, a numerical parametric investigation has been performed and preliminary results are compared to experimental data and discussed in this section. The three dimensional steady state Favre averaged Navier-Stokes equations describing the mean single phase iso-thermal swirling flow coming out of the laboratory burner were solved by use of the commercial software Star CD. Unclosed turbulence terms were modelled by use of the standard high Reynolds k-8 model. The deficiencies of this model in swirling flows are well documented; however, it has been proven one of the most appropriate models in terms of computational economy, stability and reliability of the rssults ~
~
~-
Bulk exit velocity, VS,is the mean velocity at the burner exit determined by the volumetric air flow rate and the annular area at the exit
478
particularly in combustion applications and has been adopted for the present work.
Figure B Predictions of axial and swirl velocities components and turbulent levels along the centreline (a) and at ZlDs0.62 (b). Figure B shows a generally good agreement between measurements and computations as a function of the swirl number, taking into account the restrictions imposed by the turbulence model and uncertainties regarding the initial flow conditions. Investigations of the flow field generated by a versatile small swirl stabilised multi-fuelled laboratory burner showed that: For all different -flow settings, swirl numbers 0.65 to 0.9, the obtained profiles show a highly symmetric flow field with respect to the centreline of the burner. With a sufficiently high enough amount of swirling air, part of the fluid reversed their axial flow direction and formed an internal recirculation zone (IRZ). This IRZ can be illustrated as a toroidal vortex formed around the centre axis of the burner. Flames with stronger swirl appear to be shorter (along the z-axis) and broader (along the r-axis). The Internal Recirculation Zone appears to be 30% wider for higher swirl case (swirl number 0.9 as compared to the 0.65 one). the width extension of the zone that the coal particles recirculate, is by 20% higher as compared to lower swirl number There is, as expected, increased coal particles centrifugal phenomenon for higher swirl number. Further work can be performed with the use of this burner to examine other parameters such as burner quarl geometry, behaviour of the different size of coal particles and staged combustion. The laboratory burner can be used as test bed for laser techniques such as spectroscopic (CARS, LIV) or particle sizing techniques.
DATA STRUCTURING APPLICATIONS FOR STRING PROBLEMS IN BIOLOGICAL SEQUENCES Y. PANAGIS, E. THEODORIDIS, K. TSICHLAS Research Academic Computer Technology Institute 61 Riga Feraiou Str., 26221 PATRAS, GREECE & Computer Engineering and Informatics Department University of Patras, 26500 PATRAS, GREECE E-mail: {panagis. theodori, tsihlas)@ceid. upatras.gr
In this work we present applications of data structuring techniques for string problems in biological sequences. We firstly consider the problem of approximate string matching with gaps and secondly the problem of identifying occurrences of maximal pairs in multiple strings. The proposed implementations can be used in many problems that arise in the field of Computational Molecular Biology.
1.
Introduction
One important goal in computational molecular biology is identifying repeated patterns (always of initially unknown content) in nucleic or protein sequences. In the Exact pattern matchingproblem one wishes to find all the occurrences of a pattern X , in a biological sequence, while in the Approximate pattern matching problem one wishes to find the occurrences of possible diverse forms of a pattern x , a set of sub strings that match the given pattern with at most kdifferences ( k is a constant that express the approximation extent of the mismatch). Tracing of a pattern in an approximate manner is often used in DNA sequencing by hybridization, reconstruction of DNA sequences from known DNA fragments and in the determination of evolutionary trees among distinct species. Finally, the detection of regularities in such sequences is very useful, because it points out features that are common to a set of them.
479
480
2. Background 2.1. The @,y)-approximatepattern matchingproblem
The approximate pattern-matching problem is based on two metrics of string approximation, the 6-approximation and the y-approximation, where 6 and y are integers. Two symbols a, b of an alphabet C are said to be 6-approximate, denoted as a =s b, if and only if la-bll 6. In that manner two strings x, y are 6approximate, denoted as x =6 y. if and only if Ixl=lyl and xi =a yi. Also two strings x, y are y-approximate, denoted as x
y, if and only if Ixl=lyl and
2,
xi - yi I I y . Finally we have (6,y)-approximation for two strings when
i=l
the both condition are satisfied. 2.2. Suffrv Trees and their use
In order to solve the maximal pair problem we used a well-known data structure, the Generalized Sufjx Tree. A Sufjx Tree is a compact trie corresponding to the suffixes of a given string. Consequently a generalized suffix tree is a tree data structure, with the same properties as the suffix tree, but built over a set of strings, instead of a single string. A suffix tree for a string of length n can be constructed, using McCreight’s algorithm [4], in O(n)time and uses
O(nk) space, where k is length of the alphabet.(In the case of DNA
sequences k=4 so O(n)space is used). A generalized suffix tree for a set of strings with total length n , uses the same space and needs the same time to be constructed [ 11.
48 1 3.
Methodology
In the field of approximate string matching problem, we have focused in a set of methods, depending on the type of approximation and the variations of the gaps. The goal is to find &occurrences with a-bounded gaps, (6,y)-occurrences with abounded gaps, &occurrences with unbounded gaps, (6,y)-occurrences with unbounded gaps, &occurrences minimizing the total difference of gaps, 6occurrences with &-boundeddifference of gaps of a pattern x in a string T. Ultimately we want to find the &occurrences of a pattern in a set of strings with bounded gaps. The time and space that the above methods are using are described thoroughly in [2]. Also in this work we have considered the problem of finding occurrences of maximal pairs P in a set of strings S={S,,S2, ...,Sk.). A pair P is a sub-string x that occurs twice in a string Si and the gap is defined as the number of characters between the two occurrences (see Figure 2). The notion of maximality refers to the fact that its component x is the longest possible. .-
.....
X
-
Figure 2. A gap between the two occurrences of pair x
Assuming that IS, (+(S,I+...+ISkl=n, there are two algorithms for reporting the maximal pairs depending on whether the gap of every pair is bounded or not. In the first case when the size of gaps is bounded by a constant value b the time complexity is O(nlog2 n -tak log n) ,where a is the size of the output bounded by O(bn). Finally in the second case where the gaps of maximal pairs are unbounded the time complexity of reporting them is O(n + a ) , where a is the size of the output, bounded by O ( n 2 ) . For the representation of a set of k input strings, a generalized suffix tree (GSQ is used. The GST as previously described is constructed in linear time and consuming linear space. The main idea behind the reporting of the maximal pairs (and simple pairs) is that for any internal node u, the sub-string that corresponds to the path from the root to u can appear in all the suffixes that traverse u and end up to the leafs in his sub-tree (see Figure 3).
482
4.
Experimental Results
We implemented the above algorithms using the C++ programming language and the LEDA library that provides efficient implementations of basic data structures and algorithms. We have used the gcc compiler (version 2.95) and run the experiments in an ultrasparc@400 system with solaris 0s and in a PentiumIV@2500 with Linux 0s. Maximal Pairs Unbounded Gaps
total length of strings
Maximal Pairs Bounded Gaps
Q ,QQ
@ , Q
,..""
$QQQ
+QQQ
@QQ
+QQQ
@QQQ $QQQ
@@+
total length of strings +gap=250
+gap=500
+gap=750
Figure 4. Experimental Results for the time complexity.
483 References 1.
2.
D. Gusfield, Algorithms on Strings Trees and Sequences, Cambridge University Press, (1997). M. Crochemore, C.S. Iliopoulos, C. Makris, W Rytter, A. Tsakalidis, K.
Tsichlas, Approximate String Matching with Gaps, Nordic Journal o f Computing, Volume 9, 2002, pp. 54-65. 3. C.S.Iliopoulos, C. Mahis, S. Sioutas, A. Tsakalidis, K. Tsichlas, Identifiing Occurrences of Maximal Pairs in Multiple Strings, In Proc of Combinatorial Pattern Matching 2002 - CPM2002, LNCS 2373, pp. 133143, Springer Verlag. 4. Edward M. McCreight: A Space-Economical Sufix Tree Construction Algorithm. JACM23(2): 262-272 (1976)
COMPUTING NASH EQUILIBRIA THROUGH PARTICLE SWARM OPTIMIZATION
N.G. PAVLIDIS, K.E. PARSOPOULOS AND M.N. VRAHATIS Department of Mathematics, University of Patras Artificial Intelligence Research Center (UPAIRC), University of Patras, GR-26110 Patras, Greece E-mail: { npau, elena, kostasp, urahatis} @math.upatras.gr This paper considers the application of a novel optimization method, namely Particle Swarm Optimization, to compute Nash equilibria. The problem of computing equilibria is formed as one of detecting the global minimizers of a real-valued, nonnegative, function. To detect more than one global minimizers of the function a t a single run of the algorithm and address effectively the problem of local minima, the recently proposed Deflection technique is employed. The performance of the proposed algorithm is compared to that of algorithms implemented in the popular game theory software suite, GAMBIT. Conclusions are derived.
1. Introduction
A central solution concept in game theory is that of Nash eq~ilibrium.~ Several approaches have been proposed for the computation of Nash equilibria in finite strategic games but computing such solutions remains a challenging task. Furthermore, as pointed out in Ref. 3, computing a single Nash equilibrium is inadequate for many applications. The problem of detecting a Nash equilibrium can be formulated as a global minimization problem. This approach enables us to consider an efficient and effective optimization method, named Particle Swarm Optimization (PSO), to address this problem. Incorporating the recently proposed Deflection technique for alleviating local minima and finding more than one global minimizers of a function, several Nash equilibria can be located in a single run of the algorithm. The performance of the proposed algorithm is compared to that of algorithms implemented in the popular game theory software suite GAMBIT." aThe GAMBIT suite is freely available from: http: / / w w . h s s . caltech. edu/gambit/.
484
485
The paper is organized as follows: Section 2 is devoted to the formulation of the problem. In Sections 3 and 4 the PSO and the Deflection techniques are briefly exposed. Experimental results are reported in Section 5 and conclusions are drawn in Section 6. 2. Problem Formulation 2.1. Strategic Games and Nash Equilibria
Definition 2.1. A strategic game consists of a finite set N = (1,.. . ,n } of players; for each player i E N a strategy set Si = (sill... ,sirni} is given, consisting of mi pure strategies. For each i E N, a payoff S ~the Cartesian function ui : S --+ R is also given, where S = X ~ G N is product of all Si's. Let Pi be the set of real valued functions on Si. The notation p i j = p i ( s i j ) , is used for the elements pi E Pi. Let also P = x ~ ~ and N Pm ~= C i E N m i .Then P is isomorphic to Rm. We denote elements in P by P = ( P I , P ~ , . . . , P ~where ), pi = ( ~ i 1 1 ~ i 2 1 * * . 1 ~ Ei mPi. j ) If P E P , and pi E Pi, we use the notation @ : , p i ) for the element q E P that satisfies qi = pi and qi = p j for j # i. Now let Ai be the set of probability measures on Si. We define A = x ~ ~ NsoAA ~C ,R". Thus, the elements pi E Ai are real valued functions on Si, pi : Si 4 R and it holds that C s i j E S j p i ( s i j=) 1, and p i ( s i j ) 3 0,VSij
E
si.
We use the abusive notation sij to denote the strategy pi E Ai with p i j = 1. Hence, the notation ( s i j , p - i ) represents the strategy where player i adopts the pure strategy s i j , and all the other players adopt their components of p . The payoff function u is extended to have domain Rm by the rule
Definition 2.2. A strategy p* = (pi,&,. . . ,&) E P is a Nash equilibrium if p* E A and for all i E N and all pi E Ail u i ( p i , p * _ , )6 u i ( p * ) . 2.2. Nash Equilibrium as a Global Minimizer
To formulate the problem of finding a Nash equilibrium to that of detecting the global minimum of a real valued function, three functions, z, z and
486
g :P
-+
Rm, are required. For any p E P , i E N and
sij
E Si, define: (3)
Zij(P) = %(%j,P-z),
The real valued function v : P -+ R, is defined as: iEN l ( j ( m i
Function v is nonnegative and continuously differentiable. Furthermore, p* is a Nash equilibrium if and only if, it is a global minimizer of v, i.e. v(p*) = 0, and p* E A. 3. Particle Swarm Optimization
PSO belongs to the broad class of stochastic optimization algorithms. PSO is a population-based algorithm that exploits a population of individuals, to probe promising regions of the search space. In this context, the population is called swarm and the individuals are called particles. Each particle is assigned to a neighborhood and moves with an adaptable velocity within the search space, retaining in its memory the best position it ever encountered. Moreover, the best position ever attained by all individuals of a neighborhood is communicated to the particles that comprise the neighb~rhood.~ Assume a D-dimensional search space, S c RD, and a swarm consisting of N particles. The i-th particle is in effect a D-dimensional vector X i = ( ~ 1~ , 2 , . .. , z ~ D ) ~ The . velocity of this particle is also a D-dimensional vector, K = (vil,ui2, . . . ,v ~ D ) ~The . best previous position encountered by the i-th particle is a point in S , denoted as Pi = (pil,pia,. . . pi^)^. Assume g to be the index of the particle that attained the best previous position among all the individuals of the swarm, and t to be the iteration counter. Then, according to the latest version of PSO, which incorporates a parameter called constriction factor, the swarm is manipulated using the following equations:
+ 1) = x ( W + clrl (P&)- xi (t)) + czrz(P,(t) Xi(t + 1) = XZ(t) + K ( t + l ) , K(t
- Xdt)))
, (7) (8)
487
where i = 1,2,. . . , N ; x is the constriction factor; c1 and cz are positive parameters called cognitive and social parameter respectively; 1-1, 1-2 are random numbers uniformly distributed in the interval [0,1]; and t , stands for the counter of iterations. The formulae used for the computation of the constriction factor's value are reported in Ref. 1. 4. Detecting Several Minimizers Through Deflection To detect several global minimizers in a single run, the Deflection technique is applied. Let f : S -+ R, S c Rn, be the original objective function under consideration. Let also xz, i = 1 , . . . ,m, be m minimizers of f . Then, the Deflection technique is defined as:
F ( z ) = TI(%; El,
* ' *
T,(x;x m ,X,)-lf(x),
(9)
where Xi, i = 1,. . . ,m, are relaxation parameters. The functions,
Ti(x;xi,Xi) = tanh(Xi1lx - xTII),
i = 1,.. . ,m,
(10)
satisfy the property, that any sequence of points { x k } E oconverging to any one of the minimizers xCf, does not produce a minimum of F at x = x;, while all other minima of f remain unaffected, as shown in Ref. 2. Note that if the global minimum of the function is zero, a function f^ = f c, where c > 0 is a constant, should replace f in Eq. (9).
+
5. Experimental Results The proposed algorithm has been applied on noncooperative strategic games characterized by more than one Nash equilibrium. A comparison of the algorithm's performance against the Lyapunov function algorithm implemented in GAMBIT produced promising results. The particular algorithm was selected since it is the suggested algorithm for detecting all the equilibria of an n-person game.4 Indicative test problems are defined below:
Test problem 1 [BACHOR STRAVINSKY']: Two-player game, with two pure strategies available to each player and 3 equilibria. The payoff matrix for this game is illustrated in Table 1. Test problem 2 [HAWK-DOVE']: Two-player game, with two pure strategies available to each player and 3 equilibria. The payoff matrix for this game is also illustrated in Table 1.
488
Test problem 3 [STAGHUNT GAME4]: Two-player game, with three pure strategies available to each player and 6 equilibria. The payoff matrix is illustrated in Table 2. Test problem 4: Random strategic game with two players and three pure strategies available to each player, with payoffs randomly distributed in the interval [0,1] and 3 equilibria. The payoff matrix is illustrated in Table 3. Test problem 5 : Normal form three player game with two pure strategies available to each player and 9 e q ~ i l i b r i aThe . ~ payoffs of this game are given in Table 4. Table 1. Payoff matrices for Test Problems 1 (left) and 2 (right). Bach
Stravinsky
Stravinsky
Hawk
Table 2.
I I
Payoff matrix for Test Problem 3.
I
5-21
522
523
511
0.5.0.5
0.5.0
0.5. -0.5
si?
I
-0.5.0.5
0.5.1
1.5.1.5
Table 3.
I
Payoff matrix for Test Problem 4.
0.0470,0.5297 0.6789,0.6711
0.9347,0.3834 0.3835,0.0668
0.8310,0.6868 0.0346,0.5890
Table 4. Payoff matrix for Test Problem 5. 511
9,8,12
O,O,O
511
O,O,O
3,4,6
512
O,O,O
9,8,2
512
3,4,4
0,030
The configuration of the PSO parameters has been fixed for all test problems, with the exception of swarm size, which was problem dependent.
489
Table 5. Experimental Test
Nash
Problem 1 2 3 4 5
Equilibria 3 3 6 3 9
PSO
SR 100% 100% 80% 100% 82%
FE 530 884 4050 7500 6700
Lyapunov Function SR FE 50% 2148 30% 1188 0% 14835 10% 177536 0% 75583
x
Thus, for the constriction factor and the cognitive and social parameter, c1 and c2, respectively, the default values x = 0.729, c1 = c2 = 2.05 have been used.' For the Deflection technique, the setup y1 = y2 = $, p = lo-", and X = 1, respectively, was selected. The desired accuracy for detecting a Nash equilibrium has been equal to in all cases. The obtained results are exhibited in Table 5. In particular, the success rate (SR) of the PSO and the Lyapunov function method on locating all Nash equilibria averaged over 10 runs, as well as the mean function evaluations required for the computation of one equilibrium (FE), are reported.
q,
6. Conclusions In this contribution the problem of detecting the Nash equilibria of finite strategic games was addressed through the Particle Swarm Optimization method, equipped with the Deflection technique. Deflection enables the algorithm t o overcome local minima and detect more than one global minimizers a t a single run. Numerical experiments performed on a number of finite strategic games suggest that the particular approach addresses the problem effectively.
References 1. M. Clerc, J. Kennedy, IEEE Trans. Evol. Comput., 6(1):58-73, 2001. 2. G.D. Magoulas, M.N. Vrahatis, G.S. Androulakis, Nonlinear Analysis, Theory, Methods €9 Applications, 30( 7):4545-4550, 1997. 3. R.D. McKelvey, A. McLennan, In H.M. Amman, D.A. Kendrick, and J. Rust (Eds.), Handbook of Computational Economics, pp. 87-142. North-Holland, 1996. 4. R.D. McKelvey, A. McLennan, T. Turocy, Manual of the Gambit Command Language, Ver. 0.96.3, California Institute of Technology, 2000. 5. J.F. Nash, Annals of Mathematics, 54:289-295, 1951. 6. M.J. Osborne, A. Rubinstein, A Course in Game Theory, 1994, MIT Press. 7. K.E. Parsopoulos, M.N. Vrahatis, Natural Computing, l(2-3):235-306, 2002.
FUNDAMENTAL SOLUTION OF THE CRACKED DISSIMILAR ELASTIC SPACE D.G.PAVLOU, N.V.VLACHAKIS, M.G.PAVLOU, V.N.VLACHAKIS, M.KOUSKOUT1, LSTATHARAS T.E.I. Chalkidas, Mechanical Engineering Dept., Psahna, Halkida, Evoia, Greece. , mpaul!ji.tee.zr. E-mail: cipcivlorc~~teihul.~r
Fundamental solutions are used in the Boundary-Integral Equation analysis to determine the stress or/and displacement fields of finite elastic bodies subjected to external loading. Especially for crack problems, the method of Fundamental Solutions give more accurate results because they completely satisfy part of the boundary conditions of the problem. Fundamental solutions have been derived for axisymmetric body forces acting along a circle in a redial, torsional and axial direction in a homogeneous infinite elastic space, in a homogeneous elastic half space, in a homogeneous infinite elastic space containing cracks. The investigation in the present work is focused on the development of the Fundamental Solution for the infinite dissimilar elastic space containing interface cracks (annular or circular), loaded by singular tangential coaxial circular sources. This Fundamental Solution will be used for the formulation for the Boundary-Integral Equation which will be solved numerically to analyze finite dissimilar cracked elastic solids under torsion. Advantages of the proposed method are: a) for the stress or strain analysis of a bimaterial cracked body, no discretization of the crack surface is necessary, and b) the accuracy of the results is guaranteed by the fact that singularity at the crack tip is included in the Fundamental Solution. To be obtained the above Fundamental Solution for the infinite dissimilar elastic space containing interface cracks, the problem of the un-cracked dissimilar infinite elastic solid will be firstly treated. Following, the superposition of two partial solutions will be used. The first solution corresponds to a unit tangential load on a circular ring of the un-cracked dissimilar elastic body and the second one pertains to the same body with an interface crack. To this scope, the stress distribution derived from the first problem at the prospective crack surface will be applied to the surface of the actual crack in the second problem leading to the complete solution of the required crack problem. Considering the derived fundamental solutions, the graphical representation of the strain distribution results will be displayed for the infinite dissimilar elastic space containing interface cracks (annular or circular), loaded by singular tangential coaxial circular sources. For the numerical calculations the 490
49 1 commercial code MATHEMATICA has been used. The considered infinite bimaterial elastic solid is composed by aluminum and steel. The singular tangential co-axial circular loading source is located on the interface. The displacement and stress distributions are displayed in several planes within the two materials of the cracked infinite bimaterial space. In both domains, stress concentrations are shown due to the existence of the singular loading source and the crack top respectively. Moreover, the stress values are greater within the steel domain because of the greater shear modulus.
STUDY OF FRACTURE IN STC/AL COMPOSITES G. PAPAKALIATAKIS Democritus University of Thrace GR-671 00 Xanthi, Greece E-mail: [email protected]
D. KARALEKAS University of Piraeus GR-185 34 Piraeus, Greece E-mail: [email protected]
Interest in metal matrix composites has increased dramatically in the past decade with advances in processing and an increasing demand for materials with improved mechanical and physical properties. The major advantages of these materials consist of their high strength and stiffness to weight ratios, their resistance to severe environments and retention of strength at high temperatures. During this period, significant efforts to improve our understanding of the fundamental aspects of the deformation and fracture behavior of this broad class of materials have been undertaken, with considerable success. As many metal matrix composites have advanced from the development stage, additional consideration has been given to fracture behavior because of their importance in eventual engineering applications. In the present work the problem of fracture behavior of a SiC/6061-A1 filamentary composite was investigated. The composite was modeled as a twomaterial cylinder consisting of an inner cylinder simulating the fiber and a surrounding shell simulating the matrix. For the two-material composite cylinder the outer radius was taken equal to 1.51 x inner radius, which corresponds to a fiber volume ratio of 0.44. The fiber is considered linear elastic up to fracture, while the aluminum matrix exhibits elastoplastic behavior. It was assumed that the two components are perfectly bonded at the interface and the cylinder was subjected to a uniform displacement along its upper and lower faces. The fracture behavior of the modeled composite was studied for the cases where: a) no crack exists, b) a small central crack exists in the fiber and perpendicular to fiber long axis, and c) the fiber has fully fractured. A nonlinear finite deformation analysis was performed based on the finite code ABAQUS. For the finite element analysis, due to symmetry, one quadrant of the cylindrical element of the composite was modeled. Axisymmetric quadrilateral four-mode elements were used and a very detailed analysis of the stress field in the vicinity of the crack tip was undertaken. Results for the stress, strain, displacement and strain energy density quantities were calculated. The deformed profiles of the crack faces near the crack tips were determined. The results of stress analysis were coupled with the strain energy density theory to predict the initiation of crack growth. Crack growth, in
492
493 general, consists of three stages: crack initiation, sub-critical or slow growth and unstable crack propagation. These stages of crack growth are addressed in a unified manner by the strain energy density criterion, which was introduced by Sih, and then used by various researchers for the solution of a host of problems of engineering importance. According to the strain energy density theory crack growth takes place when the strain energy density at an element ahead of the crack tip reaches a critical value. This value is calculated from the stress-strain curve of the material in tension, which for the constituents of the composite under investigation was experimentally determined. Finally, results concerning the variation of strain energy density versus distance from the crack tip, for the determination of the critical value of applied displacement (u) at crack initiation, are presented.
COMPUTATIONAL STUDY OF THE CRACK EXTENSION INITIATION IN A SOLID PROPELLANT PLATE WITH A CIRCULAR HOLE G . PAPAKALIATAKIS School of Engineering Democritus University of Thrace GR-671 00 Xanthi. Greece
Solid propellants are particulate composite materials, containing hard particles embedded in a rubbery matrix. On the microscopic scale, a highly filled propellant can be considered as nonhomogeneous. When the material is strained, damage in the form of microvoids in the binder or debonding at the matrix/particle interface takes place. As the applied strain in the material is progressively increased the growth of damage takes place as successive nucleation and coalescence of the microvoids or as material tears. These processes of damage initiation and evolution are time-dependent and they are mainly responsible for the time-sensitivity of the nonlinear stress-strain behaviour of solid propellants. Their mechanical response is strongly influenced by the loading rate, temperature and material microstructure. A considerable amount of works has been performed by Liu and coworkers [ 1,2,3] to study crack growth behaviour in solid propellants. They investigated the characteristics of damage zone near the crack tip and crack growth behaviour in cracked specimens of a solid propellant. From experimental results they established that the damage characteristics have strong effects on crack growth behaviour. Crack growth consists of crack tip blunting, resharpening and zig-zag crack growth. In the present work the problem of the crack growth initiation of a solid propellant plate containing a crack and a circular hole was studied. The plate is subjected to a uniform displacement along its upper and lower faces. Solid propellants are modeled as hyperelastic materials. The behaviour of hyperelastic materials is described in terms of a strain energy potential U(E).The more frequently used forms of the strain energy potentials for modeling approximately incompressible isotropic materials are the polynomial form and the Ogden form. In the present work the material was modeled by the Ogden form of the strain energy potential. A nonlinear finite deformation analysis was performed based on the finite element code ABAQUS. This computer program was used to solve the boundary value problem of two groups cracked specimens. All the specimens are rectangular sheets of width W=lOO mm, 494
495 height h=75 mm and containing, in the mid height, a crack of length 2a=15 mm and a circular hole with radius r-10 mm. For the first group of specimens the center of the hole is on the crack axis and the distance of the right crack tip B from the nearest point of the hole are d=4.829, 8.122, 10.622, 13.013 and 17.798 mm. For the second group of specimens the distance of the right crack tip B from the hole is d=4.829 mm and the normal distance of the hole center v=O, 5, 10 and 15 mm. The thickness of the specimen was small enough to assume that conditions of plane stress prevail. The true stress-strain curve of the propellant in tension was used. A very detailed analysis of the stress field in the vicinity of the crack tip was undertaken. Results for the stress, strain, displacement and strain energy density quantities were calculated. The deformed profiles of the crack faces near the crack tips were determined. The results of stress analysis were coupled with the strain energy density theory to predict the initiation of crack growth. According to the strain energy density theory crack growth takes place when the strain energy density at an element ahead of the crack tip reaches a critical value. This value is directly determined from the area underneath the stress-strain diagram of the material up to the point of fracture. The deformed crack profiles for various values of the applied displacement were plotted. Finally, results for the critical applied displacement for crack initiation as a function of the distance of the crack tip from the hole were presented. Also, the variation of the critical applied displacement as a h c t i o n of the normal distance of the hole center, from the crack axis, were plotted.
References [ 11 C.T. Liu, Journal of Spacecraft and Rockets, 32(3), 535-537(1995).
[2] C.T. Liu, J. Spacecraft and Rockets 29(5), pp. 713-717(1992). [3] C.T. Liu and B. Tang Proceedings of the 1995 SEMSpring Conference and Exhibit on Experimental Mechanics, Grand Rapids, June 12-14, 1995, pp, 83 1-836(1995).
NODAL STRESS RECOVERY AND ERROR ESTIMATION BASED O N VARIATION OF MAPPING FUNCTION
S. H. PARK School of Mechanical and Aerospace Engineering, Seoul National University, Shinrim-dong, Kwanak-ku, Seoul 151-742,Korea E-mail: pshQodyssey.snu.ac. kr
J. H. KIM School of Mechanical and Aerospace Engineering, Seoul National University, Shinrim-dong, Kwanak-ku, Seoul 151-742,Korea E-mail: [email protected] A technique for nodal stress recovery and a posteriori error estimation is developed for linear elasticity problems. The nodal stress recovery technique is based on the error distribution obtained from variation of mapping function and a posteriori error is calculated from the recovered stress. A pronounced improvement in the recovered stress and the estimated error is observed by this method. In addition, results show that the estimated error can be considered as an upper bound.
1. Introduction
Over the years, the finite element method has become one of the most widely used numerical methods in engineering. The convergence of the numerical solution depends very much on the size and the shape of the elements. Increasing the degree of freedom enhances quality of the solution but requires more computational time. Especially in the case of a complex domain or boundary, the meshes should be well structured for the accuracy and the computational time. In these situations, adaptive mesh is required for efficient computation. In particular, the adaptive finite element method based on a posteriori error has become an important tool in scientific and engineering computing. As a result much interest has focused on the design of a posteriori error estimators. Majority of the error estimators are essentially of two types. One class of 496
497
a posteriori error estimators is based on the residual. This type of estimator was first introduced by Babuska and Rheinboldt[l]. The second one is the method using the recovered solution and called recovery type of error estimator. The recovery type of error estimator was first introduced by Zienkiewicz and Zhu[2]. This type of error estimators use some recovery techniques to achieve a more accurate stress or strain from the finite element approximation and the recovered solution is used in place of the exact one t o calculate the error. This type of error estimator is easy t o implement and more efficient. In the present research, we shall concentrate on the recovery based method of error estimation. A new technique for nodal stress recovery and error estimation utilizes the variation of energy functional about the mapping function between the global and the master elements. The explicit formulations and the numerical results are presented for the 2-dimensional problems. 2. Recovery procedure Consider a following linear static problem in 2- dimension as a model.
The partial differential equation (1) is supplemented by the general natural boundary conditions. In the variational boundary-value problem, the solution of (1) minimizes the total error energy of the body
where
Variation of mapping function of (2) can be written as
where ,s is variation of mapping function dx, and a, is xk component of the outward vector normal t o the element boundary. As above, the mapping function variation induces two types of change of a functional:
498
(1) Internal variation of displacement. This virtual displacement is proportional to the gradient of the function as the first term of ( 4 ) . (2) Element size variation. This size change of each element has an effect on the quantity of the functional only if the discontinuity of the functional on the element interface exists.
The main assumption for the stress recovery is that if the mapping function variation takes place so that each element is purely expanded, compressed or rotated, the internal virtual displacement does not change the error energy. For the first assumption(pure expansion or compression), the variation ,s is linear as ,s
= ax,
+b
(5)
Using the first assumption and the local equilibrium relation, one can obtain the self equilibrating traction resultant[3]. In 1-dimensional problems, the traction resultant directly designates the nodal stress, but in 2-dimensional problems the traction resultant is the integral of stresses along element interfaces. This integral couples the nodal stresses on the entire domain of problem. In order to compute the nodal stresses locally, the second assumption is incorporated with the calculated traction resultant. For the second assumption, the variation is also linear as sm = a(1- d m k ) z k
+b
(6)
By using this type of variation, one can obtain a local approximate form of minimum complementary energy principle.
3. Example
A 2-dimensional plane stress problems in Figure 1is analyzed. The problem is a infinite plate subject to uniform tension and has analytic solutions for the true stress fields as
Figure 2 presents the histogram representing occurences of values of the local effectivity index. It can be shown that for the present method, the band of the distribution of the effectivity index is narrower and concentrated
499
N
i
1
2
1
,
2 3
........ .... ............... &N
Figure 1. Definition and meshes
05
Figure 2.
15
25
Local effectivity index : SPR(top), present(bottom)
around 1. In other words, the quality of the estimation of the stress or the error energy is better and more uniform over the domain of problem.
References 1. I. Babuska and W. C. Rheinboldt, Int. J. Numer. Methods Engrg. 12, 1597 (1978). 2. 0. C. Zienkiewicz and J. Z. Zhu, Int. J. Numer. Methods Engrg. 24, 337 (1987). 3. P. Ladeveze and D. Leguillon, SIAM J. Numer. Anal. 20, 485 (1983).
DISCOVERING REGULARITIES IN BIOSEQUENCES: CHALLENGES AND APPLICATIONS K. PERDIKURI, C. MAKRIS, A. TSAKALIDIS Research Academic Computer Technology Institute 61 Riga Feraiou Str., 26221 PATRAS, GREECE & Computer Engineering and Informatics Department
University of Patras, 26500 PATRAS, GREECE E-mail: perdikuraceid. upatras.gr
Computational methods on molecular sequence data (strings) are at the heart of computational molecular biology. A DNA molecule can be thought of as a string over an alphabet of four characters {a,C,g,t} (nucleotides), while a protein can be thought of as a string over an alphabet of twenty characters (amino acids). A gene, which is physically embedded in a DNA molecule, typically encodes the amino acid sequence for a particular protein. Existing and emerging algorithms for string computation provide a significant intersection between computer science and molecular biology. 1.
Introduction
c
A string is a sequence of zero or more symbols drawn from an alphabet . The . In molecular biology set of all strings over the alphabet C. is denoted by one can use the four nucleotides alphabet EDNA= {a,c,g,t} when dealing
c’
with DNA sequences. In the case of protein sequences, the alphabet would be that of the twenty amino acids. Both in the case of nucleotides and amino acids, one extra symbol is also often used, the “don’t care” symbol (denoted as ‘*’ or ‘$’), representing any nucleotide or amino acid. The most important goal in computational molecular biology is allocating patterns in nucleic or protein sequences, and identifying motifs that are common to a set of such sequences. This implies inferring patterns, unknown at first, from one or more input sequences. In the problem of pattern matching in molecular sequence data (strings) one is interested in finding all occurrences of a given pattern x (“structured” or “non-structured”) in a given biosequence BS . A “non-structured” pattern is a string s of length 1 s )=n , where s[i]E ZDNA= {a,c,g,t,*}, for each 500
501
1I i 2 n , while a “structured” pattern can be defined as an ordered collection of k “boxes” B, ( B, are strings over EDNA) and p - 1 intervals of distances, gaps, (one between each pair of successive boxes), Figure 1. Each gap g , could have a minimum
min, and a maximum value max, , or a fixed length.
I -B,
9 min, ..... max,
min ..... maxz
B3
I
Bk-1
Bk min k.l ..... max.,
Figure 1. A “structured” pattern.
When we consider the approximate version of the above problem we do not require a perfect matching but a matching that is good enough to satisfy certain criteria. One of the most common variants of the approximate string-matching problem is that of finding sub strings that match the pattern with at most kdi&erences. In this case, k defines the approximation extent of the matching or in other words the possibility of errors [ 11 (substitutions, deletions or insertions that might have taken place). The problem of approximate string matching has been extensively studied in recent years because it has a variety of applications in searching for similarities among biosequences [Z]. 2.
Open Problems
2.1. Discovering Mot@ in DNA Sequences
x , we define the location list L, , which contains the positions occurrences of x in a given biosequence BS . Using a special
Given a pattern
of all the parameter q , (often called quorum), we say that pattern xis a motif, if
I L, 12 q .
The problem of finding all motifs of BS is computationally difficult, due to the possible exponential size of the output. In fact, an exponential number of motifs can be found. We therefore have to reduce this size. Since motifs appearing only once are not interesting, we assume q 2 2 (in other words, we are interested in finding all motifs that is all patterns that appear at least q times in BS ). In some cases, one can be interested in seeking motifs that appear at least or exactly q times in a set of N given input biosequences. A recent attempt of defining a notion of maximality and redundancy for motifs is based on the idea that just some motifs could be enough to build all the others [3]. These motifs are called tiling motifs. The same idea has been extended for flexible motifs, which are motifs where the number of “don’t care”
502
symbols can vary from one occurrence to another. Discovering flexible motifs appears quite interesting for biological applications. 2.2. Discovering String Regularities with “don ’t cares ”
Regularities in strings arise in the form of repeated sub patterns (periodicities of an approximate nature). Some typical regularities that have been studied in the literature [4] include: i) the period p of a string x , ii) the cover w of a string X , which can be constructed by concatenations and superposition’s of w , iii) the seed w of a string x , which is a cover of every super string of x and iv) the repetitions inside a string X . Although string regularities are common in applications of molecular biology, only approximate regularities have been studied, while little work has been done in regularity problems that arise from having “don’t cure symbols”. In [5] an algorithm for computing all the periods of a string of length ~1 over an alphabet EDNA= {a,c, g , t , * } in linear expected-time in the average case and quadratic worst-case time, is presented. Finding the repetitive structures in DNA strings x remains an open problem. 3.
Applications
3.1. Sequence Clustering
Motif search (as presented in the above paragraphs) has nowadays become a specific research field whose applications in molecular biology are of high impact. In fact, due to the huge amount of data entering genomic databases, there is an urgent need for tools that can help molecular biologist to interpret this data. In this direction, the search for repeated patterns is one of the first things one can do on a sequence in order to detect some properties or some particularly significant functional site. In fact, a typical way to start analyzing new sequences is to group them into families that are assumed to be biologically related because they present similar function or structure, or because they are evolutionary related. Many of these classifications into families have been done finding shared properties in terms of common motifs in the sequences. Therefore motif extraction could play a crucial role both in detecting these common patterns in known families, and in characterizing a newly available sequence. This is all again under the assumption that syntactic similarity reflects biological correlations. 3.2. Functional Modeling
Moreover pattern matching and motif extraction algorithms are used in the creation of functional biological networks, enabling the systematic analysis of
503 design patterns and their evolution. The idea is that functional patterns of biological sequences can be integrated into systems that represent a general network of cellular processes, including metabolic pathways and transcription activation mechanisms [ 6 ] . Scientists have already identified cases where the same circuit patterns and homologous genes produce similar system behaviors, but with unrelated physiological outcomes and cases where the same circuit patterns use different sets of genes to attain similar system behaviors. More systematic surveys are needed to determine how many evolutionary conserved circuits exist, in what functions and how they relate to the evolution of genes. 4.
Conclusions
Computational methods on molecular sequence data (strings), deals with the analysis of entire genome sequences or protein sequences in order to discover motifs or sub patterns that represent inherent properties and further model them as systems. References 1. D. Gusfield, Algorithms on Strings Trees and Sequences, Cambridge University Press, (1997). 2. G. Navarro, A Guided Tour to Approximate String Matching, ACM Computing Surveys, Vol. 33 (l), pp. 31-88, (2001). 3. N. Pisanti, M. Crochemore, R. Grossi and M.-F. Sagot, A basis of tiling motgs for generating repeated patterns and its complexity for higher quorum, submitted. 4. J.S. Sim, C.S. Iliopoulos, K. Park, and W.F. Smyth, Approximate periods of strings, Theoretical Computer Science. 5. C. Iliopoulos, M. Mohamed, L. Mouchard, K. Perdikuri, W. F. Smyth and A. Tsakalidis, String regularities with don’t cares, Proceedings of the Prague Stringologv Conference (PSC’O2), (2002). 6 . S . Tsoka, C.A. Ouzounis, Recent Developments and Future Directions in Computational Genomics, FEBS Letters, 480, pp. 42-48, (2000).
CONSTRAINT BASED WEB MINING I. PETROUNIAS, A. TSENG Department of Computation, UMIST PO Box 88, Manchester M60 lQD, UK E-mail: ilias@co. umist.ac.uk, y .tseng@postgrad. umist.ac.uk P. CHOUNTAS Department of Computer Science, University of Westminster Northwick Watford Rd, Northwick Park, London, HA1 3TP, UK E-mail: [email protected]
As the rapid growing number of WWW users, the hidden information becomes ever increasingly valuable. As the consequence of such phenomenon, mining Web data and analysing on-line users' behaviour and their on-line traversal pattern have emerged into a new popular research area. Primarily based on the Web servers' log files, the main objective of traversal pattern mining is to discover the frequent patterns in users' browsing paths and behaviours. This paper presents a complete framework for web mining, allows users to predefine physical constraints when analysing complex traversal patterns in order to improve the efficiency of algorithms and offer flexibility in producing the results.
1.
Introduction
One of the main challenges for large corporations adopting World Wide Web sites is to discover and rediscover useful information from very rich but also diversified sources in the Web environment. Web log analysis is mainly used in this instance to determine key factors, such as interest in content, and usage of Web sites. These become important inputs to design tasks and determine how a Web site is being used. Usage analysis includes straightforward statistics, such as page access frequency, as well as more sophisticated forms of analysis, such as finding the common traversal paths through a Web site. However, most of the work in web mining has focused on web log analysis. Within web log analysis the main interests have been user and session identification and sequences of pages being accessed by users. This paper presents a complete framework for web mining. Existing proposals in the literature are concerned only with the forward navigation within a web site. In order to filter out the redundant pattern from the log source, [l]
504
505
introduced the concept of "Maximal Forward Reference or Path'' (MFP) as a notion of a maximal forward moving motion in visiting Web documents. They assumed that all the backward traversal actions (i.e. Backward Reference) only occur to users in the process of searching for Web pages that really interest them. Hence they assumed that only the forward browsing motion (Forward Reference) contains meaningful information and reflects users' true browsing patterns. The work in this paper argues that the notion of a "Minimum Backward Path" (MBP) needs to be included since it also provides information about users' navigational patterns and their ability (or not) of navigating easily within a web site. This will demonstrate whether there exists a frequent short backward motion which may show that the structure of a web site is not clear. In addition to this, another important characteristic that is addressed is the notion of time. Within this, one can identify the longest time periods within which frequencies of pages occur and also the periodicity with which these web pages are accessed. The framework also addresses several 'constraint-based' preprocessing mining tasks to be performed prior to applying data mining algorithms to data collected from server logs. These constraints are taken from standard Web log files and categorised into three main groups based on their nature and relation to user's on-site browsing behaviours: "Traversal Constraints" which concentrate on factors relating to users' navigating movements. A new method of 'Minimum Backward Path' (MBP) is defined to further reduce less meaningful traversal patterns and it successfully cooperates with the existing method of Maximum Forward Path (MFP) proposed in [ 11. "Temporal Constraints" include elements of 'Time', 'Session' and 'Periodicity'. These constraint elements concern factors such as duration of staying on a particular web page, session intervals and periodicity of visits to web pages. "Personal Constraints", consist of other available information regarding each individual visiting a web site. For example, the IP address, demographical data and relevant topics, and is recognised as the subset of element 'User'. Data mining algorithms that incorporate the above set of "Objective Constraints" are an attempt to resolve the shortcomings of existing approaches by introducing more relevant information (MBP, longest interval, periodicity of visits). By applying conditional restrictions with specific patterns, the approach enables data analysts to focus on individual cases with more control while at the same time providing more knowledge about users' patterns. The outcome of any web mining algorithm is then influenced by those conditions. The value of
506
conditional restrictions can be anything within users' traversal patterns, e.g. the length of the traversal movement, the direction of browsing path, designating nodes inside the browsing pattern, etc. This framework is developed to support and assist existing data mining algorithms in order to first refine browsing pattern with relevant constraints and then with the discovery tasks in both intra and inter-sessional information retrieval. With such a framework implemented, information retrieval and pattern identification is significantly faster and more accurate than just using standard discovery methods. 2.
Components of the Framework
In order to filter out the redundant pattern from the log source, Chen et a1 [ l ] introduced the concept of "Maximal Forward Reference" as a notion of a maximal forward moving motion in visiting Web documents. They assumed that all the backward traversal actions (i.e. Backward Reference) only occur to users in the process of searching for Web pages that really interest them. Hence, they assumed that only the forward browsing motion (Forward Reference) is reflecting users' true browsing patterns and contains the meaningful inflation. For instance, if a user has the following traversal pattern inside a particular Web { ABCDCBEGHGWAOUOV }
(1)
1
Using traditional analysis methods, nodes B and C are showing greater importance than nodes D and E, where it may in fact be that nodes D and E are actually the pages containing the information that user needs. Nodes B and C might be pages embedded with all the inter-links in that site and as a result, cause an illusion in becoming the most valuable pages. When the "Maximal Forward Reference'' method has been taken into consideration, the original traversal pattern will be translated into a new set of patterns as:
I
{ ABCDCBEGHGWAOUOV}
I
{(ABCD)1 (ABCDEGH) 1 (ABEGW) 1 (AOU)I ( AOV) }
I
I
M.F.R. successfully redefines the traversal data into a more meaningful manner by ignoring the continuous repetition of backward browsing actions. In the "Maximal Forward Reference", [l], [2] only consider users' onward browsing flow as the only mean for measuring users' browsing behaviours and completely ignores the backward browsing paths. However, on-line browsing movement is not a simple single-directional action, but rather a "dual-
507
directional" action. Although the conversing direction of the traversal paths only exist because of users' convenience, if it is paired with the result of the onward path analysis it offers better insights into users' actual travelling intentions. For instance, the Minimum Backward Path (BMP) demonstrates groups of nodes in the shortest-length combination. This presents a good indication of how well the infrastructure of a site is constructed and arranged. The longer the combination of nodes MBP holds the less organised a site appears to be. This can be interpreted as users having difficulties in finding their desired nodes hence they are forced to browse each link one after another in order to narrow down the possibilities. If MBP contains many the same combinations then this can inform the Webmaster that this particular reference of linkages is well constructed. This paper proposes a new approach for data processing by adapting a constraint-based technique. These constraints are based on users' on-site browsing behaviours, for instance the maximum fonvard-browsed nodes (MFP) and minimum visited nodes in the reverse direction (MBP). Furthermore the duration users have taken in visiting a site and their demographical records such as IP address and the time interval (Periodicity) they access the site are also being used as factors in deciding the mining section of raw data. These records hold valuable information that can determine specific requirements and further focus on particular data sectors in order to obtain a refined data analysis. 'Objective constraints' apply additional restrictions onto the existing traversal data. The new inclusion allows data analysts to apply personalised conditional restrictions, e.g. to include some sort of path sequence or designated starting and finishing nodes. As a result, it allows execution of algorithms to improve the speed of generating candidate sets in order to produce a more efficient analysis that suits individual needs. As indicated earlier the essential part of this proposal is to introduce the constraints that are able to refine the data source in order to lessen the processing time with a better return of outcome. These objective constraints are introduced as below and classified into three categories: Traversal, Temporal and Personal. 2.1. Traversal Constraints
MFP and MBP are the main elements in this category. Each denotes the significance pattern of forward or backward directions of a user. MFP (Maximum Forward Path) Definition of MFP: Given a set of inter-linked nodes arranged in a hierarchy fashion, the action starts ?om the highest node (the root node) and follows its
508
way down. When the first reverse movement occurs the forward movement is terminated. This results in a collection of nodes which is marked as maximum forward path. For instance, taking the following as a template, the whole traversal pattern from the root node (i.e. node A) to the node R is shown as:
I
(ABDGDBEHJMQSQMJNRTVXVTR)
I
(3) According to the defmition given above, the maximum forward path for this instance will be extracted as below: {ABDGDBEHJWSQMJNR TWTR)
i
U
(ABDG) (ABDGEHJMQS) (ABDGEHJMQSVRTW)
I
(4)
This presents that the MFP is confirmed on the nodes G, S and X where reverse movement starts taking place. Hence travelling from node A to R produces three maximum forward path listed above. Since MFP omits all the reversing directional travelling, it will contain purely the nodes captured during the forward visits. M B P (Minimum Backward Path) Definition of MBP: In a set of hierarchal inter-linked nodes and during a
particular session in time, the MBP starts at a node when a reverse behaviour occurs and returns back to the node where a new forward movement was invoked.
Minimum backward path is "not" necessary the reverse order of a maximum forward path. Again use (3) as an example, the MBP for travelling from node A to R, is listed as follows: {ABDGDBEHJMQSQMJNRTVXVTR)
It is noticeable to point out the difference between the two sets after comparing (4) and ( 5 ) . MBP contains the nodes covered by bidirectional
I
509
movements (that is both forward and backward travelling), whereas MFP contains only single directional nodes. 2.2. Temporal Constraints
Time Definition of Time: Indefinite continuedprogress of existence, events, etc., in the past, present, and future, regarded as a whole (takenfrom Oxford Dictionary). A time domain is a pair where is a non-empty set of chronons and is total order on [8]. As it can be seen from above definition, time is constructed by a set of chronons arranged in a total order manner. Session Definition of Session: Given a time-stamped starting point SS, on a particular visitor V, the session time ST remains until a visitor's onsite presence disappears at ending point SE. A session is the time presence of a completed visit of a user with a specific IP address. This is normally achieved by setting a transient cookie. Transient cookies are only stored in temporary memory and are erased when the browser is closed. (Unlike persistent cookies which are stored in the user's hard disk and only removed when past the expiration date or deleted by the user manually). Periodicity Definition of Periodicity: A time consisting of a series of periodic intervals based on a time cycle unit. Each periodic interval appears within an interval of the cycle unit and all these periodic intervals have the same position in their correspondent cycles. Nevertheless it is important to distinguish between the concepts of "Session" and "Periodicity". A session, as described above, is created based on individual temporal attributes, such as the beginning and finishing of access time to a site. On the other hand, periodicity represents general time intervals during a bounded period over the time domain. Hence it is possible for multi sessions to occur over the same periodic time. As demonstrates in figure 2, a user can visit two or more sites simultaneously at the same or different time intervals. 2.3. Personal Constraints
User DeJinition of User: A user is determined by a speclJic IP address, which is assigned to each individual access when connected to the Internet.
510
This IP address is usually unique apart from the case of using a shared proxy server, in which case all users will be counted as using the same IP. Every web request made will then be processed and recorded according to their IP.
Figure 1 : M.F.P. Illustration
_._.-.-.
-i
!
! ! ! !
! ! ! !
Period 3
Figure 2 Session and Rnodidty
3.
Scheme of the Mining Task
The scheme for web mining tasks is then constructed and can be described in a specialised syntax similar to SQL: I
SELECT
Mining-Rule (< rule - condrtron >)
WHERE
Cons-Type(<
WHERE
Cons-Type(< type- condrtron >)
IN
type- condrtron >)
(< mining-algorithm >)
511
3.1. Mining-Ru,e() It denotes the type of data mining technique to be used in the task for seeking a particular pattern. The task can be implemented with popular mining techniques such as Associations Rule, Classification, Clustering and Summarisation. Each technique has its own merit towards sorting different problem and this scheme provides the flexibility of adopting different techniques with customised conditions. 3.2. Cons-Type()
It is composed of elements discussed in the previous section. Constraints are inserted into an ordered list, where the position of a constraint's type determines the priority of the execution. Several constraints can be combined to form a complex constraint's type for setting a more detailed filter. For instance, in the case of attempting to find Association Rules, a constraint might be to find users' online MFP patterns that have a period of over 4 hours between every Friday and Saturday nights after 19:OO hours and with the IP address ranging from 192.168.0.1 to 192.168.0.255, can be expressed in the scheme as follows: SELECT
Mining-Rule ("association rule")
WHERE MFP ("threshold = '5"' )
AND USER("ip='192 1680 I'T0'192 1680255"') AND TIME ("duration - hr = '4' " )
(7)
AND PEMODICIn (" str -day =' firday' AND end - day =' sunday' AND str -time ='I 900' AND end - time ='I 900"')
IN
(" apriorr - gen( )" )
4fter execution of the mining task, the data will be pruned accordingly with the set constraints values and then passed on to the selected data analytical algorithm.
References 1.
Chen, M.S., Park, J.S., and Yu, P.S. "Data Mining for path traversal patterns in a Web environment", in Proc. 16th Int. Conf. On Distributed Computing Systems, p. 385-392, 1996. 2. Park, J.S., Chen, M.S., and Yu, P.S., "An Efficient Hash-Based Algorithm for Mining Association Rules", pp. 175-186, Proceedings of SIGMOD 1995.
TEMPORAL WEB LOG MINING USING OLAP TECHNIQUES 1. PETROUNIAS A N D A. ASSAID Department of Computation, UMIST PO Box 88, Manchester A460 IQD,UK E-mail: [email protected], [email protected]. uk
This paper is concerned with mining temporal features from web logs. We present two methods. The first one concerns the temporal mining of sequential patterns in which we use sequence data which are used as support for discovered patterns in order to find periodicity in web log data. The second one concerns an efficient method for finding periodicity in web log sequence data which handles missing sequences by dealing with the overlap problem.
1. Introduction
With the growing popularity of the World Wide Web (WWW), large volumes of data such as addresses of users or URLs requests are gathered automatically by Web servers and collected in access log files. The analysis of server access data is aiming to restructure a Web site for increased efficiency. Discovering relationships and global patterns that exist in access log files, but hidden amongst the vast amounts of data is usually called Web Usage Mining. The foundation of the approach presented in this paper addresses the problem of exhibiting behavioural patterns from one or more servers collecting data about their users. Our paper concentrates on mining temporal features in sequential patterns of web log data, by analysing information from Web servers. Handling time constraints for mining Sequential patterns could provide relationship such as:” 70% of users visited ( .../a.html ,../ b.html ,../c.html, ) between 1st September 2001 and 3 1st October 2001. Once the pattern has been discovered, web developers can use it to dynamically customise the hypertext organisation. More precisely, the user’s current behaviour in a period of time can be compared to one or more sequential patterns. Applying data mining techniques on access logs unveils interesting access patterns that can be used to restructure sites in a more efficient grouping, pinpoint effective advertising locations, and target specific users for specific selling ads. While it is encouraging and exciting to see the various potential applications of web log file analysis, it i s important to know that the success of such applications depends on what and how much valid and reliable knowledge one can discover from the large raw log data. Current web servers store limited information about the users’ accesses. Using Web log files, studies have been conduced on analysing system performance, improving system design, understanding the 512
513
nature of Web traffic, and understanding user reaction and motivation. Discovery and analysis of various data relationships is essential in hlly utilising the valuable data gathered in daily transactions. A comprehensive analysis tool must be able to automatically discover such data relationships including the correlation among Web pages, sequential patterns over time intervals, and classification of users according to their access patterns. This technique must be able to discover relationships in very high traffic servers with very large access logs. Periodicity search detects cyclicity in time series data. Most previous work on periodicity detection has focused on aggregate sequences which occur within a time segment. However, a sequence may overlap from time segment a to segment b. This means each part of the sequence appears in different segments. In other words, the overlap of sequences in the current literature makes it impossible to be taken into account. Figure 1 shows normal sequences and overlapping sequences. In this paper we present two methods: (1) temporal mining of sequential pattern which handle temporal features using sequence data which are used as support for discovered patterns to find periodicity in web log data (2) a more efficient method for find periodicity in web log sequence data which handles missing sequences by dealing with the overlap problem. This is achieved by integrating three methods: temporal sequence data which are gathered by our temporal sequence patterns algorithm (rSPADE) which is presented in this paper, data cube structure techniques which we adopted from [l], [3] and identification of periodicity with overlapping segments. Our periodicity mining method does not concentrate on time segmentation (i.e. day, hour, etc) but focuses on identifying overlapping sequences. The difference between this approach and others in the literature, is that all other approaches in the literature that consider temporal patterns from web logs, consider the temporal and periodic characteristics as part of the pattern identification process. The approach in this paper considers them separately and as such offers more flexibility and in reality the proposed algorithms can be applied in other domains as well.
- yy--r &
- i 7y + * 1 4 2 , i + & = g *
Y
la j -
Yo
i
-
m
u
.Ses,nem~
se.g,iientl
1
2
'
i
!i-a,
Segmem 3
I !
Segmentl 4 1
-ls%mElm
o=&n=dsxl-=
Figure 1. Normal and overlapping sequences
~'e
5 14
The rest of the paper is organised as follows: section 2 presents a temporal sequential pattern discovery algorithm, Section 3 concentrates on the design and construction of Data Cubes and section 4 presents the algorithm for identifying periodicity even with overlapping segments.
2. Temporal Sequential Discovery Algorithm The SPADE (Sequential Pattern Discovery using Equivalence Classes) algorithm was presented recently in [4], [6]. It uses vertical id-list database format where each sequence is associated with a list of objects in which the sequence occurs along with the timestamps. It decomposes the search space lattice into sub-lattices and processes them independently in the main memory. Three database scans are needed compared to the multiple scans of data in other approaches. Two different search strategies, breadth-first search and depth-first search are used for enumerating for frequent sequences. Later cSPADE was presented in [S]. It is mainly an extension of the SPADE algorithm with the constraints. We present tSPADE (temporal Sequential PAttern Discovery using Equivalence classes) is used for mining click-stream data sequential patterns. Figure 2 shows the pseudo code of tSPADE. The tSPADE algorithm is based on SPADE and cSPADE which are described in [4], [ 5 ] , [6] by incorporating each event with temporal feature. The event could be item, itemset or sequence. SPADE (min sup): 1. P = {parent classes (f,,/&,)); 2. for each parent class (P,,/.,, &,) E Pdo Enumerate-Frequent(P,,/,s&,);
Enurnerate-Frequent(S,/,/,>): 1. for all sequences (A,,/.&) E S do 2. for all sequences (A,,$,T,/,,,)E (S,/,&)withj > i do { 3. L(R,tR.!,fR,,) = Temporal-Join-constraint( L(A,,f,&), L(A,,f&)) 4. if (CJ (R,/R./RJ 2 min-sup) then T./rT,h= T,tr\,tre u { R./~.~.lfi A print ( R . ~ R ~ , ~ R ~ ! ; 5. Enumerate-Frequent(T/~~,/~~);
Figure 2. Pseudo code of tSPADE The aim of tSPADE is to enhance the SPADE algorithm for the purpose of performing temporal sequence of click-stream analysis, by incorporating event swith temporal features (event, t,, t,). Events in SPADE do not have a timestamp. In tSPADE min-gap and max-gap is computed among event and UserID rather than between Idlists only as in SPADE. More details about SPADE and cSPADE algorithms can be found in [4], [5], [6]. Later [2] have proposed another algorithm for mining sequences from click-stream data, webSPADE. The difference between tSPADE and webSPADE are that events in webSPADE have only time start rather whereas in zSPADE they two features
515
UscrID
AB - % C D-% A
Time t, , t,
I
(A,07,09)(B,09,15) 4 (C,16,20)(D,20,26) 3 (A,29,3 1) (A,10,15)(B,15,17) -% (C,20,24)(D,24,28) 3 (A,30,36) (C,19,23)(D,23,25) 3 (A,26,24) (A,12,14)(B,14,15)
07,3l 10,36 12,24
2 3
+
Figure 3. SPADE Sample data
I
AB->B
time
30
Figure 4a. SPADE Sample data Time 10
SID 1 1
I
2
I
15 12
I
Figure 4b. webSPADE Sample data 3.
Constructing Data Cubes: Sequence Cube and Working Cube
In this section we discuss how to construct data cubes with given dimensions from web log sequence temporal data, and how to use such cubes for mining periodicity with overlapping data between periods. Suppose we have sequential patterns denoted as SP 1, SP 2, and SP 3 which are gathered by the temporal sequential algorithm. Figure 5a shows an example of a sequential pattern. However, each pattern has table of temporal sequences data for support, such as TS I , TS 2. The structure of each element in a tuple of these tables has a timestamp (tSand te) except from the first column which consist Users ID. Figure 5b shows tables ofTempora1 Sequence patterns data. The Sequence Cube ‘provides an efJicient structure to access and index the minimally generalized data” [3]. For mining periodicity we construct two cubes: Sequence cube (sCUBE) and working cube (wCUBE). These cubes are constructed according to [ 3 ] but wCUBE is used differently in order to identify periodicity with overlapping segments.
516 Firstly, we propose an sCUBE which is based on any selected TS table (e.g., TS I ) . sCUBE (Figure 6) is constructed with time-related attributes of element sequence time stamps and with remaining attributes. However, an C U B E dimensions are Time Period, Sequence and UserID. The structure of an sCUBE is that the first column is for Users ID, the next column is for sequence elements time stamp data, the last penultimate columns are for aggregation of all times in a row where SQ handles time stamps for tuple sequence. The last column is for Tag to describe the aggregation status of SQ whether aggregated as normal or with overlapping segments between two cycles.
b. T c m p m l Sequrncrr m p p m d m
Figure 5 . Sequential patterns gathered by sequential algorithms
517
Secondly, by slicing the sCUBE (as shown in Figure 7) we construct a working Cube wCUBE (shown in Figure 8). We assume that wCUBE has three dimensions such as Period, week days and Hours. These are used as examples in this paper. These dimensions are user defined. Each cell in wCUBE is defined as numeric to aggregate sequences SQ (t,, t,) from a slice of sCUBE (Time Period, Users ID) denoted as SL, figure 6 shows a sample of sCUBE slice. If SQ (ts, t,) is aggregated without overlap, the tag cell tagged by “Normal”, otherwise by “Prefix” or suffix if the SQ (t,, fe) has overlaps between two cells. The SQ (t,, fe) is tagged by “Not” whenever the sequence not aggregated. Figure 8 shows a working cube and the overlapping techniques.
slice(sCUBE(time PERIOD=I, UserlD=ALL))
fi a 5
A
B
D
Sequence
F
-J
$
r”
1
P
Figure 7. Slice (sCUBE (time PERIOD, UserID))
4. Algorithm for Mining Periodicity The Periodicity algorithm starts with constructing sCUBE which is based on a temporal sequence table TS. After that, the algorithm will build wCUBE by slicing sCUBE with time period and UserID denoted by SL. The function in step 3 contains aggregate sequences from SL to wCUBE in working cube cells wCELL as follows: AGGREGATE function: The Aggregate function reads sequence SQ( t,, b) from SL then checks the length of SQ( t,, k), if t, and t, of the SQ fit in wCELL without overlapping, then the function increases the current wCELL by one and tags SQ( 6, t,) tuple in SL with “Normal” . However, if SQ ( t,, fe) has overlap between two
518
(wCELL)s then, the function treats this problem fimction.
and calls Overlapping
PERIODICITY Algorithm { 1. Construct sCUBE(dimensions(timeperiods,SQ( t,, k) , UserID) t (TSi). S L t slice(sCUBE(time period, UserID)). 2. Numeric wCUBE (wCELL). Construct wCUBE (cycle, interval, granular) t SL. 3 AGGREGATE (SL, wCUBE ) 4 Mining Periodic T i m e (slicing(wCUBE))
1
1
QI
-J
N
w
m
z 1
2
3
4
5
6
7
,
Week Days
Figure 8. Working cube wCUBE AGGREGATE (SL, wCUBE )
{
While not EOF(S){ READ SQi(ts, &) from SL len =LENGTH( SQi(tS,&)) LOCATE at wCUBE for wCELL(ca1, int, 4 ) 5 SQi(t,) I wCELL(ca1, int, te) IF FOUND () IF SQ, (t,) > wCELL (cal, int, t)THEN { NOCELL (call, int, )&t := N"CELL(cal1, int, )&t Tag SQi(ts, te) at SL by 'Normal '
I'
ELSE
+I
5 19
OVERLAPPING (S,wCUBE, SQi(ts,&), addedGRN)
1
1
OVERLAPPING (SL, wCUBE, SQi(ts,t,), addedGRN) { tsQ(ts, te) + sQi(ts. k), lenl := LENGTH (tSQi (t,), wCELL(ca1, int, t,)) len2 := LENGTH(NEXT(wCELL(ca1, int, t,), tSQi (t,))) g:=l //where g is time granular IF lenl>len2 THEN { While addedGRN >j and LENGTH(tSQi (t, ,t))>minSEQlen { tsQi (fe) + tsQi (te if tSQi (L) I wCELL(ca1, int, t,) THEN { Increment (wCELL (cal, int, t:t,).data ); Tag SQi(ts,t,) at SL by ' Suffx ' EXIT Ig+g+l
1
ELSE While addedGRN >j and LENGTH(tSQi (t, ,t))>maxSEQlen { tsQi(ts) tsQi(ts + g) if tSQi(t,) 2 wCELL(ca1, int, t,) THEN { Increment (NEXT(wCELL (call, int, &&)).data ). Tag SQi(t,, t,) at SL by ' Prefz ' EXIT Ig+g+l
+
I
1
References 1. S., Chaudhuri and U., Dayal, An overview of Data warehousing and OLAP technology, ACM SIGMOD. 26,65-74( 1997). 2. A., Demiriz and M. J., Zaki, webSPADE: A Parallel Sequence Mining Algorithm to Analyze Web Log Data. SIGKDD '02 Edmonton, Alberta CA, 2002. 3. J., Han, W., Gong, and Y . , Yin, Mining segment-wise periodic patterns in timeerlated database. Proceedings of knowledge discovery and Data Mining (KDD'98), 1998. 4. M.J., Zaki, Efficient Enumeration of Frequent Sequences. 7th International Conference on Information and Knowledge Management. Washington DC, pp. 68-75, 1998.
5. M.J., Zaki, Sequence Mining in Categorical Domains: Incorporating Constraints,in 9th International Conference on Information and Knowledge Management Washington, DC., pp. 422-429,2000. 6. M.J., Zaki, SPADE: An Efficient Algorithm for Mining Frequent Sequences. Machine Learning Journal, special issue on Unsupervised Learning (Doug Fisher, ed.). 42(1-2), 3 1-60(2001).
INTRODUCING COMPLETE GRAPHS IN MOLECULAR CONNECTIVITY STUDIES LIONELLO POGLIANI Dipartimento di Chimica, Universitci della Calabria, 87030 Rende (Cs), Italy E-mail: lionpii2:rmical.it
Molecular connectivity uses as key parameters for the definition of the valence molecular connectivity indices, x", the valence delta values, ti', which are, practically, expressions of the electronic structure of the skeletal atoms in a molecule. These values, for second row atoms, can be derived from the hydrogen-suppressed chemical pseudograph. With higher row atoms arises the problem of the non-valence core electrons, which play a determinant role in the ionization, electron affinity, electronegativity, and in the size of atoms. Up to date, molecular connectivity has solved the 'core problem ' with the definition of a valence delta, for the xv indices, centered on the atomic number, Z, as well as on the valence electron count, Z'. For the electrotopological state indices has, instead, the proposed valence delta is centered on the principal quantum number and on a valence delta, which can easily be derived from a chemical pseudograph. It should be underlined that the electrotopological indices are the basic tools used to define the molecular pseudoconnectivity indices. Nevertheless, the 'core problem', has recently been solved with graph concepts, and, especially with odd complete graphs, which together with the concept of pseudograph give rise to a full graph representation of a hydrogen-suppressed molecule made up of any type of atoms. Even with complete graphs two algorithms have been shown to be useful for the valence delta values. These two algorithms for the valence delta, which mimic the old algorithms and recent QSPWQSAR studies have shown that both odd complete algorithms are a useful solution for the 'core problem'. These two algorithms have shown their validity with model studies of different properties of halocompounds.
520
STRUCTURES AND ENERGIES OF P-NEOCARRABIOSE IN VACCUM AND IN AQUEOUS SOLUTION M. SEKKAL-RAHAL Laboratoire de chimie thkorique, Dkpartement de Chimie, Faculte' des Sciences. Universitk Djrllali Liabes de Sidi Be1 Abbes. 22000. Algeria. E-mail: cm rahtrl'tc?caranrnil ~ o n z D.C. KLEB AND P. BLECKMANN Fachbereich Chemie, Otto Hahn str. 6, Dortmund Universitat 0-44221 Dortmund. Germany
In order to explain some physical behaviour of carrageenans, we focuse our interrest on the structure of one of the two basic dissacharides of these polygalactanes. The structure of P-neocarrabiose has been studied using quantum mechanics methods such as RHF/6-3Ig*, RHF/6-31g**, RHF/6-31 l+g* , B3LYP/6-31g*, B3LYP/6-3 lg** and B3LYP/6-31 l+g* both in vaccum and in water solution using the SCRF onsager model . The structures and the energies obtained from each calculations have been compared and the effect of solvent as well as some other features concerning the preferential conformations are investigated. The vibrational frequencies have been also computed for each method in order to discuss the stability of these molecule with the aid of the obtained thermodynamical values, notably in term of Gibbs free energies. Acknowledgments One of the authors thanks the Alexander von Humboldt-Stiftung (Bonn Germany) for a research fellowship.
52 1
-
VARIABLE STEP-SIZE STORMER METHODS
H. RAMOS Escuela Polite'cnica Superior, Campus Viriato, 49022, Zamora. Spain E-mail:[email protected] J. VIGO-AGUIAR Departamento de Matemdtica Aplicada, Uniuersidad de Salamanca. 37006, Salamanca. Spain E-mail: [email protected]
Extended abstract Although it is possible to integrate a special second-order differential equation of the form
by reducing it t o a first order system and applying one of the methods available for those systems, it seems more natural to provide numerical methods to integrate (1) directly without using first derivatives. The advantage of these approaches has to be with the fact that they are able to exploit special information about ODES, and this results in a n increase in efficiency (that is, high accuracy at low cost). For instance, it is well-known that Runge-Kutta-Nystrom methods for (1) involve a real improvement compared to standard Runge-Kutta methods, for a given number of stages [ 4, p. 2851 . On the contrary, a linear k-step method for first-order ODES becomes a 2k-step method for (l), [4, p. 4611, increasing the computational work. Problems of type (1) occur frequently in classical mechanical modelling and they have been treated as a particular case by different authors. 522
523
Stormer developed his method in connection with the numerical calculations concerning the aurora borealis, and since then, it has been widely used not only by astronomers but in different contexts. The original Stormer method is of linear multistep type with a fixed step size. But to be efficient, as some authors have remarked [4,p.3971, an integrator based on a particular formula must be suitable for a variable stepsize formulation. However, changing the step size in a multistep method is not an easy task. We have obtained a generalization of the Stormer method for variable step-sizes, complementing the ideas about variable coefficient multistep methods that appeared in [lo] for the specia!. differential equation y ( m ) = f (2,Y ) . For that, it is necessary to use an easily computable vector, Q k , h3,k-1
whose components h3,j are complete symmetric polynomials of degree 3 in the values Hn+l, H,, H n - l , and a matrix, s k , whose coefficients are expressed in terms of certain elementary symmetric polynomials in the values Hn,Hn-1,. * * Hn-(k-Z)r being 7
Hn+1
= z n + l - xn
Hn = Tn - Zn = 0 Hn-l
-
= %,-I-
Hn-(k-Z)
-
= %-(k-Z)
2,
- En
7
are unevenly spaced. where the grid points The resulting formula may be expressed in the form
where Fk is the k-vector of Newton divided differences, Fk =
( f [ T n ] , f [ ~,Znn
-
l],...,f[T~r...,2n-(k-l)]).
Of course, this method suffers the disadvantage of needing some starting values (which must be obtained using the Taylor series expansion, a RungeKutta method, or whatever one-step method with an initial very small step size according to the required accuracy).
524
We also need a strategy for selecting the final step every time, in order to have the local error controlled. If we suppose that the numerical integration has proceeded successfully up to the point Tn and we attempt to advance from Tn to Tn+1 = Tn hn+l using the -variable-step formula in (2), we propose as an estimate for the new step hn+l, the unique solution of the equation
+
where T O L is a user-given tolerance for the local error, p is a safety factor (less than one), gk is a constant coefficient from the corresponding Stormer formula with fixed step-size (being expressed in terms of backward differences for the function f [ 4 , p.463]), and, as it is usual, f [ Z n ,Tn-l,. . . ,z n - k ] is the Newton divicied difference of order k.
It is possible to extend the above formulation for the corresponding implicit case, obtaining similar formulae. In this case, we have to solve at each step an implicit system for yn+l. And obviously, the two methods may be implemented as a predictorcorrector pair, obtaining the advantages of this approach, notably the facility for monitoring the local truncation error cheaply and efficiently. Thus, for example, if we take the same order for the two methods, say k = 3, and evaluate the corresponding formulas, we obtain for the predictor
where we have set for the step size ratios,
525
And, for t h e corrector, t h e formula results in Yn+l -
(1
+ c1) Yn + c1yn-1
hn+l(hn+l+ 12Cl(l
+
hn) c1)
[(-I
=
+ c1 + C12)
fn+l
with c1 as before. Thus, the two methods in (3) and (4) may be applied in a P(EC)” mode. Finally, some numerical results are provided in order t o show the good behavior of the methods.
References 1. M. Calvo and J. Vigc-Aguiar, A note on the step size selection in A d a m multistep methods, Numer. Algorithms 27, 359(2001). 2. J. P. Coleman and A. S. Booth, Analysis of a Family of Chebyshev Methods for y” = f(z,y ) , J . Comp. Appl. Math.44, 95(1992). 3. G. Denk, A new numerical method for the integration of highly oscillatory second-order ordinary differential equations, Appl. Numer. Math. 13,57(1993). 4. E. Hairer, S. P. Norsett and G. Wanner, Solving Ordinary Differential Equations I, Springer, Berlin, 1987. 5. P. Henrici, Discrete variable Methods in Ordinary Differential Equations, John Wiley, New York, 1962. 6. E. Isaacson and H. B. Keller, Discrete variable Methods in Ordinary Differential Equations, John Wiley, New York, 1966. 7. M. S. H. Khiyal and R. M. Thomas, Variable-order, variable-step methods for second-order initial-value problems, J . Comp. Appl. Math. 79, 263( 1997). 8. J. D. Lambert, Numerical Methods for Ordinary Differential Systems, John Wiley, England, 1991. 9. L. F. Shampine and M. K. Gordon, Computer solution of Ordinary Differential Equations. The initial Value Problem, Freeman, San Francisco,CA, 1975. 10. J. Vigo-Aguiar, An approach to variable coefficients multistep methods for special differential equations, 1nt.J.Appl.Math. 8 , 911(1999).
A NOTE ON THE SELECTION OF THE STEP SIZE IN THE VARIABLE STEP-SIZE STORMER METHOD
H . RAMOS Escuela Polite‘cnica Superior, Campus Mriato, 49022, Zamora. Spain E-mai1:higraOusal. es
J. VIGO-AGUIAR Departamento de Matema’tica Aplicada, Universidad de Salamanca. 37006, Salamanca. Spain E-mail: [email protected]
Extended abstract Stormer method is a multistep code with fixed step suitable for the numerical integration of second-order differential equations of the special form Y/”(X)= f(X,Y(X)),
Y’(X0) = Y
Y(X0) = Yo1
b
1
(1)
where the right hand side does not include the derivative of y [4, p.4621. We have obtained the counterpart formulation for the variable Ic-step case, which may be expressed in the form
j=O
where h, and hn+l are two of the step-sizes, namely,
h, = Zn - Tn-l,
hn+l
-
=~,+1
-
- X,
with zn-(k-l), . . . , Z n + l r the grid points, unevenly spaced, and the terms f [ Z n l . .. l Z n - j ] the Newton divided differences as are usually defined. Fi526
527
nally, yj are some coefficients that depend on the step sizes in the grid points, and h the maximum of the steps. Of course, it is necessary a criterion from which to decide how to change the steplength. If we suppose that the numerical integration has proceeded successfully up to the point Zn and we attempt to advance from T n to Zn+l = Xn+hn+l using the variable-step formula in (2), we propose as an estimate for the new step En+l,the unique solution of the equation
where T O L is a user-given tolerance for the local error, p is a safety factor (less than one), Uk is a constant coefficient from the corresponding Stormer formula with fixed step-size (being expressed in terms of backward differences for the function f [4, p. 463]), and, as it is usual, f[Zn,Zn-lr... ,Xn-k] is the Newton divided difference of order k. This strategy for selecting the step-size works fine, but when the stepsize has to be reduced (particularly if it is approximately by half) we have develop a procedure that permits, at the same time, reducing the step and increasing the order. In the simplest case, for k = 1, the formula in (2) results in Yn+l
+ ~ 1 ~n) - c1 Yn-1+
= (1
e
1 2
- hn+l(hn+l+
hn)f(%)
1
where we have set c1 = for the step size ratio. As the coefficient in the error term is y1 = 1 (c1 - l)hn, choosing c1 = 1 it results in y1 = 0, and so, the method, which have order 2, is the well-known Stormer method of fixed step, ~ n + l =2Yn - Yn-1+
hi+, f n.
We will show another particular case, say for lc = 3 . Now, the coefficients yj in the formula (2) depend on c1 = and on c2 = and the coefficient 7 3 in the error term is
e,
528
with P l = -1
pz = -3 Equating of C l ,
73 =0
+ c1 + + 3cl + 7 c 1 2+ 3 ~ 1 3 . C12,
we obtain that c2 may be expressed as a rational function
cq = --.5
~ 1
P2
(3)
Figure 1. cz versus c1 for 7 3 = 0
The plot of c2 versus c1 in the range for c2 > 0 is in FIGURE 1. As c2 is known, it is clear from the picture that whatever c2 was, we can select a unique c1 in the interval ( s , r ) ,with s = 0.4463115, the real root of the denominator in ( 3 ) and r = 0.6180339, the real root of the numerator in ( 3 ) , in such a way that 7 3 vanishes. Thus, with this selection the method will have order 4. We can establish this result as a Theorem.
&,
Theorem 0.1. For k = 3, given c2 = we can select a unique c l , and therefore hn+l = c1 h,, in such a way that the variable 3-step Stormer method given in (2) has order 4. For different values of k it is possible t o obtain similar results.
References 1. M. Calvo and J. Vigo-Aguiar, A note on the step size selection in Adam multistep methods, Numer. Algorithms 27,359(2001).
529
2. J. P. Coleman and A. S. Booth, Analysis of a Family of Chebyshev Methods for y” = f ( z , y ) , J. Comp. Appl. Math.44, 95(1992). 3. G. Denk, A new numerical method for the integration of highly oscillatory second-order ordinary differential equations, Appl. Numer. Math.13, 57( 1993). 4. E. Hairer, S. P. Norsett and G. Wanner, Solving Ordinary Differential Equations I, Springer, Berlin, 1987. 5 . P. Henrici, Discrete variable Methods in Ordinary Differential Equations, John Wiley, New York, 1962. 6. E. Isaacson and H. B. Keller, Discrete variable Methods in Ordinary Differential Equations, John Wiley, New York, 1966. 7. M. S. H. Khiyal and R. M. Thomas, Variable-order, variable-step methods for second-order initial-value problems, J.Comp. Appl. Math. 79, 263( 1997). 8. J. D. Lambert, Numerical Methods for Ordinary Differential Systems, John Wiley, England, 1991. 9. L. F. Shampine and M. K. Gordon, Computer solution of Ordinary Differential Equations. The initial Value Problem, Freeman, San Francisco,CA, 1975. 10. J. Vigc-Aguiar, An approach to variable coefficients multistep methods for special differential equations, 1nt.J.Appl.Math. 8 , 911(1999).
BIREFRINGENCES: A CHALLENGE FOR BOTH THEORY AND EXPERIMENT
A. RIZZO* IPCF del CNR, Area della Ricerca, Via Moruzzi, 1, loc. 5’. Cataldo I-56124 Pisa, Italy E-mail: [email protected]
The phenomenon of birefringence, the anisotropy of the refractive index induced in light when it impinges on matter subject t o static, generally spatially inhomogeneous, electric and/or magnetic induction fields, will be discussed. It will be shown how the subject presents a challenge for theory, computation and experiment.
Optical anisotropies, or birefringences, may occur when radiation impinges on matter in the presence of external electromagnetic fields. Several types of birefringence can be observed depending on the status of polarization of light, of the geometrical setup, of the symmetry of the sample subject to radiation and of the type of external field. Typical and well known examples are the linear birefringence observed when polarized light interacts with a sample in the presence of an external electric field with a component perpendicular to the direction of propagation (Kerr effect), or the circular birefringence occurring when again linearly polarized light impinges on a sample traversed by a static magnetic induction field with a component parallel to its direction of propagation (Faraday effect). 5,6 The “linear” birefringence leads to the insurgence of an ellipticity, due to the anisotropy of the components of the refractive index parallel and perpendicular, respectively, with respect to the applied external field. A “circular” birefringence results into a rotation of the plane of polarization due to an anisotropy arising between the right and left circular components of linearly polarized radiation. “Axial” birefringence is also observed f.ex. in non-polarized light traversing an assembly of chiral molecules in the same 192,3
*Web: http:/www.icqem.pi.cnr.it/rizzo/ar.html
530
53 1
conditions as for the regular Faraday effect (magnetochiral birefringence, see below). A review of the recent ab initio work of the author on the magnetic induction field (Cotton-Mouton effect) birefringence, in some cases in collaboration with the experimentalists working on the field, will be given. In the case of the effect on the Neon gas, only an extremely sophisticate, state-of-art approach as the Coupled Cluster including Singles Doubles and full Triples (CCSDT) and the use of extended correlation consistent basis sets allows to reproduce the results of very recent accurate renewed measurements of the effect. Also, the excellent performance of a Density Functional Theory (DFT) approach in reproducing the experimental results both for small molecules in the gas phase lo and, in conjunction with the Polarizable Continuum Model (PCM), of the furan homologues in the liquid phase and in solution l1 will be analyzed. The electric-field-gradient (Buckingham effect) induced linear birefringence l2 is another typical example of a process which requires extra accuracy, both from the point of view of the experiment and of that of the computational chemist. Also, the theory of the effect is still the subject of open discussion. 13114,15*16 This birefringence is at the basis of one of the most successful techniques for the experimental determination of molecular quadrupole moments. The status of the art of the field will be briefly discussed, with special emphasis on some very recent studies which might help shed some light on the source of disagreement of the literature on the molecular theory of the effect, and also lead to a revision of some of the experimentally derived values of the molecular quadrupole of molecules, f.ex. of Clz. 17,18 Jones (and Magneto-Electric) birefringence arises when polarized light goes through atoms or molecules in a direction perpendicular to that of externally applied static electric and magnetic induction fields, arranged parallel (Jones) or perpendicular (Magneto-Electric) to each other. The resulting linear birefringence (bilinear in the two fields) is currently being experimentally extensively investigated by Geert L. A. Rikken and his co-workers in Grenoble and Tolouse. We present the results of our ab initio analysis carried out on inert gases (He, Ne, Ar, Kr), in centrosymmetric molecules (Hz, Nz, CzH2) and in a polar molecule (CO). 23 The study is carried out by exploiting a Coupled Cluster Singles and Doubles (CCSD) approach, and it includes a detailed analysis of the dependence of the observable on the choice of origin of the coordinate system in approximate calculations. 899
19720
21i22
532
As mentioned above, when unpolarized light passes through a sample of chiral molecules in the presence of an external magnetic induction field parallel t o the direction of propagation of the radiation an axial birefringence, the anisotropy of the refractive index for alignment of the vector of propagation and that of the field parallel and anti-parallel, respectively, is observed. 24 This anisotropy has been measured very recently and two groups disagree quite noticeably on the magnitude of the effect for systems as proline, limonene and carvone. We studied the process employing a Time-Dependent Hartree-Fock (RPA) approach, by computing the appropriate frequency dependent quadratic response functions, which involve the electric dipole, the electric quadrupole and the magnetic dipole interaction operators, rather similar t o what happens for Jones birefringence 28. I n spite of the approximations made, and of the strong dependence of the response of these floppy molecules t o the combination of radiation and external magnetic induction field, which makes even the study of natural optical activity rather challenging, we still predict anisotropies much smaller than observed, up t o three t o four order of magnitude weaker than seen in some laboratories. This casts, in our view, a shadow on the interpretation which has been made of the results of experiment. 25926127
References 1. C. J. F. Bottcher and P. Bordewijk. Theory of Electric Polarization, volume 11. Elsevier, Amsterdam, 1978. 2. L. D. Barron. Molecular light scattering and optical activity. Cambridge University Press, Cambridge, 1982. 3. G. H. Wagnihre. Linear and Nonlinear Optical Properties of Molecules. Verlag Helvetica Chimica Acta, Basel, 1993. 4. A. D. Buckingham and J. A. Pople, Proc. Phys. SOC.A. 68,905 (1955). 5. D. J. Cadwell and H. Eyring. The Theory of Optical Activity. WileyInterscience, New York, 1971. 6. A. D. Buckingham and P. J. Stephens. 17,399 (1966). 7. A. D. Buckingham and J. A. Pople, Proc. Phys. SOC.B. 69,1133 (1956). 8. J. G a d and A. Rizzo. J. Chem. Phys., to be submitted. 9. R. Cameron, G. Cantatore, A. C. Melissinos, J. Rogers, Y. Semertzidis, H. Halama, A. Prodell, F. A. Nezrick, C. Rizzo, and E. Zavattini, J. Opt. SOC.Am. B. 8 , 520 (1991). 10. C. Cappelli, B. Mennucci, J. Tomasi, R. Cammi and A. Rizzo, Chem. Phys. Lett. 346,251 (2001). 11. C. Cappelli, B. Mennucci, J. Tomasi, R. Cammi, A. Rizzo, G. Rikken and C. Rizzo. J. Chem. Phys., in press. 12. A. D. Buckingham, J. Chem. Phys. 30,1580 (1959). 13. A. D. Buckingham and H. C. Longuet-Higgins, Mol. Phys. 14,63 (1968).
533
14. D. A. Imrie and R. E. Raab, Mol. Phys. 74, 833 (1991). 15. S. Coriani, A. Halkier, A. Rizzo and K. Ruud, Chem. Phys. Lett. 326,269 (2000). 16. S. Coriani, A. Halkier and A. Rizzo, in Recent Research Developments in Chemical Physics, Vol. 2, edited by G. Pandalai. Transworld Scientific, Kerala, India, 2001. 17. S. Coriani, A. Halkier, D. Jonsson, J. Gauss, A. Rizzo and 0. Christiansen, J. Chem. Phys. 118,7329 (2003). 18. C. Cappelli, U. Ekstrom, A. Rizzo and S. Coriani. "J. Comp. Meth. Sci. Eng. (JCMSE)", submitted. 19. E. B. Graham and R. E. Raab, Proc. R. SOC.Lond. A. 390,73 (1983). 20. E. B. Graham and R. E. Raab, Mol. Phys. 52,1241 (1984). 21. G. L. J. A. Rikken, E. Raupach and T. Roth, Physica A. 294-295,1 (2001). 22. T. Roth and G. L. J. A. Rikken, Phys. Rev. Lett. 85,4478 (2000). 23. S. Coriani and A. Rizzo. J. Chem. Phys., to be submitted. 24. L. D. Barron and J. Vrbancich, Mol. Phys. 51,715 (1984). 25. P. Kleindienst and G. H. WagniBre, Chem. Phys. Lett. 288,89 (1998). 26. N. K. Kalugin, P. Kleindienst and G. H. Wagnihre, Chem. Phys. 248, 105 (1999). 27. M. Vallet, R. Ghosh, A. L. Floch, T. Ruchon, F. Bretenaker and J.-Y. ThBpot, Phys. Rev. Lett. 87, 183003 (2001). 28. M. Pecul, S. Coriani, A. Rizzo, P. Jorgensen and M. Jaszuriski, J. Chem. Phys. 117, 6417 (2002).
FEASIBILITY OF CLOSED-LOOP TARGET CONTROLLED INFUSION OF INTRAVENOUS ANAESTHESIA* J. ROCA JR., J.ROCA, J.MARTINEZ AND F.J. MARTINEZ “Industrial & Medical Electronics” Research Group Polytechnic University of Cartagena Cartagena,30202 Murcia. Spain E-mail: [email protected]
F.J. GIL AND J.A. ALVAREZ-GOMEZ Anesthesiology & Reanimation Service - Intensive Care Service Sta Maria del Rossel Hospital - Servicio Murciano de Salud Cartagena. 30202 Murcia, Spain One of the mayor problems that clinical anesthesiologists face up within their everyday practice is related to intraoperative patient awareness. The sedation level that prevents these incidents should be granted by proper drug administration and dosing procedures. This paper presents the results of a feasibility study of a closed-loop controller for target controlled infusion of intravenous anaesthesia based upon different in-silico experiments.
1.
Introduction
1.1. Anaesthesia administration and computer assisted dosing
The problem of dosing intravenous agents is related to the metabolic principles of operation of the hypnotic used for sedation, that can be modeled through standard procedures such as compartmental modeling, which is the one most frequently used in clinical applications. In this way, the total amount of drug in the blood may be estimated by means of pharmacokinetic models (PK) that relate the plasmatic drug concentration with the drug uptake, elimination and internal redistribution among the different body tissues and organs. These models take the form of a simple set of differential equations reflecting the drug concentration variation in each one of the n-compartments; and can be written in terms of the inter-compartmental rate microconstants Ku , and the external drug input I,associated to that compartment (of volume V,) as:
This work is being supported by the Spanish Ministry of Health research grant FIS01/1342 from the Spanish Sanitary Research Fund of the Carlos I11 Institute of Health.
534
535
i#
J
Dosing is usually accomplished in a two stage approach. First, the dose required for induction is calculated as the rate needed to get the desired plasmatic concentration at a given time. After that point, the dose will be calculated to compensate the amount of drug eliminated through the renal clearance. As these calculations were not affordable for most of the specialists at the operating theatre, exact dosing procedures were not available until the development of computer controlled infusion pumps (CCZPs) and the proper control algorithms. The first device approved for clinical practice was the DiprifusorTM developed by ZenecaO in 1996 for the administration of propofol'. This system used a model predictive controller (MPC) for adjusting the pump infusion rate based upon a pharmacokinetic data set which is used to estimate the required infusion rate. This method for intravenous drug administration, also known as Target Controlled Infusion, has been successfully introduced as an everyday clinical procedure. 1.2. Closed-Loop control of Anaesthesia
The first problem that the clinical specialists have to face up is that most of the drugs used for sedation do not present a linear relationship between plasmatic concentration and the observable therapeutic effect, so that an additional nonlinear pharmacodynamic model (PD) for the effect-site has to be used in order properly calculate the dosing required to reach a desired sedation state. On the other hand, as the PK parameters are derived after a specific sample population, some individuals may not be correctly modeled, so that significant differences are observed between actual and predicted plasmatic drug concentrations. In order to overcome these limitations, several authors have developed closed-loop control strategies for depth of anaesthesia (DOA) that automatically adjust the dose in function of the observed changes in several indicators of the sedation such as the blood pressure and the heart rate. The main drawback of this approach relies on intersubjects response variability, so that current trends are focusing on the analysis of cortical responses (EEG, AEP) as DOA
estimator^^'^.
536 2.
A model based approach to closed-loop control
To improve the TCI perfusion, the closed-loop TCI controller of figure 1 is proposed (in that diagram, n is a noise input used for the simulation of interferences in order to study the immunity to electric perturbations). STANDARD TCZ ENGINE
J. PATIENT
Figure 1. Proposed closed-loop controller for target controlled infusion of propofol.
DOA feedback from the patient, obtained after processing the medium latency auditory evoked potentials with the method proposed by Mantzaridis et aZ4, is analyzed by the fuzzy linguistic controller (FLC) in order to increase or decrease the target concentration of the TCI block for the compensation of the drift due to the mismatch between the theoretical model and the real PKPD model of the patient. This simple but yet effective, Mamdami fuzzy logic linguistic controller is developed around a rule set derived from those commonly implemented by industrial controllers; while the membership functions (MFs) were characterized after studying the decision-making procedures of the specialists in charge of the sedation at surgery. The system was implemented under Labview 6iTMin order to integrate AEP processing for DOA measure and the controller under a single software system. The fuzzy MFs were defined within the Fuzzy Logic Toolkit as follows: Error and Trend were implemented as inputs of 3 no-overlapping MFs from 10 to -10 (POS, ZERO and NEG) and Rate as an output of 5 full-overlapping MFs from 0,125 to -0,125 (INCREASE, INC, HOLD, DEC and DECREASE). Defuzzification was got by means of the Center of Gravity and the inference was based on the Max-min rule.
537
3.
Results
The system was simulated with three1 different PK models for the patient. The simulation time step was Is, updating the TCI pump and the FLC controller after induction (2 min) every 15 and 60 seconds, respectively. In order to test the controller performance, median performance error1 (MPE) and median absolute performance error1 (MAPE) were calculated over the resulting DOA. Table 1 shows the results of the simulations. Table 1. Simulation results for effect targets 10 to 80%. PATIENT MODEL
OPEN LOOP
MPE(%) MAPE(%) MPE(%) MAPE(%)
0.33 Marsh 48.26 Dyck & Shafer -8.2 Tackley
4.
CLOSED LOOP
0.33 48.26 9.05
0.33 4.99 -6.72
0.33 4.99 6.78
Conclusion and future works
Closed-loop control of anaesthesia based upon adaptive target controlled infusion will be feasible as soon as DOA measures improve. Our future work will be focused on the development of a multimodal index for DOA measure based upon the extraction of mutual information from different indexes obtained after EEG and AEP. Acknowledgments This research project was awarded by the Spanish Society of Intensive, Critical & Coronary Units Medicine (SEMICYUC) research prize of 2000 sponsored by Siemens Medical. References 1. 2. 3. 4.
J.B. Glen, Brit. Jour. Anaesthesia, V53S1, 13 (1998). H. Litvan EW, Jensen M, Revuelta S W, Henneberg P, Paniagua J M, Campos P et al, Acta Anaesthesiol Scand, V46, 245 (2002). M. Van Gils, H. Viertio-Oja, A. Yli-Hankala, I.Korhonen. IFMBE Proc Series, V3(l), 390 (2002). H. Mantzaridis and G.N.C. Kenny, Anaesthesia, V52, 1030 (1997).
THE FUNDAMENTAL SOLUTION METHOD FOR ELLIPTIC BOUNDARY VALUE PROBLEMS PASCAL ROUBIDES Georgia Institute of Technology Atlanta, GA 30332-0160, USA E-mail: [email protected]
This article presents an efficient field discretization method based on fundamental solutions for elliptic boundary value problems, represented in this monogram by the steady-state heat conduction equation. The results have shown that the method is highly accurate and does not require the fine grid resolution that other techniques demand.
1. Introduction Typical solution methods for potential-type problems involve direct discretization of the field equation within a particular domain using Finite Difference (FDM) or Finite Element (FEM) methods. In contrast, other techniques such as Boundary Element Methods (BEM) involve only discretizing the boundary of the domain [ 1-41. The sample mathematical model used in this paper for the steady state heat conduction in 2D, the well-known Laplace equation:
V 2 T= O . 2.
(1)
Method
The method used in this article, called Fundamental Solution Method (FSM), uses a system of constraints that relate a computational node in a grid to its neighboring nodes. The result is an equation that looks similar to a finite difference discretization but does not have any explicit truncation error. The constraints are computed directly from single layer potential theory. The problem is to determine the potential Ta t a point F , given data at M neighboring points 7., The actual source or sources producing the field are unknown in both location and strength. The heat flow potential T may be approximated in the region SZ containing T and the data points Tm as a superposition of the fields generated by N hypothetical sources of strength randomly distributed throughout the domain (see Figure 1).
538
(T,
539
Figure I . Local grid nodes and random hypothetical source distribution
Discretization of the integral equation resulting from (1) for points F and
-
N
rm gives
N
T(F) = Xa,,G(F;C)
and
T(F,,,)= xonG(Fm;Fn),with
n=l
F
E 0, m
fl=l
= 1, M , and where G is the fundamental solution for the Laplace
1 In matrix notation we have 2?T are M x 1 and N x 1 matrices respectively,
equation in 2D given as G(F;F,)= --log(?).
Tm = G,,CT, where T and
CT
-
-
with elements T(F,), m = 1, M , an, n = 1, N , and where G is an
M xN
matrix with elements G(Fm;F,) . The matrix G does not necessarily have to be a square matrix; in fact, in certain types of problems, it may be beneficial to under-constrain the system (i.e., N > M), in which cases, the computational gain in efficiency is even larger. The non-square matrix can then be inverted choosing a least square norm implemented by Singular Value Decomposition (SVD). Once the pseudoinverse of matrix G has been computed, the least squares solution for the hypothetical source strengths an can be computed as an= G,',Tm where
G' denotes the pseudo-inverse matrix. Finally, the solution of the potential at each grid point can be expressed as T, = g,'G;,,,T, with g, (7) = G(F;F,) . Boundary conditions can be included as additional constraints to the system and can be used along with the neighboring nodes T,,, to compute the hypothetical Because of the nature of the method, the number of source strengths 0,. constraints used in the discretization is not limited. This makes it possible to impart sub-grid boundary condition variations, such as a curved surface or a comer.
540
3.
Field Solution
To demonstrate the method’s ability to adequate obtain the field solution for the heat flow potential, we solve a sample boundary value problem on a rectangular domain. The potential field due to a singular source distribution centered within the computational mesh is evaluated. The interpolating grid used in the field solution is a single square lattice of four nodes whose field values are calculated by a successive over-relaxation algorithm (SOR). Figures 2-5 demonstrate the field solution at various grid resolutions, with the most coarse shown in Figure 2, where only the minimum 5x5-point stencil is used. It is immediately obvious that the least squares procedure is picking up the source centered within the grid even at this minimal configuration. In the next three figures, the field solution is interpolated on 10x10, 20x20, and 50x50-point stencils. It can be easily seen that the 20x20 field resolution is more than sufficient to accurately visualize the field solution. 4
3
h 2
1
1
2
X
X
X
X
7
Figures 2-5. Field solution for 5x5, 1Ox10,2Ox20, and 50x50-point stencils
54 1 4.
Conclusion
Preliminary results obtained here and in [ 5 , 6 ] show that the method can produce accurate results in elliptic-type differential models, and may be applied in various areas of practical importance. Furthermore, FSM allows one to obtain accurate results with minimal computational resources (all numerical results for this monogram have been obtained on a desktop computer in double precision and validated on a Sun workstation). Current work in the mathematical details that affect the parameters involved in the method is ongoing. References
1.
C. Brebbia, J. Telles, and L. Wrobel, Boundary Element Technique, Springer, (1984).
2. N. Tovmasyan, Boundary value problems for partial differential equations and applications in electrodynamics, World Scientific, (1 994). 3. Y . Melnikov, S. Hughes, and S. McDaniel, Boundary element approach based on Green's functions, BETECH, University of Hawaii, (1996). 4.
Y . Melnikov and P. Roubides, Influence function method as an alternative approach in computational mechanics, ASCE-ASME-SES Conference, Northwestern University, (1 997).
5.
P. Roubides, An influence function formulation for the solution of the inverse biharmonic equation, 22"d SEARCDE, University of TennesseeKnoxville, (2002).
6. P. Roubides, Influence functions for the numerical solution of the inverse bioelectric field model, 3rd Int'l Conference on Scientific Computing, City University of Hong-Kong, (2003).
AXISYMMETRIC RIGID BODIES IN CREEPING FLOW
J. ROUMELIOTISI School of Computer Science and Mathematics Victoria University of Technology PO Box 144.28 Melbourne City Mail Centre Victoria 8001 AUSTRALIA E-mail: John .Rourne 1 i o t is Cpuu .edu .au URL:http://www.staff.vu.edu.au/johnr
1. Extended Abstract
Many biological and industrial processes involve fluid flows in which viscosity is large and/or particle lengths are small. These flow regimes, called Stokes or creeping flows are described by the governing equations
where u is the velocity field, p the pressure field and p the viscosity of an external fluid domain R. The most striking feature of these equations are their linearity and it is this that has been exploited over the years t o develop both analytic and numerical solutions for a variety of problems. The linearity of the governing equations (1)suggests a super-positioning approach which results in an integral over the collected surface S of all bodies in the unbounded fluid. That is, we can recast (1) so that the velocity field may be represented by a distribution of singularities over the
*Work partially supported via the overseas study program of Victoria University of Technology 542
543
surface of each body. In non-dimensional terms, we have
where c(x) =
{ f,
1, Z E R ,
(3)
2 ES,
and the usual summation convention has been adopted. The singularities are the Stokeslet
and the Stokes-stresslet
We refer the reader to for the details. The Stokeslet represents the velocity of the fluid at x in the ith direction due to a unit point force at the origin in the direction k. The stresslet is the analogous singularity for the stress tensor. For rigid bodies translating with velocity V ,the integral equation (2) simplifies to
.=s,
G ( . - Y) *f(Y)dS(Y)
for
2
E
3,
(6)
an equation of the first kind in the surface traction f. lo solved the first kind equation for a single spheroid and they reported reasonable accuracy for spheriods of moderate curvature. Of greater significance was the fact that they reported no ill-conditioning, even though F'redholm equations of the first kind are susceptible to these numerical instabilities. The instability of the first kind equation, which we will show in a later section, led to the development of a second kind equation for rigid particles 8 . The Stokesstresslet is itself a solution to the governing equation and in the case of a rigid surface, it is a simple matter to show that the double layer potential satisfies
Is
cijk(z -
y)ui(y)nj(y) d S ( y ) = -(I - c(z))uk(z),
(7)
an equation of the second kind. The problem with this equation is that X cannot describe flows made up of rigid body motions. Indeed, equation (7)
544
should be substituted into (2) to produce the single-layer, first kind equation (6), and is not in itself sufficient to allow solution by any numerical method. The remedy is to complete the system by stipulating conditions that are equivalent to the six rigid body motions demonstrate the applicability of the equations by numerically solving several rigid particle problems. Also, use this formulation to investigate the problem of particle motion near a plane wall. Another second kind equation for rigid particles, based on the traction, has been presented by 2 . They demonstrate its use, but there is a requirement t o evaluate three normal derivatives of Cauchy principal value integrals. Details regarding the second kind equation formulations can be found in It would seem that the price paid to stave off ill-conditioning in first kind equations is the much higher workload in solving those of the second kind. Recently, compared all three formulations on a set of test problems and showed that even though the condition number of the first kind system was large, it did not seem to influence the accuracy. In fact, the first kind equation performed slightly better than the other two and other work suggests that first kind equations are not necessarily worse than second kind 6 . In this paper the rigid body, axi-symmetric, first kind integral equations are solved. We present, in detail, a method based on arbitrary order Hermite interpolation over a general grid to construct a linear system which, when solved, furnishes the nodal behaviour of the particle traction. Complete elliptic integrals are approximated with a new polynomial-logarithmic expansion developed using the computer algebra package Maple. As a result, a combination of Gauss-Legendre and Gauss-Log quadrature rules are employed to evaluate the integrals. We show that this method works extremely well for simple spheroidal geometries, but becomes unstable for more complex particle shapes. In an effort to increase stability, as well as accuracy, two curvature based methods will be presented. The first method is based on choosing an a priori grid so that the curvature is represented uniformly across the entire domain. We will show that this scheme produces a five-fold increase in accuracy for the calculated traction of a highly prolate spheroid and prevents the instability reported earlier with the complex shaped particle. The second method involves modification of the integrand in the boundary integral equation. The idea being that the traction f may be decomposed as 324.
545
where F is some function of t h e curvature chosen so that fw is independent of the curvature. I n this way, almost any trivial grid can be used. / ~ this would return a constant solution fw = We chose F(n) = K ~ since c for the spheroid. With the complex surface, the returned solution fu, lacked the very large derivatives of the traction f, giving an a posteriori confirmation to t h e validity of the decomposition (8). In addition, it was clear that the traction, f = n1/3fw was much more accurate when compared with the un-weighted solution f. Finally, we note that the methods presented in this paper are quite general and apply t o any first kind boundary integral equation whose boundary has a known expression.
References M. S. Ingber and A. A. Mammoli. A comparison of integral formulations for the analysis of low Reynolds number flows. Eng. Anal. with Boundary Elements, 23:307-315, 1999. 2. M. S. Ingber and L. A. Mondy. Direct second kind boundary integral formulation for stokes flow problems. Comp. Mech., 11:ll-27, 1993. 3. S . J. Karrila and S. Kim. Integral equations of the second kind for Stokes flow: direct solution for physical variables removal of inherent accuracy limitations. Chem. Engng. Commun., 82:123-161, 1989. 4. S . Kim and S. J. Karrila. Microhydrodynamics: Principles and Selected Applications. Butterworth-Heinemann, Boston, 1991. 5. N. Liron and E. Barta. Motion of a rigid particle in Stokes flow: a new secondkind boundary-integral equation formulation. J. Fluid Mech., 238:579-598, 1992. 6. H. Niessner and M. Ribaut. Condition of boundary integral equations arising from flow computations. J . Comp. Appl. Math., 12 & 13:491-503, 1985. 7. H. Power and €3. F. de Power. Second-kind integral equation formulation for the slow motion of a particle of arbitrary shape near a plane wall in a viscous fluid. SIAM J. Appl. Math., 5350-70, 1993. 8. H. Power and G . Miranda. Second kind integral equation formulation of Stokes’ flows past a particle of arbitrary shape. SIAM J. Appl. Math., 47: 689-698, 1987. 9. J. Roumeliotis. A Boundary Integral Method applied to Stokes Flow. PhD thesis, The University of New South Wales, 2000. URL http://www.staff .vu.edu.au/johnr. 10. G. K. Youngren and A. Acrivos. Stokes flow past a particle of arbitrary shape: a numerical method of solution. J . Fluid Mech., 69:377-403, 1975. 1.
PAST, PRESENT AND FUTURE CHALLENGE OF DENSITY FUNCTIONAL THEORY BASED METHODS IN MOLECULAR SCIENCES NINO RUSSO, TIZIANA MARINO, EMILIA SICILIA AND MARIROSA TOSCANO Dipartimento di Chimica and Centro di Calcolo ad Alte Prestazioni per Elaborazioni Parallele e Distribuite-Centro d 'EccellenzaMIUR, Universita della Calabria, 1-87030 Arcavacata di Rende (CS), Italy. Email: E-mail: nru.:so(ii'rmical.it
In this talk we will briefly review the development of density functional methods starting from the 60s to nowadays and outline some future perspectives. Both conceptual and computational viewpoints will be treated. Particular emphasis will be given to the structure and performance of the new version of deMon code. Some applications in different fields of modem molecular sciences such as catalysis, surface and material science, organic and bioinorganic chemistry, metal-ligand interactions and molecular spectroscopy will be presented and discussed. We deal with: -metal cation interactions with nucleic acid bases and amino acids. In particular the toxicity of A1 cation will be proved at molecular level;
-metal and metal oxide clusters to simulate chemisorption processes at the surfaces; 546
547
Adsorption of NO on MgO cluster modelling a defect site
-H-atom versus electron transfer in the working mechanism of natural antioxidants;
Chemical structures of the considered antioxidants
a i o londsrr
ai dr*h
-Potential energy profiles for some reactions catalysed by enzymes;
548
-20-
4-
\\
\
-5295
Potential energy profile for N03- reduction by Nitrate reductase enzyme (all values are in kcavmol). -Reactivity indices and their use in reproducing periodic properties and reaction mechanisms.
The graphical representation of RGB,RSTO-DFT , RISO-DPT, X D F T - ~ ; ~ ) ~, D F Tand I I from scandium to cadmium.
549 -potential energy surfaces for catalytic reactions;
References 1 . Russo, N.; Toscano, M.; Uccella, N. J . Agric. Food Chem. 48,3232(2000). 2. Russo N. and Salahub, D.R., Eds., Metal-ligand Interactions in Chemistry, Physics and Biology, Kluwer, Dordrecht, 2000. 3. Marino T., Russo N. and Toscano M., J. Inorg. Biochem., 79, 179(2000). 4. Marino T., Russo N. and Toscano M., Inorg. Chem., 40,6439(2001). 5. Russo N., Toscano M. and Grand A., J. Phys. Chem. B, 105,4735(2001). 6. Russo N., Toscano M. and Grand A., J. Am. Chem. SOC.,123, 10272(2001). 7. Russo, N.; Sicilia, E., J. Am. Chem. SOC.,123, 2588(2001). 8. Russo, N.; Sicilia, E. J. Am. Chem. SOC.,124, 1471(2002). 9. De Luca G., Russo, N.; Sicilia, E. and Mineva T., J. Am. Chem. SOC.,124, 1494(2002). 10. Michelini M. C., Russo N. and Sicilia E., J. Phys. Chem. A, 106, 8937(2002). 1 1 . Marino T., Russo N. and Toscano M., J. Mass Spectrom., 37,786(2002). 12. Marino T., Russo N. and Toscano M., J. Phys. Chem. B, 107,2588(2003). 13. Russo N., Toscano M. and Grand A., J. Mass Spectrom., 38,265(2003).
ARTIFICIAL INTELLIGENCE METHODS USED IN THE INVESTIGATION OF POLYMERS PROPERTIES T. RUSU AND M. PINTEALA “P. Poni ” Institute of Macromolecular Chemistv, Iasi, Aleea “Gr. Ghica ” Voda, no. 41A. 6600, Romania E-mail: [email protected] V. BULACOVSCHI “Gh. Asachi ” Technical University, Faculty of Chemise, h i , 6600, Romania E-mail: vbulacov@ch. tuiasi.ro This paper deals with the use of Artificial Intelligence Methods (A0 in the process of designing new molecules possessing desired physical, chemical and biological properties. This is an important and difficult problem in the chemical, material and pharmaceutical industries. The traditional methods involves a laborious and expensive trial-and-error procedure, but there is of a considerable importance the development of computer-assisted approaches towards the automation of molecular design.
1. Introduction
In general, computer-aided molecular design requires the solution of two problems: - the first problem requires the computation of physical, chemical and biological properties from the molecular structure; - the second problem is related to the identification of the appropriate molecular structure to give the desired physicochemical properties. For the first problem we propose the use of a special trained neural network (NN) in cooperation with a Monte Carlo simulations technique (MC). Our interest focus on the use of neural network algorithms (NN) due to the fact that this algorithms can be applied to any particular knowledge domain, and may be used generally to solve certain classes of problems from a wide variety of fields. Once the learning process of the NN is realized for a set of copolymers to whom the water delivery properties were investigated in relation with the molecular ratio of the hydrophilic and hydrophobic sequences. We consider a set of data supplied by the MC method to run on the trained NN and the result set of data are to be considerpopulations for the next stage. For the identification problem, we propos an evolutionary molecular design approach using genetic algorithms (GA) that are general purpose, stochastic, evolutionary search and optimization strategies based on the Darwinian model of natural selection and evolution.
550
551 1.1 Neuronal Network
The essential abilities and the flexibility of N"s are brought about the interconnection of individual arithmetic units. Many kinds of networking strategies have been investigated but for our work we have considered a "backpropagation" algorithm. This algorithm does not represent any particular kind of network architecture (a multilayered net is generally used) but rather a special learning process. The learning through a back-propagation stems from the fact that adjustments to the neural net's weights can be calculated on the basis of well-defined equations. Nevertheless this procedure for correcting errors has very little in common with those processes responsible for the adjustment of synaptic weights in biological systems. The back-propagation algorithm may be used for one- or multilayered networks and is a supervised learning process[l].
1.2 Genetic Algorithm Genetic search methods have their basis in Darwinian models of natural selection and evolution. For the GA there are essentially two main components. First, there must be a code or structure to represent the possible solutions to the problem. This basic code is called a string. Strings are composed of a sequence of characters of finite length n, composed of binary bits or symbols. Each candidate solution is represented by such a string. The set of solutions, Pj, is referred to as the population of the jth generation. Second, a transition rule must be defined to mimic the biological evolution process. The transition rule consist of a reproductive plan and genetic operators. The reproductive plan is an algorithm that select strings of the current population, which will participate in the process of generating the next solution population. The genetic operators appropriately modify the structure of the selected strings, in order to produce the members of next generation. A simple genetic algorithm consists of one reproductive plan, called the fitness proportionate reproduction and two genetic operators, crossover and mutution[2]. 1.3 Experimental results Starting synthesis data are related to a set of copolymers with hydrophilic and hydrophobic sequences to whom the best fit molecular ratio was designed according to evaporation speed from the macromolecular network[3]. The synthesis of copolymers was realized according Scheme 1.
552
- PMAA PDMS
PDMS-co-PMAA
Scheme 1
553
120
,
0
2
4
8
6
10
12
14
Figure 1 Evaporation of water from the macromolecular network as a function of time and molecular ratio. The NN algorithm presented here is a multi-parameter NN algorithm called OMBNN3 developed by Krasnopolsky [4]. It contains three new elements: 1. It is a multi-parameter retrieval algorithm. The co-variability of related atmospheric and evaporation parameters which can be extracted from the same set of evaporation conditions is taken into account: evaporation speed, columnar water vapor and columnar liquid water. g' f m 0 (1) where g = { W , V, L, Ts}is now a vector of simultaneously retrieved parameters, W is evaporation speed, V is columnar water vapor, L is columnar liquid water. If w is the evaporation speed retrieved by a single-parameter algorithm and W is the evaporation speed retrieved by the corresponding multi-parameter algorithm (I), then this "artificial" systematic error (bias) can be estimated as, ( W - w ) = x a i b i + x P i u , z + ~ y i j c+... ,j (2) I
I
iJ
2. Second, a method of NN training which enhances learning at high evaporation speeds was used. In preparation for training the NN, the weights, , in the error function were generated using the following formula, a=- C (3)
mi
554 where p(W) is the observed evaporation speed probability distribution and C is a normalization constant. 3. Third, an extensive matchup database containing raw matchup data sets for the selected samples and augmented with additional matchup data for high evaporation speed events (up to 26 m/s) was used in developing this algorithm ( > 15,000 matchups overall). 1.4 Discussions
The OMBNN3 and PB algorithms, which both employ the simultaneous multiparameter retrieval approach, reduce the bias, and the dependence of the bias, on both water vapor and cloud liquid water concentrations. In some cases this may lead to degraded retrievals; therefore, any additional information about related local atmospheric conditions which can be derived from the same evaporation conditions may improve these retrieval for the efect of molecular ratio PDMSPMAA. Local atmospheric parameters ( V , L), which the OMBNN3 algorithm now produces simultaneously with evaporation speed, help to describe the instantaneous state of the atmosphere more completely, and, therefore may help to improve the retrieval molecular ratio from PDMWPMAA samples and to hrther improve the accuracy of retrievals under imposed conditions. The GA have selected the best fit solutions and to the rejected candidates of the initial population the specific mutation was used to run again the system. The calculated error of the system is in a reasonable limit . The OMBNN3 algorithm demonstrates the best performance. The random errors for the OMBNN3 algorithm are significantly smaller and demonstrate weaker dependencies on the related atmospheric parameters than do the errors for the other algorithms. References 1.J. Zupan, J. Gasteiger, NeuraI networks for Chemists: A Textbook, VCH,
Weinheim, 1993. 2.Goldberg, D. E., Genetic Algorithms in Search, Optimization, and Machine Learning.Addison-Wesley (1989) 3.T. Rusu, S. Ioan, S. C. Buraga; Amphyphylic copolymers as water delivery systems;Eur. Polym. J., 37, 2005,2001 4.Krasnopolsky, V., Gemmill, W.H., Breaker, L.C.; A New transferfinction for ssm/I based on an expanded neural network architecture; Technical Note, OMB contribution No. 137, NCEP/NOAA, 1996
SYMMETRIC MULTISTEP METHODS WITH MINIMAL PHASE-LAG FOR THE APPROXIMATE SOLUTION OF ORBITAL PROBLEMS
D. SAKAS AND T.E. SIMOS*$ Department of Computer Science and Technology, Faculty of Science and Technology, University of Peloponnese, GR-22100 Tripolis, Greece
An explicit hybrid symmetric six-step method of algebraic order six is presented in this paper. The method has phase-lag of order ten. Numerical comparative results from the application of the new method to well known periodic orbital problems, clearly demonstrate the superior efficiency of the method presented in this paper compared with methods of the same characteristics.
1. Introduction
The last decade much research has been done for the numerical solution of second order differential equations of the form
i.e. differential equations for which the function f is independent from the first derivative of y. Some of the most frequently used methods for the numerical solution of problem (1) are the symmetric multistep methods. Symmetric multistep methods were first presented by Lambert and Watson In they show that the interval of periodicity of symmetric multistep methods is non vanishing, which reassures the existence of periodic solutions
’.
*Active Member of the European Academy of Sciences and Arts +Corresponding author. Please use the following address for all correspondence: Dr. T.E. Simos, 26 Menelaou Street, Amfithea - Paleon Faliron, GR-175 64 Athens, Greece. Fax number: 301 94 20 091 $Email: [email protected]
++
555
556
'
in it a. Lambert and Watson developed symmetric multistep methods, that were orbital stable (when the number of steps exceeds two). Orbital instability is a property which was presented for the family of Stormer-Cowell multistep methods, used for the solution of (1). Another important advantage of the linear symmetric multistep methods, developed by Lambert and Watson and by Quinlan and Tremaine (see and 3 ) , is the simplicity of these methods compared with the hybrid (Runge-Kutta type) ones. In this paper, we develop a three-stage symmetric multistep method which has phase-lag of order ten. For the new proposed method we also discuss in some detail its stability characteristics. We present our numerical results by applying the new method t o some established orbital type problems. For comparison purposes, we also use certain other well known methods that can be found in the literature.
'
2. The New Method
We study the following six step method:
The above method is of sixth algebraic order. We calculate the free parameters in order the above method to have minimal phase-lag.
References 1. Lambert, J. D., and Watson, I. A. 1976, J. Inst. Math. Applic., 18, 189 Quinlan, G. D., and Tremaine, S. 1990,TheAstronomical Journal, 100, 1694 Quinlan, G.D. 2000, Resonances and instabilities in symmetric multistep methods, submitted.
2. 3.
aThe interval of periodicity is determined by the application of the symmetric multistep method to the test equation y " ( t ) = -q2 y(t). If q2h2 E (O,T:), where h is the step length of the integration, then this interval is called interval of periodicity
FINITE ELEMENT ANALYSIS FOR WEAKLY COUPLED MAGNETO
- THERMO - MECHANICAL PHENOMENA IN SHELL STRUCTURES J. K. SAKELLARIS Energy Production, Transmission and Distribution Sector Department of Engineering and Management of Energy Resources Aristotle University of Thessaloniki E-mail: [email protected]
In a previous paper it was shown that, the use of the surface impedances concept, widely used already, can contribute to the construction of a universal model for eddy currents modeling in thin layers. This is the most fundamental part for a systematic approach on constructing an appropriate type of finite element analysis for weakly coupled magneto thermo - mechanical phenomena in shell structures. At this point it would be very interesting, to remind, that this technique has been initially used for thin layers modeling. Such a structure can be represented by two surfaces with self - impedances Zaa and zbb respectively, accompanied by a transfer impedance taking into account the interdependence of the phenomena on the two surfaces. Hence, the previous paper was devoted to the presentation of this universal model, which is accompanied by a new type of elements for its implementation. A second and a third part followed. The second part dealt with one directional coupling of magnetic and thermal phenomena, whilst the third one with one directional coupling of magnetic and mechanical phenomena very briefly and this the point from which, this paper starts.
1.
Introduction
Many eddy current phenomena in thin layers can be represented by a universal model. This model is based on the fact, that the magnetic induction vector B is conservative. This property can be expressed in mathematical terms by one of the four Maxwell equations, known as the continuity equation, which applies for non - divergent vectors:
div B
=
0
The expression (1) describes the non - divergent property of magnetic induction, that is the fact, that in nature there are no, actually, magnetic charges from which start and to which end magnetic field lines. Magnetic fields derive from
557
558 currents and, consequently, their lines are closed. For magnetically linear materials the following expression is valid:
Magnetic force density = J X B 2.
(2)
Shells and Shell Elements in Mechanics Shells are thin structures which span over curved surfaces (see Figure 1).
Figure 1. Typical shell structure The forces in shells are: Membrane forces + Bending Moments, whereas in plates there are bending forces only (see Figure 2).
Figure 2. Forces and moments acting on a typical shell
559 3. Shell Theory Shell theories are the most complicated ones to formulate and analyze in mechanics and they demand strong analytical skills. In general, they are classified as: 1. Thin shell theories and 2. Thick shell theories.
4. 3.1Shell Elements: The principle on which is based the use of shell elements is exposed in the next figure:
plme stnm e1-t
plate hedw
Figure 3. Decomposition of a flat shell element in a plane stress element and a plate bending element (consequently in 2 dimensions it is valid that: bar -!simple beam element => general beam element). The Degrees Of Freedom (DOF) at each node are (see Figure 4):
560
4' Figure4. Degrees of freedom for a flat shell element node The most widely used shell elements are quadrilateral with 4 or 8 nodes. A curved shell element of 8 nodes is the most general form of a quadrilateral shell element of 8 nodes (see Figure 5):
V
Figure 5. Typical curved shell element
CONCLUSIONS Having determined which type of special element to use in the eddy current phase, depending on the skin depth, the following succession of phenomena has to be studied for correctly modeling weakly coupled magneto - thermo mechanical phenomena in shell structures: MODELING OF THE SHELL STRUCTURE
THERMAL MODELING OF THE SHELL STRUCTURE MECHANICAL MODELING OF THE SHELL STRUCTURE
A GENERATOR OF PSEUDO-RANDOM NUMBERS SEQUENCES WITH MAXIMUM PERIOD
SERGIO SANCHEZ AND REGINO CRIADO Dpto. de Matemdticas y Fisica Aplicadas y CC. de la Naturaleza, E.S.C.E.T., U.R. J . C., C/Tulipdn s/n, 28933-Madrid ,Spain E-mail: [email protected], [email protected]
CARLOS VEGA Dpto. de Matemdtica Aplicada a las Tec. de la Informacidn U.P.M., Ciudad Universitaria s/n, 28040-Madrid, Spain E-mail: [email protected] This work presents several guidelines in order t o show that it is possible to choose in an adequate manner the parameters a , b and m of the linear congruential generator zn+l = (a.zn+b) mod m in order to maximize the period of the generated series of pseudo-random numbers, in such a manner that the period obtained is close t o the theoretical upper bound of m!.
1. Introduction
There are many procedures t o generate sequences of pseudo-random numbers, see [I]for a review, which apart from appropriate randomness properties should have a long enough period for the incumbent application. One of the most popular methods for generating pseudo-random numbers is the linear congruential generator. The general formula is, z,+1
= ( a '2,
+ b)
mod m
The values of a , b and m are preselected constants. a is the multiplier, b is the increment and m is the modulus. This generator is strongly dependent upon the choice of these constants but it has properties that make it specially attractive for an extensive use, both from the statistical point of view and due t o the possibility of obtaining sufficiently long periods if the parameters a , b, m are conveniently chosen. A key property resides in the fact that the number of operations needed per bit generated is small. The choice a = 16807, b = 0 and m = 2147483647 is a very good set of parameters for 561
562
this generator. These parameters where published by Park and Miller in [2]. On the other hand, the main disadvantage of this generator is that the generated sequences are predictable (see [3] and [4]), suggesting that this generator is not adequate for cryptographic purposes. However, in those applications where random behavior is simulated, such as the search of big prime numbers, where it is necessary to check if a number is composed or ”probably” prime, this generator can be certainly used. In this paper, we shall present several guidelines to choose the parameters a, b and m in such a form that allows us to overcome such faults, because its configuration is capable of breaking the natural order in the generation of random numbers and allows us to approach a large period length, close to the theoretical upper bound of m! and, hence, higher than that of combined generators ~51. 2. Basic notation and preliminaries
Let us recall that if F is a set with a finite number of elements, a generator of F is an algorithm that obtains a sequence of elements of F . A sequence {z,},?~ is periodic if there exists Ic such that x,+k = z, for all n E N. If the subsequence zo, z1,. . . , zx-1 is the period, X is called the length of the period. The sequence {x,},>O - is almost periodic if there exists m E N such that the sequence {z,},~, is periodic. It is said that { F ,f , z o } is of maximum period if the period length is equal to IF/. In the sequel, we shall concentrate in single-step generators, that is, those that can be written as z,+l = f ( x n ) ,where f is a mapping f : F + F and F is a finite set. We shall interpret the generator of F as an algorithm that produces the sequence {z,},>~ - of elements of F and we shall denote it by { F ,f , 2 0 ) .
Lemma 2.1. Let { F ,f , x o } be a generator, where F is a finite set. The sequence {z,},>~- defined through z,,+1 = f (z,) is almost periodic. In our case, F = Z, = {0,1, ...,m - 1) is a commutative ring with unit. We are interested in knowing under what conditions the affine mapping z a . z b is of cycle length m. The mapping f is bijective if and only if a is an invertible element of .Z , The mapping f k provides z HakzO ( l+ a+a2+ ...+ak-l).b. Using the notation S k ( a ) = (1+a+u2+...+ak-’) we may rewrite f k as z akz0 b . S k ( a ) . It is important to remark that for f to be of length cycle rn, fixed points should not exist. For the sake of simplicity, let b = 1. Then the equation
-
+
-
+
+
563
x = a .x
+ a has a unique solution if a # l ( m o d m). Hence, in order to
f will not have a fixed point, it is necessary that a = l ( m o d m). But, in this case, S k ( x ) = O(mod k) and f" = e , where e is the identity element. Following this reasoning, it is not difficult t o prove: Lemma 2.2. If m is a prime number, a E l ( m o d m) and b is not congruent with O(mod m), the mapping f a , b : Zm -+ Z , defined by f a , b ( x )= a . x b, where a . x b is the equivalence class modulo m corresponding t o the number a.x+b, is a cyclic permutation of order m in Z , (and hence a maximum length generator).
+
+
The single-step generator that we have seen , cannot generate sequences with period longer than IFI. Our objective is to design a generator whose period length is close t o the theoretical boundary m!,the order of a symmetric group with m elements. It is well known that if the order of the cyclic group (f) generated by f satisfies I( f)l = s, given x E Z , , we have that either x is a fked point of certain element of (f) (in this case, we denote x by x ~ )or, the set H i = { x ,f ( x ) ,..., f ' - l ( x ) } contains exactly s elements. Moreover, it is also possible t o find a sequence ~ 1 ~ x..., 2 xl , E iz, such that (2" - { X F } ) = H&
I
u ... u H&
I
where IHil = ... = [ H i l = s, and 1.s = m - 1 = q5(m),where $(m)is the Euler phi function.
3. Guidelines to select the parameters
Theorem 3.1. If a = l ( m o d m ) , then for any sequence produced according to (1) there exists a fixed point X F such that XF
= (a '
+
x ~b) mod m
Remark. If m is a prime number, then m - 1 is composed, so the period formed by the previous m - l numbers is divided in l subsets of s elements each, so that l.s=m-l=4(m). The number of possible values of s is T(+(m)) - 1, where ~ ( p y ' p ; ~ . . . p E= ~ )(a1 l)(az 1 )...(an 1 ) for a1 a 2 ( m - 1 ) = p , p , ...p,"". The number of subsets 1 is
+
1=-,
m-1 S
+
+
564
Now, for a given value of s, the valid values of parameter a are the solutions of the following equation: us
= l(mod
m).
(1)
If we use the programming language C with the option of minimum representation of 32 bits, parameters a , b and m must satisfy the condition a ( m - 1)
+ b 5 232
-
1
(2)
In order to select a , we follow the recommendations of [4] and impose that u and m satisfy the conditions
0.01m < a < 0.99m
(3)
Also parameters a and m should be such that the partial quotients of 2 must be small, so that the Dedekind sums a ( a ,b, m) and the serial correlation coefficient
C M a(a,b, m)/m is small. This property also favors the fulfillment of the series criterion for pairs of numbers. Again, in order to satisfy the requirements in 141, the parameter b ( k = b/m) should be chosen to satisfy the condition
Lk.mJ 5 b 5 [k.ml where
k
=
(;-
-
:)
-&
M
(4)
0.2113248654051871177454
Substituting in (3) the lower bounds of (4) and (5) we obtain the following equation for m: m2 - m Lk . m] * 10’ - (232 - 1) * lo2 5 OSolving this equation for integer numbers, we can obtain m = 655339 as a valid value for this parameter. In a similar way, we obtain an equation for the upper bounds for the parameters a and b. In consequence, the choice of m is restricted. For a generator of maximum period, the only way to increase the period length is to increase the value of the parameter m, which is bounded, or use a combination of generators. In our case, the following procedures can be followed in order to achieve a long period. As we have 1 subsets, the simplest way to form the series is the following: introduce a seed 20 for the i-th subset, generate all the numbers z belonging to this i-th subset, take a seed 20 from the following
+
565
+
subset i 1 and so forth until all the zo from all the 1 subsets have been used. In this manner we can obtain all the m values of 5 in the interval from 0 to m - 1, including the value of ZF. Therefore, it is possible to produce a sequence whose generation depends on: 1. The order in which we go through the subsets. The number of possible variations is 1!, hence the period length will be 1! . m. 2. The choice of seed zoin each subset. The number of combinations for the choice of the initial value is s1, hence the period length will be 1! .m . s l . 3. The choice of subset can be random, as well as the choice of the number of elements in the chosen subset. This process is carried out so that the uniform law of distribution of z in the interval between 0 and m - 1 holds. In consequence, the period length will be 1! . m . (s!)~. In this way, for the chosen parameters a,b, m, we can generate not only a sequence (as in the generator of complete period case) but many different sequences. Therefore, it is possible to come closer to the theoretical upper bound of m! elements. Following the generation procedures above, and if we take initially m = 655339, its closest prime number is 655337, so taking m = 655337, we have that m - 1 = 655336 = (23) . (11)2 . 677. If, for instance, 1 = 2 . 11 = 22, we determine s = -= 29778. The period length for each generation from 1 t o 3 will be: 1. 22! .6,55336. lo5 M 1021.6,55336. lo5 M 2. 22! .6,55336. lo5 . (29778)22M 1021.6,55336. lo5 . log8 x 3. 22! . 6,55336 . lo5 . (29778!)22 x 1021 . 6,55336 . lo5 . 102.647.414M 102.646.441
To see the degree of approximation, we can apply Stirling's formula to approximate the value 655337!,
from where
The generator with the parameters mentioned above has been subject to a series of statistical tests. They do not suffice to judge statistical properties, therefore, in order to use the the generator in an extensive way in practical cases it would be more appropriate to try it with batteries of 15 or more significant tests ([7]) to allow for a more justified choice of the parameters a,b.
566
References 1. P. L'Ecuyer, Random Number Generation, in Banks J . (ed), Handbook of Simulation, John Wiley & Sons, 1998. 2. S.K. Park and K.W. Miller, Random Number Generators: Good Ones are Hard to Find, Comm. of the ACM. 31(10), 1192-1201(1988). 3. J. Boyar, Inferring sequences produced by pseudo-random number generator, J.A.C.M. 36,129-141(1989). 4. A.M. Frize et al, Reconstruction truncated integer variables satisfing linear congruences, Journal on Computing, 17(2), 262-280( 1988). 5. B. Schneier, Applied Cryptography. John Willey&Sons, 1996. 6. D.E. Knuth, The Art of Computer Programming. V.2..Addison-Wesley, 1981. 7. G. Marsaglia,The Marsaglia Random Number CDROM, including the DIEHARD Battery of tests of Randomness, Departament of Statistics, Florida State University, Tallahassee, Florida, 1995.
NEAR FORCE-BALANCED CUTTING: KEY TO INCREASE PRODUCTIVITY IN MACHINING P. SASAVAT, N. GINDY, J. F. XIE, AND A.T.BOZDANA School of Mechanical, Material, Manufacturing Engineering And Operation Management University of Nottingham, University Park Nottingham, NG7 2RD, UK
E-mail: e~,[email protected] A near force-balanced system has been developed to reduce the deterioration of part quality occurring during machining process by proposing multiple active tools cutting simultaneously in the different direction and/or utilization of active/passive tools (supports) operating simultaneously. FEM is used to prove the feasibility of this idea by considering an aluminum hollow cylinder that is fixed at one end to represent a turning process of workpiece clamped in a chuck. Weakening material property by reducing the value of young's modulus is selected as the material removal technique for this paper. Parametric study is also performed to analyze the effect of changing some relevant parameters whereas experimental validation of this concept in twin turning machine (CNC) is underway.
Introduction
In today's competitive manufacturing industry, increasing productivity without causing any further problems in quality is a major concern. Many approaches have been advocated towards maximizing productivity whilst maintaining or even improving part quality. Increasing material removal rate is one of the major approaches to increase productivity, which is normally accomplished by directly maximizing or varying process parameters like depth of cut, feed rate, cutting speed, and so on. For instance; Karamany [l] proposed a computerized method for determination of optimum cutting variables in turning long workpieces by varying the static stiffhess of the machine workpiece along the workpiece axis. Kops et a1 [2] presented an equilibrium relationship between depth of cut and workpiece deflection by providing a ratio between the planned and the actual depth of cut to compensate the deflection. Despite the analytical methods mentioned above, numerical technique like FEM has increasingly become an effective tool in machining simulation since 1970s as can be seen from many references considering in Eulerian, Lagrangian, or Arbitrary Lagrangian-Eulerian (ALE) technique. Surprisingly, most of these works consider the simulation of specific cutting processes, chip formation, failure and tool wear, thermal aspects, or residual stress in machining. Only few are available in FEM simulation/optimization of deflection/deformation of the workpiece in machining. For examples; Phan et a1 [3] proposed a finite-element model with closed-form solutions to workpiece cross-section deflections in 567
turning taking into the consideration of all cutting force components and three workpiece mounting types. An improved model was presented later with an analysis of the coupling between the cutting force and actual depth of cut in which part holder stiffness and shear deformation effect were taken into account [41. Though these approaches have been used to keep deflectioddeformation within the prescribed limits, howsoever, they do have a practical limit and sometimes create more parts that fail to meet the required specification, particularly in specific cases like production of thin walled or asymmetric components. Thus, to overcome this drawback, a near-force balanced technique (expanding the concept of twin turning technique) has been brought into attention altogether with FEM to prove the feasibility of this approach. Methodology (An Overview) The proposed near force-balanced model aims primarily to increase the material removal rate and to reduce the deterioration of the parts by taking into account of FEM technique. A commercial FE pre- and post-processing software developed by MSC is used in conjunction with the ABAQUS FE solver to perform the main tasks within this study. First, an aluminum hollow cylinder is generated directly from the preprocessor and meshed with twenty-noded hex elements with full integration, which represents reasonable accuracy with smooth facets and also maintaining an affordable computational expense. Boundary condition is set to fix one end of the cylinder in all translation to represent a chuck mounting. In the aforementioned studies of Phan et al [3-41, a process-independent force model proposed by Stephenson was used to evaluate the components of the cutting force. Stephenson [5] proposed a model for simulating cutting forces, which is based on empirical equations for forces or pressures normal and parallel to the rake face of the tool and claimed to be process independent. Moreover, some methods like abductive network [ 6 ] ,genetic algorithm, neutral network, genetically modified neural optimization algorithm [7], or even commercial software like ThirdWaveAdvantEdge were also found to be used in determining the amount of the cutting force in machining instead of performing experiments. In this paper, forces are approximated and input by their equivalent distributed load (non-uniform), which can be obtained either from any prediction method available in the literatures or by experiment; upon the element faces. Together with a subroutine written in Fortran, the approximated force can be moved spacially and varied as a function of position, time, element
569
number, and other parameters as prescribed in the subroutine. Figure 1 represents the test model of a hollow aluminum cylinder.
Figurel: Near Force-Balanced Test Model
Despite taking into account the machining forces, material removal contributes strongly to FEM of machining processes. Many approaches have been advocated towards attempting to virtually represent this process, mostly by volumetric representation like volume buffer, octree/quadtree representation, voxels representation [S]. However, by reducing the value of Young’s modulus to a certain figure allows stresses to be transferred to the adjacent elements. This represents the weak material property of the elements, which will be treated later on as inactive elements that no longer are being considered in any further calculations. A very clear outstanding advantage of this technique is that there is no pre-defined list of elements to be removed, as it is automatically accomplished by the supported subroutines. Therefore, it could be used even in the case where a very high degree of accuracy is required as long as user can afford expensive computational time. Although this technique has found to facilitate the simulation of material removal, howsoever, as it gives no physical element removal, care should be exercised. Thereafter near force-balanced technique is developed, either by proposing active tools cutting simultaneously in the different direction or by utilizing active or passive tool (such as guide-roller) operating simultaneously in the same direction. Contact modeling must also be implemented to simulate the usage of passive tools and in this case no material removal occurs. A parametric study is performed with the aim to clearly analyze the effects of changing parameters such as thickness, diameter, and length of the workpiece. Results are compared with the alternative equilibrium approaches themselves and also in the
570 conventional method where only one cutting tool is applied. The comparison of the results with the laboratory experiments in twin turning machine is, however in progress. Conclusion
This paper describes strategies to develop a virtual near force balancing technique by the utilization of FEM. The methodology developed in this study offers a range of potential advantages to machining applications particularly in the machining of thin wall or asymmetric components. Some of the advantages gained from this idea are as follows: 1) No or limited requirements of support fixturing particularly in machining of thin walled or asymmetric components. 2) Cutting forces are balanced, thus maintaining the geometric quality of the workpiece by means of global deformation. 3) A possible twice increase in material removal rate could be achieved in the case of utilizing multiple active tools, although this is not the case where passive tooling is used, but it does however maintain the aforesaid advantages. References 1. El-Karamany, Y., Turning long workpieces by changing the machining parameters. Int. J. of Mach. Tool Des. Res, 1984.24(1): p. 1-10. 2. Kops, L., Gould, M., and Mizrach, M., A Search for equilibrium between workpiece deflection and depth of cut: Key to predictive compensation for deflection in turning. Manuf. Science Eng., 1994. 68-2: p. 819-825. 3. Phan, A.-V., Cloutier, G., and Mayer, J. R. R., Afinite element model with closedform solutions to workpiece deflections in turning. Int. J. Prod. Res., 1999.37(17): p. 4039-4051. 4. Phan, A.-V., Baron, L., Mayer, J. R. R., and Cloutier, G., Finite element and experimental studies of diametral errors in cantilever bar turning. Applied Mathematical Modelling, 2003. 27(3): p. 22 1-232. 5. Stephenson, D.A. and Bandyopadhyay, P., Process-independent force characterization for metal cutting simulation. Concurrent Product and Process Engineering, 1995. 1: p. 15-36. 6. Lin, W.S., Lee, B.Y., and Wu, C.L., Modeling the surface roughness and cutting force for turning. J. Mat. Proc. Tech., 2001. 108(3): p. 286-293. 7. Govender, E., An intelligent deflection prediction system for machining of flexible components in Manufacturing Engineering. 200 1, U. of Nottingh 8. Jang, D., Kim, K., and Jung J., Voxel-Based Virtual Multi-Axis Machining. Int. J. Adv. Manuf. Technol., 2000. 16: p. 709-713.
SYMMETRY FORMATION PRINCIPLES OF THE CHEMICAL COMPUTER SOFTWARE L. P. SCHULZ Adam Mickiewicz Universiw, Department of Chemistty, Grunwaldzka 6 60-256 Poznan, Poland E-mail: [email protected]
A way leading from basic structural features in chemistry to chemical numbers and arithmetics has been shown One of the goals of this work was a query about conditions under which self-assemblage symmetries are formed and collapsed. It turns out that optimization through some criteria of quality results in the determination of specific symmetry datum-levels, leaving of which launches formation of chemically essential symmetry designs. The process can be measured due to fwzy set based characteristics called overlapping and splitting that additionally detect collapses (catastrophes) of symmetries and chemical numbers to the very elementary forms. An enrichment of the chemical computer software has been achieved owing to novel relationships between structures underlying chemical processing and corresponding arithmetics of chemical numbers. Besides, certain mathematical groups have been specified for molecules of non-complex compounds.
1.
Introduction
1.I. Chemical Arithmetics and Computing Through Self-Assemblage
Symmetry Transformations A specific to chemistry way of computations emerged along with the concept of chemical computer ([l] - [3]). Computers of this kind perform operations on chemical numbers and it occurs each time when a chemical reaction takes place. Such computation can be expressed by a general formula
where XI, ... , X, and Y,, ... ,Yk are respectively reagents and products in a reaction, and TpXi (TpY,) denote chemical numbers which are types of equivalence classes of underlying structures, e.g., types [2] of the classes of isomorphic self-assemblage chemical spaces. The structures underlying chemical processes turn out to be aimed by specific symmetries beyond the geometric standards such as point or crystallographic groups. This work unveils
57 1
572 the crucial role of self-assemblage symmetries by which any chemical processing is supported. An additional light has been shed on the relationships between chemical numbers through measurements of the formation of symmetries and accompanying chemical compound syntheses. In particular, certain datum-levels have been established for assessments of deflexions under optimized standards. 1.2. General Preliminaries
Relating the processes of chemical self-assemblage to some specific symmetries does afford the possibility of measurements of phenomena that are underlain by real chemical processes. Such measurements can be carried out using mathematical tools briefly characterized below. Let Boolean rings of the form (B,+;) be taken into account. They are commutative rings with the idempotent multiplication (i.e., a . a = a). For any finitely additive measure p on B, there exists a useful pseudo-metric p such that
for (a,b)
E
(AxA) \ (p-’({O})x p-l({O})) and fixed C’ and C” such that 0 < C’ 5
C”, and
p(a,b) = 0
(3)
if a and b are of measure 0, i.e., (a,b) E p”({O}) x p-’({O}). the pseudo-metric determined in (1) and (2) is a metric if and only if the set of elements of measure zero consists only of one element. Commonly, Boolean rings are taken as collections of sets that are closed under symmetric difference (A) and intersection (n)operations. Parameters C’ and C” of formula (1) are, as usual, equalized, i.e., C‘ = C” = 1. It is easily seen that pseudo-metric (2), (3) can be considered as a membership function of pairs of measurable sets. The top of this membership (=1) means that sets (of non-zero measure) are maximally differentiated, and the minimum (=O) denotes that compared objects are maximally resembled. Such a membership function is regarded here as a fuzzy set. Consider an arbitrary family of sets F c 3, where 3 is a Boolean ring. Characterization of general relationships between the members of family F needs the usage of constituents of sets of family F. Such constituents are given for any mapping a.: S -+ PowE (where PowE denotes the class of all subsets of E) in the following way:
573
4 @(a.,E) = E/r u Emp(a.,E)
(4)
where E/r is the set of equivalence classes under relation r of equal neighbourhoods, i.e.,
4
df
r = r(a.,E) = { (x,y) E E x E I Nb(x,ima.) = Nb(y,ima.) }
(5)
Symbol ima. (doma.) denotes the image (domain) of mapping a. . Thus, two elements x and y fulfil relation ( 5 ) iff the sets of ima. containing x (Nb(x,ima.)) equal the class of members of ima. containing element y. Class Emp(a.,E) of (4) can only be empty set 0 or one-element set {0}. The last case occurs if and only if there exists a dichotomic partition (S’,S’’) of the domain of mapping a. (i.e., S‘ u S“ = doma. and S’ n S“ = 0)such that
S€S”
SES”
If such partition does not exist, then
Emp(a.,E) = 0
(7)
and constituents @(a.,E) are called independent provided that 0 # E f {0}. Members of class (4) cover set E and they are mutually disjoint. Constituents @F of an arbitrary family of sets F are sometimes defined on the union U F of members of F, i.e., df @(IF, U F ) = @F (8) Characterizations of collections of sets can efficiently be performed by a transformation of them into relations. For example, if classes F and P are finite and included into Boolean ring F,then the required relation is of the form.
where c2 denotes c x c for short and
W(F,P)
df ={
(b,b’)
E
P x P 1 b z b’ and p(a n b) + 0 }
(10)
5 74
Two families of sets F and P have to be compared. The way to reach this goal is through a transformation of them into relation (9). Each element of class F is transformed in a relation of the form a x a, and then certain sub-relations induced by elements of P are excluded of each a x a. Generally, we have a class W(F,P) c P x P of pairs (b,b’) to which a relation (a n b) x (a n b’) is ascribed. The relation serving for subsequent exclusions is, however, the union of Cartesianproducts of the domains, i.e., the union of products of the form
dom((anb) x (anb’)) x dom((anb) x (anb’))
(1 1)
For a particular case, when P consists of constituents of F, i.e., P = @F, the followingfizy set characteristics determine the nature of family F - overlapping
df 0 = P(g(FY@F)Yg(@FY@F))
(12)
- splitting
The maximal overlapping value is 1, and then each member a of F, such that pa f 0, includes at least two constituents of non-zero measure. If minimization of overlapping is reached at 0, then every set of F has included at most one constituent of non-zero measure. Evaluations of splitting (1 3) leads to the inverse situation, i.e., sets of F are maximally separated with respect to their unique constituents of non-zero measure at s = 1 , and at s= 0, they are maximally overlapped. However, the intermediate and certain extremum evaluations are independent of each other. Therefore, criteria of quality have been taken in the form of functions fo,s, where arguments are splitting and overlapping simultaneously. It is supposed that an enrichment of the information on the systems investigated can be obtained in this way. 2.
Symmetry Formation and Collapses
2.1. Compensating Systems
Self-assemblage chemical space (SAC space) is a system (A,h,v;) such that h evaluates elements of A as atomic numbers, v (the ,,bonding force” operation) maps some class of subsets of A into N (where N is the set of non-negative integi-a1 numbers), and (the ,,bonding” operation) is a partial mapping from the +
575
Cartesian product of subsets of A (PowA x PowA) into N. The exact determination of SAC spaces by a system of axioms is contained in [3]. This system has previously been used for the construction of an arithmetic supporting the concept of chemical computers [ 11. Alternative systems are considered herein such that they are equivalent to the categories of SAC spaces. A compensating system (A,~,v,K)(cf., e.g. [4]) consists of underlying set A and functions h, v, and K. Set A and operations h and v are characterized by the same axioms as for SAC spaces. An assignment of a self-assemblage chemical space to the given compensating system and conversely can be proved by showing the existence of surjectors from categories of compensating systems onto the categories of SAC spaces. In the first step, this is realized by the following assignment K
ObjCcomp3 (A,~,V,K) H (-4&,v,
*
1 E ObjCsAc
(14)
where the needed transformation [4] of compensating function K into bonding K
operation . is given by formula K
U . W = I K n
& ,: I
(15)
for any sets u,w of domv. An example has been depicted below for an illustration of transformation (1 4) H2
H'
C'
0'
1 0 1 1 0 1
1 0 1 0 1 3 2
I O ( O ( 3 2 1 O I l O I H2
H'
C'
0'
Figure 1. A transformation of the compensating system of formaldehyde into its SAC space
576 df = 10, ...,vu - I ) X {u>,for vu # 0, are commonly Unions of sets used for building the domain and image of compensating functions. Thus, compensating function K is a partial injective mapping
uedomv
& ,:
uedomv
denotes Cartesian products of sets like
&Y,
df =
I;,i.e.,
zu"x 2,".
(17)
where u and w are elements of the domain of function v. Finally, all the axioms determining compensating systems of the form (A,~,v,K)have been listed below for the full clarity. U,) The axiom of finiteness of the underlying set A claims that A is of cardinality n E N. CB 1) The axiom of excluded inclusion
b'(u,w)
E
domv x domv: ((u E w v w c u) 3 K n IcUv,,
= 0.(IS)
This axiom excludes pairs of sets of domv that are included in one another and related by compensating function K. CB2') The axiom of the orientations of K relationships
V(u,w)
E
domv
x
domv: dom(K n
) f 0 3 im(K n]cuvw )
c domu.
(19)
where symbol ,,im" denotes the image of the relation indicated. CB2") The axiom of the symmetry of K relationships K n (domK x domK) E K - I . (20) Function K reduced to the Cartesian product of its domain turns out symmetric owing to the property expressed by axiom CBr8, i.e., if pair (x,y) fulfils K and y is of its domain, then pair (y,x) does also fulfil it. BFI) The axiom of the empty set
0 P domv. BF2) The axiom of one-element classes (i.e., elements of set A/IA)
MIAE domv. BF3) The axiom of separation
(21)
577
AN I)The axiom of the fixed operation h
hcA.
(24)
Function A has been established on a Grothendieck universe. For example, Grothendieck universes [5] can be used as a source of well-defined sets, which gives clear insights in a purposeful selection of derivations and needed results. In other words, information inputs in the axiomatic systems pursued become clearer. Owing to fixed function A, each element of underlying set A of a SAC space or compensating system has a fixed atomic species number. Similarly, physically existing atoms are classified on the grounds of the number of protons contained in their nuclei. More details on the Grothendieck universes are considered later on. CB02) The axiom of the resultant compensation
b'u
E
domv \ NIA:(VwEdomv, w E u, w + u: domK& c_ domK
uimK)=,vu=O.
(25) CZI) The axiom of zero valued A classes (i.e., classes of elements of A such that hx = 0)
b'u
E
domv : h(u) E (0) 3 (domK u imK ) n 1,. = 0.
(26)
2.2. Criteria of Qualityfor Measurements of Symmetry Formation and Collapses Let the symmetrized SAC space of formaldehyde be taken into account, i.e., all arrows of Figure 1 should not have any orientation distinguished, which means that compensating function K equals its inverse. There is naturally emerging stabilizer (isotropic group [6]) 3-1, in the symmetric group on disjoint union
u 2:
yielded by the members ( 1;) of the canonical partition
P . It
ucdomv
turns out that equality or inequality of this stabilizer with its conjugate
7$
does provide the necessary and sufficient condition for the formation of selfassemblage symmetries and consequently conditions for building chemical structures. This can schematically be pictured as follows
578 Formation of se(f-assemblage symmetries
7-tp=7-g 4
b-lP#F& Collapses of chemical structures and numbers
Moreover, considering orbits of the stabilizers involved, i.e.,
df Fp,K= Orb7-tp v Orb%u,
(28) one can easily determine overlapping and splitting ratio 01s by formulas (12) and (13) for family of orbits just defined in (28). Ratio o h , under some soft assumptions that are not essential here, equals 0 if and only if stabilizer Ftp and its conjugate ?-$are equal. Conjugate I$
is, of course, again a stabilizer of
its orbits. Thence, this is supposed that non-zero 01s evaluations can serve as a measure of the diversification of stabilizers as shown in (27). It has additionally been proved that a molecular system including more than one atomic species breaks up into atoms or at most two atomic molecules along with the nullification of overlapping and splitting ratio. Therefore, the way gave reason for construction of some measures suitable in assessments of chemical processes [7]. The algorithms for efficient computations are not so direct as ones outlined above, but the principle is the same. Destroying relationships between certain stabilizers like in (27), the formation of self-assemblage symmetries becomes possible in atomic and two-atomic systems. A crucial role of the diversification of stabilizers in the formation of chemically significant symmetries can also be elicited in other cases of criteria of quality. For example, summation o + s of overlapping and splitting applied to the domain of bonding force operation v in SAC spaces of the form (A,h,v,.) ([A1> 1) leads to the optimization resulting in the minimum value equal 1. Again, the minimum can be reached if and only if certain stabilizers are equal, i.e.,
o + s = 1 iff Hdomv\ ( M ~=)Hi*) (29) where H is the symmetric group on A acting naturally on its subsets (Le., f . a = f(a)), and domv \ (A&) denotes the set of many-element members of the domain of bonding force operation v. What is the effect of the diversification of stabilizers (29) in this case? Since for complex compounds relationship (29) is possible if and only if the complex is neutral, i.e., domv \ (A&)
=
{A} .
(30)
579 Any perturbation of the equality just indicated causes, e.g., the instability of complexes against the ionizing solvents. Platinum coordinated ammine complexes provide the classical example in which the case of neutrality expressed in (30) implies the complete fall of electro-conductivity in water solutions [8]. 3.
Self-Assemblage Group Attributions to Molecules
3.1. Determination of Algebraic Groups of Non-Complex Molecules Measurements of self-assemblage symmetry formation mentioned before bring the question about the explicit exposition of the nature of groups arising together with molecular self-organization processes. They will be defined here for the non-complex symmetrized molecules, The feature of these groups is such that they are unambiguously determined for any fixed equivalence class of isomorphic molecules (SAC spaces) in question. There is a pretty general mathematical basis for such determinations. First, chemical groups are defined in Grothendieck universes [5] that are classes (denoted U) provided with the following properties 1.0E
U, Va,b: a c b
2. Va,b E
U: PowA
E
E
U 2 b E U,
U, a u b
E
U, and Ua
E
U,
(31)
3, for each function f,
if domf E U, then imf
E
U.
Mapping A:U + N (cf. (24)) is assumed to underlie the operation of atomic numbers aforementioned. This is treated here as a general classifier as shown below. The structure of any molecule of a non-complex compound can be expressed as a group with two generators. More exactly this is system (B;,x,y) such that 1. B is a finite set, 2. x (it e) is an involution and x # y, 3. for each a E B and i E N,
x*a# yi a *
4. for each orbit a
E
Orb{y}-,
if a,p
E
a, then Aa = Ap,
5. B is generated by set {x,y), i.e.,
580 { X J - =B
6. (B;) is a transitive group, i.e., it has only one orbit (under action by group operation ' ). Thus, such a group consists of finite products of mutually changed involution
x and elements of group {y}-generated by element y, i.e., n-1
B = { Y n * n < x * Y Ii y.: ) {L...,n} + {y>-, n
E
N1
(33)
r=l
It is an easy task to receive the corresponding SAC space and to find what are the molecules encoded in the group (B;). For group (B;,x,y) fulfilling conditions (32), this transformation is given as follows, 1. The underlying set consists of orbits df A = Orb{y}-. (34) 2. For each a E Orb{y}-,
ha 3. For each a
E
E
A(a).
(35)
(a(.
(36)
Orb{y}-,
df
va 4. For each a, b
=
Orb{y}-, df a.b = I{(a,P)~axblx.a=p>l. E
(37)
SAC space (Orb{y}-,h,v, . ) constitutes the required object. For each noncomplex, symmetrized molecular structure (SAC space), there exists uniquely determined class of groups with properties (32), which can be transformed by means of (34)-(37) in the SAC space isomorphic to the source structure. It is worth to notify that essential group structures emerge only after leaving the optimized level with minimum values of criteria of quality supported by o h ratio (or for a minimum of o + s summation as well). Due to the above consideration, chemical arithmetics ([ 1],[2]) have been revealed as grounded by self-assemblage symmetries and their transformations. 3.2. Generalized Species of Structures Support of Chemical Numbers
and Ariimetics - A Theorem On a Priori Partitions A category [5] is treated here as a generalized group, i.e., it is a multiplicative system (C, . ) where the result of multiplication y . x E C if it exists. Commonly, not all pairs of C x C can be multiplied. The role of neutral elements play the
58 1
so-called objects that are elements of C such that, if a . x ( x . a ) exists, then a . x = x (x . a = x ). Multiplicative system (C, . ) is a category if and only if 1. For each x E C, there exists an object b E C such that b . x = x, and there exists an object a E C such that x ' a = x. 2. For each x,y,z E C, ( z . (y . x) exists or (z . y) . x exists ) iff ( y . x and z . y do exist, and ( z . y) . x = z ( y . x )). (3 8) All bijections of universe (3 1) constitute a category, where multiplication of arrows is defined for compositions of the form g o f for any pair (g,Q of bijU bijU such that imf = domg. In the category theory sense, the species of structure is the functor [5], [9] x
+
T: bijU bijU. (39) For the chemistry purposes, functor (39) has been generalized here by considering the universes with a fixed equivalence relation r G U x U. Define class
df
I
E bijU f c r }. (40) It is easily seen that composition of bijections of (40), in the way mentioned before, constitutes a subcategory of bijU. Let it be notfied that for total equivalence U x U one receives the previous category since
bij(U,r) = { f
bij(U, U x U) = bijU.
(41)
Thus, the generalized species of structure is any covariant functor T: bij(U,r) + bij(U,r). (42) Equivalence relation r suitable for chemical purposes will be introduced by means of strongly inaccessible cardinals [ 101, [ 111. Class of ordinal numbers (Ord) consists of transitive sets (J,i.e., sets possessing property UOCG (43) such that each element of (J is also a transitive set [lo]. The class of cardinal numbers (Card) consists of such ordinal numbers that each of them has no equipotent ordinal number strongly included. Strongly inaccessible cardinal numbers are elements of the following class df Inac = { G E Card I o z N,& b ' E~ G : Card2' E G & Vp E G : (f: p
5 82
where
N is the smallest infinite cardinal and Card2‘ denotes the cardinal
number of 2’. There is one-to-one correspondence between the strongly inaccessible cardinals and class of Grothendieck universes U.
Theorem Mapping
cp : U + Inac LJ
{N}
(45)
is a bijection given by assignment
U 3U
H
U n Ord
E
Inac LJ
{N}.
(46)
The theorem claims that each intersection of the Grothendieck universe with the class of ordinals is a strongly inaccessible cardinal (except for N). Mapping determined in (45) does even preserve the well ordering of the universes and strongly inaccessible cardinals since the following equivalence holds
U c U‘ iff cpU < cpU’. (47) The existence of strongly inaccessible cardinals and respectively, the existence of corresponding Grothendieck universes is subject to the axiom of strongly inaccessible cardinals [121 in the form VCTE Ord 3zEInac: (J < z. (48) Assuming (47) and Theorem thesis, a suitable statement holds that leads to the finite partitions of the universes in question. Corollary For n 2 2 , there exists sequence of Grothendieck universes U1,...,U, such that Ui c Ui+l (for i < n). (49) Thence, there is no difficulty in determining aforementioned hnction
A:U,+N
(50)
such that Awl({ i}) = Ui+l \
Ui .
(5 1)
Thus, universe U, has been partitioned by partial universes Ui+l \ Ui, which are consecutively numbered by indices “i”. Mapping A of (50) yields the required equivalence relation
df
r = { (a,P)E Un x Un I A a = A P }.
(52)
Equivalence classes of relation r (52) are just partial universes determined in (5 1). Partial universes follow many properties of Grothendieck universes. For example,
583
V ~ Ui+l\ E Uj: Ua , POW^, { a }E Ui+l\ Ui .
(53) There is an interesting feature consisting in saving the same partial universe in a graduation of one-element sets, i.e., for each a € Ui+l \ Ui and any finite number of brackets, it holds that
{ a } , { { a ,..., } } {...{a}...} E Ui+l\Ui. (54) The number of partial universes can, of course, comply with the number of chemical elements, and values A a can be interpreted as atomic numbers in this way. This is implied by (54) that evaluations by atomic number operation h in SAC spaces yield the same values for any graduation of one-element sets, e.g., h a = h { a }= h { { a }=} ..., etc. 4. Conclusions The problem of chemical structure remains a serious challenge in the attempts of efficient insights in the foundations of chemistry. Theoretical physics approaches are mainly directed on the explanations of energetic aspects of the molecular systems, which however seems not to cover the specific to chemistry self-organization forms. Mathematical chemistry trends [ 131 develop the methods of investigations of structural features of the matter. The abovepresented outline is intended to elicit the rules of formation and measurements of self-assemblage symmetries in molecular systems. For this aim, the alternative structures have been proposed in the form of compensating systems, and certain, practically efficient fuzzy set based criteria of quality have been found. An insight in the processes of formation of chemical compounds unveils a rich symmetry background of the constituted structures and chemical numbers. Entries of the basic chemistry symmetries are rooted by the infinitude of certain universe classes coming from the foundations of set theory and mathematics. References
1. L.P. Schulz, J. Mol. Struct. (Theochem) 187, 15( 1989). 2 . L.P. Schulz, Chemical Conversions in the Computer of New Kind in ,,Am Mutandi" (N. Psarros, K. Gavroglou, eds.); Leipzig Univ. Publ.: Leipzig, 1999. 3. L.P. Schulz, Comp. Meth Sc. Tech. 5, 75(1999). 4. L.P. Schulz, Int. J. Quantum Chem. 80,432(2000). 5. I Bucur and A.Deleanu, Introduction to the Theory of Categories and Functors: Wiley: London, 1969.
584 6. S. Lang, Algebra; Springer: Berlin 2002. 7. L.P. Schulz, J. Chem. Inj Cornp. Sci. 40, 1018(2000). 8. G.P Graddon,. An Introduction to Coordination Chemistry, Pergamon Press: London, 1961. 9. L.P. Schulz, Appl. Math. Comp. 41, l(1991). 10. F.R. Drake, Set Theory, Elsevier: Amsterdam 1974. 11. K. Kuratowski and A. Mostowski,. Set Theory, North Holland: Amsterdam, 1976. 12. J.R. Shoenfield, Mathematical Logic, Addison-Wesley: New York, 1967. 13. N.Trinajstic and LGutman, Croat. Chim. Acfa 75,329(2002)
THE GENERALISED MASS-ENERGY EQUATION AE = AC2 AM; ITS MATHEMATICAL JUSTIFICATION AND APPLICATION IN GENERAL PHYSICS AND COSMOLOGY AJAY SHARMA Community Science Centre, Directorate of Education, Post Box 107, Shimla, 171001, INDIA E-mail cirai,22.lk~i,hotinc711coni
Einstein derived (in Sep 1905 paper), an equation between light energy (L) emitted and decrease in mass (m) of body i.e. Am = L/c2 (which is speculative origin of E = c2Am) completely disregarding other possibilities from the derivation. It theorizes when light energy (L) is emanated from luminous body, then mass of body decreases. In blatant way the same mathematical derivation (under logical conditions), contradicts the law of conservation of matter and energy but these are unnoticed yet. For example, it is equally feasible (as feasible as Dirac’s prediction of positron) from the same mathematical derivation that the mass of source must also INCREASE or remain the SAME when it emits light energy. It clearly implies that mass of body inherently increases when energy is emitted or energy is emitted from body without change in mass. Then Einstein speculated general Mass Energy Equivalence AE = c2AM from it without mathematical proof. All these aspects are logically discussed here thus Einstein’s unfinished task has been completed here. Further an alternate equation i.e. AE = Ac2 AM, has been purposely derived, in entirely different and flawless ways taking in account the existing theoretical concepts and experimental results. The new equation implies that energy emitted on annihilation of mass (or vice versa) can be equal, less and more than predicted by Einstein’s equation. It successfully explains the energy emitted ( 104’J ) in Gamma Ray Bursts ( intense and short) with high value of A i.e. 2 . 5 7 ~ 1 0 ’ ~. The energy emitted by Quasars (15.56xI04’J) in extremely small region can be explained with value of A as 4 ~ 1 0 ’ ~Recent . work at SLAC confirmed discovery of a new particle, whose mass is far less than current estimates, the same can be explained with help of equation AE = Ac2 AM with value of A more then one. AE = Ac2 AM, is the first equation which mathematically explains that mass of universe 10S5kgwas created from dwindling amount of energy (10-444Jor less) with value of A 2 . 5 6 8 ~ 1 0 - ~J ~or’ less. Whereas E = Amc2 predicts the mass of universe 105’kg was originated from energy 9x107’ J. Einstein’s AE = c2AM is not 585
586 confirmed in chemical reactions, but regarded as true. If one gram of wood or paper or petrol is burnt under controlled conditions, and just kg is ~ converted into energy then energy (9x107 J is equal to 2 . 1 5 ~ 1 0kcal) emitted can a body of mass 1 kg to a distance of 9x 1O7 m 9x 1O4 km or heat water equal to 2 . 1 5 ~ 1 0kg~ through 1°C. If energy released in chemical reaction (at any stage) is found less than Einstein’s equation AE = c’AM, then value of A less than one in AE = Ac’AM will be confirmed.
References 1 . Einstein, A. Annalen der Physik, 17 89 1-921 (1905). 2. Einstein, A. Annalen der Physik 18 639-641 (1905). 3 Arav, N., Korista, K.T., Barlow, T.A. and M.C. Begelman, Nature, Vol. 376, pp.576-578 (1995) 4. The BaBar collaboration Observation of a Narrow Meson Decaying to Ds+piO at a Mass of 2.32 GeV/cA2. Preprint, http://arxiv.org/abs/hep-exlO304021(2003).
A SIMPLE APPROACH TO A MULTI-OBJECTIVE DESIGN WITH CONSTRAINTS IN COMPOUND SELECTION FOR DRUG DISCOVERY SHENGHUA SHI AND ATSUO KUKI Pfizer Global Research & Development, La Jolla Laboratories, La Jolla, CA 92137, USA
1. Introduction
Despite the dramatic advance of high-throughput screening (HTS), the number of compounds that can be synthesized by combinatorial chemistry, still far exceeds the capacity of the HTS. Therefore, the development of methods for compound selection is in great need. To increase the chance for finding active compounds and to reduce the attrition rate for accelerated drug discovery, one should select the compounds such that they are not only diversified, but also have desired physiological property profiles. Thus, this is a multi-objective design problem. Moreover, the selected compounds, the designed libraries, may have the following three different formats: a) sparse array - cherry-picking; b) fully combinatorial super-array; c) a series arrays of given array size - tiling, as schematically shown in Figure 1.
a
b
C
Figure 1. Three library design formats: a) cherry-picking; b) fully combinatorial super-array; c) tiling; Since there is a certain format requirement on the design (for formats b and c shown above), the library design becomes to a multi-objective design problem with certain constraints on the solution. The problem usually is solved’ by applying the multi-objective optimization theory. Here, a simple “elite-first’’ approach to the multi-objective library design is introduced.
587
588
2. Method 2.1 Cherry-picking
Let first consider an unconstrained (cherry-picking) compound selection. That is, we select a subset, 3, of N compounds from a collection ,%, of M compounds such that the properties Q(i) = [q, (i), ..., qK(i)] of selected compound i satisfy the specified requirements
0, I Q(i)I a2,i=1, ...,N.
(1)
Or more general, the property qk(i), i = I, ...,N, should have a desired profile f(qk) for k = 1, ..., K if there are K properties concerned. Moreover, the selected compounds should satisfy a certain diversity requirement. For example, one may require that the distance di,j between any pair (i, j) of selected compounds should larger than or equal to a specified value 6,
di,, 2 6,i,j
E
3
(2)
Usually, in optimization theory, an objective functional to be optimized is defined. Here, instead, a score function, S(i), which is a measure of the closeness of the compound’s properties to the desired ones is defined for each compound. For instance, the scoring function may be defined as
where wk is a weighting factor andAqk) is a desired property profile for property qk. The task is to select a subset,
3,of N compounds from a collection, %,
of M
N
compounds such that the average score
c s ( i )/ N
is minimized and the
i=1
diversity requirement, the pair distance di,j I 6,i,j E
3 ,is satisfied. To solve
this problem an “elite-first” approach is used. One first calculates the score S(i) for each compound i in the set, %. Then, the compounds are sorted according to their score into a ascending list. We always select the first compound at the top of the list, then, from second compound step down through the list one by one. Suppose at step n where n compounds have been selected and the compound m is to be checked. We calculate the distances from compound m to the set 3 of
589
dm,r, i = 1,...n E 3, and find the minimum distance from the compound m to the set 3 of compounds that have already been selected, Min[d,,,], i = 1,...n E 3.If the minimum
all the compounds that have already been selected,
distance Min[d,,,i] is larger than or equal to the specified value 6 for diversity
Min[d,.,] 2 S,i = 1,...n E 3,
(4)
then, the compound m will be selected. Otherwise, the next compound (m+l) in the list will be checked. We repeat this selection algorithm until the bottom of the list is reached. Since at each step, always the best (in terms of the score S(m)) compound, which satisfies the diversity requirement Eq. (4), is selected, N
therefore, the average score c s ( i )/ N is warranted to be minimized and the r=l
diversity requirement Eq.(2) is always satisfied. 2.2 Fully combinatorial super-array library design
For the fully combinatorial super-array library design, in addition to the diversity requirement Eq. (2) and the property profile requirement (minimizing N
the average score,
z s ( i ) / N ) ,the selected compounds must have a fully r=l
combinatorial super-array format (format b shown in Fig.1). Therefore, this is a multi-objective optimization problem with a design constraint. For simplicity, let consider a 2-component combinatorial reaction, for which each product i(a,P) are formed from two reactants, a and P, for the reaction components A and B , respectively. Suppose that there are a set %A of MA and a set f l E of MB reactants for the reaction components A and B , respectively. Thus, the total number of products is M = MA x MB. The task is to select a subset of NA
sA
reactants A and a subset
3,
of NB reactant B such that the diversity
requirement
m,(a,,P,),i&,
p2>126 , a E 3,md
has to be satisfied and the average score
p E 3,.
(5)
590
is minimized. Suppose that the distance measure is defined such that the square of the distance, D 2[il (a,3 P I h i 2 (a,9 P A between two
PI
products, i, (a,, ) and i, (a,,P, ), is equal to the sum of the squares of the distances, dZal,dand d2pl,pz, of the corresponding reactant pairs, (al,az) and ( P I , Pz) :
~2[~l(al,~l>,i2(a2,P2)1 = d2a1,a2+ d2p1,p2
(7)
In view of the relationship, Eq.(7) and the fully combinatorial . super-array constraint, the diversity requirement, Eq.(5), leads to the diversity requirement on each of the reaction components
dal,a22 6,a E Z Aand dpl,p22 6,P E 3,.
(8)
To solve this problem we first calculate the score S[i(a,P)] for all products, and then assign to each reactant a /P a score S(a)/S(P) defined as M B
S ( a )=
S[i(a,P)] / M , and S(P) =
p=1
MA
S[i(a,P ) ] /M A
(9)
a=l
We sort the reactants A according to their score S ( a ) into an ascending list, and apply the same “elite-first’’ algorithm described in subsection Cherry-picking, to select a subset
3,
of NA reactants for reaction component A. Then, we re-
calculate the score S(P) for reactants B with the selected reactants A N”
a=l
The reactants B are sorted in accordance with the re-calculated score S(P) into an ascending list. The application of the “elite-first” algorithm to this list gives rise to a selection of a subset
sAand 3,
g B of
NB reactants. The selected reactant sets
for reaction components A and B constitute the desired optimum
design of a fully combinatorial super-array. 2.3 Tiling
In this case, the design objectives are the same as those for the two cases described above. That is, the diversity requirement Eq. (2) and the property
59 1 N
s(i)/ N ), have to be
profile requirement ( minimizing the average score, i=l
satisfied. However, in comparison with design format b of fully combinatorial super-array, tiling design format c imposes a more relaxed constraint on the design. One requires that the fmal library design consists of a set of unit combinatorial array of a given format, nA x nB x.... Therefore, the design formats a and b are the extreme cases of tiling format. For example, the cherrypicking is tiling with 1x1 unit array for a 2-component reaction. Since this is a tiling design, we do the design iteratively. At each iteration, a score for each reactant of component A that has not yet been selected is first calculated:
N
where,
is the remaining set of
M E reactants B to be selected. At the first
iteration, a fully combinatorial super-array design is generated by using the same “elite-first’’ algorithm described in the last subsection except that the number of selected reactants for each reaction component, NA, NB,.. ., has to be an integer multiple of the specified size of the corresponding component, nA, nB,.. ., of unit combinatorial array for tiling. Starting from second iteration, the selected reactants in f.c A have to satisfy two conditions:
Min(dE,,E2) 2 6,Zl E N , andZ2 E E,, and dE,,a22 c6, El E N, and a2E 3 , for component A
(12.1)
N
and for
p1in f.c,
onehas N
Min(d-P l J-2 ) 2 s y j 1EN, andp, EE,, and Min(dp,,p2)2
J z i ,PI
E
E, and P2 E 3, for component B (12.2)
-
and E, are the sets of reactants that have already been selected in this particular iteration; 3, and 3, are the sets of reactants that have been Here
LZ,
selected in previous iterations; c is a coefficient < 1. As for the first iteration, the N
number of reactants selected at each iteration, N ,
-
,N , ,....,
has to be the
592
integer multiple of the respectively specified size, nA, nB,..., combinatorial array for tiling.
of unit
3. Results and Discussion
As an illustrative example, a 2-component reaction is considered with three design formats, and with the same property profile requirements and diversity criteria. The three designed libraries are compared. The high efficiency of the present approach is demonstrated with a tiling design of a 3-component combinatorial reaction. Reference 1. Agrafiotis, D. K., Multiobjective optimization of combinatorial libraries, IBM J. Res. & Dev. 2001,45, 545-566.
FINITE ELEMENT LEVEL SET FORMULATIONS FOR MODELLING MULTIPHASE FLOWS
S. V. SHEPEL Laboratory for Thermohydraulics, Paul Schemer Institut, Villigen PSI, CH-5232, Switzerland E-mail: [email protected] S. PAOLUCCI Department of Aerospace and Mechanical Engineering, University of Notre Dame, Notre Dame, IN 46617, USA E-mail: paolucci@nd. edu *
In this work, we present two Finite Element formulations of the Level Set method: the Streamline-UpwindlPetrov-Galerkin(SUPG) and the Runge-Kutta Discontinuous Galerkin (RKDG) schemes. Both schemes are constructed in such a way as t o minimize the numerical diffusion inherently present in the discretized Level Set equations. In developing the schemes, special attention is given t o the issues of mass-conservation and robustness. T h e RKDG Level Set formulation is original and represents the first attempt to apply the discontinuous Galerkin FE method t o interface tracking. T h e performances of the two formulations are demonstrated on selected two-dimensional problems: the broken-dam benchmark problem and a mold-filling simulation. T h e problems are solved using unstructured triangulated meshes. We also provide comparison of our results with those obtained using the Volume-of-Fluid (VOF) method.
1. Problem Formulation 1.1. Level Set problem
Modelling of gas-liquid interfaces is one of the major challenges in the numerical simulation of multiphase flows. Moving interfaces between dissimilar phases are encountered in diverse applications such as melting, solidification, evaporation, condensation, flame fronts in combustion, and materials processing. In numerical simulation, the shape and position of a moving interface has to be obtained as part of the solution. For this reason, a nu593
594
merical method used to model moving interfaces has to be able to handle complex topological changes in the interface shape, as well as have good robustness and mass conservation properties. The Level Set method is one of the most promising interface-capturing schemes and satisfies the specified requirements. We are interested in implementing the interface tracking on unstructured grids, and therefore we develop FE formulations of the Level Set method since the Finite Element method handles unstructured grids easily. In our work we develop and test two FE formulations of the Level Set method: the SUPG and RXDG schemes. Both formulations allow easy extension to three-dimensional simulations. In the Level Set method, the numerical interface between two fluids is represented by a continuous function 4 which is typically defined as the signed, minimum distance from the interface; positive on one side, negative on the other, and zero at the interface itself. The zero level contour of 4 is called the zero level set and is associated with the physical interface. The computational interface is assigned some finite thickness, within which the distribution of 4 is used for interpolation of physical properties across the interface (such as density and viscosity), and also for computing interface curvature. The evolution of the level set function in an interfacial flow is given by the advection equation:
a4 + u . v+ = 0. at where u is the velocity of the flow, and t is time. In numerical implementation, the level set function ceases to be the signed distance from the interface after a single advection step associated with the numerical solution of 4. To restore the correct distribution of 4 near the interface, the so-called redistance problem has to be solved in which the level set function is re-initialised. In solving the redistance problem, values of 4 must be corrected in such a way that the level set distribution is recovered without disturbing the zero level contour. Much of the research in improving the existing Level Set formulations is concerned with accurate solution of the redistance problem. We choose the formulation of the redistance problem developed by Sussman e t al. due to its generality. This formulation is based on solving for the steady state of the following equations
595
Here 40 is an initial guess of the level set function, and T is a time-like variable (used in approaching the steady-state solution). Equations (1) and (2) are solved consecutively at every time step.
1.2. SUPG Level Set Formulation The SUPG formulation, which is a second-order accurate upwind modification of the standard Galerkin F E method, is widely used for the solution of hyperbolic equations. The SUPG scheme is well described in the literature (see Johnson for instance). Eq. (1) is hyperbolic and its solution with the SUPG scheme is straightforward. Eq. (2) belongs to the so-called family of Hamilton-Jacobi (HJ) equations, although it can be rewritten in the form of a nonlinear hyperbolic equation. Unfortunately, the nonlinearity of the sign function causes severe problems associated with the numerical solution of Eq. (2). Use of the standard SUPG method would result in poor convergence and possibly significant mass change in the system where by mass we understand the mass of the fluid enclosed by the zero level set. To minimize the mass losses and improve convergence of the SUPG scheme we adapt the Sussman and Fatemi’s mass correction technique to the F E method. The Sussman and Fatemi’s technique is based on the idea of imposing a local constraint near the zero level set. Originally this technique has been developed in the finite difference framework and is based on solving for the steady state of the corrected following redistance equation
where X is the correction coefficient, and f (4) is some weighting function. In our work we develop a computationally efficient weighting function f(4) and test it on different advection/redistance problems involving various interface topologies. 1.3. RKDG Level Set Formulation
The RKDG finite element formulation for solving hyperbolic conservation laws has been developed by Cockburn and Shu It applies only t o hyperbolic equations that can be written in divergence form. The redistance
‘.
596
equation (2) can be transformed into divergence form by using a special procedure developed by Hu and Shu This procedure employs taking partial derivatives of (2) in each spatial variable and introducing a new set of variables, pi = 4 ,i, which finally results into the following conservation law
'.
with the constraint pa ,J. = pJ' , a,.
(5)
Here H ( $ ) = sign($o)(lV$I - 1). The solution of (4) and (5) is sought in a finite element space of polynomial functions which are discontinuous across element edges. The system of equations is over-determined, and therefore its solution is sought in the least squares sense. After the partial derivatives of are found, the function 4 is obtained by numerical integration of the cp field over the computational domain. Different integration procedures can result in local distortions of the zero level set which can eventually result in significant error in the solution of the coupled Level Set problem. In our work we develop a special integration procedure which minimizes possible distortions of the zero level set and significantly improves the accuracy of the solution. This procedure involves division of the computational domain into element bands depending on the location of the zero level set contour and consecutive integration from one element band to the other. 2. Results
We apply the developed Level Set formulations to two problems involving flow of fluid with a free surface: the broken-dam benchmark problem and a mold-filling problem. We assume that the interaction between air and fluid is negligibly small which is actually the case in many industrial mold-filling applications. Therefore, we use the so-called single-fluid approach in which the air is excluded from consideration. Assuming the fluid t o be incompressible, we model the flow of fluid by solving the Navier-Stokes equations with free-slip boundary conditions on the container walls. The use of the Level Set method within the framework of the single-fluid approach requires a special procedure for construction of an extended velocity field ahead of the free surface. The region in which the extended velocity field is built is
597
called the buffer zone. In our work we study one of the methods t o construct the extended velocities. This method is based on solving the Stokes equations in the buffer zone with prescribed velocities on the free surface. In the broken-dam problem the fluid is initially still and contained within a square cavity of 5.72 x 5.72 cm. At time t = 0 the dam (one of the walls) supporting the fluid is instantaneously removed and the fluid collapses under the influence of gravity. The fluid that we model in this problem is water with density p = 1000 kg/m3 and viscosity p = 0.001 kg/(m.s). We obtain both the SUPG and RKDG solutions. We also obtain a solution to this problem by using the Volume-of-Fluid (VOF) interface-tracking method. We find that there is good agreement between the Level Set and VOF solutions, with the Level Set formulations producing a smoother free surface. Comparison of the numerical solutions with the available experimental data shows good agreement between the numerically and experimentally found velocities. In our work we also study the effects of the thickness of the buffer zone and the viscosity of fluid inside the buffer zone upon the solution. We find that the Level Set solutions are quite insensitive t o the size and fluid properties of the buffer zone. We also find that in the broken-dam problem the Level Set formulations provide good mass conservation although simulations are quite long. We find that the use of the mass correction term is essential for robustness of the SUPG scheme and at the same time retains the second order of convergence of the scheme. The second problem we consider is the filling of a two-dimensional mold with liquid melt. We consider a simple mold of the size of 4 x 6 cm with a circular orifice in the center of radius of 1 cm. The filling is done from below with molten aluminum at T = 760" C with a uniform velocity profile of 8 cm/s. The mold is initially empty and is at constant temperature of T = 500" C. In the course of filling the mold, the melt solidifies and forms a solidification front. In this case we also observe good mass conservation of the Level Set solution despite complex topogical changes of the free surface. The RKDG formulation is found to be less robust and less accurate as compared t o the SUPG formulation and requires further development.
References 1. M. Sussman, P. Smereka, and S. Osher J . Comput. Phys. 114,146 (1994). 2. C. Johnson Numerical solution of partial differential equations by the finite element method, (Camb. Univ. Press), New York, 1987. 3. M. Sussman and E. Fatemi J . Sci. Comput. 20, 1165 (1999). 4. B. Cockburn and C.-W. Shu J . Comput. Phys. 141,199 (1998). 5 . C. Hu and C.-W. Shu J . Sci. Comput. 21,666 (1999).
REPRESENTATION & MODELLING OF ELECTRONIC PATIENT RECORDS KAMI SIVAGURUNATHAN, PANAGIOTIS CHOUNTAS, ELIA EL-DAM1 University of Westminster School of Computer Science Northwick Park- Watford Road, HA1 3TP, UK E-mail: [email protected]
This paper is an attempt to provide an XML based framework for modelling and representing multi-source electronic patient records (EPR) as part of an integrated single source environment. The main focus of this framework is to capture the dynamic features of EPR data sources for further analysis and knowledge discovery with the aid of OLAP and data mining.
1.
Introduction
National Health Service (NHS) can be portrayed as a large repository of records keeping track of patient’s history. These records are originated primarily from the dialogue between a doctor and patient and subsequently the recording of these by the doctor. In many cases this dialogue may be based on the patient perceptions about a particular illness rather on actual facts. Therefore the recording could be biased and open to vagueness, resulting in impurities within the patient record. Furthermore these records are mostly in paper PPRs (Paper Patient Records). Therefore there is bound to be deceptive information In this paper we argue that the electronic recording of PPRs into an electronic form will result in large data repository where OLAP and mining techniques can be used for extracting and verifying useful tendencies with respect to patient behaviour, diagnosis, treatment, discharge and Length of stay (LOS). Such information may serve as a way to improve standards of health care as well as redirecting public funding inside NHS towards efficient-reliable medical practices. Thus the main question to be addressed is ”what consists of an electronic patient record (EPR) system”? According to [l], [2] patient history is necessary to diagnose illnesses, identify drug interactions, monitor the course of disease, assess progress in rehabilitation, plan for discharge. We further suggest that EPR will have to reflect an element of perception-based values due to either patient or medical staff distinct ways of sensing same situation-experience. 598
599
Furthermore we argue that ERP records should be cleaned from intentional as well as extensional inconsistencies and available for use by either global or local users. To put it differently the global repository of ERP’S have to be structured in a way that can be used for analysis in the context of a locality (i.e local hospital) as well at national level for providing cumulative information to non local users. The rest of the paper is organized as follows in section two proposes a general architecture for a global EPR schema. Section three outlines the features of an XML based framework for modeling dynamic patient features-data. In section four we summarized our proposal and point towards future work. 2.
A General Architecture for a Global EPR System
We proposed a generic EPR architecture for modeling patient records. We focus on the properties of the central EPR node as well as on the properties of the regional nodes. We argue that the Central Repository must provide integrated cumulative information that can be accessed universally. While EPR nodes are associated with hospitals, regions of GPs, health care facilitates. EPR nodes can search other nodes with the aid of the central node-repository. Therefore the central repository apart from holding summarized cumulative information it must as well contain the appropriate metadata structures for allowing the unintermptible exchange of information between EPR nodes, health care facilities (HCF), or access points (i.e. mobile devices).
Figure 1 . Proposed ERP Architecture We first present the fundamental characteristics of a Central repository (Node) for accommodating summarized-integrated [3] EPR data. Such a node must provide the following characteristics Resolution of intentional and Extensional inconsistencies. Intentional inconsistency resolution requires a common format, for all information
600
providers (EPR nodes) and possible overlapping information. At this point it is possible that two information providers could provide semantically different and occasionally conflicting answers to the same query (extensional inconsistency). Simplicity. There is no restriction on the underlying data model; the only requirement is that results from different providers are concluded in a tabular archetypes. Flexible answers. Because mappings are not single valued, global queries may be bound. There is no assumption of mutual consistency between a set of information providers(regiona1-EPR nodes). An authoritative-certain answer may not be possible. Global queries may have several candidate answers that may be related to, classified according to the might happen ability or tendency of events to occur. The characteristics described so far treat information providers equally, without taking into account properties that make particular information attractive with reference to EPR nodes. The features that are considered are the following: Time property. In terms of the valid time dimension information may be definite, unique time-stamped fact instances with known duration, indefinite unique time-stamped fact instances with constrained duration. Quality Property. An answer to a query may have an associated level of quality. This characteristic may indicate the level of completeness of an answer towards a query. Uncertainty Property. Perception based information may generate two types of uncertainty. One is introduced because of queries that refer to level concepts that are at a lower level than those that exist in the instance level of the data-source (EPR records). The other arises because of the use of an element in the query that is a member of more than one high level concepts. The rest of the paper is attempting to provide XML based framework for modeling and representing information at the central repository (Node) level as well at the regional level (EPR nodes).
3. An XML Based Framework for Modelling EPR Cases Patient records content founded either at the central node or on regional ERP nodes are expressing richly structured clinical data which can be represented by various types of generic structures: including proposition, observation, subjective, instruction and query. These clinical structures are differentiated mainly by the contextual data they record. For example, observation items contain a timestamp and the identity of the recorder, while
60 1
subjective items record the identity of the information provider, and the degree of certainty that the clinician has the information correct. XML-Schema [4], [ 5 ] provides a rich set of data types, data constraints and data concepts for representing integrated-summarised patient records, stored in different formats in various ERP nodes. Therefore the central-repository-node must be based on XML schemas. XML-Schema provides the following features: Legacy data can be integrated into the architecture using a translation wrapper which translates legacy data to XML structures. Furthermore wrappers are available for translating relational and object data to XML schemas and vice versa. Rich Data Typing: XML-Schema provides an extensive typing mechanism with a broard range of primitive typesfrom SQL and Java, such as numeric, datehime, binary, boolean, URIs, etc. Furthermore, complex types can be built from the composition of other types (either primitive or complex). In particular, XML-Schema uses a single inheritance model that allows the restriction or extension of type definitions; Support for Name Spaces: XML-Schema is namespace-aware, enabling elements with the same name to be used in different contexts. Additionally, schema types and elements can be included (or imported) from a separate XML schema using the same (or different) namespace; Constraints: XML-Schema provides an assortment of constraint types, including format-based ‘pattern’ constraints key and uniqueness constraints, key references (foreign keys), enumerated types (value constraints), cardinality (or frequency) constraints and ‘nullability’ ; XML schemas are suited for representing data using different formats and semantics, apparently there is no support for representing subjective or temporal information. We suggest that XML schemas can be extended for supporting subjective or temporal information. We first suggest way for including subjective information as part of an XML schema. Subjective information (1) is an issue, founded in EPR data and it is not easy to represent in a relational model and XML suggests itself as a natural representation choice. possj.- W,...h)- d e f f e ulf,,...&Icf,...,fnl (1) However In the case of XML, we may have probabilities-fuzzy weights associated with elements - we have to interpret what exactly this means given that elements can nest under other elements, and more than one of these elements may have an associated weight. Since all weighted information is encoded in XML, it is accessible through normal XML query mechanisms.
602
However, the manipulation of these weights is nontrivial, and not something that a typical user would wish to do directly. Consider the case of conjunctive and disjunctive events. Rather, we would like to permit the user to issue queries in the normal manner, as if the queries were against a deterministic data source (with equivalent schema). We should then produce query responses that take weighted information into account. For temporal modeling, each XML construct such as element, attribute, entity, should be considered as evolving in time. To achieve temporal assignments of such XML constructs the temporal embrace of existence and temporal constitute of belief are introduced. Temporal embraces provide representation of point, interval, relative and periodic events. The temporal embrace of existence defines the time period during which the XML construct believed to be true according to user beliefs. The temporal embrace of belief defines the time period during which the system believes into the existence of the XML construct as values. Temporal embrace value assignment expressed by XML attribute inevitably leads to XML syntax change. Our model is based on [ ] temporal inequalities and Allen’s temporal logic [ 3 and provide five basic TXML objects. 0 Integer. Represents integers in the mathematical sense with no predefined range 0 (Datetime). Is represented as a string. Corresponds to “Characters in XML” (Interval, Timeperiod). Are represented as “Characters in XML” 0 Variable. Which is a sequence of characters matching a regular expression We further suggest that our basic model of T-XML objects can be further enhanced for modeling either temporal uncertainty or moving objects over the hierarchy of time 4.
Conclusions
In this paper a conceptual framework has been proposed for the representation and modeling needs of a flexible EPR system. We have outlined the structural elements of an XML-EPR system mainly focusing on the dynamic properties of EPR data such as vagueness and temporality in an effort to provide global querying facilities among various-heterogeneous integrated EPR data sources.
603 References 1. P. MILLARD and S. MCCLEAN, Modelling Hospital Resource Use - A different approach to the planning and control of Healthcare systems” Royal Society of Medicine Press, 1994. 2. P. MCKENNA, The paper mountain, Document image processing and the electronic medical record, The British Journal of Healthcare Computing and Information Management, February, 14(1), 24-26( 1997). 3. P. CHOUNTAS and I. PETROUNIAS, Virtual Integration of Temporal and Conflicting Information, Proc. International Conference Database Engineering & Applications Symposium, Grenoble France, pp 243-248, IEEE Computer Society Press, 200 1 . 4. R. CONRAD, D. SCHEFFNER and J.C. FREYTAG, XML Conceptual Modeling Using UML. Proc.International Conceptual Modeling Conference, Salt Lake City, USA, pp 558-571, Springer, 2000. 5. L. BIRD, A. GOODCHILD and T. HALPIN, Object Role Modeling and XML Schema. Proc.International Conceptual Modeling Conference, Salt Lake City, USA, pp 309-322, Springer, 2000.
CONSISTENT KINETIC MODEL OF INNERMOST COMETARY ATMOSPHERE AND BOUNDARY LAYERS OF COMETARY NUCLEUS
YURl V. SKOR.OV* Max-Planck-lnstitut f i r Aeronomie, Max-Planck-Str. 2; 0-37191 K.-Lindau> Germany BJORN J. R.. DAVIDSSON Department of Astronomy and Space Physics, Box 515, SE-751 20 Uppsala, Sweden GENNADY N. MAR.KELEOV Atos Origin Engineering Services B. V.: Haagse Schouweg 6G: 2332 KG Leiden, The Netherlands
1. Relevance and importance
The key for the explanation of most phenomena associated with the 'activity' of comets is a proper understanding of the processes taking place within a thin layer around the nucleus/cometary atmosphere interface. As a comet approaches the inner Solar System, solar radiation starts to heat up the near-surface boundary layers of the nucleus, which is imagined as a complex body characterized by a high volume porosity and composed of a mixture of mineral, organic, and volatile components. A complete physical description of the energy and mass transfer inside and outside the nucleus is a complex problem. The first difficulty arises because the energy transfer to the interior of the nucleus is not only controlled by heat transfer through the solid porous matrix, but takes place also via sublimation and recondensation at the pores Moreover, due to the high porosity and the presence of transparent ice, solar light could "Keldysh Institute of Applied Mathematics, U S , Miusskaya sq. 4, Moscow 125047, Russia
604
605
penetrate to a substantial depth, and there is volume energy absorption (volume attenuation) in the uppermost porous layer of a cometary nucleus 3. In addition due to the high porosity of boundary layers, effective s u b limation has a bulk character also. In order to model the corresponding molecular flwr above the nuclear surface, it is necessary to calculate the transmission distribution function of molecules just entering the coma from non-isothermal porous layer '. Finally note that the nucleus and the innermost coma of an active comet constitute a strongly interacting physical system. Both heat and mass are exchanged between the two regions, and their physical properties develop in close symbiosis. The thermophysical modelling of a comet is therefore not restricted to the nucleus itself - the whole system must be considered simultaneously '. The present work is the logical next step towards the development of a fully consistent description of mass and heat transfer into cometary surface layers. We continue to study several issues concerning the outgassing process and associated gas-solid state interactions in the nuclear surface layer and innermost coma, putting an emphasis on kinetic simulation of cometary atmosphere. 2. General structure of the model
Hereafter we present the development of the consistent numerical model, that was introduced on ICCMSE 2002. The model can be naturally subdivided into four coupled blocks: 2.1. Simulation of radiation transfer and calculation of necessary optical properties of medium
We calculate synthetic reflectance spectra in the 0.2 - 2.0pm wavelength region for a large number of porous icedust mixtures with different composition, regolith grain sizes and grain morphologies, such as core-mantle grains, dense clusters of such grains, and large irregular particles with internal scatterers. The whole process is divided into four mutually independent substeps; (1) calculating the wavelength-dependent optical properties of the optically effective particles in the medium, as if they were isolated (2) calculating the optical bulk properties of a porous medium, based on the data for the individual particles; (5') solving the equation of radiative t r a n s fer for this medium; (4) integrating in appropriate ways over wavelength and/or geometrical bodies in order to obtain the sought-for quantities. As a result we accurately calculate the solar flux attenuation profile inside
606
the cometary surface layer. The numerical methods used are: Mie t h e ory, Discretedipole approximation, Monte Carlo ray tracing and radiative transfer in optically non-uniform thick media. The detail description of the method and the results obtained were presented in
’.
2.2. Simulation of heat transfer inside irradiated porous medium
The main specific features of the heat transfer into cometary nuclei are ( 1 ) reflection and absorption of solar light by a surface layer; (2) consump tion and release of latent heat due to sublimation and condensation; (9) transport of energy by solid state conduction (where the approximate bulk conductivity of a solid mixture of ice and dust should be corrected for porosity) and gas flows; (4) losses of energy by gas escaping into space, and to a lesser degree by thermal radiation. These processes are intimately connected and form a non-linear problem. We consider a plane-parallel geometry in which we solve the coupled system of the non-stationary heat transfer equation and macroscopic gas diffusion equation (or the alternative module for the kinetic diffusion). The heat transfer equation includes a set of additional terms describing processes listed above. The discussion of applicability and the complete description of the model can be found, for example, in 2-5. 2.3. Simulation of m a s s transfer inside volatile porous medium
Sublimation/condensation of volatile components and gas transport through boundary layers of cometary nucleus are considered both for kinetic and continuos regimes. For the simulation of the Knudsen kinetic flow we use different approaches: the so-called Clausing probabilistic formalism (see, for example, and the parallel realization of the Monte Carlo test particle method ‘. The first technique is applicable for a model where the porous media is described as a bundle of sublimating capillary tubes. We used the method in the recent papers I-’, where the detail description can be found. The test particle technique can be applied to an arbitrary stochastic porous media. This is a new sub-block of the model that can briefly be summarized by the following algorithm: (1) A test particle is injected at a certain depth below the surface, according to a sublimation probability function. The latter function is based on the Hertz-Knudsen formula, corrected by a temperature-dependent sublimation coefficient a,. (2) The test
607
particle is assigned a speed according to the local distribution function. (3) The test particle travels a certain distance, determined by a distribution of free path, after which it interacts with the solid medium. The interaction can be one of three kinds: condensation according to a temperaturedependent condensation coefficient %, or scattering which may be either specular or diffuse. (4) In case recondensation occurs, the test particle is “lost” and we go back to step 1. If reflection occurs, we go back to step 2: if the scattering is specular the speed is not modified, and directions are chosen isotropically; if the scattering is diffuse, we change both the speed and the direction of motion. (5) For each test particle escaping though the upper boundary, we sample its speed and exit angle into small speed and angular bins, thus obtaining the transmission distribution function for escaping molecules.
2.4. Simulation of mass and energy transfer i n innemnost coma
The gas just above the nucleus is in a state of thermodynamic nonequilibrium. Due to intermolecular collisions, the distribution function inside the boundary layer gradually transforms to a drifting MaxweUian. Ordinary hydrodynamical relations such as the Euler or Navier-Stokes equations are not applied to the simulation of this region. A kinetic description is necessary, based on numerical solutions to the Boltzmann equation, e.g., Direct Monte Carlo simulation (DMCS) method. This method was first proposed by Bird and now is widely used in different branches of the rarefied gas dynamics. Recently we applied it to solution of 1D cometary physics problems 4-7. We now intend to investigate global structure of the inner coma in order to better understand the different individual effects that come into play (such as ”cometary wind” and possible recondensation at inactive regions of nucleus). The corresponding simulation is done using 2D non-stationary kinetic model based on the majorant frequency scheme ‘. The combined usage of cell and free cell schemes allows one to achieve a high spatial resolution in the entire flow field, including the region of large gradients. An unsorted grid with a variable cell size depending on the local mean free path is used. The variable hard sphere (VHS) model is applied for intermolecular potential. The energy exchange between translational and rotational modes follows the Larsen-Borgnakke model. The upstream boundary condition is specified in frame of the corresponding model of Knudsen flow inside stochastic porous media. The downstream
608
boundary is chosen that the flow is supersonic and the condition of vacuum is specified for it, i.e. no particles enter the computational domain. The parallelized version of the model is based on the well known domain decomposition technique. Note that the interaction among the above general model sub-blocks occurs since the intermolecular collisions redirect substantial amounts of gas back toward the nucleus. This must be taken into account in the boundary condition for the heat transfer equation. The ideal thermophysical model of cometary nuclei is therefore a combined model in which the heat transfer and gas diffusion of the nucleus are simulated in parallel with a model of the Knudsen layer, where the latter model is deeply involved in determining the boundary conditions at the nucleus/coma interface. For example, the two models could work alternately in an iterative scheme, and converge toward a solution where the temperature profile and gas production rate of the nucleus are fully adopted to, and consistent with, the kinetic structure of the Knudsen layer gas.
3. Basic results and conclusions
In frame of the radiation transfer model we developed a general approach for simulation of optical characteristics of particulate media (especially such as cometary nuclei) and energy deposition in the near surface porous layer and calculated solar flux attenuation profiles inside the porous cometary surface as a function of depth, and wavelength-integrated plane albedo. We presented a practical, accurate, and time-efficient tool which makes it possible to consider the nucleus and the innermost coma of a comet as a coupled, physically consistent system. The tool consists of interpolation tables for the surface gas density and pressure, the recondensing coma backflux, and the cooling energy flux due to scattered coma molecules. The tables cover a wide range of surface temperatures and subsurface temperature profiles. By using the tables, it is therefore possible to include the effects of a dynamic and flexible gas coma during nucleus modelling, almost without increasing the calculation effort. The tool works equally well for thermophysical models with or without layer absorption of solar energy. Based on the SMILE (Statistical Modeling In Low-density Environment) software system we investigated the 2D structures of near-nucleus gas coma around a set of representative irradiated dust-icy porous spherical bodies. Three different gas release regimes were considered: sublimation from uniform spherical body; sublimation from a polar cap; sublimation
609
from a circular strip, The thermophysical models with and without layer absorption of solar energy were tested. In order to calculate return molecular fluxes of density, impulse and energy the local planeparallel simulation of the boundary Knudsen layer was executed for a wide set of model parameters. From the numerical results obtained here, we arrive at the following summary: (i) Under certain circumstances, it is possible for icedust mixtures to be darker than the dust itself. Even media consisting of exactly the same materials, in the same relative amounts, and the same bulk porosity, can have substantially different visual albedos, depending on how the material is geometrically organized. (ii) Solar light can penetrate to substantial depths, and negligence of this phenomenon in thermal models may introduce substantial errors. Solidstate greenhouse effect is obviously observed in all models; it’s stronger if the conductivity is low, and rotation of cometary nucleus is fast. The surface temperature becomes substantially lower, and the subsurface temperature maximum is typically located a few pore radii below the surface. (iii) The main conclusion of our simulation is that in all models, where the surface pressure deviates from the corresponding saturation pressure, there is a strong temperature gradient within a thin surface boundary layer. This means that for typical cometary conditions one should consider sublimation &om a near-surface layer of finite thickness rather than sublimation that from the surface only. In the frame of this approach for a pure porous ice the kinetic model gives a larger net gas production rate, which exceeds the values obtained from macroscopic models with surface sublimation by a factor of about two. (iv) In all considered cases the major part of inner coma is in a thermodynamical non-equilibrium state. It means that only molecular kinetic approaches can be used in numerical simulation of the region (e.g. DSMC method). For “spotted” models where there are significant small-area scale variations in the gas production rate, the local effective recondensation as well as spatial inhomogeneities of the innermost coma (fine structures) are observed. The molecular collisions play dominate role in the innermost coma and as a result sublimating gas outflow is a global phenomenon. Its largescale field as well as particular structure depend on the total surrounding and background sublimation.
610
References 1. Skorov, Y.V., et al.: ”A Model of Heat and Mass Transfer in a Porous Cometary Nucleus Based on a Kinetic Treatment of Mass Flow”, Icarus, v. 153, 180 (2001). 2. Skorov, Y.V. et al.: ”Thermophysical Modelling of Comet P/Borrelly: Effects of Volume Energy Absorption and Volume Sublimation”, Earth, Moon, and Planets, v. 90, 293-303 (2002). 3. Davidsson, B.J.R.. and Skorov, Y.V.: ”On the Light-Absorbing Surface Layer of Cometary Nuclei. I. Radiative Transfer”, Icarus, v. 156, 223 (2002). 4. Davidsson, B.J.R.. and Skorov, Y.V.: ”A Practical Tool for Simulating the Presence of Gas Comae in Thermophysical Modelling of Cometary Nuclei”, Icarus (in press). 5. Davidsson, B.J.R.. and Skorov, Y.V.: ”On the Light-Absorbing Surface Layer of Cometary Nuclei. 11. Thermal Modelling”, Icarus, v. 159, 239 (2002). 6. Bird G.A. 1994. Molecular Gas Dynamics and the Direct Simdation of Gas Flows. Oxford University Press. 7. Skorov, Y.V., Rickman, H.: ”Gasflow and dust acceleration in a cometary Khudsen layer”, Planet. Space. Sci., v. 47, 935 (1999). 8. Ivanov, M.S. and Rogasinsky S.V.: Theoretical analysis of traditional and modern schemes of the DSMC method”, Proc. XVII Inter. Symp. on Rarefied Gas Dynamics, Aachen, 1991.
COMPARATIVE EVALUATION OF SUPPORT VECTOR MACHINES AND PROBABILISTIC NEURAL NETWORKS IN SUPERFICIAL BLADDER CANCER CLASSIFICATION P. SPYRIDONOS, P. PETALAS, D. GLOTSOS, G. NIKIFORIDIS Computer Laboratory, School of Medicine, University of Patras, Rio, Patras, 265 00, Greece E-mail: spyridonos@med. upatraxgr
D. CAVOURAS Department of Medical Instrumentation Technology, Technological Education Institution of Athens, Ag. Spyridonos Street, Aigaleo, 122 10, Athens, Greece. E-mail: [email protected] P. RAVAZOULA Department of Pathology, University Hospital, Rio, Patras, 265 00, Greece In this study a comparative evaluation of Support Vector Machines (SVMs) and Probabilistic neural networks (PNNs) was performed exploring their ability to classify superficial bladder carcinomas as low or high-risk. Both classification models resulted in a relatively high overall accuracy of 85.3% and 83.7% respectively. Descriptors of nuclear size and chromatin cluster patterns were participated in both best feature vectors that optimized classification performance of the two classifiers. Concluding, the good performance and consistency of the SVM and PNN models enforces the belief that certain nuclear features carry significant diagnostic information and renders these techniques viable alternatives in the diagnostic process of assigning urinary bladder carcinomas grade.
1.Introduction
Bladder cancer is the fifth most common cancer in the western male population'. Microscopic visual analysis of histopathological material provides an index of disease severity and tumor grading according to the degree of malignancy determines the choice and form of treatmen?. However, the recognition of a variety of histopathological findings in tissue biopsies by human requires high level of skill and knowledge. In addition, inter and intraobserver low reproducibility has been shown to influence the quality of
61 1
612 diagnosis3. Previous studies aimed at bladder cancer classification, have described methods based on quantitative structural and textural tissue feature^^'^. More recent approaches have proposed the application of Bayesian Belief Networks in grade diagnosis of bladder cancer utilizing histological features estimated subjectively by pathologists6. So far, little effort has been made to investigate the potential of morphological and textural nuclear features in computer-based grading systems'. The first thing that must be taken under consideration for designing a recognition system is to achieve the best possible classification performance for a particular task. Experimental results from different classification designs would be the basis for choosing the optimal classification model for the particular application. In this study a comparative evaluation of SVMsS and PNNs9 was performed exploring their ability to classify superficial bladder carcinomas as low or high-risk according the WHO grading system". 2. Materials and Methods 129 cases with bladder cancer were collected from the University Hospital of Patras in Greece. Of the 129 patients, 92 were diagnosed as low risk and 37 as high risk from two independent pathologists. Images from tissue specimens were captured using a light microscopy imaging system. From each case a representative sample of nuclei (about 70) were isolated using an automatic segmentation technique" (Figure 1). Two kinds of quantitative parameters were estimated: morphological features related to nuclear size and shape distribution and textural features related to nuclear chromatin organization'*.
Figure 1 Tissue sample of bladder carcinoma and resulted segmented image
613
In the training process of SVM and PNN no iterative procedures are used and no feedback paths are required. The latter enabled us to perform an exhaustive search method for the optimal selection of the best feature vector combination. For each feature combination the classifier performance was tested by means of the leave-one-out method7. 3. Results Both classification models resulted in a relatively high overall accuracy of 85.3% and 83.7% respectively, in discriminating low from high-risk tumors. Considering the SVM classifier, the best feature vector combination comprised two morphological features describing nuclear shape distribution (range of area and standard deviation of area) and two textural features derived from cooccurrence matrices encoding chromatin clusters pattern. The textural features were the cluster shade for inter-pixel distance d=l and d=3. 91.3% (84/92) of low-risk cases were correctly classified. High-risk classification success was 70.3% (26/37) (Table 1). For the PNN classifier the best classification result was obtained utilizing a 4 dimensional feature vector consisting of three morphological features (the nuclear concavity, range of roundness and standard deviation of area) and one textural feature from co-occurrence matrices the cluster shade for inter-pixel distance d=l. Low-risk cases were classified with 88% accuracy whereas high-risk with 73% (Table 2). Table 1. Classification results using the SVM classifier
Patient Outcome Low-Risk High-Risk Accuracy
Classification Performance High-Risk Low-Risk Accuracy
84 11
8 26
91.3% 70.3% 85.3%
Table 2. Classification results using the PNN classifier
Patient Outcome Low-Risk High-Risk Accuracy
Classification Performance Low-Risk High-Risk Accuracy
81 10
11 27
88.0% 73% 83.7%
4. Discussion Utilizing the diagnostic potentiality of nuclear features, two different classification designs based on SVMs and PNNs were tested according to their ability in differentiating superficial urinary bladder tumors. The most attractive
614 characteristic of SVMs’ is that provide bounds on the generalization error of the model in the f+amework of structural risk minimization. On the other hand, decision boundary implemented by the PNNs9 asymptotically approaches the Bayes optimal decision surface. Additionally, the selected classifiers’ setting is simple and computationally efficient. The rather smaller classification accuracy in the high-risk group compared with the respective of the low-risk group might be due to the different size of data sets (37 high-risk cases versus 92 low-risk). Worth mentioning is that both classification models indicated relevant features during the optimization process of their performance. Descriptors of nuclear size and chromatin cluster patterns, such as the standard deviation of area and the cluster shade, were participated in both best feature vectors that optimized classification performance of the two classifiers. The latter enforces the belief that certain nuclear features carry significant diagnostic information. Concluding, the good performance and consistency of the SVM and PNN models render these techniques viable alternatives in the diagnostic process of assigning urinary bladder tumors grade. 5. References
1. S. Parker, T. Tony, S. Bolgen, P. Wingo, CA Cancer JClin, 47, 5 (1997). 2. D. Bostwick, D. Rammani, L. Cheng, Urol Clin North Am, 26,493 (1999). 3. E. Ooms, W. Aderson, C. Alons, M. Boon, and R. Veldhuizen, Human Pathology, 14, 140 (1983). 4. H-K. Choi, J. Vasko, E. Bengtsson, T. Jakrans, U. Malmstrom, K. Wester and C. Bush, Analytical Cellular Pathology, 6,327 (1994). 5 . T. Jakrans, J. Vasko, E. Bengtsson, H-K. Choi, U. Malmstrom, K. Wester and C. Bush, Analytical Cellular Pathology, 18, 135 (1995). 6. R. Montironi, European Urology, 41,449 (2002). 7. P. Spyridonos, V. Ravazoula, D. Cavouras, K. Berberidis, G. Nikiforidis, Medical Informatics and the Internet in Medicine, 26, 179 (2001). 8. V. Kechman, MIT(2001). 9. D. Specht, Neural Networks, 3, 109 (1990). 10. F. Mostofi, L. Davis, and I. Sesterhenn, Springer (1999). 11. P. Spyridonos, D. Cavouras, V. Ravazoula, and G. Nikiforidis, Medical Informatics, 27, 111 (2002). 12. J. Kitler, M. Hatef, P. Duin, and J. Matas, IEEE Transactions on Pattern Analysis and Machine Intelligence, 20,226 (1998).
COMPUTATIONAL ASPECTS OF THE REFINEMENT OF 3D COMPLEX MESHES J. P. SUAREZ AND P. ABAD University of Las Palmas de Gran Canaria Department of Cartography and Graphic Engineering Department Las Palmas de Gran Canarias, 3501 7, SPAIN E-mail: jsuarez@,dcegi.uluzc.es
A. PLAZA Department of Mathematics Las Palmas de Gran Canarias, 3501 7, SPAIN . E-mail: aplaza/&tlma L L I J I ~ Ces M. A. PADRON Department of Civil Engineering Las Palmas de Gran Canarias, 3501 7, SPAIN E-mail: mpadron~,dic.ulupc.es The refinement of tetrahedral meshes is a significant task in many numerical and discretizations methods. The computational aspects for implementing refinement of meshes with complex geometry need to be carefully considered in order to have real-time and optimal results. In this paper we enumerate the relevant compu-tational aspects of tetrahedral refinement algorithms. For local and adaptive refinement we give numerical results of the computational propagation cost of a general class of longest edge based refinement and show the implications of the complex geometry in the global process.
1.
Motivation and Introduction
Triangulating a set of points plays a central role in Computational Geometry, and is a basic tool in many other fields such as Computer Graphics, Geometric Modeling and Finite Element Methods [l, 2, 3, 51. A related problem that is also of considerable interest is refinement of a mesh. The refinement problem can be described as any technique involving the insertion of additional vertices in order to produce meshes with desired features. The manipulation of meshes in real time applications requires particular strategies to adaptively refine or simplify the domain. There are, essentially, two approaches to constructing an approximation from a hierarchy: bottom-up and top-down. This has lead to the improvement of algorithms to proceed with those
615
616
operations. The (uniform) refinement problem differs from the local one in that the insertion and location of vertices in the refinement is the same over each triangle in the mesh. These are essentially bottom-up. The run-time of local refinement algorithms is considerably influenced by the refinement propagation: the extent of secondary refinements induced in neighboring elements by the initiating element. Particularly, a bisection algorithm in a parallel environment has been presented in [ 2 ] where the authors referred as an ominous problem the fact that the propagation could get the worst case, that is: refining a single tetrahedron also implies the refinement of all the tetrahedral in the mesh. Figure 1 shows the worst case of a propagation in 2D.
tat
Figure 1 . Longest edge refinement propagation. The dependencies in the propagation when refining t are indicated by arrows.
2.
Two-steps refinement problem
The local refinement of triangular meshes involves two main tasks. The first is the partition of the target triangles and the second is the propagation or extension to preserve the ‘cracks’ in the resulted mesh. However, the uniform refmement does not consider the propagation as the refinement is uniform in the mesh. Definition (Single bisection and LE bisection). Single bisection consists of dividing the tetrahedron into two sub-tetrahedra by the midpoint of one of the edges. When the longest edge is chosen to bisect the tetrahedron, we say that a longest edge (LE) bisection has been done. Definition (Standard Partition). The original tetrahedron is divided into eight subtetrahedra by cutting off the four comers by planes trough the midpoints of the edges and subdividing the remaining octahedron by selecting one of three possible interior diagonals, Figure 2. This interior diagonal has to be chosen carefully in order to satisfy the non-degeneracy of the meshes obtained when the partition is recursively applied.
617
4
Figure 2 Standard Partition of a tetrahedron and the faces unfolded in plane. The second task in the local refinement problem is to ensure the conformity of the mesh. It is necessary to determine additional irregular patterns, which makes it possible to extend the refinement to neighbor triangles.
3. Computational aspects of the local refinement We summarize the computational aspects of the refinement strategies in Section 2 as follows: (i) partitioning of target tetrahedral and (ii) propagation of the refinement. The first steps can be usually performed in constant time as long as the algorithm implement pre-computed partitions. So, the main computational effort of the local algorithms is the propagation of the refinement The propagation is a necessary task for avoiding non conforming nodes, that is, the requirement that the intersection of non-disjoint tetrahedra is either a common vertex or a common side, and also for producing gradual and smooth meshes. The main strategy to implement a propagation is based on the longest-edge propagation, which consists to repeat the refinement in neighboring elements as long as non conforming node is present. Our series of experiments numerically show that the propagation in 3D approaches to a fix number between 10 and 20. It depends on the tetrahedra geometry of the initial mesh and on the partition type applied for refinement. The lowest limit of the propagation hold for Delaunay meshes and for the standard partition. This result emphasizes that Delaunay meshes and standard partition constitutes the best election for the two-steps refinement problem in many of the 3D problems. A theoretical proof of that result is in progress and will be published in a future work. 4.
A test mesh as case study
As a real case study we provide a mesh like in Figure 3 (left). The initial mesh is a Delaunay mesh. The complexity of such a mesh is notorious: a interior cavity is performed in a sphere domain. Inside the cavity, a salient ellipsoid is located. The refinement around the ellipsoid provide accuracy in the model. For a clear visualization, the sphere has been cut by the middle and hence, the interior cavity and the ellipsoid can be noted (the blue area corresponds to interior tetrahedra).
618
It is performed 3 levels of uniform refinement following the standard partition for every tetrahedron ,and a propagation following the longest edge. Table 1 summarizes the result on the propagation of the mesh.
Figure 3 Test mesh with interior cavity
# Tetrahedra
Propagation mean Refinement Level
5.
1927 97.7000 1
15416 24.1951 2
123328 13.2069 3
Conclusions
We have remarked in this work the main source of computational effort by common algorithms for local refinement. These algorithms are key tools for the discretization process needed in many numerical methods as Finite Element Mehtod, Volume methods etc. We have numerically showed an important result regarding the propagation of these algorithms which holds for Delaunay meshes and for the standard partition. This is an useful basis for users and engineers who often uses meshing algorithms for a variety of application problems. References 1. Bern M. and Plassmann P., Mesh generation, in Handbook of Computational Geometry, J.-R. Sack, J. Urmtia Eds. (2000), 291-332. 2. Jones M.T. and Plassmann P.E., Parallel algorithms for adaptive mesh refinement, SIAM J. Sci. Comp. (1 997), 18:686-708. 3. Rivara M.C., Hitschfeld N. and Simpson B., Terminal-edges Delaunay (small-angle based) algorithm for the quality triangulation problem, Computer-Aided Design (200 l), 33:263-277. 4. Suarez, J.P. and Plaza, A., The propagation problem in longest-edge based refinement algorithms, KJMA Institute Report- 1/02, Universidad de Las Palmas de Gran Canaria, 2002. 5. Suarez, J.P., Carey, G.F. and Plaza, A. Graph based data structures for skeleton based refinement algorithms. Comm. Num. Meth. Eng., 17, pp.903-910,2001.
THE IMPACT OF GRAPHICS CALCULATOR ON MATHEMATICS EDUCATION IN ASIA
CHE-YIN SUEN Chairman of Mathematics Department Hong Kah Secondary School, Ministry of Education, Singapore E-mail: [email protected]
Background The author is the chairman of the mathematics department in a Singapore high school. He finds that the standard of mathematics in Asia has been very high. However, the students in Asia do not benefit as much as those in other regions from the rapid development of IT teaching of mathematics in the recent years. Most of the students in Asia come from low-income families and they cannot afford advanced personal computers. They also have difficulty in paying the high cost of internet surfing. In school, the situation is not much better. The number of personal computers is far below the number required. This slows down the development and implementation of IT teaching in Asia.
Abstract It seems that the situation will not be improved much in the near hture. The educators in Asia hence introduced Graphics Calculator. Graphics Calculator may not be as powerful as computer software in some aspects. However, it has the following advantages: 1. It can be used in teaching and exploring effectively, 2. It is affordable to most of the Asian students, 3. It has small size and light weight which enable it to be carried around easily, 4. It can be used in examination and other assessment environment while personal computer cannot be used for these purposes.
Advantage 4 is very important. All educators agree that the main objective of mathematics assessments is to provide a good indicator of how students understand certain mathematical concepts and this indicator should not be distracted by complicated arithmetic manipulation. Graphics Calculator can help the students skip the tedious manipulation. This is the main reason that many developed countries, such as Australia United States of America and Canada, have introduced Graphics Calculator in examinations in the nineties.
619
620
Like the other new technology, the introduction of Graphics Calculator faczs some resistance fiom some teachers. The teachers wony that the students will no longer be keen to learn the basic skills such as solving of simultaneous equations as this can be solved easily by Graphics Calculators. The teachers also do not like to make great changes in the traditional teaching syllabus following the impact of Graphics Calculator. In the seventies, educators introduced the Scientific Calculator in mathematics education. The introduction also faced similar resistance from mathematics teachers. After some adjustment of the teaching syllabus, the Scientific Calculator was introduced very smoothly and successfully. After the change, womes from the educators did not happen. With the help of the new technology, people find that calculation skill become less and less important in daily life. Nowadays, nearly every student in all developed and developing countries uses Scientific Calculator in mathematics lessons everyday. Similar to Scientific Calculator, Graphics Calculator also brings in some impacts. Educators have to do adjustment not only in their teaching syllabus, but also need to adjust the teaching of mathematical concepts. We can solve a mathematics problem in many ways. Many of these problems can be solved by graphical method. However, drawing of graphs is time-consuming and tedious. The degree of accuracy in graphical method is not high. Many teachers, students and textbooks do not highlight the importance of this method. One of the strengths of Graphics Calculator is that it enables students draw graphs easily and accurately. With the help of Graphics Calculator, the graphical method becomes efficient and effective. We have to change our mind set. The most efficient way in the past may no longer be the best method for the present. Hence the teaching syllabuses have to be modified accordingly. For example, factorization of quadratic expression is used to be a very important part in algebra. It helps the students solve a quadratic equation effectively:
x2 - 6 x + 8 = ( ~ - 2 ) ( ~ - 4 ) Students take one or two weeks to learn and get familiarized with this skill. Some of them do not know the meaning of the skill. They are told that this is the necessary step to find the solution of the related quadratic equation:
x2-6x+8=O With the use of Graphics Calculator, students can find the roots of the equation at their finger tips: From the graph, they can see the whole picture easily and understand the relation among the equation, the expression and the graph. Without the Graphics Calculator, drawing graph is time consuming and students are reluctant to use the graph to learn the relation.
621
. Solving inequalities of one variable by algebraic method will also be redundant to learn after the introduction of Graphics Calculator. The solution can be read directly from the graph: Example: Solving of 12%- 11 13%- 21 > 1 4 ~ 11 will take more than
+
10 minutes in the past. We can read the solution from the graph y = 12x - 11 13x - 21- 14x - 11 below in less than minute.
+
Y=O From these examples, we can see the impact clearly. Before the use of Graphics Calculator, students use factorization to find the roots then sketch the graph. The Graphics Calculator enables the student to draw the graph and find the root directly from the graph. Hence the learning of factorization becomes less important. A few Asian countries are using Graphics Calculator in examinations. Some other countries will follow in one or two years' time. In his presentation, the author will introduce the impact with some examples and how the educators in Asia modify the teaching syllabus to adapt the impact. During the modification, some unforeseen difficulties were encountered and the author will also report how the educators overcome these difficulties. The author will also demonstrate some of his own Graphics Calculator activities which students can learn and explore mathematical concepts.
DENSITY FUNCTIONALS FOR MOMENTS OF THE ELECTRON MOMENTUM DISTRIBUTION
AJIT J. THAKKAR Department of Chemistry, University of New Brunswick, Fkedericton, New Brunswick E 3 B 6E2, Canada
The 1st-order density matrix is related to the electronic wave function by1
r(?l3') = N
1
Q*(?', 22,. . . ,?jv)*(?,&,
. . . ,?jv)
d22.. . d?N
(1)
where N is the number of electrons, and the space-spin coordinate i?k = ( r ' k , O k ) is composed of the position vector of the lcth electron, r'k, and its spin coordinate Q k . When electronic spin is not a focus of attention, it is sufficient to use a spin-traced version defined by
I'(r'1 r") =
s
r(?I?')h(a - a') do do'
(2)
Then the electron density is given by
p ( 3 = P'+r lim41?(71.")
= qr'l r')
(3)
and the one-electron momentum density by2i3
J
II(p3 = ( 2 ~ ) - ~I'(l'1l'')exp[+ip'. (r" - 3 1dr'dr''
(4)
The electron density tells us where the electrons are because p ( 3 dr' is N times the probability of finding an electron within an infinitesimal volume element dr' centered at r'. The electron momentum density tells us how fast the electrons are moving because II(p3dp'is N times the probability of finding an electron with a momentum p' within an infinitesimal volume element dp'. The Hohenberg-Kohn theorem4> of density functional theory states that the ground state electron density p ( 3 determines all the properties of the ground state. In particular, the electron momentum density l-I($ is determined by the electron density. Although this is true in principle, 622
623
there is no known direct route from p to II. Both p and II can be obtained from the density matrix I? as shown in Eqs. (3)-(4) but the wave function is needed to obtain r. Even if the exact exchange-correlation functional were known and used in a density functional computation using the KohnSham procedure,6 the Slater determinant of Kohn-Sham orbitals leads t o a density matrix for a system of non-interacting electrons and not the true many-body system. Thus, the common practice of using Kohn-Sham orbitals to construct 'I is not correct and can lead to inaccurate results. A separate density functional is required for each property. In this work we focus on density functionals for moments of the electron momentum density defined by ( p k )=
1
-2 5 k
p k II(p3 d6,
< 4.
(5)
Moments with k outside the range given above are infinite for the exact momentum density although not necessarily so for approximate momentum d e n ~ i t i e s The . ~ k = 0 case is simply the normalization condition
(PO)= / n ( p 3 d F = N
(6)
The ( p 2 ) moment is twice the electronic kinetic energy. The (p-') moment is twice the peak height of the Compton profile, a measurable quantity.8 Most of the other moments also have physical interpretation^.^ Quasi-classical phase-space arguments by Burkhardt," K6nya,11 and Coulson and March"? l 3 have led t o the following approximate density functionals (pk)
Ik,
-l
(7)
in which
The limits on k are more restrictive in Eq. (7) than in Eq. (5) because the moments with k = -2,3 and 4 incorrectly diverge in the quasi-classical regime. Of course, normalization gives 10
= /p(T')di=
N
(9)
and comparison of Eq. (6) with Eq. (9) shows that the k = 0 case of Eq. (7) is exact: (PO) = Io. These relationships are particularly interesting because they give a direct route from the electron density to moments of the electron
624
momentum density. Two of the I k integrals are familiar from ThomasFermi-Dirac theory5>l4 in which the kinetic energy is approximated by I2/2, and the exchange energy by I1/7r. The Thomas-Fermi approximation for the kinetic energy has been tested for 77 molecule^.'^ Several studies16-22found that the density functionals of Eq. (7) were moderately accurate at the Hartree-Fock level for many atoms and a few diatomic molecules. Pathak et al.23 suggested the same relationship (7) for the remaining allowable values of k = -2, 3 and 4. They examined the approximation (7) for the extended range -2 I k 5 4 for 35 diatomic molecules at the Hartree-Fock level. They conjectured the bounds (pk)I l k ,
k = -2, -1,
(10)
(pk)2
k
(11)
and Ik,
= 1,2,3,4.
on the basis of their molecular data, and previously published atomic data. Thakkar and Pederseng continued this study by computing data for 122 linear molecules at the Hartree-Fock level. The bounds of Eqs. (10)-(11) were satisfied in all cases. They also proposed several models relating ( p k ) and I k more closely. Their most promising functional was given by
Equation (12) is appealing because it correctly reduces to the quasi-classical result of Eq. (7) in the large N limit. Thakkar and Hart24 extended this study to 317 molecules, most of which were polyatomic and of C1 symmetry, using electron densities and electron momentum densities computed by Hartree-Fock, Mdler-Plesset perturbation theory, and coupled cluster met hods. A different way to improve the semi-classical functionals of Eq. (7) is to use gradient correction^.^^ The gradient-corrected functionals can be written as follows (pk)
where
in which
Ik
+ Jk
(13)
625
Thakkar15 tested the kinetic energy or k = 2 case for 77 molecules. Thakkar and Hartz4 examined the accuracy of the gradient-corrected functionals of Eq. (13) for 317 molecules. A survey of the recent studies of Thakkar and Hartz4 and their implications will be presented in my talk.
References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20.
21. 22. 23. 24. 25.
P. 0. Lowdin. Phys. Rev. 97,1474 (1955). A. J. Thakkar, A. M. Simas and V. H. Smith, Jr. Chem. Phys. 63,175 (1981). A. 3. Thakkar. Adv. Chem. Phys. xx,in press (2003). P. Hohenberg and W. Kohn. Phys. Rev. 136B, 864 (1964). R. G. Parr and W. Yang. Density junctional theory of atoms and molecules (Oxford University Press, New York, 1989). W. Kohn and L. J. Sham. Phys. Rev. 140, A1133 (1965). A. J. Thakkar. J. Chem. Phys. 86, 5060 (1987). B. G. Williams, Ed. Compton Scattering: The Investigation of Electron Momentum Distributions (McGraw-Hill, New York, 1977). A. J. Thakkar and W. A. Pedersen. Int. J . Quantum Chem. Symp. 24, 327 (1990). G. Burkhardt. Ann. Phys. (Leipzig) 26, 567 (1936). A. K6nya. Hung. Acta Phys. 1, 12 (1949). C. A. Coulson and N. H. March. Proc. Phys. SOC.(London) A63,367 (1950). N. H. March. Theor. Chem. Spec. Per. Rep. 4, 92 (1981). N.H. March. Adv. Phys. 6, 1 (1957). A. J. Thakkar. Phys. Rev. A 46, 6920 (1992). R. K. Pathak and S. R. Gadre. J . Chem. Phys. 74, 5925 (1981). R. K. Pathak and S. R. Gadre. J . Chem. Phys. 77,1073 (1982). N. L. Allan and N. H. March. Int. J. Quant. Chem. Symp. 17, 227 (1983). R. K. Pathak, S. P. Gejji and S. R. Gadre. Phys. Rev. A 29, 3402 (1984). N. L. Allan, D. L. Cooper, C. G. West, P. J. Grout and N. H. March. J , Chem. Phys. 83, 239 (1985). N. L. Allan and D. L. Cooper. J. Chem. Phys. 84, 5594 (1986). N. H. March. Int. J . Quant. Chem. Symp. 20, 367 (1986). R. K. Pathak, B. S. Sharma and A. J. Thakkar. J. Chem. Phys. 85, 958 (1986). A. J. Thakkar and J. R. Hart (2003). To be published. G. I. Plindov and I. K. Dmitrieva. J. Physique 45, L419 (1984).
RELATIONSHIP BETWEEN CAROTID PLAQUE COMPOSITION AND EMBOLIZATION RISK ASSESSED BY COMPUTER PROCESSING OF ULTRASOUND IMAGES P. THEOCHARAKIS, 1.KALATZIS AND N.PILIOURAS Department of Medical Instrumentation Technology, Technological Education Institution of Athens, Ag. Spyridonos Street, Egaleo, 122 10, Athens, Greece N. DIMITROPOULOS Medical Imaging Dept., EUROMEDICA Medical Center, 2 Mesogeion Avenue, Athens, Greece
E. VENTOURAS AND D. CAVOURASt Department of Medical Instrumentation Technology, Technological Education Institution of Athens, Ag. Spyridonos Street, Egaleo, 122 10, Athens, Greece e-mail: [email protected]
The purpose of this work was to process carotid plaque ultrasound images, employing pattern recognition methods, for assessing the embolization risk factor associated with carotid plaque composition. Carotid plaques of 56 ultrasound images displaying carotid artery stenosis were categorized by means of the gray scale median (GSM) as high risk of causing brain infarcts ( G S M 4 0 gray level) and low-risk (GSM>5O gray level), and in accordance with the physician's assessment and final clinical outcome. In each plaque image, the ratio of echo-dense to echo-lucent area was automatically calculated and it was combined with other textural features, calculated from the image histogram, the cooccurrence matrix, and the run-length matrix. These features were employed as input to two classifiers, the quadratic Bayesian (QB accuracy 9 1.7%) and the support vector machine (SVM accuracy loo%), which were trained to characterize plaques as either high risk or low risk of causing brain infarcts.
1.
Introduction
It has been reported that the majority of stroke patients with severe carotid artery stenosis are previously asymptomatic.' Atherosclerotic carotid plaque composition is an important factor in the development of symptoms and ultrasound (US) is the non-invasive method mostly used in assessing plaque structure, stenosis, and composition routinely.*However, estimation of the risk factor by US involves the subjective evaluation of B-mode images by the physician and, thus, it depends upon the experience of the operator. Objective Please address corresuondence: Prof. D. Cavouras, Ph.D., Dept of Med Inst. TEI of Athens, Tel: (+30) 210-5385-375, (work) - Fax: (+30) 210-5910-975 (work), E-mail: cavouras~~~teiatli.ar.
626
627 methods for estimating the risk associated with carotid plaque mainly concern evaluation of the gray-scale median value (GSM) from the US image^.^ The purpose of this work was to analyze the plaque’s composition by computer processing, using pattern recognition methods, in order to assess the risk of embolization. Plaque composition was represented by textural features, calculated from the carotid plaque’s US image. These feature combinations were used as input to two classifiers, the classical quadratic Bayesian (QB)4 and the recently developed support vector machines (SVM).’ 2.
Materials and Methods
The study comprised 56 ultrasound images of 56 patients displaying carotid artery stenosis. The boundary of each carotid plaque was delineated by the sonographer by means of a custom made software system. Carotid plaques were categorized on the basis of the gray scale median (GSM)3, 13 as high-risk (GSMSSO gray level) of causing brain infarcts, and 43 as low-risk (GSM>5O gray level) and in accordance with the physician’s assessment and final clinical outcome. A structural-index was calculated from each plaque image, the ratio of echo-dense to echo-lucent plaque content. Each plaque image was processed by characterizing each pixel as echo-dense or echo-lucent depending on its graylevel proximity to preset values of I 13 and 40 respectively. These values were the gray-level averages of a large sample of dense and lucent US carotid plaque structures, interactively selected by the physician. Additionally, 38 textural features were calculated from the image histogram, the co-occurrence6 and the run-length’ matrices. Determination of the best combination between the structural-index and textural features was based on the performance of the classifier. System evaluation was performed by means of the leave-one-out method and the accuracy of classification was evaluated exhaustively by combining features in all possible ways to design the classifier. The textural features with the higher classification accuracy were employed as input to the classifiers, which were trained to characterize plaques as high risk or low risk of causing brain infarcts. 3.
Results
Tables 1 and 2 show the highest classification accuracy achieved by the QB and the SVM classifiers respectively, employing the “structural-index” - “mean gray-level’’ feature-combination.
628 Table 1. Truth table for QB classification of high-risk and low-risk carotid plaques. QB classification Group Low risk High risk Accuracy 0 High risk 13 100.0% Low risk 38 88.4% 5 Overall 9] 1% accuracy Table 2. Truth table for SVM classification of high-risk and low-risk carotid plaques. SVM classification Low risk Group High risk Accuracy 0 100.0% High risk 13 43 100.0% Low risk 0 Overal1
accuracy
100.0%
SVM precision was achieved with 5 support vectors, for the parameter C equal to 100 and using the polynomial kernel of degree d=2. 4.
Discussion
Atherosclerosis of the carotid artery is a major factor contributing to brain infarcts and there is clearly a need to assess the risk involved in brain infarctproducing plaques so that surgery may be avoided. Since composition and structure of the carotid plaque have been indicated as important factors leading to embolic events, we have developed a quantitative method for assessing the carotid plaque embolization risk. Following a scanning protocol2, we employed GSM=50 as a threshold, for separating the carotid plaques into the low-risk and high-risk. Our method was based on the carotid plaque structural-index and the mean gray-level, both calculated from the US image, being features related to the structural composition of the carotid plaque. In a previous work8, textural features from the co-occurrence matrix were employed on carotid plaque US images in analyzing plaque composition by multiple discriminant analysis of variance with classification accuracy ranging between 68% for haemorrhage and 100% for calcium. In comparison, our work showed high classification accuracies in distinguishing high-risk (high lipid and/or haemorrhagic content) from low-risk plaques (mostly calcium and/or fibrous tissue). This was achieved by using image processing and powerful pattern recognition methods. The structuralindex combined with the mean gray-level revealed highest classification accuracies by both classifiers. In a previous study8, the mean gray-level was also found, among a few more textural features, to have statistically high
629 discriminatory ability in characterizing carotid plaque texture in ultrasound images. Comparing the performance of the two classifiers, it is evident from Figures 2 and 3 and from Tables 1 and 2, that the SVM outperformed the QB classifier. Although, data in both scatter diagrams are well clustered into two classes, however, the SVM managed to draw its decision boundary accurately. This result was achieved employing the leave-one-out method, thus, providing an assessment as to the behavior of the SVM to new data. An other good indicator of robustness was the relative small number of support vectors used to achieve this high precision result. On the other hand, the QB classifier worked satisfactorily well and it was fast enough to allow for the application of the exhaustive search procedure, in order to locate high precision feature combinations with the structural-index. This was an important contribution of the QB, since the SVM, employing the polynomial kernel, is a slow procedure.
t support wc1oIs
I
-3
-2
.I
0
1
2
Mean
Figure 2: ‘Echo-denseEcho-lucent ratio’ vs ‘mean gray-level’ scatter diagram, with high-risk (echolucent) and low-risk (echogenic) carotid plaques, and QB classifier decision boundary.
I 3 Mean
Figure 3: ‘Echo-denseIEcho-lucent ratio’ vs ‘mean gray-level’ scatter diagram, with high-risk (echolucent) and low-risk echogenic carotid plaques, and SVM classifier decision boundary.
In conclusion, the echo-dense / echo-lucent ratio is a structural-index related to the plaque’s composition and in conjunction with the mean gray-level and the SVM classifier may be of value to patient management in assessing carotid plaque risk of causing brain infarcts. 5.
References 1. Consensus Group, Znt. Angiol. 14(1), 5 (1995). 2. F. Rakebrandt, D. C. Crawford, D. Havard, D. Coleman and J. P. Woodcock, Ultrasound in Med & Biol., 26 (9), 1393 (2000).
630 3. 4. 5. 6. 7. 8.
G. M. Biasi, A. Sampaolo, P, Mingazzini, P. De Amicis, N.M. ElBarghouty and A. Nicolaides, Euro. J, Vasc. Endovasc. Surg., 17, 476 (1 999). R.C. Gonzalez and P. Wintz, Addison- Wesley, (1977). V. Kecman, MITPress, Cambridge, MA, 121 (2001). R.M. Haralick, K. Shanmugam and I. Dinstein, IEEE Trans. Syst., Man., Cybern, 6 10 (1 973). M.M. Galloway, Comput. Graphics and Image Processing, 4, 172 (1 975). A. Arnold, P. Taylor, R. Poston, K. Modaresi and S. Padayachee, Ultrasound in Med. & Biol., 27(8), 1041 (2001).
TRANSIENT SIMULATION OF LARGE SCALE GAS TRANSMISSION NETWORKS USING AN ADAPTIVE METHOD OF LINES
E.S. TENTIS, D.P. MARGARIS, D.G. PAPANIKAS Fluid Mechanics Laboratory (FML) University of Patras, GR-26500 Patras, Greece E-mail: [email protected]
1. Abstract
The modern gas pipeline distribution networks is of such scale and complexity that operate under transient conditions most of the times. The accurate and rapid prediction of the time dependent flow magnitudes is essential in order t o achieve optimum cumulative deliverability and safe and reliable operation. For analysis purposes the different flow situations can be divided into slow and rapid transients. Slow transients are mainly those fluctuations in pressure and flow magnitudes caused by changes in demand (for example the variations in gas consumption in a daily cycle). The simulation of this type is critical for the prediction of packing and unpacking of gas in the system. Rapid Transients are those caused by a linebreak (pipe rupture), rapid start-up or shut-down of a system for example, compressor, downstream valve. The detection of such situations can be important for the safety operation of the system. The most popular method for the solution of the equations that simulate unsteady flow is the Method of Characteristics. Furthermore finite difference methods have been applied, such as explicit finite differences ll2, and fully implicit schemes i.e. Crank-Nikolson method whereas Wylie et a14 used the centred difference method. An alternative numerical approach the Method of Lines is applied in this paper The scope of this paper is the development of a code for the prediction of transient flow phenomena in gas pipe networks. The proposed method516.
631
632
ology based on the Method of Lines (MOL) solves efficiently the non-linear hyperbolic, one dimensional system of conservation of mass, momentum, energy equations which fully define the transient compressible pipe flow. The Method of Lines is a semidiscretise method. The principle idea of the method is to discretise the spatial term first, reducing the partial differential equations to a system of ordinary differential equations (ODE) for the nodal values. An attractive feature of the method is the existence of various techniques for solving systems of ordinary differential equations. An advantage of the proposed numerical solution is the relaxation of the Courant criterion for the time integration and the convenience to establish stability and convergence for a wide variety of problems. For natural gas transients the time steps can be ranged from 1to 60 min for rapid transients and from l h to 24hs for slow transients. Thus, the Method of Characteristics proves slow, because of the restrictions from the CFL stability criterion. On the contrary the MOL method can handle non-linear equations where the time constants can range over many orders of magnitude. The method is extended from the single pipeline to a complex pipe network with a variety of non-pipe components. Various boundary conditions (automatic pressure regulators, valves, compressors) and junction conditions are modelled by MOL in order to define the network behavior. The time integration of the resulted ODE system is archived by a five order Runge-Kutta method with step size time control capabilities. The method is advantageous because of its error estimation and control. A special algorithm for the optimum calculation of total computational node number, and its distribution to each pipe element was developed. It takes account of the space magnitude of each network component and the desirable minimum difference between maximum and minimum space interval of the elements. The algorithm achieves to reduce the computational effort by diminishing the number of equations to be solved, simultaneously for every time step. Although it is possible to use a fixed spatial mesh in such an approach, it has been recognized 8,9 that adapting the spatial mesh offers important advantages in regards of efficiency and accuracy of the solution process, particularly for problems with moving or highly localized features. An appropriate novel adaptive grid algorithm was developed. It is a modified variation of equidistribution principle. The grid is tracking and moving according t o the main velocity or pressure disturbances. The new adaptive mesh has the same or increamental node number according to the flow conditions.
633
Different numerical experiment are used to show off t h e effectiveness of t h e proposed grid adaptation. T h e proposed algorithm proves to be computationally efficient a n d accurate enough for the design a n d control of gas network systems.
References 1. Niessner, H, Comparison of different numerical methods for calculating onedimensional unsteady flow, Lecture No.16 form Lecture series, 1980-1, Unsteady One-Dimensional Flows in Complex Networks and Pressurised Vessels, The Von Karman Institute of Fluids Dynamics, January (1980) 14-18 2. Abarbanel, S., Gottlieb, D. and Turkel, E., Difference schemes with four order accuracy for hyperbolic equations, SIAM J. Appl. Math. 29, 329-351 3. Fletcher, C.A.J., Computational Techniques for Fluid Dynamics, SpringerVerlag, (1991) 4. Wylie, E.B., Stoner, M.A. and Streeter V.L., Network system transient calculations by implicit method, SOC.Pet. Eng. J. Trans. AIME, Vol 11, (December 1971) 356-362 5 . Rice, R.J., Numerical Methods, Software, and Analysis, McGraw-Hill, (1983) 6. Schiesser, W.E., The Numerical Method Of Lines, Academic Press, (1991) 7. Ying, S.P. and Shah, V.J., Transient pressure in boiler steam lines, Fluid Transients and Acoustics in the Power Inddstry, presented at the Winter Annual Meeting of the American Society of Mechanical Engineers, San Fancisco, Calif. (Dec 10-15 1978) 8. Vande Wouwer, A., Saucez, Ph. and Schiesser, W.E., Adaptive Method of Lines, Chapman - Hall, (2001) 9. Berjins, M., Capon, P.J. and Jimack, P.K. On spatial adaptivity and interpolation when using the method of lines, Applied Numer. Math. 26 (1998) 117-133.
OPTICAL PH MEASUREMENT USING CHROMATIC MODULATION D. TOMTSIS, V. KODOGIANNIS', E. WADGE' TEI of West Macedonia, Koila, Kozani, GR-50100, GREECE E-mail: d i m e d i a s . s o f t - h a r d . g r 'Dept of Computer Science University of Westminster London, HA1 3TP, UK E-mail: k o d o g i v @ w m i n a. c . uk
A distimulus chromatic detection system allied with fibre optic light transmission has been used in the development of a low cost and accurate pH measurement system. The performance of the chromatic pH measurement system is compared with a number of optical measurement techniques, which are based on intensity modulation. The chromatic modulation technique has been shown to have advantages over intensity modulation, such as greater immunity to fibre bending, and maintaining calibration when extending the length of the optical fibres used to address the modulator.
1. Introduction Accurate measurement of pH is required in diverse and challenging fields provided by industry, medicine, chemical process control and the environment. On-line monitoring of pH is needed for process control in extreme conditions such as those posed by nuclear reactor environment [ 11 or waste water treatment plant [2]. Clinicians require measurement of blood pH [3] and other physiological fluids during surgery. Optical fibre sensors meet these challenges because they have many advantages compared with conventional pH sensors. For example, immunity to electromagnetic interference, electric and intrinsic safety, possibility of miniaturization, and bio-compatibility make an optical fibre pH sensor ideal for in vivo blood monitoring in electromagnetically noisy intensive care environments. In this paper, a distimulus chromatic detection system allied with fibre optic light transmission has been used in the development of a pH measurement system. The performance of the chromatic pH measurement system is compared with a number of optical measurement techniques, which are based on intensity modulation. The chromatic modulation technique has been shown to have advantages over intensity modulation, such as greater immunity to fibre bending, and maintaining calibration when extending the length of the optical fibres used to address the modulator.
634
635 2. Chromatic Modulation Theory The essence of chromatic modulation is the utilisation of polychromatic light for sensing spectral changes by monitoring the total profile of an optical signal within a spectral power distribution. Chromatic changes can be monitored by a number (n) of detectors with overlapping spectral responses. The output of each detector may then be expressed as [4]
Vn= jP(jl)R.(jl)dil where P(h) is the spectral power distribution in the optical signal and R,,(h) is the wavelength responsivity of the nh detector and k is the wavelength. Each detector output may also be intensity normalised according to:
where
... +u(,.I)+un=l
(3) In such a way, chromaticity maps may be formed in terms of the coordinates u1, u2, ... u ( ~ - The ~ ) , case of n=2 leads to a distimulus chromaticity map on which either chromaticity coordinate (ul or u2) may be used to completely describe changes in optical signals. Any two detectors with different but overlapping spectral responsivities may be used to form a distimulus measurement system. The chromatic model representing this mathematical formalism is called LXY and provides the relative magnitudes of the distimulus values (i.e. X=ul; Y=uz). The two-dimensional nature of the chromaticity diagram, shown in fig. 1, implies that in general, the status of the measured is over-specified in being defined by two coordinates, ul, u2. System simplification and implementation economy is therefore possible by reducing the amount of information acquired and which may be achieved through the use of two rather than three detectors (this differs from two wavelengths monitoring in being broad-band and often using overlapping detectors). The measured status is then defined by the single variable. u1+u2+
tan e=u,/u2
(4)
which in chromaticity terms is effectively a measurement of the shift in dominant wavelength. The two detector (distimulus) system can, therefore, be used to measure those spectral signature changes which lead to a change in dominant wavelength.
636
Figure 1. Chromaticity diagram for distimulus detection. Since the ratio u1/ u2 provide normalisation with respect to signal intensity the distimulus method preserves the intensity-independent nature of chromatic monitoring. Also, since the signals u1 and uz are derived from overlapping detector responsivities there is a degree of inbuilt immunity to extraneous spectral noise in the output U I1 u2.
3. Chemical pH and Chromaticity measurements For a distimulus chromatic measurement system, the chromaticity is a measure of the dominant wavelength of the detected optical spectrum. A pH indicator exhibits two characteristic absorption spectra, that of its acid, and that of its alkali form. The acid and alkali spectra have different dominant wavelengths. In chemical equilibrium, a portion of the indicator will be in the acid, and the rest in the alkali form. The dominant wavelength of the resultant spectra for an equilibrium mixture of acid and alkali indicator forms will depends on the chemical equilibrium position, which is in turn a direct function of pH. Indicator dyes typically change from the acid form to the alkali form over a range of 2 pH. To achieve a greater pH range two or more indicators, for which the change from acid to alkali form takes place over different but overlapping pH values, may be mixed. If for two indicators, the dominant wavelength shifts in the same direction when changing from the acid to alkali form, then a distimulus chromatic measurement system may be used to determine the pH value.
637 4. Experimental Method Three types of experimental investigations have been carried out. In the first one, the change in chromaticity with pH for two indicators was measured while in the second, the change of chromaticity with temperature of a reagent at constant pH was investigated. During the third experiment, the possibility of extending the range of pH measurement using a mixture of indicators was studied. Two indicator dye solutions were investigated; these were BromoThymol-Blue (BTB) and Bromo-Creosol-Purple (BCP) and results are shown in Table 1. Table 1. Chromaticity against pH results for two indicators. Average gradient in
BTB
6.0-7.6
Table 2 shows a comparison of the results obtained from the chromatic system with those obtained from the three intensity or wavelength based optoelectronic detection systems.
Table 2. Comparison results Resolution Range Temperature Coefficient
Chromatic System 0.01 pH 4.6 pH 0.0068 pH / OC
LED system [5] 0.03 pH <2pH
Spectrum Analyzer System [6] > 0.03 pH > 0.003 pHloC
Monochromatic System [7] 0.003 pH <2pH 0.008 pH / " C
The results demonstrate that the performance of the suggested chromatic pH measurement system compares favourably with other optoelectronic pH measurement systems. The monochromatic system demonstrates slightly better performance than the chromatic system, although the chromatic system would have a significant cost advantage. The resolution of the chromatic system is superior to both the LED / photodiode and spectrum analyser systems and has a greater range. The chromatic measurement system has also been shown capable of measurements with multiple indicators.
638
5. Conclusion The proposed chromatic measurement system produces results similar to the conventional, optical measurement systems using alternative, generally far more expensive and complex electronic methods of processing optical signals, but with improved accuracy and range compared with the LED I photodiode and spectrum analyser based systems. Furthermore, the chromatic system has improved immunity to intensity changes not caused directly by the modulator (such as fibre bending).
References 1. S. Motellier., M.H. Michels, B. Dureault and P. Toulhoat, fibre optic pH sensorfoin situ applications, Sensors and Actuators B, 11,467-473(1993). 2. A. Holobar, B.H. Weigl et al., Experimental results on a optical pH measurement system for bioreactors, Sensors and Actuators B, 11, 425430( 1993). 3. J.I. Peterson, S.R. Goldstein, R.V. Fitzgerald and D.K. Buckhold D.K., Fibre optic pHprobe forphysiological use, Anal. Chem. 52, 864( 1980). 4. D. Tomtsis, V. Kodogiannis, D. Zissopoulos, Advances in Systems Science: Measurement, Circuits and Control: Analysis and Measurement of the Modal Power Distribution for Guiding Multimode Fibres, Sfh WSEWEEE Conf: CSCC-MCP-MCME 2001, Crete, Greece July 2001, pp. 345 1-3456,2001. 5. T.E. Edmonds, I.D. Ross, Low cost fibre optic chemical sensors, Analytical proceedings, 22,206-207( 1985). 6. B.A. Woods et al., Measurement of rainwater pH by optosensing flow injection analysis, Analyst, 113,301-306( 1988). 7. G.F. Kirkbright et al., Studies with immobilized chemical reagents using a flow cell for the development of chemically sensitive fibre optic devices, Analyst, 109, 15-17(1984).
COMPUTER-AIDED CHARACTERIZATION OF THYROID NODULES BY IMAGE ANALYSIS METHODS S. TSANTIS Department of Medical Physics, School of Medicine, University of Patras, Rio Patras 26500, Greece
I. KALATZIS, N. PILIOURAS, D. CAVOURAS Department of Medical Instrumentation Technology, Technological Education Institution of Athens Ag. Spyridonos Street, Aigaleo, 122 10, Athens e-mail: [email protected].
N. DIMITROPOULOS Medical Imaging Department, EUROMEDICA Medical Center 2 Mesogeion Avenue, Athens, Greece
G. NIKIFORIDIS Department of Medical Physics, School of Medicine University of Patras, Rio Patras 26500, Greece
The assessment of sonographic findings of thyroid nodules in medical praxis is dependent on the examiner's experience and subjective evaluation. Thus, it is obvious that a quantitative method could be of value to diagnosis. This study evaluates the risk factor of malignancy of thyroid nodules by means of an automatic image analysis system designed and implemented in C++ for processing B-mode sonographic images. Precision accuracies of 95% and 93.3% were achieved by means of the Neural Network Multi-layer Perceptron and the Support Vector Machines classifiers respectively. The proposed image analysis system combined with either classifier may be indicative of thyroid nodule's malignancy-risk and may be of value to patient management.
1. Introduction
The thyroid gland plays an important role in the control of human metabolism, secreting vital hormones such as thyroxine (T4) and triothyroxine (T3). Thyroid nodules are swells that appear in the thyroid gland and may be indicative of thyroid cancer'. Sonographic findings of the thyroid nodule are employed as criteria in assessing the risk factor of malignancy, and are crucial in patient management. Such criteria include hypo-echogenicity, absence of halo, microcalcifications, irregular margins, and intra-nodular vascular patterns or 639
640 spots2. However, estimation of the risk factor by ultrasound (US) involves the subjective evaluation of US images by the physician and, thus, it depends upon the experience of the operato?. It is evident that a quantitative assessment of the thyroid nodule’s risk factor may be of value in avoiding unnecessary invasive operations. Objective methods for estimating the risk associated with the thyroid gland disease mainly concern evaluation of parameters from the gray-level histogram of the thyroid gland US image3. It is apparent the need for a more robust and objective method for assessing thyroid disease, as an aid to the subjective criteria employed by the physician. In this work we have developed a computer based image analysis system for the automatic characterization of thyroid nodules into two main classes (high-risk and low risk corresponding to grade Ill and grade I1 respectively), directly related to the 4 grading categories mainly adopted in clinical practice4. 2. Material and Methods The study comprised 120 ultrasonic images displaying various thyroid nodules. All US examinations were performed on an HDI-3000 ATL digital ultrasound system with a wide band (5-12 MHz) probe using various scanning methods such as longitudinal, transversal and sagittal cross sections of the thyroid gland. From the nodule’s segmented image, 40 textural features were automatically calculated by the software, 4 features from the nodule’s histogram, 26 from the co-occurrence matrix5 and 10 from the run-length matrix6. Categorization of the thyroid nodules was performed on the basis of the final clinical outcome; 78 were assigned to Grade I1 and 42 to Grade 111. Determination of the best feature combination was based on the performance of the classifiers, which were trained to characterize the nodules as grade I1 and grade 111. For comparison reasons, two classifiers were employed, the recently developed support vector machines (SVM)’, and the multi-layer perceptron (MLP)’ neural network classifier. In the present study, adequately high classification accuracy was achieved by the mean gray-level value and the graylevel non-uniformity run length features. These two textural features were also employed as input to the MLP classifier. 3. Results
The SVM achieved 93.3% precision in distinguishing correctly low-risk (Grade 11) from high-risk (Grade 111) thyroid nodules. Table I shows the highest classification accuracy achieved by the SVM classifier, employing the above mentioned feature combination. Figure 1 shows a scatter diagram of the mean gray-level against contrast and the decision boundary drawn by the SVM classifier.
64 1 Table 1. Truth table demonstrating SVM classification of Grade II and Grade Ill of the thyroid nodules SVM classrJication Grade II 1 Grade 111 72 6 2 40
Subject Group Grade II Grade III Overall Accuracy
Accuracy 92.3% 95.2% 93.3%
The performance of the MLP classifier is shown in Table I1 and Figure 2. Highest classification accuracy was 95% for the same feature combination as in the case of the SVM. Figure 2 shows the scatter diagram with the MLP decision boundary separating the two classes achieving higher classification accuracy in comparison with the SVM.
Table II. Truth table demonstrating MLP classification of Grade I t and Grade Ill thyroid nodules ~
MLP class@catioA Grade II Grade III 13 5 1 41
Subject Group Grade I1 Grade III
~~
Accuracy 93.6% 97.6%
I
0 Grade 1 I A Grade I#
+
svppoll renors
80
$2
1
0 -2
I
-15
I
1
4
85
1
o
I
05
,
i
I
15
I
,
2
25
I
Ma&"
Figure 1: 'mean gray-level' versus 'gray-level non-uniformity' scatter diagram, displaying Grade I I and Grade Ill thyroid nodules, and SVM classifier decision boundary.
-2
.I5
.l
05
0
05 Mean
1
15
2
25
Figure 2: 'mean gray-level' versus 'gray-level non-uniformity' scatter diagram, displaying Grade II and Grade Ill thyroid nodules, and MLP classifier decision boundary.
642 4. Discussion
We have employed image processing and analysis computer methods to extract a large number of textural features from the US images of thyroid nodules and we have used these features as input to a software classification system to characterize thyroid nodules as high risk (Grade 111) or low risk (Grade 11). The system was trained by means of 120 US-images of 120 thyroid nodules, which had been previously assigned to high or low risk (Grade I11 or Grade 11) classes on the basis of the physician’s assessment and final clinical outcome. Best textural features employed were the mean gray-level and gray-level nonuniformity from the run-length matrix. It is obvious that these features are related to the echogenicity of the thyroid nodule. with respect to surrounding tissue, from the physician’s point of view, and the existence of various structures (micro-calcifications, cystic nodes) inside each nodule. Comparing the results of the classification accuracies achieved by the SVM and MLP algorithms, their performance was similar, with the MLP having marginally better precision. In conclusion, the proposed image analysis system combined with either classifier may be indicative of thyroid nodule’s malignancy-risk and may be of value to patient management. References 1.
2. 3. 4. 5. 6. 7. 8.
Van Herle AJ, Pick P, Ljung BME, Ashcraft MW, Solomon DH, Keeler EB. Ann Inter Med. 96,22 1 (1982). Watters, D A; Ahuja, A T; Evans, R M; Chick, W; King, W W; Metreweli, C; Li, A K. Am JSurgev, 164,654 (1992). T. Hirning, I. Zuna, D. Schlaps, D. Lorenz, H. Meybier, C. Tschahargane, and G. van Kaick. Eur JRadiol, 9(4), 244, (1989). Nobuhiro Fukunari. Biomed & Pharmacother, 56, 5 5 , (2002). R.M. Haralick, K. Shanmugam, and I. Dinstein, IEEE Trans. Syst., Man., Cybern, 610, (1973). M.M. Galloway, Cornput Graph Imag Proc, 4, 172, (1975) V. Kecman, ,MITPress, Cambridge, 121, (2001). Khotanzad A and Lu JH. IEEE Trans Acoust, Speech and Signal Proc, 38, 1028, (1990).
DENOISING SONOGRAPHIC IMAGES OF THYROID NODULES VIA SINGULARITY DETECTION EMPLOYING THE WAVELET TRANSFORM MODULUS MAXIMA S. TSANTIS Department of Medical Physics, School of Medicine, University of Patras, Rio Patras 26500, Greece
D. CAVOURAS’ *Department of Medical Instrumentation Technology, Technological Education Institution of Athens Ag. Spyridonos Street, Aigaleo, 122 10, Athens e-mail: [email protected].
N. DIMITROPOULOS Medical Imaging Department, EUROMEDICA Medical Center 2 Mesogeion Avenue, Athens, Greece
G. NIKIFORIDIS Departments of Medical Physics, School of Medicine University of Patras, Rio Patras 26500, Greece
Ultrasound imaging involves signals, which are obtained by coherent summation of echo signals from scatterers in the tissue. This accumulation results in a speckle pattern, which constitutes unwanted noise and causes image degradation. Suppression of speckle-noise is desirable in order to enhance the quality of ultrasonic images and therefore to increase the diagnostic potential of ultrasound examination. In this paper we introduce a denoising technique for medical ultrasound images based on singularity detection through the evolution of the wavelet transform modulus maxima across scales. The algorithm differentiates the image components from the noise by selecting the wavelet transform modulus-maxima that correspond to the image singularities via the Lipschitz exponents. The performance of the proposed algorithm was tested in 145 B-scan thyroid images (pathological and normal) and demonstrated an effective suppression of speckle while preserving resolvable details (edges and boundaries).
Please address correspondence: Prof. D. Cavouras, Ph.D. Dept of Med Inst. TEI of Athens, Tel: (+30) 210-5385-375 (work) - Fax: (+30) 210-5910-975 (work), E-mail: cavouras~!teiatli.er.
643
644 1. Introduction The use of diagnostic ultrasonography (US) for the clinical evaluation of thyroid nodules has proved to be a useful clinical diagnostic method. Echogenicity, size and shape of thyroid nodules constitute vital parameters for an effective diagnostic evaluation [ 11. Differentiating thyroid nodules from surrounding tissue is critical for improving diagnosis. Although US images in general convey key diagnostic information, the presence of speckle can be considered as noise that causes degradation in quality and resolution [2]. Speckle-noise removal by means of digital image processing could improve the diagnostic potential of medical ultrasound. Several techniques for suppressing speckle-noise have been developed [3][4][5][6][7].Many of the speckle-noise suppression techniques proposed introduce blurring and/or show varying performance in eliminating speckle-noise. In this study, a speckle-noise suppression technique is presented, based on the wavelet transform modulus maxima (WTMM) [8].Accordingly, the image information is separated from noise by discriminating image singularities from noise discontinuities by means of the WTMM. This is performed by thresholding the Lipschitz exponents of the modulus maxima that correspond to noise [9].The proposed technique yields significant speckle-noise suppression in the image, while valuable edges and boundaries are preserved. 2. Material and Methods
One hundred and forty-five US examinations of thyroid nodules were performed on an HDI-3000ATL digital ultrasound system with an automatic wide band (512 MHz) probe. Each US image was digitized by connecting the video output of the ultrasound scanner to a Screen Machine I1 frame grabber using 5 12x5 12x8 image resolution. Image processing comprised finding of the local modulus maxima of the US images wavelet transform (WT). The proposed technique is based on the multiscale edge representation (MER) method [8][101. The MER method smoothes the image at various scales and detects sharp variation points on each smoothed image via the modulus maxima. Through Lipschitz exponents [9],the modulus maxima characterize image information, image content (positive Lipschitz exponents) or noise (negative Lipschitz exponents). Accordingly, the proposed method processes the singularities of the image-WTMM, via thresholding the Lipschitz exponents (removing negative values). The image-WTMM is then reconstructed to reveal the denoised image. 3. Results and Discussion
The proposed algorithm was applied on the 145 US thyroid images and the results were evaluated be an experienced radiologist. Examples on the performance of the proposed algorithm can be seen in Figures 1 & 3, displaying
645
speckle-noise suppression with thyroid nodule boundary preservation. Detailed operation of the effect of the proposed technique in a single scan-line can be observed in figures 2 & 4.
(4
(b)
Figure 1. (a) Original image, (b) Denoised Image.
;m 3 F’i
5 100
r
0
0
50
loo
150 U u u Nurnbi:r
200
250
300
Figure 2 . Pixel values along middle column of the original and denoised image.
(a) (b) Figure 3. (a) Original image, (b) Denoised Image.
Figure 4. Pixel values along middle column of the original and denoised image.
646
According to the physician’s assessment, the proposed method is able to discriminate the thyroid nodules from the surrounding tissue by enhancing the contrast between them. In some cases, it enhances usefid information, not easily visible on the original image, and in cases it preserves edges when the nodule is surrounded by equally echogenic tissue. If the code is optimized for sub-second processing time, the proposed algorithm could result in a useful tool for experienced physicians. Moreover, it may be of value as a pre-processing step to improve the performance of automatic nodule detection, segmentation, and recognition.
References 1. 2. 3. 4.
5. 6. 7. 8. 9. 10.
Van Herle AJ, Pick P, Ljung BME, Ashcraft MW, Solomon DH, Keeler EB. Ann Inter Med 96,221 (1982). P.A. Magnin. Hewlett PackardJ. 34,39 (1983). I Duskunovic, A Pizurica, G Stip-pel, W Philips, and ILemahieu, IEEE Engineering in Med h Biol Society, Conf Proc. 4, 2662 2000. M Karaman, MA Kutay and G Bozdagi. IEEE Trans. on Med Imaging. 14, 283 (1995) M. Malfait and D. Roose. IEEE Trans on Image Proc. 6,549 (1997). V. Frost, J. Stiles, K. Shanmugan and J. Holtzman. IEEE Trans. PAMI. 4, 157 (1982) J. Lee, IEEE Tran PAMI. 2, 165, (1980). S. G . Mallat and S . Zhong, NYU Technical Report, 592 (1991). Mallat, S. and Hwang, W.L. IEEE Trans. Inform. Theory. 2,617 (1992). S. Zhong, PhD Thesis, New York University, (1 990).
RUNGE-KUTTA METHODS WITH MINIMAL DISPERSION AND DISSIPATION FOR PROBLEMS ARISING FROM COMPUTATIONAL ACOUSTICS
K.TSELIOS AND T.E.SIMOS*+~ Department of Computer Science and Technology, University of Peloponnese, Faculty of Science and Technology, University Campus, GR-221 00 Dipolis, Greece E-mail: [email protected], [email protected], [email protected]
In this paper new symplectic Runge-Kutta methods with minimal dispersion and dissipation errors are developed. The proposed schemes are more efficient than the classical Runge-Kutta schemes for computational acoustics problems.
1. Basic Theory
For the initial value problem 'ZLt = f ( t , U )
the general s-stage Runge-Kutta method, is defined by
where
ci =
C ai,j,i = l...s. The coefficients bi,ci,ai,j are determined by j=1
the method and can be presented using matrices in the Butcher
tableau:
*Corresponding author tActive Member of the European Academy of Sciences and Arts Postal address: Amfithea-Paleon Faliron, 26 Menelaou Street, GR-175 64 Athens, Greece.
647
648
Theorem 1.1. :(Sanz-Serna '). method satisfy the relations
If the coeficient bi,ai,j of the RK
b.a. . + b . a . . - b.b. - 0 z z,3 3 3,a - ,
1 I i,.i
Is
(4)
then the method is symplectic. 2. Dispersion and Dissipation in Runge-Kutta methods
Our study is based on the known test equation, ut = xu,
x = z + yi
(5)
for which the exact solution is
u(t Using the Albrecht
+ h) = eh(lsVi)u(t)
notation, the RK solution has the form
un+1 = (1
+ + ... + zSps)u,= (P" + iF,)u, zP1
where z = hX, pj = bTAj-'e
(7)
, e=(1, ...,1) E R" and (Mead and Renaut 5,
+
P, = 1 hxPl+ h' (z' - y') p z h4 (z4 - 62'~'
(6)
+ h3 (z3- 3 ~ 9 ' )
p3+
+ y 4 ) + h5 (z5- 10z3y2+ 5 z y 4 ) + ... and p5
p4
Definition: (Van Der Houwen and Sommeijer '). fined by (7) is dissipative of order p if
+
elh - IPS iF,I = O(hPS')
The RK method de(8)
and dispersive of order g if
hy - tan-l(Fs/Ps)= O(hq+l)
(9)
In the present paper and based on the above theory we construxt RK methods the coefficients of which are obtained from the order of accuracy, the symplectic-relations ( 4 ) and the relations which maximize the orders of dissipation (8) and dispersion (9)
649
References 1. J.C.Butcher, The Numerical Analysis of Ordinary Differential Equations:
Runge-Kutta and General Linear Methods, Wzley, New York ( 1987) 2. J.M.Sanz-Serna, Symplectic integrators for Hamiltonian problems, Report 1991/6 (1991) 3. P.Albrecht,The Runge-Kutta theory in a nutshell , SIAM J-Numer. Anal. 33(5) 1712, (1996) 4. P.J.Van Der Howen and B.P.Sommeijer, Explicit Runge-Kutta (-Nystrom) methods with reduced phase errors for computing oscillating solutions, SIAM J.Numer. Anal. 24,595, (1987) 5. J.L.Mead and R.A.Renaut, Optimal Runge-Kutta Methods for First Order Pseudospectral Operators,J. Comput. Phys., 152, 404-419, (1999)
RELIABILITY BOUNDS IMPROVEMENT VIA CUT SET OR PATH SET REARRANGEMENTS S. TSITMIDELIS Department of Applied Sciences Technological Educational Institute of Chalkida 34400 Evoia, Greece E-mail: [email protected] M. V. KOUTRAS Department of Statistics and Insurance Science
University ofPiraeus, Greece E-mail: [email protected] V. ZISSIMOPOULOS Department of Informatics and Telecommunications University of Athens, Greece E-mail: vassilis@di. uoa.gr
The present article deals with the development of a systematic procedure for improving the general reliability bounds published recently by Fu and Koutras (1995). Koutras, Tsitmidelis and Zissimopoulos (2003) proved that, for a given permutation of the cut sets or path sets of a coherent structure, the identification of the optimal bounds can be achieved by transforming the set-theoretic and probabilistic conditions associated with them to an equivalent Set Covering Problem. As a consequence, genetic algorithms established for the Set Covering Problem were brought into play to derive very tough approximation intervals for a general system’s reliability at very competitive computer times. The object of the present work is to investigate procedures for spotting out rearrangements of the family of cut sets or path sets which yield high quality reliability bounds. The main tool exploited to achieve that, is a Traveling Salesman Problem based model combined with certain efficient techniques that facilitate the transition between the bounds associated with different rearrangements. Numerical experimentation is carried out to assess the power of the suggested approach and finally, a brief discussion for future research in this framework is included.
650
THE ELECTRON-PAIR DENSITY AND THE MODELING OF THE SPHERICALLY AVERAGED EXCHANGECORRELATION HOLE
JESUS M. UGALDE Kimika Fakultatea; Euskal Herriko Unibertsitatea; P. K . 10'72; 20080 Donostia; Euskadi (Spain)
1. Introduction
The modeling of the the spherically averaged exchange-correlation hole deals with one particular approach to the quantum chemistry's holy grail, namely, the electron correlation. However, electron correlation is only vaguely defined as the difference between the one determinant HF representation of the interacting electronic system and its exact representation. Though, it is well known that there are more than one source of electron correlation. Thus, for instance, in the stretched Hz the effect of electron correlation is to eliminate the ionic configurations from the HF wave function. But, when the hydrogen atoms approach at shorter distances ionic configurations play a role and electron correlation has more subtle effects which are more difficult to account for. These two type of effects illustrate the two types of electron correlation that we shall deal with. The former, called non-dynamical correlation relates to the degeneracy of bonding and antibonding configurations. Therefore, they will interact strongly and hence cannot be treated in isolation from each other. The non-dynamical electron correlation is consequently system-specific. The latter, due to short range interactions between the electrons, stems from the failure of the HF representation to describe the detailed correlated motion of the electrons as induced by their instantaneous mutual repulsion. This type of electron correlation is customarily referred to as dynamical correlation and, since it is non-specific is, in a sense, universal. A complementary alternative to look at the electron correlation problem, invokes probabilities and densities of the electron pairs. Thus, in addition to the energy based studies, electron-pair distributions yield insight 65 1
652
to afford quantitative assessment of the short- and long-range effects of electron-electron interaction in a given system. In particular, the electron intracule I(u) and extracule E(R)densities, as well as their corresponding spherical averages, h(u)and d(R), are genuine electron-pair densities useful to characterize the motion of a pair of electrons in atoms and molecules'. For a system of N electrons, the intracule density and its spherical averaged density are defined by
and
/
dR,I(u)
/
dORE(R)
1 h(u)= 47r
whereas the extracular densities are:
and
d(R) = 47r
(4)
where r2(rl, r2) = ( N ( N - 1)/2) 19(rl,r2,. . . ,r N ) I 2 dr3 . . . drN is the spin-less electron pair density and 6 denotes the Dirac's delta function. They represent the probability density functions for the relative electronpair vector (ri - rj) and the center of mass electron-pair vector (ri rj) /2 to be at u and R, respectively. Their genuine two-electron character combined with the low dimensionality make these functions ideally suited to unveil the nature of electron-electron interaction in an elegant and intelligible manner. The connection between the two viewpoints alluded to above can be established as follows. The time independent Schrodinger equation for our N electron system
+
kQ(r1, r2,. . . ,r N ) = E9(rl,r2,. . . ,r N )
(5)
can also be expressed in terms of the one- and two-electron density functions
653
as
The one particle density matrix is given by -y(r,r') = N
1
*(r, rz, . . . ,rN)Q(r/,r 2 , . . . ,rN)drz.. . drlv
(7)
and the electron density function is N
p(r) = y(r, r) =
CiQIS(r - ri)l *)
(8)
i= 1
The electron pair density accounts for the probability r2(rl, r2)drldrz of one electron being in the volume drl around rl when other electron is known to be in the volume drz around rz. If the electrons were independent', clearly: r z ( r l , r z ) = p(rl)p(rZ). Therefore, it is then intuitive that for correlated electrons, an exchange-correlation contribution which takes into account all kinds of correlations between the electrons must be added to the uncorrelated case. Thus, rz(I-1, rz) =
1
p ( r d [p(rz) + Pzc(rl,rZ)l
(9)
Substituting Eq. (9) into the last term of the right hand side of Eq. (6) we obtain that the electron-electron repulsion energy can be expressed as
And finally, also substituting Eq. (9) now into Eq. (1) we obtain that
I(u) =
1
drp(r)p(r
+ u) +
1
drp(r)pzc(r,r
+ u)
Recall at this point that Hohenberg and Kohn demonstrated in their Density Theory Functional (DFT) foundational paper3 , that all properties of interacting electron systems are completely determined by its ground state electron density, p(r). This includes, the energy of the ground state, the energies of the excited states, response properties, etc. Therefore, pzc itself must also be a functional of the ground state electron density in
654
accordance with Eq. (6),although its exact form has been proved difficult t o find out. Nevertheless, equation (11) pinponits t o a strong direct relation between the spherically averaged exchange-correlation hole and the intracule density. We shall explore this r e l a t i ~ n s h i p ~It ~will ~ ~be ~ ~shown ~ ~ ~how . t o bring together experimental and theoretical developments aimed at learning more about the structure and properties of the intracular density and, at the same time we shall find one solid bridge between experimental determination and theoretical modeling of the exchange-correlation hole. Thus, on the one hand experimental work can provide values for the integrated inelastic X-ray scattering intensities IT:, ie.:
from which estimates of the system averaged exchange-correlation hole can be obtained, and for the integrated total X-ray scattering intensities I:;, ie.:
which additionally provides and estimate of the system averaged electron density. On the other hand theoreticians can obtain accurate system average electron densities ( p ), and design reliable exchange-correlation hole density functions, pxc(r,r’). These two independent developments must fulfill the requirements imposed by Eq. (13-12). In particular, it would be highly desirable that approximate density functionals should reproduce the experimentally obtained integrated intensities of Eq. (13-12), in view of the importance of pxc(r,r’) in modeling the correct electron pair distributiong.
References 1. Valderrama, E.; Ugalde, J. M.; Boyd, R. J. In Many-Electron Densities and Reduced Density Matrices; Cioslowski, J., Ed.; Kluwer Academic/Plenum Publishers: New York, 2000; pages 231-248. 2. McWeeny, R. Rev. Mod. Phys. 1960,32, 335. 3. Hohenberg, P.; Kohn, W. Phys. Rev. 1964,140, A1133. 4. Ugalde, J. M.; Sarasola,C. Phys. Rev. A 1994,49, 3081. 5. Mercero, J. M.; Fowler, J. E.; Sarasola, C.; Ugalde, J. M. Phys. Rev. A 1999, 59. 4255.
655
6. Fradera, X.; Duran, M.; Valderrama, E.; Ugalde, J. M. Phys. Rev. A 2000, 62,34502. 7. Valderrama, E.; Fradera, X.; Ugalde, J. M. Phys. Rev. A 2001,64, 044501. 8. Valderrama, E.; Mercero, J. M.; Ugalde, J. M. J. Phys. B 2001, 34, 275. 9. Burke, K.; Perdew, J. P.; Ernzerhof, M. J . Chem. Phys. 1998,109, 3760.
SOLIDIFICATION OF Pb PRE-COVERED Cu(ll1) SURFACE E. VAMVAKOPOULOS, G. A. EVANGELAKIS‘ Department of Physics, University of Ioannina P. 0.Box 1186, 45 11 0 loannina, Greece E-mail: gevagel@cc. uoi.gr D. G. PAPAGEORGIOU
Department of Materials Science and Engineering, Laboratory of ComputationalMaterials Science, University of Ioannina, P. 0.Box 1186, 451 10 Ioannina, Greece E-mail: [email protected]
We present Molecular Dynamics simulation results based on a semi-empirical potential model in analogy to the Tight Binding scheme in the second moment approximation, concerning the Pb overlayer behaviour deposited on the Cu(ll1) surface as a function of concentration. We found that the adlayer’s character changes from fluid to solid as the Pb concentration passes a characteristic value 8,=37.5%. Specifically, for concentration less than 8, the deposited Pb atoms exhibit the behaviour of a dilute fluid, while above 8, 2D liquid like character appears, to recover typical solid behaviour above the saturation concentration that is dictated from the lattice mismatch at 8,=56.3%. These conclusions are deduced from the calculated structural and diffusive properties of the overlayer, namely the Pb lattice parameter and its relative relaxed positions with respect to the bulk lattice spacing as well as the atomic diffusion coefficient. It is found that for concentrations up to 8, the adlayer exhibits important expansion with Pb atoms flowing over the substrate and diffusing very fast, while at 8, the Pb adlayer is compressed by as much as 2.65% with respect to the lattice spacing of the bulk Pb material, in agreement with experimental findings.
1. Introduction Modem microelectronics and magnetic multi-layers require well ordered and defect free thin films. Thus the favorable growth mode for deposition of a metal on a metallic substrate is the “Layer by Layer” epitaxy against the threedimensional modes [l]. The pre-deposition of a low surface energy third element can act as surface-active-species (surfactant), so that the deposited metallic atoms can segregate continuously on the surface. It turns out that these elements can maintain the adsorbates and substrate activity during the deposition process resulting in an almost complete wetting. This method is very
This work was supported by HPRN-CT-2000-00038 project.
656
657 promising for the controlling the growth mode, especially in the hetero-epitaxial cases [2]. In addition the functionality of the surfactant can be interpreter as that, their presence on the substrate surface modifies the kinetics and / or elastic properties of the epitaxial system [3-61. Focusing on the homo-epitaxial case of the Cu(l11) surface it has been found that the predeposition of Pb plays an efficient role in growing 2D islands [7]. However, Pb adsorption close to compete wetting gives a superstructure that is not uniform but consists of closed packed islands with some domains still consisting of uncovered Cu [8-91. In order to have insight of the detailed processes taking place it is necessary to have information at microscopic scale. It is the aim of the present work to performed Molecular Dynamics simulations at various Pb concentrations in order to put in evidence the role of Pb concentration in the preservation of the 2D growth mode. 2.
Model and computational details
The simulations have been carried out in the canonical ensemble using the Nose thermostat to control the temperature. The system consisted of 30720 particles arranged in an fcc lattice with periodic boundary conditions. By fixing the simulation box at a value as twice large as the size of the system along the [ 1113 direction, we produced an infinite slab having its surfaces parallel to the (1 11) planes. The slab was made up of 24 atomic layers each containing 1280 atoms. We used a potential model in analogy to the Tight Binding scheme in the second moment approximation [lo-1 11 to describe the atomic interactions. This potential model has been successfully used for the study of Pb adatom diffusion ontheCu(l10)face [12]. The simulations took place at temperature T=600K using the lattice constant that corresponded to zero pressure for the Cu bulk system. For the integration of the equations of motion, we used the Verlet algorithm with a time step 5x1O-’’sec ensuring Hamiltonian conservation within 10”. We studied 8 different Pb concentrations (O%, 9.4%, 18.8%, 28.1%, 37.5%, 46.9%, 51.6% and 56.3%). At each concentration the system was equilibrated for 200ps. Time averages were then taken over trajectories of additional 50ps.
3. Results and Discussion In fig. 1, we present the two dimensional radial distribution function g(r) for the Pb adlayer (solid lines) and Cu first layer (dashed lines) for four different Pb concentrations. It is clear that the Cu layer exhibits the 6 characteristic peaks of the triangular surface lattice that correspond successively to the lst,2“dand up
658 to 6" neighbour, yielding coordination numbers 6, 6, 6, 12, 6 and 6 respectively [13]. The Pb overlayer at concentration 8, =56.3% remains stable upon heating near the meting point of the bulk, at T=600K. The corresponding radial distribution function preserves the same sequence of coordination numbers as the substrate up to 4' neighbour but at greater distances in accordance to the lattice mismatch. For concentrations less than 8, the radial distribution functions indicate that the Pb overlayer exhibits a liquid like character (the peak of the first neighbour being shorter and wider than the corresponding peak at O, while the corresponding second and third neighbour peaks disappear as g(r) oscillates near unity). This is also reflected into the corresponding nearest neighbour distance 3.60 A up to 8,=37.8%, reduced to 3.41 A at 56.3%, yielding a compressed Pb adlayer by as much as -2.65%, in agreement with SPA-LEAD results of melting behaviour of incommensurate superstructure [ 141. In addition, we calculated the relative interlayer relaxed positions (RIRF') for the Pb adlayer, for all concentrations studied. It comes out that the Pb atoms are expanded by 10% at concentration 9.4%, this expansion being greater at higher concentrations and finally reaching the value of 18.4% at 56.3%. These results compare well with available experimental data (16.75%) for the RIRP of the Pb overlayer [6] at coverage of 1ML. The liquid character of the overlayer below 8, is also manifested by its diffusivity. In fig.2 we present the Pb atomic diffusion coefficient, D, as a function of concentration. It is clear that D obeys a power law as @' for concentrations up to O, while at higher concentrations it reduces significantly. At all concentrations no intermixing with the substrate was observed. 4.
Conclusions
In this communication, we present results concerning the structural and diffusive features of the Pb/Cu(l 11) interface as a function of concentration of the Pb overlayer. We found that the adlayer's character changes from fluid to solid as the Pb concentration takes a characteristic value 8,=37.5%. Specifically, for concentration less than 8, the deposited Pb atoms exhibit a dilute fluid behaviour, while above 8, 2D liquid like character appears, to recover typical solid behaviour above the saturation concentration that is dictated from the lattice mismatch at 8,=56.3%. This behavior is followed by a reduction of the first neighbor distance of the Pb overlayer which at 8,=56.3% results in a compressed Pb overlayer adopting lattice parameter of 3.41A, close to experimental findings.
659
.........
4
1 j -----I
1M
Dh(rure R (A)
Figure 1 . Two dimensional radial distribution
Figure 2. Log-Log diagram of the atomic
functions for representative concentrations of Pb
diffusion coefficient of Pb overlayer atoms as a
overlayer; solid lines stand for the Pb overlayer
function of concentration
and dashed lines stand for the first Cu substrate layer
References
E. Bauer and J. H. van der Menve, Phys. Rev. B 33,3657 (1986). J. J. Miguel and R. Miranda, J. Phys: Condens. Mutter 14, R1063 (2002). Z. Znang, M. Lagaly, Phys. Rev. Lett. 72,693 (1994). I. Markov, Phys. Rev. B 59, 1689 (1999). D. Kandel And E. Kaxiras., Phys. Rev. Lett. 75,2742 (1 995). S. Muller, J. E. Prieto, C. Rath, L. Hammer, R. Miranda and K. Heinz J. Phys.: Condens. Mutter 13, 1793 (2001). 7. J. Ferron, L. Gomez, J. M. Gallego, J. Camarero, J. E. Prieto, V. Cros, A.Vazquez de Parga, J. J. de Miguel and R. Miranda, Surf: Sci. 459, 135 (2000). 8. C. Nagl, 0. Haller, E. Platzgummer, M. Schimd and P. Varga, Surf: Sci. 321,237 (1994). 9. P. Varga, C. Nagl and M. Schmid, Surf: Sci. 369, 159 (1996) 10. N. I. Papanicolau, G. C. Kalinteris, G.A.Evangelakis, D. A. Papaconstnatopoulos and M. J. Mehl, J. Phys.: Condens. Matter 10, 10979 (1998). 1 1 . E. Vamvakopoulos, G. A. Evangelakis, J. Phys.: Condens. Mutter 13, 10757 (200 1). 12. G. Prevot, C. Cohen, D. Schmaus, V. Pontikis, Surf: Sci. 459,57 (2000). 13. J. H. Conway and N. J. A. Sloane, “Sphere Packings, Lattices and Groups“, Springer- Verlug,p. 1 1 1 (1 999). 14. G. Mayer, M. Michailov and M. Henzler, Surf: Sci. 202, 125 (1998). 1. 2. 3. 4. 5. 6.
STABILITY OF AN EQUILIBRIUM SOLUTION FOR A GYROSTAT ABOUT AN OSCILLATING POINT
J. A. VERA AND A. VIGUERAS Departamento de Matemcitica Aplicada y Estadstica. Uniuersidad Polite'cnica de Cartagena. C/ Paseo Alfonso XIII, 52. 30203 Cartagena (Murcia). Spain E-mail: juanantonio.vera@upct. es; antoaio.vigueras@upct. es
0.1. Abstract: In the present paper will study the stability of an equilibrium solution of the problem of motion of a symmetric gyrostat about a point belonging to its rigid part when this point describes an oscillatory motion with amplitude zo and frequency w . Two systems of reference are considered: a system O X Y Z , whose axes are parallel to those of a fixed or inertial frame and another mobile Oxyz, fixed in the body, and whose axes are directed along the principal axes of inertia of the gyrostat . Thus, supposing that the gyrostat is symmetrical, the gyrostatic moment is constant and adopts the form 1, = (O,O,Z) and the point 0 moves with an oscillatory motion with amplitude zo and frequency w into the axis 0Z being z = zo sin w t and under forces derived of a potential function V(lcs),the equations of motion of this problem, in the body frame, are given by the following formulas
where 1 is the total angular momentum vector of the gyrostat in the body frame, p1 is the moment of the forces derived from the potential function 660
66 1
V ( k 3 ) and p2 is the moment corresponding to the transportation forces of the mobile system. The above equations can be written, explicitly, in the form
A i = ( A - C ) q r - lq + kzV’(k3) + k 2 m Z A4 = (C - A)pr + l p - klV’(k3) - k l m Z r =o k1 = r k 2 - qk3 k2 = pk3 - r k l k3 = q h - pk2 being A = B and C the principal moments of inertia of the gyrostat at 0, k = ( k l , k2, k3) is the Poisson vector and w = ( p , q , r ) is the angular velocity of the gyrostat. These equations have the equilibrium solution
p = 0, q = 0,
T
= rg, kl = 0 , k2 = 0, k3 = 1
The linearized equations of the perturbed motion can be written as a Mathieu equation
d2u 4X2 + -(I dx2
w2
+ psin2z)u = 0
with
mzgw2 and p = AX2 Then we obtain the following result: A necessary condition for stability of the previous equilibrium solution is that the following condition must be verify n2X2 5 w2 5
X2A -
mzo Studying the curves of parametric resonance of the previous equation we can obtain better approximations of the regions of stability when the parameter p is sufficiently small.
662
References 1. D. R. Merkin, Introduction to the theory of stability, Texts in Applied Mathematics(24), Springer-Verlag, 1996. 2. V. V. Rumiantsev, On the stability of motion of gyrostats, J. Appl. Math. Mech. 25 9-19(1961). 3 . V. V. Rumiantsev, On the stability of motion of certain types of gyrostats, J. Appl. Math. Mech. 25 1158-1169(1961). 4. E. Leimanis, The general problem of the motion of coupled rigid bodies about a fixed point, Springer Verlag, Berlin, Heidelberg, New York, 1965. 5. M. Farkas, Periodic Motions, Springer-Verlag, Applied Mathematical Sciences (104), Springer-Verlag, 1991. 6. R. S. Zounes and R. H. Rand., Transition curves for the quasi-periodic mathieu equations, SIAM J. Apli.Math. 58 1094-1115(1998). 7. J. A. Vera y A. Vigueras. Estabilidad de ciertos equilibrios de un gir6stato simktrico bajo un potencial con simetra axial V ( k 3 ) , MBtodos de d i n h i c a orbital y rotacional (IV Jornadas de Trabajo en Mechica Celeste), pp. 175181, Servicio de Publicaciones de la Universidad de Murcia (Spain), (2002). 8. 3. A. Vera and A. Vigueras. Stability of some equilibrium solutions of the problem of motion of a gyrostat in an incompressible ideal fluid, J. of Computational Methods in Sciences and Engineering, in press. 9. I.G.Malkin, Theory of Stability of Motion, Translation Series, United States Atomic Energy Comission, 1952.
COMPUTER AIDED ESTIMATION O F MOLECULAR WEIGHT AND LONG CHAIN BRANCHING DISTRIBUTION IN FREE RADICAL POLYMERIZATION G. D. VERROS’ Department of Informatics and Computer Technology Technological Educational Institute (T.E.I.) of Lamia 35100 Lamia, Greece E-mail: verros@vergina. eng.auth.gr
A robust method for calculating molecular weight and long chain branching distribution in free radical polymerization is proposed in this work. This method is based on direct integration of large nonlinear integro-differential equations system describing the conservation of “dead” polymer and ‘‘live’’ radicals in the reactor. A fairly general kinetic mechanism was employed to describe the complex kinetics of homo- and co- polymerization in presence of branching reactions such as transfer to polymer and terminal double bond polymerization. To simplify calculations and reduce the order of prohibitively large nonlinear system the long chain hypothesis in addition to quasi steady state and continuous variable approximation were applied to ‘‘live’’ radical mass balances. This method was employed to calculate molecular weight and long chain branching distribution of poly(p-methyl styrene) and poly(viny1 acetate) produced by bulk homopolymerization in a batch reactor. Assumptions are justified by comparing calculated distributions, with experimental data and solutions obtained without invoking any assumption. Number and weight average molecular weight of the calculated molecular weight distributions are in excellent agreement with experimental data. This methodology is extended to branched copolymers to obtain the total weight molecular weight distribution by assuming that the respective copolymer composition distribution is uniform. In all cases, the obtained number and weight average molecular weight of calculated molecular distributions are in excellent agreement with obtained independently by moments’ method. Results are presented, showing the effect of branching reactions on molecular weight distribution of copolymers. It is believed that present method can be applied to other free radical polymerization systems, to calculate the molecular weight and long chain branching distribution, leading thus to a more rational design of polymerization reactors.
Correspondence at P.O. Box 454, Plagiari, 57500 Epanomi, Greece.
663
664 1.
Introduction
The production of polymers with desired end-use properties is of significant economic importance to the polymer industry. One of the most important molecular properties that control the end-use characteristics of polymers is the molecular weight distribution (MWD) as it directly affects the physical, mechanical and rheological properties of the final product. The aim of the present work is to calculate, for the first time to our extent of knowledge, the MWD of branched copolymers by directly solving the original mass conservation equations. This is not a new idea; In 60's Liu and Amundson', Detar and Deta? applied this approach to simplified reaction schemes but only the implementation of robust computer facilities, can allow its realization. In the following sections this methodology is described in detail. 2.
Kinetic Mechanism
A fairly general kinetic mechanism for free radical chain addition copolymerization was used: Chemical Initiation: kd
I+2PR*
; PR'+ Mj
k!,
+
Ri-j,,-l,o
; j=1,2
Propagation :
Chain Transfer to Monomer: krmb
R ; , ~ ,+~M~
-+ R$-,,,-,,~
+ M " , ~;, ~i = 1,2 ; j = 1,2
Chain Transfer to Modifier:
Chain Transfer to Polymer:
Termination by Combination:
Termination by Disproportionation:
665
Here, I is initiator, PR’ stands for an initiator fragment, M represents monomer, S is modifier. We denote with R,!,m,b “live” radicals having n units of monomer 1 (MI), m units of monomer 2 (Mz) and b long chain branches (LCB) per molecule. The superscript i refers to the ultimate monomer unit in the radical chain. Based on the above kinetic mechanism a set of algebraic or differential equations, depending on the reactor type and mode of operation (steady state or dynamic) is derived in order to describe the mass conservation of the various reactants in a polymerization reactor. For a batch polymerization reactor, one can derive the following design equation: d(VG)/dt = VrG . For a semi-batch polymerization reactor, as well as, a continuous stirred tank reactor (CSTR) operating under transient conditions the general mass balance equation takes the following form: d(VG)/dt = F G , , ~- F G , ~ ,+, ~V ~ G . Finally, for a tubular reactor operating under steady state conditions and by assuming absence of axial and radial mixing, the design equation is: d(u,G)/dz = rG . rG denotes reaction rates of various species (initiator, monomer(s), modifier, “live” and “dead” polymer molecules), V stands for reactor volume, t is time, z represents axial coordinate and u, is the axial velocity. Model equations can be directly integrated in order to obtain weight molecular weight distribution, the bivariate molecular weight-long chain branching distribution (MW-LCBD) in homopolymers, or the total weight molecular weight distribution (TWMWD) in copolymers. These distributions can be calculated directly without invoking kinetic assumptions (“theoretical solution”). However, in order to increase computational efficiency and decrease the required time for execution four common assumptions were implemented: a) the long chain hypothesis b) the quasi steady state (QSSA) approximation for “live” homopolymers or copolymers radicals and c) the continuous variable approximation (CVA) d) the uniform copolymer composition ass~mption~’~.
3. Results and Discussion The frst step calculating MWD by direct integration is to choose the correct integration scheme. The equations of the model include the low molecular weight species mass balances such as initiator(s), monomer(s), modifier(s) and primary radicals rate equations, coupled with energy and momentum conservation equations and the macromolecular (“live” and “dead”
666
polymer) equations. The former type of equation is usually stiff and the latter not stiff which renders a simultaneous integration of the equation system ineffective. Therefore we choose the following two level algorithm. The first level includes integration of monomer(s), initiator(s) and modifier(s) and primary radical stiff equations using multi-step predictor corrector methods, such as Gear’s method. The second level includes integration of “live” and “dead” polymer mass balances using a single step method. In the present work the backward Euler integration scheme is chosen. To calculate MWD in branched copolymer case we have to resort to numerical experimentation. An excellent agreement between “theoretical solution” and calculated MWD was obtained in all cases. Total real time for execution in a personal computer operating at 2.7 GHz was about 2 min for solution with assumptions as opposed to three hours of “theoretical solution”. To validate the assumption of uniform copolymer composition implemented in model development, we compare mean molecular weights as calculated independently from the method of moments with present model calculations. Please, note that in the calculations of average molecular weights by the method of moments, the assumption of uniform copolymer composition was not invoked. In all cases maximum deviation is smaller than k2% justifying completely the uniform copolymer composition assumption.
4. Conclusions In the present work we developed a simple and effective method for calculating molecular weight and long chain branching distribution of homopolymers and copolymers produced by free radical polymerization in presence of branching reactions. This method has general value and can be applied to other free radical polymerization systems, to calculate the molecular weight and long chain branching distribution, leading thus to a more rational design of polymerization reactors. Acknowledgments The author is thankful to Dr. N. A. Malamataris, Department of Mechanical Engineering, T.E.I. of W. Macedonia, for useful discussions. References 1. S.L. Liu and N.R. Amundson, Rubber Chem. Technology, 34, 995 (196 1). 2. D.F Detar and C.E. Detar, J. Phys. Chem. 70,3842 (1966). 3. W.H. Stockmayer, J. Chem. Phys., 13, 199 (1945). 4. W.H. Ray, J. Macromol. Sci. Rev. Macromol. Chem., CS, 1 (1972)
VS-VO NUMEROV METHOD FOR THE NUMERICAL SOLUTION OF THE SCHRODINGER EQUATION
J. VIGO-AGUIAR Departamento de Matema’tica Aplicada, Universidad de Salamanca. 37006, Salamanca. Spain E-mail: [email protected]
H. RAMOS Escuela Polite‘cnica Superior, Campus Viriato, 49022, Zamora.’ Spain E-mail:[email protected]
The goal of providing efficient general purpose numerical methods has been a central activity within the full scope of solving numerically differential equations. But lately there has been a certain change of trend, in the sense that some types of problems are so particular that it is better to develop their own special theory and techniques. One of these problem types is the so-called special second-order differential equation, which has the form
where the right hand side does not include the derivative of y. These problems arise in a wide variety of physical situations and a well sign of their importance is the fact that some of them have their own proper name: Airy’s equation, Duffing’s equation, Hill’s equation, ..., even the Bessel equation may be reduced to the form in (1). Different authors have dealt with the problem in (l),[4,1, 2, 71, providing different approaches to solve it, but the pioneer work was due to Stormer( 1907), in connection with the numerical calculations concerning the aurora borealis. The k-step Stormer method may be derived similarly to the Adams method, by integrating twice the differential equation in (1) and then, re667
668
placing f by the interpolating polynomial passing trough the points (%-(k-l),
Yn-(k-l)),
*.
1
( x n ,Yn)
,
where the xi are equally spaced. But to obtain more accurate formulas, it can be used the interpolation polynomial passing trough the additional point (xn+l,yn+l). In this case, we obtain the Numerov method (also known as Implicit Stormer method or Cowell method). The methods we have just mentioned are multistep type methods, appropriated for solving the problem in (1) with more or less accuracy, and they have in common a fixed step size. But to be efficient, as some authors have remarked [3, p.3971, an integrator based on a particular formula must be suitable for a variable step-size formulation. We have obtained a generalization of the implicit Stormer method in its variable step-size version, complementing the ideas about variable coefficient multistep methods that appeared in [lo], for the special differential equation in (1). For that, it is necessary to use an easily computable vector, Qk+l,
whose components -h3,j-are complete symmetric polynomials of degree 3 in the values zn+l, H n , H n - l , and a matrix, S k + l , whose coefficients are expressed in terms of certain elementary symmetric polynomials in the values Hn+lr H n , . . . , Hn-(k-2), with
-
-
-
-
-
Hn+1 = xn+1 - xn+1 = 0 H n --z 71 - xn+1 Hn-l = ~
Hn-(k-2)
-
~ - 1Tn+l -
= xn-(k-2)
- %+l
9
where the grid points Ti,i = n - ( k - 1 ) , . . . , n + l , are now unevenly spaced. The resulting formula may be expressed in the form
where Fk+l is the k
+ 1 vector of Newton divided differences, -
Fk+l = ( f [ T n + l ]f,[ % + l , Zn],. . . 7 f[%+l,. . .,x n - ( k - l ) l )
669
Of course, we need a strategy for deciding in what way to change the steplength (and the order if this is the case). If we suppose that the numerical integration has proceeded successfully up to the point T n and we attempt to advance from T n to Tn+l = Tn hn+l with the variable step formula in (2), we propose as an estimate, zn+l,for the next step, the unique solution of the equation
+
where T O L is a user-given tolerance for the local error, p is a safety factor (less than one), is a constant coefficient from the corresponding implicit Stormer formula with fixed step-size ( being expressed in terms of backward differences for the function f [3, p.464]), and f[En+1,ICn,.. . ,Tn-k]the Newton divided difference of order Ic 1 as it is usually defined. Finally, some numerical examples are provided in order to show the good behavior of the formula.
+
References 1. J. P. Coleman and A. S. Booth, Analysis of a Family of Chebyshev Methods for y” = f(z,y), J . Comp. Appl. Math.44, 95(1992). 2. G. Denk, A new numerical method for the integration of highly oscillatory second-order ordinary differential equations, Appl. Numer. Math. 13, 57(1993). 3. E. Hairer, S. P. Norsett and G. Wanner, Solving Ordinary Differential Equations I, Springer, Berlin, 1987. 4. E. Hairer, S. P. Norsett and G. Wanner, Unconditionally stable methods for second order differential equations ,Nurner. Math. 32,373(1979). 5. P. Henrici, Discrete variable Methods in Ordinary DifferentialEquations, John Wiley, New York, 1962. 6. E. Isaacson and H. B. Keller, Discrete variable Methods in Ordinary Differential Equations, John Wiley, New York, 1966. 7. M. S. H. Khiyal and R. M. Thomas, Variable-order, variable-step methods for second-order initial-value problems, J.Comp. Appl. Math. 79, 263( 1997). 8. J. D. Lambert, Numerical Methods for Ordinary Differential Systems, John Wiley, England, 1991. 9. L. F. Shampine and M. K. Gordon, Computer solution of Ordinary Differential Equations. The initial Value Problem, Freeman, San Francisco,CA, 1975. 10. J. Vigo-Aguiar, An approach to variable coefficients multistep methods for special differential equations, Int.J.Appl.Math. 8 , 911(1999).
NONLINEAR PRESSURE AND TEMPERATURE WAVES PROPAGATION IN FLUID SATURATED ROCK
M. DE’ MICHIELI VITTURI Dipartimento d i Matematica “L.Tonelli”, Pisa, Italy E-mail: demichieOmail.dm. unipi. it F. BEUX Scuola Normale Superiore d i Pisa, Italy
In this study, we consider a non-linear one dimensional model for temperature and pressure waves in fluid saturated porous rock due to abrupt variations in depth. The geological system studied, illustrated in Fig.1, represents a horizontal medium overlying an aquifer in a hypertermal domain. The aquifer consist of a homogeneus isotropic deep horizon identified as a source of temperature or pressure change (for exemple, magma due to intrusion, friction fault phenomena, etc.). This is covered by a homogeneous isotropic upper horizon, with a lower temperature, represented by a fluid-saturated porous permeable medium. A third superficial horizon may also be present. Following Bonafede and Natale et al. 2 , the thermal space and time evolution is described by: -aT(t,z )
d2T(t,z )
at where k, ,B and
az2
- -p - a P (t ,z ) a T (t ,z ) aZ
dZ
x are respectively the average thermal diffusivity due to diffusion, the average thermal diffusivity due to convection and the average dissipative diffusivity due to fluid-matrix friction. This equation is not sufficient to fully determine P and T ; indeed, we also consider the equation indroduced by McTigue 3: dP(t,z) -at
@ P ( t , z )-a-=o aT(t z ) az2
at
where h and a are respectively the fluid diffusivity and a source term. Our purpose is to extract, from system (1)-(2), the gradient of the pressure, and then to obtain the corrisponding fluid velocity by mean of the 670
upper boundary
TO
A Z
PO z=b
depth fluid-saturated porous-permeable medium
z=o
TO+Tl
PO+Pl
source of temperature or pressure change Figure 1. Geological section of the system. Po and TO are the values of temperature and pressure a t the beginning of the phenomenon and PI and TI are the pressure and temperature jumps.
phenomenological “Darcy Law”. This work is the first attempt to numerically solve the system of equations (1)-(2) in its general form. For some particular case, for example for small values of the hydraulic diffusivity, we can consider H M 0 and the system of equation can be linearized and solved analitically using the Hopf-Cole tranformation ‘. Other particular cases are studied in 3. In our work both a finite difference method and a Galerkin finite element method, using cubic Hermite trial functions, are applied for the discretization in the spatial variable, while a Crank-Nicholson implicit method is adopded for advancing in time. For the resolution of the discrete system obtained, a comparison between the Newton-Raphson method and the nonlinear generalized minimum residual (GMRES) method is done. We show the accuracy of the numeric solutions with respect to both the analitical ones found for the particular case H = 0 and the geological observations. In particular, we analyze the geochemical, seismologiacal and geodetic observations made at Campi Flegrei following the 1982-1984 uplift, and more recent observation made in Pisa by the CNR.
672
References 1. M. Bonafede, Hot fluid migration: An efficient source of ground deformation
2.
3. 4. 5.
6.
7.
of the 1982-1985 crisis at Campi Flegrei-Italy, J. Volcanol. Geotherm. Res. 48 187-198(1991). G. Natale, A. Salusti and A. Troisi, Rock deformation and fracturing process due to nonlinear shock waves propagating in hypertermal fluid pressurizzed domains, Journal of Geophysical Research 103 15,325-15,328(1998). D.F. McTigue, Thermoelastic response of fluid-saturated rock, Journal of Geophysical Research 91 9533-9542(1986). G.B. Whitam, Linear and Non-linear waves, Wiley-Interscience, New York, 1974. J.R. Rice and M.P. Clearly, Some basic stress diffusion solutions for fluidsaturated elastic porous media with compressible constituents, Rev. Geophys. 14 227-241(1991). Y. Saad and M.H. Schultz, GMRES: A Generalized Minimal Residual Algorithm for Solving Nonsymmetric Linear System, SIAM J.Sci.Statist. Comput. 7 856-869( 1986). L.B. Wigton, N.J. Yu and D.P. Young, GMRES Acceleration of Computational Fluid Dynamics Codes, AIAA Paper 85-1494, 1985.
NEURO-FUZZY ELLIPSOID BASIS FUNCTION MULTIPLE CLASSIFIER FOR DIAGNOSIS OF URINARY TRACT INFECTIONS E. WADGE, V. KODOGIANNIS, D. TOMTSIS’ Mechatronics Group, Computer Science Dept., University of Westminster, London HA1 3TP. UK ‘TEI of West Macedonia, Kozani, Greece E-mail: [email protected], [email protected] c . uk Recently, the use of smell in clinical diagnosis has been rediscovered due to major advances in odour sensing technology and artificial intelligence. Urinary tract infections are a serious health problem producing significant morbidity in a vast number of people each year. A newly developed “artificial nose” based on chemoresistive sensors has been employed to identify in vivo urine samples from 45 patients with suspected uncomplicated UTI who were scheduled for micro-biological analysis in a UK Health Laboratory environment. An intelligent model consisting of an odour generation mechanism, rapid volatile delivery and recovery system, and a classifier system based on artificial based techniques has been developed. The implementation of an advanced hybrid neuro-fuzzy scheme and the concept of fusion of multiple classifiers dedicated to specific feature parameters has been also adopted in this study. The experimental results confirm the validity of the presented methods. 1.
Introduction
There is an increasing demand worldwide for the application of intelligent, fast and inexpensive measurement systems in clinical diagnosis able to monitor metabolic changes of human physiology and transform them to global patterns of diseases such as: diabetes, tuberculosis (TB), urinary tract infections (UTI) and gastrointestinal Helicobacter pylori (HP) related infections. UTI is a significant cause of morbidity with 3 million UTI cases each in the USA alone. Thirty-one percent of nosocomial infections in medical intensive care units are attributable to UTI, and it is estimated that 20% percent of females, aged of 20 and 65 years suffer at least one episode per year. There are also links to other complicated or chronic urological disorders such as pyelonephritis, urethritis, and prostatitis. Approximately 80% of uncomplicated UTI are caused by E.coli and 20% by enteric pathogens such as Enterococci, Klebsiellae, Proteus sp., coagulase (-) Staphylococci and fungal opportunistic pathogens such as Candida albicans. In the field of clinical microbiology, current techniques require 24-48 hours to identify a pathogenic species in urine midstream specimens following a series of biochemical (i.e. antibiotic sensitivity) tests. The introduction of intelligent electronic noses (EN) based on gas-sensing
673
674 devices, shows potential to rediscover the diagnostic power of odours in clinical practice. Today, there are commercial EN systems that use several sensor technologies including conducting polymers, metal oxides and piezoelectric crystals. However, due to the low repeatability of the data patterns extracted from these sensors and the fuzzy nature of odour patterns, the use of advanced soft computing intelligence techniques could achieve successful classification. The use of neural networks (”s) and fuzzy logic systems (FL) to model complex numerical systems is a much explored but still very active topic. Neuro-Fuzzy (NF) systems attempt to combine the low-level numerical modelling capabilities of NN with some of the representational transparencies of FL, but like FL often suffers from the so called “curse of dimensionality” where the number of partitions necessary to accurately model the data is large enough that the number of rules becomes prohibitively large. Often however the dimensionality of the data is frequently large making the construction of a single model difficult. Combining multiple classifiers to get higher accuracy is gaining increased attention. If a series of individual features can be identified within the complete dataset then a sub-network can be trained for each feature and the classification results combined through a relevant fusion method. The objectives of this study are to: 0 Utilise an extended normalized radial basis function classifier trained using the Expectation Maximization (EM) algorithm and with a Split and Merge (SM) technique; 0 Develop a multiple classifier of non-linear pattern recognition problems involving large and noisy data; 0 Explore the benefits of the fuzzy integral soft fusion method over the average sub-network output; 0 Analyse 45 specimens of human urine by the application of an intelligent diagnostic model based on novel generation, detection, and rapid recognition of urinary volatile patterns and perform a diagnosis; 2.
Experimental
Forty-five 5ml urine samples were collected from randomly selected patients and inoculated into specially made centrifuge bottles to a fmal volume of 20ml per VGK (volatile generating kits) and incubated aerobically for 5 hrs at 37OC. After 5 hrs of incubation to coincide with the logarithmic phase of growth, 45 VGK were placed in a 37°C water bath and directly connected with a specifically designed air-filtered sparging (bubbling) system and an activated carbon filter to provide clean air-flow above the urine headspace [l]. An electronic nose carrying 14 conducting polymer sensors was used in this study.
675 3.
Neuro-fuzzy Classifier
One branch of NF systems attempts to reduce the complexity of the overall model by partitioning the input space into a series of regions of common characteristics. Each of these regions can then be modelled using a much simpler technique than is required to model the entire system. One promising partition scheme is related to Ellipsoid Basis Functions (EBF) as they define an elliptical instead of spherical partition of the input space providing thus a less constrained structure. By combining the EBF partitioning method with a series of local linear models we have a versatile local expert system as illustrated in the following figure.
For a system with N sets of C inputs contained in a matrix X,L outputs and A4 partitionslneurones we can define the output Y ~ ,for , , an ~ input vector x as. M
y,,; = p D i , j @ ; , j j=l
676 c
0'. = a'. J J.0
+ Ca:.,pxp
(4)
p=l
There are two sets of parameters to be learnt, the e parameters that govern the position and size of the neurone and the a parameters which are the coefficients of the linear model. We chose to optimise each of these sets of parameters separately first partitioning a region of the input space then training a local linear model for that area. One method that has enjoyed considerable and continued interest is that of Expectation Maximization (EM) [2]. The goal here is to maximise a likelihood function that defines the degree that the parameters fit the data. This guarantees that the network will train to a local optimum. However the EM algorithm is very susceptible to local maximum. In an effort to resolve this a series of Split and Merge (SM) criteria and partial EM steps used to better shift partitions from overpopulated areas of the input space before recommencing the parameter updates. 4. Classifier Fusion The fusion of each classification made by the feature classifiers must be combined in order to produce a system output. The aim is to incorporate information from each feature space so that decisions are based on the whole input space. The simplest method is to take the average output from each classifier as the system output. This does not take into account the objective evidence supplied by each of the feature classifiers and the confidence that we have in that classifiers results. The fuzzy integral is a method which claims to resolve both of these issues by combining evidence of a classification with the systems expectation of the importance of that evidence. 5.
Model structure and results
Thirty cases of UTI were identified from 45 randomly selected samples by standard microscopy and culture: 13 patients were infected with E.coZi (e), 9 with Proteus sp. (p) and 8 with coagulase (-) Staphylococcus sp., (st). The remaining cases were considered as normal ones (n). 31 random examples consisting of all 4 classes were taken as training data with the remaining 14 examples were used as testing examples. The electronic nose took readings of four features across all 14 sensors providing 56 inputs on which to design the model [3]. We decomposed the data into the four feature spaces and trained one network on each feature. Two fusion methods were tested. The first was to take the average output of all of the networks as the system output, although this achieved 100% on the testing data not all of the confidence levels of the system
677 was over the 50% cut off level. Using the fuzzy integral soft fusion method all confidence levels were over 50% with 100% accuracy. References
1.
A. Pavlou, V.S. Kodogiannis and A.P.F. Turner, Intelligent classijication of bacterial clinical isolates in vitro, using electronic noses, Int. Conf. on Neural Networks and Expert Systems in Medicine and Healthcare, Greece, pp. 231-237,2001, 2. R. Langari, L. Wang and J. Yen, Radial basisfitnction networks, regression weights and the Expectation-Maximization Algorithm, IEEE Trans. on Systems Man and Cybernetics, 27(5), 6 13-623( 1997). 3. V.S. Kodogiannis, A. Pavlou, P. Chountas and A.P.F. Turner, Evolutionary computing techniques for diagnosis of urinary tract infections in vivo, using gas sensors, 6" Int. Conf. on Neural Networks and Soft Computing, ICNNSC 2002, Poland, 474-479,2002.
PROTEIN FOLDING SIMULATIONS USING THE ACTIVATION-=LAXATION TECHNIQUE *
G. WE1 AND N. MOUSSEAU De'partement de physique et GCM, Unaversite' de Montre'al, C.P. 6128, succ. centre-ville, Montre'al (Que'bec) CANADA E-mail: Nonnand.MousseauO UMontrecsl.CA
P. DER.R.EUMAUX IGS, CNRS UMR 1889, 91 Chemin Joseph Aiguer, 13402 Marseille Cedex 20, FRANCE E-mail: philippe Oigs.cnrs-mr-s.fr
Understanding folding mechanisms by which proteins find their native and functional forms is a crucial issue in the post-genomic era. Molecular dynamics and Monte Carlo simulations are often used t o simulate the folding process. Here we show that the activitation relaxation technique allows for the discovery of new folding mechanims for simple protein models.
1. Introduction
A standard view of protein folding is that the native (experimental) structure corresponds to the global free energy minimum, kinetics of folding is well described by a two-state model (denatured-native) and the time scale for folding range from ps to minutes.' Many methods have been used to simulate the kinetics of folding. Among them, three methods have p r e vided insights into the folding mechanims of small protein models. These include standard molecular dynamics (MD) simulation, ensemble dynamics and replica exchange techniques. Since experiment cannot capture the details of the transition, one important question is whether these techniques *This work is supported in part by the Fonds QuCbCcoiS pour la Formation des Chercheurs et 1'Aide ci la Recherche and the Natural Sciences and Engineering Research Council of CANADA (G.W., N.M)
678
679
can miss major folding pathways. To adress this issue, we study the folding of a 16-residue p hairpin (two strands connected by a loop) using the activation-relaxation technique. This peptide folds in isolation with a time constant of 6 ps using fluorescence experiments. N
2. The Activation RRlaxation Technique
ART can be used to optimize any cost function in a high-dimensional space through a series of activated steps. Here we apply its most recent version which uses the Lanczb algorithm to extract a limited spectrum of eigenvectors and eigenvalues without requiring the evaluation and diagonalization of the full Hessian matrix.2 A basic event in ART consists of four steps: Starting from a minimum, the system is first pushed in a random direction outside the harmonic well until a negative eigenvalue appears in the Hessian matrix. The system is then pushed along the corresponding eigenvector until the total force is approaching zero, indicating a saddle point. The configuration is then relaxed into a new local energy minimum, using standard minimization technique.re1axation step. The new configuration is acceptedfrejected using the Metropolis criterion at 300 K. 3. Energy model
We use a flexible-geometry model where each amino acid is represented by six particles, i.e. N, H, Ca, C, 0 and one bead for the side chains. All coordinates are free to vary. The potential form was optimized on the structures of six training peptides with 10-38 residue^.^ OPEP includes three types of interactions: pairwise 6-12 contact energies between the side chains considering the hydrophobic and hydrophilic character of each amino acid, potentials to maintain stereochemistry (excIude volume, bond lengths, bond angles) and twebody and four-body terms for hydrogen bonds (hbonds). The two-body component between residues i and j, E H B ~ and , the cooperative energy between two h-bonds ij and kl, EHBS,are defined by: E H B ~= ~ h Cp(rij)v(aij) b ij
(1)
680
V(&ij) =
cos2 aij,“ij > 90“ 0,otherwise
(3)
EHBZ = &ZhbeXP(-(rij - g I 2 / 2 )eq(-(rkl - 0 ) 2 / 2 )
(4)
Here, rij denotes the HO distance, aij the NHO angle, 0 = 1.8A. &hb = 1.0 kcal/mol if j=i+4 (helix), otherwise = 2.5 kcal/mol. e2hb = -0.5 for helices, -2.0 and -1.0 kcal/mol for antiparallel and parallel h-bonds.
Event 180
Event 800
(b)
Event 1820
Event 1960
(4
Figure 1. A detailed analysis of a trajectory resulting in the native beta-hairpin at 300 K, starting from a fully extended state. (a) snapshots of the trajectory. (b) rms deviation (rmsd) from the native state, radius of gyration of the peptide (Rg-G) and of the hydrophobic core (Rg-core) and number of native main chain hydrogen bonds (NHB) as a function of event number. (c) Total energy as a function of event number.
4. Results
The hairpin spanning residues 41 to 56 within protein G was the subjet of 60 runs at 300 K starting from fully extended (52 runs) or random conformations (8 runs). From the analysis of 20 runs that converge towards the ground state within 2000 events at 300 K (our criteria for nativeness are very severe), three folding mechanisms emerge. All mechanisms point
681
to the hydrophobic (HP) collapse in the early steps of folding followed by the cooperative formation of hydrogen bonds and the exact HP core. Two of them were previously described by ensemble dynamics,* replica exchange5 and MD simulations: the N- and C-termini first approach each other to form a loop and finally the 0-turn forms or the P t u r n is formed first. A third reptation mechanism, never described for a single protein, is discovered. Figure 1shows this folding mechanism which is closely followed by 6 independent runs. As seen, the peptide finds in less than 300 MC events conformations deviating by less than 2.5A from the native structure, but all contain non-native h-bonds. Then at event 1960, the reptation motion of the loop connecting the two stands enhances longitudinal motion of the two strands and allows for the formation of all native h-bonds. References 1. A. Sali, E. Shakhnovich and M. Karplus Nature 369,248 (1994). 2. G. T. Barkema and N. Mousseau, Phys. Rev. Lett. 77, 4358 (1996); G. Wei, N. Mousseau, and P. Derreumaux, J. Chem. Phys., 117,11379 (2002). 3. P. Derreumaux, Phys. Rev. Lett. 85, 206 (2000). 4. B. Zagrovic, E. J. Sorin, and V. Pande, J. Mol. Biol. 313, 151 (2001). 5. R.. Zhou, B. J. Berne and R.. Germain, Proc. Natl. Acad. Sci. USA 98,14931 (2001).
ON THE SYSTEMATIC CONSTRUCTION OF MOLECULAR BASIS SETS
S. WILSON
The algebraic appro~imationlt~.~, that is the use of finite basis set expansions, lies at the heart of practical computational quantum chemistry. The choice of basis set is crucial since it ultimately determines the accuracy and therefore the utility of a particular application. Yet much of the "art" of contemporary quantum chemistry lies in the selection of an appropriate basis set for the problem in hand. Here we discuss systematic techniques for the construction of molecular basis sets; that is, methods which aim to explore the way in which a particular property or properties converge with increasing size of basis set. The use of basis sets comprising of the Gaussian-type basis functions, originally suggested independently by McWeeny4 and by Boys5, is now the method of choice in molecular electronic structure calculations. These functions have the analytic form
where Ye, ( 0 ,'p) is a spherical harmonic, [ is the exponent, R is the point upon which the function is centred, and n, l and m are integers which determine the nature of the Gaussian-type function. r, B and 'p are the spherical polar coordinates with respect to the point R. These functions have the well-known advantage over alternatives - most notably exponential-type functions - that the integrals over the components of the molecular electronic hamiltonian can be evaluated quite easily6i7. The most difficult and most numerous molecular integrals involving the electron-electron interaction can be written in terms of the error function. These integrals can not only be evaluated easily but also accurately. Additionally, Gaussian functions can also be centred at arbitrary points in space since they do not introduce a cusp at the point upon which they are centred'. When the 682
683
widely used point nucleus model is abandoned in favour of a more realistic distributed nucleus model, the Gaussian-type function centred on that nucleus is found to be well suited to the description of orbitals in the vicinity of the nucleusg. Basis sets of Gaussian-type functions are found to display less computational linear dependence than, for example, exponential-type functions of elliptical functions. Traditionally, molecular basis sets have been constructed as the union of atomic basis sets supplemented by polarization functions which aim to describe the distortion of the atomic orbitals in the molecular environment. The atomic basis sets usually are taken from tabulations which are determined by optimizing the exponents C and the choice of n, f? and m for some particular model, such as Hartree-Fock or a correlated theory such MBPT2l0. In most of the quantum chemical packages, like G A U S S I A N ~ ~ , which are employed in contemporary computations, the parameters defining a wide range of atomic basis sets and supplementary polarization functions are stored and can be retrieved by means of a single, simple directive in the input data. There is also a web site12 from which a variety of basis sets can be retrieved in formats suitable for a number of the more popular program packages. An important, but often unmentioned, aspect of these tabulated basis sets is that they attempt to provide a uniform level of accuracy for a range of atoms and properties. The absolute errors in the total energies supported by many tabulated basis sets are much larger than the energy differences which are of interest in the study of chemical processes. The tabulated basis sets attempt to balance the errors in the description of the initial and final states of the system under investigation. However, there is no formal proof of the cancellation of errors in the calculations for the initial and final states and, to a large extent, such cancellations must be regarded as fortuitous. As indicated above, much of the “art” of contemporary quantum chemistry lies in the selection of an appropriate basis set for the property of interest. For a given molecular system, different basis sets might be used to calculate different molecular properties. For example, supplementary diffuse functions are required to afford an improved description of the orbitals in the long range when calculating molecular polarizabilities and hyperpolarizabilities, whereas supplementary contracted functions support a better description of the orbitals in the regions near the nuclei which is required when calculating nuclear magnetic resonance constants for example. For more than forty years, the importance of studying the convergence of molecular electronic structure calculations with the size and quality of
684
the basis sets employed has been recognized. Writing in 1960, Roothaan remarked13 on the finite basis set self-consistent field method It is ... not necessary to restrict the process to ... relatively crude applications; rather we wish to use it as a practical method t o obtain the fully optimized Hartree-Fock orbitals by m a h n g the basis set ... suficiently large and flexible. That this is possible without hawing to take a n impracticably large basis set has been shown in a number of recent calculations o n atoms14 and the H2 molecule 15. The phenomenal increase in the power of computers since this was written in 1960 has meant that the use of large and therefore accurate basis sets is today a reality. The last two decades have seen the development of a number of what might be termed hierarchical basis sets set for the calculation of molecular energies and properties. We describe basis sets as hierarchical if, in some sense, they can be systematically extended so as to approach exact solutions in some limit. These basis sets have not only afforded “a practical method to obtain the fully optimized Hartree-Fock orbitals”, but also in many cases formed a firm foundation for correlation studies. In this talk, I shall concentrate on the use of distributed universal even-tempered sets, but we should mention the basis sets developed by Huzinaga“, those of Poirier, Kari and Csizmadia17, the A N 0 (atomic natural orbital) basis sets of Almlof and Taylor1s1920,the correlation consistent basis sets of and the modified A N 0 Dunning and his colleagues21~22~23~24~z5~26~27~28~2g, basis sets of Widmark and his c o - w ~ r k e r s ~All ~ ~of~these . basis sets are, in some sense, hierarchical. In their recent review of correlation consistent basis sets, Dunning, Peterson and WOO^^^ list three criteria that a basis set should satisfy. 0
0
The basis sets must provide a n adequate description of electron correlation The basis sets must systematically cover all of coordinate space. The basis sets must be as compact as possible.
(Dunning et a1 point out that although “it is certainly possible to develop a more extensive set of rules for basis sets ..., the above capture the major requirements. ”) Distributed universal even-tempered basis sets3
0
can support a description of electron correlation effects, both in calculations of electron correlation energies and in studies of the effects of electron correlation o n molecular properties. have the unique space-spanning property resulting the even-
685
tempered generating formula originally pointed out by Ruedenberg and his coworkers. are almost certainly not the most compact basis set f o r a particular molecular system or property. However, the requirement that the basis set be as compact as possible arises not because of dificulty in handling large basis sets but because of the d i i c u l t y in handling the associated two-electron integrals list which formally scale as the fourth power of the number of basis functions. For larger systems the scaling is signijkantly reduced. I n addition to these properties, distributed universal even-tempered basis
sets
are available for all atoms of the periodic table, including heavy and superheavy elements. rn are available for electronically excited states rn can be employed in four component relativistic molecular electronic structure studies.
rn
References 1. S. Wilson and D.M. Silver, Phys. Rev. A14, 1949 (1976). 2. S. Wilson, Adv. Chem. Phys. 67, 439 (1987). 3. S. Wilson, in Handbook of Molecular Physics and Quantum Chemistry, Volume 2, Molecular Electronic Structure, S . Wilson, P.F. Bernath and R. McWeeny, Wiley, Chichester (2003). 4. R. McWeeny, Nature 166, 21 (1950) 5. S.F. Boys, Proc. Roy. SOC.(London) A200,542 (1950) 6. V.R. Saunders, in Methods in Computational Molecular Physics, G.H.F. Diercksen and S. Wilson (editors), Reidel, Dordrecht (1983). 7. V.R. Saunders, in Handbook of Molecular Physics and Quantum Chemistry, Volume 2, Molecular Electronic Structure, S . Wilson, P.F. Bernath and R. McWeeny, Wiley, Chichester (2003). 8. D. Moncrieff and S. Wilson, J. Phys. B: At. Mol. Opt. Phys. 26, 1605 (1993) 9. H.M. Quiney, J. Kobus and S. Wilson, J. Phys. B: At. Mol. Opt. Phys. 34, 2045 (2001) 10. S. Wilson, Electron correlation in molecules, Clarendon Press, Oxford (1984) 11. M.J. Frisch, G.W. Trucks, H.B. Schlegel, P.M.W. Gill, B.G. Johnson, M.A. Robb, J.R. Cheeseman, T. Keith, G.A. Petersson, J.A. Montgomery, K. Raghavachari, M.A. Al-Laham, V.G. Zakrzewski, J.V. Ortiz, J.B. Foresman, J. Cioslowski, B.B. Stefanov, A. Nanayakkara, M. Challacombe, C.Y. Peng, P.Y. Ayala, W. Chen, M.W. Wong, J.L. Andres, E.S. Replogle, R. Gomperts,
686
R.L. Martin, D.J. Fox, J.S. Binkley, D.J. Defrees, J. Baker, J.P. Stewart, M. Head-Gordon, C. Gonzalm and J.A. Pople, GAUSSIAN94 Rev. (2.3, Gaussian Inc., Pittsburgh PA, (1995); GAUSSIAN98 Rev. A.9, Gaussian Inc., Pittsburgh PA, (2000). 12. http://www.emsl.pnl.gov:2080/forms/basisform.html 13. C.C.J. Roothaan, Rev. Mod. Phys. 32, 179 (1960) 14. C.C.J. Roothaan, L.M. Sachs and A.W. Weiss, Rev. Mod. Phys. 32, 186 (1960) 15. W. Kolos and C.C.J. Roothaan, Rev. Mod. Phys. 32, 219 (1960) 16. S. Huzinaga, Comput. Phys. Rept. 2, 281 (1985) 17. R. Poirier, R. Kari and I.G. Csizmadia, Handbook of Gaussian Basis Sets: A Compendium for Ab Initio Molecular Orbital Calculations, Elsevier, Amsterdam (1985) 18. J. Almlof and P.R. Taylor, J. Chem. Phys. 86, 4070 (1987) 19. J. Almlof and P.R. Taylor, Adv. Quant. Chem. 22, 301 (1991) 20. J. Almlof, in Modern Electronic Structure Theory. Part 1, D.R. Yarkony, editor, p. 110, World Scientific, Singapore (1995) 21. T.H. Dunning, Jr., J . Chem. Phys. 90,1007 (1989) 22. R.A. Kendall, T.H. Dunning, Jr. and R. J. Harrison, J. Chem. Phys. 96,6796 (1992) 23. D.E. Woon and T.H. Dunning, Jr., J. Chem. Phys. 98,1358 (1993) 24. D.E. Woon and T.H. Dunning, Jr., J . Chem. Phys. 100, 2975 (1994) 25. D.E. Woon and T.H. Dunning, Jr., J. Chem. Phys. 103, 4572 (1995) 26. A.K. Wilson, T. van Mourik and T.H. Dunning, Jr, J. Mol. Struct. (THEOCHEM), 388, 339 (1996) 27. T. van Mourik and T.H. Dunning, Jr, J . Chem. Phys., 107,2451 (1997) 28. T. van Mourik, A.K. Wilson, K.A. Peterson, D.E. Woon, and T.H. Dunning, Jr, Adv. Quantum Chem. 31, 105 (1998) 29. T.H. Dunning, Jr., K.A. Peterson and D.E. Woon, in Encyclopedia of Computational Chemistry, ed. P.v.R. Schleyer, N.L. Allinger, T. Clark, J. Gasteiger, P.A. Kollamn, H.F. Schaefer I11 and P.R. Schreiner, 1,88, Wiley, Chichester (1998) 30. P.O. Widmark, P.A. Malmqvist and B.O. ROOS,Theor. Chem. Acta. 77,291 (1990) 31. P.O. Widmark, B.J. Perrson and B.O. Roos, Theor. Chem. Acta. 79,419 (1991)
GENERALIZATION OF LAGUERRE ORBITALS TOWARD AN ACCURATE, CONCISE AND PRACTICAL ANALYTIC ATOMIC WAVE FUNCTION
Z. XIONG* and N. C. BACALIS Theoretical a n d Physical Chemistry Institute, National Hellenic Research Foundation, Vasileos Constantinou 48, 116 35 Athens, Greece E-mail: [email protected]
Simple analytic, non-orthogonal but selectively orthogonalizable, generalized Laguerre type atomic orbitals are proposed, which, when optimized, provide a concise and clear physical interpretation with near equivalent accuracy to numerical multi-configuration self-consistent field, in atomic multi-configuration calculations. A general Eckart theorem for excited states is analyzed, leading, via a thorough search of these orbitals, to an estimated energy correction, for various excited atomic states prone to variational collapse.
1. Wave function and examples The proposed generalized Laguerre type orbitals (GLTOs) are of the form ( r h1, m) = An,i,m L , i ( { g } , r, h , i , qn,i) K,m(@,$), where An,i,m is a normalization constant and Y,,,(O, $) are spherical harmonics. The radial part is (in a.u.)
w,
'Also at Engineering Science Department, University of Patras, GR-26500 Patras, Greece; Partially supported by an IKY grant of the Greek State Scholarships Foundation and by a scholarship of the National Hellenic Research Foundation.
687
688
where gn-z-l(n, 1, { G J , bi,l, q i , l } y Z l ) = 1 (the rest of the gk factors are extensively discussed below) and Ck are the usual associated Laguerre polynomial coefficients. The parameters zn,l, bn,l, qn,l are determined from the (nonlinear variational) minimization’ of the desired root of the secular equation. The zn,l parameters are effective nuclear charges and determine the radial extent of the orbitals. Since zn,l differ from orbital to orbital, these orbitals are, in general, non-orthogonal. The addition of the last term of Eq. (1) just modifies the radial part of s-orbitals, especially useful for Is, since this usually absorbs large correlation correction. The total wave function is a normalized CI expansion, formed out of Slater determinants, composed of these orbitals, whose node positions and radial extent are optimized variationally through non-linear multidimensional minimization of the total energy. A selective intrinsic orthogonalization formalism to any lower n,l orbital of either the ground, or a desired excited state, used, preserves the orbital characteristics. The rest of the orbitals remain non-orthogonal. First a main wave function in the dominant part of the active space (called ‘main’),is found (and used), well representing the state under consideration [e.g., for He, in the active space of 2s, 2p, 3s, 3p, the four ‘Poroots have the following ‘main’ wave functions: (2s2p), (2s3p f 3s2p) and (3s3p)l. Then, angular and radial correlation’ is added, simulating cusp conditions. The adaptability of our orbitals to almost NMCSCF accuracy is due to the gk-factors, which move, during the minimization process, the orbital nodes appropriately, by intrinsic orthogonalization among desired orbitals of the same3 or of a different4 state (an advantage of this method), by directly solving (ni,1, rnlnj, 1, rn) = &,j, ( i ,j = 1,..., NoTb). Thus these orbitals, after orthogonalization, are not linear combinations of each other, as in orther orthogonalization schemes, but maintain a clear physical interpretation for all 1 = s , p , d, ..., enabling one, to choose reasonable (and to reject unreasonable) outcomes even by inspection. Since the CI expansion may still contain non-orthogonal orbitals, we use the general non-orthogonal formalism of p. 66 of M ~ W e e n y . ~ Radial and angular correlation, simulating cusp conditions, is incorporated either via orthogonalization to desired lower-n orbitals, or via free non-ort hogonality. Table 1 shows some examples compared with other calculations. We observe that our values are quite close to NMCSCF, so that our analytic orbitals and wave functions are quite trustable with nearly as small CI expansions as NMCSCF. 1.e. by variationally moving the nodes and the
689
extent of the GLTOs, quite similar to NMCSCF orbitals are obtained, with the same rich and concise physical content. Table 1. Total energies of He Is2 I S , Li ls22s 'S, Li ls22p 2P,C ls22s22p2 3P and He ls2s 'S, compared with other calculations. Atomic units are used. He 192s lS
-7.4080f -7.40849
-37,78719' -37.78695J
-7.4780e -37.79'
-2.14261" -2.14389O -2.14456P -2.14347q -2.14380'
Remarks on Table 1 (A) He ls2 lS: (") This work6 using correlation orbitals up to 4f'; (b) NMCSCF up to 4f7; (") Exact.s (B) Li ls'2s ' S : (d) NMCSCF up to 4f;9 (") Exact.lo (C) Li ls22p 2 P : (f) This work6 using correlation orbitals up t o 4f ;
(g)
Large CI (45 CI terms
up t o 59);" (h) Experimental value.12 (D) C ls22s22p2 3P: ( j ) This work6 up to 4 f , including ls', 2s' and 2p' (13 orbitals), by keeping 64 mostly significant configurations with 346 Slater determinants;
(j)
Large-scale NMCHF up to 4f in active space;13 (k)
Large-scale MRCI using 145 Gaussian functions (17sllp6d5f4g2h) with config~rations;'~ (') With 90 orbitals and
6 100 000 Slater
1 500 000
determinant^;'^ (") Using
1/Z e ~ p a n s i o n . ' ~(E) He ls2s l.9: This work6 by implementing the Hylleraas-Undheim-
+ 2s (+ls'), ( P ) ls2 + 2s (+ls') + value^:^ (9) ls2 + 29, ( r ) 1s' + 2s + 2p.
MacDonald (HUM) theorem3: (") ls2
2p ; Corresponding NMCSCF
+ 2s,
(O)
ls2
2. Energy correction for excited states and examples
For excited states the general Eckart upper bound theorem (GET) is needed: The exact energy eigenvalue En, is a lower bound not of the calculated energy E?) per se, but of the calculated augmented energy, i.e. of n-1
(E?)+6P)), where
6p) = C I($iI@t."))I'((E,
- Ei). This is the
GET the-
i=l
orem. Here $1, $2, ..., qn,... are the exact eigenstates of the Hamiltonian H (a complete orthonormal set) with energies El < Ez < ... < En < ..., and is the calculated normalized ( n- 1 ) t h excited state of the desired symmetry, with energy expectation value E P ) . The theorem is proven by expanding @?) in terms of the exact eigenstates, multiplying the normalization condition by En, subtracting from the expansion, solving for En 00
and noting that the unknown e?)
= C
k=n+l
)I
(n) 2 (Ek - En) 2
0.
690
For the ground state, [(n = 1), i.e. e = g], GET: En < (E^n} + 5^n)) reduces to the usual Eckart upper bound theorem, since Sg = 0. The augmentation Se needs the exact (literally unknown) lower lying states. We estimate Se by orthogonalizing to (accurately calculated) approximations of these states (utilizing the ability for exact orthogonality between GLTOs). Therefore, our A! values (our estimate to Se ) can only serve as rough (depending on the quality of our lower lying approximations) error corrections to the calculated energies (we have achieved O(10~4) a.u.). For higher states we proceed consecutively starting from the first, until, for some n, Ae becomes comparable to the energy separation En - En-\. Depending on the accuracy of each *f>e and on the quality of the orthogonality of $e to each of them (i < n), due to error accumulation, at about that n this process becomes further unreliable. Table 2. Our full CI up to 4/ energies for the Is2s 1S isoelectronic sequence from He to Ne, for He Is3s {S and for Li Is(2s2p 3P) 2P [in the combination Isa (2sa2pf3 + 2s02pa) - 2 (Is/32sa2pa) which is orthogonal to Is(2s2p 1P) 2P: Isa (2sa2p/3 - 2s/32pa) (a, (3 mean spin-up, spin-down)], compared with other calculations (in a.u.). We used 10 orbitals. In the ab-initio proximity estimation to £2, we approximate 5^ by A^ 2 '. 6.5~4 means 6.5 x 10~4. The free one-configuration Is2s values (i.e. lse±2se), in the last column, are far beyond our proximity estimation to the exact En and are collapsed. £< 2 >
ExactM
MCSCFa
21S3
Ne8+
-2.14596 -5,04093 -9.18469 -14.57834 -21.22258 -29.11382 -38.25841 -48.65206 -60.29534
-2.14597 -5.04087 -9.18487 -14.57853 -21.22202 -29.11542 -38.25876 -48.65206 -60.29534
-2.14595b -5.04028 -9.18413 -14.57769 -21.22111 -29.11445 -38.25775 -48.65102 -60.29428
He Is3s 1S
-2.06129
-2.06127
Li 2 P Is(2s2p ** P]
-5.31998
-5.312I18)
He Li+ Be2+ B3+ C4+
N5+ O6+ F7+
A< 2 >
ls±2s
1.6297 2.4353 3.6849 4.4797 5.5464 6.4736 7.4671 8.4689 9.4527
6.5-4 1.7-3 3.7-3 1.2-3 1.7-3 1.9-3 1.9-3 2.7-3
-2.156 -5.058 -9.206 -14.603 -21.248 -29.143 -38.288 -48.682 -60.327
-2.06127I17)
1.1090C
1.5-4idl
-2.0696
-5.3111f
2.9670
7.54-3
-5.3416
2.0~4
Remarks on Table 2 ( a ) With seven configurations.16 ( b ) Ref. [7], p. 67, (up to 6h). ( c ) The z of 2sep; that of lsg is shown in the first line of He I. ( d ) Al3) = A^ + Al3'2). ( e ) Free variation (without gfc-factors). ( f ) Weiss in Ref. [19]
Table 2 shows some examples compared with other calculations, first of the Is2s 1S isoelectronic sequence from He to Ne and then of higher singly
691
excited states and with more electrons. Because of the augmentation, b?’, a wave function with minimal (E?) + A:%’) should be quite trustable.
References 1. We modified Powell’s method p. 412 of W. H. Press, S. A. Teukolsky, W. T. Vetterling and B. P. Flannery, Numerical Recipes in FORTRAN, 2nd ed. (Cambridge University Press, 1992), for a restricted region in parameter space. 2. C. A. Nicolaides, Int. J. Quantum Chem. 60, 119 (1996), and references therein. 3. E. Hylleraas and B. Undheim, 2. Phys. 65, 759 (1930); J. K. L. MacDonald, Phys. Rev. 43 830 (1933). 4. C. E. Eckart, Phys. Rev. 36, 878 (1930). See also A. K. Theophilou, J. Phys. C12, 5419 (1979). 5. R. McWeeny, Methods of Molecular Quantum Mechanics, 2nd ed. (Academic Press, 1989). 6. Z. Xiong, Ph.D thesis, University of Patras, Greece, (2002), unpublished. 7. C. F. Fischer, T. Brage and P. Jonsson, Computational Atomic Structure, An MCHF Approach, (Institute of Physics Publishing, 1997). 8. Y . Accad, C. L. Pekeris and B. Schiff, Phys. Rev. A4, 516 (1971). 9. C. F. Fischer, Phys. Rev. A41, 3481 (1990). 10. S. Larsson, Phys. Rev. 169, 49 (1968). 11. A. W. Weiss, Astrophys. J. 138, 1262 (1963). 12. C. E. Moore, Atomic Energy levels, Vol. 1-3 (Washington DC: US Govt. Printing Office, 1949). 13. D. Sundholm and J. Olsen, Chem. Phys. Lett. 182, 497 (1991). 14. A. Kalemos, A. Mavridis and A. Metropoulos, J . Chem. Phys. 111, 9536 (1999), and private communication. 15. J. N. Silverman, Chem. Phys. Lett. 160, 514 (1989). 16. C. Froese Fischer, Can. J. Phys. 51, 1238 (1973). 17. M-K. Chen, J. Phys. B 27, 865 (1994). 18. D. Rassi, V. PejCev and K . J. Ross, J. Phys. B: Atom. Molec. Phys. 10, 3535 (1977), and references therein. 19. D. L. Ederer, T. Lucatorto and R. P. Madden, Phys. Rev. Lett. 25, 1537 (1970).
BAYESIAN MODELS FOR MEDICAL IMAGE BIOLOGY USING MONTE CARL0 MARKOV CHAINS TECHNIQUES
s. ZIMERAS University of the Aegean Department of Statistics and Actuarial Studies Karlovassi, 832 00 Samos, Greece E-mail: zimste!a)aepeun.w F. GEORGIAKODIS University of Piraeus. Department of Statistics and Insurance Science, 80, Karaoli & Dimitriou St. 185 34 Piraeus, Greece E-mail: fotisi2,unini.gr
The use of Bayesian methods in medical biology and modeling is an approach, which seeks to provide a unified framework within many different image processes. In this work, bayesian models would be presented to illustrate biological phenomena using the Gibbs sampler technique. Finally methods for the estimation of model parameters would be proposed based on the likelihood ration tests.
Abstract
In recent years stochastic models and statistical methods have been successfully applied in medical biology using image analysis techniques. Of particular interest are Bayesian methods based on local characteristics. Key components of any statistical analysis, using such methods, are the choice of an appropriate model as the prior and the estimation of the prior model parameters. The Bayesian approach to reconstruction involves the modeling of prior information describing local characteristics of the spatial process. These local characteristics are defined in terms of the conditional distribution of the random variables at each site, given the values at the other sites (Besag, 1974; 1986). This specification is called a Markov random field (M.r.0. Each of these models consists of a set of sites forming a finite lattice and associated with each site is a random variable. The individual sites could represent points or regions and the lattice could be regular or irregular. The associated random variables could be discrete or continuous, univariate or multivariate. In practice estimates using Bayesian methods cannot be computed analytically. For this reason Monte Carlo algorithms can be used to generate samples from the posterior distribution and parameter estimates calculated from this sample. The fundamental idea is to use an algorithm, which generates a discrete time Markov chain converging to the desired distribution. The most 692
693 commonly algorithms include the Gibbs sampler (Geman and Geman, 1984) and the Metropolis-Hasting algorithm (Metropolis et. al., 1953; Hastings, 1970). Applications of these methods cover a wide range of areas including: detection of lesions in medical imaging (Aykroyd and Green, 1991); geographical epidemiology (Besag et. al., 1991); astronomy (Ripley and Sutherland, 1990), medical biology (Diggle, 1983), medical imaging in SPECT (Green, 1990; Weir, 1993; Zimeras, 1997; 1999). Models based on Markov random fields are widely used to model spatial processes, especially in biology, where reconstructions of the cells could be presented as a spatial pattern model. A particular subclass of Mrf is the automodels, introduced in Besag (1974) and further studied in Cross and Jain (1983), Besag (1986), Aykroyd et. al. (1996), Zimeras (1997). However, prior component of these models usually involves unknown prior parameters, which control the influence of the prior distribution. Key components of any statistical analysis using such models are the choice of an appropriate model as a prior distribution and the estimation of prior model parameters. In many applications, appropriate values of these parameters will be found be trial- and- error, in other cases a fully Bayesian approach will be adapted and the prior parameters estimated in the same way as other model parameters, In all these cases it is, at least implicitly, expected that the procedures depend smoothly on the prior parameters and that there is a unique relationship between the parameters and different types of behavior of the process. In this work, spatial behavior of the auto-models shall be investigated using simulated and real data from auto-binomial and auto-Poisson models, which could be used to model biological structures or treatment diseases (Diggle, 1983). A simple deterministic model based on the univariate iterative scheme which appears to emulate the behavior of auto-models and allows us to make predictions regarding the behave of the spatial models would be analyzed. For well-defined regions in the parameter space this iterative scheme is unstable leading to catastrophic behavior (Chandler 1978; Zimeras, 1997). This instability coincides with structural changes in the corresponding spatial model and that the critical boundaries for the iterative scheme coincide with those for the spatial model. For the simulated data, the Gibbs sampler was used to produce realizations illustrating a wide range of possible models. Finally methods for the estimation of the models parameters are examined and new methodology for the construction of hypothesis tests and confidence intervals is proposed. Conditional likelihood is used to estimate the model parameters based on the coding technique (Besag, 1974). A model selection procedure is proposed to classify the neighborhood structure of the image. The procedures are investigated using simulated data from the auto-binomial and auto-Poisson model. Finally, the auto-Poisson model is fitted to a real data example &om he area of medical biology.
694 References I.
2. 3.
4. 5. 6. 7.
8. 9.
10.
Aykroyd R. G., Haigh J. G. B. and Zimeras S. (1996): Unexpected spatial patterns in exponential familly auto-models, Graphics, Models and Image processing, Vol58, No 5,452-463. Besag J. (1974): Spatial interaction and the statistical analysis of lattice systems, J. Royal Statistical Society, Series B, 36, 192-236. Besag J. (1986): On the statistical analysis of dirty pictures, J. Royal Statistical Society, Series B, 48,259-302. Chandler D. (1978): Introduction to Modem Statistical Mechanics, Oxford University Press, N. York. Cross G. R. and Jain A. K. (1983): Markov random field texture models, IEEE Trans. Pattn. Anal. Mach. Intell., 5(1), 25-39. Diggle P.J. (1983): Statistical analysis of spatial pattern point, Academic Press, London. Geman S. and Geman D. (1984): Stochastic relaxation, Gibbs distribution and the Bayesian restoration of images, IEEE Trans. Pattern Anal. Mach. Intell., 6, 721-741. Zimeras S. (1997): Statistical models in medical image analysis, Ph.D. Thesis, Leeds University, Department of Statistics. R. G. Aykroyd, S. Zimeras: Inhomogeneous prior models for image reconstruction, Journal of American Statistical Association (JASA),Vol 94, NO447,934-946. S. Zimeras, R G. Aykroyd: Neighbourhood structure estimation of images using hierarchical testing, IEE Electronic Letters, 35,2188-2189.
This page intentionally left blank