Lecture Notes in Bioinformatics
6330
Edited by S. Istrail, P. Pevzner, and M. Waterman Editorial Board: A. Apostolico S. Brunak M. Gelfand T. Lengauer S. Miyano G. Myers M.-F. Sagot D. Sankoff R. Shamir T. Speed M. Vingron W. Wong
Subseries of Lecture Notes in Computer Science
Kang Li Li Jia Xin Sun Minrui Fei George W. Irwin (Eds.)
Life System Modeling and Intelligent Computing International Conference on Life System Modeling and Simulation, LSMS 2010 and International Conference on Intelligent Computing for Sustainable Energy and Environment, ICSEE 2010 Wuxi, China, September 17-20, 2010 Proceedings, Part III
13
Series Editors Sorin Istrail, Brown University, Providence, RI, USA Pavel Pevzner, University of California, San Diego, CA, USA Michael Waterman, University of Southern California, Los Angeles, CA, USA Volume Editors Kang Li George W. Irwin The Queen’s University of Belfast, Intelligent Systems and Control School of Electronics, Electrical Engineering and Computer Science Ashby Building, Stranmillis Road, Belfast BT9 5AH, UK E-mail:
[email protected];
[email protected] Li Jia Xin Sun Minrui Fei Shanghai University, School of Mechatronical Engineering and Automation P.O.Box 183, Shanghai 200072, China E-mail:
[email protected];
[email protected];
[email protected]
Library of Congress Control Number: 2010933354
CR Subject Classification (1998): J.3, F.1, F.2, I.5, I.4, H.4 LNCS Sublibrary: SL 8 – Bioinformatics ISSN ISBN-10 ISBN-13
0302-9743 3-642-15614-2 Springer Berlin Heidelberg New York 978-3-642-15614-4 Springer Berlin Heidelberg New York
This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer. Violations are liable to prosecution under the German Copyright Law. springer.com © Springer-Verlag Berlin Heidelberg 2010 Printed in Germany Typesetting: Camera-ready by author, data conversion by Scientific Publishing Services, Chennai, India Printed on acid-free paper 06/3180
Preface
The 2010 International Conference on Life System Modeling and Simulation (LSMS 2010) and the 2010 International Conference on Intelligent Computing for Sustainable Energy and Environment (ICSEE 2010) were formed to bring together researchers and practitioners in the fields of life system modeling/simulation and intelligent computing applied to worldwide sustainable energy and environmental applications. A life system is a broad concept, covering both micro and macro components ranging from cells, tissues and organs across to organisms and ecological niches. To comprehend and predict the complex behavior of even a simple life system can be extremely difficult using conventional approaches. To meet this challenge, a variety of new theories and methodologies have emerged in recent years on life system modeling and simulation. Along with improved understanding of the behavior of biological systems, novel intelligent computing paradigms and techniques have emerged to handle complicated real-world problems and applications. In particular, intelligent computing approaches have been valuable in the design and development of systems and facilities for achieving sustainable energy and a sustainable environment, the two most challenging issues currently facing humanity. The two LSMS 2010 and ICSEE 2010 conferences served as an important platform for synergizing these two research streams. The LSMS 2010 and ICSEE 2010 conferences, held in Wuxi, China, during September 17-20, 2010, built upon the success of two previous LSMS conferences held in Shanghai in 2004 and 2007 and were based on the Research Councils UK (RCUK)-funded Sustainable Energy and Built Environment Science Bridge project. The conferences were jointly organized by Shanghai University, Queen's University Belfast, Jiangnan University and the System Modeling and Simulation Technical Committee of Chinese Association for System Simulation (CASS), together with the Embedded Instrument and System Technical Committee of China Instrument and Control Society. The conference program covered keynote addresses, special sessions, themed workshops and poster presentations, in addition to a series of social functions to enable networking and foster future research collaboration. LSMS 2010 and ICSEE 2010 received over 880 paper submissions from 22 countries. These papers went through a rigorous peer-review procedure, including both pre-review and formal refereeing. Based on the review reports, the Program Committee finally selected 260 papers for presentation at the conference, from amongst which 194 were subsequently selected and recommended for publication by Springer in two volumes of Lecture Notes in Computer Science (LNCS) and one volume of Lecture Notes in Bioinformatics (LNBI). This particular volume of Lecture Notes in Bioinformatics (LNBI) includes 83 papers covering 10 relevant topics.
VI
Preface
The organizers of LSMS 2010 and ICSEE 2010 would like to acknowledge the enormous contributions from the following: the Advisory and Steering Committees for their guidance and advice, the Program Committee and the numerous referees worldwide for their significant efforts in both reviewing and soliciting the papers, and the Publication Committee for their editorial work. We would also like to thank Alfred Hofmann, of Springer, for his continual support and guidance to ensure the high-quality publication of the conference proceedings. Particular thanks are of course due to all the authors, as without their excellent submissions and presentations, the two conferences would not have occurred. Finally, we would like to express our gratitude to the following organizations: Chinese Association for System Simulation (CASS), IEEE SMCS Systems Biology Technical Committee, National Natural Science Foundation of China, Research Councils UK, IEEE CC Ireland chapter, IEEE SMC Ireland chapter, Shanghai Association for System Simulation, Shanghai Instrument and Control Society and Shanghai Association of Automation. The support of the Intelligent Systems and Control research cluster at Queen’s University Belfast, Tsinghua University, Peking University, Zhejiang University, Shanghai Jiaotong University, Fudan University, Delft University of Technology, University of Electronic Science Technology of China, Donghua University is also acknowledged.
July 2010
Bohu Li Mitsuo Umezu George W. Irwin Minrui Fei Kang Li Luonan Chen Li Jia Xin Sun
LSMS-ICSEE 2010 Organization
Advisory Committee Kazuyuki Aihara, Japan Zongji Chen, China Guo-sen He, China Frank L. Lewis, USA Marios M. Polycarpou, Cyprus Olaf Wolkenhauer, Germany Minlian Zhang, China
Shun-ichi Amari, Japan Peter Fleming, UK Huosheng Hu,UK Stephen K.L. Lo, UK Zhaohan Sheng, China
Erwei Bai, USA Sam Shuzhi Ge, Singapore Tong Heng Lee, Singapore Okyay Kaynak, Turkey Peter Wieringa, The Netherlands
Cheng Wu, China Guoping Zhao, China
Yugeng Xi, China
Kwang-Hyun Cho, Korea
Xiaoguang Gao, China
Shaoyuan Li, China Sean McLoone, Ireland Xiaoyi Jiang, Germany Kok Kiong Tan, Singapore Tianyuan Xiao, China Donghua Zhou, China
Liang Liang, China Robert Harrison, UK Da Ruan Belgium Stephen Thompson, UK Jianxin Xu, Singapore Quanmin Zhu, UK
Steering Committee Sheng Chen, UK Tom Heskes, The Netherlands Zengrong Liu, China MuDer Jeng, Taiwan, China Kay Chen Tan, Singapore Haifeng Wang, UK Guangzhou Zhao, China
Honorary Chairs Bohu Li, China Mitsuo Umezu, Japan
General Chairs George W. Irwin, UK Minrui Fei, China
International Program Committee IPC Chairs Kang Li, UK Luonan Chen, Japan
VIII
Organization
IPC Regional Chairs Haibo He, USA Wen Yu, Mexico Shiji Song, China Xingsheng Gu, China Ming Chen, China
Amir Hussain, UK John Morrow, UK Taicheng Yang, UK Yongsheng Ding, China Feng Ding, China
Guangbin Huang, Singapore Qiguo Rong, China Jun Zhang, USA Zhijian Song, China Weidong Chen, China
Maysam F. Abbod, UK Vitoantonio Bevilacqua, Italy Yuehui Chen, China
Peter Andras, UK Uday K. Chakraborty, USA Xinglin Chen, China
Costin Badica, Romania
Minsen Chiu, Singapore Kevin Curran, UK Jianbo Fan, China
Michal Choras, Poland Mingcong Deng, Japan Haiping Fang, China Wai-Keung Fung, Canada Xiao-Zhi Gao, Finland Aili Han, China Pheng-Ann Heng, China Xia Hong, UK Jiankun Hu, Australia
Tianlu Chen, China Weidong Cheng, China Tommy Chow, Hong Kong, China Frank Emmert-Streib, UK Jiali Feng, China Houlei Gao, China Lingzhong Guo, UK Minghu Ha, China Laurent Heutte, France Wei-Chiang Hong, China Xiangpei Hu, China
Peter Hung, Ireland
Amir Hussain, UK
Xiaoyi Jiang, Germany Tetsuya J. Kobayashi, Japan Xiaoou Li, Mexico Paolo Lino, Italy Hua Liu, China Sean McLoone, Ireland Kezhi Mao, Singapore Wasif Naeem, UK Feng Qiao, China Jiafu Tang, China Hongwei Wang, China Ruisheng Wang, USA Yong Wang, Japan Lisheng Wei, China Rongguo Yan, China Zhang Yuwen, USA Guofu Zhai, China Qing Zhao, Canada Liangpei Zhang, China Shangming Zhou, UK
Pingping Jiang, China
IPC Members
Huijun Gao, China Xudong Guo, China Haibo He, USA Fan Hong, Singapore Yuexian Hou, China Guangbin Huang, Singapore MuDer Jeng, Taiwan, China Yasuki Kansha, Japan Gang Li, UK Yingjie Li, China Hongbo Liu, China Zhi Liu, China Fenglou Mao, USA John Morrow, UK Donglian Qi, China Chenxi Shao, China Haiying Wang, UK Kundong Wang, China Wenxing Wang, China Zhengxin Weng, China WeiQi Yan, UK Wen Yu, Mexico Peng Zan, China Degan Zhang, China Huiru Zheng, UK Huiyu Zhou, UK
Aim`e Lay-Ekuakillel, Italy Xuelong Li, UK Tim Littler, UK Wanquan Liu, Australia Marion McAfee, UK Guido Maione, Italy Mark Price, UK Alexander Rotshtein, Ukraine David Wang, Singapore Hui Wang, UK Shujuan Wang, China Zhuping Wang, China Ting Wu, China Lianzhi Yu, China Hong Yue, UK An Zhang, China Lindu Zhao, China Qingchang Zhong, UK
Organization
IX
Secretary-General Xin Sun, China Ping Zhang, China Huizhong Yang, China
Publication Chairs Xin Li, China Wasif Naeem, UK
Special Session Chairs Xia Hong, UK Li Jia, China
Organizing Committee OC Chairs Shiwei Ma, China Yunjie Wu, China Fei Liu, China OC Members Min Zheng, China Yijuan Di, China Banghua Yang, China
Weihua Deng, China Xianxia Zhang, China
Yang Song, China Tim Littler, UK
Reviewers Renbo Xia, Vittorio Cristini, Aim'e Lay-Ekuakille, AlRashidi M.R., Aolei Yang, B. Yang, Bailing Zhang, Bao Nguyen, Ben Niu, Branko Samarzija, C. Elliott, Chamil Abeykoon, Changjun Xie, Chaohui Wang, Chuisheng Zeng, Chunhe Song, Da Lu, Dan Lv, Daniel Lai, David Greiner, David Wang, Deng Li, Dengyun Chen, Devedzic Goran, Dong Chen, Dongqing Feng, Du K.-L., Erno Lindfors, Fan Hong, Fang Peng, Fenglou Mao, Frank Emmert-Streib, Fuqiang Lu, Gang Li, Gopalacharyulu Peddinti, Gopura R. C., Guidi Yang, Guidong Liu, Haibo He, Haiping Fang, Hesheng Wang, Hideyuki Koshigoe, Hongbo Liu, Hongbo Ren, Hongde Liu, Hongtao Wang, Hongwei Wang, Hongxin Cao, Hua Han, Huan Shen, Hueder Paulo de Oliveira, Hui Wang, Huiyu Zhou, H.Y. Wang, Issarachai Ngamroo, Jason Kennedy, Jiafu Tang, Jianghua Zheng, Jianhon Dou, Jianwu Dang, Jichun Liu, Jie Xing, Jike Ge, Jing Deng, Jingchuan Wang, Jingtao Lei, Jiuying Deng, Jizhong Liu, Jones K.O., Jun Cao, Junfeng Chen, K. Revett, Kaliviotis Efstathios, C.H. Ko, Kundong Wang, Lei Kang,
X
Organization
Leilei Zhang, Liang Chen, Lianzhi Yu, Lijie Zhao, Lin Gao, Lisheng Wei, Liu Liu, Lizhong Xu, Louguang Liu, Lun Cheng, Marion McAfee, Martin Fredriksson, Meng Jun, Mingcong Deng, Mingzhi Huang, Minsen Chiu, Mohammad Tahir, Mousumi Basu, Mutao Huang, Nian Liu, O. Ciftcioglu, Omidvar Hedayat, Peng Li, Peng Zan, Peng Zhu, Pengfei Liu, Qi Bu, Qiguo Rong, Qingzheng Xu, Qun Niu, R. Chau, R. Kala, Ramazan Coban, Rongguo Yan, Ruisheng Wang, Ruixi Yuan, Ruiyou Zhang, Ruochen Liu, Shaohui Yang, Shian Zhao, Shihu Shu, Yang Song, Tianlu Chen, Ting Wu, Tong Liang, V. Zanotto, Vincent Lee, Wang Suyu, Wanquan Liu, Wasif Naeem, Wei Gu, Wei Jiao, Wei Xu, Wei Zhou, Wei-Chiang Hong, Weidong Chen, WeiQi Yan, Wenjian Luo, Wenjuan Yang, Wenlu Yang, X.H. Zeng, Xia Ling, Xiangpei Hu, Xiao-Lei Xia, Xiaoyang Tong, Xiao-Zhi Gao, Xin Miao, Xingsheng Gu, Xisong Chen, Xudong Guo, Xueqin Liu, Yanfei Zhong, Yang Sun, Yasuki Kansha, Yi Yuan, Yin Tang, Yiping Dai, Yi-Wei Chen, Yongzhong Li, Yudong Zhang, Yuhong Wang, Yuni Jia, Zaitang Huang, Zhang Li, Zhenmin Liu, Zhi Liu, Zhigang Liu, Zhiqiang Ge, Zhongkai Li, Zilong Zhao, Ziwu Ren.
Table of Contents
The First Section: Biomedical Signal Processing, Speech, Imaging and Visualization MMSVC: An Efficient Unsupervised Learning Approach for Large-Scale Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hong Gu, Guangzhou Zhao, and Jianliang Zhang
1
CUDA Based High Performance Adaptive 3D Voxel Growing for Lung CT Segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Weiming Zhai, Fan Yang, Yixu Song, Yannan Zhao, and Hong Wang
10
Wavelet Packet-Based Feature Extraction for Brain-Computer Interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Banghua Yang, Li Liu, Peng Zan, and Wenyu Lu
19
Keynote Address: The 3D Imaging Service at Massachusetts General Hospital: 11 Years Experience . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Gordon J. Harris
27
A Novel Localization System Based on Infrared Vision for Outdoor Mobile Robot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jingchuan Wang and Weidong Chen
33
Analytical Solution for the Forward Problem of Magnetic Induction Tomography with Multi-layer Sphere Model . . . . . . . . . . . . . . . . . . . . . . . . . Zheng Xu, Qian Li, and Wei He
42
Total Variation Regularization in Electrocardiographic Mapping . . . . . . . Guofa Shou, Ling Xia, and Mingfeng Jiang
51
The Time-Frequency Analysis of Abnormal ECG Signals . . . . . . . . . . . . . . Lantian Song and Fengqin Yu
60
Dynamic Spectrum and BP Neural Network for Non-invasive Hemoglobin Measurement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Huiquan Wang, Gang Li, Zhe Zhao, and Ling Lin
67
Study on Real-Time Control of Exoskeleton Knee Using Electromyographic Signal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jiaxin Jiang, Zhen Zhang, Zhen Wang, and Jinwu Qian
75
Characterization of Cerebral Infarction in Multiple Channel EEG Recordings Based on Quantifications of Time-Frequency Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Li Zhang, Chuanhong He, and Wei He
84
XII
Table of Contents
Research on a Novel Medical Image Non-rigid Registration Method Based on Improved SIFT Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Anna Wang, Dan Lv, Zhe Wang, and Shiyao Li
91
Automatic and Reliable Extraction of Dendrite Backbone from Optical Microscopy Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Liang Xiao, Xiaosong Yuan, Zack Galbreath, and Badrinath Roysam
100
Magnetic Induction Tomography: Simulation Study on the Forward Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Wei He, Xiaodong Song, Zheng Xu, and Haijun Luo
113
Diagnosis of Liver Diseases from P31 MRS Data Based on Feature Selection Using Genetic Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jinyong Cheng, Yihui Liu, Jun Sang, Qiang Liu, and Shaoqing Wang
122
Research of Acupuncturing Based on Hilbert-Huang Transform . . . . . . . . Xiaoxia Li, Xiumei Guo, Guizhi Xu, and Xiukui Shang
131
A New Microphone Array Speech Enhancement Method Based on AR Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Liyan Zhang, Fuliang Yin, and Lijun Zhang
139
A Forecast of RBF Neural Networks on Electrical Signals in Senecio Cruentus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jinli Ding and Lanzhou Wang
148
Classification of Malignant Lymphomas by Classifier Ensemble with Multiple Texture Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bailing Zhang and Wenjin Lu
155
Denoising of Event-Related Potential Signal Based on Wavelet Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Zhen Wu, Junsong Wang, Deli Shen, and Xuejun Bai
165
The Second Section: Biological and Biomedical Data Integration, Mining and Visualization Predict Molecular Interaction Network of Norway Rats Using Data Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Qian Li and Qiguo Rong The Study of Rats’ Active Avoidance Behavior by the Cluster Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Otar Tavdishvili, Nino Archvadze, Sulkhan Tsagareli, Anna Stamateli, and Marika Gvajaia MS Based Nonlinear Methods for Gastric Cancer Early Detection . . . . . . Jun Meng, Xiangyin Liu, Fuming Qiu, and Jian Huang
173
180
189
Table of Contents
The SEM Statistical Mixture Model of Segmentation Algorithm of Brain Vessel Image . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Xingce Wang, Feng Xu, Mingquan Zhou, Zhongke Wu, and Xinyu Liu
XIII
196
Classification and Diagnosis of Syndromes in Chinese Medicine in the Context of Coronary Heart Disease Model Based on Data Mining Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yong Wang, Huihui Zhao, Jianxin Chen, Chun Li, Wenjing Chuo, Shuzhen Guo, Junda Yu, and Wei Wang
205
An Image Reconstruction Method for Magnetic Induction Tomography: Improved Genetic Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Wei He, Haijun Luo, Zheng Xu, and Qian Li
212
The Segmentation of the Body of Tongue Based on the Improved Level Set in TCM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Wenshu Li, Jianfu Yao, Linlin Yuan, and Qinian Zhou
220
Transcutaneous Coupling Implantable Stimulator . . . . . . . . . . . . . . . . . . . . Hui Xiong, Gang Li, Ling Lin, Wangming Zhang, and Ruxiang Xu
230
Simulation Analysis on Stimulation Modes of Three-Dimension Electrical Impedance Tomography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Wei He, Kang Ju, Zheng Xu, Bing Li, and Chuanhong He
238
Researches on Spatio-temporal Expressions of Intestinal Pressure Activity Acquired by the Capsule Robot . . . . . . . . . . . . . . . . . . . . . . . . . . . . Rongguo Yan, Xudong Guo, and Guozheng Yan
246
Analysis of Chlorophyll Concentration during the Phytoplankton Spring Bloom in the Yellow Sea Based on the MODIS Data . . . . . . . . . . . Xiaoshen Zheng and Hao Wei
254
A Novel Association Rule Mining Based on Immune Computational Intelligence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Xuesong Xu and Sichun Wang
262
The Third Section: Computational Intelligence in Bioinformatics and Biometrics Face Recognition via Two Dimensional Locality Preserving Projection in Frequency Domain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chong Lu, Xiaodong Liu, and Wanquan Liu
271
Prediction of Protein-Protein Interactions Using Subcellular and Functional Localizations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yanliang Cai, Jiangsheng Yu, and Hanpin Wang
282
XIV
Table of Contents
Nucleosomes Are Well Positioned at Both Ends of Exons . . . . . . . . . . . . . . Hongde Liu and Xiao Sun
291
An Evaluation of DNA Barcoding Using Genetic Programming-Based Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Masood Zamani and David K.Y. Chiu
298
Auto-Creation and Navigation of the Multi-area Topological Map for 3D Large-Scale Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Wenshan Wang, Qixin Cao, Chengcheng Deng, and Zhong Liu
307
Relation of Infarct Location and Size to Extent of Infarct Expansion After Acute Myocardial Infarction: A Quantitative Study Based on a Canine Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jianhong Dou, Ling Xia, Yunliang Zang, Yu Zhang, and Guofa Shou
316
Artificial Intelligence Based Optimization of Fermentation Medium for β-Glucosidase Production from Newly Isolated Strain Tolypocladium Cylindrosporum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yibo Zhang, Lirong Teng, Yutong Quan, Hongru Tian, Yuan Dong, Qingfan Meng, Jiahui Lu, Feng Lin, and Xueqing Zheng
325
The Human Computer Interaction Technology Based on Virtual Scene . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Huimeng Tan, Wenhua Zhu, and Tianpeng Wang
333
ICA-Based Automatic Classification of Magnetic Resonance Images From ADNI Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Wenlu Yang, Xinyun Chen, Hong Xie, and Xudong Huang
340
Label Propagation Algorithm Based on Non-negative Sparse Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Nanhai Yang, Yuanyuan Sang, Ran He, and Xiukun Wang
348
Multiple Sequence Alignment by Improved Hidden Markov Model Training and Quantum-Behaved Particle Swarm Optimization . . . . . . . . . Chengyuan Li, Haixia Long, Yanrui Ding, Jun Sun, and Wenbo Xu
358
Breast Cancer Diagnosis Using WNN Based on GA . . . . . . . . . . . . . . . . . . Xiaomei Yi, Peng Wu, Jian Li, and Lijuan Liu
367
Lattice-Based Artificial Endocrine System . . . . . . . . . . . . . . . . . . . . . . . . . . . Qingzheng Xu, Lei Wang, and Na Wang
375
Direct Sparse Nearest Feature Classifier for Face Recognition . . . . . . . . . . Ran He, Nanhai Yang, Xiu-Kun Wang, and Guo-Zhen Tan
386
Table of Contents
XV
The Fourth Section: Computational Methods and Intelligence in Modeling Molecular, Cellular and Multi-cellular behavior and Dynamics A Mathematical Model of Myelodysplastic Syndromes: The Effect of Stem Cell Niches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Xiuwei Zhu, Ling Xia, and Luyao Lu
395
Ion Channel Modeling and Simulation Using Hybrid Functional Petri Net . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yin Tang and Fei Wang
404
Computer Simulation on the Compaction of Chromatin Fiber Induced by Salt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chun-Cheng Zuo, Yong-Wu Zhao, Yong-Xia Zuo, Feng Ji, and Hao Zheng Electrical Remolding and Mechanical Changes in Heart Failure: A Model Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yunliang Zang and Ling Xia Modeling Conformation of Protein Loops by Bayesian Network . . . . . . . . Peng Yang, Qiang L¨ u, Lingyun Yang, and Jinzhen Wu
413
421 430
The Fifth Section: Intelligent Modeling, Monitoring, and Control of Complex Nonlinear Systems Towards Constraint Optimal Control of Greenhouse Climate . . . . . . . . . . Feng Chen and Yongning Tang
439
A Kernel Spatial Complexity-Based Nonlinear Unmixing Method of Hyperspectral Imagery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Xiaoming Wu, Xiaorun Li, and Liaoying Zhao
451
Study on Machine Vision Fuzzy Recognition Based on Matching Degree of Multi-characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jingtao Lei, Tianmiao Wang, and Zhenbang Gong
459
Application and Numerical Simulation on Water Mist Cooling for Urban Environment Regulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Junfeng Wang, Xincheng Tu, Zhentao Wang, and Jiwei Huang
469
Optimal Guaranteed Cost Control for Linear Uncertain System with Pole and H∞ Index Constraint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Xianglan Han and Gang Zhang
481
Statistical Modelling of Glutamate Fermentation Process Based on GAMs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chunbo Liu, Xuan Ju, and Feng Pan
490
XVI
Table of Contents
The Application of Support Vector Regression in the Dual-Axis Tilt Sensor Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Wei Su and Jingqi Fu
500
Implementing Eco-Friendly Reservoir Operation by Using Genetic Algorithm with Dynamic Mutation Operator . . . . . . . . . . . . . . . . . . . . . . . . Duan Chen, Guobing Huang, Qiuwen Chen, and Feng Jin
509
The Sixth Section: Intelligent Medical Apparatus and Clinical Applications Research on the Biocompatibility of the Human Rectum and a Novel Artificial Anal Sphincter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Peng Zan, Jinyi Zhang, Yong Shao, and Banghua Yang A Medical Tracking System for Contrast Media . . . . . . . . . . . . . . . . . . . . . . Chuan Dai, Zhelong Wang, and Hongyu Zhao Rapid Planning Method for Robot Assited Minimally Invasive Surgery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yanhua Cheng, Chun Gong, Can Tang, Jianwei Zhang, and Sheng Cheng
517 525
532
The Seventh Section: Modeling and Simulation of Societies and Collective Behavior Autonomic Behaviors of Swarm Robots Driven by Emotion and Curiosity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Takashi Kuremoto, Masanao Obayashi, Kunikazu Kobayashi, and Liang-Bing Feng
541
Modelling and Simulating Dynamic Evolvement of Collective Learning Behaviors by Voronoi Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Xiang-min Gao and Ming-yong Pang
548
Study of the Airway Resistance of a Micro Robot System for Direct Tracheal Inspection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lianzhi Yu, Guozheng Yan, Yuesheng Lu, and Xiaofei Zhu
555
Numerical Simulation of the Nutrient and Phytoplankton Dynamics in the Bohai Sea . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hao Liu, Wenshan Xu, and Baoshu Yin
564
Personalized Reconstruction of 3D Face Based on Different Race . . . . . . . Diming Ai, Xiaojuan Ban, Li Song, and Wenxiu Chen Lake Eutrophication Evaluation and Diagnosis Based on Bayesian Method and SD Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kai Huang, Xulu Chen, and Huaicheng Guo
570
579
Table of Contents
XVII
The Eighth Section: Brain Stimulation, Neural Dynamics and Neural Interfacing Respiration Simulation of Human Upper Airway for Analysis of Obstructive Sleep Apnea Syndrome . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Renhan Huang and Qiguo Rong
588
Optimization for Nonlinear Time Series and Forecast for Sleep . . . . . . . . . Chenxi Shao, Xiaoxu He, Songtao Tong, Huiling Dou, Ming Yang, and Zicai Wang
597
Classifying EEG Using Incremental Support Vector Machine in BCIs . . . Xiaoming Zheng, Banghua Yang, Xiang Li, Peng Zan, and Zheng Dong
604
Acute Isolation of Neurons Suitable for Patch-Clamping Study from Frontal Cortex of Mice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yuan-yuan Li, Li-jun Cheng, Gang Li, Ling Lin, and Dan-dan Li
611
Palmprint Identification Using PCA Algorithm and Hierarchical Neural Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ling Lin
618
Image Fusion Using Self-constraint Pulse-coupled Neural Network . . . . . . Zhuqing Jiao, Weili Xiong, and Baoguo Xu Segmentation for SAR Image Based on a New Spectral Clustering Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Li-Li Liu, Xian-Bin Wen, and Xing-Xing Gao
626
635
The Ninth Section: Intelligent Construction and Energy Saving Techniques for Sustainable and Green Built Environment Satellite-Retrieved Surface Chlorophyll Concentration Variation Based on Statistical Methods in the Bohai Sea . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Li Qian, Wen-ling Liu, and Xiao-shen Zheng
644
A Study on the Cooling Effects of Greenery on the Surrounding Areas by Computer Simulation for Green Built Environment . . . . . . . . . . . . . . . . Jiafang Song and Xinyu Li
653
Spatial-temporal Variation of Chlorophyll-a Concentration in the Bohai Sea . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Wen-ling Liu, Li Qian, and Xiao-shen Zheng
662
Effect of the Twirling Frequency on Firing Patterns Evoked by Acupuncture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yu-Liang Liu, Jiang Wang, Wen-Jie Si, Bin Deng, and Xi-Le Wei
671
XVIII
Table of Contents
The Tenth Section: Intelligent Water Treatment and Waste Management Technologies Comparison of Two Models for Calculating Water Environment Capacity of Songhua River . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Shihu Shu and Huan Ma
683
Growth Characteristics and Fermentation Kinetics of Flocculants-Producing Bacterium F2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jie Xing, Jixian Yang, Fang Ma, Wei Wang, and Kexin Liu
691
Research on Enrichment for Anammox Bacteria Inoculated via Enhanced Endogenous Denitrification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yi Yuan, Yong Huang, Huiping Deng, Yong Li, and Yang Pan
700
Evaluation of Geological Disaster with Extenics Based on Entropy Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Xinmin Wang, Zhansheng Tao, and Xiwen Qin
708
Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
717
MMSVC: An Efficient Unsupervised Learning Approach for Large-Scale Datasets Hong Gu, Guangzhou Zhao, and Jianliang Zhang College of Electric Engineering, Zhejiang University, Hangzhou, China, 310027 {ghong,zhaogz,jlzhang}@zju.edu.cn
Abstract. This paper presents a multi-scale, hierarchical framework to extend the scalability of support vector clustering (SVC). Based on the multi-sphere support vector clustering, the clustering algorithm called multi-scale multi-sphere support vector clustering (MMSVC) in this framework works in a coarse-to-fine and top-to-down manner. Given one parent cluster, the next learning scale is generated by a secant-like numerical algorithm. A local quantity called spherical support vector density (sSVD) is proposed as a cluster validity measure which describes the compactness of the cluster. It is used as a terminate term in our framework. When dealing with large-scale dataset, our method benefits from the online learning, easy parameters tuning and the learning efficiency. 1.5 million tiny images were used to evaluate the method. Experimental results demonstrate that the method greatly improves the scalability and learning efficiency of support vector clustering. Keywords: one-class support vector machine, large-scale clustering, support vector clustering.
1 Introduction Clustering is an important form of unsupervised learning, which is widely used in many fields including large-scale bioinformatics, data mining, pattern recognition and image retrieval. Until recently, many kernel-based clustering approaches have been proposed in the literature along with the exploiting researches of kernel methods [1-4]. The support vector clustering (SVC), first proposed by Ben-Hur et al. [5], is one of them. The basic idea of SVC is to estimate the density of data points in high dimensional feature space using Support Vector Data Description (SVDD) [6]. When the minimal sphere found by SVDD was transformed back to the data space, several classes enclosing different clusters of points were generated automatically, resulting in the contour corresponding to the spherical surface in the feature space. Besides the single sphere form, extending clustering scheme that we refer as multi-sphere support vector clustering (MSVC) is preferred in current researches of SVDD-based learning, whose advantage is overcoming some limitations of original ones [7-9]. Many works, not only clustering but also supervised learning that focus on the multi-classification problems, benefit from the multi-sphere form [10]. In practical applications of clustering with very large datasets, such as protein sequences analysis and web-based image retrieval, learning efficiency, however, is still K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 1–9, 2010. © Springer-Verlag Berlin Heidelberg 2010
2
H. Gu, G. Zhao, and J. Zhang
the main challenge when applying those kernel-based clustering methods, especially MSVC in this paper. Most researchers try to solve this problem using one or a combination of following techniques: subset selection, training algorithm optimization and the hierarchical-or-parallel algorithm. Several methods [11, 12] have been applied to SVDD and Kernel Grower to scale up these algorithms. However, in the cases that the set of support vectors itself is very large, the learning efficiency is still unacceptable. In this paper, we propose a multi-scale hierarchical framework for large-scale support vector clustering problems. We use MSVC with adaptive cell growing [9] as our basis to perform the clustering in one cascade at an adaptive similarity scale. With MSVC extended to the multi-scale structure, the clustering is performed in a coarse-to-fine manner that partitions the dataset iteratively. The scale, which is defined effectively by the width of Gaussian kernel, is generated by a secant-like numerical algorithm based on the property of its parent cluster. Only two parameters need to be initialized in our framework, and they can both to be approximated quite easily and robust to the clustering results. We also propose Spherical Support Vector Density (sSVD) to measure the tightness of cluster as a term of termination.
2 Multi-sphere Support Vector Clustering The standard SVC was derived from SVDD proposed by [6] as a density estimator. The basic notion of SVDD is to map the data points to high dimensional feature space by a nonlinear transformation and then to find the minimal sphere enclosing the majority of points. Formally, let X = {xi }il =1 where xi ∈ ℜ N , using constraints:
Φ ( xi ) − a ≤ R 2 + ξi 2
(1)
where Φ (⋅) is a nonlinear transformation that maps the data to a new feature space F. a is the center and a family of ξi ≥ 0 are slack variables allowing for soft boundaries. The inequations (1) can be expressed by the following optimization problem with a l
regularization constant C in its penalty term C ∑ ξi . Introduce the Lagrangian: i =1
L( R, a, ξ1 ,
l
l
l
i =1
i =1
i =1
, ξl ) = R 2 − ∑ ξi β i + C ∑ ξ i − ∑ ( R 2 + ξi − φ ( xi ) − a )α i 2
(2)
where α i ≥ 0 and βi ≥ 0 are Lagrange multipliers associated to Eq. (1) and to the slack variables. The solution of Eq. (2) can be solved by standard QP solvers. The original form of SVC employs single sphere to enclose all data points in the high dimensional feature space using SVDD. The sphere, when mapped back to the data space, can be separated into several components, each enclosing a separate cluster of points. However, the main drawback here is that the description does not show the relationship between one point and its corresponding cluster, the cluster labeling is still a hard work though the algorithm was improved [13, 14]. The multi-sphere form takes advantage of the one-to-one correspondence between spheres and clusters, that is, the prototype of one cluster is directly represented by the sphere itself. Recently, there are
MMSVC: An Efficient Unsupervised Learning Approach for Large-Scale Datasets
3
several multi-sphere inspired clustering algorithms [7-10] in the literature. This paper concentrates on adaptive cell growing approach proposed by [12] for its online learning property and efficiency compared with the others. For simplicity, we name the Jth cluster as PJ , denoting both the sphere in the high dimensional feature space and its enclosing data points. The sketch of MSVC algorithm is as follows: 1. Initialize the first cluster, Set P1 = {x1}, α1 = 1 and R1 = 0 . 2. Perform the following steps for new coming data points x1 , i = 1,..., l .
3. Find the winning cluster PJ , such that d ( xi , PJ ) = min {d ( xi , Pk )} where nc is the k =1,.., nc
number of clusters. 4. If g J ( xi ) < ε then xi belongs to cluster PJ , append xi to PJ and relearn the SVDD for PJ so the implicit representation of the sphere and RJ are both updated. Else, label this cluster as invalid, go back to Step 3 finding the winning cluster among the valid ones. 5. If no cluster is valid in Step 3, create a new cluster and do the initialization as in Step 1. There are two different ways to find the winning cluster by choosing different distance measurements d ( xi , Pk ) from a point to the cluster center. The measure can be chosen as the distance between the spherical center and the point in the feature space d ( x, PJ ) = DJ ( x) = φ ( x) − aJ . The validity test function g J ( xi ) is defined specifically
as g J ( x)1 = 0 if DJ ( x) < RJ and d1 + d 2 else , where d1 = 1/ 1 + exp(−λ1 × [d ( x, PJ )]2 ) and d 2 = 1/ (1 + exp(−λ2 × [max(0, ( DJ ( x) − RJ ))]2 )) . The selection of λ1 and λ2 depends on the specific dataset and kernel parameter. For different set, the method tries several times in order to get a good result. In our framework, this multiple trial procedure is automatic done implicitly within the multi-scale clustering algorithm. While the values of λ1 and λ2 will affect the generation of next scale parameter, it's much more robust than the direct tuning method.
3 Multi-scale Multi-sphere Support Vector Clustering The MSVC algorithm yields good result when the dataset is not so large or complex. It has several shortcomings when dealing with big datasets. Firstly, the organization structure of clusters generated by MSVC is flat. It is not easy to analysis and ineffective to the forward searching when the dataset grows up and so many clusters exist. Secondly, it's hard to select good parameters without a few of parameters tuning turns and the learning results are always influenced greatly by the parameters. To solve these shorts the multi-scale hierarchical structure has been introduced for support vector clustering. We denote the scale parameter as q when Gaussian kernel K ( x, y ) = exp(− x − y ⋅ q) is 2
used. Given an initialization scale, we use MSVC to generate the hierarchical structure of clusters at a variety of scales which were obtained by a secant-like numerical algorithm.
4
H. Gu, G. Zhao, and J. Zhang
3.1 Clustering Algorithm
The section provides details about the learning algorithm in the multi-scale framework. Given learning samples X = {xi }li =1 , the clustering algorithm is as follows: 1. Initialize q 0 , using the MSVC algorithm mentioned in Section 2 to get nc clusters PJ , J = 1,..., nc at the first scale. 2. Adjust the radius by adding compensation ζ , RJ' = RJ + ζ where RJ' is used only for the forward classification or searching process to substitute the real radius. 3. In the case of nc = 1 , we can either keep this level as one cascade or remove it. For easy presentation we do not save it and just continue the process at the next scale. 4. For each cluster PJ , apply MSVC to X J at the next scale with qJn +1 = κ (qJn , q n −1 ) , where X J is the data set belongs to cluster, X J = {x : d ( x, PJ ) = min {d ( x, Pk )}} . k =1,.., nc
5. Terminate the clustering of PJ once the local condition Cond ( PJ ) is satisfied. 3.2 The Chosen of q0
The initial scale q 0 may be chosen as the inverse of the max distance of the learning samples. Practically, using Eq. (3) where Z k means the range of k th dimension and the numerator 4 behaves well in our tests. q 0 = 4 / Z12 + Z 22 … + Z N2
(3)
The selection of q 0 is robust to the clustering result in our algorithm because the scale is calculated based on the information of its parent cluster. For example, if we select a small q 0 , the radius of the first sphere found by SVDD will be small which will cause a large step and get a large q1 in next scale. The q n will quickly reach the magnitude which is big enough to allow the MSVC to partition the data set into several clusters. 3.3 Iterative Secant-Like Algorithm
The iterative generation of the Gaussian kernel width for the original SVC has been explored recently by [15]. Here the modified version of [15] is used where the dataset size is assumed to be infinite ( l → ∞ ):
qJn +1
⎧ qJn nc ≠ 1 ⎪ RJ ⎪ n n −1 = κ (q , q ) = ⎨ n n −1 2 2 ⎪ (1 − Rn −1 )qJ − (1 − RJ )q ⎪ RJ2 − Rn2−1 ⎩
(4) nc=1
where Rn −1 and q n −1 are the radius and scale of its parent sphere. Suppose the data is partitioned to several clusters at current scale, our idea is that the clustering should be
MMSVC: An Efficient Unsupervised Learning Approach for Large-Scale Datasets
5
regarded as a new task since the data used at the next scale is only a subset of the present one. The next scale parameter is calculated only upon RJ . We have tried the form qJn / RJ2 when nc ≠ 1 but found that the searching step is too large. In the case of nc=1 , the algorithm is the same with [15]. 3.4 Stopping Rule Based on Spherical Support Vector Density
The choice of stopping rules for top-down hierarchical clustering is very closely related to cluster validity [16]. They can be mainly grouped by two categories, global stopping rules and local (also called internal) stopping rules [17, 18]. For support vector clustering, [19] provide a state-of-art global validity measure to find the optimal cluster configurations. However, this measure can hardly be employed because it was designed for the simple partitioned clustering. They try to minimize the ratio of the compactness measure to the separation measure by regularizing scale parameter q. The problem occurs when dealing with large dataset. For example, when their method has been used to obtain the image feature clusters (vocabulary for bag-of-visual-words), the computational cost is too high even to finish it once for there're millions of features the whole. Therefore, we present the new local cluster validity measurement called Spherical Support Vector Density (sSVD) especially for our multi-scale framework. The sSVD is then used as the stopping term in the algorithm. Consider the sphere representation of PJ in high dimensional feature space, we have
{
}
the SVs that satisfied 0 < α i < C by SV = sv1 ,..., sv SVJ and the implicit form of distance measurement. Denoting | SV | to be the size of SVs, for any sample sv in SVs we have: l
l
l
RJ2 = DJ2 ( sv) = K ( sv, sv) − 2∑ α i K ( xi , sv) + ∑∑ α iα j K ( xi , x j ) i =1
(5)
i =1 i =1
For Gaussian kernel, Eq. (5) can be written as: const J = K ( sv, sv) − D 2 ( sv) +
l
l
∑∑α α i =1 i =1
i
j
K ( xi , x j )
(6)
Note that the right side of Eq. (6) is a constant. Defining: l
l
const J = K ( sv, sv) − D 2 ( sv) + ∑∑ α iα j K ( xi , x j ) = 1 − RJ2 + VJ
(7)
i =1 i =1
l
l
where VJ = ∑∑ α iα j K ( xi , x j ) is a constant too. Thus the sSVD J is defined as: i =1 i =1
sSVD J
SVJ 4πρ J 2 × lJ
(8)
where ρ J2 = − ln(const J / 2) / q and lJ is the number of samples in PJ and SVJ is the number of support vectors of PJ . Finally, Cond ( PJ ) is defined as follows:
6
H. Gu, G. Zhao, and J. Zhang
⎧ true Cond ( PJ ) = ⎨ ⎩ false
if sSVDJ ≥ υ else
(9)
One advantage of our measurement is that the sSVD is effective and can be obtained directly within a learned cluster PJ , without any further statistical calculation. For some large dataset, it saves much computation time. The stopping rule (9) uses the threshold υ to control the compactness of the leaf clusters. Despite the selection of υ is related to the specific dataset, this threshold can either be post-determined among the clustering process or pre-selected by testing a small random subset.
4 Experiments Tiny Images [20], which contains 80 millions tiny images handled by humans using LabelMe, are used to evaluate the real performance. We took 1.5 million samples of the database, available free at http://people.csail.mit.edu/torralba/tinyimages, to evaluate the efficiency of our approach. These images are all string labeled and stored in 26 folders name from 'a' to 'b'. We compare the learning efficiency of the following methods in this section, the single sphere version of multi-scale SVC [21], the multi-sphere SVC with cell growing described in Section 2 and the proposed approach. The single sphere version of multi-scale SVC is not far different with the original algorithm. It makes the clusters more discriminative by regularizing scale parameter q manually as in [1]. The only different is that the solver of multi-scale SVC is based on the Entire Regularization Path for SVDD [22], so they can perform the clustering continuously when q increased. When q changed, however, the algorithm equals to set the current SVs to the learned, previous SVs but not be randomly assigned and resolved. We use SVC to denote the multi-scale SVC following to distinguish the single sphere multi-scale SVC and multi-sphere SVC. The pictures in the dataset have 32*32 color pixels. We utilize the pyramid representation of image and skip the color information. The image is first converted to a gray scale image, and then convolved with a Gaussian filter with width 1.6. Finally the gray image is composed by two scales of SIFT. The first scale is a 2*2 SIFT and the second is a 4*4 SIFT with 8 bin directions each. Finally the whole feature vector is a (2*2 + 4*4)*8 = 160 dimensional one to represent the global spatial features of the image. We fix the parameters as: q 0 = 0.025,υ = 0.002 . The first sub-experiment applies our algorithm on the 110,249 images in the folder 'b' of the dataset. The whole 313 leaf clusters were generated with 5 layers. Fig. 1 shows 3 querying results based on the learned structure and Table 1 lists the detailed cluster information. Given one sample image, we find the leaf cluster closest to the image. We illustrate 14 random images for each cluster without internal sorting to reveal the matching results. Focusing on the learning efficiency, the second sub-experiment does not save the learned structure. We compare the time costing of three algorithms (SVC, MSVC and our approach) based on the whole image dataset. Fig. 2 shows the comparison of the learning times. As we see, the single sphere form of SVC can not deal with too many samples even using a small q . Concerning our approach, on the fifth
MMSVC: An Efficient Unsupervised Learning Approach for Large-Scale Datasets
7
layer the scale reaches 0.05 (see Fig. 1) though the initial scale parameter is 0.025. Our approach greatly reduces the clustering time and enables the application of support vector clustering on the large scale dataset. Table 1. Clusters of three image retrieval assignments Query
Cascade
qJ
| SV |J
R
lJ
1 2 3
3 3 5
0.0311 0.0293 0.0515
22 19 22
0.776 0.782 0.817
224 186 207
Fig. 1. Examples of cluster retrieval, the structure was automatic learned by 110,249 samples from letter ‘b’ folder in the tiny images
Fig. 2. Learning time comparison for three cluster methods. SVC for original support vector clustering, MSVC for multi-sphere SVC and Our Approach for multi-scale multi-sphere SVC. We test MSVC with q = 0.02 and q = 0.04 .
8
H. Gu, G. Zhao, and J. Zhang
5 Conclusion and Future Works A new clustering algorithm based on the multi-scale tree structure has been presented. The sSVD is proposed as a new quantity to measure the compactness of one cluster and is used to be a termination term in our framework. The experiments confirm that our approach greatly improve the efficiency of support vector clustering. Another advantage of our approach is the robustness of the initial parameters with respect to the clustering result, yet the parameter tuning process for large dataset can be avoid. Further research will focus on the extension of the MMSVC algorithm in the metric learning fields.
Acknowledgments. This work is supported by National Natural Science Foundation of China (60872070) and Zhejiang Province key Scientific and Technological Project (Grant No. 2007C11094, No. 2008C21141).
References 1. Ben-Hur, A., Horn, D., Siegelmann, H.T., Vapnik, V., Critianini, N., Shawe-Taylor, J., Williamson, B.: Support Vector Clustering. Journal of Machine Learning Research 2, 125–137 (2002) 2. Dhillon, I.S., Guan, Y., Kulis, B.: Kernel k-means: spectral clustering and normalized cuts. In: ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 551–556 (2004) 3. Fischer, B., Roth, V., Buhmann, J.M.: Clustering with the Connectivity Kernel. In: Advances in Neural Information Processing Systems, vol. 16, pp. 1–16 (2004) 4. Girolami, M.: Mercer kernel-based clustering in feature space. In: Proceedings of 2004 IEEE International Joint Conference on Neural Networks, 2004, vol. 13, pp. 780–784 (2002) 5. Ben-Hur, A., Horn, D., Siegelmann, H.T., Vapnik, V.: A support vector clustering method. In: Pattern Recognition, vol. 722, pp. 724–727 (2000) 6. Tax, D.M.J., Duin, R.P.W.: Support Vector Data Description. Machine Learning 54, 45–66 (2004) 7. Camastra, F., Verri, A.: A novel kernel method for clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence 27, 801–805 (2005) 8. Defeng, W., Daniel, S.Y., Eric, C.C.T.: Structured One-Class Classification. IEEE Transactions on Systems, Man, and Cybernetics 36, 1283–1295 (2006) 9. Jung-Hsien, C., Pei-Yi, H.: A new kernel-based fuzzy clustering approach: support vector clustering with cell growing. Fuzzy Systems 11, 518–527 (2003) 10. Daewon, L., Jaewook, L.: Domain described support vector classifier for multi-classification problems. Pattern Recognition 40, 41–51 (2007) 11. Chang, L., Deng, X.M., Zheng, S.W., Wang, Y.Q.: Scaling up Kernel Grower Clustering Method for Large Data Sets via Core-sets. Acta Automatica Sinica 34, 376–382 (2008) 12. Jen-Chieh, C., Jeen-Shing, W.: Support Vector Clustering with a Novel Cluster Validity Method. In: IEEE International Conference on Systems, Man and Cybernetics, SMC 2006, vol. 5, pp. 3715–3720 (2006) 13. Jaewook, L., Daewon, L.: An improved cluster labeling method for support vector clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence 27, 461–464 (2005)
MMSVC: An Efficient Unsupervised Learning Approach for Large-Scale Datasets
9
14. Lee, S.H., Daniels, K.M.: Cone Cluster Labeling for Support Vector Clustering. In: Proceedings of the 6th SIAM International Conference on Data Mining (2006) 15. Lee, S.H., Daniels, K.M.: Gaussian Kernel Width Generator for Support Vector Clustering. In: International Conference on Bioinformatics and its Applications, pp. 151–162 (2004) 16. Grira, N., Crucianu, M., Boujemaa, N.: Unsupervised and Semi-supervised Clustering: a Brief Survey. A Review of Machine Learning Techniques for Processing Multimedia Contents. Report of the MUSCLE European Network of Excellence (FP6) (2004) 17. Cao, F., Delon, J., Desolneux, A., Mus, P., Sur, F.: An a contrario approach to hierarchical clustering validity assessment (2004) 18. Halkidi, M., Batistakis, Y., Vazirgiannis, M.: Cluster validity methods: part I. ACM SIGMOD Record. 31, 40–45 (2002) 19. Wang, J.-S., Chiang, J.-C.: A cluster validity measure with a hybrid parameter search method for the support vector clustering algorithm. Pattern Recognition 41, 506–520 (2008) 20. Torralba, A., Fergus, R., Freeman, W.T.: Tiny Images. Computer Science and Artificial Intelligence Lab, Massachusetts Institute of Technology (2007) 21. Hansen, M.S., Holm, D.A., Sjöstrand, K., Ley, C.D., Rowland, I.J., Larsen, R.: Multiscale hierarchical support vector clustering. In: Medical Imaging 2008: Image Processing 6914, 69144B, pp. 136–144 (2008) 22. Sjöstrand, K., Larsen, R.: The Entire Regularization Path for the Support Vector Domain Description. In: Larsen, R., Nielsen, M., Sporring, J. (eds.) MICCAI 2006. LNCS, vol. 4190, pp. 241–248. Springer, Heidelberg (2006)
CUDA Based High Performance Adaptive 3D Voxel Growing for Lung CT Segmentation Weiming Zhai, Fan Yang, Yixu Song, Yannan Zhao, and Hong Wang State Key Laboratory of Intelligent Technology and Systems Computer Science and Artificial Intelligence Research Devision Tsinghua National Laboratory for Information Science and Technology Department of Computer Science and Technology Tsinghua University, Beijing 100084, China
[email protected]
Abstract. A novel CUDA based high performance parallel voxel growing algorithm to segment 3D CT pulmonary volumes with GPU Acceleration is introduced in this paper. The optimal parameters for segmentation is dynamically iterative adjusted based on the statistical information about previous segmented regions. To avoid the disadvantage of leaking during segmentation with the conventional voxel-growing based methods, it adopts a process to mutually utilize segment results between both of lateral lung leaves, which in turn benefits the discriminative segmentation on left and right lung leaves. Experiments show that the algorithms obtain accurate results with a speed about 10-20 times faster than the traditional methods on CPU, which imply that this algorithm is potentially valid for future clinical diagnosis applications.
1
Introduction
CT images are widely used in medical lung analysis. It can provide high resolution images, as well as great contrast between lungs and their surrounding tissues. CT lung images are used for lung tissues density analysis, the trachea analysis, lung tissues modeling and visualization, as well as medical diagnose . All these techniques are based on the results of lung images segmentation, and the demand for automatic, fast and accurate lung image segmentation method in clinical application is increasing [5,1,11]. However, automatic lung segmentation remains a challenging task in medical images processing, especially when accuracy and speed are both seriously considered[8]. A myriad of lung CT images segmentation methods have been proposed[7,2,4] and implemented in recent years. In spite of the huge effort invested in this problem, there is no single approach that can generally solve the problem of segmentation for the large variety of images existing today, and the major difficulty of these algorithms lies in the separation of the left lung leaf and the right leaf from each other. The boundary of two lung leaves often blurs and leads to the
Supported by National Natural Science Foundation of China 60873172.
K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 10–18, 2010. c Springer-Verlag Berlin Heidelberg 2010
CUDA Based High Performance Adaptive 3D Voxel Growing
11
leaking problem in segmentation, which forbid two leaves to be separated from each other. Several algorithms are proposed to solve the leaking problem. Manual method is given first in[7], then 2D boundary tracking technique is adopted in[3], and dynamic programming is proposed to find the exact boundary of each lung[1,5]. All these algorithms adopt 2D techniques for the separation of lung leaves slice by slice, which often lead to algorithm complexity and declining of robustness. At present, many segmentation methods are not completely automatic, and a more fast method with less manual operation is still attractive. 3D image voxel growing in is a segmentation technique which can achieve considerable accuracy and it is used in many medical image processing applications[9]. However, image size in medical applications is often very large, and it leads to tremendous computation time which is not appropriate for clinical practice. Fortunately, the mechanism of voxel growing has the potential for parallelization, making it possible to speed up the algorithm. Recently, the Graphics Processor Unit (GPU) has become an economical and fast parallel co-processor. A contemporary GPU can have up to hundreds of stream processors so it has the potential for solving massive data-parallel problems. The Compute Unified Device Architecture (CUDA) is a general purpose parallel computing architecture that enables the GPU to solve high performance computing problems. A novel segmentation algorithm based on dynamic adaptive voxel-growing is proposed in this paper, it adopts a process to mutually utilize segment results between both of lung leaves, which in turn benefits the separation of left and right lung leaves. And, this algorithms is implemented in CUDA, using the parallel style 3D voxel growing. Thus, the accurate results are achieved in much less time compared to the implements on CPU. Several improvements to the traditional voxel growing algorithms are also introduced in this paper, which could better fit the method to the 3D volume segmentation problem and the clinical requirements.
2 2.1
Parallel Voxel Growing Problem Definition
Voxel growing algorithms have proven to be an effective approach for 3D medical image segmentation[6]. In medical image processing, 3D image are often call volume, which is composed of uniformly distributed 3D array of voxels, namely volume pixels. The definition of 3D volume is the generalization of 2D bitmap in 3D space, while the definition of voxel is the generalization of 2D pixel in 3D accordingly. We define a voxel vi in a volume Π as vi =< xi , yi , zi >
(1)
And the whole 3D volume is the voxel set of Π = {vi |0 ≤ xi < r, 0 ≤ yi < s, 0 ≤ zi < t}
(2)
where r, s, t are the length, width, and height in size of Π, and the xi , yi ,zi are the index of the voxel vi in 3D array respectively.
12
W. Zhai et al.
Two mapping functions are also defined here, the first is used to define the CT g value of the voxel vi Γ (vi ) = g (3) the function (3) maps a voxel vi to its correspondent CT value, where the domain is Π, and range is typically [-1024,1024] in CT value. The second function is used to define the label l of the voxel vi Λ(vi ) = l
(4)
the function (4) maps a voxel vi to its label, witch distinguish whether a voxel belongs to a special target region. The basic approach of a voxel growing algorithm is to start from a seed region that are considered to be inside the object to be segmented. The voxels neighboring this region are evaluated to determine if they should also be considered part of the object. If so, they are added to the region and the process continues as long as new voxels are added to the region until the final target region Λ−1 (l) is evolved. Voxel growing algorithms vary depending on the criteria used to decide whether a voxel should be included in the region or not, the type connectivity used to determine neighbors, and the strategy used to visit neighboring voxels. 2.2
Adaptive Growing Criterion
The 3D voxel growing algorithm based on region statistical features tends to take good effect in medical image segmentation. Multiple iterations are adopted in this algorithm to growing the seed region dynamically. First, a initial seed region with a set of voxels in the volume should be selected, this may consist several isolated seeds in the target region. Once the seed region are set, the growing procedure starts. All voxels directly neighboring the seed region are examine and determined whether they should be adopted in the seed region by compare their CT value to a threshold range Ω. The range Ω used by the algorithm is based on simple statistics of the current region. First, the algorithm computes the mean and standard deviation of intensity values for all the voxels currently included in the region. A adaptive factor is used to multiply the standard deviation and define a range around the mean. Neighbor voxels whose intensity values fall inside the range are accepted and included in the region. When no more neighbor voxels are found that satisfy the criterion, the algorithm is considered to have finished its first iteration. At that point, the mean and standard deviation of the intensity levels are recomputed using all the voxels currently included in the region. This mean and standard deviation defines a new intensity range that is used to visit current region neighbors and evaluate whether their intensity falls inside the range. This iterative process is repeated until no more voxels are added or the maximum number of iterations is reached. The following equation illustrates the inclusion criterion used by the algorithm in the nth iteration Ωn = [mn−1 − θσn−1 , mn−1 + θσn−1 ]
(5)
The new threshold range Ωn is calculated from the previous mean value mn−1 and standard deviation σn−1 of the voxel CT values in the region, θ is a factor
CUDA Based High Performance Adaptive 3D Voxel Growing
13
defined by the user. Compared with conventional voxel growing and threshold segmentation methods, it can adaptively adjust the optimal threshold range in each iteration based on the current local statistical features, so that the robustness and precision of the total system is improved greatly. 2.3
Parallel Implementation on CUDA
The key for parallelize the voxel growing is to make all the seeds grow at the same time. Since in each step of the growing procedure, a seed can only affect its 26 neighbors in 3D space, that is, for each seed we only need a small neighborhood information around a voxel instead of the whole volume information, which make the parallel mechanism relative simple. To accomplish the parallelized growing, we designed a parallel growing kernel in CUDA to process all the voxels in the set of seeds simultaneously. The strategy for this kernel is, if the set of seeds is not empty, we assign one CUDA thread to each voxel in the set and check its 26 direct connected neighboring voxels. The neighboring voxel’s CT value satisfies the threshold range Ωn , it will be added in the seed region, while those voxels that doesn’t satisfies the criterion are removed from the candidate voxels. The parallel growing kernel is iteratively executed until the algorithm ends. One advantage for the implementation is all CUDA threads do exactly the same process at the same time, which is efficient to avoid divergence. The growing kernel for a special voxel vi in the seed region does the following processing: VoxelGrowingKernel(vi ) 1. 2. 3. 4. 5. 6. 7.
Use thread-ID and Block-ID to get the index of the voxel vi Use the index to fetch the CT value of the voxel vi For each neighbor, if its CT value satisfy the criterion goto 4), else go to 5) Add the voxel to the seed region set Remove the voxel from the candicate voxels Update the flags array into the global memory. Synchronize threads.
The overall working flow of the voxel growing is implemented as the following procedure: 1. Prepare the volume data in GPU texture memory 2. Initialize the initial seed region 0 and calculate the mean m0 and standard derivation σ0 of the seed region using CUDA accumulation. 3. In the nth iteration, the new threshold range Ωn is calculated using 5 4. For each voxel vi in seed region(vi ∈ n−1 ), parallel perform VoxelGrowingKernel(vi ) using multiple CUDA kernel thread to derive new seed region n 5. Calculate the mean mn and standard derivation σn of the new region n mn =
1 ΣΓ (vi ), vi ∈ n |n |
(6)
14
W. Zhai et al.
σn =
1 |n |
2 Σ(Γ (vi ) − mn ) , vi ∈ n
(7)
6. Repeat steps 3-5, until the range won’t expand, or the iterate number reaches a predefined value.
3
Lung Leavas Segmentation Procedure
In voxel growing segmentation methods, two lung leaves can’t be easily separated, because the other lung area may be added to the current one when seed growing method is applied. And the leak problem is more serious in 3D segmentation, for there are 26 other voxels around each voxel, all of which can be a leak channel. A special process is applied in our method to solve the leaks problem. The basic idea is, before some tissue segmentation, a pre-segmentation step will be adopted first to get rid of the surrounding areas, so that the leaks between neighboring tissues can be reduced effectively.The flow chart of this algorithm is illustrated in Fig.1. 3.1
Data Initialization
As we are going to deal with images in 3D space, we first allocate a 3D array in GPU texture memory, with the data copied from the original volume. The coordinates of 3D array in texture memory is normalized, which enable us to fetch data according to the physics coordinates rather then the indexes of the original data array. Thus, the resolution of the image is independent of the image size, which means multiscale image processing is very convenient in GPU texture. we can sacrifice the accuracy for faster processing speed if the 3D image is too large. Besides the data for the image, we also need an array for label according to the resolution we use, this array is used to mark the points in the target as the texture memory is not writable from the device. 3.2
The Trachea Segmentation
The trachea connects the two lung leaves by bronchi, it’s important to segment the trachea tissue, and remove it from the original image. The voxel growing and accumulation slice by slice method is applied in the method to extract the trachea structure. The seed is set at the top most slice of the trachea; then the voxel growing method is applied from top to bottom, and voxels within appropriate threshold range are labelled with ”T” and added to the current growing area Λ−1 (T ) , slice by slice. Experiment shows that the accumulated voxels number |Λ−1 (T )| will grows steadily, then it will grow sharply when the connection between two lung leaves via bronchi takes place. So it can be set as the terminate condition of the extraction.
CUDA Based High Performance Adaptive 3D Voxel Growing
15
Fig. 1. Flow Chart of Segmentation Method
3.3
Right Lung Pre-segmentation
There is distinct boundary between two lung leaves in most CT images slices, but this kind of boundary tends to be blurred in some local area, and it’s hard to determine the optimal growing factor to separate the two lung leaves. When the growing factor is too large, leaks appear apparently, whereas small growing factor leads to insufficient growing, which can non’t be taken as the final segmentation result. The half lung pre-segmentation method is applied to solve the problem with respect to the short comings explained above. First, with a large enough growing factor which won’t cause leaks, the right lung pre-segmentation is carried out. A method dynamically adjusting statistical information about previous segmented regions is adopted in our algorithm to seek a appropriate growing factor θ. For a given image, different growing factor leads to different voxel volume in segmentation result illustrated in Fig.2(b) and Fig.2(a); but the increasing ratio varies in different range, and a sharp break will appear at a specifical critical
16
W. Zhai et al.
value, which causes the volume doubles. So the growing factor can be set a little less than the critical value. A similar algorithm to choose the optimal threshold in liver vessel segmentation is adopted in Dirk Selle[10].
Volume curve
Volume curve
1800
4500
1600
4000 3500
1400 Volume(ml)
Volume(ml)
3000 1200
1000
2500 2000
800
1500
600
400
1000
1
1.5
2 theta
2.5
3
500
(a) Expiration evolve curve
1
1.5
2 theta
2.5
3
(b) Inspiration evolve curve
Fig. 2. Relation between Adaptive Growing Factor and Volume Size
To seek the optimal growing factor θ, the next steps are carried out: 1. Initialize a large growing factor θR , which will growing over to the other lung; and a small one θL , which can’t grow sufficiently; their volume VR and VL are calculated via self growing, respectively; 2. Set the median factor θM = (θR + θL )/2, and also calculate the volume VM ; θL = θM 3. if ( VR /VM > VM /VL ) 4. else θR = θM 5. θM = (θR + θL )/2 6. while VR /VL > η, repeat steps 2-4 7. the final VL can be taken as the result; The η is a critical value, which can be set a little larger than 1. 3.4
Segmentation of Two Lung Leaves
More precise segmentation of the two lung leaves with voxel growing algorithm can be carried out after the trachea Λ−1 (T ) and right lung pre-segmentation Λ−1 (R), which are removed from the original volume Π before the left lung segmentation, so that the leaks from the left lung to the right part can be avoided, and a large enough growing factor can be taken in subsequent steps, which guarantees a sufficient segmentation result Λ−1 (L), as equation (8) shows. Λ−1 (L) = {vi |vi ∈ Π − Λ−1 (T ) − Λ−1 (R), Λ(vi ) = L}
(8)
Similarly, the trachea Λ−1 (T ) and left lung area Λ−1 (L) should be removed from the original volume Π to get the right lung segmentation result Λ−1 (R). Λ−1 (R) = {vi |vi ∈ Π − Λ−1 (T ) − Λ−1 (L), Λ(vi ) = R}
(9)
CUDA Based High Performance Adaptive 3D Voxel Growing
4
17
Experiment Results
The method has been evaluated on CT volumes of over 20 patient with lung illness. For all these computation, a computer with Intel CoreDuo 2.4G CPU, 2G RAM and Nvidia GeForce GTX275 GPU is used to perform the CUDA based voxel growing. In each patient’s CT, inspiration CT and expiration CT are processed respectively, so that 40 CT volumes in total are processed. Only two volumes failed in this method due to the ill lung’s size is extremely small compared to normal person, so the method could get the right result 95% in probability. As results, when the growing factor θ exceeds a critical value, the volume size of the right lung will increase sharply to about 2 times of previous one, and leaks appear, which proved to our theoretic prediction. The relationship between growing factor θ and volume of other groups of samples is illustrated in Fig.3, in which the break phenomenon appears when the critical value is exceed. Volume curve
Volume curve
8000
6000 5500
7000
5000 6000 Volume(ml)
Volume(ml)
4500 5000
4000
4000 3500 3000
3000 2500 2000
1000
2000
1
1.5
2 theta
2.5
3
(a) Expiration Evolve Curve
1500
1
1.5
2 theta
2.5
3
(b) Inspiration Evolve Curve
Fig. 3. Experiment for Patient
We also found that the CUDA based implement is at least 10-20 times faster than the corresponding conventional method, which could be found in [6]. The advantage of CUDA implements is more obvious when the CT image has a larger size. While the huge time complexity keep traditional CPU methods from clinic utility, our method can effectively improve the working efficiency for the doctors.
5
Summary and Conclusions
A CUDA based fast Dynamic adaptive 3D voxel-growing segmentation algorithm for lung CT volumes is proposed in this article. Compared with conventional 2D algorithms, this approach can effectively solve the two lungs segmentation problem by 3D segmentation method, and great improvement in automation, robustness and segment precision is achieved. This algorithm is tested with many
18
W. Zhai et al.
experiment samples, and the results show effective, as well as provide reliable foundations for the flowing lung tissues volume estimation and clinic application. We also found that the CUDA based voxel growing has better computational performance than traditional implementation, which uses CPU to accomplish the bulk of computing. The future work includes the seek algorithm of optimal growing factor θ, or how to get the optimal factor that satisfies the requirement of segmentation with least steps, and the prediction of optimal θ based on dynamically adjusting statistical information is also worth for further research.
References 1. Brown, M.S., McNitt-Gray, M.F., Mankovich, N.J., Goldin, J.G., Hiller, J., Wilson, L.S., Aberle, D.R.: Method for segmenting chest ct image data using an anatomical model: Preliminary results. IEEE Transactions on Medical Imaging 16(6), 828–839 (1997) 2. Denison, D.M., Morgan, M.D.L., Millar, A.B.: Estimation of regional gas and tissue volumes of the lung in supine man using computed tomography. Thorax 41, 620– 628 (1986) 3. Hedlund, L.W., Anderson, R.F., Goulding, P.L., Beck, J.W., Effmann, E.L., Putman, C.E.: Two methods for isolating the lung area of a ct scan for density information. Radiology 144, 353–357 (1982) 4. Hoffman, E.A., Ritman, E.L.: Effect of body orientation on regional lung expansion in dog and sloth. J. Appl. Physiol. 59(2), 481–491 (1985) 5. Hu, S., Hoffman, E.A., Reinhardt, J.M.: Automatic lung segmentation for accurate quantitation of volumetric X-ray CT images. IEEE Transactions on Medical Imaging 20(6), 490–498 (2001) 6. Ibanez, L., Schroeder, W., Ng, L., Cates, J.: The ITK Software Guide. Kitware, Inc. (August 2003) 7. Kalender, W.A., Fichte, H., Bautz, W., Skalej, M.: Semiautomatic evaluation procedures for quantitative ct of the lung. J. Comput. Assist. Tomogr. 15(2), 248–255 (1991) 8. Mumford, D., Shah, J.: Optimal approximations of piecewise smooth functions and associated variational problems. Communications in Pure and Applied Mathematics 42, 577–685 (1989) 9. Geun, P.J., Chulhee, L.: Skull stripping based on region growing for magnetic resonance brain images. Neuroimage 47(1), 394–407 (2009) 10. Selle, D., Preim, B., Schenk, A., Peitgen, H.O.: Analysis of vasculature for liver surgical planning. IEEE Transactions on Medical Imaging 21(11), 1344–1357 (2002) 11. Zhang, L.: Atlas-Driven Lung Lobe Segmentation in Volumetric X-Ray CT Images. Ph.D. thesis, The University of Iowa (2002)
Wavelet Packet-Based Feature Extraction for Brain-Computer Interfaces Banghua Yang1,2, Li Liu1, Peng Zan1, and Wenyu Lu1 1 Shanghai Key Laboratory of Power Station Automation Technology, Department of Automation, College of Mechatronics Engineering and Automation, Shanghai University, Shanghai, 200072, China 2 State Key Laboratory of Robotics and System (HIT), Harbin, 150001, China
[email protected]
Abstract. A novel feature extraction method of spontaneous electroencephalogram (EEG) signals for brain-computer interfaces (BCIs) is explored. The method takes the wavelet packet transform (WPT) as an analysis tool and utilizes two kinds of information. Firstly, EEG signals are transformed into wavelet packet coefficients by the WPT. And then average coefficient values and average power values of certain subbands are computed, which form initial features. Finally, part of average coefficient values and part of average power values with larger Fisher indexes are combined to form the feature vector. Compared with previous feature extraction methods, the proposed approach can lead to higher classification accuracy. Keywords: brain-computer interface (BCI), electroencephalogram (EEG), feature extraction, wavelet packet transform (WPT).
1 Introduction A brain-computer interface (BCI) establishes a new communication channel between the human brain and a computer or other output devices. For some people with very severe disabilities (e.g., amyotrophic lateral sclerosis or brainstem stroke), a BCI may be the only feasible channel for communicating with others and for environment control. The most common BCI systems are based on the analysis of spontaneous EEG signals produced by two or more mental tasks [1]. The analysis mainly contains the feature extraction and the classification, in which the feature extraction is more important. A novel feature extraction method based on the wavelet packet transform (WPT) will be explored in the paper. Feature extraction methods of spontaneous EEG used for BCIs can be divided into four categories: 1) Time or frequency method: It uses averages in the time domain or power spectrums in the frequency domain as features [2]. 2) Conventional timefrequency method: Features are obtained by combining averages in the time domain with power spectrums in the frequency domain [3]. 3) Model parameters method: It uses some model coefficients as features, such as autoregressive (AR) model [4]. 4) The WPT power (WPTP) method: Power values within certain frequency ranges are K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 19–26, 2010. © Springer-Verlag Berlin Heidelberg 2010
20
B. Yang et al.
used as features [5]. The former three methods assume that the EEG is stationary. However, the assumption is not satisfied in practice. Due to the non-stationary property of EEG signals [6], the WPT can better represent and analyze signals than the former three methods as it describes the information in various time windows and frequency bands. The WPT is a novel time-frequency analysis tool and it can provide important time-frequency features which can’t be provided by other transforms [7]. The existing WPTP method is demonstrated to outperform other methods for the feature extraction of EEG signals [8]. Nevertheless, it only contains frequencydomain information and lack time-domain information, which affects its classification performance. EEG signals during mental tasks contain not only time-domain information but also frequency-domain information. A combination of two kinds of information should increase the classification performance of features. This paper explores a novel method on the basis of the WPTP. The method applies features from coefficients and powers obtained by the WPT and so it is called WPTCP.
2 Wavelet Packet Transform In the multi-resolution analysis of the wavelet transform, Hilbert space L2 ( R ) can be decomposed into many wavelet subspaces W j ( j ∈ Z ) , i.e., L2 ( R ) = ⊕ W j . In the j∈Z WPT, W j subspace is decomposed further with dyadic mode. Fig.1 shows the space decomposition of the WPT, where U nj is the subspace at the j th level and the n th node. u nj , k (t ) = 2− j / 2 u n (2− j t − k ) ( k ∈ Z ) is the normal orthogonal wavelet basis of the subspace U nj . The u nj, k (t ) function satisfies the following two-scale equations:
u 2j ,nk (t ) = ∑ h0 (m)u nj −1,2 k − m (t )
(1)
u 2j ,nk (t ) = ∑ h0 (m)u nj −1,2 k − m (t )
(2)
m
m
where h0 (k ) and h1 (k ) = (−1)1− k h0 (1 − k ) are a pair of orthogonal mirror filters. Let us use f ( x) denote a given function, WPT coefficients of a given function f ( x) at the j th level, the k th point, the (2n) th node and (2n + 1) th nodes are computed via the following equations respectively: d 2j n (k ) = ∑ h0 (m)d nj −1 (2k − m)
(3)
d 2j n (k ) = ∑ h0 (m)d nj −1 (2k − m)
(4)
m
m
For ∀f ( x) ∈ L2 ( R) , sample sequence f (k Δt ) (or termed f (k ) ) can be regarded as d 00 (k ) of U 00 space 14. We can see from (3) and (4) that WPT coefficients at j th level can be obtained by ones at ( j − 1 )th level, i.e., WPT coefficients at the first level can be obtained from d 00 (k ) and WPT coefficients at the second level can be
Wavelet Packet-Based Feature Extraction for Brain-Computer Interfaces
21
obtained from ones at the first level, and so on. So, we can obtain WPT coefficients at each level and each node. The corresponding frequency range of the subspace U nj is [ nf , (n + 1) f ] , where f s is the sampling frequency. s
2
j +1
s
2
j +1
U 00 U U U
0 3
U 11 (W1 )
0 1
1 2
0 2
U (W 2 ) 1 3
U (W 3 ) U
2 3
U
U
3 3
U
4 3
U 23
2 2
U
5 3
U 36
U 37
Fig. 1. The space decomposition of the WPT
3 The WPTCP Method 3.1 The Form of Initial Features
First, we use f l ( x) to denote the EEG signal from channel l ( l = 1, 2,...L , L is the total number of channels). The corresponding sample sequence f l ( x) can be decomposed according to (3) and (4). Let dl2, nj (k ) and dl2, nj +1 (k ) be decomposition coefficients of channel l at j th level. Let the raw signal have 2 N sample points, AVEl2, nj and AVEl2, nj +1 are average coefficients of channel l at the j th level, the (2n) th node and (2n + 1) th node respectively. Accordingly, Pl ,2jn and Pl ,2jn +1 are average power values. Then AVEl , j and Pl , j can be computed according to the following formulae: AVEl2, nj =
AVEl2, nj =
Pl ,2jn =
Pl 2, jn +1 =
2N 2j
∑d
2n j
(k )
(5)
2N 2j
∑d
2n j
(k )
(6)
k
k
2N 2j
∑ (d
2n j
2N 2j
∑ (d
2 n +1 j
(k )) 2
(7)
k
(k )) 2
(8)
k
The decomposition at level j can give rise to 2 j wavelet packet components (nodes), in which nodes whose frequencies are lower than 50Hz are considered relevant components. AVEl , j and Pl , j of the relevant nodes are computed according to (5) ~ (8). We sort these AVEl , j and Pl , j values by channel, then we can obtain the following vector: j
M = { AVE j ,1 , AVE j ,1 , AVE j ,1 ...; AVE j , 2 , AVE j , 2 , AVE j ,2 ...;......; AVE 0j , L , AVE 1j , L , AVE j2, L ...;} . 0
1
2
0
1
2
N = {Pj0,1 , Pj1,1 , Pj2,1 ...; Pj0,2 , Pj1,2 , Pj2,2 ...;......; Pj0, L , Pj1, L , Pj2, L ...;} .
(9) (10)
22
B. Yang et al.
For the simplicity of description, according to the original order, M and N can also be written as: M = {m1 , m2 , m3 ......} .
(11)
N = {n1 , n2 , n3 ......} .
(12)
Here, M and N are considered initial feature vectors. It should be noted that the deeper the decomposition level j is, the higher frequency resolution we will obtain. However, a deeper decomposition level also results in more complex computation. So, we should select j reasonably according to actual requirements. 3.2 The Form of the Feature Vector
In order to reduce the dimensionality of the feature vector, a criterion called Fisher distance is adopted to evaluate the separability (classification performance). Fisher distance can be represented as:
J = tr ( S w−1 Sb )
(13)
where Sb is the distance among classes, Sw is the distance within classes, tr is the trace of the Sw−1 Sb matrix. Features with larger j values are considered to be more pertinent for classification than those with smaller j values. We compute j value of each feature component in M and N respectively. And so a new vector M ' = {m1' , m2 ' , m3' ,...md ' } can be achieved by selecting a subset of d features with the highest j values from M . In a similar way, N ' = {n1' , n2 ' , n3' ,...nl ' } can also be achieved by selecting a subset from N . We combine M ′ with N ′ to form the final feature vector F = {M ' , N ' } . 3.3 The Step of Obtaining the Feature Vector Step1. all training samples are decomposed to level j channel by channel according to (3) and (4), dl2, nj (k ) and dl2, nj +1 (k ) of channel l can be obtained. 2n 2 n +1 2n 2 n +1 Step2. AVEl , j , AVEl , j , Pl , j and Pl , j of the nodes (sub-bands) whose frequencies lower than 50Hz can be computed according to (5) ~ (8). So initial features M , N can be obtained. Step3. j value of each feature component in M and N is computed according to (13), then, M ′ , N ′ and the final feature vector F can be obtained.
4 Experiment Data and the Feature Vector 4.1 Experiment Data
Six healthy subjects (sub1-sub6) who had no any experience in BCIs participated the experiment. They seated in a shielded room with dim lighting. A 32-channel elastic
Wavelet Packet-Based Feature Extraction for Brain-Computer Interfaces
23
electrode cap was used as recording electrode. The data were recorded at a sampling rate 256Hz with ESI-128. Each subject repeated the experiment for two sessions (session A and session B). Each session comprised 150 trials. The subjects were asked to imagine performing one of three motor imagery tasks (playing basketball with left hand, playing basketball with right hand, and braking with right foot) in a self-paced mode during each trial. Each trial lasted 5.75s~6.25s (mean 6s) and consisted of three phases: 1) a 0.75s~1.25s (random) resting phase during which the computer screen was black; 2) a 1s preparation phase during which a “+” fixation was displayed; 3) a 4s of motor imagery task phase during which the subjects were performing the corresponding motor imagery task according to the direction of the arrow (a left arrow indicating imagining left hand, a right arrow indicating imagining right hand, a down arrow indicating imagining right foot). The arrow was displayed during the first 1s of the 4s task phase and the computer screen was black during the other 3s. The data during the last 4s of each trial were used to perform off-line analysis. 4.2 The Feature Extraction
According to the method described in section 3, we select three different wavelet functions (db4 of Daubechies, db6 of Daubechies, sym2 of Symlets) to decompose raw EEG signals up to the sixth level giving sixty-four nodes, in which the first 25 nodes whose frequencies lower than 50Hz are adopted for each single channel. 0 .2 0
the second peak value
Fisher distance J
0 .1 5
the first peak value
0 .1 0 0 .0 5 0 .0 0 0
30
60
90
120
150
Feature component number Fig. 2. Average
J
values of each feature component in M
Considering the practicality of BCIs system, we use a few electrodes (C3, C4, P3, P4, CZ, and PZ electrodes over the sensorimotor cortex, the electrodes position can be seen in [9]). Consequently, the sizes of both the subset M and N the subset are 150 (25*6 channels). Fig.2 and Fig.3 show average j values of six subjects of each feature component with db4 wavelet function in M and N respectively. It can be seen from Fig.2 that there are two peak values, so we adopt d = 2 , i.e., M ' = {m1′ , m2′ } . We adopt fifteen features with j values exceeding 0.02 in N , i.e., l = 15 , N ' = {n1′ , n2′ , n3′ ,...n15′ } . It should be noted that we tried many l values ( l = 5, 6, 7,...20 ) with training samples, the best result is determined with l = 15 by the classification accuracy of training samples. F = {M ' , N ' } is the final feature vector for classification.
24
B. Yang et al.
Before sort After descending sort
Fisher
0 .0 6 0 .0 4 0 .0 2 0 .0 0 0
30
60
90
120
150
Feature component number Fig. 3. Average
J
values of each feature component in N
5 Results and Analysis We establish the feature extraction and the classification models using training samples. The model diagram is shown in Fig.4. To a testing sample, raw EEG data is feed into the established model and the output of the model is the class label. By virtue of the easy training and a solid statistical foundation of the Probabilistic Neural Network (PNN) in Bayesian estimation theory [10], we use the PNN as our classifier. In the PNN, we take the value of the spread of the radial basis function 1.0. In order to prove the performance of the WPTCP method, we compare it with the frequency method, traditional time-frequency method and the existing WPTP. These methods can be described as follows: (1) WPTCP: this method is described in subsection 4.2 and the feature vector is F = {M ' , N ' } . (2) WPTP: this method is familiar with the WPTCP, the feature vector is formed by power values, i.e. F = {N ' } . (3) Frequency method: We estimate the spectral power of the signal using Welch method, which consists of averaging the power spectra by sliding-windowed fast Fourier transforms across the duration of a trial. The implementation of the Welch method uses a Hamming window of width 1s. The mean power of each channel in the band 0-50Hz is computed, which result in a 6-dimentional feature vector using six channels described in subsection 4.2. (4) Traditional time-frequency method: The mean value across the duration of a trial in the time domain and the spectral power in the frequency domain are computed simultaneously. The feature vector is formed by combining the mean value and the spectrum power of six channels. Therefore the feature vector is 12-dimensional. The calculation of the spectrum power is the same as in method (1). We evaluate the performance of different feature extraction methods with the classification accuracy of testing samples. The classification accuracy is defined as the ratio between the number of trials that are classified correctly and the number of all trials. We adopt data from the session A as training samples and data from the session B as testing samples. Meanwhile, different wavelet functions (db4, db6, sym2) are
Wavelet Packet-Based Feature Extraction for Brain-Computer Interfaces
25
Raw data WPT decomposition at sixth level
( )
Average oefficient values M Computing J , be obtained
M ′ can
( )
Average power values N N′
Computing J , can be obtained
F = { M ′, N ′}
classifier class label Fig. 4. The model diagram of the feature extraction and the classification Table 1. The classification accuracy (%) of different feature extraction methods
Feature extraction methd (1) (2) (3) (4) Feature extraction methd (1) (2) (3) (4)
Classification accuracy (%) sub1 Sub2 Sub3 db4 db6 sym2 db4 db6 sym2 db4 db6 sym2 70.5 59.3 68.6 71.2 71.5 69.4 68.7 66.5 67.2 63.2 65.3 62.9 68.9 67.1 64.5 65.6 63.5 60.3 58.7 60.5 56.4 63.4 64.2 60.3 Classification accuracy (%) Sub4 Sub5 Sub6 db4 db6 sym2 db4 db6 sym2 db4 db6 sym2 66.5 67.1 63.4 71.9 69.4 66.8 72.1 70.5 69.3 63.2 65.3 62.9 65.1 66.3 62.4 67.5 68.3 66.7 57.1 60.5 60.0 59.6 62.3 63.8
also adopted to test the classification performance of the WPTCP method. The classification accuracies with different wavelet functions and different feature vectors are shown in Tab.1. From Tab.1 we can see that the WPTCP method obtains the highest classification accuracy among all feature extraction methods for each subject. The WPT is a good analysis tool but the WPTP only contains frequency-domain information and lack time-domain information. The conventional time-frequency method uses both timedomain information and frequency-domain information. However the method is not very appropriate for the non-stationary EEG signal. So, both the WPTP method and conventional time-frequency method obtain a middle classification result. The frequency method obtains the worst result.
26
B. Yang et al.
6 Conclusion The WPT is a good analysis tool for EEG signals. The combination of average WPT coefficient values and the power values can provide plenty of feature information, which makes the WPTCP and has significantly improve the classification performance than other methods. Preliminary study shows the WPTCP is a promising method to extract features from spontaneous EEG signals for BCIs. The effectiveness and reliability of the WPTCP need be testified by more data and more subjects. In addition to the average coefficients and the average powers, more effective features based on the WPT should be investigated further. Acknowledgments. This project is supported by National Natural Science Foundation of China (60975079), State Key Laboratory of Robotics and System (HIT), Shanghai University, "11th Five-Year Plan" 211 Construction Project, Systems Biology Research Foundation of Shanghai University, Shanghai Key Laboratory of Power Station Automation Technology (08DZ2272400).
References 1. Guo, J., Hong, B., Guo, F., Gao, X.R., Gao, S.K.: An Auditory BCI Using Voluntary Mental Response. In: 4th International IEEE EMBS Conference on Neural Engineering, Antalya (2009) 2. Guo, X.J., Wu, X.P., Zhang, D.J.: Motor Imagery EEG Detection by Empirical Mode Decomposition. In: International Joint Conference on Neural Networks, pp. 2619–2622 (2008) 3. Yu, X.Q., Xiao, M.S., Tang, Y.: Research of Brain-Computer Interface based on the TimeFrequency-Spatial Filter. Bioinformaics and Biomedical Engineering (2009) 4. Zhao, M.Y., Zhou, M.T., Zhu, Q.x.: Feature Extraction and Parameters Selection of Classification Model on Brain-Computer Interface. Bioinformatics and Bioengineering, 1249–1253 (2007) 5. Sherwood, J., Derakhshani, R.: On Classifiability of Wavelet Features for EEG-Based Brain-computer Interfaces. In: Proceedings of International Joint Conference on Neural Networks (2009) 6. Satti, A., Coyle, D., Prasad, G.: Continuous EEG Classification for a Self-paced BCI. In: Proceedings of the 4th International IEEE EMBS Conference on Neural Engineering Antalya, pp. 315–318 (2009) 7. Wang, S.Y., Zhu, G.X., Tang, Y.Y.y.: Feature extraction using best wavelet packet transform. Acta electronica sinica 31, 1035–1038 (2003) 8. Murugappan, M., Nagarajan, R., Yaacob, S.: Appraising Human Emotions using Time Frequency Analysis based EEG Alpha Band Features. Innovative Technologies in Intelligent Systems and Industrial Applications (2009) 9. Bashashati, A., Rabab, K.W., Gary, E.B.: Comparison of Using Mono-Polar and Bipolar Electroencephalogram (EEG) Electrodes for Detection of Right and Left Hand Movements in a Self-Paced Brain Computer Interface (BCI). Electrical and Computer Engineering, 725–728 (2007) 10. Hazrati, M.K., Erfanian, A.: An On-line BCI for Control of Hand Grasp Sequence and Holding Using Adaptive Probabilistic Neural Network. In: 30th Annual international IEEE EMBS Conference Vancouver, Canada (2008)
Keynote Address: The 3D Imaging Service at Massachusetts General Hospital: 11 Years Experience Gordon J. Harris Director, 3D Imaging Service, and Radiology Computer Aided Diagnostics Laboratory (RAD CADx LAB), Massachusetts General Hospital Associate Professor of Radiology, Harvard Medical School, Boston, MA USA
1 Rationale In 1999, we set out to create a radiology three-dimensional (3D) imaging service at Massachusetts General Hospital (MGH). Our goal was two-fold: first, to integrate 3D image post-processing capabilities, computer-aided diagnosis (CAD), and quantitative analysis into the routine clinical workflow; and second, to create an infrastructure generally more conducive to the transfer of new image-processing technologies from the research realm into clinical use. Initially, we found that although our institution possessed several 3D imaging workstations, they were used only occasionally for research purposes and, when a clinical request for 3D post-processing was made, the staff lacked the expertise and experience to fulfill those requests.
2 3D Imaging Techniques Three-dimensional image processing begins with a stack of 2-dimensional images, assembles them into a 3-D volume, and then manipulates them in a variety of ways. There are numerous techniques for image manipulation that can be performed with 3D imaging software, including maximum intensity projection (MIP), volume rendering (VR), endoluminal views, segmentation, and functional imaging; however, the challenge is selecting the technique that provides the most clinical value. To that end, the staff of our 3D Imaging Service has undergone extensive training. In addition, together with radiologists and referring physicians, our staff has crafted 3D protocols with standard views for each imaging modality (magnetic resonance [MR], computed tomography [CT], ultrasound [US]) and for each 3D clinical application, which have been selected in order to provide reliable consistency and optimal clinical value, both important features for any clinical service. See Table 1 for a list of 3D protocols performed by our 3D Imaging Service. Without standardized 3D protocols, if multiple radiologists or technologists perform 3D imaging at varying levels of expertise, the images created can vary widely, and the output may be difficult for radiologists or referring physicians to interpret and may be of no clinical value. Selection of an appropriate 3D image analysis technique often depends on the perspective of the physician: for diagnosis, radiologists may prefer a technique such as MIP in which all of the information is present and none has been removed by the K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 27–32, 2010. © Springer-Verlag Berlin Heidelberg 2010
28
G.J. Harris Table 1. Partial List of 3D protocols offered by the MGH 3D Imaging Service Cardiac Radiology
• • • • •
Cardiac CTA Chest CT for pulmonary vein evaluation prior to ablation Chest CT for Pulmonary Arteries Chest MR for pulmonary vein evaluation prior to ablation Cardiac calcium scoring
Vascular Radiology
• • • • • • • • • •
CTA / MRA for Abdominal Aortic Aneurysm (AAA) CTA / MRA for pre- and post-op Thoracoabdominal Aneurysm (TAA) CTA / MRA for aortic dissection Abdominal MRA for mesenteric ischemia Abdominal / pelvis MRA / MRV for portal & deep vein thrombosis MRI Upper Extremity Runoff CTA Runoff MRA Chest MRI Vascular Run-off Renal MRA for Stenosis
Bone & Joint Radiology
•
Skeletal fractures (Spine, Face, Temporal and Joints)
Neuroradiology
• • • • • •
Head CTA / MRA Neck CTA / MRA Head CT / MR Venography Head CT / MR Perfusion Mandible CT for inferior alveolar nerve Pediatric CT for craniosynostosis
Abdominal Radiology
• • • • • • •
Liver Resection / Liver Donor CT Liver Volumes CT Urography / Hematuria CT Renal Donor MRCP Pancreas CTA MRI Liver and Spleen volumes for Gauchers Disease
Chest Radiology
• •
Chest CT for any tracheal lesion (Virtual Broncoscopy) Chest CT for Video Assisted Thoracotomy (VAT) planning
Keynote Address: The 3D Imaging Service at Massachusetts General Hospital
29
computer, whereas a surgeon may prefer a more anatomically realistic view for surgical planning, such as a VR image in which some of the information has been segmented out (Figure 1). For applications such as vascular imaging, it is not uncommon to pair more than one technique: for example, VR to assess the geometry of any vascular lesions together with curved multiplanar reformatting (MPR) to assess for stenoses or occlusions (Figure 2).
(A)
(B)
Fig. 1. Neurovascular image processing using (A) maximum-intensity projection and (B) volume rendering
A
B
(A)
(B)
Fig. 2. Neurovascular image processing using (A) curved multiplanar reformation and (B) volume rendering
30
G.J. Harris
3 Benefits of 3D Imaging For diagnostic vascular imaging, 3D image analysis has allowed us to almost entirely replace the more expensive and invasive catheter angiogram with CT angiography (CTA) or MR angiography (MRA). Moreover, with CTA and MRA, it is possible to view not only the vessels, but also the surrounding parenchyma and other nearby structures. One example of the benefit of 3D vascular imaging is in evaluation of living renal donors, where the transplant surgeon requires a complete picture of the number and geometry of the renal arteries, veins, and ureters of the donor. For this application, we have been able to replace two more expensive and invasive exams, catheter angiography (involving anesthesia and higher risk of complications) plus intravenous pyelography (IVP), with a non-invasive, less expensive, and better tolerated single outpatient CT exam involving CTA plus delayed-phase CT urography. The healthy donor is spared from expensive, invasive procedures and, instead, receives a simple, outpatient, contrast-enhanced multiphasic CT scan capable of gathering all of the necessary information with minimal risk. Computer-aided segmentations are used to disarticulate structures within an image, which can greatly assist in pre-surgical planning, as in the case of the repair of complex fractures (Figure 3). Segmentation can also facilitate the assessment of vessels; for example, in cardiac imaging, segmenting out some of the adjacent structures can provide a clear view of the vessels from their origin to the apex. Segmentation can also be useful for accurate determination of brain tumor volumes, particularly in the case of irregularly shaped tumors where a linear measure provides insufficient information. Another use of quantitative segmentation is in the accurate determination of
Fig. 3. Computed-aided segmentation of a complex fracture
Keynote Address: The 3D Imaging Service at Massachusetts General Hospital
31
the volume of a donor liver to determine if it is large enough to supply both the donor and the recipient. Without pre-operative, quantitative, volumetric assessment, this determination was based on a fairly rough estimation; however, it is now possible to more precisely determine the volume of the liver and perform a virtual resection on the computer, potentially increasing the success rate of liver transplantation. Moreover, this technique has been automated within our lab and can now be performed in less than 10 minutes. In the 3D Imaging Service at MGH, we also perform functional imaging using CT and MR perfusion, and functional MRI (fMRI). CT perfusion can be used to assess patients for stroke by measuring various hemodynamic parameters: for example, an increased mean transit time and decreased cerebral blood flow indicate the presence of an infarcted area. Functional MRI plays a role in neurosurgical planning, helping the surgeons to determine the proximity of the surgical target to critical sensory, motor, and language areas. The use of fMRI in this way can reduce the amount of time the surgeon spends doing intraoperative cortical mapping, which can decrease operating room time, cost, and risk of complications.
4 3D Imaging Service at Massachusetts General Hospital At MGH, our full-time 3D Imaging Service performs 3D imaging upon request with rapid turnaround time. We are fully integrated with the hospital’s picture archiving and communications systems, billing, and information systems. Our volume has continued to grow each year: When we started in February 1999, we performed an average of two exams per day, and now we perform approximately 120 exams per day, or 2,500 per month. Our clinical staff is currently comprised of approximately 16 individuals, including 3-D technologists, image analysts, operations and technical managers, and billing coordinators, and we utilize a wide variety of different types of workstations from many different vendors for different applications. We select the vendor that we feel has the best software for each 3D protocol, and hence, we find that some 3D protocols are best managed by one vendor, while another vendor may be best for other 3D protocols. We primarily perform CTA and MRA, nonvascular CT and MR exams, and 3D US, with approximately half being neuro-based and the remainder being vascular, as well as other applications. We currently perform 3D postprocessing for approximately 10% of the CT examinations and 20% of the MRI and US examinations at MGH.
5 Off-Shore Night and Weekend Coverage In 2003, the growth and popularity of our 3D Imaging Service at MGH began to pose a problem. Radiologists and referring physicians had become dependent on the 3D images created in our lab, but wanted these images available 24 hours a day, seven days a week, whereas our staff could only cover the day and evening shifts. Furthermore, there was a shortage of technologists, and it was difficult to find a qualified person that we could hire and train to perform 3D imaging 7 nights per week. We developed a solution in collaboration with an India-based information technology
32
G.J. Harris
company in collaboration with an India-based hospital. We had a radiologist from India working as a 3D imaging fellow in our lab for a year who was fully trained in our 3D protocols and operations who moved back to India and remains there as our lead 3D radiologist, and who has trained two other radiologists who share the duties to cover our 3D post-processing 7 nights per week (the day shift in Bangalore, India). This relationship has lasted for seven years, and we consider the staff in Bangalore as an extended part of our MGH 3D Imaging Service Team. The radiologists in India process approximately 500 exams per month from MGH, and send the resulting 3D images back to the MGH PACS system to be read by the MGH-based staff radiologists. No clinical interpretations are performed by the India team, who solely provide the 3D rendering services for MGH, processing the night/weekend 3D views on CTA and MRA exams.
6 Tele3D Services for Outside Hospitals and Imaging Centers In 2009, we began providing 3D imaging services for outside hospitals and imaging centers through our Tele3D service. While MGH is a large academic medical center with the resources to develop and support a robust and broad-based 3D Imaging Service, many hospitals and imaging centers lack the infrastructure, resources, and/or expertise to develop and manage such a service, or do not have sufficient clinical 3D volume to make it worth the investment or time, energy, staff, and resources to develop and in-house 3D imaging service. To fill this need, we began offering the services of our lab to other hospitals and imaging centers across the United States. We currently have client hospitals at three centers in Ohio and California, including five hospitals and two imaging centers, for whom we are processing 6-7 exams per day on a fee-for-service basis. This allows these client hospitals to provide the 3D imaging quality, consistency, and expertise of the MGH 3D Imaging Service to their radiologists and clinicians at a fraction of the cost of equipping and staffing a full-time onsite 3D operation.
7 Conclusions In summary, 3D image analysis provides more comprehensive and realistic patient evaluation. Quantitative analysis with CAD can provide more accurate, reliable assessment, staging, and treatment planning, ultimately improving patient care, increasing clinical confidence, and reducing the time, cost, and invasiveness of procedures. We recognize that the level of commitment of resources needed to develop an inhouse 3D imaging service may not be practical for all imaging centers; therefore, through improvements in networking and communications, we have expanded our CAD and 3D services to help support the needs of outside hospitals and imaging centers through our Tele3D service. During the past year, we have been providing 3D image processing services to 5 client hospitals and two imaging centers across the United States, and we hope to grow this service to support many hospitals and imaging centers in years to come.
A Novel Localization System Based on Infrared Vision for Outdoor Mobile Robot Jingchuan Wang1,* and Weidong Chen1,2 1
Department of Automation, Shanghai Jiao Tong University, Shanghai 200240, China 2 State Key Laboratory of Robotics and System (HIT), Harbin 150001, China {jchwang,wdchen}@sjtu.edu.cn
Abstract. An outdoor localization system for mobile robot based on infrared vision is presented. To deal with the changes of light conditions, an omnidirectional near infrared (NIR) vision system is developed. The extended Kalman filter (EKF) is used in localization, and to improve the accuracy and robustness of the system. Finally, the experiments demonstrate the system performance in an electrical substation. Keywords: Mobile Robot, Out Door Localization, Infrared Vision.
1 Introduction Recently, mobile robots are getting into work instead of human in outdoor applications such as patrolling and surveillance in unstructured areas. In these situations, the localization and navigation techniques have to deal with the changing conditions and interference in the environments. Especially, for the applications in the hazardous environments, for example, the patrol robot at electrical substation, an autonomous localization system suitable for varying lighting conditions is necessary. Laser range finder, GPS and WLAN are widely used for mobile robot localization. J. Guivant et al. in [1] presented a design of a high accuracy outdoor navigation system based on standard dead reckoning sensors and laser range and bearing information. Y. Morales et al. in [2] proposed a loosely-coupled multi-sensor fusion and sensor fault detection issues, such as encoders and StarFire-DGPS. M. Agrawal et al. in [3] described a real-time, low-cost system to localize a mobile robot in outdoor environments relied on motion estimate of stereo vision. This incremental motion was fused with a low-cost GPS sensor using a Kalman filter to prevent long-term drifts. Graefenstein et al. in [4] proposed a robot localization method using received signal strength indicator (RSSI) in low power IEEE 802.15.4 conform wireless communication in an outdoor environment. However, GPS signals are always missing due to occlusion caused by building or trees. Especially, many methods will be disturbed strongly by electric and magnetic field in electrical substation, such as GPS, WLAN and magnet tracking sensor. *
This work is partly supported by the National High Technology Research and Development Program of China under grant 2006AA040203, the Natural Science Foundation of China under grant 60775062 and 60934006, and the State Key Laboratory of Robotics and System (HIT).
K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 33–41, 2010. © Springer-Verlag Berlin Heidelberg 2010
34
J. Wang and W. Chen
Vision-based robot localization in outdoor environment is also becoming popular recently. C. Weiss et al. in [5] presented a hybrid localization approach that switched between local and global image features. P. Blaer et al. in [6] developed a topological localization approach using color histogram matching of omni-directional images. Vision-based robot localization is difficult, especially for the outdoor environments, due to the changing illumination conditions. Infrared light is not visible, and not disturbed by visible light. It can be captured by camera. L. Sooyong et al. in [7] developed the localization system using artificial landmarks. The infrared light reflecting landmarks together with infrared camera helped reducing the disturbance to the users and enabled simple installation. J. Takiguchi et al. in [8] presented a self-positioning system for a mobile robot. The proposed positioning system consisted of an ODV (Omni-Directional Vision system) featuring two mirrors. A primary and two supplementary markers were used. The relative self-position could be estimated from three directional angles toward landmarks. In this paper, a new system (NIR System) is presented; it can work in outdoor environments under widely varying lighting conditions. This system relies on the illumination of infrared and recognition of reflecting passive landmarks. In order to increase the sensing range, we propose omni-directional vision system (omni-vision) for illumination and recognition.
2 System Overview Mobile robot localization in outdoor electrical substation environments is a challenging task, as shown in Fig.1. There is strong electromagnetic interference. In such outdoor environments, common odometry fails because of long distance and repeating navigation, and GPS is not completely reliable. Thus, the system based on infrared vision is presented. The NIR system consists of two parts: an omni-directional NIR vision system and a set of reflecting passive landmarks (RPL). As shown in Fig.2, NIR vision system is fixed on the top of the robot. RPL distribute along the both sides of robot path. The infrared ray projects from NIR vision system could cover the robot’s surrounding area. And the rays reflected by RPL are captured by the camera of the NIR vision system. Through the recognition of different RPL whose information is saved in RPL file, dead reckoning of robot and localization algorithm with extended Kalman filter (EKF), the robot’s position is determined.
Fig. 1. Environment
Fig. 2. NIR vision system with RPL map
A Novel Localization System Based on Infrared Vision for Outdoor Mobile Robot
35
2.1 NIR Vision System As shown in Fig.3, the NIR vision system consists of two parts: a near infrared illuminator and an omni-directional camera. The illuminator consists of several LED, and its total power is 44 W. The radiator which clings to the illuminator is used to emanate heat. In order to reduce the disturbance from visible light, top mask and emission filter are used. The back mask curtains omni-directional camera’s back area of 90 degree for the wire to illuminator. Then, only the light from 810 nm to 880 nm will project in and be captured by camera. Different designs are used in the two mirrors of the NIR vision system respectively. The reflector mirror is designed to get uniform illumination on a specific planar. Its diameter is 110 mm. It can irradiate the surrounding area of 2 to 4 m radius of the robot. The omni-vision mirror is designed as hyperbolic shape with ‘single viewpoint’ offering unambiguous image [9], its diameter is 60 mm.
Fig. 3. NIR vision system
2.2 Reflecting Passive Landmark (RPL) The reflecting passive landmark (RPL) is designed into 4 patterns, which are shown in Fig.4. They are tagged with RPL ID from No.0 to No.3 with trapezoid-pole shape. The white parts are covered by reflecting materials and can reflect infrared light; the incline angle of trapezoid is suitable to reflect the flight from the illuminator to the omni-directional camera. On the contrary, the black parts could absorb light. Different ratios between white and black parts represent different RPL patterns. The combination of different RPL represents different path status. There are straight path, path corner and stop station.
No.0 RPL
No.1 RPL
No.2 RPL
Fig. 4. RPL patterns
No.3 RPL
36
J. Wang and W. Chen
2.3 Map According to the RPL rules in the above section, the RPL will distribute at the real environments. As shown in Fig.5, there are an experiment environment and a RPL map. There are two stop stations for robot moving pause, a straight path and a corner. The width (D) of the road is 3 m, and length of the path between two stop stations is 20 m. There are 22 RPL totally. The distance (L) between two nearby RPL is 2 m. The ID and position in global coordinate of each RPL are saved in RPL file.
a. Experimental scene
b. RPL map
Fig. 5. Experimental scene and RPL map
2.4 Image Processing As shown in Fig.6, there is the processing flow chart of one frame image. In order to reduce the interference of visible light, the shutter time of camera should be set to as short as possible (0.02ms). Then, the panoramic image should to be expanded to a rectangular one. The expanded image should be processed by intensity amplification [10] to increase the difference of luminance. Then the binarization method in [11] is used. After binarization, there are morphological filtration [12], segmentation and matching, the highlight-blobs are detected and separated.
X v⋅i (k )
Fig. 6. Flowchart of image processing
According to the templates of the RPL which were described in the above chapter, the position X v⋅i (k ) of No. i RPL in pixel coordinate at time k is calculated, this vector is given by:
X v ⋅i (k ) = [xv
yv ]i
Commonly, maximal 4 RPL will be recognized in one image.
(1)
A Novel Localization System Based on Infrared Vision for Outdoor Mobile Robot
37
3 EKF Localization Algorithm Fig.7 depicts the flow chart of the localization algorithm in this system. An EKF method [14] is used for sensor fusion estimation. Through the estimated results X (k − 1) with covariance matrix P(k − 1) , motion input u (k ) from dead reckoning module and the RPL vector Z (k ) from RPL localization module, the EKF framework calculates the estimated results X (k ) and covariance matrix P(k ) at time k .
Fig. 7. EKF localization algorithm
3.1 Dead Reckoning It is necessary to have a robust dead reckoning method that can keep accurate localization estimation on long distance moving. Dead reckoning will provide rough localization of robot with accumulated error due to unavoidable slip. The motion input is given as:
u (k ) = (Δxe , Δye , Δθ e )T
Δxe , Δye and Δθ e are the variation of odometry results X o (k ) time k − 1 and k and the covariance matrix of u could be expressed by Q . Where,
(2) between
3.2 RPL Localization The result X v⋅i (k ) of No. i RPL from image processing in pixel coordinate should be compare with the RPL map, only the result which match to the RPL map will be recorded, the others will be discarded. The RPL vector Z (k ) can be calculated from X v⋅i (k ) and RPL map, it is given by:
38
J. Wang and W. Chen
z (k ) = (θ1 , θ 2 , ", θ n )T 0 < n ≤ 4 ,
(3)
The covariance matrix [15] of RPL Localization could be expressed by R .
4 Experiments The NIR system has been tested in electrical substation, the outdoor environment is described at section 2. The NIR vision system is fixed on Frontier-II robot, as shown in Fig.8.
Fig. 8. Frontier-II robot platform
The designed map is shown at Fig.5, one stop station is selected as the start position and robot is supposed to go around the path for 20 times at the speed of 1m/sec. 3 repeated running tests are performed on the same environment at different times to test the effects to EKF estimated results in various light conditions. During the experiments, we record the data in one cycle time as follow:
[k
X (k )
X o (k ) i
X v ⋅i ( k ) ]
(4)
4.1 Results in Various Light Conditions The EKF estimated and odometry results in global coordinate in various light conditions are recorded. Respectively, the illuminations in morning, afternoon and night are 20000, 15000, and 500 lux. As shown in fig.9, as the rounds of robot movement increasing, the error of odometry results accumulates. While, the EKF estimated results keep close to the expected path with acceptable errors.
A Novel Localization System Based on Infrared Vision for Outdoor Mobile Robot
39
Fig. 9. Experimental results in various light conditions
The error of EKF estimated results in various light conditions is shown in Table 1. Table 1. MSE performance
MSEx [cm 2 ]
Am 10:00 Morning 18.49
Pm 15:00 Afternoon 15.21
Pm 20:00 Night 7.84
MSEθ [ rd 2 ]
0.052
0.061
0.022
MSE
Comparing the EKF estimated results with each other; the error of experiment at night is the least, while the error of the estimated results of experiments during the daylight is almost the same, much bigger than that of night. 4.2 Error Analysis According to the above experiment results, the error of EKF estimated results increases with the light illumination augment. The recognition of RPL is base on the
40
J. Wang and W. Chen
illuminator. So, the error of EKF estimated results arises from the disturbance of light ray. We divide this disturbance into 3 situations: • Recognition error due to incomplete lighting; • False or fail recognition due to disturbance of ground; • False or fail recognition due to disturbance of background. Fig.10 shows the recognition error arises by incomplete lighting. Due to the complete lighting, the whole profile of the RPL of red pane in Fig.10 (a) is recognized correctly and perfectly. However, only a half part of the RPL of green pane is recognized. It is shown at the expanded image. It arises from the azimuth angle of sunshine. This situation results in the error of the position X v⋅i (k ) of this RPL in pixel coordinate, then the error of RPL vector Z(k) , finally the error of EKF estimated result. This phenomenon could not be avoided for the existing of infrared light in solar ray. So it always appears during daylight. Fig.11 depicts the fail recognition for the bright ground which reflects strong sunlight. As shown in Fig.11 (a), the bottom of the RPL with green pane can not be extracted. So, this RPL is failed to be recognized. Thus, only 2 RPL could be recognized from this image. Then the error of EKF estimated results increases. As shown in Fig.12 (a), the right highlighted area is grassplot; it is brighter comparing with the stone area on the left, and brings obvious disturbance to NIR vision system. The RPL of green pane is failed to be recognized. The meanly reason is that the grass has the capability to reflect infrared rays from the solar ray intensely. Through the above discussion, the error of EKF estimated results mostly arises by the disturbance of infrared ray of sunshine. The azimuth angle of sunshine, the reflecting ground and the reflecting background are also the disturbance of NIR vision system.
Fig. 10. Light disturbance
Fig. 11. Ground disturbance Fig. 12. Background disturbance
5 Conclusion In real electrical substation environments, GPS and WLAN sensors were not working reliably due to strong electromagnetic interference, and dead reckoning couldn’t cover a long distance.
A Novel Localization System Based on Infrared Vision for Outdoor Mobile Robot
41
For a patrol mobile robot working at outdoor electrical substation environments, an infrared vision-based localization system is proposed. The hardware and software framework were described. Moreover, landmark design, data structure of map, and infrared image processing method was discussed. The experimental results demonstrate the validity of the proposed system.
References 1. Guivant, J., Nebot, E.M., Baiker, S.: Localization and map building using laser range sensors in outdoor applications. Journal of Robotic Systems 17, 565–583 (2000) 2. Morales, Y., Takeuchi, E., Tsubouchi, T.: Vehicle Localization in Outdoor Woodland Environments with Sensor Fault Detection. In: Proc. IEEE Int. Conf. Robotics and Automation, Pasadena California (May 2008) 3. Agrawal, M., Konolige, M.: Real-time Localization in Outdoor Environments Using Stereo Vision and Inexpensive GPS. In: Proc. 18th Inter. Conf. Pattern Recognition, vol. 3, pp. 1063–1068 (2006) 4. Graefenstein, J., Bouzouraa, M.E., et al.: Robust Method for Outdoor Localization of a Mobile Robot Using Received Signal Strength in Low Power Wireless Networks. In: Proc. IEEE Int. Conf. Robotics and Automation, Pasadena California (May 2008) 5. Weiss, C., Tamimi, H., Masselli, A., et al.: A hybrid approach for vision-based outdoor robot localization using global and local image features. In: Proc. IEEE/RSJ Int. Conf. Intelligent Robots and Systems, San Diego, CA, USA, October 29 - November 2, pp. 1047– 1052 (2007) 6. Blaer, P., Allen, P.: Topological Mobile Robot Localization Using Fast Vision Techniques. In: Proc. IEEE Int. Conf. on Robotics and Automation, Washington DC, USA (May 2002) 7. Sooyong, L., Jae-Bok, S.: Mobile Robot Localization using Infrared Light Reflecting Landmarks. In: Proc. Int. Conf. Control, Automation and Systems, Seoul, Korea, October 17-20 (2007) 8. Takiguchi, J., Takeya, A., Nishiguchi, K., et al.: A study of autonomous mobile system in outdoor environment. In: Proc. IEEE Int. Conf. Robotics & Automation, Seoul, Korea, May 21-26 (2001) 9. Zivkovic, Z., Booij, O.: How did we built our hyperbolic mirror omnidirectional camerapractical issues and basic geometry. Technical Report IAS-UVA-05-04, Informatics Institute, University of Amsterdam (2005) 10. Centeno, J.A.S., Haertel, V.: An Adaptive Image Enhancement Algorithm. Pattern Recognition 30(7), 1183–1189 (1997) 11. Otsu, N.: A threshold selection method from gray-level histograms. IEEE Trans. Systems, Man, and Cybernetics 9(1), 62–66 (1979) 12. Serra, J.: Image Analysis and Mathematical Morphology. Academic Press, New York (1982) 13. Williams, S.B., Newman, P., Rosenblatt, J., Dissanayake, G., Durrant-Whyte, H.: Autonomous underwater navigation and control. Robotica 19(5), 481–496 (2001) 14. Thrun, S.: Probabilistic algorithms in robotics. AI Magazine 21(4), 93–109 (2000) 15. Scheding, S., Nebot, E.M., Durrant-Whyte, M.: The detection of faults in navigation systems: A frequency domain approach. In: Proc. IEEE Int. Conf. Robotic and Automation, Belgium, pp. 2117–2222 (1998)
Analytical Solution for the Forward Problem of Magnetic Induction Tomography with Multi-layer Sphere Model Zheng Xu*, Qian Li, and Wei He State Key Laboratory of Power Transmission Equipment & System Security and New Technology, College of the Electrical Engineering, Chongqing University, Chongqing 400044, People’s Republic of China
[email protected],
[email protected]
Abstract. A 4-layer sphere model of human head was built for the forward problem of Magnetic Induction Tomography(MIT). The layers represent the brain, the CFS, the skull, and the scalp, respectively. Helmholtz equation in the spherical coordinates was constructed as control equation, and the vector magnetic potential was taken as a variable, the Variable Separation Method(VSM) was used to solve the equation companying with the boundary and interface conditions. The eddy current distribution in the model was obtained. As a result, the contour line of the eddy current field was plotted, the influence of the frequency to the induced voltage was analyzed. The simulation results demonstrate that this analytical method is validated in solving the forward problem of magnetic induction tomography. It may be used as a fast calculation method to generate the sensitivity matrix of the MIT inverse problem. Keywords: magnetic induction tomography, forward problem,variable separation method.
1 Introduction Magnetic Induction Tomography [1] (MIT) is a new noninvasive medical impedance imaging method. MIT applies a time-varying magnetic field from an excitation coil to induce eddy currents in the detected sample, and the secondary magnetic field derived from the eddy currents is closely related to the conductivity of the sample material, it can be detected by the sensor coils. Then the induced voltage of the sensor coils can be used to reconstruct the conductivity distribution in the sample. Comparing with Electrical Impedance Tomography (EIT)[2] and Electrical Capacitance Tomography (ECT), MIT has obvious advantages as follows: (1) In EIT, the conduction field is built by two surface electrodes. The path of the conduction current is divergent and the current density is small. As to MIT, the eddy current field is built by the excitation coils, the eddy current is curl field which has local focusing character. This is useful to improve the local imaging resolution. (2)MIT use the coils as sensor which is noncontacted with body surface. So the error caused by the impendence between the body *
Corresponding author.
K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 42–50, 2010. © Springer-Verlag Berlin Heidelberg 2010
Analytical Solution for the Forward Problem of Magnetic Induction Tomography
43
surface and the electrodes which is a nightmare in EIT can be avoided. (3) The magnetic field can penetrate through the skull with high resistivity, so MIT is thought to be particularly suitable for brain functional imaging. The forward problem of MIT is that: Given excitation source and the conductivity distribution in the sample, research the eddy current distribution and the induced voltage in the sensor. Morris[3] developed a finite-difference model for biological tissues of MIT. Merwa[4] developed a software package to generate the finite element (FE) model of complex structures. Liu[5] presented a method with the edge finite element to solve the 3D eddy current problem. But they ignored the effects of the displacement current. In this paper the displacement current was taken into consideration, and the Variable Separation Method(VSM) was employed to solve the forward problem equation. Comparing with the numerical method, VSM is an analytic algorithm which is more fast and precise when used to solve regularly geometric model. It is to facilitate calculation of the inverse problem of the sink model. (Because the majority of sink model is regular spherical, it can be used analytic algorithm).
2 Method 2.1 Physical Model The 4-layer concentric spherical model is shown in figure1, from outer to inner, the four layers denote scalp, skull, CSF(Cerebrospinal fluid) and cerebra, respectively. The parameters of the model are shown in table1[6-7] . The permeability of brain tissue is close to vacuum. So the relative permeability μ r of each layer is set as 1. A loading current circular coil of radius ρ ' is placed at a distance of z ' beyond the surface of the model. Table 1. Parameters of 4-layer spherical model
Relative radius
cerebra
CFS
skull
scalp
r4=0.84
r3=0.8667
r2=0.9467
r1=1
Conductivity(S/m)
0.32
1
0.04
0.32
Relative permittivity
4000
100
300
20000
The calculation space is divided into six parts. Layer1~4 correspond to scalp, skull, CSF and cerebra respectively. In order to convert the excitation current as interface condition, the outside space of the head model is divided into two parts, and the excitation current source is located on the interface of these outmost 2 layers. The model is shown in figure1, r5 is the radius of the sphere surface where the excitation current source located on, and it equal to
ρ '2 + z '2 .
44
Z. Xu, Q. Li, and W. He
Fig. 1. Sketch map of 4-layer head model
2.2 The Boundary Value Problem of the Magnetic Vector Potential
In a spherical coordinate system, the magnetic vector potential of each layer satisfies the Helmholtz equation [8]: 1 ⎛ ∇ 2 Aiφ + ⎜ ki 2 − 2 2 r sin θ ⎝
⎞ ⎟ Aiφ = 0 ⎠
(1)
where i = 1, 2,..., 6 is the number of layer. ki is propagation coefficient and it can be expressed as following: k i2 = − jωμ i ( γ i + jωε i )
It contains four basic parameters: conductivity γ , magnetic permeability μ , dielectric constant ε and the angular frequency ω of the excitation source. In the air layers (5th and 6th layer), according to the ignoring condition of the displacement current[9]: if ωR << υ (where R is the distance from the observation point to the source, υ is the propagation velocity of electromagnetic wave in a vacuum), the displacement current can be ignored. Setting the excitation frequency is 1MHz, R equals 0.01m, then ω R ≈ 10 4 << υ . The displacement current can be ignored in the 5th and 6th layers. In the spherical model(from No.1 to No.4 layer), the ratio of conduction current density and displacement current density is γ ωε . If the ratio is much larger than 1, the displacement current can be ignored. Taken the cerebra tissue as example, it has a conductivity of 0.32s/m and relative permittivity of 4000. When the excitation frequency is small, such as 1 KHz, the ratio equals to 3.62× 10 7 and is much larger than 1. So the displacement current can be ignored. But according to the analysis, improving the excitation frequency can increase the magnitude of the detection signal, the excitation frequency is usually very large, such as 1MHz or more. While the ratio equals to 1.45. In this situation, the displacement current can not be ignored.
Analytical Solution for the Forward Problem of Magnetic Induction Tomography
45
On the interface of two different media, the following two conditions are satisfied: (1) the magnetic vector potential on the two sides of the interface are equal;
lim Aiφ = lim Ai +1φ
r → ri − 0
r → ri + 0
i = 1, 2,3, 4,5
(2)
(2) the normal components of the current density are continuous:
1st ~4th th
th
5 ~6
⎞ ⎞ ∂ ⎛ r ∂ ⎛ r A jφ ⎟ = lim A j +1φ ⎟ ( j = 1, 2, 3, 4 ) ⎜⎜ ⎜ ⎟ r → r j + 0 ∂r ⎜ μ ⎟ r → r j − 0 ∂r ⎝ μ rj ⎠ ⎝ rj +1 ⎠ ∂ ∂ lim ( rA5φ ) − r →limr5 + 0 ∂r ( rA6φ ) = μ0 I δ (θ − θ ' ) r → r5 − 0 ∂r lim
⎫ ⎪ ⎪ ⎬ ⎪ ⎪ ⎭
(3)
where θ ' = arcsin(ρ ' r ') , μ r1 , μ r 2 , μ r 3 , μ r 4 is the relative permeability of different layers, μ 0 is the permeability of vacuum. r1 , r2 , r3 , r4 , r5 are the relative radius of different layers as shown in figure1. On the infinite boundary, the boundary condition is lim A6φ = 0 r →∞
(4)
2.3 Solution
Using the separation of variables method [10-11], the solutions of expression (1) in a spherical coordinate system are presented as follows: ∞
A1 = ∑ C1n jn ( k1r ) Pn1 ( cos θ )
(5)
Ai = ∑ ⎣⎡Cin jn ( ki r ) + Din yn ( ki r ) ⎦⎤Pn1 ( cos θ ) ( i = 2,3, 4 )
(6)
∞ D ⎞ ⎛ A5 = ∑ ⎜ C5 n r n + n5+n1 ⎟Pn1 ( cos θ ) r ⎠ n =1 ⎝
(7)
n =1
∞
n =1
∞
D6 n 1 P cos θ ) n +1 n ( n =1 r
A6 = ∑
(8)
where Cin and Din are the unknown coefficients, Pnm (cosθ ) is associated Legendre function. jn ( ki r ) is Bessel function. Substituting (5) and (6) into (2), we have: ∞
∑ ⎣⎡C j ( k r ) − D n =1
n n
1 1
j ( k2 r1 ) − D2 n yn ( k2 r1 ) ⎦⎤Pn1 ( cos θ ) = 0
1n n
(9)
46
Z. Xu, Q. Li, and W. He
Multiply (9) with Pl s (cos θ ) cos φ sin θ on both sides, and then integral θ form 0 to π , with the orthogonality properties of the associated Legendre and trigonometric function[12], we obtain:
C1n j11 − C2 n j21 − D2 n y21 = 0
(10)
Similarly, we can obtain the following expressions: a11
μ r1
C1n −
a21
μr 2
C2 n −
b21
μr 2
D2 n = 0
(11)
Cin ji ,i + Din yi ,i − Ci +1n ji +1,i − Di +1n yi +1,i = 0 ai ,i
μ ri
Cin +
bii
μ ri
Din −
ai +1,i
μ ri +1
Ci +1n −
C4 n j44 + D4 n y44 − C5 n r4 n − a44
μr 4
C4 n +
b44
μr 4
bi +1,i
μ ri +1
i = 2,3 Di +1n = 0
D5 n =0 r4 n +1
D4 n − ( n + 1) r4 n C5 n +
n D5n = 0 r4 n +1
r5 2 n +1C5 n + D5 n − D6 n = 0 nr5 2 n +1C5 n + ( n + 1) D5 n + ( n + 1) C6 n = μ 0 I ρ ' r5 n
( ) = y (k r ) , b
(12)
(13)
(14)
(15)
(16) ⎛ z'⎞ 2n + 1 Pn1 ⎜ ⎟ 2n ( n + 1) ⎝ r5 ⎠
( ) ( ) ( k r ) − ny ( k r ) ,
(17)
Where jij = jn ki rj , aij = ki rj jn −1 ki rj − njn ki rj , i = 1, 2,3, 4 , j = 1, 2, 3, 4
yij
n
i j
ij
= ki rj yn −1
i j
n
i j
i = 2,3, 4 , j = 1, 2, 3, 4
There are 10 equations and 10 unknown parameters, these parameters can be solved and be listed in appendix. 2.4 The Calculation of Eddy Current Field and Induced Voltage
When the excitation coil and sphere model are coaxial, the coulomb electric field intensity in the model is 0. So the electric field intensity is presented as follow: E = −∇ϕ − jωA
(18)
And according to the differential expression of ohm's Law, we can obtain the eddy current density: J = γE
(19)
Analytical Solution for the Forward Problem of Magnetic Induction Tomography
47
The induced voltage can be calculated by the circulation integral of magnetic vector potential along the loop lines.
∫
v = A ⋅ dl = l
∑ A ⋅ dl n
i
i
(20)
i =1
3 Results and Discussions In simulation, the amplitude of excitation current is set as 100mA and frequency is 1MHz. The distance from the excitation coil to the surface of the model is 0.01m. According to Maxwell's equations, the real part of the induced voltage represents the induced electromotive force derived from the secondary magnetic field which is induced by the eddy current, it is expressed as vr in following text. The imaginary part of induced voltage represents the induced electromotive force from the main excitation magnetic field and expressed as vi in following text.
3.1 Contour Lines of Electric Field Intensity According to the electric field intensity expressions, the contour lines are reconstructed. The electric field intensity distributions of vertical section are drawn in Fig.2.
Fig. 2. Distribution of the electric field intensity in the vertical section
The electric field intensity is 0 along the axis because of the symmetry of spherical model. The greatest real part of electric field intensity appears at point ( r = 0.08412,θ = π 10 ), and the value is 3.44 × 10−6 V/m.
3.2 Influence of Different Frequency to the Induced Voltage The reconstruction data source of MIT is the induced voltage output from the detection coil. Knowing the characteristic of the induced voltage is very helpful in designing the amplifier circuit, so in this part, the induced voltage is simulated. As shown in Fig.3, 16 coils are arranged around the measured area with equal angle distance.
48
Z. Xu, Q. Li, and W. He
Fig. 3. Measurement points around the imaging region( θ = π 8 )
Fig. 4. Induced voltage curves of different frequency
In order to study the influence of different excitation frequency to the induced voltage, the excitation frequency is increased from 1MHz to 10MHz with the step of 1MHz, and the change of the induced voltage correspondings to the frequency of point3( θ = π 4 ) can be seen in Fig.4. Obviously, the relationship between the real part of the induced voltage and the excitation frequency is a quadratic equation, while the corresponding relationship of imaginary part voltage is linear. When it is 1MHz, the real part of the induced voltage is 2.77 × 10 −5 mV, as the frequency increased to 10MHz, it increased to 1.5 5 × 10 −2 mV, which is 599 times larger than the former.
4 Conclusion This paper presents an analysis algorithm of VSM to solve the forward problem of magnetic induction tomography. A 4-layer concentric spherical model of human head is built, and the expression of the magnetic vector potential is obtained. The distribution characteristic of the eddy current field is presented as the contour lines map. The simulation results show that there is a quadratic relationship between the real part of the inducted voltage and the excitation frequency, while there is a linear relationship when referring to the imaginary part and the frequency. The method presented in this paper can be used to generate the sensitivity matrix of MIT inverse problem with regular geometric shape.
Analytical Solution for the Forward Problem of Magnetic Induction Tomography
49
Acknowledgement. This work was supported by The National Natural Science Foundation of China (50877082), and Natural Science Foundation Project of CQ CSTC (CSTC2009BB5204). Scientific Research Foundation of State Key Lab. of Power Transmission Equipment and System Security(2007DA10512709305). The project of scientists and engineers serving enterprise of the Ministry of Science and Technology of China (2009GJF10025).
References 1. Griffiths, H.: Magnetic induction tomography. J. Meas. Sci. Technol. 12, 1126–1131 (2001) 2. Boone, K., Barber, D., Brwon, B., et al.: Imaging with electricity: Review of the European Concerted Action on Impedance Tomography. Journal of Medical Engineering & Technology 21(6), 201–232 (1997) 3. Morris, A., Griffiths, H., Gough, W.: A numerical model for magnetic induction tomography measurements in biological tissues. Physiol. Meas. 22, 113–119 (2001) 4. Merwa, R., Holaus, K., Brandtatter, B., et al.: Numerical solution of the general 3D eddy current problem for magnetic induction tomography (spectroscopy). Physiol. Meas. 24, 545–554 (2003) 5. Liu, G.Q., Wang, T., Meng, M., Wang, H.: Using Edge Element Method to Solve the Forward Problem in Magnetic Induction Tomography. Chinese Journal of Biomedical Engineering 25(2), 163–165 (2006) 6. Hoekema, R.: http://www.mbfys.kun.nl/~geertjan/nfsi/hoekema.pdf 7. http://niremf.ifac.cnr.it/tissprop/ 8. Lei, Y.Z.: Time-harmonic electromagnetic field analytical method, pp. 182–187. Science Press, Beijing (2000) 9. Yu, J.H.: Electromagnetic Theory, p. 164. Chongqing University Press, Chongqing (2003) 10. Lei, Y.Z.: Time-harmonic electromagnetic field analytical method, p. 104. Science Press, Beijing (2000) 11. Liang, K.M.: Methods of Mathematical Physics, pp. 362–364. Higher Education Press, Beijing (1995) 12. Liang, K.M.: Methods of Mathematical Physics, p. 309. Higher Education Press, Beijing (1995)
Appendix C1n =
μr 4 μ0 I ρ ' 2
⎛ z' ⎞ ( 2n + 1) ( r5 r4 ) T1T2T3 Pn1 ⎜ ⎟ ⋅ n ( n + 1) ( K n j44 μ r 4 + Ln a44 ) Z + ( K n y44 μ r 4 + Ln b44 ) Y ⎝ r5 ⎠ 2
⋅
n
(21)
C2 n =
b21 μ r1 j11 − a11 μ r 2 y21 ⋅ C1n μ r1 ( b21 j21 − a21 y21 )
(22)
D2 n =
a11 μ r 2 j21 − a21 μ r1 j11 ⋅ C1n μ r1 ( b21 j21 − a21 y21 )
(23)
50
Z. Xu, Q. Li, and W. He
C3n =
( b32 μr 2 j22 − a22 μr 3 y32 ) ⋅ C2 n + ( b32 μr 2 y22 − b22 μr 3 y32 ) ⋅ D2 n μ r 2 ( b32 j32 − a32 y32 )
=
( a22 μr3 j32 − a32 μr 2 j22 ) ⋅ C2n + ( b22 μr 3 j32 − a32 μr 2 y22 ) ⋅ D2n μr 2 ( b32 j32 − a32 y32 )
=
D3n =
C4 n = =
P2 P1 + R2 Q1 ⋅ C1n T1T2
Q2 P1 + S2Q1 ⋅ C1n TT 1 2
( b43 μr 3 j33 − a33 μr 4 y43 ) ⋅ C3n + ( b43 μr 3 y33 − b33 μr 4 y43 ) ⋅ D3n μr 3 ( b43 j43 − a43 y43 ) P3 ( P2 P1 + R2 Q1 ) + R3 ( Q2 P1 + S 2Q1 )
(24)
(25)
(26)
⋅ C1n
T1T2T3
( a33 μr 4 j43 − a43 μr 3 j33 ) ⋅ C3n + ( b33 μr 4 j43 − a43 μr 3 y33 ) ⋅ D3n μ r 3 ( b43 j43 − a43 y43 ) Q3 ( P2 P1 + R2 Q1 ) + S3 ( Q2 P1 + S2 Q1 )
D4 n = =
C5n =
⋅ C1n
T1T2T3
( nj44 μr 4 + a44 ) C4n + ( ny44 μr 4 + b44 ) D4n ( nj44 μr 4 + a44 ) Z + ( ny44 μr 4 + b44 ) Y = ⋅ C1n ( 2n +1) r4n μr 4 ( 2n +1) μr 4r4nTT 1 2T3 ⎡( n + 1) j44 μr 4 − a44 ⎤⎦ C4 n + ⎡⎣( n + 1) y44 μr 4 − b44 ⎤⎦ D5 n D5 n = r4 n +1 ⋅ ⎣ ( 2n + 1) μr 4 = r4
n +1
⎡( n + 1) j44 μ r 4 − a44 ⎤⎦ Z + ⎡⎣( n + 1) y44 μ r 4 − b44 ⎤⎦ Y ⋅⎣ ⋅ C1n ( 2n + 1) μr 4T1T2T3 D6 n = r5 2 n +1C5 n + D5 n
(28)
(29)
(30)
K n = n ( 2n + 1) r5 2 n +1 + 2 ( n + 1) r4 2 n +1 , Ln = ( 2n + 1) r5 2 n +1 − 2 ( n + 1) r4 2 n +1 2
Ti = μi ( bi +1,i ji +1,i − ai +1,i yi +1,i ) Z = P3 ( P2 P1 + R2 Q1 ) + R3 ( Q2 P1 + S2 Q1 ) , Y = Q3 ( P2 P1 + R2 Q1 ) + S3 ( Q2 P1 + S2 Q1 ) Pi = d i − ci
(27)
ci = aii μ ri +1 yi +1,i
Qi = fi − ei d i = bi +1,i μ ri jii
Si = gi − hi ei = ai +1,i μ ri jii
Ri = mi − ni f i = aii μ ri +1 ji +1,i
g i = bii μi +1 ji +1,i
hi = ai +1,i μi yii
mi = bi +1,i μi yii
ni = bii μi +1 yi +1,i
Total Variation Regularization in Electrocardiographic Mapping Guofa Shou1, Ling Xia1, and Mingfeng Jiang2 1
Department of Biomedical Engineering, Zhejiang University, Hangzhou, P.R. China, 310027
[email protected] 2 The College of Electronics and Informatics, Zhejiang Sci-Tech University, Hangzhou, 310018, P.R. China
Abstract. Electrocardiographic mapping (ECGM) is to estimate the cardiac activities from the measured body surface potentials (BSPs), in which the epicardial potentials (EPs) is often reconstructed. One of the challenges in ECGM problem is its ill-posedness, and regularization techniques are needed to obtain the clinically reasonable solutions. The total variation (TV) method has been validated in keeping the sharp edges and has found some preliminary applications in ECG inverse problem. In this study, we applied and compared two algorithms: lagged diffusivity (LD) fixed point iteration and primal dual-interior point method (PD-IPM), to implement TV regularization method in ECGM problem. With a realistic heart-lung-torso model, the TV methods are tested and compared to the L2-norm regularization methods in zero- and first-order. The simulation results demonstrate that the TV method can generate better EPs compared to the zero-order Tikhonov method. Compared to the first-order Tikhonov method, the TV's results are much sharper. For the two algorithms in TV method, the LD algorithm seems more robust than the PD-IPM in ECGM problem, though the PD-IPM converges faster.
1 Introduction Electrocardiographic mapping (ECGM) is a widely used method to reconstruct the cardiac electrophysiological information from the measured or simulated body surface potentials (BSPs). And the epicardial potentials (EPs) has been recognized as the main reconstructed target, since it directly reflects the underlying cardiac activities and provides an effective means to localized regional cardiac events[1]. The ECGM in terms of the EPs is govern by a Laplace equation with the Cauchy boundary conditions[1]
⎧ ∇ ⋅ σ∇Φ = 0 in Ω ⎪ ⎨ σ∇Φ ⋅ n = 0 on ΓT ⎪ Φ = ΦT on ΓT ⎩
Find ΦE on ΓE
(1)
where Φ is the quasi-static potential, ΦE and ΦT are the potentials on the epicardial surface ΓE and body surface ΓT, which encloses the volume conductor Ω, σ is the K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 51–59, 2010. © Springer-Verlag Berlin Heidelberg 2010
52
G. Shou, L. Xia, and M. Jiang
tissue-dependent conductivity tensor. This boundary value problem can be solved by the boundary element method (BEM), and the final relationship of ΦE and ΦT is a linear system equation as
AΦ E = ΦT
(2)
where A is the transfer coefficient matrix, which depends on the geometry and conductivity of the inhomogeneous volume conductor. Unfortunately the matrix A is found ill-conditioned with the singular values decreasing to zero without significant gap. Therefore, the ECGM problem belongs to the typical 'discrete ill-posed' problem, in which the small measurement errors in the BSPs, or geometry errors in the volume conductor model, leads to large perturbations in the EPs. For the ill-posed problem, the conventional least-squares (LS) error solutions are physically meaningless, and the regularization process is needed for stabilizing the solutions as
min AΦ E − ΦT
2 2
+ λ C (Φ E )
(3)
where λ is the regularization parameter and C (Φ E ) is the regularization functional. During the past 40 years, many research efforts have been devoted to explore the suitable regularization techniques to tackle the ill-posedness of ECG inverse solutions. The most common regularization methods are the Tikhonov regularization[2-3], which imposes the constraints on the magnitude or derivatives of the EPs. Besides, some other approaches like level-set method[4], truncated total least square (TTLS)[5] and genetic algorithm [6], are also applied before. In these application, the constraint term is often calculated in L2-norm. It is known that the use of the L2-norm based penalty has a smoothing effect on the solution. Therefore, non-quadratic regularization technique like the total variation (TV) method has been recently applied in the ECG inverse problem and some better performance has been found[7]. However, the implementation of the TV method is complicated due to the non-differentiability of the penalty function with L1-norm. In [7], the "lagged diffusivity" (LD) fixed point iteration method has been used to solve the TV method in ECG inverse problem. While the lagged diffusivity method converges slowly and becomes unstable for the small values of β [8]. Therefore, in this study, we presented the application of the primal dual-interior point method (PD-IPM) into the ECGM TV regularization. The PD-IPM is tested with a realistic heart-torso model and compared with the LD method and the L2-norm regularization method.
2 Methods 2.1 L2-Norm Regularization The L2-norm regularization has been widely used in the ECGM problem and the well-known one is the Tikhonov method[3], in which the functional C (Φ E ) is expressed as
Total Variation Regularization in Electrocardiographic Mapping
C (Φ E ) = LΦ E
2 2
53
(4)
where L is the regularization matrix, which can be chosen as the identity matrix, a positive diagonal matrix approximating the first or second order differential operators[9].The normal derivative of the EPs ( ∂Φ E ∂n ) also has been used as the penalty function and some superior results has been found[10]. The L2-norm guarantees that the regularization functional C (Φ E ) is differentiable and easy to be implemented. The solution of the L2-norm regularization can be found as
Φ E = ( AT A + λ LT L) −1 AT ΦT
(5)
In this study, the zero and first order Tikhonov regularization (ZOT and FOT)will be used as that in [7]. 2.2 Total Variation Regularization The TV regularization is firstly introduced by Rudin et al[11] in the image restoration context in order to reconstruct the discontinuous profiles. Compared to the L2-norm penalty(Eq.(4)), the TV functional is still a differential operator but in a L1 regularization scheme and measures the total amplitude of the oscillations of a function. The TV of the EPs is defined as
C (Φ E ) = TV (Φ E ) = ∫ ∇Φ E d Ω ΓE
Since the
(6)
∂Φ E ∂n can realize superior results, it is also used in the TV regulari-
zation here as that in [7]
TV (Φ E ) = ∂Φ E ∂n 1 In the BEM calculation of Eq.(1), the
(7)
∂Φ E ∂n can be directly related to the Φ E
and expressed as[7]
∂Φ E ∂n = DΦ E
(8)
In the TV regularization, the absolute value penalty leads to the nondifferentiability for the points when
∇Φ E = 0. Therefore, the numerical implemen-
tation of the TV method becomes a nonlinear optimized problem and needs to be properly addressed. A number of different approaches have been developed and applied to solve the TV regularization problem in image de-noise[11], electrical impedance tomography (EIT) [12]and bioelectric source imaging problems[7, 13]. Comparing the existed algorithms for TV regularization, it can be found that the PD-IPM is most efficient and has no stability problems[8]. While in the ECG inverse problem the LD method has been applied before[7], though the LD method requires smoothing of the TV
54
G. Shou, L. Xia, and M. Jiang
functional to avoid numerical instability. Therefore, in this study, we introduced and compared two algorithms: the PD-IPM and LD method, in the ECGM problem. In the implementation of PD-IPM and LD method, the non-differentiability of the TV functional should be firstly removed, which is achieved by introducing a small value β . And the TVβ (Φ E ) is defined as
TVβ (Φ E ) = ∂Φ E ∂n + β It can be seen that
(9)
1
the functional is differentiable for
β >0
and
TVβ (Φ E ) → TV (Φ E ) for β → 0 . The LD algorithm is to solve a standard linear problem as
Φ kE+1 = ( AT A + λ DTWβ (Φ kE ) D) −1 AT ΦT where the diagonal weight matrix
Wβ (Φ kE ) =
(10)
Wβ (Φ kE ) is obtained as
2 1 ⎡ ⎤ diag ⎢1 / ( DΦ kE + β ) ⎥ 2 ⎣ ⎦
(11)
In order to avoid the LD method converge slowly or become unstable for small values of β , it is required that starting the algorithm with larger values of β and
progressively reducing β as the algorithm progresses. The PD-IPM method for dealing with the TV regularization functional has been proposed by Chan[14] and is based on the primal-dual theory developed by Andersen[15]. The more details about the PD-IPM can be found in [8, 14-15], only a short description of the PD-IPM in ECGM is presented here. The original TV regularization inverse problem formulation was labeled as the primal (P). And since for each i
Di Φ E = max ( χ i Di Φ E ) χi : χi ≤1
(12)
where χ is a vector of scalar auxiliary variables. Substituting (12) into (P), a second equivalent formulation as a maximization problem can be obtained as
max min AΦ E − ΦT
χi : χ i ≤1
2 2
+ λ ∑ χ i Di Φ E
(13)
i
χ are labeled as dual variables. With a series of operation, the update of the primal and dual variables δΦ E and δχ can be obtained by the solution of the primal-dual problem with the Gauss This problem is called dual (D) and the auxiliary variables
Newton method as
⎡ AT A ⎡ AT ( AΦ kE − ΦT ) + λ DT χ k ⎤ λ DT ⎤ ⎡δΦ kE ⎤ = − ⎢ k ⎥ ⎢ T k ⎥ k −1 ⎢ k −1 k k ⎥ ⎣⎢ K D −Wβ (Φ E ) ⎦⎥ ⎣ δχ ⎦ ⎣⎢ D Φ E − Wβ (Φ E ) χ ⎦⎥
(14)
Total Variation Regularization in Electrocardiographic Mapping
55
with
⎛ ⎞ χ ik Di Φ kE ⎜ ⎟ K = diag ⎜1 − ⎟ 2 ⎜ ( DΦ kE + β ) ⎟ ⎝ ⎠ k
From Eq.(14), the δΦ E and
(15)
δχ can be obtained as
δΦ kE = −[ AT A + λ DT Wβ (Φ kE ) K k D]−1 ⋅ [ AT ( AΦ kE − ΦT ) + λ DT Wβ (Φ kE ) DΦ kE ]
δχ = − χ + Wβ (Φ ) DΦ + Wβ (Φ ) K DδΦ k
k
k E
k E
k −1 E
k
(16)
k E
A line search procedure is applied in primal variables, while the step length rule is used for the dual variables[8]. 2.3 Simulation Protocol The TV and L2-norm regularization methods were performed and investigated into a model-based ECGM problem, in which a geometrical realistic thorax model was taken. The realistic thorax model and the potential dataset of EPs and BSPs have been developed before based on the CT scans[16-17]. Fig.1(A) displays the mesh of the inhomogeneous volume conductor model with ventricle, lung and torso surfaces, in which the ventricle with activation time is detailed in Fig.1(B). And the closed epicardial surface is shown in Fig.1(C). There were 360 nodes and 716 elements for the epicardial surface, 297 nodes and 586 elements for lung, 412 nodes and 820 elements for the torso surface. The potentials on the lung and torso surface were firstly calculated from the dipole source based formulation, then the EPs on the closed surface were calculated from the dipole source based equation combined with the calculated potentials[18]. Such potentials information were used to investigate the regularization methods. Quantitative measures of the ECGM accuracy were evaluated using the relative error (RE) and correlation coefficient (CC) between the reconstructed and realistic EPs data.
(A)
(B)
(C)
Fig. 1. (A) The mesh of the inhomogeneous volume conductor model with the ventricle, lung and torso surfaces, (B) The ventricle with the activation time information, (C) The mesh of the closed epicardial surface
56
G. Shou, L. Xia, and M. Jiang
RE = CC =
Φ r − Φt Φr 2
∑
n i
⎣⎡(Φ r )i − Φ r ⎤⎦ ⎡⎣(Φ t )i − Φ t ⎤⎦ Φ r − Φ r 2 Φt − Φt 2
(17)
where the subscript "r" refers to reference result and "t" correspondingly to the test result and the superscript "-" refers to the mean value, n is the number of epicardial nodes.
3 Results In the practical implementation of the L2 and L1 norm regularization methods, the regularization parameter λ was chosen as the optimal one by comparing 20-40 regularization solutions. The β was assigned as 10-5 and was kept fixed in the LD algorithm as that in[7], while changed in the PD-IPM algorithm. In our simulation of the L1 norm iterative algorithms, a stopping criterion was adopted based on the relative decrease of the objective functional Eq.(3). When the relative decrease of the functional was less than 1%, the iteration stopped. The four methods are used to reconstruct the EPs at the Q, R, S and T wave peak time instants as an example. In the reconstruction, the different degrees of noises are added on the calculated BSPs to simulate the realistic measurement. The results for the RE and CC are summarized in Table 1, in which the best solutions for each cases are underlined. It can be seen from Table 1 that most of the best solutions are obtained Table 1. RE and CC for reconstruction results for the Q, R, S and T wave peak time instants in the presence of no, 50dB, 40dB and 30dB noises. The best solutions are underlined. ZOT: zeroorder Tikhonov method, FOT: first-order Tikhonov method, TV_LD: total variation method with LD algorithm, TV_PDIPM: Total variation method with PD-IPM algorithm. Signal
Q
R
S
T
Noise (dB) No 50 40 30 No 50 40 30 No 50 40 30 No 50 40 30
ZOT RE 0.213 0.632 0.691 0.857 0.218 0.659 0.718 0.754 0.212 0.481 0.513 0.547 0.218 0.713 0.725 0.763
FOT CC 0.977 0.774 0.722 0.712 0.954 0.776 0.741 0.702 0.977 0.877 0.858 0.838 0.976 0.741 0.736 0.691
RE 0.014 0.317 0.398 0.829 0.019 0.154 0.230 0.301 0.013 0.155 0.175 0.192 0.019 0.186 0.232 0.282
CC 0.999 0.956 0.918 0.960 0.999 0.988 0.977 0.961 0.999 0.989 0.986 0.983 0.999 0.983 0.976 0.965
TV_LD RE CC 0.027 0.999 0.344 0.940 0.925 0.382 0.936 0.666 0.034 0.999 0.161 0.987 0.264 0.968 0.434 0.910 0.028 0.999 0.168 0.986 0.224 0.975 0.249 0.969 0.033 0.999 0.181 0.984 0.289 0.961 0.360 0.941
TV_PDIPM RE CC 0.042 0.999 0.386 0.922 0.410 0.913 0.551 0.854 0.023 0.999 0.216 0.976 0.312 0.951 0.478 0.887 0.020 0.999 0.185 0.983 0.272 0.962 0.292 0.957 0.025 0.999 0.182 0.984 0.349 0.942 0.420 0.916
Total Variation Regularization in Electrocardiographic Mapping
57
by the FOT method, while the TV method improve the solutions much compared to that in ZOT. The performance validated the usefulness of the penalty term with the normal current density. Comparing the performance of TV method with LD and PDIPM algorithms, it seems that the LD one is more robust for the noises and get the better solutions. In the computation, the iteration steps in TV method are less than 10, and about 3 for PD-IPM algorithm.
(A)
(B)
(C)
(D)
(E)
(F)
Fig. 2. Potential distributions at the Q wave peak. (A) BSPs computed from the ventricular model. (B) EPs computed from the ventricular model as the reference values. (C) Inversely computed EPs using the zero-order Tikhonov method. (D) Inversely computed EPs using the first-order Tikhonov method. (E) Inversely computed using the TV method with LD algorithm. (F) Inversely computed using the TV method with PD-IPM algorithm.
Fig.2 depicts the simulated BSPs, the simulated and inversely calculated EPs at the Q wave peak time instant, in which the 40dB noise is added on the simulated BSPs. We can see from Fig.2 that the FOT, TV_LD and TV_PDIPM all captured well the qualitative features of the EPs, while the result of FOT is much smoother than that of TV_LD and TV_PDIPM.
4 Discussion and Conclusion This study investigated the TV regularization method in ECGM problem. In the implementation of TV method, two commonly used algorithms: LD and PD-IPM, were taken. Using a geometrical realistic volume conductor model with the ventricle, the
58
G. Shou, L. Xia, and M. Jiang
TV method was compared with the Tikhonov regularization method of zero and first order. The simulated results show that the TV regularization methods can produce better results than that of ZOT method. The FOT method is also effective and even obtain best RE and CC for most cases. While due to the nature of L2-norm, the FOT's results are smoother compared to the TV's (See Fig.(2)). The LD algorithm seems better than the PD-IPM in the ECGM problem, though less iteration steps needed in the latter one. In all, the TV method has been validated useful in the ECGM problem. And considering the TV's ability of reconstructing blocky images, the TV method seems much fit for recovering the activation times and transmembrane potentials, which will be our future research work. Acknowledgments. This work is supported in part by the 973 National Key Basic Research & Development Program (2010CB732502), China Postdoctoral Science Foundation (20090461376) and Fundamental Research Funds for the Central Universities (KYJD09001).
References 1. Rudy, Y., Messinger-Rapport, B.J.: The Inverse Problem in Electrocardiography: Solutions in Terms of Epicardial Potentials. CRC Crit. Rev. Biomed. Eng. 16(3), 215–268 (1988) 2. Ramanathan, C., Ghanem, R.N., Jia, P., Ryu, K., Rudy, Y.: Noninvasive Electrocardiographic Imaging for Cardiac Electrophysiology and Arrhythmia. Nat. Med. 10(4), 422–428 (2004) 3. Tikhonov, A.N., Arsenin, V.Y.: Solutions of Ill-Posed Problems. Wiley, New York (1977) 4. Ruud, T.S., Nielsen, B.F., Lysaker, M., Sundnes, J.: A Computationally Efficient Method for Determining the Size and Location of Myocardial Ischemia. IEEE Trans. Biomed. Eng. 56(2), 263–272 (2009) 5. Shou, G., Xia, L., Jiang, M., Wei, Q., Liu, F., Crozier, S.: Truncated Total Least Squares (Ttls): A New Regularization Method for the Solution of Ecg Inverse Problems. IEEE Trans. Biomed. Eng. 55(4), 1327–1335 (2008) 6. Jiang, M., Xia, L., Shou, G., Tang, M.: Combination of the Lsqr Method and a Genetic Algorithm for Solving the Electrocardiography Inverse Problem. Phys. Med. Biol. 52(5), 1277–1294 (2007) 7. Ghosh, S., Rudy, Y.: Application of L1-Norm Regularization to Epicardial Potential Solution of the Inverse Electrocardiography Problem. Ann. Biomed. Eng. 37(5), 902–912 (2009) 8. Borsic, A.: Regularisation Methods for Imaging from Electrical Measurements. PhD thesis, School of Engineering, Oxford Brookes University (2002) 9. Hansen, P.C.: Rank-Deficient and Discrete Ill-Posed Problems: Numerical Aspects of Linear Inversion. SIAM, Philadelphia (1998) 10. Throne, R.D., Olson, L.G.: A Comparison of Spatial Regularization with Zero and First Order Tikhonov Regularization for the Inverse Problem of Electrocardiography. Comput. in Cardiol. 27, 493–496 (2000) 11. Rudin, L., Osher, S.J., Fatemi, E.: Nonlinear Total Variation Based Noise Removal Algorithms. Physica D 60, 259–268 (1992)
Total Variation Regularization in Electrocardiographic Mapping
59
12. Borsic, A., Graham, B.M., Adler, A., Lionheart, W.R.: In Vivo Impedance Imaging with Total Variation Regularization. IEEE Trans. Med. Imaging 29(1), 44–54 (2010) 13. Ding, L., He, B.: Sparse Source Imaging in Electroencephalography with Accurate Field Modeling. Hum. Brain Mapp. 29(9), 1053–1067 (2008) 14. Chan, T., Mulet, P.: Iterative Methods for Total Variation Restoration. UCLA CAM Tech. Rep. 96-38 (1996) 15. Andersen, K.D., Christiansen, E., Conn, A.R., Overton, M.L.: An Efficient Primal-Dual Interior-Point Method for Minimizing a Sum of Euclidean Norms. SIAM J. Sci. Comput. 22(1), 243–262 (2000) 16. Xia, L., Huo, M., Wei, Q., Liu, F., Crozier, S.: Electrodynamic Heart Model Construction and Ecg Simulation. Methods Inf. Med. 45(5), 564–573 (2006) 17. Lu, W., Xia, L.: Computer Simulation of Epicardial Potentials Using a Heart-Torso Model with Realistic Geometry. IEEE Trans. Biomed. Eng. 43(2), 211–217 (1996) 18. Shou, G., Xia, L., Jiang, M., Wei, Q., Liu, F., Crozier, S.: Solving the Ecg Forward Problem by Means of Standard H- and H-Hierachical Adaptive Linear Boundary Element Method: Comparison with Two Refinement Schemes. IEEE Trans. Biomed. Eng. (2009)
The Time-Frequency Analysis of Abnormal ECG Signals Lantian Song and Fengqin Yu Jiangnan University, School of Communication and Control Engineering, 214122 Wuxi, China
[email protected]
Abstract. ECG(electrocardiogram) signal is an important basis to diagnose heart diseases, but its a weak low-frequency non-stationary signal, and possessing noise setting strong characters, neither time-domain nor frequency-domain based methods are suitable for analyzing this signal. In this article we adopt timefrequency analysis approaches which could reflect signal both in time and frequency domains. Totally, we adopts two time-frequency approaches: PseudoWigner–Ville Distribution (PWVD) and Wigner High-Order Spectra(WHOS), we successfully extract characters from two kinds abnormal ECG signals, which improves our methods are effective. Keywords: Time-frequency analysis, PWVD; WHOS, Abnormal ECG signal.
1 Introduction ECG signal is a kind of biomedical signals, it’s electrophysiological signal which appears periodical character[1].The ECG signal’s waveform could reflect the heart’s activity, when it happens distortions, we could find out the disease according to these distortions. ECG signal is a kind of weak, non-stationary signal which is easily influenced by noise, the pure time-domain analysis is not sensitive to the distortions of waveforms which lead to the fault detection rate is quite high, and the frequencydomain analysis is based on the hypothesis that the signal is stationary, so we have to use the time-frequency approaches. In this article we choose two typical kinds abnormal ECG signals: congestive heart failure signal and the ventricular late potential(VLP)signal. We use basic time-frequency approach PWVD to analyze the congestive heart failure signal, after transformation we could see an obvious distinction between this abnormal ECG signal and normal ECG signal. Ventricular late potential(VLP)signal, which is a kind of microvolt-level subtle part of signal that is highfrequency and irregular, often appears in the terminal portion of the QRS complex, signal to noise ratio(SNR) of it is rarely low. At present, the VLP is measured by time or domain methods. However, the accuracy of time-domain method is affected by the fixation of QRS’ end which is severely disturbed by noise, and as persistence of VLP K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 60–66, 2010. © Springer-Verlag Berlin Heidelberg 2010
The Time-Frequency Analysis of Abnormal ECG Signals
61
is so short. Frequency-domain method’s frequency resolution is not high. The traditional time or frequency methods usually exist a variety kinds of disadvantages. Recently, wavelet transform has successfully applied to VLP[2], this approach usually combines Artificial Neural Network (ANN) to extract the wavelet subbands’ properties, the wavelet transform is just seen as preprocessing. But if we just want to get high resolution of VLP, we can consider the classical time-frequency approaches like WVD or PWVD. The main problem is both WVD and PWVD are sensitive to noises, considering the WHOS’s advantages in low SNR situation we are about to mention below, we try to apply WHOS to this phenomenon. The simulation confirms two methods are both effective.
2 Congestive Heart Failure Signal for the Pesudo-Wigner-Ville Distribution 2.1 PWVD The definition of Wigner-Vile Distribution(WVD) for the continuous ECG signal s(t) is:
W (t , f ) =
∫
+∞ −∞
s ( t + τ / 2) s ∗ ( t − τ / 2) e − j 2 π ft d τ
(1)
The WVD changes one-dimensional signal in the time-domain into two-dimensional signal both in time-domain and frequency-domain. In the practical calculation process, we usually use the Discrete WVD(DWVD). WVD is bilinear distribution, so it will produce cross-terms. For reducing the cross-terms and get better time-frequency resolution, a smoothing window is applied in the Wigner kernel which is defined as PWVD:
W p (t , f ) =
∫
+∞ −∞
h (τ ) s ( t + τ / 2) s ∗ ( t − τ / 2) e − j 2 π ft d τ
(2)
After PWVD we could reduce or eliminate the cross-terms through the smoothing process in the frequency domain. 2.2 Simulation Experiment We take two groups’data:one group is normal ECG signal and another is congestive heart failure signal.The datas come from the MIT-BIH malignant ventricular arrhythmia database [3](sampling frequency is 360 hz, sampling precision is 11-bit). After PWVDs of two groups’data, for getting better observations, We apply the Hough transform to the PWVDs’ results to get the polar coordinates figures which expands the differences between the two kinds signals.Figure.1 and Figure.2 show the results.
62
L. Song and F. Yu
(a)
(b) Fig. 1. The normal ECG signal’s time morphology figure and the Hough transform figure after PWVD of this signal(a) normal ECG signal’s time morphology (b) the Hough transform figure of normal ECG signal’s PWVD
(a) Fig. 2. The congestive heart failure signal’s time morphology figure and the Hough transform figure after PWVD of this signal (a) The congestive heart failure signal’s time morphology (b) the Hough transform figure of congestive heart failure signal’s PWVD
The Time-Frequency Analysis of Abnormal ECG Signals
63
(b) Fig. 2. (continued)
Seen from Fig.1 and Fig.2,we could find that the time morphologies of these waves are similar,but in the transformation domain,the differences between the two signals are obvious.The normal ECG signal show some regular pulses,but the congestive heart failure signal contains more spurious signals and don’t have any regular characters.
3 The VLP Signal for the High-Order Spectra of Wigner-Ville Distribution 3.1 High-Order Spectra of Wigner–Ville Distribution Wigner High-Order Spectrum is an extention of Wigner-Ville Distribution,it keeps the advantages of Wigner–Ville Distribution and also has the advantages of HighOrder Spectra.The High-Order Spectra has widely been used in the non-gauss and nonstationary realm which is quite applicable to ECG signals, combining WignerVille Distribution we could get the time-frequency characters at the same time[4].The study has improved that under the low SNR circumstances the Wigner Bispectrum(WHOS in the third-order moment domain) is better than Wigner-Ville Distribution. The High-Order Spectra of Wigner–Ville Distribution has been used in EEG signals[5], some kind of ECG signals were also involved. The High-Order Spectra of Wigner-Ville Distribution of signal s(t) is defined as follows[6]: +∞ +∞
W (t, f1, f 2 ,... f k ) = ∫ ... ∫ s∗ (t − −∞ −∞
k k 1 k k 1 τ m ) × ∏ s(t + τi − τ j ) exp(− j 2πfiτ i )dτ i ∑ ∑ k + 1 m=1 k +1 k + 1 j =1, j ≠1 i =1
(3)
W (t , f 1 , f 2 ... f k ) represents the order k Fourier transform of a k-demensional local function, if we define Rkt = s ( t − α )
k
∏ s (t + τ
i
− α ) is the function and α is
i =1
the delay of time, then: k
W k s ( t , f 1, f 2 ... f k ) = R k (τ 1,τ 2 ...τ k ) ∏ e x p ( − j 2 π f iτ i ) d τ i =1
(4)
64
L. Song and F. Yu
The definition of Wigner Bispectrum is : 1 1 1 Ws2(t, f2, f2 ) = ∫∫ s∗(t − (τ1 +τ2))s(t + (2τ1 −τ2))×s(t + (2τ2 −τ1))exp(− j2π( f1τ1 + f2τ2 ))dτ1dτ2 3 3 3 τ1 τ2
(5)
When calculating we only calculate the diagonal slice f1 = f 2 ,for avoiding aliasings, the FFT’s calculating points should be at least more than twice of the signal s(t)’s sampling points[7]. 3.2 Simulation Experiment The normal and VLP signals come from XinYang college of vocational and technical’s affiliated hospital.(sampling frequency is 1000 hz, sampling precision is 12bit)There are 30 kinds datas in all, including 12 kinds of VLP positive and 18 kinds VLP negative. According to previous study, the VLP positive are more craggedness than VLP negative, moreover, irregularity appears repeatly, so in the simulation we choose VLP positive data(two beats). First, we want to see the WVD of two signals, Fig.3 and Fig.4 show the WVD results.
Fig. 3. The normal ECG signal’s WVD
Fig. 4. The VLP’s WVD
The Time-Frequency Analysis of Abnormal ECG Signals
65
Seen from Fig.3 and Fig.4, comparing the WVD of normal ECG signal and the VLP signal,we only could find that the VLP signal has more cross-terms but we can’t observe the LP’s locations. For making a comparison,we also observe the PWVD’s result, we find that the Time-frequency resolution is also good,but the LP’s locations are not obvious,we couldn’t judge they are LP or noises. In the simulation we adopt Wigner Bispectrum to analyse LP, High-Order Spectra of Wigner–Ville Distribution also exists the cross-terms problem like Wigner–Ville Distribution.We choose Choi-Williams transform to filter,we could directly call the programs from Matlab[8].Fig.5 is the result of WHOS,Fig.6 is the result of WHOS after Choi-williams transform(CWHOS).
Fig. 5. The WHOS of VLP
Fig. 6. After Choi-Williams Transform’s WHOS of VLP
Seen from Fig.5 we could see LP close to 100s and 400s, but the figure is still existing cross-terms. After Choi-Williams transform,seen from Fig.6 we could see the remaining cross-terms are basically eliminated, the time-frequency resolution and time-frequency concentration are good. PWVD is an effective approach to analyse or judge different signals, such as the congestive heart failure signal we mentioned above,but to the VLP, the wigner bispectrum is better.The major drawback of this method is that it’s time consuming,
66
L. Song and F. Yu
butthe running time of Wigner Bispectrum is OK, in our simulation it’s only a little longer than PWVD. WHOS is quite applicable to non-gauss signals, especially to the low SNR situation.
4 Summary The first kind abnormal ECG signal we studied: congestive heart failure signal is directly threatens the life of patient, for its time-domain’s characters are not obvious we usually give wrong test results, the appropriate time-frequency method enlarges the differences between the abnormal and normal signals. The second kind signal we studied is the VLP ECG signals, VLP is also an omen of some kinds of heart diseases. In this article we use WHOS locates VLP’s positions, then through CWHOS we get better resolution, for further research we could extract WHOS’s characteristics such as mean values or standard deviation from VLP signals and combines different classifying methods to improve the VLP’s detecting ratio, theoretically,these effective parameters also could significantly reduce the complexity of classifier.
References 1. Zhao, J.Q., Huang, H.Y., Zhang, Z.B.: A Kind of Method on Time-Frequency Analysis of Electrocardiosignal. J. Journal of Biomedical Engineering 22, 1049–1051 (2005) 2. Mousa, A., Yilmaz, A.: Neural Network Detection of Ventricular Late Potentials in ECG signals Using Wavelet Transform Extracted Parameters. In: 23rd Annual EMBS International Conference, pp. 1668–1671. IEEE Press, Istanbul (2001) 3. Physiobank Archieve Index, MIT-BIH Arrhythmia Database (1997), http://www.physionet.org/physibank/database 4. Tang, Y., Tang, J.T.: ECG signal base on Wigner-Ville High-order Spectra. J. Journal of University of Electronic Science and Technology of China 36, 143–145 (2007) 5. Chua, K.C., Chandran, V., Acharya, U.R., Lim, C.M.: Automatic Identification of Epilepsy by HOS and Power Spectrum parameters using EEG Signals: A comparative Study. In: 30th Annual International EMBS Conference, pp. 3824–3827. IEEE Press, Vancouver (2008) 6. Fonollosa, J.R.: Wigner higher order moment spectra:definition, properties, computation and application to transiant signal analysis. J. IEEE Trans. on Signal Processing 41, 245– 267 (1993) 7. Alliche, A., Mokrani, K.: Detection of cardiac late potentials using higher order timefrequency. J. Electronics, Circuits and Systems 2, 840–843 (2000) 8. Swami, A.: High-order spectral analysis toolbox. The Mathworks, Inc. (1995) (second Printing)
Dynamic Spectrum and BP Neural Network for Non-invasive Hemoglobin Measurement Huiquan Wang, Gang Li*, Zhe Zhao, and Ling Lin State Key Laboratory of Precision Measurement Technology and Instruments, Tianjin University, 300072, Tianjin, China
[email protected]
Abstract. To minimize and hopefully to eliminate the discrepancies among the individuals and the complicated conditions during non-invasive hemoglobin measuring by near-infrared spectroscopy, the Dynamic Spectrum (DS) method was applied. DS is more accurate than the traditional method in hemoglobin non-invasively measurement, which is proved by the theoretical derivation. In vivo measurements were carried out in 60 healthy volunteers, and Back Propagation Neural Network (BP-NN) was used to establish the calibration model of hemoglobin concentration against DS data, which were preprocessed by some special algorithms. The correlation coefficient of the predicted values and the true values was 0.907, which showed that DS method can be applied as a new approach to non-invasive hemoglobin analysis by near-infrared spectroscopy. Keywords: Dynamic Spectrum, Non-invasive Measurement, Hemoglobin, Artificial Neural Network.
1 Introduction The total hemoglobin concentration is significant parameters of measuring total oxygen-carrying capacity of human blood. Clinically, measuring the hemoglobin of blood is very important for doctors to diagnose anemia, which is an under diagnosed, significant public health concern afflicting over 2 billion people worldwide [1]. Also, total hemoglobin is used as a parameter to screen people who donate blood and are routinely being monitored during the treatment of patients with hemorrhaging or vascular and orthopedic surgery where a large amount of blood loss can occur. The current practice of using intermittent, invasive measurements of hemoglobin to help guide transfusion decisions may contribute to unnecessary blood transfusions. Blood transfusion should not simply be based on any particular level of hemoglobin but rather a thorough evaluation of the patient, including whether hemoglobin levels are stable or changing. To cater to these clinical needs, many technological solutions have been proposed to monitor total blood hemoglobin continuously and noninvasively, such as using transmission spectroscopy technologies [2,3] and reflectance spectroscopy and imaging technologies[4,5]. *
Corresponding author.
K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 67–74, 2010. © Springer-Verlag Berlin Heidelberg 2010
68
H. Wang et al.
The devices mentioned by Jeon K J [6] and J Kraitl [7] in their papers all had achieved to measure the hemoglobin noninvasively using five wavelength light emitting diode array as the light source. However, the theory they used is based on the traditional theory pulse oximeter used. Pulse oximetry has been around for several decades and has been a noninvasive technique that provides the measurement of heart rate and blood oxygen saturation [8]. But as a technique under developing, the theory, which pulse oximetry based on, still has some problems such as measurement accuracy and needs to be researched and perfected further [9].
2 Dynamic Spectrum Theory Pulse oximetry is based on near-infrared spectroscopy (NIRS), which is a noninvasive optical technique to monitor tissue oxygenation. The individual discrepancy and measuring condition that influence the measured spectrum can significantly affect the accuracy of pulse oximetry and the hemoglobin concentration. The individual discrepancy, which is the most difficult one among these problems, refers to the difference of personal feature in measured tissues except the pulsatile component of artery blood, including hairs, horny layer, subcutaneous tissue, muscle, skeleton etc[10,11]. It is necessary to eliminate the interference of the individual discrepancy and measuring condition which influences the spectrum. As shown in Fig.1, a finger or the other part of a human may be regarded as three layers: the artery (blood), the vein and the other tissue without blood under optical interrogation. Then the artery may be divided two layers, artery ‘static’ and ‘pulsed’ blood. If incident light intensity is represented by I0, the maximum and minimum intensities of the transmitted light are represented by Imax and Imin respectively (Fig.1 (a)), the difference of absorbance during systole and diastole (ΔOD) can be expressed as Eq. (1). ΔOD = OD1 − OD 2 = ln(
⎛I I0 I ) − ln( 0 ) = ln⎜⎜ max I min I max ⎝ I min
⎞ ⎟⎟ ⎠
(1)
ΔOD at each wavelength can be acquired by measuring Imin and Imax of the Photoplethysmographic wave, hence the Dynamic Spectrum (means the spectrum made from peak-peak value of photoplethysmographic wave) can be obtained accordingly. DS can be considered as absorbance spectrum of artery “pulsed” blood (see Fig.1 (b)), which relates to the variation of the absorbency caused by systole and diastole, so the information in DS only relates to the components of the artery blood. By Lambert Beer’s law n ⎛ Iλ OD λ = −∑ ε iλ ci d λ = ln⎜⎜ max λ i =1 ⎝ I min
⎞ λ λ ⎟⎟ = ln I max − ln I min ⎠
(2)
Dynamic Spectrum and BP Neural Network for Non-invasive Hemoglobin Measurement
69
Fig. 1. Basic Principle of DS based on Photoplethysmographic wave
The concentration of the hemoglobin in the artery ‘pulsed’ blood can be obtained from the DS. Because of the linear characteristic of the Fourier Transform, the amplitude value of the fundamental wave in the frequency domain is proportional to the amplitude of original signal value in the time domain. So the peak-to-peak value of the logarithmic pulsatile part of the pulse wave in the time domain (lnImin-lnImax) could be replaced by the fundamental wave’s amplitude value X(1) to get high Signal to Noise Ratio (SNR) as Eq.(3) [12].
OD λ = kX λ (1) = k
1 λ x (t )e − j 2πf 0t dt T ∫T
(3)
3 Experiments and Method 3.1 Experimental Set-Up The spectrum data was obtained from the volunteers’ figure tips in vivo. As shown in Fig2, the experimental device was composed of the spectrometer GE6500 (made by Ocean Optics Inc. U.S.A.), with the Signal-to-Noise of 1000:1, quantum efficiency of 90% peak and wavelength range from 200 to 1100nm, the tungsten halogen lamp with the wavelength range from 463nm to 1356nm as the lamp-house, and a personal computer to pick up DS data from the raw data range from 600.22nm to 1000.60nm according to the absorption spectra of the hemoglobin (see Fig.3).
Fig. 2. Data acquisition system of dynamic spectrum
70
H. Wang et al.
Fig. 3. Absorption spectra in the NIR regime of oxy-, deoxy-, met-, and carboxyhemoglobin[13]
3.2 Test Objects and Test Process The test objects were physical examinees age from 19 to 60, at a hospital in ShanDong Province, China. All volunteers are male. The volunteers were required to relax for a while before testing to keep their emotion and breath etc. in ease, and then were asked for put their left forefingers in the finger hose. From the hose the fiber was connected to the spectrometer. The contact pressure between the finger and the hose was needed to keep basically stable during the detection and the data were transmitted to PC for further processing of DS acquisition. For one volunteer, the data acquisition process lasted 60s with the integration time of 50ms. Then the volunteers were drawn blood for the biochemical detection to obtain the concentration of hemoglobin at the moment. 3.3 Back Propagation Neural Network The BP-NN is inspired by simulating the function of human brain and is used to represent a nonlinear mapping between input and output vectors, and it has been used in many study fields including near infrared analysis technology with the strong function approximation capability [14, 15]. A standard BP neural model consists of three or more layers, including an input layer, one or more hidden layers and an output layer. In this paper, a four-layer BP-NN was used to establish a calibration model between the DS data and the hemoglobin values. The DS data were taken as the input of the network. The spectrometer’s wavelength revolution is 0.81nm approximately, so the input layer’s node was 528. The hemoglobin’s level was taken as the output of network separately, so the output layer’s node number was 1. The number of neurons in the hidden layer was selected through trial-and-error. 45 samples randomly selected were gathered as the calibration set and the other 15 as prediction set.
4 Results BP-NN has powerful skill of nonlinear fitting. The prediction result is shown in the Fig.4, with the correlation coefficient R=0.907, demonstrating a strong correlation.
Dynamic Spectrum and BP Neural Network for Non-invasive Hemoglobin Measurement
71
The data range of hemoglobin is from 142 and 169mmol/L. The Root-Mean-Square Error of Prediction (RMSEP) is 3.34 mmol/L. The maximum relative error is 4.7% falling within the acceptable range for the clinical application. 175
n=15 2
R = 0.8233 RMSEP=3.34 Bias=2.87 y = 1.0866x - 12.497
Predicted value/mmol/L
170 165 160 155 150 145 140 140
145
150
155
160
165
170
Measured value/mmol/L
Fig. 4. Prediction results for the unknown 15 samples from the BP models
The prediction result is satisfactory and reveals that the non-invasive measurement based on the DS theory is able to forecast the concentration of hemoglobin in the blood.
5 Errors Discussions Other groups, such as Jeon K J and J Krait groups, researching the hemoglobin measurement noninvasively, using the ratio between AC and DC components as a parameter as used in the pulse oximeter case, because the coefficient of two lights’ intensity with different wavelengths can be obtained from measurement of the AC and DC components (see Eq.4). ΔODλ1 ΔODλ 2
⎛ ACλ1 ⎞ DCλ1 − ACλ1 ⎟⎟ log⎜⎜1 − DC DCλ1 λ1 ⎠ ⎝ = = DCλ 2 − ACλ 2 ⎛ ACλ 2 ⎞ log ⎟⎟ log⎜⎜1 − DCλ 2 DC λ2 ⎠ ⎝ log
(4)
In Eq. (4), ACλ1 <<1 and ACλ 2 <<1, so it can be transformed into Eq.(5) by DCλ1 DCλ 2 applying Maclaurin expansion.
⎛ ACλ1 ⎞ ACλ1 ⎟⎟ − ο ⎜⎜ DCλ1 ⎝ DCλ1 ⎠ ≈ ⎛ ACλ 2 ⎞ ACλ 2 ⎟⎟ − − ο ⎜⎜ DCλ 2 ⎝ DCλ 2 ⎠ −
ACλ1 DCλ1 R λ1 = ≡ R λ1, λ 2 ACλ 2 R λ 2 DCλ 2
(5)
72
H. Wang et al.
⎛ AC ⎞ where, R is defined as the ratio of AC and DC. ο ⎛⎜ ACλ1 ⎞⎟ and ο ⎜⎜ λ 2 ⎟⎟ are the infini⎜ DC ⎟ ⎝ DC λ 2 ⎠ ⎝ λ1 ⎠
⎛ AC λ1 ⎞ ⎛ ACλ 2 ⎞ ⎟⎟ and ο ⎜ ⎜ DC ⎟⎟ . ⎝ DC λ 1 ⎠ ⎝ λ2 ⎠
tesimal of the variables ο ⎜⎜
The photoelectric pulse signal detected has to be amplified and pre-processed, e.g. digital filtering and averaging, which may change the original information in the raw data. From the equation (5) derivation, it can be seen that hemoglobin measurement error sources that must be carefully taken into account are as follows, The ratio of the AC to DC components is typically 1 to 2% in clinical experience. Hence, the numerical value calculated by Eq.(5) is of the magnitude of 10-2. ⎛ AC ⎞
λ1 ⎟⎟ <<1 and Since Eq.(5) is obtained approximately under the conditions of ο ⎜⎜ DC λ1 ⎠ ⎝
⎛ AC λ 2 ⎝ DC λ 2
ο ⎜⎜
⎞ ⎟⎟ <<1, ⎠
the smaller the numerical value of AC
DC
, the more accurate the meas-
urement. Actually AC component can not be measured exactly. The ratio of the AC to DC can introduce errors. Therefore high precision results are difficult to obtain for different perfusion conditions. DC is influenced by the measuring condition (incident light intensity and probe etc.) and individual discrepancy. This produces major errors directly. In order to compare the accuracy of the above measurement with that of a traditional pulse oximetry, Eq. (1) is transformed and expanded by Maclaurin expansion as follow.
- +I
⎛I I ΔA = log⎜⎜ max min I ⎝ min
min
⎞ ⎛ ΔI ⎞ ΔI 1 ΔI 2 1 ΔI 3 ⎟⎟ = log⎜⎜ + 1⎟⎟ = log[ − ( ) + ( ) + "] I I 2 I min 3 I min ⎠ ⎝ min ⎠ min
(6)
where ΔI = (I max − I min ) . Comparing Eq.(6) with Eq.(4), it can be learned that the two equations are approximately the same except that I min (minimum light intensity)is used in Eq.(6) instead of DC (direct current or average light intensity) in Eq.(4). The modified BeerLambert’s law is able to describe the biological tissues much more accurately. The photoelectric pulse wave is not a simple wave like triangular wave or sine wave whose minimum to average amplitude ratio is constant. So the traditional pulse oximetry produces relatively large errors because of the different photoelectric pulse waveforms and perfusion states of different conners. However the pulse oximetry based on DS is different and it will not produce above discussed errors. In general, as a novel measuring method, the hemoglobin measurement based on DS is much more accurate than traditional method.
6 Conclusion In this paper, a high precision hemoglobin measuring method based on DS with BPNN was evaluated through the experiments and theoretical derivation. The DS data of
Dynamic Spectrum and BP Neural Network for Non-invasive Hemoglobin Measurement
73
60 volunteers were picked after the unique data processing with raw data obtained in vivo. BP-NN models were established and trained to map out the relationship between the DS data and the concentration of hemoglobin. The result suggests that the hemoglobin can be derived by this approach noninvasively. Furthermore, the accuracy of the novel method is more satisfactory than the traditional way pulse oximeter used, which is easily influenced by various factors, such as the measuring principle, measuring condition and individual discrepancy. Further work still continues to determine fully the performance of the instrument and also to assess the real potential of the DS method. Acknowledgements. It is a project supported by National Natural Science Foundation of China (No.60174032, 60674111).
References 1. John, W.M., Gregory, D.J., Selim, S., Gregory, C.: Noninvasive Optical, Electrical, and Acoustic Methods of Total Hemoglobin Determination. Clin. Chem. 54, 264–272 (2008) 2. Mark, R.M., Suzanne, N., Kimball-Jones, P.L., Richard, L.A., Robert, D.M., Martin, W.A.: Noninvasive measurement of continuous hemoglobin concentration via pulse COOximetry. CHEST 132, 493–494 (2007) 3. Kanashima, H., Yamane, T., Takubo, T., Kamitani, T., Hino, M.: Evaluation of noninvasive hemoglobin monitoring for hematological disorders. J. Clin. Lab. Anal. 19(1), 1–5 (2005) 4. Iftimia, N.V., Hammer, D.X., Bigelow, C.E., Rosen, D.I., Ustun, T., Ferrante, A.A., Vu, D., Ferguson, R.D.: Toward noninvasive measurement of blood hematocrit using spectral domain low coherence interferometry and retinal tracking. Opt. Express. 14(8), 3377–3388 (2006) 5. Jay, G.D., Racht, J., McMurdy, J.W., Mathews, Z., Hughes, A., Suner, S., Crawford, G.P.: Point-of-care noninvasive hemoglobin determination using fiber optic reflectance spectroscopy. Proceedings of IEEE: Engineering in Medicine and Biology Society 1, 2932– 2935 (2007) 6. Jeon, K.J., Kim, S.J., Park, K.K., Kim, J.W., Yoon, G.: Noninvasive total hemoglobin measurement. J. Biomed. Opt. 7, 45–50 (2002) 7. Kraitl, J., Ewald, H., Gehring, H.: An optical device to measure blood components by a photoplethysmographic method. J. Opt. A-Pure. Appl. Op. 7, S318–S324 (2005) 8. Salyer, J.W.: Neonatal and pediatric pulse oximetry. Respire Care 48, 386–396 (2003) 9. Strieoei, H.W., Kretz, F.J.: The functional principle, reliability and limitations of pulse oximetry. Anaesthesia 38(12), 649–657 (1989) 10. Gang, L., Yuliang, L., Ling, L., Xiaoxia, L., Yan, W.: Dynamic Spectroscopy for Noninvasive Measurement of Blood Compositions. In: 3rd International Symposium on Instrumentation Science and Technology, Xi’an, China, pp. 875–880 (2004) 11. Liu, Y.H., Wang, Z.Q., Li, G.: An in vivo acquisition device for near infrared blood spectra. In: 4TH IEEE/EMBS international summer school and symposium on medical device and biosensors, pp. 73–76. IEEE Press, Cambridge (2007) 12. Gang, L., Qiu-Xia, L., Ling, L., Xiao-Xia, L., Yan, W., Yu-Liang, L.: Discussion about the prediction accuracy for dynamic spectrum by partial FFT. Spectrosc. Spect. Anal. 26, 2177–2180 (2006)
74
H. Wang et al.
13. Webster, J.G.: Design of Pulse Oximieters. Institute of Physics Publishing, Bristol (1997) 14. Li, C.W., Hu, R.Q.: PID control based on BP neural network for the regulation of blood glucose level in diabetes. In: 7th IEEE International Conference on Bioinformatics and Bioengineering, Boston, pp. 1168–1172. IEEE Press, Cambridge (2007) 15. Khanmohammadi, M., Garmarudi, A.B., Ghasemi, K., Garrigues, S., de la Guardia, M.: Artificial neural network for quantitative determination of total protein in yogurt by infrared spectrometry. Microchem. J. 9, 47–52 (2009)
Study on Real-Time Control of Exoskeleton Knee Using Electromyographic Signal Jiaxin Jiang, Zhen Zhang, Zhen Wang, and Jinwu Qian School of Mechatronics Engineering and Automation, Shanghai University, Shanghai 20072, P.R. China
Abstract. This paper is concerned with control method for exoskeleton in realtime by using electromyographic signal (EMGs). EMGs is collected from normal subjects when they move their knee flexion-extension in the sagittal plane. The raw EMGs is processed and then input to a four-layer feed-forward neural network model which uses the back-propagation training algorithm. The output signal from neural network is processed by the wavelet transform. Finally the control orders are passed to the motor controller and drive the exoskeleton knee move by the same way. In this study, the correlation coefficient is used to evaluate the effects of neural network prediction. The experimental results show that the proposed method can accurately control the movement of the knee joint.
,
Keywords: EMGs, Knee Joint Angle, Neural Network, Exoskeleton.
1 Introduction EMGs is the electrical signal source to generate muscular force and the composition of many motor units in the time and space, which is produced by muscle fiber contraction. Surface EMGs is a biological signal which is picked up and recorded from skin surface through the electrodes. Since EMGs have been firstly found in the 18th century, along with in-depth research on the neuromuscular, EMGs detection and signal processing technology development, application of EMGs have been more extensive development. In recent years, extensive research efforts have been done in clinical medicine [1], sports medicine [2], biomedical engineering [3] and other fields. Exoskeleton systems offer a wide range of possible application, for example, assistance patient during their rehabilitation or give force support, and so on. Since numerous applications for exoskeletons can be proposed, lots of groups have shown interest in this topic. It is important for exoskeleton as a walking aid to be able to provide help in accordance with the intended motion of walker. Otherwise, it can not play for walking. In fact, it even may be harm for walker. Surface EMGs as a biological signal is the skin reflection of the internal changes. Therefore, more and more attentions turn to use surface EMGs to control exoskeleton [4-7]. However, using the EMGs for real-time feedback control of robot exoskeleton has not been studied deeply. The main reason for this is that it is difficult to map the EMGs into the force produced by a muscle. Research in effect of using the EMGs for K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 75–83, 2010. © Springer-Verlag Berlin Heidelberg 2010
76
J. Jiang et al.
real-time feedback control of robot exoskeleton can obtain an explanation of making the patients’ initiative behavior, not just passive and mechanical behavior during rehabilitation exercise, in this way can achieve the uniform in EMGs and rehabilitation exercise of the patients. In our group, a large of work has been done about human ankle [8-9]. The motion state identification of human knee joint based on EMGs is great significance. This identification method can be used for bio-mimetic robots, rehabilitation robots and artificial prostheses. In this paper, a method for controlling the exoskeleton knee using EMGs is introduced. A four-layer feed-forward neural network model is constructed. The EMGs from five muscles which control the knee movement is collected and input into the neural network. Firstly, the neural network is trained by a large of collected data. Secondly, a part of data is selected randomly as testing sample, and then is inputted into the trained neural network model. Thirdly, the predicted knee angle is outputted from the model. Fourthly, the correlation coefficient is used for evaluating the neural network. Finally, this output signal as control signal is delivered to the motor controller and can drive the exoskeleton knee to move in the same way.
2 Knee Joint 2.1 Exoskeleton Knee Mechanism Exoskeleton knee is a part of the powered gait orthosis which is being developed which can move legs of a patient in a physiological way on the moving treadmill. The exoskeleton knee mechanism is shown in Fig. 1. It is composed of two connecting rods, bearings, lead screw and motor. A linear series elastic actuator with a precision ball screw is connected between a torque and the knee.
Fig. 1. Mechanism of exoskeleton knee
2.2 Surface EMGs of Knee Joint There are many muscles in the human lower limb. Different muscles have different effects and roles on the gait movements. Biceps femoris (BF), semitendinosus (SEM), vastus medialis muscle (VMO), rectus femoris (RF) and vastus lateralis (VLO) are primary to knee joint and they have strong surface EMGs which can be detected and analyzed easily.
Study on Real-Time Control of Exoskeleton Knee Using Electromyographic Signal
77
Fig. 2. Primary muscles of knee joint control
3 Control Method Based on Back-Propagation Neural Network 3.1 Principles and Method Fig.3 is the motion state identification of human knee joint based on EMGs. When the subject is doing regular knee flexion movement, the surface EMGs of the lower limb EMGs
EMGs acquisition Lower limb Preprocessing and feature extraction
Normalized EMG
NN training and prediction
Knee joint Identification Fig. 3. System diagram
Normalized Knee angle
78
J. Jiang et al.
can be collected. The raw EMGs are processed including rectifying, filtering, smoothing and feature extraction. The input signal of the neural network model is extracted EMGs and the output signal is the knee angle. The identification model between EMGs and knee angle is established and the error back-propagation is used as train algorithm. 3.2
Surface EMGs Detection and Feature Extraction
Surface EMGs can be collected by the surface electrodes or needle electrodes. Surface electrodes have a non-invasive merit and will not bring about secondary damage to the patient. In this study, the disposable silver / silver chloride electrode is used as the detection electrodes. Untreated surface EMGs is very complex and can not be directly used to control. Therefore, the raw surface EMGsshould be processed. Fig.4 is the raw surface EMGs and pretreated surface EMGs of Vastus lateralis and Fig.5 is the raw sEMG and pretreated sEMG of Rectus femoris.
Fig. 4. The raw surface EMGs and pretreated surface EMGs in Vastus lateralis
Fig. 5. The raw surface EMGs and pretreated surface EMGs in Rectus femoris
Then the RMS of the signals is calculated. At the same time the noise can be removed. Its waveform is similar to the linear envelope of the EMGs. The RMS, which is linearly related with muscle power, is a mathematical expression because it is better than the linear envelope of the EMGs. It is often used to control the threshold of myoelectric prostheses. The RMS can be computed as t +N / 2
σ(k 0 ) = [
∑y
0 1 N + 1 k =t
0 −N
2
(k )]1 / 2
(1)
/2
where y(k) is the surface EMG signal, and σ(k0) the RMS of the surface EMG signal.
Study on Real-Time Control of Exoskeleton Knee Using Electromyographic Signal
79
3.3 Back-Propagation (BP) Neural Network
Since surface EMG signals have non-linear characteristics. It is difficult to establish a mapping model between surface EMG signals and joint angles of knee with mathematical equation. However, artificial neural network has some merits such as good adaptive learning, fault-tolerance and non-linear mapping capability and so on [10]. Therefore, artificial neural network is used to establish a mapping relation between surface EMG signals and joint angles of knee here. In this paper, a four-layer feedback forward neural network is established to predict knee angle from EMGs. Backpropagation algorithm is used for training. In the network, EMGs is the input signal and the knee joint angle is the output signal. There are five neurons in the input layer and one neuron in the output layer. As can be seen in Fig.6, the mapping between the surface EMG and the joint angle is many-to-one. The choice of the number of neurons mainly depends on operator experience and many experiments, while not a specific formula. In this study, the first hidden layer neuron number is 23, the second hidden layer neuron number is 13 after repeated training. Neural network model structure is shown in Fig.6. Three transfer functions used here are respectively the tansig function, tansig function and purelin function. BF
RF
SEM
VMO
EMGs
VLO Input layer
...
First hidden layer
Second hidden layer
…. Output layer
Angle of knee joint
Fig. 6. Neural network model
3.4 Neural Network Output Data Post-processing
Neural network prediction of output signal usually contains some noise signals. Noise has seriously impact on the forecast result. The wavelet transforms is chosen to reduce the noise. In this study, signals with noise are carried out by the six-layer wavelet decomposition. The original signal can be reconstructed with the low frequency coefficient. This process can reduce the high frequency noise included in raw signal. The output signal of neural network is carried out by six-layer wavelet decomposition
80
J. Jiang et al.
with wavedec function in this study. Wrcoef function is used for reconstruction signal. Wavelet function is sym8.
4 Experiment Eight subjects (4 males, 4 females) are participating in the experiment, whose ages are from 22 to 28 years old (mean value is 25). None of the subjects has any previous knee problems or has undergone any athletic training geared towards strengthening the knee. They also have not undergone strenuous exercise in 24 hours before the experiment. As the temperature will affect the detection of EMGs, laboratory temperature maintains constant 23 centigrade during the experiment. Before testing, it is necessary to clean skin of the subjects to reduce the impedance. In this study, surface EMGs of BF, SEM, VMO, RF and VLO should be collected. Every muscle is affixed to two surface electrodes along the muscles. The electrodes direct is 10mm and the space is 20mm between two electrodes. In experiments, right leg of the subject is standing on the ground remaining intact and left leg does a regular knee protrusion and post-flexion movement. The EMGs of five muscles and knee angle of subjects are collected. Acquisition time is 60s. Each subject repeats the same experiment for four times. Break time is 10 minutes between two collections. Sampling frequency is 1500Hz for EMGs and 100Hz for knee angle. Due to the unit of acquisition data is different, it is necessary to normalize of the data. After surface EMG signal data and joint angle data are normalized, a part of data is used for training and another part is used for predicting. The training goal of neural network is 0.005 and the learning rate is 0.1. The number of training is usually less than 100 when the neural network converges to the target value. The predicted data is processed by six-layer wavelet decomposition and signals reconstructed.
5 Results In this study, the correlation coefficient is used to evaluate the effects of neural network prediction. The correlation coefficient is the ratio of the predictive value and actual measured value of knee angle. For each subject, four sets of data are selected to train the neural network, and then predict knee joint movement. Table 1 shows the correlation coefficient between predicted and measured values of subjects. As can be seen from the table, the correlation coefficient is in the range of 0.9352-0.9911 and the average value of correlation coefficients is 0.9628, which is very close to 1. The prediction results are close to the real measured values. The experimental results show that the neural network model can accurately predict knee motion state. Due to the artificial neural network model has good nonlinear mapping ability, it is widely used in sports recognition and pattern recognition. Such as identification the motion state of human elbow joint based on EMGs of upper limb [11], estimate the joint angle of knee using surface EMGs instead of goniometer [12], forecast the index finger angle from the sEMG of the posterior side of the forearm [13]. These results show that neural network can exactly distinguish the state of motion. It means that the method of neural network predicting knee angle is feasible.
Study on Real-Time Control of Exoskeleton Knee Using Electromyographic Signal
81
From results of the experiment, this study achieves to use the thigh muscles predict the knee motion. Fig.7 shows the prediction results and the real measured values after post processing from subject LL’s data. Table 1. The correlation coefficient between predicted and measured values
Subject XLL
CTT
JJX
PLL
YJN
DSY
WZ
ZHY
1
0.9563 0.9352
0.9545
0.9911 0.9652 0.9640
0.9556 0.9818
2
0.9781 0.9430
0.9621
0.9876 0.9786 0.9789
0.9632 0.9756
3
0.9331 0.9389
0.9485
0.9568 0.9685 0.9632
0.9761 0.9823
4
0.9562 0.9512
0.9515
0.9445 0.9683 0.9673
0.9498 0.9815
Average 0.9559 0.9421
0.9542
0.9700 0.9702 0.9684
0.9612 0.9803
Joint angle 140 120 ) ° (100 e e n k 80 f o
Measure
e l 60 g n a t 40 n i o J 20
Predict
0 0
2
4
6
8
10
12
Time(sec)
Fig. 7. Comparison curves of the measured joint angle and predicted joint angle
6 Discussions Due to the non-invasive character surface electrodes are widely used in kinesiological studies. Compared with the fine-wire and needle electrodes, interference signal are caused by the skin. The deeper muscles which covered by surface muscles or bones
82
J. Jiang et al.
can not be detected for surface electrodes. Thus, the recognition of motion state is affected. The mapping relationship between EMGs and joint angle is different for each data of different subjects. This is because many factors will affect the EMGs, including psychological status, physical condition and electrode placement. Therefore, the EMGs strength is different for different people and the neural network model is also different. Even on the same subject, there is not a single fixed neural network model. Thus, it is difficult for the real-time control exoskeleton. Acknowledgments. This work was jointly sponsored by National Natural Science Foundation(Grant No.50975165); Subject Leaders project in Shanghai(Grant No.10XD1401900); Shanghai Natural Science Fund(Grant No.01ZR1411500); Shanghai Education Research and Innovation Project(Grant No.10YZ17); Graduate Innovation Fund of Shanghai University.
References 1. Pullman, S.L., Goodin, D.S., Marquinez, A.I.: Clinical utility of surface EMG. J. Neurology 55, 71–177 (2000) 2. Hakkinen, K., Kallinen, M., Izquierdo, M.: Changes in agonist-antagonist EMG, muscle CSA, and force during strength training in middle-aged and older people. J. Appl. Physiol. 84, 84–1341 (1998) 3. Magnusson, S.P., Simonsen, E.B., Dyhre-Poulsen, P.: Viscoelastic stress relaxation during static stretch in human skeletal muscle in the absence of EMG activity. J. Med. Sci. Sports. 6, 323–328 (1996) 4. Artemiadis, P.K., Kyriakopoulos, K.J.: EMG-based Teleoperation of a Robot Arm in Planar Catching Movements using ARMAX Model and Trajectory Monitoring Techniques. In: Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 489–495 (2007) 5. Burridge, J., Freeman, C., Hughes, A.M.: Relationship between changes in tracking performance and timing and amplitude of biceps and triceps EMG following training in a planar arm robot in a sample of people with post-stroke hemiplegia. In: The XVIII Congress of the International Society of Electrophysiology and Kinesiology, Aalborg, Denmark, June 16-19 6. Fleischer, C., Wege, A., Kondak, K.: Application of EMG signals for controlling exoskeleton robots. J. Biomedizinische Technik 51, 314–319 (2006) 7. Lloyd, D.G., Besier, T.F.: An EMG-driven musculoskeletal model to estimate muscle forces and knee joint moments in vivo. J. Biomechanics 36, 765–776 (2003) 8. Zhang, Z., Yao, S.L., Zhang, Y.N.: On the surface electromyography sensor network of human ankle movement. In: IEEE International Conference on Robotics and Biomimetics, Shanghai, pp. 1688–1692 (2007) 9. Yao, S.L., Zhang, Y.N., Zhang, Z.: Controlling Ankle Movement in Neuro-rehabilitation Using Selective Surface Electromyography Signals. J. Journal of Shanghai University 15 (2009) 10. Hiraiwa, A., Shimohara, K., Tokunaga, Y.: EMG pattern analysis and classification by neural network. In: Proc. IEEE Int. Conf. Systems, pp. 1113–1115 (1989)
Study on Real-Time Control of Exoskeleton Knee Using Electromyographic Signal
83
11. Li, X.F., Yang, J.J., Shi, Y.: Motion state identification of human elbow joint based on EMGs. J. Chinese Journal of Biomedical Engineering 24 (2005) 12. Kawainot, H., Siiwoong Lee, O., Kanbe, S.: Power Assist Method for HAL-3 using EMGbased Feedback Controller. In: Proceedings of the IEEE International Conference on Systems, Washington, pp. 1648—1623 (2003) 13. Shrirao, N.A., Reddy, N.P., Kosuri, D.R.: Neural network committees for finger joint angle estimation from surface EMG signals. J. BioMedical Engineering OnLine
Characterization of Cerebral Infarction in Multiple Channel EEG Recordings Based on Quantifications of Time-Frequency Representation Li Zhang*, Chuanhong He, and Wei He State Key Laboratory of Power Transmission Equipment & System Security and New Technology, College of Electrical Engineering, Chongqing University, Chongqing, China
[email protected]
Abstract. In this paper, a method for characterizing cerebral infarction (CI) utilizing spontaneous electroencephalogram (EEG) is described. We obtained the time–frequency representations (TFRs) of EEG signals recorded from both normal subjects and CI patients. The corresponding characteristics were depicted by relative frequency band energy (RFBE) and Shannon entropy (SE) of TFR. Comparing with the normal subjects, the CI patients had some changes in EEG as follows: (1) delta and theta rhythms were attenuated while beta and gamma rhythms were enhanced, and the changes of delta and beta were more significant, (2) alpha was also blocked with eyes open, however the blocking action was less evident, (3) SE increase was pronounced. Consequently, the quantitative EEG methods are promising tools to provide helpful and sensitive information for the detection and diagnose of CI. Keywords: electroencephalogram (EEG), cerebral infarction, time-frequency representation, relative frequency band energy, entropy.
1 Introduction Cerebral infarction (CI) is a common brain disorder in China and has a continued rising trend. electroencephalogram (EEG) is a potential tool for the identification of cerebral injury and management of patients with neurological trauma in intensive care units [1]. Changes in EEG activity occur in temporal relation to triggering events, and correspond to transitions from disordered to ordered states, or vice versa [2]. Alternative methods of continuous brain monitoring are limited by lack of sensitivity (e.g. CT), high false positive rates (e.g. transcranial Doppler), and no standardized criteria for abnormality (e.g. CT-perfusion, magnetic resonance diffusion-weighted imaging), and another major disadvantage of all of the above techniques is that they are usually only applied once or twice per day [3]. However, EEG can be recorded continuously and the features corresponding to cerebral injury can be extracted at any time. Therefore, quantitative EEG analysis methods used for continuous monitoring CI related changes in EEG would be of great value to neurologists in guiding therapy. *
Corresponding author.
K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 84–90, 2010. © Springer-Verlag Berlin Heidelberg 2010
Characterization of Cerebral Infarction in Multiple Channel EEG Recordings
85
Fourier transform based methods were used as quantitative EEG tools in some studies [3, 4]. These methods require stationarity of EEG signals, however, EEG is known to be time varying and nonstationary, especially in some pathological conditions [4]. The nonstationary nature of EEG makes it necessary to use methods which are able to join time with frequency domain to reveal how the frequency content of a signal changes with time. Among such quantitative EEG methods are Wavelet transform based methods, such as the methods proposed as quantitative measures for analyzing and monitoring EEG during cerebral ischemic [2, 5], and different quantifications of Time-frequency representation (TFR), such as the methods introduced to characterize the nonstationarity level of the EEG signals before and after hypoxicischemic injury [6]. In this study, quantitative methods based on Cohen class of TFR are proposed to assess CI associated EEG changes. Oscillatory states are the most remarkable features of EEG activity. EEG is commonly decomposed into five rhythms: delta, theta, alpha, beta, and gamma. The EEG rhythm activity results from the coordinated activation of groups of neurons and these rhythms are affected by different physiological states. None of these rhythms ever appears alone, however, one frequency range is likely to be more pronounced than others in different conditions. In this study, the relative energies of five frequency bands were first calculated from the time-frequency plane. In addition, Entropy is a useful measure to quantify the degree of order in a given signal. The Shannon entropy (SE) of TFR was also computed.
2 Materials and Methods 2.1 EEG Data Acquisition We analyzed two sets of EEG recordings corresponding to the normal and CI subjects. The EEG data were recorded from eight positions F3, F4, C3, C4, P3, P4, O1 and O2 according to the 10-20 system [7]. These channels were referenced to electrically linked mastoids A1 and A2. Each signal was filtered using a 0.3–100Hz bandpass filter and sampled at 1000 Hz with a 12-bit analog-to-digital converter. The first set of the EEG data corresponding to the normal subjects was taken from twelve healthy subjects, none of them reported of any neurological or psychiatric disorder in the past. The second group consisted of CI EEG signals obtained from twelve CI patients under conscious state. The EEG signals were recorded for one minute when the subjects were relaxed in two awake states with eyes open and eyes closed. 2.2 Relative Frequency Band Energy (RFBE) The TFR is a bivariate function of time and frequency. One of the well-known Cohen class TFRs is Wigner–Ville distribution (WVD). It gives a high resolution distribution of signal energy over time and frequency. But when the analyzed signal is a multicomponent signal, there are cross-terms between two components that are often considered as undesirable effects of WVD analysis. Undesirable effects of the cross-terms
86
L. Zhang, C. He, and W. He
can be reduced by using smoothed pseudo Wigner–Ville distribution (SPWVD) proposed by Martin and Flandrin [9]. The formalism of SPWVD is +∞
τ
+∞
τ
SPWVD(t, f ) = ∫ h(τ )∫ x(s + )x∗ (s − ) g(s − t )dse −∞ −∞ 2 2
− j 2 πfτ
dτ
(1)
Where h(τ) is the frequency smoothing window and g(s) is the time smoothing window. The suppression of cross-term is better with a longer time window, but undesirable smearing of instantaneous characteristics will be accompanied [10]. The energy of a signal in any extended time-frequency region can be obtained by integrating SPWVD over that region. Therefore, integration of SPWVD along a period of time and a frequency band (delta, theta, alpha, beta or gamma) gives the frequency band energy during a period of time. T + Δt
Ej = ∫
T
∫
f
f
j max
j min
SPWVD (t , f )dfdt ,
j = δ ,θ , α , β , γ
(2)
Then, the RFBE represents the probability distribution of energy at different frequency ranges, which may provide information about their corresponding degree of importance in EEG under different physiological states.
∑E
pj = Ej
j
,
j = δ ,θ , α , β , γ
(3)
j
2.3 Shannon Entropy of SPWVD
The main tool in measuring the information content or the uncertainty of a given probability distribution is the entropy function [11]. The well-known Shannon entropy (SE) is defined as the expectance of the logarithm of the corresponding distribution probabilities [12]. M
SE = −∑ pi ln ( pi )
(4)
i =1
According to SE definition, the entropy of SPWVD is defined as:
SE SPWVD = −∑ p j ln ( p j ),
j = δ ,θ ,α , β , γ
(5)
j
According to (3), clearly
∑p
j
= 1 . −pjln(pj) converges to 0 when pj → 0 or pj →
j
1 and reaches its maximum value at one point within interval (0, 1). Theoretically, when the distribution probabilities of all RFBEs are uniformly distributed, the global maximum of SESPWVD can be obtained. The SE of SPWVD can serve as a useful tool to quantify the divergence between probability densities of different frequency bands. The centralized distribution of frequency band energies would yield small entropy values, while the diffuse distribution of frequency band energies would yield large entropy values.
Characterization of Cerebral Infarction in Multiple Channel EEG Recordings
87
3 Results and Discussion In this paper, Independent component analysis (ICA) [8] was used to isolate and remove electrooculography and power line artifacts from the EEG data. For we were only interested in the components of EEG signals below 40Hz, the EEG signals were band-limited to the desired 0.3-40Hz range by convolving with a low-pass finite impulse response (FIR) filter. The band-limited EEG was separated into five physiological EEG subbands: delta (0.3–4 Hz), theta (4–8 Hz), alpha (8–13 Hz), beta (13–30Hz) and gamma (30–40Hz). We estimated the TFRs of EEG signals from both normal subjects and CI patients under open and closed eyes states with SPWVD. After this process, the above mentioned quantitative EEG analysis methods were used. CI
0.4 0.2
F3
0 Delta
Theta Alpha
Beta Gamma
0.4 0.2
C3
0 Delta
Theta Alpha
Beta Gamma
0.4 0.2
P3
0 Delta
Theta Alpha
Beta Gamma
0.4 0.2
O1
0 Delta
Theta Alpha
Beta Gamma
Relative energy Relative energy Relative energy Relative energy
Relative energy Relative energy Relative energy Relative energy
Health
0.4 0.2
F4
0 Delta
Theta Alpha
Beta Gamma
0.4 0.2
C4
0 Delta
Theta Alpha
Beta Gamma
0.4 0.2
P4
0 Delta
Theta Alpha
Beta Gamma
0.4 0.2
O2
0 Delta
Theta Alpha
Beta Gamma
Fig. 1. Group averages of relative energies of delta, theta, alpha, beta and gamma frequency bands for the normal and CI subjects under open eyes state at F3, F4, C3, C4, P3, P4, O1 and O2
Fig.1 gives the group averages of relative energies of delta, theta, alpha, beta and gamma frequency bands for the normal and CI subjects under open eyes state at F3, F4, C3, C4, P3, P4, O1 and O2 electrodes. For the normal subjects, delta activity was dominant at all eight electrodes whereas for the CI subjects, beta frequency band was the most pronounced one. The relative energies of delta and theta bands for the normal subjects were larger than those for the CI subjects, whereas it was just the opposite for the relative energies of the other three bands.
L. Zhang, C. He, and W. He
Relative energy Relative energy Relative energy Relative energy
Health
CI
0.4 0.2
F3
0 Delta
Theta Alpha
Beta Gamma
0.4 0.2
C3
0 Delta
Theta Alpha
Beta Gamma
0.4 0.2
P3
0 Delta
Theta Alpha
Beta Gamma
0.4 0.2
O1
0 Delta
Theta Alpha
Beta Gamma
Relative energy Relative energy Relative energy Relative energy
88
0.4 0.2
F4
0 Delta
Theta Alpha
Beta Gamma
0.4 0.2
C4
0 Delta
Theta Alpha
Beta Gamma
0.4 0.2
P4
0 Delta
Theta Alpha
Beta Gamma
0.4 0.2
O2
0 Delta
Theta Alpha
Beta Gamma
Fig. 2. Group averages of relative energies of delta, theta, alpha, beta and gamma frequency bands for the normal and CI subjects under closed eyes state at F3, F4, C3, C4, P3, P4, O1 and O2
The group averages of relative energies of five frequency bands at eight electrodes for the normal and CI subjects under closed eyes state are presented in Fig.2. The relative energy of alpha band in the last two electrodes namely O1 and O2 for two kinds of subjects was of the largest value demonstrating that alpha was strongest over occipital cortex for the normal subjects and even for the CI subjects. It could be noted that the dominant rhythms at the other six electrodes for two kinds of subjects were the same as under open eyes state except for the CI subjects at F3 electrode. Under closed eyes state, delta and theta were also attenuated while beta and gamma were enhanced for the CI subjects in comparison to the normal subjects. An increase in the energy of alpha rhythm under closed eyes state and a decrease in that under open eyes state at every electrode were obvious, which demonstrated that alpha is best seen with eyes closed under conditions of physical relaxation or relative mental inactivity and blocked or attenuated with eyes open. Under open eyes state, the relative energy of alpha rhythm at each electrode for the CI subjects was larger than that for the normal subjects, however, under closed eyes state, the CI subjects had slightly larger alpha energy in central and parietal sites while the normal subjects had slightly larger alpha energy in frontal and occipital sites. Accordingly, it is seen from the relative energy of alpha frequency band illustrated in Fig.1 and Fig.2 that alpha rhythm of the CI subjects was also blocked, but the blocking action was less evident.
Characterization of Cerebral Infarction in Multiple Channel EEG Recordings
1.49
EO
EC
1.42
P3
1.35 EO
EC
F4
C3 1.35
1.35 1.49
1.42
EO
EC
1.42
P4
1.35 EO
EC
1.49
EO
EC
1.42 O1 1.35 EO
Entropy
1.42
1.49
EC
Entropy
F3 1.35
Entropy
1.42
1.49
Entropy
1.49
Entropy
CI
Entropy
Entropy
Entropy
Health
89
1.49 1.42 C4 1.35 1.49
EO
EC
1.42 O2 1.35 EO
EC
Fig. 3. Group average of SE of SPWVD for the normal and CI subjects under open eyes (EO) and closed eyes (EC) states at F3, F4, C3, C4, P3, P4, O1 and O2
The differences of EEG rhythm activity between the normal and CI subjects are statistically compared using a paired t test. Under closed eyes state, there is significant difference in RFBE between the two types of subjects for each frequency band except for alpha band (P < 0.001 for delta band, P < 0.001 for theta band, P = 0.69 for alpha band, P < 0.001 for beta band, and P < 0.001 for gamma band). Under open eyes state, the RFBE is significantly different for all frequency bands (P < 0.001 for all bands). These results suggest that EEG RFBE differences under open eyes state were more pronounced. The most significant differences were found in delta band under open eyes state and in beta band under closed eyes state. The average SE values of SPWVD of the spontaneous EEG corresponding to the two types of subjects are displayed in Fig.3. It indicates that for the CI subjects, SE increase at each electrode was evident in both conditions. Besides, the SE values were larger under open eyes state at all electrodes for the two types of subjects. Based on the results of RFBE analysis, it may seem that the relative energies of beta and gamma contributed substantially to entropy increase for the CI subjects, and exerted a similar effect under open eyes state. For the SE method, the SE values corresponding to the normal and CI subjects are significantly different under both states (P < 0.001). Contrary to the result using the RFBE method, the larger changes were found under closed eyes state. Therefore, the preferred method varies with the states.
4 Conclusion In this study, we characterized EEG from both normal and CI subjects based on timefrequency methods. It is indicated by experimental results that the proposed quantitative measures based on TFR can express the differences of EEG features between the normal and CI subjects, and the characteristic changes in EEG under CI state can be discerned in each of the two simple tasks. The CI injury primarily affected delta and beta bands in two conditions. The SE of SPWVD also demonstrated the occurrence of CI. Therefore, these quantitative measures are physiologically meaningful since they differentiated the CI injury and normal brain state in spontaneous conditions. Quantitative EEG parameters may supplement the clinical exam in CI patients.
90
L. Zhang, C. He, and W. He
For future work, the sensitivities of these EEG features to detect CI would be studied and the recovery process of CI patients would be traced and evaluated with quantitative EEG signal analysis. Acknowledgements. The authors would like to thank Dr. X.B. Miao and J.F. Zhu for experiments. This work is supported by the Fundamental Research Funds for the Central Universities (Project NO. CDJZR10150003).
References 1. Al-Nashash, H.A., Thakor, N.V.: Monitoring of global cerebral ischemia using wavelet entropy rate of change. IEEE Trans. Biomed. Eng. 52, 2119–2122 (2005) 2. Rosso, O.A., Blanco, S., Yordanova, J., Kolev, V., Figliola, A., Schürmann, M., Basar, E.: Wavelet entropy: a new tool for analysis of short duration brain electrical signals. J. Neurosci. Meth. 105, 65–75 (2001) 3. Claassen, J., Hirsch, L.J., Kreiter, K.T., Du, E.Y., Connolly, E.S., Emerson, R.G., Mayer, S.A.: Quantitative continuous EEG for detecting delayed cerebral ischemia in patients with poor-grade subarachnoid hemorrhage. Clin. Neurophysiol. 115, 2699–2710 (2004) 4. Geocadin, R.G., Ghodadra, R., Kimura, T., Lei, H., Sherman, D.L., Hanley, D.F., Thakor, N.V.: A novel quantitative EEG injury measure of global cerebral ischemia. Clin. Neurophysiol. 111, 1779–1787 (2000) 5. Al-Nashash, H., Paul, J., Ziai, W., Hanley, D., Thakor, N.: Wavelet entropy for subband segmentation of EEG during injury and recovery. Ann. Biomed. Eng. 31, 653–658 (2003) 6. Tong, S.B., Li, Z.J., Zhu, Y.S., Thakor, N.V.: Describing the nonstationarity level of neurological signals based on quantifications of time–frequency representation. IEEE Trans. Biomed. Eng. 54, 1780–1785 (2007) 7. Jasper, H.H.: The ten-twenty electrode system of the international federation. Electroenc. Clin. Neurophysiol. 10, 371–375 (1958) 8. Hyvarinen, A., Oja, E.: Independent component analysis: Algorithms and applications. Neural Networks 13, 411–430 (2000) 9. Martin, W., Flandrin, P.: Wigner–Ville spectral analysis of non-stationary processes. IEEE Trans. Acoust. Speech Sig. Proc. 33, 1461–1470 (1985) 10. Chan, H.L., Lin, M.A., Chao, P.K., Lin, C.H.: Correlates of the shift in heart rate variability with postures and walking by time–frequency analysis. Comput. Meth. Prog. Biomed. 86, 124–130 (2007) 11. Aviyente, S., Williams, W.J.: Entropy based detection on the time-frequency plane. In: 2003 IEEE Int. Conf. on Acoustics, Speech, and Sig. Pro., Hong Kong, pp. 441–444 (2003) 12. Shannon, C.: A mathematical theory of communication. Bell Syst. Tech. J. 27, 379–423 (1948)
Research on a Novel Medical Image Non-rigid Registration Method Based on Improved SIFT Algorithm Anna Wang, Dan Lv, Zhe Wang, and Shiyao Li College of Information Science and Engineering, Northeastern University, Shenyang, China
[email protected],
[email protected]
Abstract. In allusion to non-rigid registration of medical images, the paper gives a novel algorithm based on improved Scale Invariant Features Transform (SIFT) feature matching algorithm. First, Harris corner detection algorithm is used in the process of scale invariant feature extraction, so the number of right matching points is increased; with regard to the feature points detected in the scale space, an improved SIFT feature extraction algorithm with global context vector is presented to solve the problem that SIFT descriptors result in a lot of mismatches when an image has many similar regions. On this basis, affine transformation is chosen to implement the non-rigid registration, and weighted mutual information (WMI) measure and Particle Swarm Optimization (PSO) algorithm are also chosen to optimize the registration process. The experimental results show that the method can achieve better registration results than the method based on mutual information. Keywords: Non-rigid registration, improved SIFT, Harris corner detection, global context vector, WMI, PSO.
1 Introduction In the medical field, image registration is mainly used in the information fusion of CT, MRI, PET and other medical images, in the comparison of actual medical images and maps, in the surgical navigation, in the cardiac motion estimation and many other aspects. Generally, image registration is classified as rigid and non-rigid. For some image distortions, rigid transformation cannot achieve satisfactory registration result, so non-rigid registration becomes the major research hotspot. Scale Invariant Feature Transform (SIFT) algorithm [1, 2] is a general algorithm in the field of features matching. This algorithm has a stronger matching ability to extract stable features, and to achieve images matching between two different images. SIFT algorithm has its shortages in the number of matching points and the probability of right matching, so in this paper a novel medical image non-rigid registration method based on improved SIFT algorithm is presented.
2 Image Feature Extraction and Matching Feature extraction is a crucial step in image registration, the paper chooses the improved SIFT feature extraction algorithm to extract the feature points: Step 1, use K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 91–99, 2010. © Springer-Verlag Berlin Heidelberg 2010
92
A. Wang et al.
Harris corner detection algorithm to detect feature points in two images; Step 2, establish SIFT features vector of the feature points; Step 3, generate improved SIFT feature descriptor; Step 4, compare the feature vectors of the two images, and determine matching points. In the fourth step, a feature point is selected in the reference image at first; later two feature points in the float image are identified, whose Euclidean distances with the point in the reference image are nearest; then the ratio of minimum and second minimum distance is calculated, if this value is less than a certain threshold, the matching is accepted. When this ratio threshold is reduced, matching points will be reduced, but more stable. 2.1 Harris Corner Detection Algorithm Harris corner detection operator is a feature extraction operator proposed by C. Harris. This operator is inspired by the autocorrelation function in the signal processing, giving the matrix M associated with the autocorrelation function. The formula of Harris corner detection operator only involves the first derivative of the image: M=
1 2πσ
2
exp(−
⎡ I2 x2 + y 2 )⊗⎢ x 2 2σ ⎣⎢ I x I y
IxI y ⎤ λ −1 ⎡ 1 ⎥⇒R ⎢ I y2 ⎦⎥ ⎣0
0⎤
λ2 ⎥⎦
R
.
(1)
where Ix and Iy represent the point’s derivatives in the horizontal and vertical direction in the image; σ is the standard deviation of Gaussian smoothing function; λ1 and λ2 are the eigenvalues of matrix M, when both are large, defined as a corner, using R to express it. The definition of Harris corner detection operator is R = det( M ) − kTr 2 ( M ) .
(2)
where det(M) is the determinant of the matrix; Tr(M)is the trace of the matrix, k is a default parameter, generally within 0.04—0.02. In practical applications, the R value of the point in image window center is computed, if it’s greater than a given threshold value, the point is identified as the corner. 2.2 SIFT Feature Extraction SIFT algorithm was proposed by the D.G.Lowe in 1999, and summarized perfectly in 2004. The main steps of the algorithm are described as followings: 1) The detection of the extreme points in the scale space Scale space theory first appeared in the field of computer vision, and its purpose is to simulate multi-scale features of the image data. The scale space of two-dimensional image is defined as: L( x, y, σ ) = G( x, y, σ ) ⊗ I ( x, y), G( x, y, σ ) =
1 − ( x2 + y2 )/2σ 2 . e 2πσ 2
(3)
where ⊗ means the convolution operation, G ( x , y , σ ) is the variable-scale Gaussian function, and σis called the scale space factor. In order to detect stable feature points in the scale space, there is need to use difference of Gaussian scale space (DOG), where k is a constant.
Research on a Novel Medical Image Non-rigid Registration Method
D( x, y , σ ) = (G ( x, y , kσ ) − G ( x, y , σ )) ⊗ I ( x, y ) = L( x, y , kσ ) − L( x, y , σ ) .
93
(4)
In order to detect the local extreme of D ( x , y , σ ) , each sample point is compared to its eight neighbors in the current image and nine neighbors in the scale above and below. It is selected only if it is larger or smaller than all of these neighbors. 2) Accurate feature points locations In order to pinpoint the local extreme, to enhance the matching stability, and to improve noise immunity, there is need to remove some unstable feature points [2]. Taylor expansion of the scale-space function, D ( x , y , σ ) , is used to remove the lowcontrast points by the way of removing points whose offset is greater than a certain threshold. Since the extreme of DOG operator has a greater principal curvature in the direction across the edge, and a smaller one in the direction perpendicular to the edge, the edge points can be removed using this method. 3) Orientation assignment By assigning a consistent orientation to each feature point based on local image properties, the feature point descriptor can be represented relative to this orientation and therefore achieve invariance to image rotation. For each image sample, the gradient magnitude m ( x , y ) and orientation θ ( x , y ) is computed using pixel differences, as it is showed in formula (5). m( x, y ) = ( L ( x + 1, y ) − L ( x − 1, y )) 2 + ( L( x, y + 1) − L( x, y − 1)) 2 . θ ( x, y ) = arctan( L( x, y + 1) − L( x, y − 1) L( x + 1, y ) − L( x − 1, y ))
(5)
4) Generation of feature descriptors In Fig1, a feature descriptor is created by first computing the gradient magnitude and orientation at each image sample point in a region around the feature point, as shown on the left. These samples are then accumulated into orientation histograms summarizing the contents over 4×4 subregions, as shown on the right, with the length of each arrow corresponding to the sum of the gradient magnitudes near that direction within the region. This figure shows a 2×2 descriptor array computed from an 8×8 set of samples, while the experiments in this paper use 4×4 descriptors computed from a 16×16 sample array. Therefore, a 4×4×8=128 element feature descriptor is used for each feature point.
(a) Image gradients
(b) feature descriptor
Fig. 1. Sketch map of feature descriptors
94
A. Wang et al.
2.3 Improved SIFT Feature Extraction The shortage of classical SIFT algorithm is the rigor requirements of feature point matching, so sometimes it’s difficult to obtain more than 3 pairs of feature points. If the difference which is exited in the local neighborhood information of different feature points is large, SIFT feature matching algorithm generates unique descriptors, so it can obtain satisfactory results. If there are many similar regions in the image, the feature vectors generated from the SIFT algorithm are approximate, which will cause mismatching. In the paper, an improved SIFT algorithm with Harris corner detection and global context vector is presented. First, Harris corner detection algorithm is used to extract feature point; then SIFT algorithm with global context vector is used to describe feature points and obtain the feature descriptors. For each feature point detected, a two-component vector consisting of a SIFT descriptor representing local properties and a global context vector is built to describe the feature point detected and disambiguate locally similar features [4]. The vector is defined as: ⎡ ωL ⎤ . F =⎢ ⎥ ⎣(1 − ω )G ⎦
(6)
where L is a 128-dimension SIFT vector, G is a 60-dimension global context vector, and ω is a relative weighting factor. Like the generation of SIFT local descriptor, global context vector also create a histogram. The diameter is equal to the image diagonal and, like [5], the shape context is a 5×12 histogram shown in Fig2. For each feature, the global context accumulates curvature values in each log-polar bin. Given an image point (x, y), compute the maximum curvature at each pixel and it is the absolute maximum eigenvalue of the Hessian matrix defined as: C ( x, y ) = α ( x, y ) .
(7)
where α ( x , y ) is the eigenvalue of Hessian matrix with the largest value.
Fig. 2. Region division in generating global vector: the diagram of log-polar histogram bins
Research on a Novel Medical Image Non-rigid Registration Method
95
If (x’, y’) is the position of a feature point with orientation θ, then ⎡ x − x' ⎤ y− y' ⎡6 ⎤ a = ⎢ (arctan( ) − θ ) ⎥ , d = max(1, ⎢log 2 ( ) + 6⎥) . x − x' r ⎣π ⎦ ⎣ ⎦
(8)
where a and d are the angular and radial-distance bin indices for a point (x, y), and in the formula i is the L2 norm and r is the shape context radius. Let Na,d be the neigh-borhood of points with bin indices a and d, then the global context vector is given by G=
∑
C '( x, y ) .
(9)
( x , y )∈Na ,d
where C’ is the reduced and smoothed curvature image from formula (7). The weighted function plays an important role to balance the variable-size SIFT descriptor and the fixed-size global context vector. Each pixel’s curvature value is weighted by an inverted Gaussian and then added to the corresponding bin:
ω ( x, y ) = 1 − e
− (( x − x f )2 + ( y − y f )2 )/(2σ 2 )
(10)
.
where (xf, yf) is the feature point position and σ is the same scale as which is used to weight the SIFT feature’s neighborhood. When the SIFT scale is small, the global context extends well, but for large local features, the global context reduces its relative contribution. 2.4 Improved SIFT Feature Matching Given two or more images, a set of feature points is detected, so Euclidean distance of multi-dimensional feature vectors can be used as the similarity measure. In the float image, find an approximate nearest neighbor distance and nearest neighbor distance for each feature point in the reference image, and then compute the radio between the two distances, if the ratio is smaller than a certain threshold, the two points is kept as the best matching pair. Since the feature descriptor includes two parts, so there is need to calculate the respective Euclidean distance as follows: d L = Li − L j =
∑ (L
i ,k
− L j , k ) 2 , d G = Gi − G j =
k
∑ (G
i ,k
− G j , k )2 .
(11)
k
The final distance measure is given by d = ω d L + (1 − ω ) d G .
(12)
96
A. Wang et al.
3 Image Non-rigid Registration Generally the image registration is carried out following the steps described in the frame shown in Fig3.
Float image
Feature
Feature points
Matching
extraction Reference image
Feature points
Optimization
Registered image
Transformation Fig. 3. Flow chart of image registration: the chart describes the process of image registration implemented in the paper
The steps of the novel image registration method based on improved SIFT algorithm proposed in this paper are as follows: Step 1: Feature extraction: extracting feature points using Harris detector; generate SIFT descriptor L; generate improved SIFT descriptor F with global context vector; Step 2: Feature matching: searching for matching points using Euclidean distance of multi-dimensional feature vectors; Step 3: Transformation model: choosing affine transformation with bilinear interpolation to establish the transformation model, and solve the transformation parameters; Step 4: Similarity measure: using WMI [6] as the similarity measure to determine if the two images are registered; Step 5: Optimization algorithm: using PSO [7, 8] to search for optimal affine transfor-mation parameters.
4 Simulation Results and Analysis 4.1 Results of Feature Extraction and Matching In Fig4 (a) and (b), the number of feature points detected is considerable, and the feature vectors generated in the corresponding position are similar mostly. Fig4 (c) and (d) are the matching experiment results before and after the improvement: (c) shows the result under the application of classical SIFT algorithm and (d) is the result of improved SIFT algorithm. As shown in Table1, the mismatching is reduced by 7.5% by the proposed algorithm, significantly improving the matching results.
Research on a Novel Medical Image Non-rigid Registration Method
(a)
97
(b)
(c)
(d)
Fig. 4. Results of feature extraction and matching: (a) is Reference image; (b) is Float image; (c) is the matching result using classical SIFT algorithm; (d) is the result using improved SIFT algorithm. Table 1. The results of feature matching
Evaluations Classical SIFT Improved SIFT
The number of matching points 93 67
The probability of mismatching 19.4% 11.9%
4.2 Results of Image Registration The registration can be assessed by the COEF (correlation coefficient), SNR (signal noise ratio) and MSE (mean squared error). The registration results of the two different methods can be seen clearly in the following figures and tables: Fig5 and Fig6, Table2 and Table 3.
(a)
(b)
(c)
(d)
Fig. 5. The first image registration results: (a) is Reference image; (b) is Float image; (c) is the registration result using the method proposed in the paper; (d) is the result based on mutual information method.
98
A. Wang et al. Table 2. The evaluations of the first image registration results
Evaluations COEF MSE SNR
(a)
Method in the paper 0.7529 1238.68 48.7352
(b)
(c)
Mutual information 0.5684 1701.58 39.4758
(d)
Fig. 6. The second image registration results: (a) is Reference image; (b) is Float image; (c) is the registration result using the method proposed in the paper; (d) is the result based on mutual information method. Table 3. The evaluations of the second image registration results
Evaluations COEF MSE SNR
Method in the paper 0.8956 1649.56 59.7129
Mutual information 0.6482 1842.71 54.2522
Comparing the registration evaluations using the method proposed in the paper with those based on mutual information: for the evaluation COEF, when it’s bigger, the registration result is better; for MSE, the smaller the better; for SNR, the bigger the better. And it’s clearly shown that the proposed method has some advantages: in both figures, (c) can get better registration results than (d). Thus, the method can get better medical image non-rigid registration results than the method based on mutual information, and image registration works relatively well.
5 Conclusion This paper presents an improved SIFT feature extraction algorithm, which has high matching capability and can extract stable feature points. Harris detector improves the number of right matching points in the process of extraction. If multiple similar regions appear in the image, SIFT descriptors are close and mismatching problems are generated. So SIFT algorithm with global context vector is presented, reducing the probability of mismatching due to the similar local information. The results show the improved algorithm can achieve higher matching accuracy, and implement the crucial step of image registration, the feature extraction. Affine transformation with bilinear interpolation realizes the spatial transformation, and PSO algorithm determines the
Research on a Novel Medical Image Non-rigid Registration Method
99
optimal transformation parameters for image non-rigid registration. Finally by the comparisons, it’s clearly shown that the method presented in the paper has some advantages in image registration, preferable registration results are obtained.
References 1. Lowe, D.G.: Distinctive Image Features from Scale-Invariant Keypoints. J. International Journal of Computer Vision 60, 91–110 (2004) 2. Brown, M., Lowe, D.G.: Invariant features from interest point groups. In: British Machine Vision Conference, UK, pp. 656–665 (2002) 3. Harris, C., Stephens, M.: A Combined Corner and Edge Detector. In: Proceedings of the 4th Alvey Vision Conference, Manchester, pp. 147–151 (1988) 4. Mortensen, E.N., Deng, H.L., Shapiro, L.: A SIFT Descriptor with Global Context. In: IEEE Compute Society Conference on Computer Vision and Pattern Recognition (2005) 5. Belongie, S., Malik, J., Puzicha, J.: Shape Matching and Object Recognition Using Shape Contexts. Ieee Transactions On Pattern Analysis And Machine Intelligence, 509–521 (2002) 6. Li, J.L., Cong, R.J., Jin, L.P., Wei, P.: A Medical Image Registration Method Based on Weighted Mutual Information. In: 2nd International Conference on Bioinformatics and Biomedical Engineering, ICBBE 2008, pp. 2549–2552 (2008) 7. Kennedy, J., Eberhart, R.: Particle Swarm Optimization. In: IEEE Int. Conf. on Neural Networks (1995) 8. Eberhart, R., Kennedy, J.: A new optimizer using particle swarm theory. In: Proc of the Sixth International Symposium on Micro Machine and Human Science, Japan Nagoya (1995)
Automatic and Reliable Extraction of Dendrite Backbone from Optical Microscopy Images Liang Xiao1,2, Xiaosong Yuan2, Zack Galbreath2, and Badrinath Roysam2 1 School of Computer Science and Technology, Nanjing University of Science and Technology, Nanjing, 210094, P.R. China 2 Department of Electrical, Computer, & Systems Engineering, Rensselaer Polytechnic Institute, Troy, NY 12180-3590,USA
[email protected]
Abstract. The morphology and structure of 3D dendritic backbones are the essential to understand the neuronal circuitry and behaviors in the neurodegenerative diseases. As a big challenge, the research of extraction of dendritic backbones using image processing and analysis technology has attracted many computational scientists. This paper proposes a reliable and robust approach for automatically extract dendritic backbones in 3D optical microscopy images. Our systematic scheme is a gradient vector field based skeletonization approach. We first use self-snake based nonlinear diffusion, adaptive segmentation to smooth noise and segment the neuron object. Then we propose a hierarchical skeleton points detection algorithm (HSPD) using the measurement criteria of low divergence and high iso-surface principle curvature. We further create a minimum spanning tree to represent and establish effective connections among skeleton points and prune small and spurious branches. To improve the robustness and reliability, the dendrite backbones are refined by BSpline kernel based data fitting. Experimental results on different datasets demonstrate that our approach has high reliability, good robustness and requires less user interaction. Keywords: Neuron, Dendrite backbone, Curve-skeleton, Reconstruction.
1 Intruduction The morphology and structure of 3D dendrite backbones are intimately related to the neuronal circuitry and behaviors in the neurodegenerative diseases. The morphologies of dendrites and spines are key parameters in descriptions of neuronal cell types [1]. Dendritic and axonal structure also plays a key role in shaping cellular excitability and synaptic integration. However, dendritic backbone arbors or networks are large and extremely complex. As a result, manual digital extraction and identification are labor intensive. Simple dendritic trees within individual tissue sections require hours to days of hard work; reconstructing the full axonal arborization of a single projection cell may require several months of labor. This time-consuming process constitutes a K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 100–112, 2010. © Springer-Verlag Berlin Heidelberg 2010
Automatic and Reliable Extraction of Dendrite Backbone
101
critical bottleneck in comparative neuroanatomy and in the analysis of neural circuitry by light microscopy. Two main types of algorithms are designed for computationally extracting neuroanatomy from images: tracing based methods [2]; and skeletonization based methods[3-6]. Both types of methods have unique advantages and disadvantages. Tracing based methods are based on recursive traversal of the image data, taking advantage of the 3-D tube-like local geometry of neuritis. Skeletonization based methods are based on the notion that the geometric medial axis of the image data captures the neuronal topology, and these methods offer the greatest potential since they do not assume that the neuron have a tubular structure. In this paper, we propose a dendrite backbone extraction approach based on grayscale skeletonization algorithms. In which, the minimum spanning tree graph is created to represent and establish effective connections among skeleton points and prune small and spurious branches. And the dendrite backbones are extracted by refinement scheme using BSpline kernel based data fitting. Experimental results on different datasets demonstrate that our approach has high reliability, good robustness and requires less user interaction.
2
Overview of Our Proposed Approach
In this paper, we propose a reliable and robust approach for computing curve-skeleton of 3D dendrites and spines from fluorescence confocal images. The essential ideal of our approach is to design a systematic and modular scheme which can compute component-wise differentiation curve-skeletons of dendrites with high reliability, good robustness and requires less user interaction. Specifically, our approach is designed for the following goals or basic requirements [4]: (1) One-voxel thick(1D). (2) The curve-skeleton has topology preservation of original object. (3) Invariant under isometric transformations; (4) Almost centered within the object (5) Connectivity of dendrite backbone. (6) Robust to noise or small perturbation (7) Less user interaction requirements Fig.1 illustrates the proposed modular processing architecture of our approach. Differing from many related works, our approach does not compute the medial axis, however, we compute skeletons from intensity gradient vector field. One disadvantage of medial axis is it has two dimensional components (medial surface), and can not guarantee one-voxel thick skeleton. Another disadvantage of medial axis is its intrinsic sensitivity to small changes in the object’s surface. However, our grayscale based skeletonization approach can overcome these problems.
102
L. Xiao et al.
Fig. 1. Modular processing architecture of the proposed computational approach for dendrite backbone extraction
3 Image Pre-processing and Neuron Object Segmentation 3.1 Image Data-Set The analyzed dataset included four different laboratories to ascertain how well our approach extracts dendrite backbone, branch points and spines with a wide distribution of morphologies. The image of Neocortical Layer 6 Axons is captured by a 2-photon lasers scanning microscopy in vivo in MRC Clinical Sciences Center, Imperial College London. Images from the Potter laboratory were captured with 830 nm two-photon excitation laser scanning microscopes(2PLSM). Because of the image acquisition process of 2PLSM and OWIL, the analyzed images often have photon shot noise, unwanted structures. Therefore these images are first required to preprocessing described next. 3.2 3D Image Smoothing, Structure Enhancement and Object Segmentation In many approaches for skeletons computing and tube-like structures analysis, images are first preprocessed to enhance the structures. Structure smoothing and enhancement suppresses variations due to imaging noise, texture variations, and has the potential to facilitate the task of centerline extraction and segmentation. Many nonlinear-diffusion methods have been used to 3D image denosing while preserving edges. For example, non-linear diffusion filtering proposed by Perona and Malik in 1990[7]. In this paper, we adopt an nonlinear-diffusion which proposed by Sapiro with the so-called selfsnakes [8-9]
Automatic and Reliable Extraction of Dendrite Backbone
⎡ ∂u = ∇ u d iv ⎢ g ( ∇ u ∂t ⎣
) ∇∇ uu
⎤ ⎥ = Fd iff + F sh o ck ⎦
103
(1)
⎛ ∇u ⎞ div ⎜⎜ ⎟⎟ , Fshock = ∇ g ⋅ ∇ u . The first term Fdiff is u ∇ ⎝ ⎠ called diffusion term which can be viewed curvature based motion since ⎛ ∇ u ⎞ represents the mean curvature of images. The second one k ( u ) : = d iv
where F diff = g ( ∇ u
) ∇u
⎜⎜ ⎟⎟ ⎝ ∇u ⎠
Fshock is shock filtering term with edge enhancing. For numerical implemetation, we used ITK open source C++ code and integrated into our system, in which the ITK algorithm has three parameters, the iterative numbers (10-30), the conductive parameter K (typically 2-3) and the time steps (typically, 0.0425). Comparing with other nonlinear diffusion model, the self-snake diffusion provides denosing and edgeenhancing simultaneously, since it combined mean curvature motion (MCM) with edge-enhancing denoising. Fig 2. shows an example of the pre-processing. Fig.2 (a) is the z-projection of a 3D image which is captured by 2-photon lasers scanning microscopy in vivo[10], and the Fig.2(b) shows the resulted image of the preprocessing. The next step after nonlinear diffusion is to distinguish the object of neuron from the background. We adopt a hybrid approach that starts with an initial binarization that is subsequently refined using the graph-cuts algorithm [11]. For the initial binarization, we compute the normalized image histogram, denoted h(i ) , where i denotes the intensity of a pixel in the range
{0,1...umax } .The Poisson-distribution-
based minimum error thresholding algorithm is used to get the optimal threshold for the initial binarization. For graph-cut minimization, we use the energy function which is given by: E( f ( x, y)) = Edata ( f , u) + λ ⋅ Esmooth ( f )
where λ is the weight of the smoothness term. The data term provides a per-pixel measure of how appropriate a label l ∈ L is, for a pixel in the observed data is given by
Edata ( f , u) = −ln p(u(x, y) l ={0,1}) The smooth term provides a measure of the difference between two neighboring pixels ( x, y ) and ( x ', y ') , E smooth ( f ) =
∑ ∑ V ( f ( x , y ), f ( x ', y '))
(2)
( x , y ) ( x ', y ')
=
∑ ∑ η ( f ( x , y ), f ( x ', y ')) × exp(
u ( x , y ) − u ( x ', y ')
( x , y ) ( x ', y ')
where
⎧1, if ⎩0, if
η( f (x, y), f (x ', y ')) = ⎨
f (x, y) ≠ f (x ', y ') f (x, y) = f (x ', y ')
2σ L2
)
104
L. Xiao et al.
(a)
(b)
(c)
Fig. 2. An example of pre-processing. (a) The z-projection image of original noisy 3D image(Neocortical Layer 6 Axons). (b)the smoothing and enhancement results using self-snake model. (c) the segmentation result using the graph-cuts algorithm with Poisson-distribution based minimum error thresholding.
The
V-term
penalizes different labels for neighboring pixels when u ( x, y ) − u ( x ', y ') < σ L . In our work, the scale factor σ L is set empirically to values
in the range 20-30 pixels. We use an implementation of the fast max-flow/min-cut algorithm described by Boykov and Kolmogorov[12]. So, the neuron object image uf ( x, y) is then defined as following (For example, see Fig .(c)): f ( x , y ) == 0 ⎧ 0, if u f ( x, y ) = ⎨ u ( x , y ), otherwise ⎩
(3)
4 Hierarchical Skeleton Points Detection In this subsection, we discuss the algorithm used in the proposed hierarchical skeleton points detection algorithm (HSPD). 4.1 Low Divergence Critical Points Detection
To compute the skeletons of dendrite and spines of neuron, the fist step in HSPD is the computing of gradient vector field. We can either use the 3D Deriche filter or Sobel filter to compute the partial derivatives of image data. Then the unit vector field ∇u is used to generate the low divergence critical points by the grassfire flow. ∇φ = ∇u In practice, the discrete implementation of the outward flux of the grassfire is approximate by F (p ) =
1 n
n
∑
i=1
< n i , ∇ φ (p i ) >
(4)
pi is 26-neighbor of p (n=26) and ni is outward normal at pi of the unit sphere centered at p . Of interests points with low outward flux of the grassfire values, where
Automatic and Reliable Extraction of Dendrite Backbone
105
which indicate a “sink”. A threshold on the average outward flux yields a close approximation to the centerline set. We call these points are low divergence critical δ points. Formally, if SF = {p F(p) < ς } is the critical points detected by this algorithm with one parameter ς , then SF ⊆ SF if chosen in the value of [0, 0.15]. ς1
ς2
ς 1 < ς 2 . In our experiments, the parameter ς
is
4.2 High Iso-Surface Principle Curvature Points Detection
The above process is largely effective in extracting the skeleton of dendrite, but the skeletons of many ridge points and valley points are missed, here ridge point is any point whose curvature in the maximum curvature direction is locally maximal, while a valley point appoint where the minimum in the maximum direction. In this paper, we adopt a novel way to compute the principle curvature of iso-gray surface and corresponding measurement criteria. We briefly present the mathematical background necessary to estimate the curvatures of an iso-surface in 3D gray-value image[15]. Denote u(x, y, z) the intensity of 3D image, we consider a surface S defined as an iso-surface of u(x, y, z) . Let t be any unit vector in the tangent plane
Tm at a point p
and g the normal to the surface at p. Then the curvature k at a point m = (x, y, z) in the any unit vector t on a surface have been shown by Monga etal [15] by using the Hessian Matrix H of 3D image function kt ( p) = −
tT H t g
(5)
T
where g = ⎡ux , u y , uz ⎤ and H is the Hessian of the gray level function u(x, y, z) . Let ⎣ ⎦ t1 and t 2 denote the two mutual orthogonal tangent directions whose curvatures are
maximum and minimum, then the associated curvatures kt1 and kt 2 are called principle curvature. The drawback of the method mentioned above is the computing of the direction of t1 and t 2 . Fortunately, it can be proved with constrained optimization theory that the principle curvature can be computed by from the Eigen-system of submatrix of the rotated Hessian matrix as described below. Specifically, the 3-D Hessian matrix H is rotated to the local gradient direction n using the rotation matrix P , ⎡ ux δ P = ( n, h, f ) = ⎢⎢u y δ ⎢⎣ u z δ
with
γ = ux2 + uy2
and
δ = ux2 + uy2 + uz2
uy γ
u z u x γδ ⎤ u y u z γδ ⎥⎥ −γ / δ ⎥⎦
−u x γ 0
. Then the principle curvature
kt1 and
kt2 can be computed using the formula: k ti =
hT Hh + f T Hf ±
(h
T
Hh − fTHf 2
)
2
+ r (h T H f
)
2
(6)
106
L. Xiao et al.
where t i = h + f
k ti − h T H h fT Hh
with i=1,2. Assuming that k t > k t , thus the associ1
2
ated direction t1 is the maximum curvature direction. We will call a ridge point any point whose curvature in the maximum curvature direction is locally maximal, and a valley point appoint where the minimum in the maximum direction. Then the voxels would be selected if its local maximum principle curvature k t (relative to the other 1
{
}
voxels in a 3 × 3 × 3 neighborhood) is high. Formally, let S κ = p κ t (p) > μ be the selecting seeds set, μ is the threshold of high curvature. In our experiments, the threshold is chosen as the average value of the detected points with local maximum principle curvature. μ
1
4.3 Path-Line Linking Points Once the hierarchical seed points are detected, the next step is to compute the Path line linking points. We use the iterative path-line [16] forth following algorithm to trace the possible paths. This algorithm starts from seed points and moves through the gradient vector field towards other seed points. The force-following algorithm evaluates the force value at each point along a path and moves in the force direction with small steps. Let V =(x , y ,z ) denotes the initial seed point, V = (x , y , z ) de(0)
(0)
(0) (0)
(t)
(t+1)
(t+1)
(t+1)
(t )
(t)
(t )
(t+1)
notes the t-th iterative points, V =(x , y , z )denotes the current updated location, then the algorithm compute a new path location by V ( t +1) = V ( t ) + ∫ ∇ u ( V ( t ) )dt
(7)
This is continued until the amount of movement falls (typically, 2000 iterative steps) below a small pre-set threshold (typically, 0.1 voxel). Because the updated locations on the path line may not locate on the regular integer voxels, for the computing of integral term, the cubic interpolation and Euler scheme is used. After the path-line linking algorithm, we can detect almost all skeleton points contains the backbone and spine points. However, all the skeleton points are lacking for structure organization. Fig. 3 shows the skeleton points detected by the proposed HSPD algorithm.
(a)
(b)
Fig. 3. Skeleton Points detected by our proposed HSPD algorithm( S κ , μ = 0.2 ). (a) the μ
whole skeleton points; (b) the skeleton points in a sub-region enlarged.
Automatic and Reliable Extraction of Dendrite Backbone
107
5 Dendrite Backbones Tree Representation and Refinement μ
ς
The skeleton set produced by HSPD algorithm can be represented by S κ ∪ S F . This set is initial core skeleton, and it lacks an organization structure of component-wise differentiation. In this section, we first describe the dendrite tree representation, then we present the dendrite backbone extraction algorithm. Fig.4 shows the main steps of the dendrite backbones tree representation and refinement algorithm. As an demonstration example, Fig.5 illustrates the intermediary results from the KMST to final BSpline fitting backbone. 5.1 Dendrite Minimum Spanning Tree Model To distinguish skeletons that belong to backbone and those belong to spines, we first construct a weighted dendrite tree structure G ≡ {( V, E) , W} to compactly encode the geometry and topology of the dendrite. The set of vertices is V = {vi v = ( p, d )} , where
p represents 3D points and d is nonnegative degree of the vertex. The set of edge is
E = {eij eij ≡ (vi , v j )} . And the cost weight map is
(
)
⎧ ⎧⎪2d ( vi , vj ) / r(vi ) + r ( v j ) , eij = ( vi , vj ) ∈E, ⎪ W = ⎨wij wij := ⎨ ⎪⎩ ⎪⎩+∞ otherwise
if d ( vi , vj ) < EdgeRange⎫⎪ ⎬ ⎭⎪
(8)
Fig. 4. Flowchart outlining the main steps of the dendrite backbones tree representation and refinement algorithm
108
L. Xiao et al.
(a)
(b)
(c)
(d)
Fig. 5. (a)~(d) illustrates the intermediary results from the skeleton points to final B-Spline fitting backbone (the parameters). (a) The dendrite graph which is created by KMST. (b) Graph erosion and dilation results. (c). Final extracted backbone using B-Spline fitting refinement. (d) Backbone on the input image.
here d (⋅) is the Euclidean distance between two vertices and r (vi ) is the distance transform of the binary image at this pixel, the parameter EdgeRange denotes the maximum distance of the two pixels which have a possible edge connection in graph, if the distance between two pixels is larger than the value of parameter, thus the cost is positive indefinite, which means there will not any connection between the two points. In our experiments, we set this parameter as 5-10. μ
ς
For the skeletons in S κ ∪ S F , we will use Kruskal's Minimum Spanning Tree algorithm (KMST) [17] to establish the effective connection among all the pairs of vertices. Based on this labeling of the nodes on G KMST , we can build three relationships, one is for isolated points whose degree is 0 and the others are the two types of chains of edges: (1) Backbone chain, a chain that run between two branch nodes. Formally, a branch chain can be described: {e1 (v1, v2 ), e2 (v2 , v3 ),...en (vn , vn+1 ) deg(v1 ) ≥ 3,deg(vn ) ≥ 3,deg(vi ) ≥ 2,1 < i < n} , here ei (vi , vi +1 ) represents the edge link between nodes vi and vi +1 . (2) Branch chain, a chain that start at end node and end at a branch node, and can be described by
{e (v , v ), e (v , v ),...e (v , v 1
1
2
2
2
3
n
n
n +1
) deg(v1 ) = 1, deg(vn ) ≥ 3, deg(vi ) ≥ 2,1 < i < n}
Because a branch chain has three possibility: the first one is to be an end segment of a dendrite backbone, the second is to be a spine candidate, and the third is to be an unwanted small branch, we need to distinguish them and make further processing. 5.2 Dendrite Backbone Extraction and Refinement The essential idea behind of component wise extraction algorithms is to construct three separate sub-sets for the tree G KMST ,
Automatic and Reliable Extraction of Dendrite Backbone
G
where
K M ST
= G b a c k b o n e ∪ G o th e r
is set of the backbone component,
Gbackbone
109
(9)
Gother
is the unwanted nodes and
branches. Obviously, in order to extract the backbones, the direct idea is to remove the spines and other unwanted small branches. Hence, in our algorithm, graph erosion operation is used to remove unwanted trivial small braches and leaves of the tree and keep the major tree structure. The erosion operator E is defined by: E G M S T = G M S T − ∪ {e i , v i } d e g ( v i ) ≤ 1, e i = e i ( v i , v i + 1 ), ( e i , v i ) ∈ G M S T } i
(10)
After a sequence of erosion operations on the tree, the original backbone structure will show up. However it is a little shorter than the true value, a dilation operation will bring it back to its original length. After graph erosion and dilation operations, we will obtain the initial backbone chains without small branches which include spines and unwanted protrusions. Once the initial backbone chains are extracted, multi-level fitting for 3-D C 3 curve B-Spline algorithm is used to refine the spatial position of those points which belong to a single branch chain. Given n points along the branch chain skeleton, at a given resolution level, the B-Splines are simply scaled versions of the B-Splines at the previous value such that Bi', d ( x ) = Bi , d (2 x ) where is the B-Spline at the higher level. Consider a B-Spline curve
B i', d ( x ) P (t ) =
m
∑ P B (t ) i=0
i
with control points
i
Pi . We suppose that the order and the knots of
the B-Spline curve are fixed, so they are not subject to optimization. Given data points X k , k = 1, 2....n , we want to find the control points Pi , k = 1, 2....m , such that the object function 1 ⎧ P * = a rg m in ⎨ E ( P ) = 2 ⎩
n
∑ ( P (t ) − k =1
Xk
)
2
⎫ + λ P ''( t ) ⎬ ⎭
(11)
The above optimizing problem can be solved by linear system. We use this B-Spline fitting algorithm to refine each segment in the initial backbone network. The Flowchart is illustrated in Fig.4.
6 Results and Discussions Our systematic approach was developed completely in C++, where the nonlineardiffusion and B-Spline fitting algorithms were implemented with ITK and Kruskal's Minimum Spanning Tree algorithm were implemented with the Boost Graph library [17].The complete algorithms were executed in 32bit windows, and the processed Image on Dell Optiplex 960, T5500 Intel Core 2 Duo CPU @3.00GHz w/ 3.2 GB of RAM. In the first experiment, as shown in Fig.2, the tested image is captured by 2-photon lasers scanning microscopy in vivo. In this image, the neurons are located in the Nervous System Region: Neocortical layer 6 axons. Fig. 5. (a)~(d) illustrates the intermediary results from the skeleton points to final B-Spline fitting backbone. Comparing Fig.5(b) and Fig.5(a), we can see that the dendrite graph contains many small
110
L. Xiao et al.
branches whose roots are located on the backbone. For backbone extraction, these small branches are needed to be pruned. After graph erosion and dilation with 50 iterative steps, we obtained the initial backbone network (Fig.5(b)).Then, we used Bspline fitting algorithm for each segments on the initial backbone network, thus we obtained the final refined backbone. In order to show the performance of the B-spline fitting, we displayed our approach’s tracing results for three neurons with different structure in Fig.6. This results show that our approach can process different types of neurons with different structures such as parallel, fork and cross shapes. In the second experiment, we tested our algorithm on an image of Trachtenberg Laboratory. Fig.7 illustrates the intermediary results from the skeleton points to final B-Spline fitting backbone. Fig.7 (a) shows a 2D projection of a segmented neuron image. Fig.7 (b) is the skeleton points detected by HSPD algorithm, Fig.7(c) is the dendrite graph which is created by KMST, Fig.7(d) is the initial extracted backbone after Graph erosions and dilations with 50 iterative steps. Our final extracted backbone is showed in Fig7.(e), in which the 3-order B-Spline is used. In the third experiments, we show our algorithm has good performance in the backbone network extraction. Fig.8 (a) shows the extracted backbone network from an image of Trachtenberg Laboratory, and Fig.8(b) shows the extracted backbone network from an image of MBF Bioscience Inc, while Fig. 9 shows the extracted backbone network from Dr. Potter’s data. These experiments demonstrated that our approach can reduce 3D shapes to 1D graph representation in neuronal structures while preserving the topology of the structure accurately. Due to the variation of different datasets, in the aspects of image quality, image resolution, dendrites dimensions and shapes, etc., the algorithm shows various performances on different groups of images.
Fig. 6. B-Spline fitting results. Several sub-region results are shown, in which the blue curves and red curves show the initial extracted backbones and the B-Spline fitting results, respectively.
Fig. 7. (a)~(e) illustrates the intermediary results from the skeleton points to final B-Spline fitting backbone.(a). The input image; (b) Skeleton points results from HSPD algorithm. (c) KMST results. (d) Graph erosion and dilation results. (e). Final extracted backbone using BSpline fitting refinement.
Automatic and Reliable Extraction of Dendrite Backbone
(a)
111
(b)
Fig. 8. (a):The extracted backbone network from an image of Trachtenberg Laboratory. (b): The extracted backbone network from an image of MBF Bioscience Inc.
Fig. 9. The extracted backbone network from Dr. Potter’s data
7 Conclusions We have presented a novel vision-based pipeline for automatic detection and extraction centerline from 3D microscopy images. The proposed pipeline is an integrated solution that merges the ridge points detection, Graph representation, minimum spanning tree optimization and B-Spline regularization into to a unified framework to deal with the challenge problem. Experimental results on different datasets demonstrate that our approach has high reliability, good robustness and requires less user interaction. Acknowledgement. This work was supported in part by the Natural Science Foundation of China under Grant No. 60802039, by the Doctoral Foundation of Ministry of Education of China under Grant No. 20070288050, NUST Research Funding under Grant No. 2010ZDJH07, and also sponsored by “Qing Lan Project” of Jiangsu Province.
References 1. London, M., Hausser, M.: Dendritic computation. Annual Review Neuroscience 28(1), 503–532 (2005) 2. Al-Kofahi, K., Lasek, S., Szarowski, D., Pace, C., Nagy, G., Turner, J.N., Roysam, B.: Rapid Automated Threedimensional Tracing of Neurons from Confocal Image Stacks. IEEE Transactions on Information Technology in Biomedicine 6(2), 171–187
112
L. Xiao et al.
3. Xu, X., Wong, S.T.C.: Wong, Optical Microscopic Image Processing of Dendritic Spines Morphology. IEEE Signal Processing Magzine 23(4), 132–135 (2006) 4. Firdaus, J., Kishore, M., Xu, X., Raghu, M., Kun, H., Wong, S.T.C.: Robust 3D reconstruction and identification of dendritic spines from optical microscopy imaging. Medical Image Analysis 13, 167–179 (2009) 5. Yuan, X., Trachtenberg, J.T., Potter, S.M., Roysam, B.: MDL Constrained 3-D Grayscale Skeletonization Algorithm for Automated Extraction of Dendrites and Spines from Fluorescence Confocal Images. Neuroinformatics 7(4), 213–232 (2009) 6. Cheng, J., Zhou, X., Miller, E., Witt, R.M., Zhu, J., Sabatini, B.L., Wong, S.T.C.: A novel computational approach for automatic dendrite spines detection in two-photon laser scan microscopy. J. Neurosci Methods 165(1), 122–134 (2007) 7. Perona, P., Malik, J.: Scale-space and edge detection using anisotropic diffusion. IEEE Trans. Pattern Anal. Machine Intell. 12(7), 629–639 (1990) 8. Osher, S., Sethian, J.: Fronts propagating with curvature dependent speed: algorithms based on the Hamilton-Jacobi For Mulation. Journal of Computational Physics 79, 12–49 (1988) 9. El-Fallah, A.I., Ford, G.E.: The evolution of mean curvature in image filtering. In: Proc. IEEE Internat. Conf. on Image Processing (1), pp. 298–302 (1994) 10. De Paola, V., Holtmaat, A., Knott, G., Song, S., Wilbrecht, L., Caroni, P., Svoboda, K.: Cell type-specific structural plasticity of axonal branches and boutons in the adult neocortex. Neuron. 49(6), 780–783 (2006) 11. Al-Kofahi, Y., Lassoued, W., Lee, W., Roysam, B.: Improved automatic detection and segmentation of cell nuclei in histopathology images. IEEE Transaction on Biomedical Engineering 57(4), 841–852 (2010) 12. Boykov, Y., Kolmogorov, V.: Graph cuts and efficient N-D image segmentation. International Journal of Computer Vision 70(2), 109–131 (2006) 13. Bouix, S., Siddiqi, K.: Divergence-Based Medial Surfaces. In: Vernon, D. (ed.) ECCV 2000. LNCS, vol. 1842, pp. 603–618. Springer, Heidelberg (2000) 14. Cornea, N.D., Silver, D., Yuan, X., Balasubramanian, R.: Computing hierarchical curveskeletons of 3D objects. The Visual Computer 21(11), 945–955 (2005) 15. Davis, M.H., Khotanzad, A., Flamig, D.P., Harms, S.E.: Curvature measurement of 3D objects: evaluation and comparison of three methods. In: International Conference on Image Processing (ICIP 1995), vol. 2, pp. 2627–2631 (1995) 16. Forssell, L.K., Cohen, S.D.: Using line integral convolution for flow visualization: Curvilinear grids, variable-speed animation, and unsteady flows. IEEE Transactions on Visualizationand Computer Graphics 1(2), 133–141 (1995) 17. Siek, J.G., Lee, L.-Q., Lumsdaine, A.: The Boost Graph Library: User Guide and Reference Manual, p. 12. Pearson Education Inc., London (2001)
Magnetic Induction Tomography: Simulation Study on the Forward Problem Wei He, Xiaodong Song, Zheng Xu, and Haijun Luo State Key Laboratory of Power Transmission Equipment & System Security and New Technology, College of the Electrical Engineering, Chongqing University, Chongqing 400044, People’s Republic of China
[email protected]
Abstract. Magnetic induction tomography (MIT) is a kind of electromagnetic detecting and imaging technology, which is considered to be useful for diagnoses of the intracranial hemorrhage. The forward problem is the eddy current problem which is useful for improving the resolution of the measurement system and provides basic data for the inverse problem of image reconstruction. Simulation study on the forward problem in this paper includes four parts: illustration of the concept of a new MIT system, establishment of a mathematical model for the forward problem, creation of the human brain model and image visualization of the intracranial hemorrhage. In the results, the mathematical model was established with the edge finite element method, and MIT image visualization was realized under the real human brain 3D model. This study provides a foundation for MIT in the future clinical application. Keywords: magnetic induction tomography, intracranial hemorrhage, forward problem, edge finite element method, real human brain model, image visualization.
1 Introduction Magnetic induction tomography (MIT) is a kind of electromagnetic detecting and imaging technology. Because the magnetic field can easily penetrate the electrically resistive skull, MIT has the ability to detect the conductivity distribution in the brain tissue [1]. According to the fact that the conductivity of the blood is larger than that of other brain tissues [2], MIT can be used to diagnose the intracranial hemorrhage. The basic principle of MIT is using the sinusoidal time-varying magnetic field B0 to induce eddy currents in the target object, the eddy currents can create another magnetic field added into the source field B0 , using a kind of coils to detect the change of the magnetic field which is used to reconstruct the conductivity distribution σ of the target object [3, 4]. Study on MIT has three subjects: forward problem, measuring hardware system and image reconstruction of the conductivity distribution (inverse problem). The forward problem is calculating the eddy currents and magnetic field under the known conditions of source field B0 and conductivity distribution σ . Study on the forward problem is useful for improving the resolution of the K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 113–121, 2010. © Springer-Verlag Berlin Heidelberg 2010
114
W. He et al.
measurement system [5], and provides basic data for the inverse problem of image reconstruction [6]. The existing researches in literature indicate the calculation method and the simulation model are two main points in the study on the forward problem of MIT. M. Zolgharni [1,2] used the edge FEM with the A formulation to calculate the forward problem under a complex human brain finite element model. Robert Merwa [3] and Karl Hollaus [4] used the combination of the node and edge finite element method with the ( A r , V ) − A r formulation to calculate the forward problem under a simple real human brain FE model. In this paper, we also use the edge finite element method to calculate the forward problem. While the mathematical model and the human brain model are different from the existing researches. We use the electric field intensity E formulation to establish the boundary value problem on a special MIT system. And the real human brain finite element model is created from the MRI brain data using the “Hypermesh” and “Brainsuite” software. Furthermore, we have simulated the visualization of image reconstruction for diagnosing the intracranial hemorrhage.
2 Calculation Method 2.1 Concept of the MIT System The concept of MIT system simulated here was based on a practical, 15-channel, measuring system [7]. This system is used for human brain image, which includes 15 exciter and 15 sensor coils. Here, we retain the 15 sensor coils, and the exciter coils are changed to a pair of Helmholtz coils, which can create homogeneous sinusoidal time-varying magnetic field through the imaging region. Meanwhile, in order to realize the tomography technique, we use a row of coils called as the locating coils to detect the z-axis height of the imaging target. The concept of the MIT system is shown in Fig.1.
i
i
Γ1i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
Γ2i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
B0
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
yi
i
i
Γ1i
i
i
i
i
i
i
Γ2
x
Fig. 1. The concept of MIT system
Magnetic Induction Tomography: Simulation Study on the Forward Problem
115
2.2 The Forward Problem The MIT forward problem is a time harmonic eddy current problem. The calculated region Ω can be divided into two parts, the conducting region and the nonconducting region, representing the imaging region and air region respectively in Fig.1. The exciting source is the homogeneous sinusoidal time-varying magnetic field B 0 . Here we use the electric field intensity E as the vector variable to establish the boundary value problem for the forward problem:
∇ × ( ∇ × E ) + jωμ0σ E = 0
(∇ × E) × n = 0 ( ∇ × E ) × n = − jωB 0
Ω Γ1
(1)
Γ2
where σ is the conductivity of the conducting region, μ0 the vacuum permeability of the entire region Ω ,and ω the angular frequency of B 0 . In problem (1), the governing equation is obtained by coupling the Ampere’s law and the Faraday’s law, and neglecting the displacement current density [3]. The boundary conditions on Γ1 and Γ 2 (shown in Fig.1, n stands for the normal unit vector on Γ ) are the Neumann boundary conditions. On Γ1 , it comes from the truth that the tangential component of the magnetic field intensity H × n is zero. On Γ 2 , it is based on assume that only the source field B 0 exists. Problem (1) neglects the interface conditions between the conducting region and the nonconducting region. It is because we will use the edge finite element method to solve this problem. 2.3 Edge Finite Element Method
Edge finite element method is very suitable for solving the eddy current problem in three dimensions [8]. It automatically enforces the physically necessary continuity of the tangential components of vector quantities such as the magnetic field intensity H and the electric field intensity E on the interface boundary [9]. Thus we ignore the interface boundary conditions in the boundary value problem (1). To implement the edge FEM, we firstly use the Ritz technique to transform the problem (1) to an equal functional variation problem: 1 1 ( ∇ × E )i( ∇ × E ) d Ω + ∫Ω jωμ0σ EiEd Ω − ∫Γ2 ( jω B0 × E )idS ∫ Ω 2 2 δ (J ) = 0 J=
(2)
where J is the functional formulation and δ ( J ) = 0 the variation equation. Then we use a kind of edge finite element—the first order tetrahedral with 4 nodes and 6 edges to discretize the calculated region Ω . So the functional formulation J is also discretized
116
W. He et al.
as the sum of the element functional J eln , and the functional variation δ ( J ) is discretized as the sum of δ ( J eln ) .
In edge finite element method, the degree of freedom (DOF) is the integral of vector variable along each edge, so the DOF ( φ ) in this problem is
φ = ∫ Ei dl l
(3)
The shape functions of the edge tetrahedral element are
Wij = λi ∇λ j − λ j ∇λi ( i, j = 1, 2,3, 4 )
(4)
where λi is the barycentric coordinates in each element, i and j the element nodes number [9]. So the electric field intensity E can be approximated in each element as 6
E ≈ ∑ Weφe
(5)
e =1
where e is the edge number. Therefore, the functional variation in each element δ ( J eln ) can be transformed as a linear function I ([φ1 …φ6 ] ) which is called the element characteristic matrix. So the entire variation equation δ ( J ) = 0 can be discretized as the linear matrix equation:
[ K ][φ ] = [ R ]
(6)
which is the FE equation to be solved for the MIT forward problem.
2.4 Calculation Technique To calculate the MIT forward problem, we firstly use the “HyperMesh” to create and mesh the simulation model, then implement the calculation program in “Matlab”. Because “HyperMesh” can only create the node finite element data (elementnode matrix) for the model, it is need to transform the node FE data to the edge FE data (elementedge matrix). Then, according to the edge finite element method, we create the element characteristic matrix I ([φ1 …φ6 ] ) for each element and establish the FE equation (6). In equation (6), the coefficient matrix [K] is a complex matrix, so we use the Conjugate Gradient algorithm (CG) [10] to calculate the FE equation. Finally, we get the DOFs of all edges, and use them to calculate the electric field intensity E . The flow of calculating program is presented in Fig.2, where the steps “Create the elementedge matrix” and “Create the element characteristic matrix” are independent from each other. So we can use the parallel computing technique in “Matlab 2008a” [11] to increase the program operating speed.
Magnetic Induction Tomography: Simulation Study on the Forward Problem
“
117
”
Using Hypermesh to mesh MIT model
“
”
Using Matlab to create the elementnode matrix Create the elementedge matrix
Create the element characteristic matrix
Create the coefficient matrix
“ ”
Using CG to calculate FE equation Using the DOFs To calculate the electric field intensity E Fig. 2. The program flow of calculating the MIT forward problem
3 Real Human Brain Model We use the human head MRI data to create the real human brain finite element model. The MRI data (Fig.3) is the Analyze 7.5 format, a 3D biomedical image visualization and analysis product developed by the Biomedical Imaging Resource of the Mayo Clinic [12]. At first, the MRI data was treated by the “BrainSuite”, a suite of image analysis tools provided by the LONI Software [13]. We used its Brain Surface Extractor (BSE) function to remove the non-brain tissues from the MRI data, and then used the Skull and Scalp function to create the human head volume surface, which included four surfaces data: brain, inner skull, outer skull and scalp. We chose the inner skull surface (Fig.4.a) as the source data of the real human brain model. Then, the surface data was treated by the “BrainSorm”, a Matlab application created by the University of Southern California [14]. In BrainStorm, we transformed the surface data to the Matlab format data which included the surface meshed data: the vertices and triangle elements. Finally, we used “Matlab” and “Hypermesh” to create and mesh the brain volume. In “Matlab”, we rewrote the Matlab format data with the “Optistruct” format which could be imported in the “Hypermesh”. In “Hypermesh”, we used its surface function to create the brain volume data. Finally, we imported the volume data into “Hypermesh” again, and meshed it with the tetrahedral elements.
118
W. He et al.
The real human brain FE model (Fig.4.b) was created. Furthermore, we added a small red sphere model into the brain model (Fig.4.c) to simulate the intracranial hemorrhage.
Fig. 3. Human head MRI data
a. inner skull surface
b. real human brain FE model c. intracranial hemorrhage model Fig. 4. Real human brain model
4 Results 4.1 Test of the Calculation Method A typical problem, which was calculated by Oszkar Biro [15], was solved again, using the same boundary value problem (1) and the same edge finite element method in part 2.1.3, to test the calculation method. The problem is to calculate the eddy current field in and around a nonmagnetic ( μ = μ0 ), conducting cube immersed in a homogeneous magnetic field B0 (1T, z-direction) with harmonic time variation ( f = 50 Hz ). The model was created in “Hypermesh” and meshed with 4971 tetrahedral elements. In the model (Fig.5.a), the small cube (0.02m length and 656 elements) is the conducing object with σ = 5.7 ×107 S m . The far boundary is fixed as a large cube (0.1m length) around the small cube. The results, real and imaginary part of the y component of the eddy current density along the x axis in the conducting cube, were calculated and plotted in Fig.5.b, which was well coincided with that of [15], which demonstrated the validity of the calculation method.
Magnetic Induction Tomography: Simulation Study on the Forward Problem
a. The model
119
b. Results Fig. 5. Test problem
4.2 MIT Imaging Visualization
Here, we simulated the MIT system (Part 2.1) in detecting and imaging the intracranial hemorrhage. A red sphere with a diameter of 3cm was added into the brain model (Fig.6.a) to simulate the intracranial hemorrhage. The brain model was meshed with 32319 tetrahedral elements, and red sphere with 637 elements. The conductivity of the brain model was set to 0.2 S m , while that of the intracranial hemorrhage model was 2 S m . The permeability of the entire region was assumed to be the vacuum permeability μ0 , and the frequency of the exciting magnetic field was120 KHz . The voltage V induced in the sensor coil is calculated as the line integral of the electric field density around the coil: V=
∫ Eidl
(7)
where dl is an element of the coil. Calculation was first carried out with the conductivity of all models set to 0.2 S m . The voltages induced in the coils were the “normal” voltages [V1 ] . Then the calculation was repeated with the 2 S m hemorrhage mode,
giving the second data, the ‘target’ voltages [V2 ] . Because only the imaginary part of the induced voltage is related to the conductivity distribution, the imaginary part is used to calculate the sensitivity:
[S ] =
imag ([V2 ] − [V1 ]) imag ([V1 ])
(8)
which is used to identify the relative level of the conductivity. In image reconstruction, the sensitivity [ S ] was standardized to five levels, from 0 to 5, which was used to reconstruct the relative conductivity distribution of the image region with the algorithm of biharmonic spline interpolation [16]. We changed the direction of magnetic field to y-axis, and calculated the inducted voltages in the locating coils. Adjusting the
120
W. He et al.
z-axis height of the locating coils, we identified a particular height according to the maximum voltages. Using this height, we realized the 3D tomographical visualization, shown in Fig.6.b.
a. intracranial hemorrhage
b. MIT 3D visualization
Fig. 6. Edge finite element
5 Discussion and Conclusions In this paper, we used the edge finite element method to calculate the MIT forward problem. What is different from the exiting researches is using the electric field intensity E formulation. Compared with the A or ( A r , V ) − A r formulation [1, 2, 3, 4], the E
formulation can directly obtain the physical quantity, while the A or ( A r , V ) − A r for-
mulation have to transform the potential functions to the physical quantity. Thus, the E formulation has better calculation accuracy. Generally speaking, this paper did two parts of work in the study on MIT forward problem: using edge finite element method with the E formulation to establish the forward problem of a new MIT system excited by the homogeneous sinusoidal timevarying magnetic field; simulating the intracranial hemorrhage with a real human brain FE model and having realized the 3D tomographical visualization. Based on this paper, we will do more research on the image reconstruction algorithm to improve the quality of the image visualization. Acknowledgements. This work was supported by the Fundamental Research Funds for the Central Universities (Project No.CDJZR10150021).
References 1. Zolgharni, M., Ledger, P.D., Griffiths, H.: Forward modeling of magnetic induction tomography: a sensitivity study for detecting haemorrhagic cerebral stroke. Med. Biol. Eng. Comput. 47, 1301–1313 (2009) 2. Zolgharni, M., Ledger, P.D., Armitage, D.W.: Imaging Cerebral Haemorrhage with Magnetic Induction Tomography: Numerical Modelling. Physiol. Meas. 30, S187–S200 (2009)
Magnetic Induction Tomography: Simulation Study on the Forward Problem
121
3. Hollaus, K., Magele, C., Merwa, R., Scharfetter, H.: Numerical Simulation of the Eddy Current Problem in Magnetic Induction Tomography for Biomedical Applications by Edge Elements. IEEE Trans. Magn. 40(2), 623–626 (2004) 4. Merwa, R., Hollaus, K., Brandstatter, B.: Numerical solution of the general 3D eddy current problem for magnetic induction tomography. Physiol. Meas. 24, 545–554 (2003) 5. Morris, A., Griffiths, H., Gough, W.: A Numerical Model for Magnetic Induction Tomographic Measurements in Biological Tissues. Physiol. Meas. 22, 113–119 (2001) 6. Scharfetter, H., Hollaus, K., Rosell-Ferrer, J.: Single-step 3-D image reconstruction in magnetic induction tomography: theoretical limits of spatial resolution and contrast to noise ratio. Annals of Biomedical Engineering 34(11), 1786–1798 (2006) 7. Xu, Z., Luo, H., He, W.: A multi-channel magnetic induction tomography measurement system for human brain model imaging. Physiol. Meas. 30, S175–S186 (2009) 8. Biro, O.: Edge Element Formulations of Eddy Current Problems. Comput. Methods Appl. Mech. Engrg. 169, 391–405 (1999) 9. Mingzhong, R., Bangding, T., Jian, H.: Edge Elements with Applications to Calculation of Electromagnetic Fields. Proceedings of the CSEE 14(5), 63–69 (1994) 10. Biro, O., Preis, K., Vrisk, G.: Computation of 3-D magnetostatic fields using a reduced scalar potential. IEEE Trans. Magn. 29(2), 1332–1359 (1993) 11. Parallel computing technique, http://www.mathworks.com 12. MRI data, http://www.mayo.edu 13. BrainSuite, http://www.loni.ucla.edu/Software 14. BtainStorm, http://neuroimage.usc.edu/brainstorm 15. Biro, O., Preis, K.: On the Use of the Magnetic Vector Potential in the Finite Element Analysis of Three-Dimensional Eddy Currents. IEEE Trans. Magn. 25(4), 3145–3159 (1999) 16. Sandwell, D.T.: Biharmonic spline interpolation of GEOS-3 and SEASAT altimeter data. Geophysical Research Letters 14(2), 139–142 (1987)
Diagnosis of Liver Diseases from P31 MRS Data Based on Feature Selection Using Genetic Algorithm Jinyong Cheng1, Yihui Liu1, Jun Sang1, Qiang Liu2, and Shaoqing Wang2 1
School of Computer Science and Information Technology, Shandong Institute of Light Industry, Jinan, Shandong, China, 250353
[email protected] 2 Department of Magnetic Resonance Imaging, Shandong Medical Imaging Research Institute Jinan, Shandong, China, 250021
[email protected]
Abstract. P31 MRS technique is important either in diagnosis or in treatment of many hepatic diseases for it can provides non-invasive information about the chemical content of the energy metabolism in cellular level. The data samples from P31 MRS are classified into three types of hepatocellular carcinoma, hepatic cirrhosis and normal hepatic tissue using computational intelligence methods. A genetic algorithm is used as main feature selection method and the Gaussian model is selected in the mutation operation. Two classification algorithms are used which consist of fisher linear discriminant analysis and quadratic discriminant analysis. Experiments show that the application of genetic algorithm and fisher linear classifier offers more reliable information for diagnostic prediction of liver cancer in vivo. And when the cross-validation method is 10-fold model, this algorithm can improve the average recognition correction rate of three types to 94.28%. Keywords: Genetic algorighm, fisher linear classifier, hepatic cirrhosis, hepatocellular carcinoma, P31 MRS.
1 Introduction Imaging is an essential step in the diagnostic for patients which can provide clinical signs of a tumour. A biopsy specimen, analysed by histopathology, is the gold standard for the confirmation of the presence of tumour. Magnetic resonance spectroscopy (MRS) has become an important element in the differentiation between malignant and benign lesions and newer images may help to establish the type, grade, and stage of the tumour, which is crucial in the further management of disease, such as treatment planning, prediction of progression and response to treatment. The first application of P31 MRS carried out in an animal tumour model in vivo achieved in 1981[1]. While the fist applications of MRS in a patient with a tumour was accomplished a couple of years later, one of which was the detection of an abnormal P31 MRS of a rhabdosarcoma as compared to that of muscle tissue [2]. K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 122–130, 2010. © Springer-Verlag Berlin Heidelberg 2010
Diagnosis of Liver Diseases from P31 MRS Data Based on Feature Selection
123
Linear Discriminant Analysis (LDA) has been widely used for dimensionality reduction or feature extraction in pattern recognition [3], as well as successfully employed in face recognition and classification. Linear Discriminant Analysis and Fisher Linear Discriminant Analysis (FLDA) are often used interchangeably under the same assumption that samples are normally distributed and covariance of every class is identical. Quadratic Discriminant Analysis (QDA) is a method closely related to LDA, which is used in machine learning and statistical classification to separate two or more classes by a quadratic surface. In this paper, two discriminant analysis methods (FLDA and QDA) are chosen to classify the samples based on P31 MRS provided by Shandong Medical Imaging Research Institute into three types of hepatocellular carcinoma, hepatic cirrhosis and normal hepatic tissue. The feature selection is an important approach before the classification. Two feature selection methods are used in this paper. A genetic algorithm is selected as main feature selection method and while making the mutation operation the Gaussian model is selected to improve the classification results. The other feature selection method is based on the image area of seven peaks in P31 MRS image. Two classification algorithms-fisher linear discriminant analysis and quadratic discriminant analysis are used to classify every samples into three types. We begin in Section 2 by describing the source of samples and the methods in feature selection. We continue in Section 3 by introducing the methods of classification and in Section 4 doing experiments using two feature selection and discriminant analysis methods. We conclude in Section 5 and make conclusions.
2 Data Acquisition and Feature Selection 2.1 Data Acquisition The P31 MRS data can not obtained in digital form directly because of the confidentiality of the equipment. We can get the data from the image shown on the screen. Fig. 1 shows the images of a sample. Fig. 1 is the original image from the whole screenshot, And from it the useful part can be cut by a computer program as shown in Fig. 2. From Fig. 2(a), we can get the true value of every curve point. There are seven main peaks in every image. The image area of the specific section blow the peak have important medicinal value. To avoid the influence of some accidental factors we need make curve fitting to the original curve as shown in Fig. 2(b). The approach for acquirement of the true data is as follows. The number of rows of the horizontal axis in the image and the number of columns of the vertical axis in the image can be obtained. The calibrations of different images are not same, so The number of columns of the first calibrations and the last calibrations in the horizontal axis are necessary. Similarly, The number of rows of the first calibrations and the last calibrations in the vertical axis are needed. The above four numbers can be obtained through simple image processing. However, the corresponding real values of the first calibrations and the last calibrations need to be inputted manually, such as 10, 15 in
124
J. Cheng et al.
horizontal axis and -0.1, 0.2 in vertical axis. With this approach, corresponding relation between coordinate value and true value of any pixel in the image can be formed. There are 130 samples based on P31 MRS provided by Shandong Medical Imaging Research Institute were selected, including 45 hepatocellular carcinoma, 28 hepatic cirrhosis and 57 normal hepatic tissues. Form the P31 MRS image, we can get seven main peaks, including phosphomonoester(PME), inorganic phosphorus(Pi), phosphodiester(PDE), phosphocreatine(PCr) and adenosine triphosphate( β -ATP, α -ATP, γ -ATP) that are distributed by chemical shift from left to right (as shown in Fig.1.). 21 parameters were obtained to characterize P31 MRS data including chemical shift, integral, and ratio value for the seven parameters mentioned above. For ratio value, PDE was selected as a control data, which result is 1, we ignore this parameter. 20 parameters left are selected for this study.
Fig. 1. A P31 MRS image which contains patient information, liver image and curve value
Diagnosis of Liver Diseases from P31 MRS Data Based on Feature Selection
(a)
125
(b)
Fig. 2. (a) a useful part from original image (b) curve fitting
2.2 Feature Selection There are two simple feature selection methods. The first is using the values of all points in the original curve. In accordance with the size of the image, the number of the points on a curve is set a unified value 400. However the classification results is poor based on this feature selection method. The second simple feature selection method is select the 20 parameters from the 7 image areas below the peaks. The drawback of the method is needing curve fitting. In this paper, we present another feature selection method based on genetic algorithm. As a newly developed optimization algorithm, the genetic algorithm (GA) is derived from evolutionism based on genetic selection and nature elimination of Darwin ism and genetics[4]. The task here is to develop and run genetic algorithm to search an m-sized class-discriminative feature subset from the n features. The samples after feature selection are points in the m-dimension subspace. And the subsequent classification and analysis are performed on this m-dimension subspace. a. Encoding and Decoding A data set which consist of all data of a P31 MRS image are seen as a individual in this paper. It is a feature subset, and it can be encoded as a vector contains 400 real numbers. Each real number represents a gene among the individual, each gene is an index of the original feature set. Decoding is a reverse process of encoding, and it is the process of reconstructing the feature subset of the best individual. The selection of encoding method is a important factor to influence the performance and efficiency of the algorithm. b. Fitness Functions The design of the fitness function can affect directly the optimum feature subset and affect the recognition rate. In this paper, we use the experienced classification error rate of linear discriminant analysis classifier and posterior probability(ec+ep) to judge the quality of the feature subset. We define the fitness function as follow:
f ( x ) = ec + e p
(1)
126
J. Cheng et al.
where ec is the experienced classification error rate of linear discriminant analysis classifier, and ep is posterior probability
1⎧ n ⎫ e p = 1 − ⎨∑ max ⎡⎣ P ( c1 | xi ) , Λ, P ( cm | xi ) ⎤⎦ ⎬ n ⎩ i =1 ⎭
(2)
where n is the number of training sample, P (cj|xi) is the posterior probability which sample xi belongs to class cj. c. Selection and Reproduction In this paper we use an elitism coupled with rank based stochastic universal sampling selection strategy[5]. Fitness-proportional selection can not guarantee the selection of any particular individual including the fittest. The best-so-far individual may not be selected to survive in the next generation, which make the outstanding genes thrown away, therefore the individual corresponding to the high-quality solution, even the global optimum solution, can not exist in the final population. So we use elitism coupled strategy. We select two individuals with highest fitness and copy them to the next generation. Then we use stochastic universal sampling, select the individuals which can be used to crossover and mutation from remained individuals. This will ensure that the final solution is globally optimal solution. d. Crossover The uniform crossover is used to recombine the genes of parent chromosomes. Unlike one-point crossover and two-point crossover, uniform crossover is one child version genetic operation which produces one offspring given each pair of parents, and each gene of the offspring is randomly selection from the corresponding genes of its parents. e. Mutation The goal of mutation is to increase exploration. One parent is altered to form one offspring through mutation operation. This paper use two different mutation method, uniform mutation and Gaussian mutation. The goal of mutation is to increase exploration. One parent is altered to form one offspring through mutation operation. Uniform mutation model is the simplest and commonly used in GA. For integer valued vector encoding, this method replaces each component in a uniform low probability that is called mutation rate. Generally the mutation rate is a constant[6]. Here set the mutation to be 0.02. Gaussian mutation is another way to improve the performance of a genetic algorithm to search local area. It use a random number replace original gene value. The random number accord normal distribution with mean μ and variance σ 2 .And it can trim the population to the direction of normal distribution and can avoid premature. f. Population Size Large population size can improve the search capability of GA. However, the larger the population size, the longer the genetic algorithm takes to compute each generation. We use the quotient between the size of original feature and feature subset to design population size. Here, the population size is 400, the size of feature subset is 20. g. Termination Criteria The simplest and widely used criterion is that if the current generation counter meets the maximum number generation, GA will stop after the last genetic operations.
Diagnosis of Liver Diseases from P31 MRS Data Based on Feature Selection
127
Additionally, we use an adaptive criterion to halt the evolution when the weighted average change in a sliding window is less than a predefined threshold constant.
3 Classification This paper use k-fold model as the cross-validation method to evaluate the performance of our proposed method. In classification, Fisher Linear Discriminant Analysis and Quadratic Discriminant Analysis are applied to classify the data samples based on P31 MRS. 3.1 Fisher Linear Discriminant Analysis Fisher Linear Discriminant Analysis which is chosen in the study is a statistical analysis method used to judge the type of the sample belongs to, and it has played an important role in medical diagnosis, meteorology, archaeology, economics, marker forecast, environmental, and so on. Fisher Linear Discriminant Analysis and Linear Discriminant Analysis can be used interchangeably, when the assumptions of LDA that the conditional probability density functions are normally distributed and the covariance of each class is identical are satisfied. Fisher Linear Discriminant Analysis’s purpose is to find a projective direction vector that can project data in high-dimensional characteristic space to one-dimensional characteristic space, which make the same type of samples assembled as much as possible as while as separate different types of samples. In Fisher Linear Discriminant Analysis method, it use Fisher Criteria to choose the characteristic vector that corresponds to the eigenvalue that maximize the between-class scatter matrix [7-11], while minimizing the within-class scatter matrix as a projective direction vector [12], which is also the weight vector in discriminant function. So we can construct a hyperplane that separate the different types of samples. We assume that the conditional probability density functions are normally distributed. Let’s consider a training set L = {x11 , x12 ,", x1N , x12 , x22 ,", x N2 ∈ R l } consists of N(N = N1 + N2 ) samples, which is divided into two classes, denoted by l1, l2, and x1 , x2 are the mean vectors of l1 and l2. SW and S B are within-class and between-class scatter matrices that are computed separately as follows 1
x1 =
x2 =
∑
1 N2
∑x
(
S B = x1 − x2 N1
(
)(
SW = ∑ x1j − x1 x1j − x1 j =1
N1
1 N1
j =1
N2
j =1
N2
j =1
x 1j
(3)
2 j
(4)
)( x − x )
) + ∑( x T
2
T
1
2 j
(5)
2
− x2
)( x
2 j
− x2
)
T
= SW21 + SW2 2
(6)
128
J. Cheng et al.
Apparently, we hope the value of between-class scatter matrices the larger the better, and the value of within-class scatter matrices the smaller the better. So if we assume that yi is the projection of xi through the vector w , the Fisher Criteria is defined as follows y1 − y 2
J F (w ) =
2
S w2 y 1 + S w2 y
(7) 2
From the above equation, we can obtain the value of the variable w . Then we can construct a hyperplane computed as follows g ( x) = W ⋅ X
(8)
Where W = (w*) is the weight vector. 3.2 Quadratic Discriminant Analysis Quadratic Discriminant Analysis (QDA) is a powerful statistical multivariate patternrecognition method [13]. It may be thought of as a more general version of the classical Linear Discriminant Analysis (LDA) method pioneered by R. A. Fisher sixty years ago [14]. Viewing the two classes as swarms of points in a multidimensional feature space, Quadratic Discriminant Analysis can provide a more effective boundary between two swarms that have different covariance structures than LDA that could only provide a straight boundary [15]. In QDA the measurements are normally distributed. But unlike LDA, in QDA the covariance of each class isn’t assumed identical. Consider a training set L = {x11 , x12 ,", x1N , x12 , x22 ,", x N2 ∈ R l } consist of N(N = N1 + N2 ) samples 1
2
that are normally distributed, which is divided into two classes but can’t be separated linearly, denoted by l1, l2. x1 , x2 are the mean vectors of l1 and l2, and ∑1 , ∑2 are the covariance of the two classes. Then the best solution is to use the likelihood ratio test, which can be computed by L ik e lih o o d
ra tio =
⎛ exp ⎜ − ⎝ −1 ⎛ 2π ∑ 1 e x p ⎜ − ⎝
2π ∑ 2
−1
(
)
(
)
1 x − x2 2 1 x − x1 2
T
T
(
)
(
)
⎞ ∑ 2 −1 x − x2 ⎟ ⎠ ⎞ ∑ 1 − 1 x − x1 ⎟ ⎠
(9)
If the Likelihood ratio is below some threshold t , then we judge the sample belongs to the second class l2, otherwise, it belongs to the first class l1. After some rearrangement, it can be shown that the separating surface between the two classes is quadratic with the form of X T AX + bT X + c
(10)
4 Experiments and Results We use two feature selection methods and two discriminant analysis methods. At first, use the 20 parameters obtained from the 7 image area below peaks as the main feature. In the process of classification, we use k-fold (k=10) method to evaluate the
Diagnosis of Liver Diseases from P31 MRS Data Based on Feature Selection
129
performance of our proposed method, during which, m (m=N/10) samples are selected randomly from N (N=130)observations and the rest N - m samples left are used for training. And all the samples are tested to obtain the correct rate of the classification. Meanwhile, we choose Fisher Linear Discriminant Analysis instead of Linear Discriminant Analysis under the assumption that the conditional probability density functions are normally distributed and the covariance of each class is identical. With first feature selection method that using the 21 medical parameters as feature vector, we have two experiments, one is to classify the 130 samples using Fisher Linear Classifier, and the other is to classify the samples using Quadratic Classifier. After selecting training set, we perform every experiment 10 times to obtain the mean value of the correct average recognition rate. The results are summarized in table 1. Table 1. Recognition rates(rate ± standard deviation) using 21 medical parameters Classifier LDA QDA
Crossvalidation 10-fold 10-fold
cancer
cirrhosis
normal
0.777±0.194 0.753±0.201
0.7283±0.240 0.7233±0.279
0.8633±0.134 0.8020±0.172
With the second feature selection method that using genetic algorithm to select feature vector, we have four experiments, one is choosing the uniform mutation method and Fisher Linear Classifier to classify the 130 samples; the second experiment is choosing Gaussian mutation and Fisher Linear Classifier; the third experiment is choosing the uniform mutation method and Quadratic Classifier; and the last one is choosing the Gaussian mutation method and Quadratic Classifier. The results are summarized in table 2. Table 2. Recognition rates(rate ± standard deviation) using GA mutation
Classifier
Gaussian Uniform Gaussian Uniform
LDA LDA QDA QDA
CrossValidation 10-fold 10-fold 10-fold 10-fold
cancer
cirrhosis
normal
0.893±0.169 0.862±0.195 0.672±0.216 0.700±0.224
0.936±0.145 0.785±0.232 0.748±0.321 0.448±0.300
0.998±0.016 0.953±0.091 0.995±0.028 0.952±0.102
Compared the data of the two lines of table 1, we can see that the mean recognition rate is higher performed by than Quadratic Classifier. It is proved that for the P31 NMR data, Linear Classifier can offer more reliable information for diagnostic prediction. However, when we use the 21 medical parameters as feature vector, the average recognition correction rate is not satisfied. Compared table 2 with table 1, we can see that the mean recognition rate is higher performed by GA feature selection method. And when the mutation method of GA is Gaussian mutation, the recognition correction rate is higher than that the mutation method of GA is uniform mutation. when the cross-validation method is 10-fold model, this algorithm can improve the average recognition correction rate of three types to 94.28%.
130
J. Cheng et al.
5 Conclusion In this paper, two feature selection methods are used. A genetic algorithm is selected as main feature selection method, and the Gaussian model is selected as the mutation operation to improve the classification results. The other feature selection method is based on the image area of seven peaks in P31 MRS image. we use Fisher Linear Classifier and Quadratic Classifier to analyse the data samples based on P31 MRS in order to classify three diagnosis types of hepatocellular carcinoma, hepatic cirrhosis and normal hepatic tissue. In the process of classification, we use k-fold model as the cross-validation method to select training set and testing set. The experiments results prove that GA feature selection method and Fisher Linear Classifier can get more accurate result for diagnostic prediction. In the future study, we need to improve the recognition rate further in order to provide more reliable information for clinical diagnostic of liver cancer. An available method is using data fitting to restrain the disturb of human element such as breath. Feature extraction and classification method need further research too.
References 1. Griffiths, J.R., Stevens, A.N., Iles, R.A., et al.: 31P-NMR investigation of solid tumours in the living rat. Biosci. Rep. 4, 319–325 (1981) 2. Griffiths, J.R., et al.: 31P-NMR studies of a human tumour in situ. Lancet 8339, 1435– 1436 (1983) 3. Fukunnaga, K.: Introduction to Statistical Pattern Recognition, 2nd edn. Academic Press, London (1991) 4. Holland, J.H.: Adaptation in Natural and Artificial Systems. University of Michigan press, Ann Arbor (1992) 5. Whitley, D.: The GENITOR algorithm and selection pressure: why rank-based allocation of reproductive trials is best. In: ICGA3, San Mateo, pp. 116–121 (1989) 6. Mills, G.C.: The molecular evolutionary clock: a critique. Perspectives on Science and Christian Faith. 46, 159–168 (1994) 7. Belhumeur, P.N., Hespanha, J.P., Kriegman, D.J.: Eigenfaces vs. Fisherfaces: recognition using class specific linear projection. IEEE Trans. Pattern Anal. Mach. Intell. 19(7), 711–720 (1997) 8. Martinez, A.M., Kak, A.C.: PCA Versus LDA. IEEE Trans., Pattern Anal. Mach. Intell. 23(2), 228–233 (2001) 9. Fisher, R.A.: The Use of Multiple Measures in Taxonomic Problems. Ann. Eugenics 7, 179–188 (1936) 10. Ye, J.P., Janardan, R., Park, C.H., Park, H.: An Optimization Criterion for Generalized Discriminant Analysis on Undersampled Problems. IEEE Trans. Pattern Anal. Mach. Intell. 26(8), 982–994 (2004) 11. Zhao, W., Chellappa, R., Phillips, J., Rosenfeld, A.: Face Recognition: A Literature Survey. ACM Computing Surveys 35(4), 399–458 (2003) 12. Bishop, C.: Neural networks for pattern recognition. Oxford University Press, Oxford (1995) 13. McLachlan, G.J.: Discriminant Analysis and Statistical Pattern Recognition. John Wiley, New York (1992) 14. Fisher, R.A.: The use of multiple measurements in taxonomic problems. Ann Eugene 7, 79–188 (1936) 15. Krzanowski, W.J.: Principles of Multivariate Analysis, vol. 347. Clarendon Press, Oxford (1993)
Research of Acupuncturing Based on Hilbert-Huang Transform Xiaoxia Li1, Xiumei Guo2, Guizhi Xu1, and Xiukui Shang3 1
Province-Ministry Joint Key Laboratory of Electromagnetic Field and Electrical Apparatus Reliability, Hebei University of Technology, Tianjin, China 2 Hebei Normal University of Science and Technology, Qinhuangdao, China 3 Department of Acupuncture, Tianjin University of Traditional Chinese Medicine, Tianjin, China
Abstract. Acupuncture is one of the first complementary and alternative medicine methods in the world. But the mechanism of the acupuncture is still a mystery, it attracts many researchers to this field. The aim of our research is to explore the regulative effects of acupuncturing Neiguan and Shenmen acupoints. In this paper, a research based on Hilbert-Huang Transform (HHT) is presented. With this method, the energy is redistributed after the acupuncture, especially after the fourth acupuncture stimulus. It is concluded that, the acupuncture can change the distribution of the energy after several stimuli. Keywords: Acupuncture, HHT, EMD, EEG, Neiguan, Shenmen.
1 Introduction Acupuncture is an old traditional Chinese medicine, it is now been considered as the one of the important complementary medicines. It has proved to be medicable by the public. But the mechanisms of the acupuncture effect are still controversial. In Traditional Chinese Medicine, the Meridians is called "Jingluo". They are considered as the passages of "Qi". The theory of the meridians is formed through longtime practice and observation of Chinese people. Acupuncture is one of the most important parts of the theory of the meridians. Acupuncture produces sensation of soreness, numbness, distension or heaviness. This sensation usually goes to the distal region along a definite pathway. Acupuncture the special point along the pathway can regulate the balance of human body. Many documents have hypothesis the mechanisms of the acupuncture [1-4], and at the same time many experiments have been made to explore the mechanisms of the acupuncture. The Hilbert-Huang transform (HHT) is proposed by Huang et al. in 1996. It is a new method of analyzing nonstationary and nonlinear time series data. It is the combination of the empirical mode decomposition (EMD) and the Hilbert spectral analysis (HSA). The EMD method was first used to decompose a signal into so-called intrinsic mode function, and then the HSA method was used to obtain instantaneous frequency data. Now the HHS has use in many fields including biomedical engineering, Chemistry and chemical engineering, Image processing, Meteorological and atmospheric applications and so on[5-8]. K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 131–138, 2010. © Springer-Verlag Berlin Heidelberg 2010
132
X. Li et al.
In this article, HHT was used to detect the energy change of the brain after the stimulation of acupuncturing Shenmen and Neiguan.
2 Techniques of Hilbert-Huang Transform 2.1 The Empirical Mode Decomposition (EMD) The EMD method is a necessary step to decomposed the signal data into a collection of intrinsic mode functions (IMF) IMF can be applied by the Hilbert spectral analysis. Each IMF is defined as a function, it must satisfies the following requirements: 1. The number of extrema and the number of zero-crossings must either be equal or differ by one in the data set. 2. The mean value of the envelope defined by the local maxima and the envelope defined by the local minima is constant zero at any point. Each IMF represents a simple oscillatory mode as a counterpart to the simple harmonic function. An IMF can have variable amplitude and frequency along the time axis Instead of constant amplitude and frequency in a simple harmonic component. The called sifting process is used to extract the IMF. Firstly, connect all the local maxima by a cubic spline line as the upper envelop, the repeat the same procedure for the local minima to produce the lower envelope. Make sure that the upper and lower envelopes should cover all the data between them. Calculate the mean value of the signal m1 (t ) .the first component h1 (t ) h1is the difference between the data and
m1 (t ) . h1 (t ) = x(t ) − m1 (t )
(1)
h1 (t ) can’t satisfy the definition of an IMF, the shifting usually has to be implemented for more times. In the subsequent sifting process, h1 (t ) can be Generally,
treated as the proto- IMF, which is the “difference”, obtained in the previous sifting is taken as “signals” in present sifting. After repeated sifting up to k times, corresponding difference h1( k −1) (t ) satisfies the IMF properties, and then hk (t ) can be taken as the first IMF, that is
c1 (t ) = h1k (t )
(2)
Overall, c1 (t ) should contain the finest scale or the shortest period component of the signal. Usually, the standard deviation (SD) criterion is used to be the stoppage criterion, it determines the number of sifting steps to produce an IMF. This criterion is proposed by Huang et al. It defined as follows:
∑ =
T
SDk
t =0
hk −1 (t ) − hk (t )
2
∑t =0 hk2−1 (t ) T
When SD is smaller than a pre-given value the sifting process is stop.
(3)
Research of Acupuncturing Based on Hilbert-Huang Transform
133
Then, separate c1 from the rest of the data by
x(t ) − c1 (t ) = r1 (t )
(4)
The residue, r1 (t ) ,still contains longer period variations in the data, it is treated as the new data and subjected to the same sifting process as described above. The second IMF c 2 (t ) is obtained. This procedure should be repeatedly used for n times, finally the residue,
rn (t ) ,
becomes a monotonic function from which no more IMF can be extracted. Then the signals can be expressed as
x(t ) = ∑i =1 ci (t ) + rn (t ) n
(5)
Thus, a decomposition of the data into n-empirical modes is achieved. 2.2 Hilbert Spectral Analysis The second process of the HHT is to find an amplitude-frequency-time distribution termed the Hilbert spectrum, based on the obtained IMFs. Having obtained the intrinsic mode function components, the instantaneous frequency can be computed using the Hilbert Transform. After performing the Hilbert transform on each IMF component, the original data ci (t ) can be expressed as the real part. The Hilbert transform ∞
of
ci (t ) , d (t ) = 1 π ∫ (ci (t ) / t − t ' )dt is used as imaginary part to form a com−∞
plex analytic function
z i (t ) z i (t ) = ci (t ) + jd i (t ) = a i (t )e jθi (t )
(6)
ai (t ) is the instantaneous amplitude and θ i (t ) is the instantaneous phase of IMF ci (t ) .The instantaneous frequency ωi (t ) of ci (t ) is that
ω i (t ) = dθ i (t ) / dt . Then the signals
(7)
x(t ) can be expressed as n
j ωi ( t ) dt x(t ) = Re ∑ a i (t )e ∫
(8)
i =1
Generally the Hilbert spectrum is used to represent the instantaneous amplitude of all IMFs on the time–frequency plane.
3 Materials and Methods 3.1 Subjects The three volunteers who were recruited via advertisement at the Hebei University of Technology, male, 21 years old, participated in test. They were not under medication.
134
X. Li et al.
Electrical acupuncture was applied unilaterally in the Neiguan point and Shenmen point on the right hand. Shenmen(HT7) is one of the significant points of the heart channel of handshaoyin. The position of Shenmen is shown in Fig.1.[9]. It is located at the ulnar end of the transverse crease of the wrist, in the depression on the radial side of the tendon of m. flexor carpi ulnaris. Stimulating Shenmen can solve emotional issues, especially those with related sleep or thinking manifestations - insomnia, muddled thinking etc. Neiguan(PC6) is one of the significant points of the pericardium channel of handjueyin. The position of Neiguan is shown in Fig.1.[9] The location of Neiguan is at 2 cun above the transverse crease of the wrist, between the tendons of m.palmaris longus and m.flexor carpi radialis. The two points are both the main point for heart disease, such as tachycardia, arrhythmia, etc and have a medicinal effect to insomnia, epilepsy, psychosis. It is reaction and curing zone of the heart system's disease. Acupuncture Neiguan points could inhibit the process of myocardial ischemia and reperfusion injury, improve the oxygen condition of myocardium, and treat nausea and vomiting associated with motion sickness effectively. Stimulation of Neiguan induced favorable regulation of both the peripheral nervous system and central nervous system.
Fig. 1. The position of Shenmen(left) and Neiguan(right)
Electrical-acupuncture was used as the stimulator. There are about 5 patterns of wave in the electrical-acupuncture; they are Dense Wave, Rarefaction Wave, Rarefaction-Dense Wave, Intermittent Wave, Sawtooth Wave. The Rarefaction-Dense Waves were used in this test. This pattern of wave provides an alternation of rarefaction wave and dense wave, each wave lasting for one and a half second. The frequency of dense wave is 50 Hz. The frequency of rarefaction wave is 2 Hz. In General, this pattern can promote metabolism and the flow of Qi and blood, improve the nutrition of tissues and relieve the inflammatory edema. It is commonly used for pain, sprain and contusion etc. All the 64 electrodes were applied and the data were recorded. 3.2 Procedure The subjects sat in a quiet room. Neuroscan 2000is a product of Neuroscan Company, U.S.A, it can provide the high density EEG recordings. It was used to record the EEG
Research of Acupuncturing Based on Hilbert-Huang Transform
135
signal before and after acupuncture, Electrical acupunctures were applied to the Shenmen (HT7) point and Neiguan (PC6) points during the test. The experiment included 8 parts: quiet with no stimuli; stimulation of acupuncture (intensity of 2 mA); stimulation of acupuncture (intensity of 2.2 mA); needle retaining; stimulation of acupuncture(intensity of 2.3 mA); stimulation of acupuncture(intensity of 2.9 mA); needle retaining; needle extraction. 3.3 Measurement EEG device was used to record the response of acupuncture in the brain. While recording, the electrode cap was placed on the subjects’ head, an electroencephalograph was recorded at the same time. The average recording time is about 5 min in every test part. Sampling frequency is 500HZ. ERP was measured using this electroencephalography signal. In the EEG signal of a single trial, the brain response to a single stimulus is not visible. To detect the brain response to the stimulus, many trials must be conducted and averaged together, ERP was got after this.
4 Results After the record, the HHT was used to analyse the data of the EEG before and after the acupuncture. The result shows that the distribution energy is not regular just after the first and second stimulus, but intensity of 2 mA and intensity of 2.2 mA. But after the fourth stimulus, the distribution of energy is the same, which is the distribution of energy tends to move the high frequency. Fig. 2 shows the IMF0–IMF8 extracted by EMD in quiet status before acupuncture.
Fig. 2. The IMFs extracted by EMD in quiet status
136
X. Li et al.
The IMFs extracted by EMD in status of fourth stimuli, IMF0–IMF8, are given in Fig. 3, where IMF8 represents the high frequency signal.
Fig. 3. The IMFs extracted by EMD in status of fourth stimuli
The Hilbert spectrum of the EEG signal in quiet status is shown in Fig. 4.
Fig. 4. The Hilbert spectrum of the EEG signal in quiet status before the acupuncture
Research of Acupuncturing Based on Hilbert-Huang Transform
137
The Hilbert spectrum of the EEG signal in the fourth stimulus is shown in Fig. 5.
Fig. 5. The Hilbert spectrum of the EEG signal in the fourth stimulus of the acupuncture
From the signal in Fig. 2, Fig. 3 and the energy spectrum in Fig.4 and Fig.5, it all shows that the energy of the signal concentrates mainly on IMF0–IMF1 in quiet status before the acupuncture, but after the three times of stimulus and a retain needle. The energy is concentrated to the IMF0–IMF3. The energy is moving from low frequency to the high frequency. It shows that after the stimuli of the acupoint about three times, the distribution of the energy has changed.
5 Discussion and Conclusion The research of the acupuncturing the Neiguan and Shenmen based on HHT is presented in this paper. With this method, the energy is redistributed after the acupuncture, especially the fourth acupuncture stimulus. It is concluded that, the acupuncture can change the distribution of the energy after several stimuli. Acknowledgments. This work was supported by National Natural Science Foundation of China under Grant No. 50877022, Scientific Research Fund of HeBei Provincial Education Department, No.2006149 and the Natural Science Foundation of Hebei Province, E2006000037. We thank Ms. Yufang Wei, Mr. Yang Song, for their help with the data collection.
References 1. Han, J.S.: Acupuncture: neuropeptide release produced by electrical stimulation of different frequencies. Trends Neurosci. 26, 17–22 (2003) 2. Kaptchuk, T.J.: Acupuncture: theory, efficacy, and practice. Ann. Intern. Med. 136, 374–383 (2002)
138
X. Li et al.
3. Biella, G., Sotgiu, M.L., Pellegata, G., Paulesu, E., Castiglioni, I., Fazio, F.: Acupuncture produces central activations in pain regions. NeuroImage 14, 60–66 (2001) 4. Uchida, Y., Nishigori, A., Takeda, D., Ohshiro, M., Ueda, Y., Ohshima, M., Kashiba, H.: Electroacupuncture induces the expression of Fos in rat dorsal horn via capsaicin-insensitive afferents. Brain Res. 978, 136–140 (2003) 5. Huang, N.E., et al.: The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis. Proc. Roy. Soc. Lond. 454, 903–993 (1998) 6. Huang, N.E., Shen, Z., Long, R.S.: A New View of Nonlinear Water Waves – The Hilbert Spectrum. Ann. Rev. Fluid Mech. 31, 417–457 (1999) 7. Wu, Z., Huang, N.E.: A study of the characteristics of white noise using the empirical mode decomposition method. Proceedings Royal Society of London, A 460, 1597–1611 (2004) 8. Hariharan, H., Gribok, A., Abidi, M.A., Koschan, A.: Image Fusion and Enhancement via Empirical Mode Decomposition. Journal of Pattern Recognition Research 1(1) (2006) 9. Medicine Online, http://www.med126.com
A New Microphone Array Speech Enhancement Method Based on AR Model Liyan Zhang1, Fuliang Yin2, and Lijun Zhang3 1
School of Electronic Engineering, Dalian JiaoTong University, Dalian, China 2 School of Electronic and Infomation Engineering, Dalian University of Technology, Dalian, China 3 Grace International Group, Division R&D, Montreal, Canada
[email protected],
[email protected],
[email protected]
Abstract. This paper applies the single microphone speech enhancement method to the microphone array speech enhancement method, and proposes a new speech enhancement method based on autoregressive model (AR). First, for the input matrix, the method adopts the generalized cross correlation method based on onset signals to estimate the time delay. Then according to the time delay information, the method calculates the linear prediction coefficient of signals received by the microphone array by means of Levinson-Durbin algorithm, and then carries out AR model speech enhancement. Finally, the method combines the enhanced speech signals to one channel output signals. The simulation results show that the proposed method can eliminate the plus noise and improve the speech quality effectively. Keywords: speech enhancement;microphone array, autoregressive model, Levinson-Durbin algorithm,time delay estimation.
1 Introduction The signals received by the microphone are usually speech signals with additive noise, because they are affected by the reverberation, background noise and so on in the speech processing systems such as video conference[1], vehicular communication[2], robot navigation[3], and audiophone[4]. It not only affects the intelligibility of the speech, but also affects the functions of the speech processing system. So it is necessary to suppress the noise efficiently to improve the quality of the speech signals. Speech enhancement means to extract speech from the noisy signals, which plays an important part in improving speech quality. From 1970, people carried deep research on speech enhancement and put forward many classical single microphone speech enhancements such as spectral subtraction[5], self-adaptive filter[6], and so on. These methods can suppress the noise efficiently, and are applied widely in the speech signals processing system because it is easy to achieve. But the single microphone speech enhancement algorithm requires the the sound source position and the microphone position is fixed comparatively. If the sound source position changes, the microphone must be moved accordingly. And if the distance between the microphone and the sound source is very far or the sound source is out of the selected direction of K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 139–147, 2010. © Springer-Verlag Berlin Heidelberg 2010
140
L. Zhang, F. Yin, and L. Zhang
the microphone, it will cause a lot of noise, and lead to quality reduction of the received signals. It is proposed to use the microphone array to process speech in order to solve the limitation of single microphone. Compared with the single microphone, the microphone array can collect more information, so it is better than the single microphone system to eliminate noise and improve speech quality. Meantime, microphone array has the dimensional preferred characteristics, and it can supply high quality speech signals from desired sound source position by means of "aiming electronically", and can suppress other speakers' speech and the environment noise. What's more, microphone array doesn't restrict the position of microphone and sound source, also not restrict the movement of the speakers, and not necessary to move the microphone in order to change the receiving direction. These characteristics are better suited to get multi or moving sound sources. Due to these advantages, microphone array has become very important methods in speech signals processing system, such as video meeting, to capture the speaker's speech and improve the speech quality. In these years, microphone array speech enhancement has become the research highlight of speech enhancement research. With the wide application of microphone array technology, there appear lots of microphone array speech enhancement algorithms, and these algorithms can be divided into five categories: (1)fixed beamforming technology[7]; (2)self-adaptive beamforming technology[8]; (3)beamforming technology with postfilter;(4)signal subspace technology[9]; (5)combined single microphone and microphone array technology[10]. But these algorithms all don't use the information of speech production model, so the enhancement result is not so good. Those speech enhancement algorithms based on speech production model, such as autoregression (AR) model[11,12], Hidden Markov Models (HMM) and so on, these algorithms are more complicated, but still are the research highlight. So this paper puts forward a new speech enhancement algorithm with the combination of single microphone and multi-microphones by making use of the speech production model and the microphone array spatial character. This algorithm first estimates the time delay of input microphone array signals, then carries out AR model speech enhancement to the singals subframe received by the microphone array separately,at last synthesizes one group speech according to the time delay. Simulation results show the validity of the algorithm.
2 The Signals Model Given a uniform linear array composed by N microphones, the distance between adjacent microphone is d , the position of the voice resource is ( x, y ) , we illustrate the situation with the Fig. 1. And the received signal at the i th microphone is expressed as:
xi (n) = s (n) ∗ hi (n) + vi (n) .
(1)
Here, s ( n) is the sound source signal, hi (n) is the room impulse response, vi ( n) is the noise, and '∗ ' is the convolution operation symbol.Suppose if the noise and the reverberation only affect the zero position of speech model.
A New Microphone Array Speech Enhancement Method Based on AR Model
141
Fig. 1. Signal model
3 Signal Microphone AR Model Speech Enhancement Algorithm The method of AR model speech enhancement algorithm is to use certain amount of "old" speech samples to approach the current speech samples, and adopts Minimum Mean Square Error (MMSE) method to estimate the model parameters. And finally using prediction residual e ( n ) to excite LPC filter in order to gain enhanced speech signals sˆ( n) , it is: P
sˆ(n) = ∑ a p s (n − p) + e(n) .
(2)
p =1
Here, enhanced speech signal sˆ( n ) can be regarded as predicted value of original
s (n) ,because sˆ(n) is gained from the linear combination of samples s (n − 1), s (n − 2), " , s (n − p ) . a p is the linear prediction coefficient, and speech signal
P is AR model order.
4 Microphone Array AR Model Speech Enhancement This paper makes use of the microphone array spatial character, first carries out general relativity time delay estimation to input matrix based on set signals, and gets signal subframes received by microphone array, then adopts separately LevinsonDurbin algorithm to get the linear prediction coefficient, then carries out AR model speech enhancement, and get the combined group signal output of the enhanced speech according to the time delay value. The functional block diagram of the method is shown in Fig.2, and the method of this paper is mainly constituted by time delay estimation, frame, calculation of linear prediction coefficient (LPC) and AR model speech enhancement. The theory discussion will be mentioned separately below. Input speech
Frame Frame Frame
Compute LPC coefficient
AR model speech enhancement
Output speech
Time delay estimation
Fig. 2. Microphone array AR model speech enhancement method diagram
142
L. Zhang, F. Yin, and L. Zhang
4.1 Time Delay Estimation Based on Set Signals Under real acoustic environment, especially in the video meeting system, the room reverberation will affect the time delay estimation performance. So this paper adopts time delay estimation with speech set signals without reverberation. This method is composed by extraction of speech set signals without reverberation, estimation of noise power spectrum and phase weighting time delay estimation. The speech set signals are located behind of muted speech stage, and compared to the background noise, its signal amplitude is higher. The effective length of speech set signals can be fixed from the arrive time difference between the direct signals and the reflected signals. The extraction of speech set signals without reverberation can adopt EA (Echo-Avoidance) model[15]. It can be described as:
⎧1, t =0 ⎪⎪ im (t ) = ⎨0, 0 < t < τ fe . ⎪ − ( t −τ fe ) / τ , t ≥ τ fe ⎪⎩α fe e
(3)
im (t ) is the room impulse response, α fe and τ fe is the first arrived reverberation
signal amplitude and time delay. τ is related with the reverberation decay rate, and when τ is bigger, the rate of decay of reverberation is slower, and the room reverberation time is bigger. To each frame signals, uses short time windows Fourier analysis, marked as X i ( j , k ) , and k is the mark number of the frequency, j is the frame mark number, and
i
is the microphone mark number. Because the maximal reverberation amplitude
X i ( j, k) is exponential decay, the amplitude of the maximal reverberation
echo
ingredients on the current frequency point
k
can be described as:
X i ( j , k ) = max{λ n | X i ( j − n, k ) |},
echo
0 < λ < 1, n = 1, 2,"
.
(4)
−τ λ = e fe / τ is reverberation attenuation ratio. the FFT value at k of the nth frame is X i ( j − n , k ) .The frequency without reverberation Xi ( j, k) can be described as:
echo− free
X
i echo − free
( j, k )
X i ( j, k )
> th .
(5)
echo
th is the given threshold value.
X ( j, k) frequency component can be used to
i echo− free
estimate the cross-power spectrum of the frequency point, and accumulates to the last
A New Microphone Array Speech Enhancement Method Based on AR Model
143
result, then deducts the noise cross-power spectrum of the frequency point. The calculation of equation (5) can be realized very easily according to the method in document [14]. Finally, adopting general relative time delay to estimate the time delay. 4.2 Frame On the basis of correct estimation of microphone time delay, it is to use the method mentioned below to carry out speech enhancement.Given time delay estimation value is the time for L samples, and the array is composed by 5 microphones, and each frame of the microphone can deal with 6 L samples, as shown in Fig 3.
Fig. 3. The frame data structure
This paper separately carries out AR model speech enhancement simultaneously to 2 L samples received by the first, the third and fifth microphone,and then combine 2 L enhanced speech signals of each route to one group 6 L , as one frame output.
4.3 Levinson-Durbin Algorithm for LPC The precision of LPC will affect direct the AR model speech enhancement result, and there are many methods for LPC, such as Levinson-Durbin algorithm, autocorrelation method, modified covariance method, Burg's method, maximum likelihood estimation and so on.This paper adopts Levinson-Durbin algorithm to calculate the linear prediction cofficiet. P order Yule-Walker equation matrix is
r (1) ⎡ r (0) ⎢ r (1) r (0) ⎢ ⎢ # # ⎢ ⎣ r ( P ) r ( P − 1)
r ( P) ⎤ ⎡ 1 ⎤ ⎡ρ ⎤ ⎥ ⎢ ⎥ ⎢0⎥ " r ( P − 1) a (1) ⎥⎢ ⎥ =⎢ ⎥. ⎢#⎥ % # ⎥⎢ # ⎥ ⎥⎢ ⎥ ⎢ ⎥ " r (0) ⎦ ⎣ a ( P ) ⎦ ⎣0⎦ "
(6)
r (m) is the correlation function, {a (1), a (2) " , a ( P )} is the coefficient of
A R ( P ) model, ρ is the predicted error power. This equation can get P order AR model predicted parameter, through Levinson-Durbin algorithm n order recurrence. The step of the algorithm is as follows:
144
L. Zhang, F. Yin, and L. Zhang
1.
Initialization
a1 (1) = −
r (1) . r (0)
(7)
ρ1 = (1 − a1 (1) )r (0) . 2
2.
to k = 2, 3, " , P
(8)
,calculates as follows. k −1
ak ( k ) = −
r (k ) + ∑ ak −1 ( j )r (k − j ) j =1
.
ρ k −1
(9)
ak (i ) = ak −1 (i ) + ak (k )ak −1 (k − i ), i = 1, 2," , k − 1 .
(10)
ρ k = (1 − ak (k ) ) ρ k −1 .
(11)
2
5 Simulation Result In order to verify the validity of the method , we carried our computer simulation experiment. And the simulation evaluates the algorithm performance of this paper through the time domain before and after speech enhancement, as well we noise suppress. In the simulation, the microphone array is composed by 5 microphones, and distance between each microphone is 6cm, the angle of incidence is 60 degrees, the received speech data produced from IMAGE model simulation, and the room size is 5 m × 6 m × 3 m . The sampling frequency is 8kHz, and each frame is 30 points. When LPC data points number is 10, AR model order is 2, the waveform of the signal is shown in Fig 4. The method in this paper has good speech enhancement result. When AR model order is 2, with different LPC data points, the enhancement speech wave is shown in Fig.5. When AR model order keeps unchanged, with the increase of LPC data points, the speech enhancement result improves, this is because when the LPC data points are bigger, the estimated speech data is more close to the original speech,but at the cost of increasing calculation and memory space. When LPC data points are 10, with different AR model orders, the time domain wave picture before and after speech enhancement is shown in Fig 6. When LPC data points keep unchanged, with the increase of AR model orders, the effect of speech enhancement is worse, this is because when estimate AR model parameters, it is estimated by segment, and includes one assumption, that is the model parameter is the same in the data of this segment. In fact, the speech signals are unstable, and model parameter is timevarying. And when the AR model order is small, the data estimation points gained each time is less, and get better enhancement effect with huge calculation amounts.
A New Microphone Array Speech Enhancement Method Based on AR Model
145
,
Fig. 4. Waveform of the signal (LPC data points is 10 AR model orders is 2)
Fig. 5. Waveform of the speech enhancement with different LPC data points(AR model order is 2)
Fig. 6. Waveform of the speech enhancement with different AR model order (LPC data is 10)
With different input signal to noise ratio(SNR), the comparison of output SNR between the method in this paper (LPC data points is 10 and AR model orders is 2) and the fixed beam method is shown in table 1. With the increase of input SNR, the ability of noise suppress for these two methods both decline, but noise suppress ability of this paper is much better than fixed beam speech enhancement method. The output SNR of this paper (AR model orders is 2, and input SNR is 1.8112dB, with different LPC data points) is shown in table 2. The output SNR of this paper (LPC data points is 10, and input SNR is -1.8387dB, with different AR model orders) is shown in table 3. And the conclusion of table 2 and table 3 is the same as Fig. 5 and Fig. 6. Table 1. Output SNR with different input SNR Input signal to Noise suppress ratio (dB) noise ratio (dB) Method of this paper Fixed beam speech enhancement -1.8112 18.2138 3.7551 3.1569 10.6071
10.6071 6.3773
2.3896 2.0517
146
L. Zhang, F. Yin, and L. Zhang
Table 2. Output SNR with different LPC data points (AR Model orders is 2, and the data length is 2s) LPC data points 5 10 20 40
Noise suppress ratio (dB) 11.3706 18.2138 24.8184 30.9992
Table 3. Output SNR with different AR model orders (LPC data points is10, and the data length is 2s) LPC data points 1 2 5 8
Noise suppress ratio (dB) 29.5092 20.8274 11.5803 9.0009
6 Conclusions This paper applies single microphone speech enhancement method to microphone array speech enhancement, and puts forward one new microphone array speech enhancement method based on speech production model. The method in this paper makes use of the spatial character of microphone array, and when AR model orders is small, it can be regarded as time-varying AR model speech enhancement. In addition, the method of this paper is suitable for parallel computing. But the big computing amount is still the disadvantage of this method. So it is the main effort in the future to decrease the computing amount. The simulation results show the validity of the method in this paper. Acknowledgment. This work was supported in part by the National Science Foundation of China (No. 60772119, 60972063), the Specialized Research Fund for the Doctoral Program of Higher Education of China (No. 200801410015).
References 1. Nishiura, T., Gruhn, R., Nakamura, S.: Collaborative steering of microphone array and video camera toward multi-lingual tele-conference through speech-to-speech translation. In: IEEE Workshop on Automatic Speech Recognition and Understanding, Trento, Italy, pp. 119–122 (2001) 2. Zhang, X., Hansen, H.J.: CSA-BF: Novel constrained switched adaptive beamforming for speech enhancement and recognition in real car environments. In: IEEE International Conference on Acoustics, Speech and Signal Processing, New York, USA, vol. 2, pp. 125–128 (2003) 3. Zhang, L., Yin, F., Hou, D.: SVD-based beamforming speech enhancement algorithm. In: Proc. of the International Conference on Electronic Measurement & Instruments (ICEMI 2005), Beijing, China, pp. 493–497 (August 2005)
A New Microphone Array Speech Enhancement Method Based on AR Model
147
4. Widrow, B., Lou, F.L.: Microphone array for hearing aids: an overview. Speech Communication 39(1-2), 139–146 (2003) 5. Boll, S.F.: Suppression of acoustic noise in speech using spectral subtraction. IEEE Trans. on Acoustics, Speech and Signal Processing 27(2), 113–120 (1979) 6. Gabrea, M., Grivel, E., Najim, M.A.: Single microphone Kalman filter-based noise canceller. IEEE Signal Processing Letter 6(3), 55–57 (1999) 7. Haiyun, Z., Zhihong, W., Limin, D.: Speech Enhancement Based on a Unvoiced-Voiced Model. Control & Automation 25(9), 293–295 (2006) 8. Goh, A.B., Mclernon, C.D., Orozco-Lugo, G.A., et al.: A new subband structure for acoustic beamformer with leaky adaptive filters in the blocking matrices. In: IEEE International Conference on Acoustics, Speech and Signal Processing, Philadelphia, USA, vol. 4, pp. 801–804 (2005) 9. Zhaoli, Y., Limin, D.: The Modified Post-Filter Beamforming for Speech Enhancement. Journal of Electronics & Information Technology 28(12), 2269–2272 (2006) 10. Liyan, Z., Fuliang, Y.: An Improved Speech Enhancement Method Based on SVD. Journal of Electronics & Information Technology 30(2), 67–70 (2008) 11. Naigao, J., Fuliang, Y., Dongxia, W., Zhe, C.: Subband particle filtering for speech enhancement. Journal on Communications 27(4), 23–28 (2006) 12. Bingxi, W.: Speech Coding, pp. 64–69. Xidian Press, Xi’an (July 2002) 13. Xilin, L.: Microphone array location and enhancement. Dalian, School of Electronic and Infomation Engineering, Dalian University of Technology (2003) 14. Huang, J., Supaongprapa, T., Wang, F., Ohnishi, N., Sugie, N.: A model-based sound localization system and its application to robot navigation. Robotics and Autonomous Systems 27(4), 199–209 (1999) 15. Fuliang, Y., Aijun, S.: Digital Signal Processing C Programmes, pp. 287–292. Liaoning Science and Technology Press, Liaoning (1996)
A Forecast of RBF Neural Networks on Electrical Signals in Senecio Cruentus Jinli Ding and Lanzhou Wang 1
College of Metrological Technology and Engineering, China Jiliang University, Hangzhou, Zhejiang, China 310018 2 College of Life Sciences, China Jiliang University, Hangzhou, Zhejiang, China 310018
[email protected]
Abstract. Weak electrical signals in Senecio cruentus were tested by a touching test system of self-made double shields with platinum sensors. Tested data of electrical signals denoised by the wavelet soft threshold and using Gaussian radial base function (RBF) as the time series at a delayed input window chosen at 50. An intelligent RBF forecasting model was set up to forecast the weak signals of all plants in the globe. Testing result shows that it is feasible to forecast the plant electrical signal for a short period. The forecast data is significant and can be used as preferences for the intelligent automatic control system based on the electrical signal adaptive characteristics of plants to achieve the energy saving on the production both greenhouses and or plastic lookum. Keywords: model of weak electrical signals, RBF neural network, wavelet soft threshold denoising, intelligent control, Senecio cruentus.
1 Introduction Signals of weak electrical in plants are a reaction of plants to the stimulation or interfering of environments [1]. It excites organism and tissues of plants to have a physiological change by the signal transformation, such as the movements, metabolism and substance transportation [2], and to modulate the relationship between plants and environments [3]. So, there is a lot of useful information in plant electrical signals [4], which are very important theoretic and practical application value to the physiological research, relative intelligent control used the character of the signal in applications of the agricultural production and daily life, such as the inspection of environments in the greenhouse [5], a forecast of the earthquake [6], the development of a novel pesticide in prevention and cure of the plant pesticide[7]. As a development of the technique in detection, analysis and processing of the signal [8], it is feasible to research the plant electrical signal characters [9]. Nowadays, almost of all classical researches on the plant electrical signal were focused on the relationship between signals and physiological effects[10], and character analysis of the plant classification [11]. The biological and engineer techniques for the cost reduced level of the energy in agriculture productions have been developed K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 148–154, 2010. © Springer-Verlag Berlin Heidelberg 2010
A Forecast of RBF Neural Networks on Electrical Signals in Senecio Cruentus
149
incisively and vividly. It is said that is very limited in the exploiting space. To save on energy source, you should be research on the self-adaptation characters of plants (crop), that is, what time do they need what kind of growth nutrition when plants grow vividly? So, we should set up a system of the intelligent control based on the self-adaptive character of plant signals in the establishment agriculture. Questions should be solved imperatively: 1) Processing of noises and confirmation on the base unit (magnitude) of original electric signals in plants. Our results after many years researches showed that the signal in plants was “µV” [12, 13], but almost of all reports were “mV” in publications previously, which kind of the magnitude to be applicable in the signals of plants should be confirmed; 2) Construction of the model at the function signal in plants. A plant has an apperceived ability to forecast the disturber and stimulation from environments on hand as animals in the nature. Only understanding variation contents of environments and the plant self-adaptation to it, we can find some useful calculating data for constructing a control model to realize autocontrol of the growth in plants (crop). Based on data of the wavelet analysis [14] and used method of RBF neural network [15] to forecast time domain characters of plant electrical signals timing, to find out preferences for constructing a system of the intelligent control growth as them feedback information at weak electrical signals of plants.
2 Materials and Method The printing area is 122 mm × 193 mm. The text should be justified to occupy the full line width, so that the right margin is not ragged, with words hyphenated as appropriate. Please fill pages so that the length of the text is no less than 180 mm, if possible. Use 10-point type for the name(s) of the author(s) and 9-point type for the address(es) and the abstract. For the main text, please use 10-point type and single-line spacing. We recommend the use of Computer Modern Roman or Times. Italic type may be used to emphasize words in running text. Bold type and underlining should be avoided. Papers not complying with the LNCS style will be reformatted. This can lead to an increase in the overall number of pages. We would therefore urge you not to squash your paper. 2.1 Testing Instruments and Materials BL-420E Biologic testing system (Taimeng Co., Ltd. China); self-made shielded room (size of 2 m×2 m×2 m) and box (60 cm×60 cm×60 cm), also platinic electrodes (15 mm long, 0.1-0.2 mm at the tip in diameter) and Senecio cruentus were used. 2.2 Testing Methods If you have more than one surname, please make sure that the Volume Editor knows how you are to be listed in the author index.
150
J. Ding and L. Wang
2.3 Data Processing 2.3.1 Processing of Origin Electric Signals in Plants Electrical signals of S. cruentus were analyzed in the time and frequency domain processed by the wavelet de-noising on Matlab (V7.0.1). The principle of wavelet soft-threshold de-noising was as follow: A finite signal contained noises could be as
x(i ) = s (i ) + σn(i ) (i = 1,2," ,N )
(1)
n(i ) was noises, σ was the level of noise, s (i ) was electrical signals of plants. Let ω be the operator of DWT (discrete wavelet transform), X, S was the DWT of x ∧ and s respectively. Namely, X = ωx , S = ωs ; let S be the estimation of S in X , then the algorithm of Donoho [15] De-noising was as follow: (1) Compute the DWT X = ωx ; (2) Coefficient by the method of threshold in the wavelet domain. The formula of ∧
estimation was
t = σ 2log (l ) ; l
∧
was data length.
σ
the robust estimation of
level of noise, and its definition was: ∧
σ=
m1 ( d j −1,k : k = 0,1," ,2 j −1 − 1) 0.6745
.
And the process method was: sgn ( X )( X − t ),
∧
X ≥ t
(2)
S = T h ( X,t ) =
0, X < t ∧
Compute the inverse DWT s = W
−1
∧
∧
S , s was the recovery of original signals.
2.3.2 Forecast Principle of RBF Neural Networks Data of signals de-noised by the wavelet soft-threshold could be use for the analysis of the artificial neural network (ANN). The forecast of time series was an essential contrary problem of the motivity system [16]. System mode F can be reconstructed by state of motivity system:
x(t' +T) = F(x(t),x(t + τ),x(t + 2τ),", x(t + (m−1)τ))
(3)
Where t = t +(m−1)τ , T is the forecasting step ( T > 0). For plant signals with the chaos [12], the forecast mode F was a nonlinear system mode. Plant signals by Radial Basis Functions (RBF) neural network have been used to the approach and forecast, which has the strong ability of nonlinear mapping. The structure of the RBF saw Fig.1. '
A Forecast of RBF Neural Networks on Electrical Signals in Senecio Cruentus
151
The output of network in single crytic layer can be expressed as: r
y = ω0 + ∑ ωi φ( x − ci )
(4)
i =1
Where x is input vector of network. in Rn.
φ
x = ( x1 , x2 , x3 ," ,xn )T ; ⋅ is Euclidean norm
is radial basis function of R+ → R R.
tic layer to the output layer.
ci
ωi
is weight coefficient from the cry-
is the center of radial basis function.
r
is number of
radial basis function which also denotes number of neural cells in the crytic layer. φ( x − c1 ) ω0
x1 φ( x − c 2 )
x2 #
xn
# φ( x − c r )
ω1 ω2
ωr
Σ
y
φ( x − cn )
Fig. 1. Structure of RBF neural networks
It has been proved [21] that to any function, if a little restriction was appended to prompting function φ of the crytic layer node, then adjusting number of the node in crytic layer, centroid value and value of connecting weights from the crytic layer to output layer, the RBF network can approach to it with any precision. This provides the theory evidence for the nonlinear mapping capability of RBF network. Taking 50 points of signals as input units, it will forecast 51st point of signal as output unit. 1500 points of the signal were intercepted as the study stylebooks of the network and the succedent numbers used for checking the effect of the extrapolation (forecasting). Thus, the number of stylebooks was 1500-50=1450. When the forecasting of one point was finished, its forecasting value adds to the network and studies again. Through several times of debugging, the value of spread constant was selected as 0.1. Before study and checking, data were normalized as following:
xn = 2 × [( x- min x) / ( max x − min x )-1]
(5)
3 Results and Discussion 3.1 Characters of Signals in S. cruentus Fig.2 left was a time domain of weak electrical signals in S. cruentus, which contained more noises, and was a kind of the complex weak electric signal including the action, variation and shock electric signals. It appeared an action electric signal after 30s when the plant was tested. Under the test temperature 11.9°C, relative humidity
152
J. Ding and L. Wang
73%, the maximum amplitude was 18.98µV, minimum -95.42 µV, peak-peak value 114.40µV, average -4.60µV. The power spectrum of signals in S. cruentus was <0.2 Hz (Figure 2 right). The signal in S. cruentus is a sort of weak, low frequency and unplacidity signals. The function of characters at signals in the plant should be researched for the bioinformation and electronics again.
Fig. 2. Time domain de-noised electrical signals (left) and power spectrum of the signal (right) in S. cruentus
3.2 Forecast of Electrical Signals in S. cruentus by the RBF Neural Networks Fig.3 left was a section after de-noised signals in S. cruentus. Through the study of 1450 stylebooks, the effect on the inner examination of the obtained RBF neural network was very well in coincidence (Figure 3 right), and fitting errors were quite small (Figure 4 left). It is said that the RBF neural network was suitable for the mimetic original electrical signal of plants. That was very important to the next forecast of the RBF neural network. Fig.4 right showed that first 2 fitting errors were < 20%, in which any one point can be selected as a calculating parameter for the intelligent automatic control in future.
Fig. 3. A section after de-noised signals (left) and effect of the inner examination of RBF neural network in electrical signals (right) in S. cruentus
Fig. 4. Effect of RBF network in signals (left) fitting errors to electrical signals (right) in S. cruentus
A Forecast of RBF Neural Networks on Electrical Signals in Senecio Cruentus
153
It was feasible to forecast for timing by using of RBF neural network in the utilization of plant electrical signals. It would be studied how to use those parameters carefully in an intelligent control system in future. As the number of forecast points increasing, forecasting errors would be enlarge and become unstable, which could be modulated according to the relation between input and output nerve cells. It must be obtained an original function electric signal to construct an intelligent automatic control system of the production in the establishment agriculture when you research on how to use characters of the self-adaptation in plants. Only those denoised electric signals and the model analyzed by RBF neural network can be obtained a calculated parameter in a system of intelligent controls. As to a plant electric signal with the chaos character, the approach and forecast to the plant electrical signal by the Radial Basis Functions (RBF) neural network in this work, it overcomes the nonlinear complexity among the multilayer all linked networks, which has a strong ability to the mapping nonlinear[16]. It is feasible to this method for the testing and forecasting the plant electrical signal in timing after wavelet soft threshold de-noising and by using of RBF neural network, which can obtain a parameter in the intelligent control system. Considering the forecast of the plant electrical signal time series by using of RBF neural network, it only use the plant electrical signal, in fact, the plant electrical signal is closely relationship with its growth and environmental factors (such as temperature, humidity, illumination etc.). We consider to add the growth and environment factors as new neural cells to the input port in the future work. It can obtain perfect effect in the forecast of plant electrical signals that a network is constructed by this method.
4 Conclusions It is feasible to process the weak electric signal in the plant by the wavelet soft threshold de-noising. The electrical signal in S. cruentus obtained from the normal growth is a sort of weak, low frequency and un-placidity signals from 1-999µV. The effect on the inner examination of the obtained RBF neural network is very well in coincidence with through the study of 1450 stylebooks that used the electrical signal after de-noised in S. cruentus and it can be used for a forecast of the plant electrical signal at the time domain in the timing. The forecasted result can be used as data of the intelligent automatic control system in future. It is not only an important calculating parameter, but also provides a novel content and method in microelectronics and bioinformatics respectively. Only understanding some characters of weak electric signals in plants is not enough, so should be constructed the theory of the weak signal transfer in plants and utilization for the intelligent control based on the thought of the information fusion technology. Acknowledgments. We thank that project subsidized by Zhejiang Province Natural Science Foundation of China under Grant No. Y3090142.
154
J. Ding and L. Wang
References 1. Wang, L.Z., Li, H.X., Lin, M., et al.: Analysis of plant electrical signal in the time domain and frequency domain. Journal of China Jiliang University 16(4), 294–298 (2005) 2. Lou, C.H.: The substance transportation and information transfer during the growth of higher plants (2). Bulletin of Biology 12, 1–3 (1991) 3. Ren, H.Y., Wang, X.C., Lou, C.H.: The universal existence of electrical signals and its physiological effects in higher plants. Acta Phytophysiologica Sinica 19(1), 97–101 (1993) 4. Wang, L.Z., Chai, Z.L.: A study on the analyses of the strategic mechanism in ecological adaptability of plant populations by mathematical models and biochemistry, pp. 43–45. Science Press, Beijing (2004) 5. Wang, Z.Y., Chen, D.S., Huang, L.: Plant physiological status monitoring system and its application in greenhouse. Transactions of the CSAE 16(2), 101–104 (2000) 6. Guo, Q.S., Su, C.H., Chen, C.R.: Discussion of prediction earthquake mechanism of bioelectric potential of silk tree. Earthquake Research in Shanxi (supplement), 25–27 (1999) 7. Ding, J.L., Ding, G.Y., Li, H.X., et al.: Studies on the electrical signal of a seedling in Cucumis sativus L. J. of Zhejiang Science and technology College 18(3), 180–184 (2006) 8. Wang, L.Z., Li, H.X., Lin, M., et al.: Application statistical analysis method in the study of the plant electrical signal. Journal of Jishou University 27(3), 67–70 (2006) 9. Li, H.X., Wang, L.Z., Li, Q.: Study on electrical signal in Clivia miniata. China Jiliang University 16(1), 62–65 (2005) 10. Guo, J.Y., Yang, X.L.: Electrical signals in higher plants. Chinese Agri, Science Bulletin 21(10), 188–191 (2005) 11. Wang, L.Z., Cao, W.X., Ling, L.J.: The determination of weak electrical signal in leaves of Lycoris radiata. Journal of Northwest Normal University 36(2), 62–66 (2000) 12. Wang, L.Z., Li, Q., Li, D.S., et al.: Analysis of electrical signal in Osmanthus fragrans. In: Proc. of SPIE, The 6th International Symposium on Instrumentation and Control Technology, vol. 6357, 63570N-1-7 (2006) 13. Li, Q., Wang, L.Z., Li, D.S., et al.: Analysis of electrical signal of three species in Compositae. Journal of China Jiliang University 17(4), 333–336 (2006) 14. Donoho, D.L.: De-moise by soft-thresholding. IEEE Trans. on IT 3, 327–613 (1995) 15. Nishida, S.: Automatic detection method of P300 waveform in single sweep records by using a neural network. Med. Eng. Phys. 16, 425 (1994) 16. Park, J., Sandberg, I.W.: Approximation and radial-basis-function networks. Neural Comp. 5(2), 305–316 (1993)
Classification of Malignant Lymphomas by Classifier Ensemble with Multiple Texture Features Bailing Zhang and Wenjin Lu Department of Computer Science and Software Engineering Xi'an Jiaotong-Liverpool University Suzhou, Jiangsu Province, 510213, China
[email protected]
Abstract. Lymphoma is a cancer affecting lymph nodes. A reliable and precise classification of malignant lymphoma is essential for successful treatment. Current methods for classifying the malignancies rely on a variety of morphological, clinical and molecular variables. In spite of recent progress, there are still uncertainties in diagnosis. Automatic classification of images taken from slides with hematoxylin and eosin stained biopsy samples can allow more consistent and less labor-consuming diagnosis of this disease. In this paper, three wellknown texture feature extraction methods including local binary patterns (LBP), Gabor filtering and Gray Level Coocurrence Matrix (GLCM) have been applied to efficiently represent the three types of malignancies, namely, Chronic Lymphotic Leukemia(CLL), Follicular Lymphoma (FL) cells, and Mantle Cell Lymphoma (MCL). Three classifiers of k-Nearest Neighbor, multiple-layer perceptron and Support Vector Machine have been experimented and the simple classifier ensemble scheme majority-voting demonstrated obvious improvement in the classification performance. Keywords: Malignant lymphomas, classifier ensemble, multiple texture features.
1 Introduction Malignant lymphoma is a cancer affecting lymph nodes. The common types include CLL (chronic lymphocytic leukemia), FL (follicular lymphoma), and MCL (mantle cell lymphoma) [Xu 2002]. The ability to distinguish classes of lymphoma from biopsies sectioned and stained with Hematoxylin/Eosin (H&E) can allow more consistent and less labor-consuming diagnosis of the disease. Since only the most knowledgeable and experienced expert pathologists specializing in these types of lymphomas are able to consistently and accurately classify these three lymphoma types using H&Estained biopsies, automating this classification may be considered as an important practical application of medical imaging [Foran 2000]. If a computer program has to discriminate between different classes of images, an appropriate classification algorithm has to be applied to the training data. During the subsequent classification of an unknown image, the formerly trained classifier is applied to the new, unknown image and tries to classify it correctly. Thus the classification process mainly consists of two parts: the extraction of relevant features from K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 155–164, 2010. © Springer-Verlag Berlin Heidelberg 2010
156
B. Zhang and W. Lu
images and the classification based on these features [Orlov 2008]. In the past, various techniques have been applied for computer analysis of medical microscopic images. And there has been much progress in the pattern classification, particularly for bioinformatics applications [Kai 2004, Peng 2008]. For example, machine learning methods such as k-Nearest Neighbor, artificial neural networks and Support Vector Machine (SVM) have been utilized for the predictive task of protein localization in conjunction with various feature extraction methods from fluorescence microscopy images [Boland 2001]. Most of the proposed approaches employed feature set which consist of different combinations of morphological, edge, texture, geometric, moment and wavelet features. It has become consensus in machine learning community that an integrative approach by combining several methods often offer higher and more robust classification accuracy than a single method. In this study, we investigate how to accurately classify malignant lymphoma images by applying three different texture features including local binary patterns (LBP) [Ojala 2002], Gabor filtering [Manjunath 1996] and Gray Level Co-occurrence Matrix (GLCM) [Haralick 1979]. The LBP operator has been proved a powerful means of texture description, which is relatively invariant with respect to changes in illumination and image rotation, and computationally simple. Gabor filter is also widely adopted for texture description and has been shown to be efficient in many applications. The GLCM method is a way of extracting second order statistical texture features when considering the spatial relationship of pixels. These kinds of texture features alone might, however, have limited power in describing textures. We proposed to apply a simple integrative method to fuse LBP, Gabor filter and GLCM features for improved texture recognition. In many pattern recognition tasks, classifier ensemble has been proven an efficient approach to improve the performance of recognition systems. Unlike relative works that combine different base classifiers (trained with same features) for image recognition systems, we use complementary texture information, with base classifiers built on different texture feature sets. Using a public IICBU-2008 dataset [Shamir 2008], experimental results indicate that the simple majority voting based classifier ensemble perform well from the combination of three different classifiers, namely, k-Nearest Neighbor, multiple layer perceptron and Support Vector Machine.
2 Texture Descriptors In order to classify microscopic images, some kind of features have to be extracted to express the characteristics and the information contained in the image. The feature sets proposed in the literature comprise, for instance, morphological data of binary image structures, Zernike moments and edge information [Orlov 2007]. Use of a single technique for the extraction of diverse features in an image usually shows limited capabilities for texture description. Texture features extracted using different techniques can be merged in an attempt to enhance their texture description capability. In the following we will study the representation capability of several texture features for the geometric properties and appearance of the Lymphomas images.
Classification of Malignant Lymphomas by Classifier Ensemble
157
2.1 Local Binary Pattern Local Binary Pattern (LBP) operator was introduced as a texture descriptor for summarizing local gray-level structure [Ojala 2002, Wolf 2008]. LBP labels pixels of an image by taking a local neighborhood around each pixel, thresholding the pixels of the neighborhood at the value of the central pixel and using the resulting binaryvalued image patch as a local image descriptor. In another word, the operator assigns a binary code of 0 and 1 to each neighbor of the neighborhoods. The binary code of each pixel in the case of 3×3 neighborhoods would be a binary code of 8 bits and by a single scan through the image for each pixel the LBP codes of the entire image can be calculated. Fig. 1. shows an example of an LPB operator utilizing 3×3 neighborhoods.
Fig. 1. Illustration of the basic LBP operator
Formally, the LBP operator takes the form
LBP ( xc , y c ) = ∑n=0 s (in − ic )2 n 7
where in this case n runs over the 8 neighbors of the central pixel c, ic and
(1)
in are the
gray-level values at c and n, and s(u) is 1 if u≥ 0 and 0 otherwise. A useful extension to the original LBP operator is the so-called uniform patterns [Ojala 2002]. An LBP is ``uniform'' if it contains at most two bitwise transitions from 0 to 1 or vice versa when the binary string is considered circular. For example, 11100001 (with 2 transitions) is a uniform pattern, whereas 11110101 (with 4 transitions) is a non-uniform pattern. The uniform LBP describes those structures which contain at most two bitwise (0 to 1 or 1 to 0) transitions. Uniformity is an important concept in the LBP methodology, representing important structural features such as edges, spots and corners. Ojala et al. observed that although only 58 of the 256 8-bit patterns are uniform, nearly 90 percent of all observed image neighborhoods are uniform. We use the notation LBPuP,R for the uniform LBP operator. LBPuP,R means using the LBP operator in a neighborhood of P sampling points on a circle of radius R. The superscript u stands for using uniform patterns and labeling all remaining patterns with a single label. The number of labels for a neighborhood of 8 pixels is 256 for standard LBP and 59 for LBPu8,1. A common practice to apply the LBP coding over the image is by using the histogram of the labels, where a 256-bin histogram represents the texture description of the image and each bin can be regarded as a micro-pattern. Local primitives which are
158
B. Zhang and W. Lu
coded by these bins include different types of curved edges, spots, flat areas, etc. The distribution of these patterns represents the whole structure of the texture. The number of patterns in an LBP histogram can be reduced by only using uniform patterns without losing much information. There are totally 58 different uniform patterns at 8-bit LBP representation and the remaining patterns can be assigned in one non-uniform binary number, thus representing the texture structure with a 59-bin histogram instead of using 256 bins. LBP scheme has been extensively applied in face recognition, face detection and facial expression recognition with excellent success, outperforming the state-of-theart methods [Wolf 2008]. The methodology can be directly extended to microscopic image representations as outlined in the following. First, a microscopy image is divided into M small no-overlapping rectangular blocks R0, R1, •••, RM. On each block, the histogram of local binary patterns is calculated. The procedure can be illustrated by Figure 3. The LBP histograms extracted from each block are then concatenated into a single, spatially enhanced feature histogram defined as:
H ij = ∑ x , y I ( f l ( x, y ) = i )
i = 0,..., L − 1, j = 1,..., M − 1
(2)
Here L is the number of different labels produced by the LBP operator and I(A) is 1 if A is true and 0 otherwise. The extracted feature histogram describes the local texture and global shape of microscopy images.
Fig. 2. Feature extraction diagram for image recognition with local binary patterns
LBP has been proved being a good texture descriptor that is easy to compute and has high extra-class variance and low intra-class variance. Recently, a number of variants of LBP have been proposed [Wolf 2008]. 2.2 Gabor Based Texture Features Gabor filters [Manjunath 1996] have been used extensively to extract texture features for different image processing tasks. The filters are orientation- and scale-tunable edge and line detectors. Statistics of these local features in a region relate to the underlying texture information. The convolution kernel of Gabor filter is a product of a Gaussian and a cosine function, which can be characterized by a preferred orientation and a preferred spatial frequency:
Classification of Malignant Lymphomas by Classifier Ensemble
159
where
The standard deviation σ determines the effective size of the Gaussian signal. The eccentricity of the convolution kernel g is determined by the parameter λ, called the spatial aspect ratio. λ determines the frequency (wavelength) of the cosine. θ determines the direction of the cosine function and finally, ϕ is the phase offset. Typically, an image is filtered with a set of Gabor filters of different preferred orientations and spatial frequencies that cover appropriately the spatial frequency domain, and the features obtained form a feature vector that is further used for classification. Given an image I(x,y), its Gabor wavelet transform is defined as
where * indicates the complex conjugate. We assume the local texture regions are spatially homogeneous. The mean μmn and standard deviation σmn of the magnitude of transform coefficients are used to represent the regions for classification :
A feature vector f is created using μmn and σmn as the feature components [Orlov 2007]. Five scales and 6 orientations are used in common implementation and f is then given by:
2.3 Gray Level Co-occurrence Matrices Gray level co-occurrence matrix (GLCM) proposed by Haralick et al. [Haralick 1979] is another common texture analysis method which estimates image properties related to second-order statistics. GLCM matrix is defined over an image to be the distribution of co-occurring values at a given offset. Mathematically, a co-occurrence matrix C is defined over an n× m image I, parameterized by an offset (Δx , Δy) as
In order to estimate the similarity between different GLCM matrices, Haralick proposed 14 statistical features extracted from them [Haralick 1979]. To reduce the
160
B. Zhang and W. Lu
computational complexity, only some of these features will be selected. The 4 most relevant features that are widely used in literature include: (1)Energy, which is a measure of textural uniformity of an image and reaches its highest value when gray level distribution has either a constant or a periodic form; (2) Entropy, which measures the disorder of an image and achieves its largest value when all elements in C matrix are equal; (3) Contrast, which is a difference moment of the C and measures the amount of local variations in an image; (4) Inverse difference moment (IDM) that measures image homogeneity.
3 Classifier Ensemble with Simple Majority Voting After feature extraction, a statistical model needs to be learned from data that accurately associates image features with predefined classes. Some supervised learning algorithms such as neural networks, k-nearest neighbor (kNN) algorithm and Support Vector Machine (SVM) have been applied to solve this problem. In pattern recognition, it is believed that ensemble of classifiers have the potential to improve classification results [Polikar 2006]. Combining multiple classifiers is currently an active research area. 3.1 Classification Many machine learning algorithms have been applied to classification problems. We chose three commonly used classifiers, kNN, multi-layer perceptron (MLP) and support vector machine (SVM). Each classifier will be trained independently with the feature extraction methods discussed above. The kNN strategy for making classification involves searching through the training cases and finding those whose inputs are closest to the inputs of the test case, and using some kind of weighted average of the targets of these neighbors as a prediction of label. Most kNN classifiers use simple Euclidean distances to measure the dissimilarities between examples represented as vector inputs. Euclidean distance metrics, however, do not capitalize on any statistical regularities in the data that might be estimated from a large training set of labeled examples. An MLP network consists of a set of source nodes forming the input layer, one or more hidden layers of computation nodes, and an output layer of nodes. The MLP constructs input-output mappings that are a nested composition of nonlinearities. The basic idea of SVM is to construct a hyperplane as the decision surface in such a way that the margin of separation between positive and negative examples is maximized [Shawe-Taylor 2000]. SVM achieves this by determining a classifier in a high dimensional feature space. Some so-called support vectors are found that represent the training data and then will be formed into a model by a SVM algorithm. The histogram intersection [Smith 1999], kHI (ha, hb) = ∑i=1n min (ha(i), hb(i)), is often used as a measurement of similarity between histograms ha and hb, and because it is positive definite, it can be used as a kernel for discriminative classification using SVMs. Recently, intersection kernel SVMs (henceforth referred to as IKSVMs), have been shown to be successful for detection and recognition [Barla 2003, Maji 2008].
Classification of Malignant Lymphomas by Classifier Ensemble
161
3.2 Classifier Combination by Majority Voting The second stage of an ensemble method is to combine the base models built in the previous step into a final ensemble model. There are different types of voting systems, the frequently used ones are: simple voting and weighted voting [Duda 2001]. Simple voting, also called majority voting and select all majority (SAM), considers each component classifier as an equally weighted vote. The classifier that has the largest amount of votes is chosen as the final classification scheme. In weighted voting schemes, each vote receives a weight, which is usually proportional to the estimated generalization performance of the corresponding component classifier. Weighted voting schemes usually give better performance than simple voting. In our preliminary study, however, we only experimented with the simple voting.
4 Experiment In this section we present experiments that show the effectiveness of the proposed method for malignant lymphoma classification. We performed our experiments on the dataset provided in [Shamir 2008] which is a collection of samples prepared by different histologists at a number of hospitals. The malignant lymphoma data belong to three isolates, i.e., CLL (chronic lymphocytic leukemia), FL (follicular lymphoma), and MCL (mantle cell lymphoma). There are total 368 samples available, with 111 images for CLL, 137 images for FL and 120 images for MCL. The images are 1040×1388 in size. The experiment settings for all the classifiers are summarized as follows. For MLP, we experimented with a three-layer network. Specifically, the number of inputs is the same as the number of features, one hidden layer with 20 units and a single linear unit representing the class label. All of the support vector machine classifier were optimized by quadratic programming. For kNN classifier, we chosen k=1. The histogram intersection measure (distance) was used as the similarity between LBP histograms [Smith 1999] when kNN and SVM is applied. The MLP classifier was applied with convention Euclidean distance. As most of the texture information is contained in the uniform patterns [Ojala 2002], a 59-label LBPu8,1 operator as introduced in Section 2 will be used in our experiments, i.e., LBPu8,1 operator is applied to non-overlapping image subregions to form a concatenated histogram. We directly use the plain LBP for all images without weighting over different subregions as proposed in [Ojala 2002]. As noted in [Ojala 2002], the subregions do not need to be of the same size and do not necessarily have to cover the whole image. It was also pointed out in the same paper that the LBP representation is quite robust with respect to the selection of parameters when looking for the optimal window size. Changes in the parameters may cause big differences in the length of the feature vector, but the overall performance is not necessarily affected significantly. Therefore, in all the experiments we fixed a subregion(window) size 60×64 for a given dataset. For the IICBU-2008 malignant lymphoma dataset, we randomly split it into training and testing sets, each time with 10% of each subject's images reserved for testing while the rest for training. The classification accuracy results reported are the average accuracies of 100 runs, such that each run used a random split of the data to training
162
B. Zhang and W. Lu
and testing sets. The holdout experiment results are presented in Figure 3(a), with the kNN, MLP and SVM classifiers with corresponding GLCM, Gabor and LBP features. The results show that for the four different classification schemes implemented, majority voting based classifier fusion is the best with regard to the overall accuracy ∼80%. We repeated the holdout approach with 100 iterations and record the performance distributions. The box plots in Fig. 3(b) demonstrated the distributions of classification accuracies. From the boxplots, we can find again the overall performance of majority voting based classifier fusion is better than others while keeping low variance in the distribution.
Fig. 3. Classification performance comparison. Left: Box plot for the classification performance. Right: Box plot for the classification performance.
The confusion matrices that summarize the details of comparisons from the above experiment are shown in Fig. 4.
Fig. 4. Confusion matrices from the holdout experiment after average and round-off. (a), (b) and (c) are from the kNN, MLP and SVM classifiers, respectively, with LBP, Gabor and GLCM features from left to right in each row. (d) is the result from majority voting.
5 Conclusion The ability to identify malignant lymphoma microscopic images rapidly provides significant benefits to biological scientists, drug developers or clinicians. In this study, we
Classification of Malignant Lymphomas by Classifier Ensemble
163
studied three commonly used texture feature extraction methods including local binary patterns (LBP), Gabor filtering and Gray Level Coocurrence Matrix (GLCM) for malignant lymphoma images classification and investigated the simple majority voting based classifier integration approach to combine base classifiers which were built on the three different feature sets. A classification accuracy of 80% is obtained for the public IICBU-2008 malignant lymphoma dataset from the exploitation of the complementary strengths of feature construction and classifier fusions. The proposed integrative approach is extensible to other biomedical and biological problems. Acknowledgments. The project is funded by China Jiangsu Provincial Natural Science Foundation Intelligent Bioimages Analysis, Retrieval and Management (BK2009146).
References 1. Xu, Y., McKenna, R.W., Asplund, S.L., Kroft, S.H.: Comparison of immunophenotypes of small B-cell neoplasms in primary lymph node and concurrent blood or marrow samples. American Journal of Clinical Pathology 118(5), 758–764 (2002) 2. Foran, D.J., Comaniciu, D., Meer, P., Goodell, L.A.: Computer-assisted discrimination among malignant lymphomas and leukemia using immunophenotyping, intelligent image repositories, and telemicroscopy. IEEE Transactions on Information Technology in Biomedicine 4(4), 265–273 (2000) 3. Orlov, N., Eckely, D.M., Shamir, L., Goldberg, I.G.: Machine Vision for Classifying Biological and Biomedical Images. In: Visualization, Imaging, and Image Processing (VIIP 2008), pp. 192–196. Palma de Mallorca, Spain (2008) 4. Boland, M., Murphy, R.: A neural network classifier capable of recognizing the patterns of all major subcellular structures in fluorescence microscope images of HeLa cells. Bioinformatics 17(12), 1213–1223 (2001) 5. Peng, H.: Bioimage informatics: a new area of engineering biology. Bioinformatics 24(17), 1827–1836 (2008) 6. Kai, H., Robert, F.M.: From quantitative microscopy to automated image understanding. J. Biomed. Opt. 9, 893–912 (2004) 7. Ojala, T., Pietikainen, M., Maenpaa, T.: Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Transactions on Pattern Analysis and Machine Intelligence 24, 971–987 (2002) 8. Manjunath, B., Ma, W.: Texture Features for Browsing and Retrieval of Image Data. IEEE Trans. on Pattern Analysis and Machine Intelligence 18(8), 837–842 (1996) 9. Haralick, R.: Statistical and Structural Approaches to Texture. Proceedings of the IEEE 67(5), 786–804 (1979) 10. Duda, R.O., Hart, P.E., Stork, D.G.: Pattern classification, 2nd edn. John Wiley and Sons, New York (2001) 11. Shamir, L., Orlov, N., Eckley, D.M., Macura, T., Goldberg, I.: IICBU-2008 - A Proposed Benchmark Suite for Biological Imaging. Medical & Biological Engineering & Computing 46, 943–947 (2008) 12. Orlov, N., Johnston, J., Macura, T., Shamir, L., Goldberg, I.: Computer Vision for Microscopy Applications. In: Obinata, G., Dutta, A. (eds.) Vision Systems: Segmentation and Pattern Recognition, p. 546. I-Tech, Vienna (2007)
164
B. Zhang and W. Lu
13. Wolf, L., Hassner, T., Taigman, Y.: Descriptor Based Methods in the Wild. In: Faces in Real-Life Images Workshop at the European Conference on Computer Vision (ECCV 2008) (2008) 14. Polikar, R.: Ensemble Based Systems in Decision Making. IEEE Circuits and Systems Magazine 6(3), 21–45 (2006) 15. Shawe-Taylor, J., Nello Cristianini, N.: Support Vector Machines and other kernel-based learning methods. Cambridge University Press, Cambridge (2000) 16. Barla, A., Odone, F., Verri, A.: Histogram Intersection Kernel for Image Classification. In: Proc. 2003 International Conference on Image Processing (ICIP 2003), vol. 3, pp. III-513– 516 (2003) 17. Maji, S., Berg, A., Malik, J.: Classification using Intersection Kernel Support Vector Machines is Efficient. In: Proc. IEEE Conference Computer Vision and Pattern Recognition (CVPR 2008), Anchorage, pp. 1–8 (2008) 18. Smith, J., Chang, S.: Integrated spatial and feature image query. Multimedia Syst. 7(2), 129–140 (1999)
Denoising of Event-Related Potential Signal Based on Wavelet Method Zhen Wu1, Junsong Wang2, Deli Shen3, and Xuejun Bai3 1
Tianjin University of Technology and Education, He-xi District, 300222 Tianjin, China 2 Tsinghua University, Hai-dian District, 100084 Beinjing, China 3 Tianjin Nomal University, He-xi District, 300074 Tianjin, China
[email protected]
Abstract. Event-Related brain Potentials (ERP) play an important role in psychology research. In most cases, the measured ERP signals are not clean, in order to extract useful information from ERP measured data, a denoising method is often required. In this paper, we present a denoising technique of ERP signal based on the wavelet transform (WT), which can decompose a signal into several scales that represent different frequency bands, allowing the representation of the temporal features of a signal at different resolutions. The denoising results revealed that the wavelet-based methods outperformed the digital filter in most cases. Keywords: Denoising, Event-Related Potential, Wavelet.
1 Introduction The event-related brain potentials (ERP) signals are highly subjective and the information about the various states may appear at random in the time scale[1-2]. Therefore, ERP signal parameters, extracted and analyzed using computers, are highly useful in psychology research. Real world data rarely comes clean, in order to extract useful information from ERP measured data, a theoretically sound and robust denoising method is often required. Wavelet transforms can decompose a signal into several scales that represent different frequency bands, and at each scale, the position of signal’s instantaneous structures can be determined approximately. Such a property can be used for denoising of ERP. In this paper, we present a denoising technique of ERP signal based on the wavelet transform (WT). The paper is organized as follows: in Section 2, we present denoising technique of ERP Signal Based on Wavelet Method, and the results of the validation on several data and their comparison to other algorithms are given. Finally, the conclusions are presented in Section 3.
2 Denoising of ERP Signal Based on Wavelet Method 2.1 Problem Statement The wavelet transform provides a description of the signal in the time-scale domain, allowing the representation of the temporal features of a signal at different resolutions; K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 165–172, 2010. © Springer-Verlag Berlin Heidelberg 2010
166
Z. Wu et al.
therefore, it is a suitable tool to analyze the ERP signal, which is characterized by a cyclic occurrence of patterns with different frequency content. Moreover, the noise and artifacts affecting the ERP signal also appear at different frequency bands, thus having different contribution at the various scales[3-7]. We apply wavelet-based denoising techniques to nonstationary and noisy multivariate time series of electroencephalograms (EEG) in order to estimate event-related brain potentials (ERP). The Oddball experiment studies brain potentials during a stimulus presentation. The measurement of the ERP was done with 64 electrodes/channels. A part of the electrodes were localized as shown in Fig.1.
Fig. 1. Localization of the electrodes on the head
The goal is to estimate the actual ERP signal
{xn } from measured data { yn }
where the model assumed is
yn = xn + en , n ∈ Γ
(1)
where {en } is stationary Gaussian noise. 2.2 Wavelet Transform and Multi-resolution Analysis The wavelet transform is a decomposition of the signal as a combination of a set of basis functions, obtained by means of dilation ( a ) and translation (b) of a single prototype waveletψ (t ) . Thus, the WT of a signal
Wa x(b) =
1 a
∫
+∞
−∞
x(t ) is defined as[2-7]
⎛ t −b ⎞ x(t )ψ ⎜ ⎟ dt , a > 0 ⎝ a ⎠
(2)
Denoising of Event-Related Potential Signal Based on Wavelet Method
167
The greater the scale factor is, the wider is the basis function and consequently, the corresponding coefficient gives information about lower frequency components of the signal, and vice versa. In this way, the temporal resolution is higher at high frequencies than at low frequencies, achieving the property that the analysis window comprises the same number of periods for any central frequency.
a. Resolution of ERP signal
b. Reconstruction of ERP signal Fig. 2. Multi-resolution Analysis via Wavelet Transform
If the prototype wavelet ψ (t ) is the derivative of a smoothing function θ (t ) , it can be shown that the wavelet transform of a signal x (t ) at scale
⎛ d ⎞ +∞ Wa x(b) = −a ⎜ ⎟ ∫ x(t )θ a ( t − b ) dt ⎝ db ⎠ −∞ ⎛ where θ a ( t ) = − ⎜ 1 ⎝
a is (3)
⎞ +∞ x(t )θ t − b dt is the scaled version of the smooth) ⎟ a( a ⎠ ∫−∞
ing function. The wavelet transform at scale is proportional to the derivative of the filtered version of the signal with a smoothing impulse response at scale a . Therefore, the zero-crossings of the WT correspond to the local maxima or minima of the smoothed signal at different scales, and the maximum absolute values of the wavelet transform are associated with maximum slopes in the filtered signal.
168
Z. Wu et al.
Regarding our application, we are interested in extracting ERP signals, which are composed of slopes and local maxima (or minima) at different scales, occurring at different time instants within the cardiac cycle, hence, the convenience of using such a type of prototype wavelet. The scale factor a and/or the translation parameter b can be discreted. The usual choice is to follow a dyadic grid on the time-scale plane: a = 2 and b = 2 l . The transform is then called dyadic wavelet transform, with basis functions k
k
ψ k ,l (t ) = 2− k / 2ψ (2− k / 2 − l ) ; k , l ∈ Z + .
(4)
For discrete-time signals, the dyadic discrete wavelet transform (DWT) is equivalent, according to Mallat’s algorithm, to an octave filter bank and can be implemented as a cascade of identical cells. From the transformed coefficients
W2k x[2k l ] and the
low-pass residual, the original signal can be rebuilt using a reconstruction filter bank. However, for this application we are only interested in the analysis filter bank. Wavelet-based multi-resolution analysis of ERP signal is illustrated as Fig.2. 2.3 Wavelet-Based Soft Threshold Denoising of ERP Signal via Level-Dependent Thresholding For the problem stated in Eq.(1) with data subjected to stationary Gaussian (white or colored) noise, Wavelet-based Soft Threshold Denoising technique is employed to extract the ERP signal, the soft level-dependent thresholding estimators are described by the following algorithms for equally spaced discrete data[4-5]: (1) Compute the wavelet coefficients and
w jk ( y ) of data y1 ,… , yn where N = 2 J
J ∈ Z via the discrete wavelet transform w jk =
(2) Estimate the noise variance
1 N
σ 2j
σˆ j =
N −1
∑ψ n=0
jk
(tn ) yn
at each level
(
med w jk
)
(5)
j = m,… , J − 1 via (6)
0.6745
med (⋅) denotes the median. (3) The threshold T j at each level j is now where
T j = σˆ j 2 ln 2 j
(7)
Denoising of Event-Related Potential Signal Based on Wavelet Method
169
20
ERP signal
0
-20
-40
-60
-80
0
100
200
300
400
500
600
700
400
500
600
700
400
500
600
700
k
20 10
ERP signal
0 -10 -20 -30 -40 -50 -60
0
100
200
300 k
10 0
ERP signal
-10 -20 -30 -40 -50 -60
0
100
200
300 k
Fig. 3. Denoising results of ERP signal of FC4 electrode. (From top to down respectively: measured data, digital filter and denoising via wavelet.)
170
Z. Wu et al.
400
ERP signal
300
200
100
0
-100
0
100
200
300
400
500
600
700
400
500
600
700
400
500
600
700
k 400
ERP signal
300
200
100
0
-100
0
100
200
300 k
300 250
ERP signal
200 150 100 50 0 -50
0
100
200
300 k
Fig. 4. Denoising results of ERP signal of FPZ electrode. (From top to down respectively: measured data, digital filter and denoising via wavelet.)
Denoising of Event-Related Potential Signal Based on Wavelet Method
(4) The level-dependent soft thresholding estimator is, for
(
)
⎧⎪sgn ( w jk ) w jk − T j , dˆ kj = ⎨ , ⎪⎩ 0
171
k = 1, 2,… , 2 j :
if w jk > T j
(8)
otherwise
k
where d j are the wavelet coefficient soft the signal and the
dˆ kj are their estimators,
and superscripts S denote soft thresholding. The estimated signal
xˆn can then be
obtained by taking the inverse discrete wavelet transform: 2m
1 xˆn = ∑ vmkϕ mk (tn ) + N k =1
J −1 2 j
∑ ∑ dˆ ψ j = m k =1
k j
jk
(tn )
(9)
Denoising results of ERP signal of five electrodes via wavelet and digital filter is illustrated as Fig.3, Fig.4 and Fig.5 respectively, the results revealed that the waveletbased methods outperformed the digital filter in most cases.
3 Results and Discussion In this section, the proposed wavelet-based denoising techniques is used to extract event-related brain potentials (ERP). The denoising results of ERP signals of two electrodes, i.e. three different kinds typical ERP signal, are shown as Fig.3 and Fig.4. the analysis results revealed that the wavelet-based methods outperformed the digital filter in most cases.
4 Conclusions The noise and artifacts affecting the ERP signal also appear at different frequency bands, thus having different contribution at the various scales. The wavelet transform can decompose a signal into several scales that represent different frequency bands, allowing the representation of the temporal features of a signal at different resolutions. We apply wavelet-based denoising techniques to extract event-related brain potentials (ERP). The denoising results of ERP signal of two electrodes revealed that the wavelet-based methods outperformed the digital filter in most cases. Acknowledgement. This work was both supported by the Key Program of The National Philosophy and Social Sciences Fund during the 11th Five-Year Plan Period (No.ABA060004) and the Major Program from the Key Research Institute in University of Chinese Ministry of Education (No.08JJDXLX266).
References 1. Rajendra, A.U., Oliver, F., Kannathala, N.: Non-linear analysis of EEG signals at various sleep stages. Computer Methods and Programs in Biomedicine 80, 37–45 (2005) 2. Martínez, J.P., Almeida, R., Olmos, S.: A Wavelet-Based ECG Delineator: Evaluation on Standard Databases. IEEE Transaction on Biomedical Engineering 51, 570–581 (2004)
172
Z. Wu et al.
3. Solbø, S., Eltoft, T.: A Stationary Wavelet-Domain Wiener Filter for Correlated Speckle. IEEE Transaction on Geoscience and Remote Sensing 46, 1219–1230 (2008) 4. Albert, C.T., Jeffrey, R.M., Steven, D.G.: Wavelet denoising techniques with applications to experimental geophysical data. Signal Processing 89, 144–160 (2009) 5. Paiva, H.M., Roberto, K.H.G.: A Wavelet Band-Limiting Filter Approach for Fault Detection in Dynamic Systems. IEEE Transaction on System, Man, and Cybernetics—Part A: system and Humans 38, 680–687 (2008) 6. Chang, S.G., Yu, B., Vetterli, M.: Adaptive Wavelet Thresholding for Image Denoising and Compression. IEEE Transaction on Image Processing 9, 1532–1546 (2000) 7. Pan, Q., Zhang, L., Dai, G.Z., Zhang, H.C.: Two Denoising Methods by Wavelet Transform. IEEE Transaction on Signal Processing 47, 3401–3406 (1999)
Predict Molecular Interaction Network of Norway Rats Using Data Integration Qian Li and Qiguo Rong* College of Engineering, Peking University, Beijing 100871, P.R. China
[email protected],
[email protected]
Abstract. The emergence of systems biology enables us to simulate and analyze organism’s microscope features from the level of genome, proteome and interactome. This article utilized data integration method to predict molecular interaction network of Norway rat following the basic principles of systems biology. This research selects microarray related with cardiac hypertrophy, and built the downstream studies on 730 differentially expressed genes.4 heterogeneous kinds of data type including microarray expression, gene sequence, subcellular localization of protein and orthologous data are selected to make the overall model more comprehensive. After processed by specific algorithms, the 4 data types are transformed to 5 types of evidence: Pearson correlation coefficient, SVM model recognition, similarities between gene sequences, distance between proteins and orthologous alignment. A widely used machine learning algorithm, support vector machines (SVM) is introduced here to help deal with single evidence preparation and multiple evidence integration. This article finds that the prediction accuracy of data integration is obviously higher than that of single evidence. Data integration promised that heterogeneous data types could enhance each other’s advantages by weakening each other’s disadvantages so as to deliver more objective and comprehensive understanding of molecular interactions. Keywords: systems biology, data integration, molecular interaction network, SVM.
1 Introduction The large number of studies in biology has been targeting in individual gene or protein. However, with the research going on, the study only in a single gene has been found not enough, because life is a complex body of an organic whole, genes do not exist in isolation in the body of life, and between genes there exists a complex regulation network. Currently, the main task of system biology is to predict molecular interaction network, biochemical metabolic pathway and how these interactions modulate the life-support system. If the composition of molecular interaction network can be accurately predicted, it may pinpoint the root cause of disease, then intervene it from the molecular level to achieve the goal of disease control. *
Corresponding author.
K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 173–179, 2010. © Springer-Verlag Berlin Heidelberg 2010
174
Q. Li and Q. Rong
This research focuses on the molecular interaction network corresponding to symptoms of ventricular hypertrophy in rat, the conclusion of which has some reference for heart disease.
2 Methods In NCBI GEO[1] database, 26 gene chip platforms are for rat; GPL85 platform is selected in this article. For six data sets of rat ventricular hypertrophy or hypertension-related in GPL85 platform, data set GDS598 is selected as the target chip. SOFT format data set in GDS598 is downloaded from the NBCI database. The data set contains 8799 features; each feature is on behalf of a group of probe in chip, each group of probe is corresponding to a gene or EST. However, not all genes are considered within the scope of this article, only part of genes and their corresponding proteins will be screened as target molecules to structure molecular interaction network. In this paper, by using screening tool for differentially expressed genes provided by NCBI web site, 914 features are sentenced to be differentially expressed. In the selected 914 differentially expressed features, there are 184 features with no gene symbol, whose corresponding sequence information and gene ontology information are missing in the existing database. They do not facilitate the calculation for all the evidences, so they will not be considered in this paper. Finally 730 target molecules are got. This followup model will only be built on basis of the 730 target molecules and their corresponding 266,085 candidate molecular interactions (not including self-interaction). Pearson correlation coefficient: Pearson correlation coefficient is commonly used in statistics to measure the molecular regulation[2]. If gene A or its product regulates the transcription and translation process of gene B, then it will show some correlation (positive or negative correlation) in the expression level of gene A and gene B , and the | PCC | will be closer to 1 than the average of whole genome. In the 266,085 molecular interactions to be predicted, the PCC of the first i of the molecules is: N
pcci = ∑ ( xij − xi )( yij − yi ) j =1
⎛ ⎜⎜ ⎝
N
∑ (x j =1
ij
− xi ) 2
N
∑(y j =1
ij
⎞ − yi ) 2 ⎟ ⎟ ⎠
Where xij and yij represent expression of the first i of the molecules in the first j samples of gene chips, ¯x i and ¯y i are the average expression of the first i of the molecules in all samples, N represents the total number of sample. As this article is only concerned about the existence of gene expression pattern correlation and not to consider the relevance of a positive correlation or negative correlation, so all pcci is taken as the absolute value. Support vector machines: SVM theory proposed by Vapnik V has gained wide attention in recent years because of its solid theoretical foundation and many good features. Using SVM for the identification of molecular interaction, the key lies in the work is preparation of the training set. In this paper we select GDS598 microarray data as the original data set. If the gene expression vector of element A is xA, the gene expression vector of molecular B is xB, then the corresponding SVM sample vector of the tested molecules (A, B) is (xA, xB). As GDS598 chipset contains 46 experimental samples, so the dimensions of xA and xB are both 46, the dimension of SVM sample
Predict Molecular Interaction Network of Norway Rats Using Data Integration
175
vector is 92. The positive samples of training set are composition of the known PPI and PDI. By using the APID2NET[3] 2140 PPI are found (not including selfinteraction), of which 858 PPI are removed because they are not the genes of GPL85 platform; and the samples of training set should not overlap with the sample of the predicted concentration, so the PPI of the target molecules is removed, finally 287 PPI are selected as the positive samples of the training set. The known 1092 PDI of rat are downloaded from TRED[4] database, in accordance with the selection ideas of PPI positive sample, 44 PDI are selected as the positive samples of the training set. Thus, the total of positive samples is 331. Next, the negative samples begin to be selected: first of all, 730 selected target molecules are removed from the 8799 features in the platform GPL85 and those molecules involved in PPI or PDI; from the remaining molecules 10% is randomly selected to be molecule pairs, by far no evidence has shown there exists interaction in these molecules, in order to ensure non-correlation between them, they are sort in accordance with the | PCC | value, and 3310 molecules with minimal | PCC | value are selected as negative samples of the training set. Predicting set is consisted by the gene expression vector of 730 target molecules, including a total of 266,085 samples to be predicted. In this paper, SVM discriminant analysis is realized by using the software LIBSVM[5] written by Lin CJ et al. Similarity of molecular sequences: Similarity of molecular sequences is an important evidence of predicting molecular interaction. Gene coding sequence has primary structure and secondary structure. In which the primary structure is composed by the linear arrangement of four DNA nucleotides (A, C, G, and T). This so-called gene secondary structure refers to codon usage frequency of the corresponding mRNA. Research has shown that it exist a certain correlation between codon usage and gene expression levels[6]; and according to chromosome proximity/gene neighborhood, related functional genes tend to cluster under evolutionary selection pressure, so nearby genes tend to have similar codon usage[7]; moreover, previous studies have shown that: for genes with related function, it exist common evolutionary phenomenon in their codon usage[8]. Thus the probability of molecules interaction can be predicted by comparing codon usage frequency of the gene coding sequence. In this article Biomart platform embedded in Ensembl[9] database are chosen to collect gene coding sequence data. For the gene coding sequences are not found in Biomart, their codon usage frequency can be queried from Codon Usage Database[10]. Here learning set used in SVM mode discrimination is continued to adopt: study focus on a total of 331 positive samples and 3310 negative samples, each of which samples are 64-dimensional vector, representing the distance of codon usage frequency of the two gene coding sequence. The sample size of predicting set is 266,085, of which 80,949 can not be used to calculate the distance of codon usage frequency because their encode information are missing. Finally, all the ratios of 266,085 predicted samples are calculated, the ratio range is [9.73E-08, 265053]. For those predicted samples whose codon usage frequency information can not be found, for a conservative estimate, their ratio is directly set to 0. Protein subcellular localization: Cell is the basic unit of organism structure and function. For animal cell, it generally contains cell membrane (Plasma Membrane), Cytoplasm and Nucleus. In addition, the cytoplasm contains a variety of organelles: Mitochondiria, Endoplasmic Reticulum, Golgi Apparatus, Ribosome, Centriole, Vacuole, Lysosome and so on. And proteins closely related to biological structure and
176
Q. Li and Q. Rong
characteristics are found in these structures of cell. The distribution of each protein has a specific area: subcellular localization zone. Under normal circumstances, the protein can interact and influence each other by protein-protein combination, that is, the proteins distributed in different areas of subcellular localization are difficult to contact with each other, so the possibility of interaction is low. The distance is directly proportional with the possibility of Protein-protein interaction. But how the distance between the proteins is to measure? Minimum Number of Transport Processes (hereinafter referred to as MNTP) is used here. MNTP can be seen as the number of crossing the membrane that the two proteins occur in the same subcellular localization zone. To smaller MNTP, less energy is needed for protein crossing the membrane, thus the possibility of interaction between proteins is greater. According to the structure of the composition of animal cell, 16 major subcellular localization zones are selected. When MNTP = k, the corresponding P value PMNTP=k : PMNTP =k = P( MNTP = k | PP ∉ I ) ≈ 1 − P( MNTP ≥ k | PP ∉ I ) = 1− ∑ i≥ k
[ N MNTP = k ( N MNTP =k − 1) / 2 − t p N MNTP =k / 2] [ N a ( N a − 1) / 2 − t p N a / 2]
≈ 1−
∑N i ≥k
MNTP = k
Na
The P value means the probability that proteins with no interaction (PP∈ / I) come into the same subcellular localization zone through crossing membrane k times. In the formula, I represents the whole protein interaction group, NMNTP=k represents the number of proteins which meet together through crossing membrane k times, Na is the total number of proteins in rats, tp is estimate that the average number of each protein involved in protein - protein interaction. Particularly, when the proteins are in the same subcellular localization zone (MNTP=0), according to the above formula PMNTP=0 should be 0, but taking into account that the existing subcellular localization information is inaccurate and the general practice that P value is set, PMNTP=k = 0.05 is set in this paper; for the situation that MNTP can not be determined, the P value is set to 1 for conservative consideration. The annotation files "GPL85.annot" under GPL85 platform is download from NCBI's GEO database, which contains all known protein subcellular location information in the Affymetrix RG-U34A chip. By calculate MNTP results of 730 the target molecules and corresponding P value are get as shown in the Table 1. In the actual cell, MNTP between any two proteins is less than 5, so the maximum MNTP in the Table 1 is 4. Table 1. Minimum Number of Transport Processes MNTP 0 1 2 3 4 not sure
NMNTP 299 130 160 90 34 -
PMNTP 0.05 0.41 0.59 0.81 0.93 1
Orthologous comparison: Orthologous sequences in different species are considered similar or even identical functions, similar control pathway, even play the same role.
Predict Molecular Interaction Network of Norway Rats Using Data Integration
177
Furthermore, the majority of core biological function is assumed by the significant number of orthologous genes[11]. Based on gene or protein sequence similarities, people draw Evolutionary Tree. Rat and mice are very close in the evolutionary tree, that is, it exists high orthologous between their genome, proteome even the interactome. For molecule ARat and BRat in rat cell, in mouse cell their orthologous molecule is AMouse and BMouse; if AMouse interacts with BMouse, then there will be very possible interaction between ARat and BRat, and the probability P value is: Porth = P ( MM Mouse ∈ I Mouse | MM Rat ∉ I Rat ) =
N Mouse N ≈ Mouse [ N a ( N a − 1) / 2 − t p N a / 2] N a2
Porth is the probability of interaction between orthologous molecules MMmouse in mouse cells, in the premise of that there is no interaction (MMRat∈ / IRat) between molecules in the rat. IRat and IMouse represent the entire interactome in rat and mice, Na is the total number of genes in rat, Nmouse is the number of MMmouse which can be found in Imouse, tp is the same meaning as mentioned earlier. The number of mice PDI downloaded from the TRED database is 896; and the number of mice PPI searched with APID2NET is 5542. The PDI and PPI together form a network of molecular interactions in mice. 226,085 molecules are matched to be predicted of rat with the known molecules with interaction in mice. 176 molecules are proved that there are their orthologous molecules in mice, their Porth = 0.05 is set, and the rest of 225,909 molecules are set Porth = 0.95. Evidence integration: To facilitate data integration, it is needed to transform the results of five kinds of evidence into the same form. Taking into account the P value is the most common statistical indicators with high interpretability, while the results of evidence 4 and 5 have been taken in the form of P values, so the results of evidence 1, 2 and 3 are transformed into P value. Here the Gaussian kernel density transformation is used to achieve the object. The results of the Gaussian kernel density transform are shown in Figure 1.
Fig. 1. The results of evidence 1, 2 and 3 obtained by Gaussian kernel density transform. (a) Horizontal axis represents the absolute value of Pearson correlation coefficients |pcc|; vertical axis represents the transformed Gaussian kernel density gkd; the height of the gray column is on behalf of the frequency of |pcc| in corresponding interval of horizontal axis, more frequency, the higher cylinder. The function gkd=G(|pcc|) transformed by Gaussian Kernel Density is the red line indicated by the diagram. (b) Horizontal axis represents the results calculated by SVM. (c) Horizontal axis represents the relative probability ratio of interaction between molecules under the evidence of gene coding sequence similarity, the real range of ratio is [0, 265 053].
178
Q. Li and Q. Rong
In this paper, the P value calculated by three kinds of evidence is completed by using data integration software Pointillist developed by ISB. The key of integrating a variety of evidences is to estimate the weight of evidence for each. Because life is a complex nonlinear system, we integrate evidences through re-introducing the of machine learning method SVM.
3 Results The P value of 5 kinds of evidence is integrated by using SVM and the final PIntegrated is calculated by using the Gaussian kernel density transform. When using a different threshold value, the model will predict a different number of molecular interactions- the greater threshold value, molecular interactions are more and more often; with an increasing sensitivity of the model, but at the same time, the number of false positive sample will increase accordingly, resulting in lower specificity. In this study, for conservative consideration, 0.01 is selected as this threshold value; when the PIntegrated<0.01, it is determined that there is an interaction of molecules to be predicted; when PIntegrated>=0.01, it will be determined that there is no interaction of molecules to be predicted. In the significance level, the 2659 molecules are predicted by this data integration method, which form molecular interaction network N2659; the network topology of N2659 is draw by using Cytoscape, as shown in Figure 2.
Fig. 2. Threshold value of 0.01, the predicted molecular interaction networks N2659 in rat cells. Each node represents a molecule (marking the molecular gene symbol on the right of node); each green side represents a molecular interaction. It contains interaction of 2659 molecules on the network.
Fig. 3. Sub-network consisted by 30 true positive samples. Each hexagonal node represents a molecule, each edge is representatives of a molecular interaction, the figures on the edge represents the number of molecules with interaction identified by the individual evidence in the significant level of P<0.01.
As noted above, by previous experiments or research it has been proved there is an interaction of 406 molecules in the 266,085 molecules to be predicted. In the predicted network N2659, the number of true positive samples is 30 (the corresponding sub-network shown in Figure 3); that is, average 88 samples have a true molecular interaction; in contrast, if the sample from 266,085 randomly be selected, then every 655 samples only contain a true molecular interaction; a difference of nearly 10 times is between them. Therefore, in the significant level of PIntegrated<0.01, sensitivity is
Predict Molecular Interaction Network of Norway Rats Using Data Integration
179
0.0738 and specificity is 0.9904. There is a gap between this prediction accuracy and the current world's advanced level.
4 Conclusion Data integration is increasingly popular research method in the field of system biology. Based on data integration method, this paper predicts molecular interaction network related with ventricular hypertrophy by integrating 5 different types of evidence. Compared with the model basing on individual evidence, data integration method has the advantage of wide selection of data according to the characteristics of object. These datas can be high-throughput experimental data with more noise, information stored in the known database, even previous exploration done for the same study. Heterogeneous data can compensate for each other's deficiencies, to avoid monotony and the one-sidedness. This paper is not only the exploration of data integration method, but also excavation of new molecular interactions within rat cell.
References 1. Barrett, T., Troup, D.B., Wilhite, S.E.: NCBI GEO: archive for high-throughput functional genomic data. Nucleic Acids Res. 37(Database issue), 885–890 (2009) 2. Lu, L.J., Xia, Y., Paccanaro, A., et al.: Accessing the limits of genomic data integration for predicting protein networks. Genome Res. 15, 945–953 (2005) 3. Hernandez-Toro, J., Prieto, C., Delas, R.J.: APID2NET: unified interactome graphic analyzer. BMC Bioinformatics 23(18), 2495–2497 (2007) 4. Jiang, C., Xuan, Z., Zhao, F., et al.: TRED: a transcriptional regulatory element database, new entries and other development. Nucleic Acids Res. 35(Database issue), 137–140 (2007) 5. Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines (2001), Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm 6. Jansen, R., Bussemaker, H.J., Gerstein, M.: Revisiting the codon adaptation index from a whole-genome perspective: analyzing the relationship between gene expression and codon occurrence in yeast using a variety of models. Nucleic Acids Res. 31, 2242–2251 (2003) 7. Daubin, V., Perriere, G.: G+C3 structuring along the genome: a common feature in prokaryotes. Molecular Biology and Evolution 20, 471–483 (2003) 8. Fraser, H.B., Hirsh, A.E., Wall, D.P., et al.: Coevolution of gene expression among interacting proteins. PNAS 101, 9033–9038 (2004) 9. Hubbard, T.J., Aken, B.L., Ayling, S., et al.: Ensembl 2009. Nucleic Acids Res. 37(Database issue), 690–697 (2009) 10. Nakamura, Y., Gojobori, T., Ikemura, T.: Codon usage tabulated from international DNA sequence databases: status for the year 2000. Nucleic Acids Res. 28(1), 292 (2000) 11. Tatusov, R.L., Galperin, M.Y., Natale, D.A., et al.: The COG database: a tool for genomescale analysis of protein functions and evolution. Nucleic Acids Res. 28(1), 33–36 (2000)
The Study of Rats’ Active Avoidance Behavior by the Cluster Analysis Otar Tavdishvili2, Nino Archvadze1, Sulkhan Tsagareli1, Anna Stamateli1, and Marika Gvajaia1 1
Iv. Javakhishvili Tbilisi State University, Faculty of Exact and Natural Sciences, University str.13, 0143, Tbilisi, Georgia
[email protected] 2 Institute of Cybernetics, S.Euli str. 5, 0186, Tbilisi, Georgia
Abstract. Unsupervised cluster analysis is proposed for the study of active avoidance formation in three groups of albino rats: (a) Intact; (b) neocortex and (c) dorsal hippocampus lesioned. The term ‘behavior vector’ has been introduced to quantitatively assess the behavior of rats while learning. The proposed approach enables the assessment of active avoidance behavior in rats simultaneously by all the tested parameters and the classification of animals classify the animals into groups by their behavioral resemblance through the learning process. Keywords: unsupervised clustering, defensive behavior, learning, hippocampus.
1 Introduction A wide range of mathematical methods has been proposed for the assessment of the cognitive mechanisms involved in adaptive learning, repeated decision tasks, reinforcement and strategic changes (3, 4, 5, 7). The clustering methods, with different approaches and different focuses, have been used in studies on learning, memory and behavior (1, 2, 8, 9). Cluster analysis may be applied when assessing learning abilities of white rats under multi-parametric description of behavior. Such an approach enables researchers to describe and quantitatively assess behavioral processes at learning and reveal the resemblance between the behavioral conformities of the animals the animals’ behavioral conformities. Animals’ learning abilities, assessed by acquisition of active avoidance, were found to vary within the test groups. Some animals were not able to meet the learning criteria and consequently, there should exist different groups of animals with different behavior, i.e. the groups into which animals with resembling behavior should be involved. The unsupervised clustering (automatic classification) algorithm, based on Parzen statistical estimation of probability density function, was used to partition rats in accordance with their behavioral similarities (10,11). The term “behavior vector” is introduced to for the multiparameter descriptions of behavior in the learning process. The components of the vector were behavioural parameters measured for each animal, and they took different numerical values during the experiment. K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 180–188, 2010. © Springer-Verlag Berlin Heidelberg 2010
The Study of Rats’ Active Avoidance Behavior by the Cluster Analysis
181
2 Subjects Comprising 31 subjects, three different groups of albino rats of both sexes (with an average body weight of 150g) were examined. Three different groups of albino rats of both sexes (with an average body weight of 150g) comprising 31 subjects were examined. The animals were numbered before the experiment and divided into three groups designated as Group INTACT (intact; No.113; n=13), Group NCC (No.41-49; n=9), and Group DHPC (No. 32-40; n=9).
3 Procedure The scheme of the research was designed as described by Tsagareli and Djgarkava (12). The experiment lasted 20 days with 10 trials p.d. In each trial the avoidance was signaled by a single light stimulus presented for 10 sec. The subjects could avoid the painful foot-shock by jumping onto the shelves. If they did not, after 10 sec on the background of the condition stimulus, the foot-shock current (25 mv) was delivered for 5 sec through the grid. The rats could escape the shock by jumping up onto the nearest shelf and staying there for 3 sec until they were forced to return to the floor. The Inter-trial period was scheduled by special program (method of Monte-Carlo). During the inter-trial intervals the rats could spontaneously jump up onto the shelves for only 3 sec. After these three seconds, the experimenter lowered the shelf and forced the animals to return to the floor. Three behavioral parameters were used to evaluate active avoidance conformities in Albino rats: (a) light-induced avoidance response; (b) painful foot-shock-induced escape behavior and (c) spontaneous activity (jumping onto the shelf) during the inter-trial intervals. Each experimental parameter was assessed quantitatively. The avoidance and escape behavior were measured in frequencies; inter-trial activity was measured in numbers relevant to spontaneous jumps onto the shelves. We characterized the active avoidance learning by general analysis of values encompassing all three parameters in total.
4 Surgery All surgical procedures were performed under aseptic conditions. The rats were anesthetized with sodium pentobarbital (Nembutal 55 mg/kg, i.p.) and placed in a stereotaxic instrument. An incision was made in the skin covering the skull and the latter was leveled. The animal had randomly received either electrolytic-induced lesions of the dorsal hippocampus or neocortex over the dorsal hippocampus, performed by passing a rectified current of 1.2 mA for 15 sec through stainless-steel electrode (0.2 mm in diameter) uninsulated at the tip (approx. 0.5 mm). The lesion coordinates were identified on the basis of the rat brain stereotaxic atlas (6). Before testing each animal was given a 7 day recovery period.
182
O. Tavdishvili et al.
5 Histology After completion of behavioral testing, the rats with lesions were sacrificed with an overdose of pentobarbital (100 mg/kg, i.p.) and perfused transcardially with 0.9% NS followed by 10% formal saline. The brains were removed and stored in 10% formal saline. The brains of all operated rats were cut into 30 mm-thick horizontal sections. Verifications included estimation of hippocampal and neocortical lesion extent.
6 Experimental Data Analysis A wide range of statistical methods should be applied to the analysis of any behavioral parameter, but aside from beside of traditional statistic methods, the paper aims to propose the cluster analysis for the assessment of neuroethological data. 6.1 Statistical Analysis To analyze the higher-order interactive effects of multiple categorical independent variables (factors), and to test for significant effects of the lesion, behavioral data were analyzed using Factorial Analysis of Variance (ANOVA) considering lesion and daily behavioral session as grouping factors. When significant main effects were found, additional analysis was performed using post hoc comparisons (LCD test). Differences were considered to be statistically significant at P<0.05. 6.2 Claster Analysis For multiparameter description of behavior in learning process each animal was described by the ‘behavior vector’. The components of the vector were behavioural parameters measured for each animal, that took different numerical values during the experiment. Unsupervised clustering algorithm based on Parzen statistical estimation of probability density function considers the case, when both the probability density of data set and the number of classes are preliminary unknown. It gaves us the possibility to classify rats’ behavior by their active avoidance acquisition ability. The algorithm consists of the following two main phases: (a) Cluster-analysis in the feature space of data set, from which the number of clusters, their centers, and radii of homogeneity will be defined; ( b) Partitioning of the initial data set into disjoint homogeneous subsets (classes) on the basis of clusters obtained as a result of phase (a). For the quantitative estimation of the similarity between the behavior vectors (the initial set elements) representing animals’ behavior, Euclidean metric is introduced. As a result of their pair wise comparison a sequence of non-negative numbers
{ξ },
q = 1, n is obtained. The lattice is assumed as a sample of parent population that has some theoretical density function ϕ (x) . For clustering of the sequence q
{ξ },
q = 1, n estimation of the theoretical density function is performed by Parzen statistical estimation function ϕˆ n ( x) q
The Study of Rats’ Active Avoidance Behavior by the Cluster Analysis
ϕˆ n ( x) =
1 n ⎛ x − ξq ∑ K⎜ n ⋅ h q =1 ⎜⎝ h
⎞ ⎟⎟ , ⎠
183
(1)
where h = h( n) > 0 , h( n) → 0, nh( n) → ∞ , when n → ∞; k (x ) is a Borel function integrable relative to Lebesgue measure. Function ϕˆ n ( x) allows defining points of local extremes of the statistical estimation function. In particular, the points of local maxima (modes)
M i , i = 1, p ,
where
M 1 < M 2 < ... < M p , and the points of local minima mi , i = 1, p + 1 ,
where
m1 < m2 < ... < m p +1 . They define the centres of clusters and the radiuses of
sameness
Ri , i = 1, p respectively. At that, mi ≤ M i < mi +1 .
{ξ }, q
q = 1, n into isolated
ξ q − M i ≤ Ri , i = 1, p, q = 1, n .
(2)
A criterion that defines division of the sequence clusters (groups) is the following:
As a result of clustering of the sequence
{ξ }, q
q = 1, n the clusters
K i , i = 1, p are extracted. In the Clusters obtained in feature space of the initial data set elements the characteristic parameters (features) of behavior are distributed by groups in order of level of their similarity. Particularly, the most similar characteristic parameters form the first cluster K 1 . The characteristic parameters similar less then in the first, form the second cluster K 2 , etc. The most distant characteristic parameters in respect to similarity form last cluster
Kp.
At the second stage of classification the clusters obtained at the first stage are used for dividing the initial data (behavior vectors) set into classes. Procedure starts from cluster K 1 with a minimum mode M 1 . A unity of corresponding to its elements all pairs of the behavior vectors set form class G1 . A value of class identifying label, i.e. 1, indicates cluster’s current number. Then we pass to cluster K 2 and for its elements again find all corresponding pairs of the behavior vectors set. But all elements of the behavior vectors set which have already formed class G1 , do not take part in extraction of the class. As a result, we obtain class G 2 , etc. The procedure continues until all elements of the behavior vectors set will be included in any of the classes. The elements of the initial data set that will not be included in any of the classes are called isolated or non-classified elements. So, as a result of the classification we will obtain classes in which elements of the behavior vectors set will arrange into groups in order of magnitude of the Euclidean distances between them, respectively.
184
O. Tavdishvili et al.
7 Results The significance of the differences between the studied population was assesed by Factorial Analysis of Variance, considering lesion and daily behavioral session as grouping factors. When significant main effects were found additional analysis was performed using post hoc comparisons (LCD test), revealing daily differences in escape, avoidance and spontaneous behavior between studied groups. Data reported as the mean values ± S.E.M of three behavioral parameters for all the three populations are presented in Fig.1. The curves demonstrate daily dynamics of behavioral parameters during whole experiment. (F(2, 114)=11.16, p <.05) At the early stages of the experiment DHPC rats, in contrast to intact or NCC, are not able to perform escape behavior. The escape response rate is rapidly increasing, reaching criteria level on day 5, this is not observed in cases with DHPC rats. DHPC rats perform progressively increasing avoidance responses from the 5th day, however, they remain lower than those for intact and NCC animals (Fig.1). The intact and NCC rats reach the learning criteria level (is accepted to be within rate ranges of 0.9 and 1) by day 13, but DHPC animals on day 15. When assessing inter-trial spontaneous behavior, the pattern of dynamics was similar for all the studied groups – low at the beginning of the experiment, growing during the next few days with the following decrease of rates. But NCC rats were found to be more active than the intact and DHPC, which were less active in comparison with the intact and the NCC animals. A 1.2 1.1
Frequency of Responces (Mean± SEM)
1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1
INTACT DHPC NCC
0.0 -0.1 1
3
5
7
9
11
13
15
17
19
Days
Fig. 1. Dynamics of self-defensive behavior during the active avoidance memorization: A Escape responses to painful foot-shock; B-Avoidance responses to conditional (light) stimulus; C-Inter-trial spontaneous activity; Escape frequency scores were significantly differ between Intact – DHPC and NCC-DHPC groups at the initial stage of the test (A), but avoidance responses scores remained significantly differed through whole experiment with lower rate in DHPC rats (B); Mean±S.E.M. estimation of inter-trial spontaneous activity indicates that there are statistically significant difference among all studied groups; Inter-trial behavior is statistically differed in all groups at significance level p < .05
The Study of Rats’ Active Avoidance Behavior by the Cluster Analysis
185
B 1.0
Frequency of Responces (Mean± SEM)
0.9 0.8 0.7 0.6 0.5 INTACT DHPC NCC
0.4 0.3 0.2 0.1 0.0 -0.1 1
3
5
7
9
11
13
15
17
19
Days
C 20 18
Spontaneous Activity (Mean± SEM)
16 14 12
INTACT DHPC NCC
10 8 6 4 2 0 -2 1
3
5
7
9
11
13
15
17
19
Days
Fig. 1. (continued)
It is obvious that the rats promptly escape from painful foot-shock stimulus, but acquisition of avoidance responses is comparatively slow. Throughout the experiment, elaboration of optimal self-defensive behavioral algorithm takes place in the experimental animals: the rats learn that staying on shelves helps them to avoid footshock stress and, consequently, the frequencies of avoidance responses increase. The variation in dynamics of inter-trial spontaneous behavior shows that the rate of spontaneous jumping onto the shelves still remains at a rather low level until the animals infer that staying on shelves is a self-defense behavior. The avoidance responses correlate with spontaneous activity, causing an increase of the latter (beginning from the 5th day and lasting up to the 13th day). On the 14th day, after the learning criteria level has been reached in all three studied populations, the inter-trial spontaneous activity begins to decrease and such dynamics are maintained up until the end of the experiment.
186
O. Tavdishvili et al.
The analysis of the obtained data revealed learning-based adaptation conformities of animals to the aversive conditions differed between the populations. However, it seems to be that differences between means do not reflect behavioral patterns for each individual in full. The populations involve animals with either high or low learning abilities. Also, it should not be excluded that the representatives of different populations might have similar learning skills. Proceeding from this, it seems significant to apply the relevant approach aimed at grouping animals by their behavioral resemblance. All of the behavioral activities (escape, avoidance or inter-trial spontaneous activity) serve the main strategy – minimize a painful stress through experimental treatment. Consequently, in order to assess the roles of studied structures (dorsal hippocampus and neocortex) in learning processes the simultaneous treatment of all three behavioral parameters and identification of common behavioral parameter are recommended. Special attention should be paid to individual behavioral conformities when grouping animals by their behavioral similarities. Nowadays, it has become available to conduct comparative neuroethological analysis in different populations using mathematical methods and reveal either behavioral difference or resemblance between the studied groups. For classifying the animals, through the learning process into groups by the degree of behavioral similarity throughout, multi-parameter assessment Cluster Analysis has been used. Cluster Analysis of experimental data involved all three groups: Intact, NCC and DHPC rats. The method enabled us to assess active avoidance conformities in the studied groups by a total analysis subjected to overall parameters. Consequently, all studied rats from different populations were grouped according to their behavioral similarities and defined the classes that included animals with similar learning abilities. Using Claster Analysis class distribution of 31 rats from different test groups was defined, and the relative frequencies of homogeneous classes were assessed. The relative frequency of appearance of class 1 (0.98) significantly differed from the other. As for the classes 2 and 3, the relative frequencies were significantly lower; (0.28 and 0.41, respectively) however, they exceeded the other classes (0.01÷ 0.16) not included in the final analysis because of their extremely low rates. The first class involved rats with the most similar resembling behavioral patterns during active avoidance acquisition, and every next group exhibited less similarities to it. We revealed the groups with prevalence of animals of class 1 that were assessed in percentage. It was stated that 24% of intact, 35% of NCC and 26% of DHPC rats were not included in Class 1. Therefore, in order to quantitatively assess individual learning abilities and take into account the fact that large majority of the animals were referred to class 1, the rats’ appearance frequency in class 1 (frequency range 01) was conditionally divided into four frequency intervals (0,90-1; 0,80-0,89; 0,700,79; 0,60-0,69). For calculation of the appearance frequency for each rat, the number of their appearance in class 1 has been divided by the number of days. This enabled us to identify the mixed groups each of which contained animals with different learning abilities of active avoidance throughout 20-day experiment. The frequency interval of 0.9-1 only included only the rats that were most successful through their active avoidance behavior – superior learners (3 intact rats). The animals of the second rate group - good learners (0.8-0.89) included the animals with well performed behavior test (1 intact; 3 NCC and 2 DHPC rats). However, they were
The Study of Rats’ Active Avoidance Behavior by the Cluster Analysis
187
less successful than the animals of Group 1. Medium learners - 8 intact, 5 NCC and 7 DHPC were within the interval of 0.7-0.79. The fourth rate interval – inferior learners (0.6-0.69) contained only 1 intact and 1 NCC rats. No DHPC rats were found to meet the criteria stipulated for that group (Table 1). Of all the studied populations, 9.67% of intact animals best succeeded at active avoidance behavior. No animals among DHPC or NCC groups could achieve such levels. 3.22% of intact, 9.67% of NCC and 6.45% DHPC rats were found to be good at learning. Lower learning ability was revealed among 25.8% of intact, 12.9% of NCC and 6.45% with DHPC. Table 1. Distribution of the animals (Class 1) with different learning abilities in divided frequency intervals Frequency ranges
Groups 0.9-1
0.8-0.89
Intact
2,7,11
8
NCC
49
41,42,43
DHPC
0
32,38
0.7-0.79 1,3,4,5,6,1 0,12,13 45,46,47,4 8, 33,34,35,3 6, 37,39,40
0.6-0.69 9 44 0
We have demonstrated compliance of our approach simplifying capability to reveal similarities in neuroethological studies throughout multi-factorial behavioral assessment. The proposed approach enables assessment of active avoidance behavior in rats by analysis of three or more parameters in total. It enables further grouping of all the studied rats from different populations by their behavioral similarities. Besides, the proposed method is convenient to assess the learning capacities of animals. It also lays the groundwork for obtaining additional information and defining the correlation between learning skills and other neuethological and neurobiological parameters.
References 1. Balslev, D., Finn, Å.N., Frutiger, S.A., Sidtis, J.J., Christiansen, T.B., Svarer, C., Strother, S.C., Rottenberg, D.A., Hansen, L.K., Paulson, O.B., Law, I.: Cluster Analysis of Activitytime Series in Motor Learning. Human Brain Mapping 15(3), 135–198 (2002) 2. Cohen, H., Zohar, J., Matar, M.A., Kaplan, Z., Geva, A.B.: Unsupervised Fuzzy Clustering Analysis Supports Behavioral Cutoff Criteria in an Animal Model of Posttraumatic Stress Disorder. Biological Psychiatry 58(8), 640–650 (2005); doi: 10.1007/s00422-007-0209-6 3. Hausken, K., Moxnes, J.F.: Behaviorist Stochastic Modeling of Instrumental Learning. Behavioural Processe 56, 121–129 (2001) 4. Ito, S., Yuasa, H., Luo, Z., Ito, M., Yanagihara, D.: A Mathematical Model of Adaptive Behavior in Quadruped Locomotion. Biol. Cybern. 78, 337–347 (1998)
188
O. Tavdishvili et al.
5. Paulus, M.P., Geyer, M.A.: Quantitative Assessment of the Microstructure of Rat Behavior: If(d), The Extension of the Scaling Hypothesis. Psychopharmacology 113(2), 177–186 (2005) 6. Paxinos, G., Watson, C.: The Rat Brain in Stereotaxic Coordinates, 3rd edn. Academic Press, San Diego (1997) 7. Rapp, P.E.: Quantitative Characterization of Animal Behavior Following Blast Exposure. Cogn. Neurodyn. 1(4), 287–293 (2007) 8. Speakman, J.R., Bullock, D.J.: A Problem Defining Temporal Pattern in Animal Behaviour: Clustering in the Emergence Behaviour of. Animal Behaviour 43(3), 491–500 (1992) 9. Stevens, M.C., Fein, D.A., Dunn, M., Allen, D.D., Waterhouse, L.H., Feinstein, C.M.D., Rapin, I.M.D.: Subgroups of Children With Autism by Cluster Analysis: A Longitudinal Examination. Journal of the American Academy of Child & Adolescent Psychiatry 39(3), 346–352 (2002) 10. Tavdishvili, O.: Automatic Classification Algorithm for Observable Data set. Proceedings of the Institute of Cybernetics, Georgian AS 3(1-2), 136–141 (2004) 11. Tavdishvili, O., Sulaberidze, T.: Segmentation Method of 3D Segments Extraction on the Scene Image. In: Blackledge, J.M., Turner, M.J. (eds.) Image Processing III: Mathematical Methods, Algorithms and Applications, pp. 82–88. Horwood Publishing, Chichester (2001) 12. Tsagareli, S.N., Djgarkava, N.N.: Assesment of Food-obtaining and Avoidance Behavior in Albino rats (in Georgian). In: Biology today. Collected works, pp. 166–177. Tbilisi State University, Tbilisi (2002)
MS Based Nonlinear Methods for Gastric Cancer Early Detection Jun Meng1,*, Xiangyin Liu1, Fuming Qiu2, and Jian Huang2 2
1 School of Electrical Engineering, Zhejiang University, Hangzhou, China, 310027 Cancer Institute, The Second Hospital Affiliated to Medical College of Zhejiang University, Hangzhou, China 310009
[email protected]
Abstract. The mortality rate of gastric cancer (GC) ranks the 2nd among all types of cancers. The earlier it is diagnosed, the better its curative effect becomes. As a powerful analyzing technique, SELDI-TOF serves as a new approach for Gastric Mass Spectrometry (GMS) based GC early detection. This article has developed a set of nonlinear approaches for GMS to differentiate the normal persons from the GC suffers—the adapted box dimension calculation method and the clustering featured data mining method. Comparing with other popular SELDI-TOF process techniques, such as SVM, neural networks, RPS, etc, their individual particularities and perfect performance in nonlinear problem analysis, especially after featured respective working mechanism adaptation, credible outcome is well expected. Keywords: Mass Spectrum, Gastric Cancer, Nonlinear methods, Data mining, Fractal dimension.
1 Introduction 1.1 Why GMS for GC? Gastric cancer (GC) is one of the common malignant tumours which threat people’s health (see figure 1). The incidence of GC ranks the top 3 among all types of cancers, but its mortality rate turns out to be the unbelievably 2nd worst [1][10]. Its curative effect lies much more in the early detection, whose essence is nothing more than a feature orienting and mode identification problem. From the mode identification perspective, mass spectrometry (MS) is generally used to diagnose diseases, like the biomarkers of cancers. Consequently, the SELDITOF (surface-enhanced laser desorption/ ionization time-of-flight) [3] [11] [12] [13] offers a new approach and loads of facts have proved that this specific technique is fairly well in analyzing the characteristics of GMS. *
Corresponding author.
K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 189–195, 2010. © Springer-Verlag Berlin Heidelberg 2010
190
J. Meng et al.
Fig. 1. Ulcer-typed Gastric Cancer
Fig. 2. The Nonlinearity of GMS
1.2 Why Nonlinear Methods? From the nonstability, the nonequilibrium, the orderlessness and the inconsistency characterized nonlinearity shown by the time-MS plot (see Fig. 2) we can easily see that MS series is totally a nonlinear problem, be it the gastric cancer sufferer’s (red curve) or the normal person’s (green curve). Besides, nonlinear methods perfectly interpret the intrinsic dynamics of the structures of universal existing systems and many of the nonlinear methods impart striking explanations to the diversified phenomena, for example, box dimension calculation, neural networks, SVM, reconstructed phase space method (RPS), and quite a few other data mining techniques. In fact, the calculated box dimension non-integer result (see Par.3.1) impeccably judges its chaotic essence, and we’ve got reasons to adopt the above illustrated nonlinear methods to process the specific cancer diagnosis problem [2].
2 Adopted Approaches According to the intrinsic properties of the time-MS series and our current knowledge of the features and performances of popular nonlinear methods, we’ve adopted the following approaches in dealing with the normal-cancer-GMS differentiation: (1) Adapted Box dimension calculation (BDC): on the ground of the results given by this method, two types of characteristics could be classified provided the original time-MS series are delicately processed. The algorithm is nice and fast as well but bears possible hidden problems. (2) Clustering method of data mining theories: based on the trustable experiences of field experts and close observations of the given series, we abstract a number of featured descriptive items and get their corresponding indices. By clustering them with Two-Steps method, ideal differentiation is obtained.
3 Adapted Box Dimension Analysis Method 3.1 Primitive Box Dimension Calculation Fig. 3 illustrates the incompetence of the primitive box dimension calculation to original time-MS series. Possible account for the poor differentiation is that the given series are far too long to process. On one hand, it is very time-consuming to apply the box dimension approach; on the other hand, the series are quite similar generally
MS Based Nonlinear Methods for Gastric Cancer Early Detection
191
speaking, and the descriptive indices’ features, like the maximum and minimum peakpeak value and their position are undermined during the calculation, therefore the effective information is lost greatly in this way.
Fig. 3. Primitive Box Dimension Calculation
3.2
Fig. 4. Adapted Box Dimension Calculation
Adapted Method: Using Updated Compressed Data Set
Taking the problems listed above and considering that almost the last half of the series curve are quite alike, we introduced two tactics to the original data set: (1) take the first 50000 points and sample with the interval of 1.2 and form a temporary compressed set; (2) emphasis the distinctions (what the blue curve represents) of the series using a quasi genetic copying method—the larger the distinction, the higher the proportion the data in the newly compressed set possess. After applying the above techniques, we calculate their box dimensions again and this time, satisfying results could be obtained as shown in Fig. 4, in which the green line represents the normal person’s GMS characteristics while the red belongs to the GC sufferer’s, and a noticeable merit of the distinction based differentiation method is that it potentially eliminates the redundant energy level caused by laser’s interference, which is artificially removed by traditional methods. Problems hidden. However nice and clear the adapted box dimension calculation method it seems, there still lays hidden problems: (1) the algorithm is not very complicated though, the data set is still too huge to process; meanwhile by classifying a group of series with n samples, the interactive style needs n*(n-1) times of calculation. (2) Fig. 4 impresses us with the differences of the two types by distinctive interception of the lines, however, the choice of a standard acceptable interception for a certain type needs further researches and discussion. For instance, Fig. 5 is a blurring example comparing to Fig. 4. Do they belong to the same group or they represent two types of people? And this ambiguousness drives us to find a new more reliable method.
Fig. 5. A Blurring Example of updated BDC
Fig. 6. Pre-classification’s Unreason
192
J. Meng et al.
4 Series Data Mining Method [4] and [5],[9] 4.1
Expert Judgement and Hypothesis
Choose several pre-classified sample series and test their given assortment validity by the above method, hidden problems existing though, the nearly-superposition of the two lines shown in Fig. 6 still reveals the classification’s apparent unreason. Nevertheless, when scrutinizing the sample pool’s original curves, or rather considering the field expert’s diagnosing experience, the classifications of subplots with the doctor’s icons are supposed to be suspected (Fig. 7).
Fig. 7. Pre-classification And Suspicion
4.2
Characteristic Descriptive Data Index Abstraction and Clustering
When we observe the characteristics of the sample pool’s curves and assume that most of the original classifications are valid, then a general qualitative conclusion could be drawn—most of the cancer sufferer curves’ second peaks are weak and vague while the normal types’ minor pulses are strong and clear (see Fig. 8).
Fig. 8. Two-Peak Observation
If we quantify the statement with proper statistical terms, then the following 9 descriptive indices could be drawn, namely, (1) mean of Pulse One; (2) standard variance of Pulse One; (3) covariance of Pulse One; (4) maximum of Pulse One; (5) near tail value of Pulse One; (6) mean of Pulse Two; (7) standard variance of Pulse Two;
MS Based Nonlinear Methods for Gastric Cancer Early Detection
193
(8) covariance of Pulse Two; (9) near head value of Pulse two[6]. These indices mark the concentric trend and the growing configuration rather than individual data entry’s particularity, therefore the side influences of noises are potentially avoided. TwoStep Cluster Number = 1
Cluster Size
Bonferroni Adjustment Applied
TwoStep Cluster Number
Tolerance
1
Significance
2
tail1 cov1
Variable
std2 cov2 mean2 pre2 std1 mean1 peak1
0.0
0.5
1.0
1.5
2.0
2.5
- Log10(Probability): larger value is more significant
Fig. 9. Clustering and the index significance
As illustrated by Fig. 9, after gathering the corresponding results of the above 9 indices and clustering them with the Two-Steps method, an ideal classification result is obtained and we could see that 7 out of 9 descriptive indices (except the main pulse’s mean and peak) are pretty determinant.
5 Validation Checking and Alternative Method Comparison What’s worth mentioning is that the complicated nonlinearity of GMS due to the noises brought by medical apparatus, the variety of the illnesses’ states, and the diversity of individuals, like age, sex, location, dietetic habits, etc, would unavoidably introduce errors comparing with the pathology diagnosis. However, comparing with the following popular methods, the approaches illustrated above are still at an advantage in the GC early detection problem considering their intrinsic attributes and higher precisions (see Fig. 10).
Fig. 10. Method Validation Comparison
Based on the validation checking result of 90 testing GMS differentiated by adapted BDC, simple clustering, feature clustering, point-level-comparison, RPS, ANN and SVM, we may see that, generally speaking the nonlinear methods are effective in depicting the nonlinear system’s substantial dynamics, but not every approach
194
J. Meng et al.
is suitable for the GC early detection—simple clustering fails to meet the diagnosis requirements due to the patent particularities of individuals, point-level-comparison is prone to be affected by apparatus noises, RPS is too general to encompass the GMS’s specific character, ANN and SVM is likely to bring in over-fitting and the coordinates’ offset therefore needs extra normalization as preliminary process[7] [14].
6 Conclusion GMS based GC early detection is a complicated systematic problem which needs synthesized and holistic considerations rather than rigid and easy outcomes given by simple data and algorithms [8]. By applying subtle techniques to classical fractal dimension calculation and data mining theories, the adapted BDC and feature clustering approaches distinguish themselves among other nonlinear methods at least from the following three aspects: (1) They have nicer performances, which have been clearly testified by the diagnosis precision marked quantitative index comparison. (2) They need fewer processing steps comparing with the redundant preliminary work required by the ANN & SVM based approach. (3) They possess better tolerances. BDC maintains better illustrations for the GMS’s holistic attribute and the feature clustering method is less likely to be invalidated since it depicts GMS by quite a few global indices. Considering the fact that nonlinear methods have already been widely applied for the natural science field, especially the proceedings of current research work in GMS area, adapted nonlinear methods, specifically data mining techniques introduced in this article are proved to be potent. By further refinement of our current proceedings, improved performances for the gastric cancer early detection are worthy to be expected.
References 1. http://www.xyxy.net/jbzt/neike/xhke/wai/fzjc/ 200503151547231394.htm 2. Kantz, H., Schreiber, T.: Nonlinear Time Series Analysis. Tsinghua University Press, Beijing (June 2000) 3. Wulfkuhle, J.D., Liotta, L.A., Petricoin, E.F.: Proteomic Applications For The Early Detection of Cancer, vol. 3, p. 267. Nature Publishing Group, Macmillan (2003) 4. Giudici, P.: Applied Data mining. Publishing house of Electronics industry, Beijing (June 2004) 5. Yang, C., Meng, J.: Optimal fuzzy modeling based on minimum cluster volume. In: Li, X., Wang, S., Dong, Z.Y. (eds.) ADMA 2005. LNCS (LNAI), vol. 3584, pp. 232–239. Springer, Heidelberg (2005) 6. Emmert-Buck, M.R., et al.: Laser capture microdissection. Science 274, 998–1001 (1996) 7. Nørgaard, M., Ravn, O., Poulsen, N.K., Hansen, L.K.: Neural Networks for Modelling and Control of Dynamic Systems. Springer, Heidelberg (3rd printing) (with corrections) (2003)
MS Based Nonlinear Methods for Gastric Cancer Early Detection
195
8. Banks, R.E., et al.: The potential use of laser capture microdissection to selectively obtain distinct populations of cells for proteomic analysis: preliminary findings. Electrophoresis 20, 689–700 (1999) 9. Yang, C., Meng, J., Zhu, S., Dai, M.: Chapter X: Model Free Data Mining. In: Data Mining and Knowledge Discovery Technologies. The Advances in Data Warehousing and Mining Book Series, vol. 2, pp. 224–253 (2008) ISBN 9781599049601 10. Rohan, T.E., Soskolne, C.L., Carroll, K.K., Kreiger, N.: The Canadian Study of Diet, Lifestyle, and Health: Design and characteristics of a new cohort study of cancer risk. Cancer Detection and Prevention 31(1), 12–17 (2007) 11. Ludwig, J.A., Weinstein, J.N.: Biomarkers in Cancer Staging, Prognosis and Treatment Selection. Nature Reviews Cancer 5, 845–856 (2005) (review) 12. Rifai, N., Gillette, M.A., Carr, S.A.: Protein biomarker discovery and validation: the long and uncertain path to clinical utility. Nature Biotechnology 24(8), 971–983 (2006) 13. Wu, Z.-Z., Wang, J.-G., Zhang, X.-L.: Diagnostic model of saliva protein finger print analysis of patients with gastric cancer. World Journal Of Gastroenterology 15(7), 865– 870 (2009) 14. Tang, K.-L., Li, T.-H., Xiong, W.-W., Chen, K.: Ovarian cancer classification based on dimensionality reduction for SELDI-TOF data. In: BMC Bioinformatics, pp. 1–8 (2010), http://www.biomedcentral.com/1471-2105/11/109
The SEM Statistical Mixture Model of Segmentation Algorithm of Brain Vessel Image Xingce Wang*, Feng Xu1, Mingquan Zhou1, Zhongke Wu1, and Xinyu Liu2 1
College of Information Science and Technology, Beijing Normal University, Beijing, China 2 Institute of computing technology, Chinese Academy of Science, Beijing, China
[email protected]
Abstract. The brain MRI images are processed with statistical analysis technology, and then the accuracy of segmentation is improved by the random assortment iteration .First the MIP algorithm is applied to decrease the quantity of mixing elements. Then the Gaussian Mixture Model is put forward to fit the stochastic distribution of the brain vessels and brain tissue. Finally, the SEM algorithm is adopted to estimate the parameters of Gaussian Mixture Model. The feasibility and validity of the model is verified by the experiment. With the model, small branches of the brain vessel can be segmented, the speed of the convergent is improved and local minima are avoided. Keywords: Segmentation of brain image, MIP algorithm, EM algorithm, Mixture model, Parameter estimation.
1 Introduction Born from biological science and information technology, computational biology and the computing medicine are always two of the most active fields in recent decades. And the segmentations of various organs attract a lot of attention. With the rapid development of medical imaging technology, the precision of medical images is much higher, which makes it possible to research the mini-organs. However, the segmentation of vascular vessels is still a challenge task due to its complicated structure, small proportion and similarity with the surrounding brain tissues. Since 1980s, numerous algorithms are proposed to solve this challenging problem. As mentioned in [1], there are generally three types of methods. The first one is region-oriented methods, including threshold, region growing, and watershed transform. Another one is model-based approaches, such as deformable models, parametric models, template matching methods, and generalized cylinders approaches. And the third one is statistics-based methods, which intend to segment the medical images into *
This work is supported by the State Key Program of National Natural Science of China (No.60736008), State Key Program of National Natural Science of Beijing (NO. 4081002), the National Natural Science Foundation of China (No.60803082) and China Postdoctoral Science Foundation (20060400407) and thanks for the communicate author Wang Xingce.
K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 196–204, 2010. © Springer-Verlag Berlin Heidelberg 2010
The SEM Statistical Mixture Model of Segmentation Algorithm of Brain Vessel Image
197
vascular area and non-vascular area with either classification methods or cluster methods. The statistics-based method involves supervised approach and unsupervised approach, in both of which the image pixels are regarded to be samples, and classified according to their features. And the supervised approach includes parameter statistics and nonparametric statistics. As there are various brain tissues in the medical images, including cerebrum, cerebellum, medulla oblongata, thalamus and so on, GMM regards those tissues as different subsets of voxels (independent components), and the intensity of each voxel as a random variable. So in this paper, a supervised parametric statistical approach, Gaussian Mixture Model (GMM) is adopted to fit the stochastic distribution and a voxel is classified into vascular tissue or others according to its probability density. Then the segmentation of brain vessels is essentially a parameter estimation problem with GMM. To make it simple, we use the maximum intensity projection (MIP) method to remove other tissues except brain vessels and surrounding brain tissues so the number of components in GMM is reduced to two in this task. A lot of algorithms are proposed to solve the parameter estimation problem, among which the EM algorithm is one of the most popular [2]. In this paper, an improved version called SEM is adopted with the advantage of fast convergence and better validity [3],[4]. The remainder of this paper is organized as follows: Section 2 shows the algorithm model and our technical framework. Section 3 discusses each algorithm used in our framework. Experimental results and conclusions are presented in section 4 and section 5 respectively.
2 Algorithm Model and Technical Framework The main flow of the algorithm: First, the brain images are preprocessed using MIP algorithm to obtain the purified data, including brain vessels and surrounding brain tissues. Then fit the stochastic distribution with double Gaussian Mixture Model and estimate the parameters of each component with SEM algorithm. Finally, the vascular area is obtained and formatted to a 3D model. The framework involves four parts: Data Input Module, Preprocessing Module, Vessel Segmentation Module, and Model Drawing Module, as shown in fig. 1. Data Input Module: A non-linear filter approach for noise reduction and an edgefeature based registration method are adopted to eliminate the noise and remove the offset among different images. Preprocessing Module: the MIP algorithm is put forward to remove the background and non-vascular tissues, and then a new sequence of images is obtained, which include vessels and surrounding brain tissues only. Vessel Segmentation Module: regard the intensities of each voxel as random variables and fit the stochastic distribution with double Gaussian Mixture Model. Then the SEM algorithm is adopted to estimate the Gaussian parameters and voxels are classified into vascular vessels or other brain tissues according to their probability density.
198
X. Wang et al.
Model Drawing Module: a tree-like structure and are ball B-spline curve are used to represent the topological information and geometry information of the cerebral vessels respectively.
3 Implementation of the Algorithm 3.1 Pre-processing with MIP There are several components in brain images, and the cerebral vessels take a small proportion of them [5][6]. So the preprocessing method with MIP is proposed to remove most non-vascular components and ascertain the number of classes in Gaussian Mixture Model, which can improve the precision of segmentation[7]. MIP is a volume rendering algorithm, which projects the voxels along viewing direction (called projection path) and picks up the brightest voxel along each projection path, then displays it on the 2-D projection image. Usually, the intensity of vessel is higher than other tissues, so MIP can be applied on volume rendering of vessels. In this paper, we choose Z axis as the direction of projection paths and obtain the projection image. The intensity of each pixel is
I x , y = max{G1x , y , G2x , y ,..., G Nx , y } Here,
I x, y
is the pixel intensity in projection image,
Gix , y
(1)
is the pixel intensity in
the ith image, x, y are the coordinate, and N is the number of images. Then pick up a threshold T, make
I
x, y
⎧ 1 =⎨ ⎩ 0
I x, y > T else
.
(2)
After that, the projection image is divided into several unconnected part. We retain the largest part and remove the others. Finally, multiply the original images with the projection image:
G
x, y i
⎧Gix , y =⎨ ⎩ 0
I x, y = 1 I x, y = 0
.
(3)
So a sequence of images is generated, including cerebral vessels and surrounding brain tissues only. All the voxels are processed as Gaussian random variables in the next stage. 3.2 Gaussian Model The probability distribution of double Gaussian mixture model is defined by
: H > : H > H
(4)
The SEM Statistical Mixture Model of Segmentation Algorithm of Brain Vessel Image
Where
f1
f2
and
199
are the probability density functions of cerebral vessels and other
θ1
f1 , θ 2
f2 , f ( x | Θ) is the probability density function of double Gaussian mixture model, Θ , is the parameter of double Gaussian mixture model, and . brain tissues respectively,
is the parameter of
is the parameter of
The maximum-likelihood estimates algorithm is the most popular method for parameter estimation. A likelihood function L (Θ) is defined.
~ L(Θ) = max Θ L(Θ)
(5)
~
Where Θ is parameters of some probability distribution, Θ is the maximumlikelihood estimation of Θ . Due to the complicated calculation process of maximum likelihood algorithm, EM algorithm is supposed to improve. The EM algorithm is a parameter estimation method for incomplete data by adding some “potential data”. Although the EM algorithm has a simple calculation process, it depends strongly on the initialization, is liable to converge to local minima point and with slow convergence speed. 3.3 SEM Algorithm The SEM algorithm is an improvement of EM algorithm by the addition of a stochastic component. The advantages are it essential independence of the initialization and the faster speed of convergence. The procedure is shown as following: 1) Initialization: initialize the parameters
ω1( 0 ) , ω 2( 0 ) , μ1( 0 ) , μ 2( 0 ) , σ 1( 0 ) , σ 2( 0 )
randomly, which are the weights, Gaussian means and Gaussian variances respectively. 2) E-step: for each yi , calculate the next distribution on the set of classes based on the parameters obtained in last iteration.
p k( t +1) ( y i ) = Where,
yi
function, voxels.
ω k( t ) f k ( y i ) 2 ∑ j =1 ω (jt ) f j ( yi )
i = 1,..., N
is the intensity of the ith brain voxel,
ω kt
f
k = 1,2
(6)
is the probability density
is the kth weight in the tth iteration, and N is the number of brain
200
X. Wang et al.
3) S-step: define a partition
{P1( t +1) , P2( t +1) } of all the voxels randomly, according
to the distribution got in E-step. 4) M-step: calculate the weights, means and variances,
ω k( t +1) = N k( t +1) / N μ k( t +1) = σ k2 ( t +1) = Here,
1 N k( t +1)
1 N
( t +1) k
∑
∑
N k( t +1 ) j =1
N k( t +1 ) j =1
y j ,k
( y j , k − μ k( t ) ) 2
(7)
k = 1,2 k = 1,2
(8)
(9)
N k( t +1) = card (Pk( t +1) ) .
− Θ ||< ε , stop iteration, or return to 2). 5) If || Θ The parameters obtained after iteration can determine the Gaussian mixture distribution, and the voxels can be divided into vascular area and non-vascular area according to the Gaussian mixture distribution. ( t +1)
(t )
4 Results and Analysis In this section, we present the results of applying our segmentation algorithm to a group of brain MRI images, which contains 136 DICOM images. The minimum interval between images is 0.7mm, and the maximum interval is 2.1mm. The algorithm is implemented in C++, and runs on the computer with Intel(R) Core(TM)2 CPU, 2.00GB memory and NVIDIA Quadro FX 550 display card. 4.1 Results The volume rendering result of original images is shown in Fig.1.
Fig. 1. Volume rendering result of original brain MRI images
The SEM Statistical Mixture Model of Segmentation Algorithm of Brain Vessel Image
201
After the preprocessing of section 3.1, the MIP image and largest connected segmentation result are obtained, shown in Fig. 2 and Fig. 3. Then with the result of parameter estimation using the SEM algorithm accounted in section 3.3(set the iterative error ε = 0.001 ), the final segmentation result is shown in Fig. 4.
Fig. 2. MIP image
Fig. 3. largest connected part
Compared with the statistical mixture model algorithm without preprocessing, the advantage of the algorithm in this paper is that, the number of mixing components in mixture model is determined, because most of the non-vascular tissues are removed, leaving only the vascular voxels and the surrounding brain tissue voxels. Otherwise, a stage for dynamic estimation of the mixing components number is usually added, which increases the algorithm complexity and the error.
202
X. Wang et al.
Fig. 4. Segmentation result of cerebral vessels using the algorithm in this page
The segmentation result using EM algorithm is shown in Fig. 5 and the comparison results using SEM algorithm and EM algorithm are shown in Fig. 6.
Fig. 5. Segmentation result using EM algorithm
(a) result using SEM algorithm (detail) (b) result using EM algorithm (detail) Fig. 6. Comparison results
It can be seen that, in the circle of Willis, besides the main vessels such as anterior cerebral artery, middle cerebral artery, posterior cerebral artery and communicating artery, some fine vessels can be obtained using SEM algorithm. Besides, there’s more
The SEM Statistical Mixture Model of Segmentation Algorithm of Brain Vessel Image
203
noise belonging to non-vascular part in the result using EM algorithm. The reason is that the EM algorithm is liable to converge to local minima, which will increase the classification error and affect the segmentation result. 4.2 Algorithm Performance and Analysis The difference between the performance of the parameter estimation using EM algorithm and SEM algorithm is shown in Table 1. The Gaussian parameters calculated in each iteration are shown in Fig. 7 ~Fig. 8. Table 1. The Performance parameters Heading level
EM SEM
Initialization Method k-means randomly
Iteration Number 32 13
Convergence Time 162.5s 20.0s
Fig. 7. The Gaussian means compare between the SEM algorithm and EM algorithm
Fig. 8. The Gaussian Standard deviation compare between the SEM algorithm and EM algorithm
204
X. Wang et al.
It can be seen that, the algorithm in this paper is better than EM algorithm. When using SEM algorithm, the curves of parameters are flatter and the speed of convergence is faster. Besides, the SEM algorithm doesn’t depend on the initialization, while the EM algorithm depends strongly on the initialization, so a optimized method for initialization is needed, which costs more time.
References 1. Kirbas, C., Quek, F.: A Review of Vessel Extraction Techniques and Algorithms. ACM Computing Surveys 36(2), 81–121 (2004) 2. Dempster, A.P.: Maximum Likelihood from Incomplete Data via the EM Algorithm. Journal of the Royal Statistical Society, Series B 39(1), 1–38 (1977) 3. Carson, C., Belonqie, S., Greenspan, H., Malik, J.: Blobworld: Image Segmentation Using Expectation-Maximization and Its Application to Image Querying. IEEE Transactions on Pattern Analysis and Machine Intelligence 24(08), 1026–1038 (2002) 4. Masson, P., Pieczynski, W.: SEM algorithm and unsupervised statistical segmentation of satellite images. IEEE Transactions on Geoscience and Remote Sensing 31(3), 618–633 (1993) 5. Shiming, X., Rui, C., Yu, D., Hua, L.: Motion Segmentation via On-line Gaussian Mixture Model and Texture. Journal of Computer Aided Design & Computer Graphics 17(07), 1504–1509 (2005) 6. Rihua, X., Runsheng, W.: A Range Image Segmentation Algorithm Based on Gaussian Mixture Model. Journal of Software 14(07), 1250–1257 (2003) 7. Greenspan, H., Ruf, A., Goldberger, J.: Constrained Gaussian mixture model framework for automatic segmentation of MR brain images. IEEE Transactions on Medical Imaging 25(09), 1233–1245 (2006) 8. Law, M.H.C., Figueiredo, M.A.T., Jain, A.K.: Simultaneous Feature Selection and Clustering Using Mixture Models. IEEE Transactions on Pattern Analysis and Machine Intelligence 26(09), 1154–1166 (2004) 9. Blekas, K., Galatsanos, N.P., Likas, A., Lagaris, I.E.: Mixture model analysis of DNA microarray images. IEEE Transactions on Medical Imaging 24(07), 901–909 (2005) 10. Diplaros, A., Vlassis, N., Gevers, T.: A spatially constrained generative model and an EM algorithm for image segmentation. IEEE Transactions on Neural Networks 18(03), 798– 808 (2007)
Classification and Diagnosis of Syndromes in Chinese Medicine in the Context of Coronary Heart Disease Model Based on Data Mining Methods Yong Wang, Huihui Zhao, Jianxin Chen, Chun Li, Wenjing Chuo, Shuzhen Guo, Junda Yu, and Wei Wang* Beijing University of Chinese Medicine Bei San Huan Dong Lu, Chao Yan District Beijing, 100029, P.R. China
[email protected],
[email protected]
Abstract. Objective: To study on the classification and diagnostic of syndromes in Chinese medicine (TCM) based on the coronary heart disease model (CHD, myocardial ischemia) by application of clustering analysis in mathematical statistics methods. Methods: By application of combining disease with syndrome model, dynamically observed and recorded pathologic signs of animal models, a total of 172 frequencies of the signs were collected, and the variables indicators were analyzed by cluster analysis. Results: The results show that CHD model can be divided into four syndromes by cluster analysis. The four categories can cover the ratio of 71.05% models; it gets a diagnostic accuracy rate of 92.11%, which can be used as key points to diagnose various syndromes in CHD. Conclusion: Cluster analysis can help to classify the TCM syndromes reasonably and objectively. What more, it also can discover the pattern of the syndrome evolution, Thus to provide a theoretical basis for the standardization of TCM research. Keywords: We would like to encourage you to list your keywords in this section.
1 Introduction Chinese medicine has a unique theoretical system which emphases on overall concept of human organism and its relationship with external environment. Differing from western medicine (WM), syndromes are used in clinical practice by TCM. Syndrome is concluded from a group of interrelated symptoms and signs, which is also known as the general physiological responses of various pathogenic factors. Syndrome differentiation is to make judgments based on further analysis of four diagnostic information. But it mostly depends on therapists’ subjective opinion. If more objective methods were added into the diagnosis system and great standardization were explored, more accurate differentiation of syndromes and an operable evaluation of animal models in Chinese medicine can be expected. How to extract inherent pattern of Syndrome from the numerous signs become the hot spots in Chinese *
Corresponding author.
K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 205–211, 2010. © Springer-Verlag Berlin Heidelberg 2010
206
Y. Wang et al.
medicine practitioners [1].In this experiment, Ameroid narrow ring is used to prepare animal model of CHD, the four diagnostic information are dynamic observed on animal models from 0 to 13th week, cluster analysis is used to find out the characteristic compatibility figure in the process of disease, so as to provide a new method for extraction syndrome of TCM from symptoms, and it also can provide new idea for researching the evolution of syndrome.
2 Material and Methods 2.1 Establishment of a Blood Stasis Animal Model with CHD 18 healthy Chinese experimental miniature swine (25±4 kg) were divided into test group and sham group randomly. And 12 animals were included in test group while others in sham group. All animals were maintained and treated in accordance with the Principles of Laboratory Animal Care, formulated by the National Society for Medical Research, the guide for the Care and Use of Laboratory Animals, published by the National Institutes of Health, and local laws about laboratory animal care. The local ethics committee of Beijing University of Chinese Medicine approved all animal experiments. Animal model of chronic myocardial ischemia were produced according to the methods as we established before [2]. General anesthesia was induced in the fasting animals by intramuscular ketamine (25 mg/kg). They were intubated and anesthesia was maintained with continuous intravenous ketamine. A 4-6 mm segment of the left anterior descending coronary artery was freed and an Ameroid constrictor was placed around. Then, the thoracotomy was closed by layers and the electrocardiogram was performed again. The surgery was performed under continuous monitoring of ECG. After recovered from anesthesia, animals were extubated. During the first 3 days after surgery, penicillin (4,800,000 units per day) was injected intramuscular to antiinfection. All the treatment described above was also performed in sham group animals but placement of Ameroid constrictor. Based on dynamic observations of Clinical performances and detection of electrocardiogram, echocardiography and coronary angiography, the diseases of model animals were diagnosed and their syndromes were differentiated. 2.2 Observation Table Design After a comprehensive observation of animals from 0 to 13 week, referring to TCM theory, the four diagnosis information collection methods, combined with the feasibility and operability, Observation table have been designed to observe the model animals, including mental state, activity, tongue state, hair, eyes, skin, body temperature etc. and it has been used in our previous study for syndrome, the result are reasonable and feasible, furthermore, the disease was evaluated through electrocardiogram, coronary angiography and other auxiliary methods. 2.3 Cluster Analysis of Possible Feature Patterns of Syndrome with CHD Generally, data mining is the process of finding correlations or patterns among dozens of fields in large relational databases. According to the characteristics of our data, we
Classification and Diagnosis of Syndromes in Chinese Medicine
207
chose cluster analysis, to explore the possible feature model of myocardial ischemia. Before introducing the algorithm; we give a rigorous definition to mutual information: Suppose system X = ( X 1 , X 2 , , X a , , X p ) T is consisted of p variables,
p ∈ N ( N set of natural number), where X a = ( X a ) , a = 1, 2, , p ; i = 1, 2, , q . Here our objective is to obtain some subsets which have some close i
properties from set X. Let C a ( a = 1,
, p ) be set of classification of Xa ,
be i -th element of Ca , then we have C a = {1, 2, quantity for
Xa
,i
C
ai
= i
, k } , k ≤ q , and let n be i
belong to i -th class, then entropy of Xa is defined as k
H ( X a ) = − ∑ n i / q log ni / q
(1)
i =1
The joint entropy of
X a , Xb
H (X
a
is similarly defined as
∪ X b) = −∑ i
where
nij is quantity for
X a belong to
∑
n i j / q lo g n i j / q
j
i -th class of
Ca
(2)
simultaneously X b belong to
j -th class of Cb . For the convenience of application, expressions (1) and (2) can respectively be represented as H (X
H (X
a
a
) = lo g q −
∪ X
b
1 q
) = lo g q −
k
∑
i=1
1 q
n i lo g n i
∑ ∑ i
n ij l o g n ij
(3) (4)
j
Having had above-mentioned definition of entropy, in what follows, correlative measure by which statistical dependence between their mutual information.
Xa and Xb is denoted is defined by
Definition 1. Correlative measure between two variables For arbitrary X a ∈ X , X b ∈ X , suppose X a
∩ X b = φ , then entropy
H (X a, Xb) = H (X a) + H (Xb) − H (X a ∪ Xb)
is called correlative measure μ ( X a , X b ) between
(5)
Xa and Xb.
Definition 2. Correlative measure among multi-variables Suppose X a ∩ X b = φ for arbitrary a, b(a ≠ b) , p ∈ N then μ ( X 1, X 2 ,
Δ
, X p)=
p
∑
a =1
⎛ p ⎞ H (X a) − H ⎜∑ X a ⎟ ⎝ a =1 ⎠
is called correlative measure among X1,X2,…and Xp.
(6)
208
Y. Wang et al.
We can also extend the definitions of correlative measure among variables to that of subsets of complex system. In fact, the variable itself is also one particular subset. Definition 3. Correlative measure among multi-subsystems Suppose system X be partitioned into m subsystems
i, j (i ≠ j ) , si ∩ s j = φ ,
X
μ ( s1 , s 2 ,
=
s1, s2 , , sm ,
for arbitrary
m
∑
si
i=1
Δ
, sm ) =
is called correlative measure among
, then m
∑
i=1
⎛ H ( si ) − H ⎜ ⎝
m
∑
i=1
⎞ si ⎟ ⎠
(7)
s1 , s2 , , sm .
Let us consider nonempty finite set X and set-family E(X) consisted of its subsets, P is a set-function defined on E(X) with properties: (i) P( A) ≥ 0 , ∀ A ∈ E ( X
) , (ii) P ( ∅) = 0 .
2.4 The Complex Entropy Cluster Algorithm The algorithm is detailedly presented in [5], we also present it here. Once association for each pair (every two variables) is acquired, we propose a selforganized algorithm to automatically discovery the patterns. The algorithm can not only cluster, but also realize some variables appear in some different patterns. In this section, we use three subsections to introduce the algorithm. The first introduce the concept of “Relative” set. Based on this, the pattern discovery algorithm is proposed in second subsection. The last subsection is devoted to presenting an n-class association concept to back up the idea of the algorithm. For a specific variable X, a set, which is collected by mean of gathering N variables whose associations with X are larger than others with regard to X, is attached to it and is denoted as R(X). Each variable in the set can be regarded as a “Relative” of X while other variables that not belong to the set are considered as irrelative to X, so we name R(X) “Relative” set of X. The “Relative” sets of all 20 variables can be denoted by a 20*N matrix. Based on the matrix, the pattern discovery algorithm is proposed. A pair (variable X and Y) is defined to be significantly associated if and only if X belongs to the “Relative” set of Y ( X ∈ R (Y ) ) and vice versa ( Y ∈ R ( X ) ). It is convenient to extend this definition to a set with multiple variables. If and only if each pair of these variables is significantly associated, then we can call that the set is significant associated. A pattern is defined as a significantly associated set with maximal number of variables. All these kinds of sets constitute the hidden patterns in the data. Therefore, a pattern should follow three main criteria: (1) the number of variables within a set is no less than 2. (2) Each pair of the variables belong to a set is significantly associated. (3) Any variable outside a set can not make the set significantly associated. This means the number of variables within the set reaches maximum. We defined that two variables X and Y are correlated if and only if they are interrelative, i.e., X is a ‘relative’ of Y and vice versa. It is convenient to extend this
Classification and Diagnosis of Syndromes in Chinese Medicine
209
definition to the case with multi-variables, if each pair between these variables is correlated, then we called that they are correlated. A set that is comprised of maximal variables in which each pair is correlated is defined as a pattern and all sets constitute the hidden patterns in the data acquired above.
3 Results 3.1 Establishing of Animal Model of Myocardial Ischemia Just as we reported before, the Chinese experimental mini-pig with Ameroid constrictor around the left descending artery were performed as stable myocardial ischemia model in 4 weeks after operation.
Fig. 1. Coronary angiography 4 weeks after operation in control and ischemic groups
3.2 Data mining of Possible Syndrome Pattern of Myocardial Ischemia Cluster analysis was used to analysis the 18 variables signs. The results show that four categories make diagnostic information disperse better, syndrome distributed clearly, and more in line with clinical practice. Based on exporters’ opinion, these four categories were named Qi deficiency, blood stasis, Yang and Yin deficiency respectively. The results show in Table 1, Table 2. Table 1. Syndrome classification based on Cluster analysis
Symptoms little glossy coat, loose messy fur, dull eyes, the webbed skin between toes Lack of energy, slowness, reduction number of wagging tail Pale nasal mucosa, cyanosis dull red tongue, dark and gloomy eyes pink tongue with white coating, loose messy fur, chap skin between toes
Model 22
Sham 1
P Value 2.53e-05
36
1
4.25e-09
8
1
0.048
15
3
0.0364
210
Y. Wang et al. Table 2. Possible syndromes of CHD based on cluster analysis
Symptoms
Syndrome
glossy coat, loose messy fur, dull eyes, the webbed skin between toes Lack of energy, action slowness, reduction number of wagging tail Pale nasal mucosa, cyanosis dull tongue, dark and gloomy eyes pink tongue with white coating, loose messy fur, chap skin between toes
Blood stasis Qi deficiency Yang deficiency Yin deficiency
Of all 4 groups syndrome covers 71.05% of the model group animals, when the syndrome figure were tested by sham-operated group, the mismatch rate is as high as 92.11%, it indicates that the classification method can not only covers most of the model animal, but also separate model group from the sham Surgery group significantly. 3.3 The Frequency Table of Syndromes Appearance Time The symptoms such as little glossy coat, loose messy fur, dull eyes, chap skin between toes occurs in sixth week, continuing to 12th weeks; lack of energy, action slowness, reduction number of wagging tail turn on from 3rd weeks until 13th weeks, but pale nasal mucosa, cyanosis dull tongue and dull eyes performance emerge only after 10 weeks. Pink tongue with white coating, loose messy fur, and chap skin between the toes also occurs in 6-12 weeks. It indicates that different syndrome may appeared in different time, and in the same time, there also can be two or three syndromes turn on hand in hand. (Show in Fig 2)
Fig. 2. The frequency table of syndromes appearing time
4 Discussion and Conclusion 4.1 Cluster Analysis Was the Better Way to Establish the Pattern of Syndromes At present, a lot of research on CHD were carried out by integrated TCM-WM, but there is no a standard classification and diagnostic criteria in TCM based on reasonable, scientific analysis, Cluster analysis and other modern methods of mathematical statistics
Classification and Diagnosis of Syndromes in Chinese Medicine
211
will be able to greatly improve the normalization of syndromes differentiation and its diagnostic accuracy if be applied rationally in TCM. In this study, 172 frequencies of signs are divided into four categories respectively by cluster analysis, suggesting that the key pathogenesis of CHD model is Qi deficiency, blood stasis and so on. In addition, this study also tried to explore new methods for the promotion of scientific and standardized TCM syndrome diagnosis. According to the results of this study, it can provide a reference for the clinical diagnosis of CHD in TCM. 4.2 The Feature Pattern Established by Cluster Analysis Was Robust and Sensitive We used the Chinese mini-pig to copy CHD(myocardial ischemia) animal model, and dynamic collect four diagnostic information systemically after 0-13 weeks. The syndrome of diagnostic criteria made by the Association of Integrative Medicine in 1986 is to evaluate syndrome for different time points in animal models. The 4 groups’ syndrome covers 71.05% of the model group animals, and the mismatch rate is as high as 92.11% in sham-operated group, it indicates that the 4 classification is the major pathology during the CHD, and they all have a sensitive difference from the sham group, suggesting a diagnosis meaning in the syndrome differentiation in CHD. So, we can conclude that the cluster analysis is suitable to describe the feature pattern of the CHD, what more, it has a sensitive and accurate diagnosis. In this study, by use of cluster analysis to four diagnostic information based on the chronic myocardial ischemia model, a preliminary extraction of four syndromes are got, and through the combination of time, frequency distribution, initial diagnosis is made that syndrome of myocardial ischemia largely related to Qi deficiency, blood stasis, Yang and other syndromes, and the distribution is time-depended, so it provided a new method not only for the evolution of the syndrome but also for extracting the characteristics of syndromes combination mode, and further clarify the TCM Syndrome biological basis and a new areas were expanded. Acknowledgments. This work was supported by The National Basic Research Program of China (2003CB517105) and The Creation for Significant New Drugs (2009ZX09502-018).
References 1. Junlian, L., Jian-Nan, S.: The essence of traditional Chinese medicine blood stasis Research. Liaoning Journal of Traditional Chinese Medicine 33(9), 1061–1063 (2006) 2. Xu, W.Y., Wang, W., Guo, S.Z., Liu, T., Liu, L., Yu, Y.X.: Duplication of an animal model of myocardial ischemia with blood stasis syndrome in mini-swines. Journal of Chinese integrative medicine 6(4), 409–413 (2008) 3. Integrated Traditional and Western Medicine in China in the study of professional committees huoxuehuayu. Diagnostic criteria of blood-stasis syndrome. Journal of Chinese Integrative Medicine 7, Cover 2 (1987) 4. Ziyin, S.: Reference standerd of traditional chinese medicine deficiency-syndrome. Journal of Chinese Integrative Medicine 6, 598 (1986) 5. Efron, B., Hastie, T., Johnstone, I., Tibshirani, R.: Least angle regression. Annals of Statistics 32(2), 407–499 (2004)
An Image Reconstruction Method for Magnetic Induction Tomography: Improved Genetic Algorithm Wei He, Haijun Luo*, Zheng Xu, and Qian Li State Key Laboratory of Power Transmission Equipment & System Security and New Technology, The Electrical Engineering College, Chongqing University, Chongqing 400044, People’s Republic of China
[email protected]
Abstract. Magnetic induction tomography (MIT) is a new medical functional imaging technique. This paper proposes a multi-layer model of the biological tissue. And a new reconstruction method of magnetic induction tomography based on the improved genetic algorithm(GA) was introduced. The GA was improved from the following aspects: the generation of the initial population, selection, crossover and mutation operation. The simulation results indicate that the method can reflect local and large-area bleeding in the shallow of biological tissue, and construct a base for the future clinical brain disease monitoring. Keywords: magnetic induction tomography, inverse problem, improved genetic algorithm(GA).
1 Introduction Human body can be regarded as a certain spatial distribution, with different electrical properties of tissue. When a part of tissue has functionality pathological change (such as inflammation, edema, etc.), its conductivity distribution will change. That is to say, if we measure out the conductivity distribution in biological tissues, we can obtain some diagnostic information. This is a currently popular research subject in medical functionality image study. Magnetic induction tomography (MIT) is one of the important branches in this research subject. The basic principle of MIT is: a sinusoidal time-varying primary magnetic field is generated by excitation coil penetrating the conducting object. The eddy currents induced the primary magnetic field generate the secondary magnetic field, which has a direct relationship with the conductivity distribution of object. Therefore, through the measurement of the secondary magnetic field, the distribution of conductivity can be deduced for imaging[1]. At present, there are mainly two algorithms to solve the inverse problem: backprojection method[2] and sensitivity algorithm[3,4]. The back-projection algorithm applies only to the excitation and detection coils respectively located on both sides of the object, meanwhile, the measurement system demands a circle of coils, which is not convenient for some clinical applications, such as cerebral edema and custody of hematoma. The sensitivity algorithm demands large computation quantity, which *
Corresponding author.
K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 212–219, 2010. © Springer-Verlag Berlin Heidelberg 2010
An Image Reconstruction Method for Magnetic Induction Tomography
213
cannot satisfy the requirement of the real-time monitoring in MIT. The improved genetic algorithm(GA) is a kind of faster, more practical inverse problems algorithm. The GA is a search technique used in computing to find exact or approximate solutions to optimization and search problems. In this paper, we have researched the application of improved genetic algorithm for solving the inverse problem of MIT. Firstly, we create a kind of multi-layer model to simulate the biological tissues. Then, we calculate the forward problem of MIT. Using the result of forward problem, we establish a proper improved genetic algorithm for MIT image reconstruction.
2 Multi-layer Model for Biological Tissues In the electromagnetic geologic survey, the geology is assumed as a infinite structure with multi-layer. So the reconstruction of conductivity distribution is completed on the each layer. In the study of MIT for imaging biological tissues, we apply a similar idea. In the fact, human head is a complex multi-layer structure. For simplification, the human head can be divided into scalp, skull, cerebrospinal fluid, and brain tissue. So we create a multi-layer model for human head, Meanwhile, according to the number of sensors, each layer is divided into several regions, shown in Fig.1. The conductivity is set to each region to simulate the different tissues, or the intracerebral hemorrhage.
σ11
σ 12
σ1L
σ 21
σ 22
σ 2L
σ n1
σ n2
σ nL
Fig. 1. MIT induced voltage measurement sketch map
3 Calculation of the Forward Problem In forward problem, Firstly, we should calculate the difference induced voltage of two detecting coil in a single layer model. The output voltage is presented as follow [5]:
214
W. He et al.
V = − jω ⋅ 2π r ⋅ [ A4 ( r , z1) − A3 (r , z 2)] ∞
= − jωπρμ0 I ρ '⋅ ∫ J1 (λρ ) J1 (λρ ')[(eλ z ' + α e − λ z ' )e − λ z1 − e − λ z ' (eλ z2 + α e− λ z2 )]d λ 0
∞
= − jωπρμ0 I ρ '⋅ ∫ J1 (λρ ) J1 (λρ ')[e
λ ( z ' − z1 )
0
−e
λ ( z2 − z ')
+ α (e
− λ ( z ' + z1 )
−e
(1)
)]d λ
− λ ( z ' + z2 )
Where V is the difference induced voltage of two detecting coil in a single layer model , ω the angular frequency, ρ the radius of detecting coil, ρ ' the radius of exciting coil, J1 () the primal bessel function, Z ' the distance between the exciting coil Z1 , Z 2
the distance between the exciting coils and the phantom. z
Detecting coil L2
ρ
Exciting coil L1
ρ'
’
Z2
Z
Detecting coil L3
Z1
and the phantom.
o C
2 layer
n layer
C
x
C
y 1 layer
Fig. 2. Multi-layer biological tissue model under one sensor
In the multi-layer model of forward problem, we assume the model has n layers with equal thickness, as shown Fig.2. The output voltage of the sensor is the sum of the induced voltages of the each layer, which is presented as follow:
V ' = V1 + V2 + ⋅⋅⋅⋅⋅⋅ +Vn ∞
= − jωπρμ0 I ρ ' i ∫ J1 (λρ ) J1 (λρ ')i[n ⋅ (eλ ( Z − Z1 ) − eλ ( Z2 − Z ) ) '
0
n
'
(2)
+ ∑ ai ⋅(e − λ ( Z + Z1 + 2(i −1) c ) − e − λ ( Z + Z2 + 2(i −1) c ) )].d λ '
'
i =1
αi =
jωμ 0σ i (1 − e 2ui c ) − (λ − ui ) 2 + (λ + ui )2 ⋅ e 2ui c
(3)
Where V ' is the difference induced voltage of two detecting coil in a multi-layer model , C the thickness of the each layer.
4 Improved Genetic Algorithm Taking 1# sensor as an example, changing the frequency of excitation signal m times, we will obtain m data about induced voltage signals Vi (i = 1,2,..., m ) . The inverse
An Image Reconstruction Method for Magnetic Induction Tomography
215
problem of MIT is to find out a suitable conductivity distribution σ , to make sure that the difference is minimum between the measured induced voltage Vi and the calculated induced voltage U i (σ ) from (1),(2). Using their square sum for representation, that is: n
E (σ ) = ∑[Vi − Ui (σ )]2
(4)
i =1
Actually it’s a nonlinear least square problem, and the problem solution usually adopts the Gauss-Newton iterative algorithm. If we use the Gauss-Newton iterative algorithm, because of the conductivity contained in dual Bessel integral, it will need great amount of calculation for Jacobi matrix and time consuming. If the initial value is selected improperly, it will be non-convergence. In order to satisfy the requirement about the velocity and quality of image, we adopt a more efficient searching algorithm: improved genetic algorithm, shown in Fig.3a.
(a)
(b)
Fig. 3. (a) Improved GA program, (b) Generating Initial population program
216
W. He et al.
4.1 Generation of the Initial Population The quality of initial population of genetic algorithm has significant influence. The improper selection of the initial population, will easily cause excessive iteration, even non-convergence. The initial population should be evenly distributed in the solution area to improve the diversity of population. The generalized hamming distance between the individuals should be more than a target value[6]. The generation program of the initial population is presented in Fig.3b. 4.2 The Improvement of Selection Operation The selection operation decides the probability of initial population to the next generation, which is based on the individual fitness. In the inverse problem of MIT, the target function is 2-norm between the measured induced voltage Vi and the calculated induced voltage U i (σ ) . The smaller the 2-norm is , the higher the individual fitness is. The elitism preservation method and the ranking algorithm based on pressure difference are used in selection [7]. 4.3 The Improvement of Crossover Operation The crossover operation is the most important operation of genetic algorithms, and it is the key to the convergence of genetic algorithms. The two crossover individuals are selected according to the relative partner method[8]. And gene position of the singlepoint crossover is located referring the limitation of the active crossover region. The crossover operation probability[9] is adjusted according to the fitness value, as follow:
Pc = ( f max − f ' ) /( f max − f avg ) Pc = ( f max − f ' ) /( f max − f avg )
Pc = Pmax
f ' < f avg
where PC is the crossover operation probability,
(5) (6)
f max the max fitness value of the
individuals, f avg the average fitness value, f ' the larger value of the two crossover individuals,
Pmax the fixed max crossover probability.
4.4 The Improvement of Mutation Operation The mutation operation is mainly used to realize the local search ability. In the late AG, the areas of the optimal solution has been identified. At this time we use the relatively large mutation probability to strong the algorithm’s local search ability which can speed up the iteration. The relatively large mutation probability is identified as follow [9]:
Pm = Pm − min + (
Pm − max − Pm − min ) ⋅ geni genmax
f ≥ f avg
(7)
An Image Reconstruction Method for Magnetic Induction Tomography
Pm = Pm − min
f < f avg
217
(8)
where Pm −min is the minimum mutation probability, Pm − max the maximum mutation probability, genmax the maximum iterative algebra, geni the current iterative algebra,
f avg the average fitness value of the population, f the fitness value of the individual. This can be seen from the above formula(7),(8), when the individual fitness value is greater than the average fitness value, the probability of individual mutation will be increasing with the increase in the number of iterations.
5 Simulation Results For simulation, we used 7 sensors and a 6-layer model with 14cm length and 0.5cm thickness per layer. The radius of the exciting and detecting coils were 1cm, and the each coil was 10 turns. The distance between the exciting coil to the model surface was 0.5cm, and the distance between the exciting coil and the two detecting coils was 0.5cm. The exciting frequency was increased from 100KHz to 300KHz with a step of 20KHz. The amplitude of exciting current was 1A. 5.1 Sensitivity Simulation of the Different Size In the first simulation, the abnormal areas were in the second layer( the black region in Fig.4) with the length of 14cm, 10cm, 5cm, 2cm respectively in four experiments. The conductivity in abnormal area was set to 2s/m, while that of other area was 0.2s/m.
(a)
(b)
(c)
(d)
(e)
(f)
(g)
(h)
Fig. 4. Image reconstruction of different abnormal conductivity area setting
218
W. He et al.
In the image results, shown in Fig.4, the larger the abnormal area is, the better the image result is. When the length of the abnormal area is similar to the diameter of the coil, we cannot obtain the right image result. 5.2 Sensitivity Simulation of the Different Depth In this simulation, the abnormal areas (Fig.5) were set to layer 1, 2, 4 and 6 respectively, with the same length of 5cm and the same conductivity of 2s/m. According the image results, the abnormal areas in Layer 1, 2, 4 had good images. When the area was in the Layer 6, the algorithm was failure.
(a)
(b)
(c)
(d)
(e)
(f)
(g)
(h)
Fig. 5. Image reconstruction of different abnormal conductivity area depth setting
6 Conclusions This paper proposes a multi-layer model to analyze the biological tissues, and calculates the inverse problem of MIT using improved genetic algorithm. This study the relationship between the inverse problem of algorithm and the sensitivity of the size and depth of abnormal area. The simulation results show that the improved GA is limited by the size of the sensor. When the size of the target object is nearly that of sensor, or the depth of one is more than 2.5cm, the algorithm will be failure. But when the intracranial biological tissues appear the large hematoma or edema inside 23cm, this algorithm can be used to reconstruct the conductivity distribution image. In this paper, the proposed inverse problem algorithm only includes a series of calculations of the forward problem and the genetic operation without the specific math problem. Therefore the algorithm without calculating the complex Jacobian matrix, is faster than the sensitivity algorithm.
An Image Reconstruction Method for Magnetic Induction Tomography
219
In the future study, the image reconstruction speed will be improved with combing the immune algorithm. In the designing hardware, the smaller size of sensor will be used, to improve the system resolution. Anyhow, this study laid a foundation for realtime monitoring of the hematoma and edema, and gave a guiding direction for designing the hardware. Acknowledgments. This work was supported by The National Natural Science Foundation of China (50877082), and Natural Science Foundation Project of CQ CSTC (CSTC2009BB5204), and The project of scientists and engineers serving enterprise of the Ministry of Science and Technology of China (2009GJF10025), and Scientific Research Foundation of State Key Lab. of Power Transmission Equipment and System Security(2007DA10512709305).
References 1. Griffiths, H.: Magnetic induction tomography. Meas. Sci. Technol. 12, 1126–1131 (2001) 2. Korzhenevskii, A.V., Cherepenin, V.A.: Magnetic induction tomography. Comm. Tech. Electronics 42(4), 469–474 (1997) 3. Morris, A., Griffiths, H., Gough, W.: A numerical model for magnetic induction tomography measurements in biological tissues. Physiol. Meas. 22, 113–119 (2001) 4. Gencer, N.G., Tek, M.N.: Electrical Conductivity Imaging via Contactless Measurements. IEEE Trans. on Medical Imaging 18(7), 617–627 (1999) 5. Lei, Y.Z.: Analytic Solution of Harmonic Electromagnetic, pp. 182–187. Science Press, Beijing (2000) 6. Wu, B., Wu, J.: Research of Fast Genetic Algorithm. Journal of electronic science and technology university 28(1), 49–53 (1999) 7. Yu, M., Sun, S.D.: The theory and application of Genetic Algorithm, pp. 46–50. Defense Industry Press, Beijing (1999) 8. Lu, H.Q., Chen, L., Song, Y.S.: A the improved algorithm of crossover operator of genetic algorithm. PLA University of science and technology Journal 8(3), 250–253 (2007) 9. Zhang, M.H., Wang, S.J.: Shape optimization using an adaptive crossover operator genetic algorithms. Chinese journal of mechanical engineering 38(1), 51–54 (2002) 10. Huang, Y.R.: The intelligent optimization algorithm and application, pp. 20–22. Defense Industry Press, Beijing (2008)
The Segmentation of the Body of Tongue Based on the Improved Level Set in TCM Wenshu Li, Jianfu Yao, Linlin Yuan, and Qinian Zhou College of Information and Electronics, Zhejiang Sci-Tech University, Hangzhou, 310018, China
[email protected],
[email protected]
Abstract. The segmentation of the body of tongue pays an important role for automatic tongue diagnosis in Traditional Chinese Medicine. If there are similar grayscales near the margins of the body of tongue, it is difficult to extract the body of tongue desirably with some popular methods directly. In order to overcome this difficulty, a method that combines priori knowledge with improved level set method is presented. First, the contour of tongue is initialized in the HSV color space and a method which enhances the contrast between tongue and other parts of the tongue image is introduced. Then, a new region-based signed pressure force function is proposed, which can efficiently stop the contour at weak edges. Finally, we use a Gaussian filtering process to further regularize the level set function instead of reinitializing signed distance function. Experiments by numerous real tongue images show desirable performances of our method. Keywords: Segmentation, Automatic tongue diagnosis, Gaussian filtering, Signed pressure force, Level Set.
1 Introduction Tongue diagnosis is an important part of the Traditional Chinese Medicine assessment [1]. However, the clinical competence of traditional tongue diagnosis is dependent on the subjective experience and knowledge of the doctor, which has inevitably impeded the development of TCM. Recently, researchers have been developing various methods and systems [2] based on the texture, color and other proper of tongue image to build an objective and quantitative diagnostic standard for tongue diagnosis. These systems include the Automatic Recognition System of TCM Tongue Diagnosis [3], Computerized Tongue Diagnosis System in TCM [4] and Tongue Imaging Analysis Instrument [5]. In addition, Hong Kong Institute of Technology and Ha’rbin University of Technology had made progresses in TCM automatic analysis of tongue image and had built a database with 5000 instances of tongue image [6]. However, these systems either are lower efficiency or less practicability, above all, that of the segmentation of the body of tongue [7], [8], [9], [10], [11], [12], [13], [14]. At present, there are many ways to segment object, and above all, active contour model [15] has already been adopted in tongue segmentation as well. Active contour K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 220–229, 2010. © Springer-Verlag Berlin Heidelberg 2010
The Segmentation of the Body of Tongue Based on the Improved Level Set in TCM
221
model can obtain an accurate boundary of objects by deforming the initial curve, which is defined in advance. In order to overcome the difficulties with the initialization and boundary concavities, a novel method [16] for tongue contour extraction based on improved level set curve evolution is proposed. Unfortunately, there is still one key difficulty with the method proposed in [16]. The grayscales near the margins of the tongue body must not be similar or else it will likely get the wrong segmentation or even be failed. There is no satisfactory solution to this problem, although pressure forces, GVF, control points [9], domain adaptivity [10], directional attractions, and the use of solenoidal fields [11] have been proposed. However, most of the methods proposed to address this problem solve only one problem while creating new difficulties and do not utilize the prior knowledge about color information, which often plays an important role in improving the detection performance in the presence of some disturbing image features. We found that the boundary between the body of tongue and the adjacent skin in the grayscale image is often weak. Due to the pathological details on the surface of the tongue and the fragmental weakness of the tongue’s edge, the segmentation of tongue image based on deformable models always converge to spurious boundaries. In this paper, we present an improved level set method which has mainly some improvements: (1) Determining the initial position with both H and V component of HSV space; (2) According to the distribution differences of the components of RGB space between the body of tongue and skin color, we propose a grayscale enhancement method which can enhance the contrast; (3) In order to improve convergence at the weak edge, we construct a signed pressure force function based on regional statistics to replace the edge stopping function of Level Set model. The experiment’s results show the good performance of this method.
2 Initialization of the Active Contour The usual approach is to find a point near the center of the target image, and then to give an initial two-dimensional curve and make it astringe to the edge of the target according to dynamics mechanism [17]. This approach is appropriate when the general form of the target is fixed. But the tongue of people is not always flat when extended out of the mouth. Some forms will be changed and therefore it is not suitable for this algorithm. According to the characteristics of the tongue, this paper proposes a method that determines the initialization by the H component and V component in HSV color model. In the procedure of collecting tongue image, the image obtained by camera is usually represented with three primary colors(R, G, and B). However, HSV color model is a more suitable model for visual recognition as the independence of the psychological perception between people and coordinate. So the change in each color component could be perceived with HSV color model. The theory and formula of space change could be seen in [15]. As is shown in Fig.1, original tongue is transformed from RGB color space to HSV space (Fig.1.a). Component H (Fig.1.b) is transferred into the binary image (Fig.1.c) which is smoothed by median filtering. Fig.1.c and Fig.1.d present the results of
222
W. Li et al.
binary image of H component and V component after median filtering, respectively. We can see the contour of root part of tongue can be attained according to Fig.1.d. In Fig.1.c, there is noise that would lead to the failure of initialization of tongue root. We find that the noise which would affect initialization is standing in the region above the midline of image and below the tongue root line attained according to Fig.1.d. In order to remove this kind of noise, we use the following strategies: (1) To get accurate edge of tongue root by V component(Fig.1.d); (2) With the edge of tongue root and the midline of tongue image, constructing the noise area (Fig.1.e) where the noise is. The region is surrounded by the edge line of tongue root which is curve and green, the midline of tongue image which is horizontal and two vertical lines; (3) To search and remove noise (Fig.1.f) from most left-top point to left and right in the area. We can get Fig.1.g by the fusion of Fig.1.d and Fig.1.f and get the initial contour as is shown in Fig.1.h.
(a)
(b)
(e)
(f)
(c)
(g)
(d)
(h)
Fig. 1. Initialization of the contour of tongue
3 Enhance the Weak Edge between the Body of Tongue and Skin The boundary between tongue body and face is always weak in gray image (Fig.2.a) transferred from original image. It may bring great difficulty to the following segmentation and even lead to failure. We use priori knowledge to enhance the weak boundary and provide a good base to the following work. We discover the R values in RGB color space in tongue and face are almost similar. However, the reason that body of tongue is redder than face is the G values corresponding to the body of tongue and face are much different as following shows:
The Segmentation of the Body of Tongue Based on the Improved Level Set in TCM
223
(1) The G value corresponding to face is lager than that corresponding to tongue; (2) For face, the G value is larger than B value. But, for the body of tongue, the G value equals B value or is larger than B value; (3) Whatever for tongue or for face, the R value is larger than G and B. We choose G component as the processing object to enhance contrast in G space. As the features of G component shows above, it, meantime, can enhance the weak boundary between tongue and face. Set R, B are constants but G is variable, we propose a function as Eq.(1) shows enhanced the contrast according to the relative feature of three components of R, G, B. G' =
R −G | G − B | +1
(1)
where R ≥ G, R ≥ B , the reason denominator plus 1 is to avoid denominator equals 0. Because the G value corresponding to the body of tongue is smaller than that corresponding to skin, the G ' value (see Fig.2.b) corresponding to the body of tongue is larger than that corresponding to skin. As is shown in Fig.2.c, the weak boundary in the gray image that is replaced G with G ' has been enhanced obviously.
(b) G ' value
(a) gray image
(c) enhanced result
Fig. 2. Contrast between tongue and skin
4 Level Set Method Geodesic Active Contour (GAC) [18], [19], [20] is a kind of active contour model based on edge and it is widely used in image segmentation. The GAC model is formulated by minimizing the following energy functional:
LR = ∫
L(c )
0
g ( ∇I (C ( s )) )ds
(2)
where L(c) is the perimeter of closed curve C , ∇I represent the gradient of tongue image I , g is Edge Stopping Function (ESF) as in Eq.(3). g=
1 1 + ∇Gσ ∗ I
2
(3)
224
W. Li et al.
where ∇Gσ ∗ I denotes convolving tongue image I with a Gaussian kernel whose standard deviation is σ .Using calculation of variation, we get Euler-Lagrange equation of Eq.(2) as follows:
G G ∂C (t ) = gkN − (∇g ⋅ N ) (4) ∂t G where k is the curvature of the contour and N is the inward normal to the curve. Usually a constant velocity term α is added to increase the propagation speed. Then Eq. (4) can be rewritten as:
G G ∂C (t ) = g ( k + α ) N − (∇ g ⋅ N ) ∂t
(5)
The corresponding level set formulation is as follows:
∂ϕ ∇ϕ ) + α ) + ∇g ⋅∇ϕ = g ∇ϕ (div( ∂t ∇ϕ
(6)
where div is divergence operator and α is the balloon force, which controls the contour shrinking or expanding.
5 Improved Level Set Method 5.1 The Design of Signed Pressure Function
The Signed Pressure Function (SPF) defined in [21] has values in the range [−1,1] . It modulates the signs of the pressure forces inside and outside the region of interest so that the contour shrinks when outside the object, or expands when inside the object. We construct the SPF function as follows:
spf ( I ( x)) =
c1 + c2 2 c1 + c2 ⎞ I ( x) − ⎟ 2 ⎠
I ( x) − ⎛ max ⎜ ⎝
(7)
where c1 and c2 are two constants which are the average intensities inside and outside the contour, which are defined in Eq. (8) and (9), respectively.
∑ I ( x) ⋅ sign(ϕ ) ∑ sign(ϕ )
(8)
∑ I ( x) ⋅ (1 − sign(ϕ )) ∑ (1 − sign(ϕ ))
(9)
c1 =
c2 =
The Segmentation of the Body of Tongue Based on the Improved Level Set in TCM
225
where ϕ < 0 when in the contour, sign( x) is signal function as defined in Eq. (10). ⎧1, x < 0 sign( x) = ⎨ ⎩0, x ≥ 0
(10)
The significance of Eq. (7) can be explained as follows. We assume that the intensities inside and outside the contour are homogeneous. It is intuitive that Min( I ( x)) ≤ c1 , c2 ≤ Max( I ( x)) , and the equal signs cannot be obtained simultaneously wherever the contour is. Hence, there is Min( I ( x)) <
c1 + c2 < Max( I ( x)) 2
(11)
5.2 The Improved GAC Model
Substituting the SPF function in Eq. (7) for the ESF in Eq. (6), the level set formulation of the proposed model is as follows: ∂ϕ ∇ϕ = spf ( I ( x)) ⋅ (div( ) + α ) ∇ϕ + ∇spf ( I ( x)) ⋅∇ϕ ∂t ∇ϕ
(12)
In our method, the level set function can be initialized to constants, which have different signs inside and outside the contour. This is very simple to implement in practice. In the traditional level set methods, the level set function is initialized to be a signed distance function (SDF) to its interface in order to prevent it from being too steep or flat near its interface, and re-initialization is required in the evolution. Unfortunately, many existing re-initialization methods have an undesirable side effect of moving the zero level set away from its interface. Furthermore, it is difficult to decide when and how to apply the re-initialization. In addition, re-initialization is a very expensive operation. To solve these problems, we propose a novel level set method, which utilizes a Gaussian filter to regularize the selective binary level set function after each iteration. ∇ϕ In the traditional level set methods, the curvature-based term div( ) ∇ϕ is ∇ϕ usually used to regularize the level set function ϕ . Since ϕ is an SDF that satisfies ∇ϕ = 1 , the regularized term can be rewritten as Δϕ , which is the Laplacian
of the level set function ϕ . As pointed out in [22] and based on the theory of scalespace [23], the evolution of a function with its Laplacian is equivalent to a Gaussian kernel filtering the initial condition of the function. Thus we can use a Gaussian filtering process to further regularize the level set function. We truncate the Gaussian kernel as K × K mask for efficiency. The standard deviation σ of the Gaussian filter can control the regularization strength. Since we utilize a Gaussian filter to smooth ∇ϕ the level set function to keep the interface regular, the regular term div( ) ∇ϕ is ∇ϕ
226
W. Li et al.
unnecessary. In addition, the term ∇spf ( I ( x)) ⋅ ∇ϕ in Eq. (12) can also be removed, because our model utilizes the statistical information of regions, which has a larger capture range and capacity of anti edge leakage. Finally, the level set formulation of the proposed model can be written as follows:
∂ϕ = spf ( I ( x)) ⋅ α ∇ϕ ∂t
(13)
6 Results and Discussions The tongue images we used are acquired from Shanghai university of Traditional Chinese Medicine, Zhejiang university of Traditional Chinese Medicine and our own lab. The test tongue images are all 400 × 400 pixels. Our algorithm is implemented in Matlab 2009 with a 2.70 GHz AMD Athlon Processor. In each experiment, we choose K = 5 , σ = 0.5 and α = 0.15 . Fig.3 shows the segmentation results of images with weak or blurred edge. The first row shows gray images of the initial images and the weak boundary of the body of tongue is in red rectangle. The second row shows the enhanced results by our method introduced in section 3. The third row shows the results by the method in [16]. The fourth row shows the results by original GAC model. The segmentation results by our model are in the fifth row. We can see that in the red rectangles in the first row, the boundary of the tongue body is weak, and it is hard to be recognized. However, the images in the second row are the results which are processed by our enhancement method in section 3. In these images, the weak boundary in the original gray image has been enhanced obviously. Comparing the method in [16] with our model, we can see the curve by the method in [16] can not converge at right position on weak boundary, but our model can get better results which can be seen in the fifth row. For example, in the second column, the method in [16] can not converge well at lower right boundary. For original GAC model, the evolution of the level set function converges in average 290 iterations and takes average 2.5 min; while for our model, the evolution converges in average 120 iterations and only takes average 0.17 min. The original GAC model spends most time in reinitializing the SDF in evolution and computing the level set function. However, we reduce the SDF and simplify the level set model. So we save time and make the evolution more efficiently. As we choose signed pressure function to instead the edge stopping function and utilize a Gaussian filter to regularize the selective binary level set function after each iteration, it is obvious our model can stop the curve at even weak edge better than the original GAC model. As is shown in Fig.4, our model is desirable in segmentation to various tongue images. We apply the proposed model to the 400 clinical tongue images and the correct segmentation rate reaches 98.5% according to the judgment by experts in Traditional Chinese Medicine. Experiments show the good performance of the method proposed in this paper for the tongue images with weak edge.
The Segmentation of the Body of Tongue Based on the Improved Level Set in TCM
227
Fig. 3. The first row shows gray images of the initial images and the weak boundary of the body of tongue is in red rectangle. The second row shows the enhanced results by our method introduced in section 3. The third row shows the results by the method in [16]. The fourth row shows the results by original GAC model. The segmentation results by our model are in the fifth row.
228
W. Li et al.
Fig. 4. Convergence results of various tongue images
7 Conclusions Tongue diagnosis is a necessary component in clinical diagnosis to Traditional Chinese Medicine. Moreover, it is one of the few diagnostic techniques that accord with the most promising direction in the 21st century: no pain and no injury. The proposed algorithms have shown promising result in segmenting tongue image. It is able to segment the body of tongue desirably even with weak boundary. Our method establishes the ground for the objectification of tongue diagnosis. Next we will analyze the recognition of teeth’s marks, tracking of dynamic tongue for dynamic information and the relation of the syndromes and diseases. Acknowledgment. This study was supported by the National Natural Science Foundation of China under Grant No.60702069 and the Natural Science Foundation of Zhejiang Province of China under Grant No.Y1080851.
References 1. Li, F., Song, T.B.: Practicality Handbook of Tongue Diagnosis of TCM. Science Press, Aachen (2002) 2. Jiang, Y.W., Chen, J.Z., Zhang, H.H.: Computerized System of Diagnosis of Tongue in Traditional Chinese Medicine. J. Chinese Journal of Integrated Traditional and Western Medicine 20, 21–23 (2000) 3. Yu, X.L., Tan, Y.L., Zhu, Z.M.: Study on Method of Automatic Diagnosis of Tongue Feature in Traditional Chinese Medicine. J. Chinese Journal of Biomedical Engineering 12, 10–13 (1994) 4. Shen, L.S., Wang, Y.M., Wei, B.G.: Image Analysis for Tongue Characterization. J. Acta Electronica Sinica 29, 1762–1765 (2001) 5. Chiu, C.C.: A novel approach based on computerized image analysis for traditional Chinese medical diagnosis of the tongue. J. Comput. M. PR 61, 77–89 (2000) 6. Takeichi, M., Sato, T.: Computerized color analysis of ‘Xue Yu’ (blood stasis) in the sublingual vein using a new technology. Amer. J. Chinese Med. 25, 213–219 (1997) 7. Pang, B., Zhang, D., Li, N.: Computerized Tongue Diagnosis Based on Bayesian Networks. IEEE Transactions on Biomedical Engineering, 1803–1810 (2004) 8. Witkin, K.: Snakes: Active contour models. J. Int. J. Comput. Vis. 1, 321–331 (1988) 9. Xu, N., Ahuia, N., Bansal, R.: Object segmentation using graph cuts based active contours. J. Computer Vision and Image Understanding 107, 210–224 (2007)
The Segmentation of the Body of Tongue Based on the Improved Level Set in TCM
229
10. Zhu, G.P., Zhang, S.Q., Zeng, Q.S.H., Wang, C.H.: Boundary-based image segmentation using binary level set method. J. Optical Engineering 46, 500–510 (2007) 11. Vese, L.A.: A multiphase level set framework for image segmentation using the Mumford–Shah model. J. International Journal of Computer Vision 50, 271–293 (2002) 12. Lie, J., Lysaker, M.: A binary level set model and some application to Munford–Shah image segmentation. IEEE Transaction on Image Processing, 1171–1181 (2006) 13. Li, C.M., Xu, C.Y., Gui, C.F., Fox, M.D.: Level set evolution without re-initialization: a new variational formulation. In: IEEE Conference on Computer Vision and Pattern Recognition, San Diego, pp. 430–436 (2005) 14. Paragios, N., Deriche, R.: Geodesic active regions and level set methods for supervised texture segmentation. J. International Journal of Computer 46, 223–247 (2002) 15. Li, W.S., Zhou, C.L.: The Segmentation of the body of Tongue Based on the Improved Snake Algorithm in Traditional Chinese Medicine. In: 2004 World Congress on Intelligent Control and Automation, vol. 6, pp. 5501–5505 (2004) 16. Li, W.S., Hu, S.N., Li, H.T., Wang, S.: A novel segmentation of tongue image. J. Int. J. Functional Informatics and Personalised Medicine 2, 315–342 (2009) 17. Tsai, A., Yezzi, A., Willsky, A.S.: Curve evolution implementation of the Mumford-Shah functional for image segmentation, denoising, interpolation, and magnification. IEEE Transaction on Image Processing, 1169–1186 (2001) 18. Paragios, N., Deriche, R.: Geodesic active contours and level sets for detection and tracking of moving objects. IEEE Transaction on Pattern Analysis and Machine Intelligence, 1–15 (2000) 19. Caselles, V., Kimmel, R., Sapiro, G.: Geodesic active contours. J. International Journal of Computer Vision 22, 61–79 (1997) 20. Caselles, V., Kimmel, R., Sapiro, G.: Geodesic active contours. In: Processing of IEEE International Conference on Computer Vision 1995, Boston, MA, pp. 694–699 (1995) 21. Xu, C.Y., Yezzi, J.A., Prince, J.L.: On the relationship between parametric and geometric active contours. In: Processing of 34th Asilomar Conference on Signals Systems and Computers, pp. 483–489 (2000) 22. Shi, Y., Karl, W.C.: Real-time tracking using level sets. In: IEEE Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 34–41 (2005) 23. Perona, P., Malik, J.: Scale-space and edge detection using anisotropic diffusion. IEEE Transaction on Pattern Analysis and Machine Intelligence 12, 629–640 (1990)
Transcutaneous Coupling Implantable Stimulator Hui Xiong1,2, Gang Li2, Ling Lin2, Wangming Zhang3, and Ruxiang Xu4 1
School of Electrical Engineering and Automation, Tianjin Polytechnic University, Tianjin 300160, China 2 State Key Laboratory of Precision Measurement Technology and Instruments, Tianjin University, Tianjin 300072, China 3 Neurosurgery Research Institute of Guangdong, Department of Neurosurgery, Zhujiang Hospotal,Southern Medical University, Guangzhou 510282, China 4 Department of Neurosurgery, The Military General Hospital of Beijing PLA, Beijing 100700, China
[email protected],
[email protected],
[email protected],
[email protected]
Abstract. Because of low energy transmission efficiency of the transcutaneous coupling power supply device, the factors affecting inefficiency are analyzed and an implantable stimulator external powered is designed. The circuit consists of High-frequency inverter module, transcutaneous transformer and an isolated pulse generator. The circuit realizes that the polarity, amplitude, frequency and pulse width of stimulating pulses are adjustable and controllable. The distance between two transcutaneous coupling coils is the thickness of human skin, usually 5~15mm. Through practical tests at a distance of 10mm, experimental results show that the maximum transmission efficiency is 7.66% at the highfrequency carrier of 300KHz. The variable range of amplitude is between 0 and 10 voltages. The range of frequency is between 100 and 300Hz. So the designed circuit meets the demand of implantable stimulators. Keywords: transcutaneous coupling, High-frequency inverter transcutaneous transformer, efficiency, frequency.
module,
1 Introduction At present, the clinical application of implantable nerve stimulators are mainly from foreign companies just like Medtronic, Cyberonics, Boston Scientific, Neuropace, Medtrode Advanced Neuromodulation System, St. Jud and so on. In China, Tsinghua University Li Lu-Ming Study Group developed a nerve stimulator design, which completed the functional testing of prototypes and successfully used in animal experiments, now in clinical trials. There is a common feature among these implantable stimulators. Medical devices that have been implanted inside the human bodies are powered by primary batteries. Many disadvantages exist such as the large volume of the built-in circuit and the leakage risks of battery, more troublesome is the need for regular operation to replace the battery. Transcutaneous coupling implantable stimulator with wireless power supply or charging way to raise the internal devices K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 230–237, 2010. © Springer-Verlag Berlin Heidelberg 2010
Transcutaneous Coupling Implantable Stimulator
231
are no longer constrained by battery power volume and greatly reduce the costs. At the same time, it avoids the frequent surgery to replace the battery before its power exhausted in patients’ body, thereby reduce the patient's surgical pain and economic burden. Transmission frequency of Transcutaneous coupling implantable stimulator power transmission system is high[1],which magnitude is megahertz, but the pulse commonly clinically used to stimulate is low frequency, typically a few Hz to hundreds Hz. Transcutaneous transformer of transcutaneous energy transmission System is a pair of loosely coupled coils,whose low-frequency stimulation pulse transmission efficiency is very low. To achieve the stimulation pulse frequency, amplitude, polarity adjustable is also difficult. Therefore, the transmission efficiency is a difficult point in the design of a transcutaneous coupling implantable stimulator. Its power supply obeys electromagnetic induction law, in vitro field coil established alternating magnetic, so that the in vivo coil produces induced electromotive force, and the two coils coupled to achieve the purpose of energy transfer. In vitro transmit coil close to the skin, so the distance between the two coils is the thickness of human skin, usually 5~15mm[2], the coupling coefficient is low, how to improve the transmission efficiency of transcutaneous coupling implantable stimulator is a difficulty. In the circuit design, when the measured distance of two coils is 1mm, the maximum efficiency is 3.5% and then quickly reduce[3], which is too low for the efficiency of transcutaneos coupling power supply devices. Improving the efficiency of transcutaneous energy transfer mainly depends on the carrier frequency [4]. Increasing the operating frequency, not only can improve the transcutaneous coupling strength of the transformer, but also can reduce the size of system components, especially the transcutaneous transformer size to meet the needs of transcutaneous coupling power supply. As the operating frequency increasing, iron loss, copper loss and the temperature will increase, but system transmission efficiency will decrease. Thus, according to the specific system design of transcutaneous energy transmission system, the transmission frequency is usually more than several dozens of KHz to more than 10MHz [4-5]. The design of implantable stimulator achieves the signal amplitude adjustable, frequency adjustable, polarity adjustable.With a large number of experimental data, a higher carrier frequency and efficiency of coupling is obtained which provides the basis for wireless transmission systems for the further design of percutaneous implantable stimulator.
2 The Design of Transcutaneous Coupling Implantable Stimulator 2.1 The Modules of the Transcutaneous Coupling Implantable Stimulator Implantable stimulation system consists of two parts of the circuit, in vivo and in vitro. The in vitro parts include the control module and the driver module, the in vivo part is mainly pulse module. In vivo and in vitro circuit consists of transcutaneous transformer placed on both sides of the skin as structure chart shown in Fig. 1. It consists of transmitting circuit, receiving circuit and load circuit. The transmitter
232
H. Xiong et al.
consists of power supply, high-frequency inverter module and primary coil, highfrequency inverter module is designed to produce high-frequency sine wave, and act on the primary coil. To improve transmission efficiency, the transmitter uses series resonant circuit. Transmitter produces high-frequency alternating magnetic field, the secondary coil of the receiving circuit generates alternating inductive electromotive force, produces the need waveform in vivo stimulation module through the waveform tuning module. Transcutaneous transformer was a pair of coupled coils placed in the skin from the inside and outside, so transcutaneous transformer leakage inductance is particularly great. Impulse circuit consists of rectifier circuit, energy storage, filter capacitor and bistable flip-flop, which resolves the polarity control problems of the stimulation pulse amplitude under the conditions of no communication channels. Clinical implantable stimulator common stimulation parameters is that amplitude is 1 ~ 10V, frequency is 100 ~ 300Hz, square wave.
Fig. 1. Schmetic of Transcutaneous Transformer System
2.2 Circuit Parameters Under the premise of the output required, to achieve transcutaneous coupling implantable stimulator low power consumption and high efficiency, high-frequency inverter module, transcutaneous transformer, selecting the parameters of polarity control isolation pulse generator and some other parts is the key. High-frequency inverter module includes voltage circuit and H-bridge switching circuit, selected parameter includes input voltage range, output voltage range and
Transcutaneous Coupling Implantable Stimulator
233
switching frequency. System is supplied by lithium batteries, its voltage is 2.7 ~ 4.2V; implantable stimulator output voltage range is generally 1~5V, special circumstances would need 10V or more stimulation output voltage; switching frequency directly determine the power consumption and transmission efficiency of the system, its value range depends on a mixed consideration on switching losses and coil characteristics. Table 1 shows the high-frequency inverter module required performance parameters. Table 1. Parameters of High-frequency inverter circuit Parameter InputVoltage Output voltage Switching Frequency
Range 2.7~4.2V 0~15V 100~500KHz
Voltage regulator circuit is the main power supply of the high-frequency inverter part, power supply capacity is as high as3W, and the static power consumption should be as small as possible, the output current is 0 ~ 100mA (special circumstances 200mA), output voltage is 0~15V, conversion efficiency is above 80% minimum. H-bridge circuit is the frequency conversion part, because inverter switch is in the core of the device which directly related to the high and low conversion efficiency, the following two aspects should be considered primarily to make the choice. Firstly, the rated current of switch diode should be more than the peak inductor current, the rated voltage of switch diode should not be less than the turn-off voltage. Secondly, switch-on resistance is related to the switch-on voltage and current of the switch diode, as a switching devices must require switch-on resistance, Ron, as small as possible(typically a few mΩ). Besides, switch diode as a voltage-controlled device, its gate input is equivalent to a capacitive element, so the operation to the gate is charging and discharging operation, obviously, as the input capacitance is too large, affecting rise and decline speed of the gate voltage, thereby affecting the opening and closing speed of switch diode, thus increasing the switching power. However, from the perspective of power consumption and size, the method must be chosen that the gate capacitance Ciss, Crss should be as small as possible. The design of transcutaneous transformer includes core materials, specifications, the coil turns, diameter and etc. It is a difficult point in the design. The value of induction electromotive force is proportional to the flux changing rate, considering about this, a higher frequency should be taken to obtain a greater inductive electromotive force. However, as the frequency increased, around the transcutaneous transformers the organization absorption rate and the current density increases significantly, and leads to temperature rise. Therefore, the transmission frequency can’t increase infinitely, it generally works in at 10MHz [6]. In order to reduce the leakage inductance, a pair of magnetic tin coupling coils are included, which kind is ferrite, specifications is G18 * 11. The core generally makes of MnZn ferrite material, to reduce the core loss. Through analyzing and calculation, the number of turns in primary coil is 100 turns, the number of turns in secondary coil is determined 150 turns through the method of experiments. Primary coil and secondary coil, by taking into account and calculating the window of the core and reduce the coil resistance, uses the enameled wire of 0.21mm diameter.
234
H. Xiong et al.
Polarity controllable isolated pulse part is mainly composed of rectifier circuit and the bistable flip-flops. The circuit input is secondary coil output. The load is the directly stimulated parts of the body. Table 2 shows the high-frequency inverter module required performance parameters. Table 2. Parameters of an isolated pulse generating circuit with adjustable polarity Parameter Input Voltage Input Pulse Frequency Output voltage Output pulse frequency Load impedance
Range -15~15V 100~500KHz -5~5V(maximum-10~10v) 100~300Hz 1KΩ
Diode and energy storage filter capacitor is the main device of rectifier circuit , the considerations of diode choice include the maximum rectifier current, the maximum reverse voltage and operating frequency, this design includes a rectifier diodes that the maximum backward voltage is 45V, maximum rectifier current is 50mA, maximum operating frequency is MHz of magnitude. 0.01μF energy storage filter capacitor can be used. 2.3 System Power and Loss Analysis System loss mainly includes three parts: high-frequency inverter loss, transformer loss, rectifier loss and fixed loss. High-frequency inverter loss includes the conduction loss associated load, switching loss associated frequency and fixed loss. Conduction losses associated load is caused by the transistor switch on-resistance and gate resistance; switching losses associated frequency mainly come from two ways: gate drive loss, which is caused by the gate capacitance charging and discharging; voltage and current overlap loss, which is caused by voltage and current overlap when switching transistor . Table 3 lists the formula of power. Here: D: duty cycle, IL, rms: the equivalent current in the coil, RDS_ON: switch on-resistance, tdead: dead-time, fsw: switching frequency, Isw: the current through the switching diode from ON to OFF or from OFF to ON, Vsw: the voltage drop across the switching diode from ON to OFF or from OFF to ON , Toverlap: voltage and current overlap duration of switching diode when the state alter, Cin: switching diode input capacitance, VGATE: switching diode gate control voltage. Table 3. Power of High-frequency Inverter Circuit Consumption Analyses Mechanism Switch on-resistance Switch turn-on voltage and current overlap Switch gate drive
Range DI2LrmsRDS_on IswVswToverlapfsw CinVGATEfsw
Transcutaneous Coupling Implantable Stimulator
235
Transcutaneous transformer is loosely coupled transformer, coil losses mainly include iron loss, copper loss and leakage inductance loss, improving the operating frequency can improve the coupling strength of transcutaneous transformer, but with the frequency increasing, iron loss, copper loss and temperature rise will increase, and the system transmission efficiency will decrease. Capacitor compensation can reduce the leakage loss. The design includes a capacitor - parallel compensation, which the primary coil is in series with the compensation capacitor and secondary coil is in parallel with the phase compensation capacitor. Compensation capacitor formula:
, Where C1 is the primary coil series capacitor, L1 is the primary coil inductance, C2 is the secondary coil series capacitor, L2 is the primary coil inductance. 2.3.1 Rectifier Loss and Fixed Loss Rectifier loss mainly refers to the forward and backward loss and switching loss of rectifier diode, forward power loss is equal to the product of forward voltage drop and forward current; backward power loss is equal to the product of backward voltage and backward leakage current; because of the existence of junction capacitor, backward recovery current exists in the process of the switching diode rectification, switching loss rate is the product of the backward voltage across the diode and backward recovery current through it. Fixed loss includes chip loss and the loss caused by transistors, diodes and circuit leakage currents.
3 Experimental Data and Analysis The distance between the two coils fixed D = 10mm, the input signal amplitude is 5V, the input high-frequency carrier frequency changed, frequency output power curve in Figure 2 shows. When the carrier frequency is 300KHz the maximum output power is 28.719mW from Fig. 2. The next experiment was carried out to find out the relationship of efficientcy and the output voltage. Fig. 3 shows the Efficiency curve of the actual test, the input power supply is 5V, the load frequency is 100 ~ 300Hz at the high-frequency carrier of 300KHz, load resistance is 1KΩ. When the two coils distance D = 10mm at 3V output voltage, the measured maximum transmission efficiency is 7.66%, the output pulse amplitude of the variable range is 1 ~ 10V, duty cycle is 50%, and polarity is adjustable.
236
H. Xiong et al.
Fig. 2. Ouput Power Curver with Frequency (D=10mm)
!"#$%
Fig. 3. Efficiency Curve at Different Voltages (D =10mm)
4 Conclusion In this paper, a model is designed to improve transcutaneous coupling implantable Stimulator’s efficiency of transmission and meets the range requirements of signal. The experiments show that the maximum transmission efficiency is 7.66% at the two coils distance of 10mm.The output voltage range is 0~10V,frequency range is between 100 and 300Hz and polarity is adjustable So, the polarity, amplitude, frequency and pulse width of stimulating pulses are adjustable and controllable. The design meets the transcutaneous coupling implantable Stimulator for efficiency and frequency requirements.
Transcutaneous Coupling Implantable Stimulator
237
References 1. Kopparthi, S., Ajmera, P.K.: Power delivery for remotely located Microsystems. In: 2004 IEEE Region 5 Conference: Annual Technical and Leadership Workshop, pp. 31–39 (2004) 2. Pernia, A.M., Orille, I.C., Martinez, J.A., et al.: Transcutaneous microvalve activation system using a coreless transformer. Sensors and Actuators A-Physical 136(1), 313–320 (2007) 3. Niu, C., Hao, H., Li, L., et al.: The transcutaneous Charger for Implanted Nerver Simulation Device. In: Proceeding of the 28th IEEE EMBS Annual International Conference, pp. 4941–4944 (2006) 4. Ben Hmida, G., Dhieb, M., Ghariani, H., et al.: Transcutaneous power and high data rate transmission for biomedical implants. In: International Conference on Design & Test of Integrated Systems in Nanoscale Technology, pp. 374–378 (2006) 5. Wu, Y., Yan, L.G., Xu, S.G.: Modeling and performance analysis of the New Contactless Power Supply System. In: Proceedings of the Eighth International Conference on Electrical Machines and Systems, pp. 1983–1987 (2005) 6. Shiba, K., Nukaya, M., Tsuji, T., et al.: Analysis of Current Density and Specific Absorption Rate in Biological Tissue Surrounding Transcutaneous Transformer for an Artificial Heart. IEEE Transactions on Biomedical Engineering 55(1), 205–213 (2008) 7. Blad, B., Bertenstam, L., Rehncrona, S., et al.: Measurement of contact impedance of electrodes used for deep brain stimulation. ITBM-RBM 26, 344–346 (2005) 8. Sato, F., Nomoto, T., Kano, G., et al.: A new contactless power-signal transmission device for implanted functional electrical stimulation (FES). IEEE Transactions on Magnetics 40(4), 2964–2966 (2004) 9. Coston, A.F., John, K.J.: Transdermal drugdelivery: a comparative analysis of skin impedancemodels and parameters. In: Proceedings of the 25th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, Cancun, Mexico, September 17-21, pp. 2982–2985. IEEE, Los Alamitos (2003) 10. Arai, S., Miura, H., Sato, F., et al.: Examination of Circuit Parameters for Stable High Efficiency TETS for Artificial Hearts. IEEE Transactions on Magnetics, 4170–4171 (2005) 11. Chen, Q., Wong, S.C., Tse, C.K., et al.: Analysis, Design, and Control of a Transcutaneous Power Regulator for Artificial Hearts. IEEE Transaction on Biomedical Circuits And Systems, 23–31 (2009) 12. Lee, S.-Y., Lee, S.-C.: An Implantable Wireless Bidirectional Communication Microstimulator for Neuromuscular Stimulation. IEEE Transactions on Circuits and Systems 5(12), 2526–2538 (2005) 13. Yao, N., Lee, H.N., Chang, C.C., et al.: A power-efficient communication system between brain-implantable devices and external computers. In: 29th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, Lyon, France, August 23-26, pp. 6588–6591. IEEE, Los Alamitos (2007) 14. Si, P., Hu, A.P., Hsu, J.W., Chiang, M., Wang, Y., Malpas, S., Budgett, D.: Wireless Power Supply for Implantable Biomedical Device Based on Primary Input Voltage Regulation. In: IEEE Conference, ICIEA 2007, pp. 235–239. IEEE, USA (2007)
Simulation Analysis on Stimulation Modes of Three-Dimension Electrical Impedance Tomography Wei He, Kang Ju, Zheng Xu, Bing Li, and Chuanhong He State Key Laboratory of Power Transmission Equipment & System Security and New Technology, College of Electrical Engineering, Chongqing University, Chongqing 400030, P.R. China
[email protected]
Abstract. To improve the lack of information and model error of twodimensional electrical impedance tomography (EIT), a three-dimensional EIT model is set up. Finite element method is used to calculate the forward problem. To compare different stimulation mode, distinguish ability and surface projection method are used. The result shows that back electrode mode has obvious advantages in detecting depth and in clinical use. The conclusion of this paper can provide reference for the study of three-dimensional EIT. Keywords: electrical impedance tomography, three-dimensional EIT, distinguish ability.
1 Introduction Electrical impedance tomography (EIT) is believed to be a distinctive imaging technology, by injecting voltage or current into the object and measuring the electric response, electrical impedance distribution of the measured object can be reconstructed. Valuable information can be obtained from the corresponding distribution of electrical characteristics of tissues and organs by use of EIT. It has good prospect in the field of biomedical testing, such as breast disease, lung disease, and gastrointestinal disease. Compared with ultrasound, magnetic resonance imaging and x-ray, EIT has its own characteristics, such as non-invasive, low-cost, simple and functional[1]. The existing EIT systems include: Rensselaer ACT4 system[2] (Rensselaer Polytechnic Institute), TS2000 system[3] (T-SCANTM in Israel), MEIK system[4] (МEIK® in Russia) and MK3.5 System[5] (University of Sheffield). Limited information is a bottleneck of EIT. Electrodes are used to actuate and measure, so the amount of information depends on the number of electrodes. Take 16electrode system for example[6], considering the principle of reciprocity, effective number of independent voltage measurements is only (16-3)×16/2=104. Another problem is model error. Two-dimensional imaging model is used commonly. It is a simple approximation of three-dimensional imaging model and exists model error inevitably. Three-dimensional EIT is believed to be the trend of EIT. It is necessary to study the electrode system, motivation mode and the model of three-dimensional EIT. In K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 238–245, 2010. © Springer-Verlag Berlin Heidelberg 2010
Simulation Analysis on Stimulation Modes
239
this paper, a simple three-dimensional model and a 6×6 surface electrode array are set up. Two simulation models (back electrode and hand electrode) and two motivation modes (current excitation and voltage excitation) are compared by using of distinguish ability[7].
2 Principle and Calculation Model of Three-Dimensional EIT 2.1 Mathematical Description of Three-Dimensional EIT Because of the low permeability of biological tissue, the effect of magnetic field is ignored in this paper. In the electric field (Ω), potential distribution function (ϕ) and conductivity distribution (σ) satisfy the Laplace equation: ∇ ⋅ (σ∇ϕ (r )) = 0
r ∈Ω
(1)
The boundary ∂Ω is divided into two parts: Γ E ∪ Γ . Among them, Γ E is the boundary of electrode l, Γ E = ∪ lL=1Γ l . Γ is the gap of electrodes. The calculation accuracy of complete electrode model is higher than measuring accuracy. In this paper, it is used to obtain Dirichlet and Neumann boundary conditions: ∂ϕ ( r ) ∂n
r ∈ Γ l , l = 1, 2,..., L
(2)
∂ϕ (r ) dS = I l ∂n
r ∈ Γ l , l = 1, 2,..., L
(3)
Vl = ϕ + zl σ
∫Γσ l
Where, Vl is the voltage of electrode l, zl is the contact impedance, n is the unit normal vector of boundary (∂Ω), Il is the current of electrode l. In the gap of electrode ( Γ ): ∂ϕ (r ) =0 ∂n
r ∈Γ
(4)
Considering conservation law of charge, L
∑I l =1
l
=0
(5)
Potential reference point is needed to ensure the only solution, L
∑ϕ l =1
l
=0
(6)
240
W. He et al.
2.2 Simulation Model and Motivation Mode of Three-Dimensional EIT
The simulation model of three-dimensional EIT is shown in Fig.1. Human breast is simplified by a cube with a dimension of 15×10×10cm3, and only consider the resistive component (conductivity take 0.4 S/m). A 6×6 electrode array is shown in Fig.1c, with a dimension of 7×7mm2, space between 2mm. There is a back electrode in the back surface of back electrode mode with a dimension of 10×10mm2. Hand electrode mode is different from back electrode mode; a diameter 40mm, height 50mm ‘arm’ and a hand electrode are added. The material of all electrodes is copper (conductivity is 5.998×107S/m). An object (conductivity take 4S/m) is placed below the center of electrode array, with a dimension of 15×15×10mm3.
a) Back electrode model
(b) hand electrode model
(c) 6×6 electrode array Fig. 1. Simulation model of three-dimensional EIT
According to the EIT system at home and aboard, the following four modes are summarized: Back electrode and current mode (Iback): current is injected through electrode 1 to 36 in turn, and outflow from back electrode. Hand electrode and current mode (Ihand): current is injected through electrode 1 to 36 in turn, and outflow from hand electrode.
Simulation Analysis on Stimulation Modes
241
Back electrode and voltage mode (Vback): voltage is drive between one of the 36 electrodes and the back electrode. Hand electrode and voltage mode (Vhand): voltage is drive between one of the 36 electrodes and the hand electrode. In this paper, we select incentive current 3mA and voltage 2V.
3 Motivation Mode of Three-Dimensional EIT The forward problem is described as: conductivity distribution σ and excitation are known, potential distribution ϕ is solved based on (1) to (6). Three-dimensional finite element method is used to solve forward problem in each motivation mode. The object is placed in different depth regions. 3.1 Evaluation Method
A concept of distinguish ability is proposed by professor Isaacson to assess detection performance. It can assess reconstruction performance, in other words, if the distinguish ability is less than or equal to noise level, image reconstruction will be failed. Two kinds of distinguish ability presented: norms distinguish ability and powers distinguish ability. Set the conductivity of background is σ0, the conductivity of object is σ1. V(σ0,j) is the measuring voltage in uniform field when current density is j, V(σ1,j) is the measuring voltage in non-uniform field. ε i is the measuring accuracy. When the following equation is set up, it can distinguish two materials with different conductivity. V (σ 0 , j ) − V (σ 1 , j ) > ε i
(7)
Norm distinguish ability is
δ l (σ 1 , σ 0 , j ) = =
V (σ 0 , j ) − V (σ 1 , j ) j
∑
V (σ 0 , j ) − Vl (σ 1 , j ) l =1 l L
∑
L l =1
Il
(8)
2
2
Among them, Il is the current of electrode l. Current is injected in turn, the average voltage (Vback-avg, Vhand-avg) of each electrode is used to calculate norm distinguish ability. If the norm distinguish ability of mode I(1) is higher than mode I(2), it means that when
∑ (δ V ( ) ) > ∑ ( δ V ( ) ) L
l =1
1
2
l
detect the object.
L
l =1
2
l
2
∑
L l =1
I l( ) 1
2
= ∑ l =1 I l( L
2)
2
,
is established. That is to say, the mode I(1) is easier to
242
W. He et al.
The norm distinguish ability is applied to assess the current incentive mode, power distinguish ability can assess both current and voltage incentive mode. When the disturbance caused by the object change is greater than measurement accuracy ( ε P ), that
P (σ 0 ) − P (σ 1 ) > ε P
(9)
It is believed that the object can be detected. So power distinguish ability is defined:
δ P (σ , σ 1
0
)=
P (σ 0 ) − P (σ 1 ) P (σ 0 )
(10)
Power P is the real part of ∑ l =1 IV . If a kind of motivation mode has high power distinguish ability, that is to say, the object can be detected easily in the mode. It must point out that, norm distinguish ability and power distinguish ability are independent of incentive values, but related to motivation mode. L
3.2 Results 3.2.1 Norm Distinguish Ability Norm distinguish ability is used to assess the forward problem of two kinds of current incentive mode, just as Fig.2. Obviously, the back electrode and current mode is superior to hand electrode and current mode. With the depth of object changed, the variations of the two modes are similar. The norm distinguishes ability are first decreased and then increase, but the rise of back electrode and current mode is more obvious.
Fig. 2. Norm distinguish ability of two current modes
The difference of two modes can comprehended from the perspective of current distribution. In back electrode and current mode, current is injected form the electrode on the surface of model, and outflow through the back electrode. The general
Simulation Analysis on Stimulation Modes
243
direction of current is from top to bottom and around the vertical axis, just as the spindle. The object is on the vertical axis, so the corresponding current density is higher. But in hand electrode and current mode, the general direction of current is from top to hand side, the current density of vertical axis is lower than the former mode. Fig.3 shows the different distribution of current.
(a) Back electrode mode
(b) hand electrode mode
Fig. 3. Current distribution
3.2.2 Power Distinguish Ability Power distinguish ability is applied to four kinds of excitation modes. The assessment result is shown in Fig.4, back electrode and current mode is similar with back electrode and voltage mode, hand electrode and current mode is similar with hand electrode and voltage mode. The result of power distinguish ability shows us back electrode is superior to hand electrode, this is consistent with the result of norm distinguish ability.
Fig. 4. Power distinguish ability of four modes
4 Projection Images Measuring voltages are often used to reflect the internal condition of threedimensional by direct projection in electrical impedance scanning (EIS). It can also be used in the research of excitation mode. According to the conclusion of 3.2.2, back
244
W. He et al.
electrode mode has high distinguish ability. In this paper, a simple method is used to study direct projection of back electrode mode. It is: calculate the boundary voltage in uniform field and non-uniform field respectively, and obtain the direct mean of each electrode from (11): Vi
∑ =
L j =1
Vi , j
L
i, j = 1, 2,3,
(11)
,L
When current is injected through electrode j, Vi,j is the response voltage of electrode i. ΔVi is voltage difference of each electrode. Finally, we do interpolation and contour imaging on the surface of electrode area, the result is shown in Fig.5.
ΔVi = Vi ,homo − Vi ,tar
i = 1, 2,3,
,L
(12)
Vi ,homo and Vi ,tar are the average voltage in uniform field and non-uniform field. In Fig.5, x and y axis express the length and width of electrode array, z axis expresses voltage difference. According to imaging result, when object is located in the depth of 1cm, the voltage difference of object region achieves maximum, the objective and size can be clearly marked; but as the depth increases, object region is not obvious. The reason is: the result of (11) is the average voltage of each electrode, can not reflect the real field. If current is injected through electrode 1, current line under electrode 1 has most intensive and current line under electrode 36 has least intensive. Thus, according to the distribution of current line, weight coefficient W is introduced. ⎧ 1 ⎪ 1 W (i, j ) = ⎨ 0.95 × ⎪ S i, j ⎩
i=j (13)
i≠ j
Si,j is the distance between electrode i and j. (11) is improved to:
∑ (V L
Vi =
j =1
i, j
× W (i, j ) )
L
i, j = 1, 2,3,
Fig. 5. Voltage difference imaging
,L
(14)
Simulation Analysis on Stimulation Modes
245
Weighted imaging result is shown in Fig.5. Compared with direct voltage difference imaging, when object is in shallow area (1~3cm), there is little difference; but when object is in deep area (5cm above), it is obvious that weighted imaging is superior to direct imaging. In particular, the imaging result in 7cm is better than 5cm, it consist with the earlier conclusion of distinguish ability.
5 Conclusion In this paper, a preliminary study of three-dimensional EIT is achieved. To meet the clinical application, back electrode and hand electrode is studied. Four kinds of motivation modes are assessed by using of distinguish ability. The result of projecting images shows that, back electrode is superior to hand electrode and the weighted imaging is superior to direct imaging. The conclusions can be used effectively in three-dimensional EIT. Acknowledgments. This work was Supported by the Fundamental Research Funds for the Central Universities(Project No. CDJZR10150021).
References 1. Bayford, H.: Bioimpedance tomography (electrical impedance tomography). J. Annu. Rev. Biomed. Eng. 8, 63–91 (2006) 2. Kao, T.J., Boverman, G., Kim, B.S., et al.: Regional admittivity spectra with tomosynthesis images for breast cancer detection: preliminary patient study. J. IEEE Transactions on Medical Imaging 27(12), 1762–1768 (2008) 3. Michel, A., Orah, L., Dov, M., et al.: The T-SCANTM technology: electrical impedance as a diagnostic tool for breast cancer detection. J. Physiol. Meas. 22(1), 1–8 (2001) 4. Vladimir, A.C., Alexander, Y.K., Vladimir, N.K., et al.: Three-Dimensional EIT Imagine of Breast Tissues: System Design and Clinical Testing. J. IEEE Transactions on Medical Imaging 21(6), 662–667 (2002) 5. Nebuya, S., Noshiro, M., Yonemoto, A., et al.: Study of the optimum level of electrode placement for the evaluation of absolute lung resistivity with the Mk3.5 EIT system. J. Physiol. Meas. 27(1), 129–137 (2006) 6. Kao, T.J., Newell, J.C., Saulnier, G.J., et al.: Distinguishability of inhomogeneities using planar electrode arrays and different patterns of applied excitation. J. Physiol. Meas. 24(1), 403–411 (2003) 7. Rahal, M., Khor, J.M., Demosthenos, A., et al.: A comparison study of electrodes for neonate electrical impedance tomography. J. Physiol. Meas. 30, 73–84 (2009)
Researches on Spatio-temporal Expressions of Intestinal Pressure Activity Acquired by the Capsule Robot∗ Rongguo Yan1,**, Xudong Guo1, and Guozheng Yan2 1
School of Medical Instrument and Food Engineering, University of Shanghai for Science and Technology, Shanghai 200093, China
[email protected] 2 Department of Information Measurement Technology and Instruments, Shanghai Jiaotong University, Shanghai 200030, China
Abstract. Aim: To give researches on the spatio-temporal expressions of the intestinal pressure activity. Methods: The intestinal pressure activity was acquired by using a capsule robot invented for functional diagnosis of human gastrointestinal diseases, which collected intestinal physiological parameters as it propelled itself within the gastrointestinal tract. Results: a series of different contraction types were systematically analyzed to form the corresponding spatio-temporal expressions of the intestinal pressure activity including (1) standing contractions, (2) propagating segmental contractions, and (3) pendular contractions. Conclusions: Spatio-temporal expressions could provide a method for visualizing a temporally evolving and spatially varying intestinal pressure activity. Keywords: intestinal pressure activity, spatio-temporal expression, capsule robot.
1 Introduction With the development of new technology of micro-electro-mechanical system (MEMS) in recent years, many diagnosis systems for the gastrointestinal tract were emerged rapidly, such as the SmartPill capsule devices made by the SmartPill corporation, USA [1], [2], the M2A swallowable imaging capsule designed by Israelbased Given Imaging Ltd [3], [4], the MiRo developed in North Korea [5] and the NORIKA invented in RF system lab, Japan [6]. The M2A, the MiRo and the NORIKA were, actually, video endoscopies that could transmit the digital images of ∗ Supported by the National Natural Science Foundation of China (No.30900320) and Innovation Program of Shanghai Municipal Education Commission (No.10YZ93). ** Corresponding author. K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 246–253, 2010. © Springer-Verlag Berlin Heidelberg 2010
Researches on Spatio-temporal Expressions of Intestinal Pressure Activity
247
the intestine, whereas the SmartPill was a tracking device that acquired the gastrointestinal physiological signals including gastrointestinal pressure recordings, pH levels, and transit time [1], [2]. Unlike the M2A, the MiRo and the NORIKA, a practical measuring capsule robot we developed for functional diagnosis of the intestine [7], [8], [9], [10], gave possible to measure real-time physiological parameters, like the SmartPill in some sense, of the intestinal tract under the normal conditions. The method made it possible to learn unknown fields of the gastrointestinal tract. In the passed few years, we have also done several works to deal with signal processing that has been acquired by the telemetric measuring capsule robot [8], [9]. In this paper, we will deal with the spatio-temporal expressions of such intestinal pressure activities acquired by the capsule robot. Generally, the spatio-temporal expressions provide a method for visualizing a temporally evolving and spatially varying pressure time sequences. In the past, several researchers used spatio-temporal maps, and called them D-maps (the D refers to the diameter). Such method has been used to analyze the transmembrane potential distribution in the longitudinal muscle layer [11], [12], and to analyze the motility of the small intestine [13], [14], [15], colon [16], stomach [17]. Unlike the spatiotemporal maps, the spatio-temporal expressions we proposed will give threedimensional plots of the intestinal pressure activity.
2 Method and Subject 2.1 Method The intestinal pressure activity was acquired by using a micro system invented for gastrointestinal diagnosis in the past few years (see Fig.1.). The main components of the diagnosis system were: 1) an in-vivo capsule robot, which was the first component that performed data acquisition, using the sensor sensitive to pressure variations embedded in the capsule, of physiological parameters within the GI tract under the normal physiological conditions, sent the data to the next component, i.e., the in-vitro pocket data recorder via a radio frequency (RF) wireless link; 2) an in-vitro pocket data recorder, which could be mounted around the waist of the subject and received physiological parameters from the in-vivo capsule robot; and 3) an in-vitro data processing center, which was composed of a computer to download the acquired data from the pocket data recorder via a RS232 interface and finally processed them under the instruction and guidance of clinical doctors. The outline of the capsule looked like a little bigger pharmaceutical pill with 21.1mm length, 10.0mm diameter and 2.9gram weight. The sampling rate of the system was about 0.83Hz (that was, a sample every 1.2 seconds). The sampling was more than adequate to measure motility signals that, in humans, lied in the range from 1 cycle per minute (cpm) to 12 cpm (0.016Hz to 0.2Hz).
248
R. Yan, X. Guo, and G. Yan
Fig. 1. The diagram of the signal acquisition system
After the subject have taken the capsule orally with a cup of water, it began its “tour” in the GI tract by the natural peristaltic motion (i.e. by human gastrointestinal motility) of the gastrointestinal tract until it was moved out normally outside the body through the anus. During its “tour”, the intestinal pressure data were recorded in the capsule firstly. The data were, then, transported to the pocket data recorder via a radio-frequency (RF) transmitting module embedded in the capsule due to limited store memory on a chip. Finally, the data could be downloaded via a RS232 interface to our computer easily for further study. 2.2
Subject
One 42-year-old healthy female volunteer was recruited through a hospital advertisement before the experiment. The healthy subject was in good health, had no previous history of gastrointestinal symptoms or surgery, was not taking any medications, and had normal physical examinations. During the experiment, the subject did not complain of any discomfort except that she had some difficulties to swallow at the beginning for her psychological factor and swallowed the capsule successfully with ease with another cup of water. On the experiment, the subject swallowed the capsule at 9:33 on the first day, discharged it outside of the body at 19:30 on the second day, and the total transit time of the capsule from the mouth to the anus was about 34 hours. One hour representative intestinal pressure activity from 15:50 to 16:50 on the second day was shown in Fig.2. The position of the capsule within the GI tract was confirmed in the colon by x-ray photograph taken at 14:38 on the second day. They mainly consisted of seemingly irregular segmental waves with “erratic” fluctuations, which were referred to as phasic non-propagating contractions occurring either sporadically or in bursts for extracting the remaining nutrients and in particular water and electrolytes from the chyme in the colon. The pressure activity plotted in Fig.2. was relative to standard atmosphere (101.325kPa).
Researches on Spatio-temporal Expressions of Intestinal Pressure Activity
249
Pressure (kPa)
100 50 0 -50 -100 15:50
16:00
16:10
16:20 Time (hour)
16:30
16:40
16:50
Fig. 2. The representative intestinal pressure activity
3 Results The time sequence plotted in Fig.2. could really give intestinal pressure activity against time. Based upon this time sequence, it was easy to know the averaged pressure, the baseline, the variation trend, and other parameters of the intestinal pressure activity using the statistical method on this time period. However, it provided no position information about the sensor in the gastrointestinal tract. The position of the robot in the GI tract was empirically conferred according to X-ray photograph taken occasionally and the transit time after the capsule robot was swallowed from the mouth. On the time period, the capsule maybe performed to-and-fro movement in a specific intestinal segment, halted at a position to perform standing contractions, or performed propagating segmental contractions. In this paper, we will give researches on spatio-temporal expression of intestinal pressure activity, which was acquired by the capsule robot abovementioned. The spatio-temporal expressions could provide a method for visualizing a temporally evolving and spatially varying intestinal pressure activity acquired by the capsule robot. Actually, a series of different contraction types will be systematically analyzed to form the corresponding spatio-temporal expression of the intestinal pressure activity including (1) standing contractions, (2) propagating segmental contractions, and (3) pendular contractions that being described below. 3.1 Standing Contractions Standing contractions, also referred to as non-propagating segmental contractions, were those contractions that were allowed to contract and relax at fixed locations. Fig.3. shows standing contraction simulations of the intestinal segment. In the figure, the intestinal segment was depicted with three standing contractions from the left to the right. The intestinal diameter of the segment, as shown in the figure, is 100% along the segment and increases/decreases at the three locations of the contractions. The standing contraction means each lumen contracts in the cross sectional direction and has no movement along the segment (in the oral or aboral direction). In this simulation, the first contraction occurred on both sides of the intestinal tube, and the other two occurred on each side of the intestinal tube. The length between each contraction was about 10mm [18].
depth
depth
diameter
R. Yan, X. Guo, and G. Yan
depth
250
Fig. 3. Standing contraction simulations of the intestinal segment
During these standing contractions, the capsule robot should remain in each position or place for a time. Fig.4. showed spatio-temporal expressions of standing contractions of the intestinal pressure activity at three positions along the aboral direction. This method was helpful to find a remarkable motor pattern of the intestine, known as the migrating motor complex (MMC), which consists of several phases of motor activity that is played out over hours of time and meters of space [20].
Fig. 4. Spatio-temporal expressions of standing contractions of the intestinal pressure activity
Satish S.C. Rao has ever reported one-dimensional two propagating pressure wave sequences of three cycles/min along the transverse colon, the splenic flexure, the descending colon, to the sigmoid, and plotted them in the six subplots [19]. Also, he has given an example of retrograde pressure wave moving oral along the sigmoid, the descending colon, the splenic flexure, and the transverse colon [19]. Using the spatiotemporal expressions, I thought, we also could clearly see such propagating pressure wave in three-dimensional plot and their moving directions. 3.2 Propagating Segmental Contractions Propagating segmental contractions were intestinal activity patterns characteristics of segmental contractions (a prolonged series of bursts of spike activity superimposed on the basic electric rhythm) that were recorded successively at consecutive site along
Researches on Spatio-temporal Expressions of Intestinal Pressure Activity
251
the intestine. During these contractions, the capsule robot maybe could track such pressure varieties within the intestinal tract, and the pressure activity should change both in time and in length (along the aboral direction). Fig.5. showed spatio-temporal expressions of such propagating segmental contractions of the intestinal pressure activity, supposed that the capsule robot moved 20mm in distance along the aboral direction in 600s as Fig.5. showed.
Fig. 5. Spatio-temporal expressions of propagating segmental contractions of the intestinal pressure activity
3.3 Pendular Contractions One of the most common types of contraction of the gastrointestinal tract is the socalled pendular contraction, which is the result of the rhythmic contractions of the longitudinal muscles. During these contractions, the capsule robot should perform toand-fro movements. This created in the spatio-temporal expressions a pattern of a zigzag intestinal pressure activity. Fig.6. showed spatio-temporal expressions of such pendular contractions of the intestinal pressure activity, supposed that the capsule robot moved 4 times in 600s to-and-fro, 10mm in distance each, as Fig.6. showed.
Fig. 6. Spatio-temporal expressions of pendular contractions of the intestinal pressure activity
252
R. Yan, X. Guo, and G. Yan
4 Conclusions In this paper, we studied the spatio-temporal expressions of intestinal pressure activity acquired by the capsule robot, which collected intestinal physiological parameters as it moved in the gastrointestinal tract by the natural peristaltic motion until it was moved out normally outside the body. Such spatio-temporal expressions provided a method for visualizing a temporally evolving and spatially varying intestinal pressure activity. The spatio-temporal expressions of three typical intestinal pressure activities including standing contractions, propagating segmental contractions, and pendular contractions are involved, well analyzed, and clearly expressed in three dimension plots. Acknowledgements. This work is supported in part by the National Natural Science Foundation of China (No. 30900320).
References 1. Smart pill nationwide diagnostic centers, http://www.natdxc.com/SmartPill.html 2. SmartPill corporation, the measure of GI health, http://www.smartpilldiagnostics.com 3. Appleyard, M.: A randomized trial comparing wireless capsule endoscopy with push enteroscopy for the detection of small-bowel lesions. Journal of Gastroenterology 119(6), 1431–1438 (2000) 4. Given Imaging, http://www.givenimaging.com 5. MiRo, North Korea, http://www.i3system.com 6. NORIKA, Japan, http://www.rfsystemlab.com 7. Wang, W.X., Yan, G.Z., Sun, F., et al.: A non-invasive method for gastrointestinal parameter monitoring. World Journal of Gastroenterology 11(4), 521–524 (2005) 8. Yan, R.G., Yan, G.Z., Zhang, W.Q., et al.: Phase coupling analysis of gastric pressure activity via wavelet packet based diagonal slice spectra. Computer Methods and Programs in Biomedicine 83, 198–204 (2006) 9. Yan, R.G., Yan, G.Z., Zhang, W.Q., et al.: Long-range scaling behaviours of human colonic pressure activities. Communications in Nonlinear Science and Numerical Simulation 13, 1888–1895 (2008) 10. Jiang, P.P., Yan, G.Z., Ding, G.Q., et al.: Researches on a telemetry system for gastrointestinal motility monitoring. In: Proceedings of, International Symposium on 2003 Micromechatronics and Human Science, Nagoya (2003) 11. Aliev, R.R., Richards, W., Wikswo, J.P.: Non-linear model of electrical activity in the intestine. In: Proceedings of the First Joint BMES/EMBS Conference Serving Humanity, Advancing Technology, GA, USA (1999) 12. Aliev, R.R., Richards, W., Wikswo, J.P.: A simple nonlinear model of electrical activity in the intestine. Journal of Theoretical Biology 204, 21–28 (2000) 13. Hennig, G.W., Costa, M., Chen, B.M., Brookes, S.J.H.: Quantitative analysis of peristalsis in the guinea-pig small intestine using spatio-temporal maps. Journal of Physiology 517, 575–590 (1999)
Researches on Spatio-temporal Expressions of Intestinal Pressure Activity
253
14. Bouchoucha, M., Benard, T.: Dupres: Temporal and spatial rhythmicity of jejunal wall motion in rats. Neurogastroenterology and motility 11, 339–346 (1999) 15. Bercik, P., Bouley, L., Dutoit, P., Blum, A.L., Kucera, P.: Quantitative analysis of intestinal motor patterns: spatiotemporal organization of non-neural pacemaker sites in the rat ileum. Gastroenterology 119, 386–394 (2000) 16. Antona, D., Hennig, G.W., Costa, M., Humphreys, C.M., Brookes, S.J.H.: Analysis of motor patterns in the isolated guinea-pig large intestine by spatio-temporal maps. Neurogastroenterology and motility 13, 483–492 (2001) 17. Berthoud, H.R., Hennig, G., Campbell, M., Volaufova, J., Costa, M.: Video-based spatiotemporal maps for analysis of gastric motility in vitro: effects of vagal stimulation in guinea-pigs. Neurogastroenterology and Motility 14, 677–688 (2002) 18. Jep Lammers, W.J., Cheng, L.K.: Simulation and analysis of spatio-temporal maps of gastrointestinal motility. BioMedical Engineering OnLine 7(2), 1–11 (2008) 19. Rao, S.S.C., Sadeghi, P., Beaty, J., et al.: Ambulatory 24-h colonic manometry in healthy humans. American Journal of Physiology - Gastrointestinal and Liver Physiology 280(4), 629–639 (2001) 20. Ehrlein, H.J., Schemann, M., Siegle, M.L.: Motor patterns of small intestine determined by closely space extraluminal transducers and videofluoroscopy. American Journal of Physiology - Gastrointestinal and Liver Physiology 253(3), G259–G267 (1987)
Analysis of Chlorophyll Concentration during the Phytoplankton Spring Bloom in the Yellow Sea Based on the MODIS Data Xiaoshen Zheng1,2 and Hao Wei1,2 1 2
Physical Oceanography Laboratory, Ocean University of China, Qingdao, 266100, China Tianjin key laboratory of marine resource and chemistry, Tianjin University of Science and Technology, Tianjin, 300457, China
[email protected]
Abstract. The Yellow Sea is semi-enclosed shelf sea located between the mainland of China and the Korea Peninsula. It is one of the most important fishing areas in the world. Phytoplankton bloom is defined as a relatively rapid increase in the biomass of phytoplankton. The twice bloom in spring and autumn every year is common in Yellow Sea. Chlorophyll concentration is the important parameters to estimate the phytoplankton biomass and its seasonal variation or blooms. Taking the yellow sea as an experimental site, a cruise for phytoplankton bloom was carried out in the Yellow Sea with R/V Beidou founded by 973 projects. Chlorophyll was measured by a fluorescence sensor of sea point installed on a RBR 620 CTD. Using MODIS remote sensing images, unifying the location water quality monitor data, in this paper the model of retrieval of chlorophyll concentration is built using the 250m resolution waveband 1 and 2 reflectivity combination compare with chlorophyll concentration measurements by correlation analysis and multivariate regression. Then, the distribution of chlorophyll concentration in the yellow sea is mapped based on this retrieval model. The intensity, place, process and distributed scope of this phytoplankton spring bloom reflected clearly. The results of this study show that MODIS data is useful in retrieving quantitatively chlorophyll concentration and studying on the onset of phytoplankton spring bloom and its dynamics in the yellow sea. Keywords: chlorophyll concentration, phytoplankton bloom, Yellow Sea.
1 Introduction Phytoplankton bloom is defined as a relatively rapid increase in the biomass of phytoplankton in an aquatic system. The twice bloom in spring and autumn every year is common for mid-latitude phytoplankton annual cycle [1], and the intensity and range of the autumn bloom is weaker and smaller than the spring bloom’s. Phytoplankton bloom has great effect on the primary production level and biological variable cycle of the whole area region [2-3]. Chlorophyll concentration is the important parameters to estimate the phytoplankton biomass and its seasonal variation K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 254–261, 2010. © Springer-Verlag Berlin Heidelberg 2010
Analysis of Chlorophyll Concentration during the Phytoplankton Spring Bloom
255
or blooms. The conventional monitoring measurement of chlorophyll concentration is difficult to reflect comprehensive, dynamic, continuous information of chlorophyll concentration. Water color got from satellite remote sensing is the one of the best method to obtain the continuous distribution of chlorophyll in a definite area after insitu data calibration [4]. Satellite remote sensing data of chlorophyll concentration make up for deficiencies in decentralized and achieve large-scale, long continuous observation, which is more conducive to the chlorophyll concentration monitoring, analysis and research. Inversion algorithms accuracy of chlorophyll concentration is low, which is the direct use of blue-green band ratio to achieve chlorophyll concentration [5], so the bands will be normalized to improve the accuracy of inversion algorithms. At the same time, some researchers inverse the chlorophyll concentration through the Normalized Difference Vegetation Index (NDVI) method to construct NDPI index in Taihu lake [6-7], which indicated that it is strong interrelated between 250m resolution waveband 1 and 2 reflectivity and chlorophyll concentration measurements. In this paper , the 1,2 waveband reflectance indices can be used to build the model of remote sensing NDPI for chlorophyll concentration inversion algorithms in Yellow Sea, which ranges 33 ~ 38 ° N, 119 ~ 126 ° E. The results of this study show that MODIS data is useful in retrieving quantitatively chlorophyll concentration and studying on the onset of phytoplankton spring bloom and its dynamics in the yellow sea.
2 Observation Data of Phytoplankton Bloom in the Yellow Sea The yellow sea is one of the most important fishing areas and one of the world’s 50 so-called “large marine ecosystems” [8]. The winter water of the vertical mixing can provide the nutrient rich conditions for phytoplankton growth, in spring, the biomass of phytoplankton increased dramatically with rising water temperatures and the stability of water [9]. The phytoplankton bloom occurred in a very complex mechanism, usually associated with the hydrodynamic conditions, weather conditions, temperature, salinity distribution and changes in the added nutrients and cycle characteristics of phytoplankton [10]. In general the spring bloom happens in March, strengthens in April and decreases in May due to the consumption of the nutrient. From March 30 to April 24, 2007, a cruise for phytoplankton bloom was carried out in the Yellow Sea with R/V Beidou founded by 973, 2006CB600402. Chlorophyll was measured by a fluorescence sensor of sea point installed on a RBR 620 CTD. Sampling stations at the scene include large-spot investigation stations and continuous observation stations of algal bloom focus area, the former is regarded for the background field research of phytoplankton bloom, and the later mainly for the course of algal research. Sampling stations at the scene show as figure 1[11]. The integrated multi-disciplinary observation is carried on the scene, at the same time the satellite remote sensing images are collected, combining with the cruise measured chlorophyll data phytoplankton bloom locked position, further follow-up to carry out continuous observation of regional algal research process.
256
X. Zheng and H. Wei 38.00
E1 F1
37.00
Latitude (deg.N)
China
C1
A1 A2
36.00
D1
G1
E2
G2
BM4-a BM4 BM4-b
C2
E3
G3
BM4-c
A3
C3
D3
E4
F4
BM2
A4
35.00
C4 B5
E5 C5
BM1
34.00
BM3
A6
B6
C6
South Yellow Sea 33.00 119.00
120.00
121.00
122.00
123.00
124.00
125.00
126.00
Longitude (deg.E)
Fig. 1. Sampling stations during spring bloom cruise in 2007
3 Remote Sensing Data and Processing The remote sensing images and data are downloaded from http://ladsweb. nascom.nasa.gov/data/search.html, in which the MODIS AQUA 1B remote sensing images are used. During the course of processing, the remote sensing images are calculated for geometric correction and radiometric calibration. The Geometric correction using Geographic Lat / Lon projection, combined with 1B data correction of the latitude and longitude information, corrected the location accuracy of 0.5 pixels. The purpose of radiation calibration is the pixel value of reflectance. The formula is as follows:
R = scales × ( DN − offsets )
(1)
In which, R point for the image pixel reflectance values, scales for the scaling coefficient of reflectivity, DN value for the 1B data storage; offsets scaling intercept for reflectivity.
4 Model and Algorithm 4.1 Characteristics of MODIS Data Moderate Resolution Imaging Spectroradiometer (MODIS) is one of the main sensor in the AQUA satellites, which mainly for large-scale monitoring of the Earth, detecting the change of atmospheric, oceanic and terrestrial elements in the dynamics of space-time [12-13]. The MODIS sensor has 36 spectral bands, spatial resolution includes three kinds of 250 m, 500 m and 1000 m, in which the band features of spatial resolution of 250m are showed in Table 1.
Analysis of Chlorophyll Concentration during the Phytoplankton Spring Bloom
257
Table 1. The band features of spatial resolution of 250m
Number of band
Wavelength range /nm
spatial resolution /nm
Spectral emissivity /W*m-2*um-1*sr-1
SNR
1 2
620~670 841~876
250 250
21.8 24.7
128 201
4.2 The Establishment of Model At present, the inversion algorithms of ocean chlorophyll based on MODIS images are established in the direct application of the 1000m spatial resolution of the marineband. The reflectance spectroscopy of water bodies is an obvious difference when containing different chlorophyll concentration, especially at higher chlorophyll concentration, which is mainly reflected in the near-infrared, that is, with increasing chlorophyll concentration, water on the near-infrared light reflectance was significantly increased [14-15]. It is found that the chlorophyll concentration is very sensitive to the signal of No.2 band (Near-infrared wave, 841-876nm) while not high sensitive to No.1 band (visible light, 620- 670nm) so we could build the reflectivity index NDPI for inversion of chlorophyll concentration from remote sensing. Formula NDPI algorithm is as follows:
NDPI (1,2) = ( R1 − R 2) /( R1 + R 2)
(2)
Where: R1, R2 respectively MODIS band 1 and band 2 reflectance values, NDPI (1,2) is the remote sensing index of band 1 and band 2. The inversion model of chlorophyll concentration is established through quadratic polynomial regression analysis between NDPI index of remote sensing and the in-situ data of chlorophyll concentration. The equation is shown in formula (3), and sampling point using the measured chlorophyll concentration value and NDPI fitting result is shown in Figure 2. Chl = 95.614 × ( NDPI ) 2 − 8.4016 × ( NDPI ) + 1.1329
(3)
Its correlation coefficient as: R 2 = 0.8417 . In which, chlorophyll is the inversion value of chlorophyll concentration (mg/m3), NDPI is the remote sensing index of band 1 and band 2.
258
X. Zheng and H. Wei
12.00 10.00 8.00 ) 3 m / g m 6.00 ( / l h c 4.00 2.00 0.00 0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
NDPI
Fig. 2.The measured chlorophyll concentration value and NDPI fitting result
4.3 Model Evaluation
Chlorophyll concentration is inversed using the remote sensing extraction model of NDPI index on formula (3) in Yellow sea. The result of comparative analysis between measured chlorophyll concentration and inversion chlorophyll concentration is shown in figure 3. 12.00
measured chlorophyll concentration inversion chlorophyll concentration
10.00 8.00 3 m / g m 6.00 / l h C 4.00 2.00
0.00 1
2
3
4
5
6
7
8
9
10
11 12
sample station
Fig. 3. The comparative analysis of measured and inversion of chlorophyll concentration
From Figure 3, we can see the distribution of trends and patterns of model inversion for chlorophyll concentration and the measured values are the same, so the model of NDPI remote sensing index is fit for the inversion of the concentration of chlorophyll.
Analysis of Chlorophyll Concentration during the Phytoplankton Spring Bloom
259
5 Inversion of Concentration of Chlorophyll in Yellow Sea The result of daily chlorophyll concentration inversion in yellow sea (the range of 33 ~ 38 ° N, 119 ~ 126 ° E) from January to May, 2007 is shown in figure 4. 4 3.5 3 3m /g m/ l hc
2.5 2 1.5 1 0.5
20 -M ay
27 -A pr 13 -M ay
7A pr 16 -A pr
3A pr
27 -F eb
22 -F eb
2Fe b
15 -F eb
28 -J an
19 -J an
14 -J an
0
date
Fig. 4. The average inversion result of chlorophyll concentration
From the above figure, we can see the chlorophyll concentration reached a peak on April 3, it is same as the result of the time of our cruise. It is possible that the phytoplankton bloom broke out about April 3, according to the data of sampling stations during spring bloom cruise in 2007, it is true. So the intensity, place, process and distributed scope of this phytoplankton bloom reflected clearly in MODIS remote sensing images. The reversion result is shown in figure 5.
April 2
April 6
April 3
April 9
April 4
April 13
Fig. 5. The inversion result of chlorophyll concentration in April, 2007
260
X. Zheng and H. Wei
April 16
0
0.5 1
April 19
1.5
2
2.5
3
5
8
10 15
30 (mg/m3)
Fig. 5. (continued)
From the above inversion result of chlorophyll concentration, it is obvious that the phytoplankton bloom occurred in a large area on April 3, and faded on April 9.
6 Conclusion The experimental results show that the use of NDPI extraction model of remote sensing index is feasible to inverse the chlorophyll concentration in the Yellow Sea. At the same time, the algorithm model is relatively stable, which can be used to reflect the actual distribution of chlorophyll concentration trends and patterns. According to the measured data and satellite remote sensing image data we can see that the spatial variation of the phytoplankton bloom appears in shore region at first. Then it extends to the central part of the Yellow Sea. After the central region bloom declined, the chlorophyll concentration has another small peak in the shore region again. The distribution of the phytoplankton bloom reveals the patchiness characteristic which has a wide range of spatial and temporal scales. In the future more analysis is needed to research on the dynamics of spring bloom combined with wind and mixing. Acknowledgments. This paper is supported by National Key Basic Research Development Program (973project-2010CB428904), National Natural Science Foundation Project (40830854), National Major Basic Research Program (2006CB400602), Tianjin Natural Science Foundation Project (08JCYBJC10500) and (09JCZDJC25400).
References 1. Lalli, C.M., Parsons, T.R.: Biological oceanography: An introduction, p. 103. Butterworth Heinemann, Oxford (1997) 2. Riley, G.A.: The Relationship of Vertical Turbulence and Spring Diatom Flowerings. Journal of Marine Research 5(1), 67–87 (1942)
Analysis of Chlorophyll Concentration during the Phytoplankton Spring Bloom
261
3. Riley, G.A.: Factors Controlling Phytoplankton Populations on Georges Bank. Journal of Marine Research 6(I), 54–73 (1946) 4. Xiuren, N., Junxian, S., Yuming, C., Chenggang, L.: Biological Productivity Front in the Changjiang Estuary and the Hangzhou Bay and its Ecological Effect. Acta Oceanologica Sinica 26(6), 96–106 (2004) 5. Dall’Olmo, G., Gitelson, A.A., Rundquist, D.C., Leavitt, B., Barrow, T., Holz, J.C.: Assessing the potential of SeaWiFS and MODIS for estimating chlorophyll concentration in turbid productive waters using red and near infrared bands. Remote Sensing of Environment 96, 176–187 (2005) 6. Lingya, Z., Shixin, W., Yi, Z., Fuli, Y.: Determination of Chlorophyll-a Concentration in Taihu Lake Using MODIS Image Data. Remote Sensing Information (2), 25–28 (2006) 7. Chungui, Z., Yindong, Z., Xing, Z., Weihua, P., Jing, L.: Ocean Chlorophyll-a Derived from Satellite Data with Its Application to Red Tide Monitoring. Journal of Applied Meteorological Science 18(6), 821–831 (2007) 8. Shizuo, F., Fengqi, L., Shaojing, L.: Introduction to Marine Science. Higher Education Press, Beijing (2003) 9. Zhao, L., Wei, H.: The Influence of Physical Factors on the Variation of Phytoplankton and Nutrients in the Bohai Sea. Journal of Oceanography 61, 335–342 (2005) 10. Hao, W., Lei, W., Lin, Y., Chung, C.: Nutrient Transport Across the Thermocline in the Central Yellow Sea. Advances in Marine Science 20(3), 15–20 (2002) 11. Hanjun: Numerical study on the physical effect on phytoplankton bloom in the yellow sea, Ocean University of China (2008) 12. Barnes, W.L., Salomonson, V.V.: MODIS: A Global Imaging Spectroadiometer for the Earth Observing System. Crit. Rev. Opt. Sci. Technol. CR47, 285–307 (1993) 13. Barnes, W.L., Pagano, T.S., Salomonson, V.V.: Prelaunch Characteristics of the Moderate Imaging Spectroradiometer (MODIS) on EOS-AMI. IEEE Trans. Geosci. Remote Sens. 36, 1088–1100 (1998) 14. Carder, K.L., Chen, F.R., Lee, Z.P., Hawes, S., Kamykowski, D.: Semi-analytic MODIS Algorithms for Chlorophyll a and Absorption with Bio-optical Domains Based on NitrateDepletion Temperatures. J. Geophys. Res. 104, 5403–5421 (1999) 15. O’Reilly, J.E., Maritorena, S., Mitchell, B.G., Siegel, D.A., Carder, K.L., Garver, S.A., Kahru, N., McClain, C.: Ocean Color chlorophyll Algorithms for SeaWiFS. J. Geophys. Res. 103, 24937–24953 (1998)
A Novel Association Rule Mining Based on Immune Computational Intelligence Xuesong Xu and Sichun Wang Institute of Management Engineering, Information College of Hunan University of Commerce, Changsha, China
[email protected]
Abstract. By inspiration of immune computational intelligence, a novel association rule mining algorithm based immune clonal and cluster was proposed. Aim at the efficiency problem of association rules mining,raw data is regarded as antigen and candidate pattern is regarded as antibody. enhancing the antibody’s affinity maturation rate and improving the support of candidate patterns through the cluster competition operation. The simulation and real application illustrate this algorithm can increase the convergence velocity and advance veracity of the association rule, and has the remarkable quality of the global and local research reliability. Keywords: association rule mining, cluster and competition, clonal selection, data mining.
1 Introduction In the field of the data mining, the association rules mining is one of the important research tasks of data mining, As we know, Srikant Agrawal put forward the famous Aprion algorithm [1,2]. Then Han and Kuma proposed a novel association rules of parallel data mining algorithm [3], Han also studied the space of association rule mining. But the association rule mining need to find out all support meet requirements. Some of the methods above also have the slow convergence speed and cannot get all association rules correctly. Artificial immune system based on biological immune mechanism, provides a random intelligent search technology. The mechanism of immune memory has excellent search capabilities and it easily used in association rule mining and establish corresponding relation between the immune system. Literatures [4][5][6][7] apply clonal selection mechanism in association rules mining, so as to realize the immune algorithm and data mining. In this paper, an Immune Cluster Association Rule Mining algorithm(ICARM) was proposed based on immune clonal selection mechanism. Through clonal selection and cluster competitive to expanse the high affinity antibodies, makes each antibody in the cluster can be optimized independently in the evolutionary process. And the new antibodies can search in a wider range of search space to achieve multi-modal local K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 262–270, 2010. © Springer-Verlag Berlin Heidelberg 2010
A Novel Association Rule Mining Based on Immune Computational Intelligence
263
and global optimization. So it can improve the accuracy and speed for association rules. The remainder of the paper is organized as follows. Section 2 briefly describes the basic definition of itemsets, datasets and association rules. Then immune clone optimization principle was introduced in brief. In section3,4 and 5, we propose the immune cluster association rule mining algorithm and the algorithm’s realization. Typical test and real application are used to validate the proposed algorithm and to verity the correctness of the theory analysis in section5. Finally, we present some conclusions.
2 Problem Describe In general, suppose I = {i1 , i2 , in } be a collection of itemsets, and data D is a group of datesets, each of T also belongs to the itemsets, and T ⊆ I . Set A is a itemset, that datasets T contain A , when A ⊆ T . Association rules is one of the forms of the following A ⇒ B , where A ⊂ I , B ⊂ I and A ∩ B = ∅ . We can give the definition as follows: Definition 1: Suppose the association rules A ⇒ B in the datatset D , sustentation degree is s. If D in the percentage of A ∪ B consist of s, that is support ( A ⇒ B) = P( A ∪ B ) ; Definition 2: Suppose the rule A ⇒ B in the D has the size of the confidence level c, If D transaction contains A also contains the percentage of c, namely is confidence ( A ⇒ B) = P( A | B ) .
To dig out the valuable association rules, we set two thresholds: minimum support (minsurp) and minimum confidence (minconf). The strong rules are those meet the threshold. If itemsets meet minimum support, the frequency of the itemsets is equal to or greater than the minimum support. Therefore association rules mining usually divided into two steps: one is to find all the frequent episodes, the other is generated by frequent episodes strong association rules. The first step is the key to the algorithm performance, because the number of potential frequently itemsets with a total exponentially relationship, this will require searching the whole space. Immune Clonal Selection is the important theory of biological immune syste m theory. Different from the traditional evolutionary algorithm which emphasizes individual competition in natural selection to keep the population unchanged. On one hand, cloning operation realizes individual competition through the antibody-antigen affinity, and on the other hand it preserves the diversity of antibody groups using affinity regulation or inhibiting excessive competition. Then it is possible that a certain antibody uses a variety of variations and restructuring strategy at the same time through individual proliferation. Because of the evolution of the immune control of the distribution of non-central parallel operation, makes each antibody in the antibody group can be optimized independently in the evolutionary process.
264
X. Xu and S. Wang
3 Data Expression and Initialization The original records and candidate mode can be considered by the gene, so can set the chromosomes as antigens for the data records, candidate as recognition of antibodies. Through the comparison with antigen and antibody, we can get their similarity degree and relationship. To each antibodies, all the high degree of matching to antigen will gain greater chance of survival and cloning. The process of learning is to improve individual’s affinity, and is also the process of generating and frequent mode by immune memory mechanism is preserved. Finally, strong association rules were memorized by cells. Each record in data mining is regarded as an antigen. Search process generated candidate mode of usable antibodies, including three important information: B cell number,B cells stimulate threshold and support degree. The structure of data expression as figure 1.
Fig. 1. Data expression of antigene-antibody
Here the decimal encoding attribute of collection, Set W = {W1 , W2 , ,Wn } is a finite set of attributes, V = {V1 ,V2 , ,Vn } ( Vi is the attribute Wi of the range) is the property of the value set, then the encoding range of Wi is 0~ Vi . Such records or modes will be expressed as the attribute of the coding sequence. The definition of the initial antibody group Ab , size is N , each antibody encoding length is L , define the shape space of all the antibody is S , then Ab ∈ S N × L .
4 Immune Association Rule Mining Algorithm 4.1 Immune Clonal Cluster Process
The initial population of antibody is obtained randomly. Select a recode randomly from the data sets and set some bits of genes to 0 randomly, so get an antibody clustering, if the antibody in the same individual species, add it to the populations, Repeated several times, until the scale of population meet our requirement.
A Novel Association Rule Mining Based on Immune Computational Intelligence
265
In general, defining the initial antibody group A(k ) of the scale N, the encoding length of each antibody is L , shape space S , then A(k ) ∈ S N ×L . Achieve the initial clustering competitive algorithm flow.After clustering, we can get the antibody group. Each cluster represents a sub-population and has Di antibodies. So we introduce the competition selection mechanism in each sub-populations and put currently best antibody that has the max fitness and represents the cluster center into elite set, constitute antibody group of T scale. After that, the T original outstanding antibody group respectively expand to the T small antibody group of the scale N i (i = 1,2, ",T ) ,Through those operator, The similar antibodies will be put into the same cluster , which be selected and reproduced in the local elite set to realize the affinity maturation. 4.2 Clonal Proliferation
After clustering, conduct the antibodies clonal proliferation according formula (1) clonal expansion of clone size of function. ⎛ N i = round ⎜ N × aff( Abi ) ⎝
⎞
T
aff( Ab ) ⎟ ∑ j =1 j
⎠
(1)
Where Ni is the i antibody clonal size, round( i) is an entire functions, and i = 1, 2,
T
, T . N c = ∑ N i , N c is the scale of antibodies after clustering and N c ≈ N . i =1
Therefore, the antibodies with higher affinity will get more chance to clone and replicate more childrens. After cloning, the original antibody group were expanded to Ni ( i = 1, 2, , T ) new antibody groups with T scale. 4.3 Mutation of Antibody
Antibody mutation operator random selected one or more points from an antibody, insert the genes in a certain position to build a new antibody. Figure 2 describe the antibody mutation operator.
Fig. 2. Single point mutation illustration
266
X. Xu and S. Wang
4.4 The Fitness Function
In the process of implementation algorithms, the excellent individuals that have a good support which bigger than support threshold will be preserved as memory cells. So, memory cells represents the model which satisfied the requirement minimum support mode, can also easily be extracted to meet the minimum requirements of association rules of confidence. Because reflect association rules of credibility and performance are statistics, so this algorithm choose support as a filter conditions, and confidence as the antigen-antibody accessible function, namely: aff( Abi ) = C , C is the confidence.
5 Algorithm Realization Algorithm generated attribute value randomly and select a certain attribute value which’s probability pi = 0 and select another property values which’s probability is (1 − pi ) . The integer ranges from 1 to the number of attribute values. When a property corresponding to the selection of the probability of pi = 0 , this attribute must exist among in the association rules to dig out; Otherwise, it doesn't exist in the attribute that the association rules. So, if you want to dig out the specific attributes associated with the rules of the selection of this property should be pi = 0 , and take the rest of the property that the probability of pi = 0.2 ~ 0.5 The main process of ICARM algorithm is described as follow. Step 1: Initialization: Create initial population randomly and get N antibodies Ab ∈ S N ×L . Step 2: Clustering: Perform cluster to antibody population Ab as the way described in section 4 and get M antibodies cluster. Perform competition in each cluster and put some antibodies that have the max fitness or represent the cluster center into elite set, compose the T (T = 2M )subpopulation Abe ( Abe ∈ S T ×L ). Step 3: Clonal Proliferation: Perform reproduction to population Abe times and get the Nc population Abc ( Abc ∈ S Nc ×L ). Step 4: Mutation : Perform the mutation to the population Abc with pmc probability. Eliminate those individuals less than the support degree to obtain the Nc population Abm ( Abm ∈ S Nc ×L ). Step 5: Selecting : Perform the selection operator on the population Abm . Eliminate all but the max fitness individuals whose distance are less than the suppression threshold σs to obtain the population Abd ( Abd ∈ S Nd × L , N d ≤ N c ). Step 6: Suppression and supplement: Create N r newcomers randomly and choose N s ( Ns N r ) individual has better fitness to constitute the next population together with Abd . If the antibodies meet the minimum support and confidence, then recovery the antibodies original attribute value and retained in populations. Step 7: Convergence check: Repeat step 3-6,until most solutions are not improved any more.
A Novel Association Rule Mining Based on Immune Computational Intelligence
267
6 Experimental Simulation and Application In order to verify the algorithm for mining association rules, we use Iris data sets from UCI machine learning database. Iris contains 150 objects which evenly distributed in three classes and has four consecutive condition attribute. We use C5.0 to generate the rules set and compare to ICARM. The results show in table 1 and table 2. Table 1. Iris rule sets based on C5.0 Rule1 (Cover 35 cases) Petal-Length ≤ 1.9 → class Iris: Setosa Rule2 (Cover 32 cases) Petal-Length ≥ 1.9 ∧ Petal-Length ≤ 5 ∧ Petal-Width ≤ 1.6 → class Iris:Versicolor Rule3 (Cover 29 cases) Petal-Width > 1.6 → class Iris: Virginica Rule4 (Cover 28 cases) Petal-Length>5 → class Iris-Virginica Default class: Iri: Setosa Table 2. Iris rule sets based on ICARM Rule1 (Cover 35 cases) Petal-Length ≤ 1.9 → class Iris: Setosa Rule2 (Cover 32 cases) Petal-Length ≥ 1.9 ∧ Petal-Length ≤ 4.89 ∧ Petal-Width ≤ 1.58 → class Iris:Versicolor Rule3 (Cover 29 cases) Petal-Width > 1.6 → class Iris: Virginica Rule4 (Cover 28 cases) Petal-Length>4.81 → class Iris-Virginica Default class: Iri: Setosa
From the table1,2, ICARM and C5.0 generate the same rules. But the difference showed in rule 2 and rule 4. ICARM get the Petal attribute Length right boundary value is 4.89 and width boundary value is 1.58 in rule 2. The Petal attribute length left boundary value change to 4.81 in rule4. From that we can see ICARM can get more precision rules and increase its comprehensive. Table 3. The comparative of the rule precision
Setosa Versicolor Virginica Total Value
C5.0 rule set Training set 25/25 24/25 25/25 74/75 98.67%
Testing set 25/25 23/25 24/25 72/75 96%
ICARM rule set Training set Testing set 25/25 25/25 24/25 24/25 25/25 25/25 74/75 73/75 98.67% 98.67%
268
X. Xu and S. Wang
Table 3 results show that this algorithm is more accurate than C5.0 algorithm and it is easy to query and understand. Another application for teaching quality evaluation is used to explain the ICARM, Table 4 gives part of teaching evaluation information, there are 1800 records. Table 4. Evaluation table of teaching quality item 13 50 76 99 135 245 257 ……
age 36 43 31 37 41 51 29 ……
gentle male male female male female male male ……
title Associate prof Associate prof Lecturer Associate prof Professor Associate prof Assistant ……
grade medium good good excellent excellent good medium ……
Age is the number of attributes in table 4, convert it to Boolean type. And divided age into four groups: C1:21-30, C2:31-40, C3:41-50, C4:51-60. Titles and grade are category attributes, also need be converted into Boolean typ., According to the actual situation of rank Bl:Assistan;B2:Lecturer;B3:Associate Professor;B4:Professor. Grade is divided four groups. A1: excellent ;A2: good;A3: medium; A4:bad. Table 5. Data value after discretization A1
A2 1 0 0 1 1 0 0
A3 0 1 1 0 0 1 0
A4 0 0 0 0 0 0 0
B1 0 0 0 0 0 0 0
B2 0 0 0 0 0 0 1
B3 0 0 1 0 0 0 0
B4 1 1 0 1 0 1 0
C1 0 0 0 0 1 0 0
C2 0 0 0 0 0 0 1
C3 0 0 1 0 0 0 0
C4 1 1 0 0 1 0 0
0 0 0 0 0 1 0
To dig out the association rules like A1 ∩ A2 ∩ ⇒ Category, and the garde=excellent. we choose initial population N = 100 ,iteration is r=200. muation probability Pm = 0.2 , minsurp=20; minconf=5, pi =0. The result after mining is follows. 1. age(31 ∩ 35) ⇒ category(excellent ) [ surp = 27.34%; conf = 9] 2. age(36 39) ⇒ category (excellent ) [ surp = 46.4%; conf = 13] 3. age(36 39) ∩ certified ( senior ) ⇒ category (excellent ) [ surp = 52.1%; conf = 24] The Aprion algorithm and Evolution Association Rule Minging (EAM) introduced in Literature[8] were used to compare with ICARM, the efficiency of them was showed in table 6.
A Novel Association Rule Mining Based on Immune Computational Intelligence
269
Table 6. Efficiency of different algorithms Algorithm Aprion EAM ICARM
Number of rules 18 14.7 17.2
Completeness 100% 81.7% 95.6%
Time(s) 360 45 49
From table 6 we can see that the traditional Aprion algorithm can get the maximum number of rules. ICARM algorithm and EAM lose some rules respectively. But the Aprion algorithm’s computational complexity will increase with dimension increasing. ICARM has obvious advantage in extraction rules than EAM algorithm and the computational complexity less than Aprion algorithm.
7 Conclusion Aim at the efficiency problem of association rules mining, an immune cluster association rule mining algorithm was proposed based on immune clonal selection mechanism. Through clonal selection and cluster competitive to expansion of high affinity antibodies. Each antibody in the antibody group can be optimized independently in the evolutionary process. And the new antibodies can search in a wider range of search space to achieve multi-modal local and global search optimization. So as to improve the accuracy and speed for association rules. Through the experiments can be found that this approach has fast convergence rate and good global and local search ability, so we can get more accord with a condition of association rules. Acknowledgment. This work is supported by a grant from the National Natural Science Foundation of China (No.60634020) and by the Provincial Natural Science Foundation of Hunan (No.05JJ40103) and by the Provincial Social Science Foundation of Hunan (No. 09ZDB080).
References 1. Agrawal, R., Imiclinski, T., Swami, A.: Database mining: A Performance Perspective. IEEE Trans. Knowledge and Data Enginnering 5, 914–925 (1993) 2. Agrawal, R., Srikant, R.: Fast Algorithm for Mining Association Rules. In: Proceeding 1994 International Conference Very Large Data Bases(VLDB 1994), Santiago, Chile, pp. 487–499 (1994) 3. Euihong, H., George, K., Kumar, V.: Scalable Parallel Data Mining for Association Rules. In: Proceeding of the ACM SIGMOD 1997, pp. 277–288. ACM Press, New York (1997) 4. Jiao, L., Du, H.: Development and Prospect of the Artificial Immune System. Acta Electronica Sinica 31(10), 1540–1548 (2003) 5. Liang, M., Liang, J., Guo, C.: Association rule mining algorithm based on artificial immune system. Computer Applications 24(8), 50–53 (2004)
270
X. Xu and S. Wang
6. Kim, J., Bentley, P.J.: Immune Memory in the Dynamic Clonal Selection Algorithm. In: Proceedings of the First International Conference on Artificial Immune Systems, pp. 57– 65. Universitv of Kent, Kent (2002) 7. Liu, F., Sun, Y.-j.: A Novel Association-Rule Mining Algorithm Based on the Polyclonal Selection Algorithm. Journal of Fudan University 43(5), 742–744 (2004) 8. Han, J., Kamber, M.: Data Mining: Concepts and Techniques (2001) 9. Gupta, G.K., Strehl, A., Ghosh, J.: Distance Based Clustering of Association rules. In: Proceedings of ANNIE, vol. (9), pp. 759–764. ASME Press (1999) 10. Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: Proceedings of ACM-SIGMODE Int. Conf. Management of Data, pp. 1–12 (2000)
Face Recognition via Two Dimensional Locality Preserving Projection in Frequency Domain Chong Lu1,3 , Xiaodong Liu1 , and Wanquan Liu2 1
School of Electronic and Information Engineering DLUT, Dalian, 116024, China 2 Curtin University of Technology, Perth WA, 6102, Australia 3 YiLi Normal College, Yining, 835000, China
Abstract. In this paper we investigate the face recognition problem via using the two dimensional locality preserving projection in frequency domain. For this purpose, we first introduce the two-dimensional locality preserving projections (2DLPP) and the two dimensional discrete cosine transform (2DDCT). Then the 2DLPP in frequency domain is proposed for face recognition. In fact, the 2DDCT is used as a pre-processing step and it converts the image signal from time domain into frequency domain aiming to reduce the effects of illumination and pose on face recognition. Then 2DLPP is applied on the upper left corner blocks of the global 2DDCT transform matrices of the original images, which represent the central energy of original images. For demonstration, the Olivetti Research Laboratory (ORL), YALE and FERET face datasets are used to compare the proposed approach with the conventional 2DLPP and 2DDCT approaches with the nearest neighborhood (NN) metric being used for classifiers. The experimental results show that the proposed 2DLPP in frequency domain is superior over the 2DLPP in time domain and 2DDCT in frequency domain. Keywords: Two Dimensional Locality Preserving Projections, Two Dimensional Discrete Cosine Transform, Face Recognition.
1
Introduction
For face recognition, different subspace methods have been proposed where Principal Component Analysis (PCA) [1] and Linear Discriminant Analysis (LDA) [2] are two most classical and fundamental methods among them. Recently, much research effort has shown that the face images possibly reside on a nonlinear sub-manifold [3]. Locality preserving projection (LPP) [4] aims to find an embedding that preserves local information, and thus obtains a face subspace that can better represent the essential face manifold structure. However, an intrinsic limitation of classical PCA , LDA and LPP is that all of them involve the eigen decomposition, which is time-consuming for highdimensional data. In essence, an image must be presented as a vector, which may cause the loss of some structural information residing in original 2D images. To overcome such problem and improve the computation efficiency, Yang K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 271–281, 2010. c Springer-Verlag Berlin Heidelberg 2010
272
C. Lu, X. Liu, and W. Liu
et al. [5] proposed the 2DPCA method, which directly extracted feature from image matrix. Subsequently, Jing et al. [6] developed the 2DLDA method and then Chen et al. [7] proposed 2DLPP method with features being extracted from image matrix. Also the 2DLPP was used for face recognition in Zhu et al. [8] and Chen et al.[12]. Its principal idea is to compute the covariance matrix based on the 2D original training image matrices and achieve its optimal projection matrix iteratively. Lu et al. [9] established a necessary condition for an optimal projection matrix. Discrete cosine transform (DCT) has been used as a feature extraction method in various studies on face recognition, which can yield a significant reduction of computational time with better recognition rates [10][11][14][15]. This motivates us to use 2DLPP idea in frequency domain with an aim to produce better face recognition performance with low computational costs. In this paper, we introduce 2DLPP in frequency domain for face recognition. First, the 2D discrete cosine transform (2DDCT) has been used for feature extraction, then 2DLPP is applied only on the upper left corner blocks of the global 2DDCT transform matrices of the original images. The proposed approach is tested against conventional 2DLPP without 2DDCT and also tested against conventional 2DDCT in frequency domain, where the nearest neighborhood (NN) metric was used for classifiers. The rest of this paper is organized as follows. In Section 2, we first give a review of 2DLPP approach and 2DDCT, and then we propose 2DLPP in frequency domain. In Section 3, we report some experimental results. Finally, we conclude this paper in section 4.
2
The Proposed Approach
The proposed 2DLPP algorithm consists of two steps. The first one is 2DDCT which converts the image into frequency domain, and the second step is 2DLPP analysis, which implements face recognition in frequency domain. The details are presented in the following subsections. 2.1
2DDCT
The two-dimensional DCT (2DDCT) is a popular technique in image and video compression, which was first applied in image compression by Ahmed et al [13]. In 1992, the first international standard for image compression, known as the Joint Photographic Experts Group (JPEG), was established with the 2DDCT as encoder and decoder. The frequency technique has broad applications in face recognition, pattern recognition and object detection [16][17][18]. The definition of 2DDCT for a W × H input image f (x, y) is given by [13]: C(u, v) = a(u)a(v)
W −1 H−1 x=0 y=0
f (x, y) cos[
(2y + 1)vπ (2x + 1)uπ ] cos[ ] 2W 2H
where u = 0, 1, 2, ..., W − 1, v = 0, 1, 2, ..., H − 1. In many applications, its inverse transform is also used, which is given by
Face Recognition via Two Dimensional Locality Preserving Projection
f (x, y) =
W −1 H−1
a(u)a(v)C(x, y) cos[
u=0 v=0
273
(2y + 1)vπ (2x + 1)uπ ] cos[ ] 2W 2H
⎧ ⎧ 1 ⎨ 1 ,u = 0 ⎨ W H,v = 0 a(u) = and a(v) = 2 ⎩ 2 , u = 0 ⎩ W H , v = 0
where
Different approaches in frequency domain can achieve better recognition results for face recognition [19][20][21]. This may be explained as in frequency domain the illumination and pose effects might be reduced in terms of energy distribution. When 2DDCT is applied to a face image, and one can obtain its DCT coefficients. In our experiments in this paper, different block size of w×w were chosen for ORL face database and YALE face database and FERET face database, respectively as reported in the next section. 2.2
2DLPP
LPP is an effective dimension reduction approach for face recognition. Currently, LPP is used on the image features directly either in one dimensional case [22] or two dimensional case [23]. Now we present the two dimensional LPP briefly in time domain. Let Ai ∈ Rm×n for i = 1, 2, ..., N . be the N matrices of face images, we aim to compute two matrices L ∈ Rn×r and R ∈ Rm×c with orthonormal columns such that the projected low dimensional matrices preserve locality property. The projection feature Y is an r × c dimension matrix given by calculation Y = L AR. Different projection matrices for the following objective function can be obtained by different optimization approaches. The objective function of 2DLPP is defined as min(J(L, R)) =
N i<j
2
Yi − Yj F Wij
2
where ∗F is square of Frobenius norm. The weighting Wij is usually chosen as Wij =
2
exp(Ai − Aj )/t, Ai and Aj are in the same class 0, otherwise
where t > 0 is a constant. Therefore, minimizing the objective function is an attempt to ensure that, if Ai and Aj are in the same class, put a similarity weighting Wij on it, otherwise let the weighting be zero. The similarity weighting Wij can be simplified in our experiments as Wij = 1 if Ai and Aj are in the same class. In fact, the optimization problem is a nonlinear optimization problem, we can not achieve its global optimum in general. To obtain the two feature matrices L ∈ Rn×r and R ∈ Rm×c , an efficient alternative algorithm is proposed in [8]. We restate it here as below.
274
C. Lu, X. Liu, and W. Liu
Algorithm. In order to obtain the optimal solution L and R in the above minimization problem, we solve the following two problems alternatively (1) For a given R, to find L This is solved through the following optimization problem [8]. N ( N i,j (Ai − Aj ) RR (A − i − A − j)Wij )L = λL ( i,j Ai RR Aj Wij )L L consists of the r eigenvectors of the minimum r eigenvalues in (1). (2) For a given L, to find R This minimization problem is then equivalent to the following optimization problem. N ( N i,j (Ai − Aj ) LL (Ai − Aj )Wij )R = λR ( i,j Ai LL Aj Wij )R R consists of the c eigenvectors of the minimum c eigenvalues in (2). The 2DLPP algorithm is now stated in detail below. Input: matrices {Ai }, r and c. Output: matrices L, and R. 1. Let the initial L0 = (Ic , 0) for L and set i = 1. 2. While not convergent 3. From the matrix MR1 (L) =
N
(Ai − Aj ) Li−1 L i−1 (Ai − Aj )Wij
i,j
and MR2 (L) =
N
Ai Li−1 L i−1 Ai Wij
i,j
one can compute the c eigenvectors MR1 (L) = λR MR2 (L) corresponding to the minimum c eigenvalues. 4. Let Ri = [r1 , r2 , ...rc ] 5. From the matrix ML1 (R) =
N
(Ai − Aj ) RR (Ai − Aj )Wij
i,j
and ML2 (R) =
N
Ai RR Ai Wij
i,j
one can compute the r eigenvectors ML1 (R) = λR ML2 (R) corresponding to the minimum r eigenvalues. 6. Let Li = [r1 , r2 , ...rr ] 7. i=i+1 8. Endwhile 9. L = Li−1 ,R = Ri−1 If the feature matrices of training images are Y1 , Y2 , ..., YN (N is the total number of training images), and each image is assigned to a class Ci . Then for a given test image A, one can project it to the feature space using L and R, namely Y = L AR. If d(Y, Yl ) = minj d(Y, Yj ) and Yl ∈ Ci , then the resulting decision is Y ∈ Ci .
Face Recognition via Two Dimensional Locality Preserving Projection
275
Fig. 1. Block diagram 2DLPP in DCT domain
2.3
The Proposed 2DDCT+2DLPP Approach
The proposed methodology is based on the use of the 2DDCT as a feature extraction and dimensionality reduction step, then we use the 2DLPP on the w× w upper left corner blocks of the global 2DDCT transform matrices of the original images. Since the 2DDCT features in frequency domain are more robust to the variations in illumination and rotation than gray-level data in time domain, we expect the performance of face recognition can be improved by using 2DLPP in the frequency domain. Hence we propose the 2DLPP method based on 2DDCT features which we named as 2DDCT+2DLPP method. In fact, the face images are filtered by a 2DDCT filter in order to enhance the recognition rate. The two-dimensional projections with 2DLPP are used to reduce the computation complexity on small blocks in frequency feature space. With this approach, we only use a subblock containing the most important coefficients of the 2DDCT matrices, i.e., the most significant information is contained in these coefficients. Compared to the DCT based face recognition approach in [24], the advantage of 2DDCT is that 2D structure is preserved and the dimensionality reduction is carried out on matrix directly. The block diagram describing the whole procedure for implementing 2DLPP in 2DDCT domain is illustrated in Figure 1.
3
Experimental Results
In this section, we will carry out several experiments to evaluate the performance of the proposed 2DDCT+2DLPP for face recognition against some state-of-art algorithms. We choose three benchmark datasets for evaluation in our experiments. One is the ORL face database, which under various facial expressions and lighting conditions. The second dataset is the YALE face database, which
276
C. Lu, X. Liu, and W. Liu
includes 165 images of 15 individuals (each person has 11 different images) under various facial expressions and lighting conditions with each images being cropped and resized to 231 × 195 pixels in experiment. The last face database is the FERET face database, which is a set of face images collected by NIST from 1993 to 1997. It contains training, gallery, and probe sets, and we select 49 individuals that have equal or more than 10 images and the image size is scaled down to 112 × 92 according to positions of eyes and noses from the original size of 640 × 480. We evaluate the performance of 2DLPP in DCT domain and we compare the results with those obtained by the conventional 2DLPP in time domain and 2DDCT methods. All the experiments are carried out on a PENTIUM 4 PC with 2.0 GHz CPU and 1 Gb memory. Matlab7 (Matlab, 2004) is used to carry out these experiments. These three datasets vary much since they are collected in different labs and under different conditions. These three datasets have their own characteristics in evaluating the performances of different face recognition algorithms [25][26]. 3.1
Results on the ORL Dataset
The ORL face dataset contains 400 images for 40 individuals, for each person we have 10 different images of size 112 × 92 pixels. For some subjects, the images captured at different times. The facial expressions (open/closed eyes, smiling/no smiling) and facial appearance (glasses or no glasses) also vary very much. In order to find out the best dimensions for the projection matrices R and L in ORL face dataset, we have randomly selected the training samples five times, each time, we have used the selected five image samples per class for training and the remaining five images for test, so the total number of training samples was 50. We take block size of 50 × 50 in dct domain by 2DTCT+2DLPP. The results of recognition are the mean of the five implementations and they are shown in Table 1. In Table 1, the dimension denotes the number of r = c in the LPP projection matrices. We noticed that in all three cases, the performances reach best when r = c = 18 and the 2DDCT+2DLPP achieves best. Next we use r = c = 18 to evaluate the efficiency of different algorithms. We compare the computational time used to recognize all testing images, where we take training samples n=5, and feature r = c = 18, the results of the average time in five testings are shown in Table 2. It can be seen from Table 2 that the computation cost for 2DDCT+2DLPP is the best among the similar category algorithms. Next we compare the performances for all these five approaches on this dataset. We have randomly accessed the dataset five times, if we define the original Table 1. Recognition rates with feature dimension from 14 to 26 14
16
18
20
22
24
26
dimension
0.792 0.897 0.911 0.903 0.900 0.890 0.887 2DLPP 0.887 0.897 0.897 0.897 0.897 0.897 0.897 2D DCT 0.887 0.907 0.925 0.900 0.907 0.897 0.910 2DDCT+2DLPP
Face Recognition via Two Dimensional Locality Preserving Projection
277
Table 2. Recognition time cost comparison on ORL (s) 2DLP P 2DDCT 2DDCT + 2DLP P 2DDCT + 2DP CA 2DDCT + 2DLDA 16.84
161.65
16.39
26.96
26.37
1 0.95
Recognition accuracy
0.9 0.85 0.8 0.75 2D−LPP 2D−DCT 2D DCT+LPP 2D DCT+PCA 2D DCT+LDA
0.7 0.65 0.6
3
3.5
5 4.5 4 Number of training sample
5.5
6
Fig. 2. Comparison of five approaches under the ORL dataset
serial of ORL face database is [1,2,3,4,5,6,7,8,9,10] The five random accesses are denoted as [1,9,6,3,5,8,10,2,7,4], [3,10,5,7,1,6,9,2,8,4], [5,6,9,1,4,2,10,8,3,7], [4,1,5,7,10,6,9,2,8,3] and [6,9,3,5,8,10,2,1,7,4]. We select the number of training sample n = 3; 4; 5; 6, respectively, and all feature dimensions are r = c = 18 in the dataset ORL, since the algorithms reach high recognition accuracy in those dimensions. Then the rates of recognition accuracy in ORL database are show in Figure 2. It can be seen that the performance of 2DDCT + 2DLPP is the best among these algorithms. 3.2
Results on the YALE Dataset
The YALE dataset contains different facial expression or lighting configurations. Knowing that DCT is very sensitive to illumination change (especially, the first coefficients), we have considered this dataset in order to evaluate the performance of proposed technique when facial expression and lighting conditions are changed. In our experiments, in order to choose right block sizes of 2DDCT and feature numbers that could reach better recognition rates, we have used the average recognition rate for five times random implementations, each time we use five image samples per class for training and the remaining six images for testing. So, the total number of training samples was 75. In order to find out the right block size and feature dimensions for face recognition, different results are listed in Table 3. From Table 3, we noticed that the 2DDCT block size of 50 × 50 and feature dimension of r = c = 18 can achieve the best recognition rate. So, we take
278
C. Lu, X. Liu, and W. Liu
Table 3. The results of different 2DDCT block sizes and 2DLPP feature size on the YALE dataset 2DDCT size 2DDCT 2DDCT + 2DLP P 2DLP P 8×8 10 × 10 12 × 12 16 × 16 20 × 20 24 × 24 28 × 28 34 × 34 38 × 38 42 × 42 46 × 46 50 × 50 54 × 54
0.853 0.874 0.874 0.896 0.906 0.906 0.906 0.906 0.906 0.906 0.906 0.906 0.906
0.853(8 × 8) 0.894(10 × 10) 0.896(10 × 10) 0.916(12 × 12) 0.916(12 × 12) 0.918(14 × 14) 0.916(14 × 14) 0.918(16 × 16) 0.920(16 × 16) 0.922(16 × 16) 0.924(18 × 18) 0.929(18 × 18) 0.920(18 × 18)
0.561(8 × 8) 0.642(10 × 10) 0.756(10 × 10) 0.78(12 × 12) 0.81(14 × 14) 0.822(14 × 14) 0.822(16 × 16) 0.843(16 × 16) 0.863(18 × 18) 0.86(20 × 20) 0.883(20 × 20) 0.883(18 × 18) 0.872(18 × 18)
block size of 50 × 50 in 2DDCT domain, and take feature dimension r = c = 18 in the next experiment on YALE face dataset. The results for three typical algorithms are listed in Table 4. Next we compare the computational times used Table 4. Comparison accuracy with feature dimension from 14 to 26 14
16
18
20
22
24
26
r=c
0.872 0.895 0.895 0.900 0.900 0.894 0.887 2DLPP 0.905 0.905 0.905 0.905 0.905 0.905 0.905 2DDCT 0.901 0.907 0.922 0.920 0.930 0.917 0.911 2DDCT+2DLPP
to recognize all testing images via using different techniques, where we take the number of training samples n=5, and feature dimension r = c = 18. The results of the average on five time implementations are shown in Table 5. It can be seen from these results that 2DDCT+2DLPP is the best from both the recognition performance and computational time. For further illustration, we give the average performance results for five algorithms in Figure 3 with five random implementations as below. We have randomly accessed the dataset five times, if we define the original serial of YALE face database is [1,2,3,4,5,6,7,8,9,10,11], and Table 5. The recognition time cost comparison on YALE (s) 2DLP P 2DDCT 2DDCT + 2DLP P 2DDCT + 2DP CA 2DDCT + 2DLDA 3.54
47.26
3.39
5.69
5.47
Face Recognition via Two Dimensional Locality Preserving Projection
279
1 0.95
Recognition accuracy
0.9 0.85 0.8 0.75 0.7 0.65
2D−LPP 2D−DCT 2D DCT+LPP 2D DCT+PCA 2D DCT+LDA
0.6 0.55 0.5
3
4
5
8 7 6 Number of training sample
9
10
Fig. 3. Comparison of five approaches under the YALE dataset
the five random access serial are [11,5,6,3,10,8,4,7,2,9,1],[10,8,11,6,7,3,9,5,4,1,2], [2,3,11,8,6,4, 10,1,7,5,9], [9,3,5,6,4,7,2,8,1,10,11] and [6,3,11,7,8,5,1,2,4,9,10]. We select the number of training samples as n = 3; 4; 5; 6, respectively, and all feature dimension r = m = 18. One can see that 2DDCT+2DLPP still achieves the best. 3.3
Results on the FERET Dataset
The FERET face database contains thousands of face images. We select 49 individuals that have equal or more than 10 images in gray feret cd1 and gray feret cd2 and the images were scaled down to 112 × 92 according to positions of eyes and noses from the original size of 640 × 480. In our experiments, we have randomly accessed five times the chosen face images. We select feature dimension r = c = 10 and use the first 3rd,4th,5th,6th,7th,8th,9th images of each individual as training samples and the others as for testing in the dataset. Therefore the total numbers of testing samples were 343, 294, 245, 196, 147, 98, 49 respectively. We take block size of 30 × 30 in dct domain by 2DDCT and 2DDCT+2DLPP. The results of recognition are the mean of the five time implementations and they shown in Table 6. From Table 6, we can observe that the proposed 2DCT+2DLPP improves the recognition rate in each case though the percentage of improvement varies. Table 6. Comparison accuracy with feature dimension c=r=10 3rd
4th
5th
6th
7th
8th
9th
T raining samples
0.590 0.654 0.667 0.671 0.689 0.710 0.731 2DLPP 0.519 0.558 0.592 0.615 0.635 0.659 0.682 2DDCT 0.581 0.654 0.669 0.678 0.697 0.720 0.757 2DDCT+2DLPP
280
3.4
C. Lu, X. Liu, and W. Liu
Conclusions
In this paper, 2DLPP is introduced in DCT domain. The main advantage of the DCT transform is that it can reduce redundant information and also can be used for feature extraction as well as dimensionality reduction. So, the computational complexity is significantly reduced and very few coefficients are required for efficient image representation in the reduced space. The experimental results show that the recognition rate using 2DLPP in DCT domain is always better than its counterpart in time domain and DCT techniques only. Also the testing time is also reduced significantly than most related 2D algorithms. Acknowledgement. This research was supported by the University fund from the Xinjiang Government of China under Grant No. XJEDU2007I36 and Natural Science Foundation of Xinjiang under Grant No. 2009211A10.
References 1. Turk, M., Pentland, A.: Eigenfaces for recognition. Cognitive Neuroscience 3(1), 71–86 (1991) 2. Belhumeur, P.N., Hespanha, J.P., Kriengman, D.J.K.: Eigenfaces vs. Fisherfaces: recognition using class specific linear projection. IEEE Trans. Pattern Anal. Mach. Intelligence 19(7), 711–720 (1997) 3. Shashua, A., Levin, A., Avidan, S.: Manifold pursuit: A new approach to appearance based recognition. In: International Conference on Pattern Recognition, pp. 590–594 (2002) 4. He, X., Yan, S., Hu, Y., Niyogi, P., Zhang, H.-J.: Face Recognition Using Laplacianfaces. IEEE Trans. Pattern Analysis and Machine Intelligence 27(3), 328–340 (2005) 5. Yang, J., Zhang, D., Frangi, A.F., Yang, J.-y.: Two Dimensional PCA: A New Approach to Appearance Based Face Representation and Recognition. IEEE Trans. Pattern Analysis and Machine Intelligence 24(1), 131–137 (2004) 6. Jing, X., Wong, H., Zhang, D.: Face Recognition based on 2D Fisherface Approach. Pattern Recognition 39(4), 707–710 (2006) 7. Chen, S., Zhao, H., Kong, M., Luo, B.: 2D-LPP: A two-dimensional extension of locality preserving projections. Neurocomputing 70(1), 912–921 (2007) 8. Zhu, L., Zhu, S.-a.: Face recognition based on two dimensional locality preserving projections. Journal of image and graphics 12(11), 2043–2047 (2007) 9. Lu, C., Liu, W., An, S.: A simplified GLRAM algorithm for face recognition. In: ICIC 2008, pp. 450–460 (2008) 10. Chen, W., Er, M.J., Wu, S.: PCA and LDA in DCT domain. Pattern RecognitionLetters 26(15), 2474–2482 (2005) 11. Bengherabi, M., Mezai, L., Harizi, F.: 2DPCA-based techniques in DCT domain for face recognition. Int. J. Intelligent Systems Technologies and Applications 7(3), 243–265 (2009) 12. Wang, X.: Bilateral Two-Dimensional Locality Preserving Projections with Its Application to Face Recognition. In: Yu, W., He, H., Zhang, N. (eds.) ISNN 2009. LNCS, vol. 5553, pp. 423–428. Springer, Heidelberg (2009) 13. Ahmed, N., Natarajan, T., Rao, K.: Discrete cosine transform. IEEE Transactions on Computers 23(1), 90–93 (1974)
Face Recognition via Two Dimensional Locality Preserving Projection
281
14. Tjahyadi, R., Liu, W., Venkatesh, S.: Application of the DCT energy histogram for face recognition. In: ICITA 2004, pp. 305–310 (2004) 15. Sanderson, C., Paliwal, K.K.: Fast features for face authentication under illumination direction changes. Pattern Recognition Letters 24, 2409–2419 (2003) 16. Zhao, W., Chellappa, R., Phillips, M.D.P.J., Rosenfeld, M.D.A.: Face recognition: A literature survey. ACM Computing Surveys (CSUR) 35(4), 399–458 (2003) 17. Liu, C., Wechsler, H.: Independent Component Analysis of Gabor Features for Face Recognition. IEEE Trans. Neural Networks 14(4), 919–928 (2003) 18. Lee, Y.-C., Chen, C.-H.: Feature Extraction for face Recognition Based on Gabor Filters and Two-Dimensional Locality Preserving Projections. In: Fifth International Conference on Intelligent Information Hiding and Multimedia Signal, pp. 106–109 (2009) 19. Eickeler, S., Mulller, S., Rigoll, G.: Recognition of JPEG compressed face images based on statistical methods. Image and Vision Computing 18(4), 279–287 (2000) 20. Yin, H., Fu, P., Qiao, J.: Face Recognition Based on DCT and 2DLDA. In: Huang, D.-S., Heutte, L., Loog, M. (eds.) ICIC 2007. LNCS, vol. 4681, pp. 581–584. Springer, Heidelberg (2007) 21. Hafed, Z.M., Levine, M.D.: Face Recognition Using the Discrete Cosine Transform. International Journal of Computer Vision 43, 167–188 (2001) 22. Yu, W., Teng, X., Liu, C.: Face recognition using discriminant locality preserving projections. Image and Vision Computing 24(3), 239–248 (2006) 23. Hu, D., Feng, G., Zhou, Z.: Two-dimensional locality preserving projections (2DLPP) with its application to palmprint recognition. Pattern Recognition 40(1), 339–342 (2007) 24. Sanderson, C., Paliwal, K.K.: Features for robust face-based identity verification. Signal Processing 83(5), 931–940 (2003) 25. Tolba, A.S., Abu-Rezq, A.N.: Combined Classifiers for Invariant Face Recognition. Pattern Analysis and Applications 3(4), 289–302 (2000) 26. Nam, M.Y., Bashar, R., Rhee, P.K.: Adaptive feature representation for robust face recognition using context-aware approach. Neurocomputing 70(4), 648–656 (2007)
Prediction of Protein-Protein Interactions Using Subcellular and Functional Localizations Yanliang Cai, Jiangsheng Yu , and Hanpin Wang Keylaboratory of High Confidence Software Technologies, Ministry of Education School of Electronic Engineering and Computer Science, Peking University, China Tel.: +86 10 82756376 {caiyl,yujs,whpxhy}@pku.edu.cn
Abstract. Protein-protein interaction (PPI) plays an important role in the living organisms, and a major goal of proteomics is to determine the PPI networks for the whole organisms. So both experimental and computational approaches to predict PPIs are urgently needed in the field of proteomics. In this paper, four distinct protein encoding methods are proposed, based on the biological significance extracted from the categories of protein subcellular and functional localizations. And then, some classifiers are tested to prediction PPIs. To show the robustness of classification and ensure the reliability of results, each classifier is examined by many independent random experiments of 10-fold cross validations. The model of random forest achieves some promising performance of PPIs.
1
Background
The research of proteomics has become one of the landmarks for the post genome era. More and more attention is being paid to explore the protein structure, function, subcellular location, protein-protein interaction and etc. In recent years, many high-throughput experiment methods, such as yeast two-hybrid system [1] and mass spectrometry [2], have been used to identify PPIs and generated a large number of PPIs datasets. However, the high false positives and false negatives are often associated with the high-throughput experiments [3]. Sprinzak et al argued that the reliability of high-throughput yeast two-hybrid assays is about 50% [4]. While some small-scale experiments have improved the reliability of PPIs datasets [5]. Most approaches are time-consuming because of the enormous potential interactions. For instance, the size of the yeast interactome is estimated to be 10,000-16,600 [6]. In order to make the PPI verification feasible, some more efficient computational methods are still needed. The prediction of PPIs is usually modeled as the supervised learning of binary classifiers in pattern recognition and machine learning [7]. Some computational methods, based on the primary structure of protein, have been studied in the last decade. Such as, Nanni et al encoded proteins with statistical information of 2-mer amino acid, and some physical and chemical properties of amino acid [8].
Corresponding author.
K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 282–290, 2010. c Springer-Verlag Berlin Heidelberg 2010
Prediction of Protein-Protein Interactions
283
They tested a modified model of k-nearest neighbor (k-NN) for the prediction problem of PPIs. Shen et al classified the amino acids into seven classes and then tested the model of support vector machine (SVM) on the statistical information of 3-mer amino acid [9]. Some other features, like the gene context, were also used to encode the protein, including the gene fusion [10], the gene neighborhood [11], and the phylogenetic profiles [12]. Neverthless, only a small part of proteins have the gene context information. In order to find some more general methods, Jansen et al encoded each protein with four different database information and some genomic features, such as Rosetta compendium, Cell cycle, GO biological process and MIPS (Munich Information Center for Protein Sequences) function [13]. Based on the same datasets, Lin et al evaluated the possible combinations of the features and found that the combination of GO biological process and MIPS function could yield a better result, while more features lead to worse performance instead [14]. Wu et al used the protein GO cellular component and biological process annotations to predict PPIs [15]. Again, the protein co-localization and co-biological process are shown to be crucial for the prediction of PPIs. Chen and Liu used protein domains to predict PPIs and encoded each protein as a 4293 dimensional vector, in which each dimension is a domain of Pfam database. Chen and Liu achieved a sensitivity of 79.30% and a specificity of 62.8% [16]. In our work, the information of subcellular and functional localizations is utilized to encode the protein. We propose four distinct encoding ways, based on which several classifiers are compared in performance. The model of random forest is adopted since it performs robustly. The paper is organized into four sections. Section 2 introduces the materials and methods. The experimental results are presented in Section 3. Finally, conclusions are drawn in Section 4.
2
Materials and Methods
We focus the PPIs on the species of yeast Saccharomyces cerevisiae. The goldenstandard positive datasets are generally from MIPS [17], DIP (Database of Interacting Proteins) [5], SGD (Saccharomyces Genome Database) [18], or from their combinations. Ben-Hur and Noble got their high reliable positive PPIs dataset from DIP and MIPS [19], we also use it as our positive dataset. Jansen et al synthesized negatives from lists of proteins in separate subcellular compartments in MIPS [13], by the fact that proteins from different subcellular compartments are unlikely to interact. Ben-Hur and Noble found Jansen’s method can yield high quality sets of non-interacting proteins, but lead to biased estimates of prediction accuracy [19]. So Ben-Hur and Noble suggested to select non-interacting pairs uniformly at random from the set of all protein pairs with no interaction information. Here we use the same negative dataset as suggested. 2.1
Subcellular Localization and Functional Localization
As mentioned earlier, the proteins with similar localization are most likely to interact with each other. So far, much information of subcellular localization can be
284
Y. Cai, J. Yu, and H. Wang
utilized directly. In our experiments, we extract the yeast protein subcellular localization in the CYGD database [17]. There are totally 56 kinds of localizations in the CYGD subcellular classification catalogue (11.14.2005). Each localization is structured as a tree, that is, every non-leaf node has one “parent” and some “children” which are more specified. For an example of subcellular localization, see Table 1. In literature, the proteins that have physical interaction with each other, or compose one protein complex, or take part in the same biological process, always execute some similar or complementary biological functions. Upon that, we also extract the information of yeast protein functional localization in CYGD database. There are totally 1362 kinds of localizations in the CYGD functional classification catalogue (version 2.1, 09.01.2007). The functional localization has the similar structure as subcellular localization (see Table 1). Table 1. Partial schema of subcellular localization and functional localization (a) Subcellular localization Category 701 705 705.01 705.03 705.05 710 715 720 722
Descriptions extracellular bud bud tip neck
cell wall cell periphery plasma membrane integral membrane / endomembranes 725 cytoplasm 730 cytoskeleton 730.01 actin cytoskeleton 730.01.03 actin patches 730.03 tubulin cytoskeleton 730.05 spindle pole body 730.07 intermediate filaments
(b) Functional localization Category 01 01.01 01.01.03 01.01.03.01 01.01.03.01.01 01.01.03.01.02 01.01.03.02 01.01.03.02.01 01.01.03.02.02 01.01.03.03 01.01.03.03.01 01.01.03.03.02
Descriptions METABOLISM amino acid metabolism assimilation of ammonia, metabolism of the glutamate group metabolism of glutamine biosynthesis of glutamine degradation of glutamine metabolism of glutamate biosynthesis of glutamate degradation of glutamate metabolism of proline biosynthesis of proline degradation of proline
Take the protein YLR026C PPIs network for example, it mainly locates at golgi and ER, and the main function is vesicular transport between golgi and ER. There are 75 proteins in this network interacting with YLR026C and more than 56 proteins locate at golgi. 2.2
Four Encodings for the Data Representation
Each protein pair belongs to either interaction class or non-interaction class, thus the prediction of PPIs is usually formalized to be a binary classification problem. In our approach, a protein pair is characterized by the subcellular and functional localizations. Each category in the subcellular and functional classification catalogue has distinguishing biological significance. For instance, the protein located
Prediction of Protein-Protein Interactions
285
at golgi membrane has more position information compared with those at golgi, and the protein taking part in the biosynthesis of glutamine provides us more information than the protein executing metabolism of glutamine. So we propose the following four different encoding ways to represent the protein pairs. Traditional Encoding. All the categories in the subcellular and functional localizations are treated equally. In this way, each protein pair is represented by a vector of 1418 features, in which the first 56 features are defined by the subcellular localization and the left ones are defined by the functional localization. Let the data matrix D = (P1 , P2 , · · · , Pn ) represent the n training samples (i) (i) (i) and let the vector of Pi = (p1 , p2 , · · · , p1418 , yi )T represent that the i-th sam(i) (i) (i) ple point with 1418 feature attributes p1 , p2 , · · · , p1418 belongs to the class of yi = interaction or non-interaction. Each feature pj has an integer value of 0, 1, or 2. The associated feature value is 0 if and only if the protein pair have neither subcellular nor functional localization. And the value of 2 indicates that the two proteins both locate at this attribute. Otherwise, the associated feature value is 1. The biological significance between features is not characterized in this encoding. Path Encoding. In the GO annotations [20], there are two relationships, “is-a” and “part-of”, between the child and parent nodes, which means that the child terms are instances or components of parent terms. Such relationships also exist in the subcellular and functional localization. More precisely, in the subcellular localization, one protein locating at nuclear envelope also locates at nucleus. And in the functional localization, one protein having biosynthesis of glutamine function also has amino acid metabolism function. To involve this characteristic in the patterns, we propose a novel method called path encoding. For instance, consider the functional localization of “01.01.06.05.01.01 biosynthesis of homocysteine”, the vector of Pi is defined in the traditional way and all the feature values associated with its parent terms are added by 1. That is, the associated feature values of “01 METABOLISM”, “01.01 amino acid metabolism”, “01.01.06 metabolism of the aspartate family”, “01.01.06.05 metabolism of methionine”, and “01.01.06.05.01 biosynthesis of methionine” are therefore increased by 1. One of the advantages of path encoding is that it embodies the biological significance of different localizations. Merging Encoding. Because of the dynamic nature of the living organism, some proteins may translate from one place to another, such as from nuclear envelope to nuclear matrix. Some proteins may have multiple functions, such as “meiotic recombination” and “mitotic recombination”. To reduce the dimensionality of feature space, another encoding method, merging encoding, is proposed to merge some attributes by their biological significance. In the subcellular localization, the localizations at special organelle are treated as independent attributes and proteins are described by the root localization. For example, proteins located at “ER lumen” and “ER membrane” are all regarded as “ER”. As a result, the number of attributes is reduced to 20. In the functional localization, take “01.01.06.05.01.01 biosynthesis of homocysteine” for example, the
286
Y. Cai, J. Yu, and H. Wang
separator of “.” splits this localization into 6 hierarchical parts, and all the functional localizations are abbreviated to their first four parts. So, “01.01.06.05.01 biosynthesis of methionine”, “01.01.06.05.01.01 biosynthesis of homocysteine”, “01.01.06.05.01.02 degradation of homocysteine”, and “01.01.06.05.02 degradation of methionine” are all considered as the class of “01.01.06.05 metabolism of methionine”. In this way, the dimension of of functional localization is reduced to 1136, and the feature values are assigned by the traditional encoding. In [4], Sprinzak et al used the similar method to define the feature vector. Path-Merging Encoding. The fourth encoding method tested in this paper is the combination of path encoding and merging encoding, called path-merging encoding, which defines the attributes by the method of merging encoding and assigns the feature values by the method of path encoding. In these four ways of data representation, we study the mechanism of PPIs by the random forest classifier, and some other frequently-used classifiers. 2.3
Random Forest Classifier
The random forest classifier [21] constructs many decision trees simultaneously, and each is grown from a different subset of training data. For each decision tree, the training data are randomly selected with replacement from the original training set, and each decision node uses a random subset of the features. The classification of a new sample point is based on the voting of all trees in the forest, which is finally determined by the majority. The random forest classifier is adopted in the present work to predict PPIs in the light of its robustness and high performance. All the experiments are implemented in the environment of Weka, a convenient tool for data analysis and machine learning [22].
3
Results and Discussion
We obtained 4837 interacting protein pairs and 9674 non-interacting protein pairs from Ben-Hur and Noble’s work [19], in which the interacting proteins were obtained from the intersection of the DIP and MIPS database. Each protein subcellular and functional localizations are drawn from the CYGD database [17]. After removing the proteins without subcellular and functional localizations, there remain 4615 interacting protein pairs and 8589 non-interacting protein pairs. Let TP denote the number of positive data (interacting proteins) which is predicted positive, FN the number of positive data which is predicted negative, TN the number of negative data which is predicted negative, and FP the number N , of negative data which is predicted positive. The criteria of specificity = TNT+F P TP TN +TP sensitivity = TP +FN and accuracy = TP +FN +TN +FP are used to evaluate the performance of PPIs prediction in the following content. Another measure of the performance is the ROC area, which represents a perfect test as it equals 1, and a worthless test as it equals 0.5. In the model of random forest, and the number of trees is trained by the random experiment of 10-fold cross validation. As illustrated in Figure 1, the
287
12.5 12.0 11.0
11.5
Percent_incorrect
13.0
Prediction of Protein-Protein Interactions
50
100
150
200
250
Number of tree levels
Fig. 1. Percent-incorrect rate of different tree numbers in random forest model
percent-incorrect rate decreases as the tree number increases and reaches a minimum when the tree number is 200. We run Weka on Intel(R) 8 cell Xeon 3.00GHZ, 4G memory computer, and it takes 4232.56 seconds to build the model of 200 trees. We compare the classification methods of decision tree (C4.5), k-nearest neighbor (k-NN, where k = 3), Naive Bayes method, and support vector machine (SVM, of Gaussian radial basis function) with the model of random forest. All the experiments are implemented in the Weka platform. Tested by the random experiments of 10-fold cross validation, it is shown that the approach of random forest achieves the best performance (see Table 2). Table 2. The comparison between distinct classifiers by 10-fold cross validation. The random forest method exceeds the other four frequently-used classifiers in performance. Classifier Sensitivity Specificity Accuracy ROC area
Random forest Decision tree (C4.5) k-NN (k = 3) 77.5% 76.0% 77.5% 95.1% 90.7% 88.0% 89.0% 85.6% 84.3% 0.934 0.849 0.881
Naive Bayes 64.8% 80.3% 74.9% 0.798
SVM 26.7% 98.4% 73.3% 0.625
The comparison between the four distinct encoding methods shows that the traditional and path encodings yield the best performance (see Table 3). Table 3. The results of 10-fold cross validation of random forest in four encoding ways Encoding methods Sensitivity Specificity Accuracy ROC area
Traditional 77.3% 95.4% 89.1% 0.934
Path 77.5% 95.1% 89.0% 0.934
Merging 77.0% 94.9% 88.6% 0.93
Path-merging 77.5% 94.8% 88.8% 0.93
288
Y. Cai, J. Yu, and H. Wang
Table 4. The importance comparison between subcellular and functional localizations Features Sensitivity Specificity Accuracy ROC area
Subcellular 55.9% 84.7% 74.7% 0.802
Functional 77.5% 94.2% 88.4% 0.924
Both 77.5% 95.1% 89.0% 0.934
As [7] summarized, the more detailed the encoding way the more desirable for the prediction of PPIs. To compare the importance between subcellular localization and functional localization for the PPIs prediction, we also test the case of feature vector only defined by subcellular localization and functional localization respectively. It seems that the functional localization is more important in predicting PPIs than subcellular localization (see Table 4). Finally, we examine the robustness of the random forest algorithm by means of several random experiments of 10-fold cross validation in the path and traditional encoding way respectively. As illustrated in Figure 2, the mean accuracies are 89.0% and 88.9%, and the standard deviations are 0.00749 and 0.00763. It shows that random forest classifier is robust in predicting the PPIs. What’s more, for all classifiers tested in this paper, the path encoding seems a little better than the other three encoding ways. It is because that the method of path encoding contains more subcellular and functional localization information about the ancestor nodes for each localization.
0
5
10 15 20 25
traditional encoding
Frequency
5 10 15 20 25 0
Frequency
path encoding
0.870
0.890
accuracy of random forest
0.87
0.90
accuracy of random forest
Fig. 2. The accuracy2 histograms of random forest classifier, tested by independent trials of 10-fold cross validation, in the path and traditional encoding ways
4
Conclusions
In the present work, we propose four encoding ways to draw the biological information from the protein subcellular and functional localizations for the purpose
Prediction of Protein-Protein Interactions
289
of data representation of interacting protein pairs. Some frequently-used classifiers are tested to predict the PPIs, the algorithm of random forest behaves robustly and achieves the best performance by many independent random experiments of 10-fold cross validations. It yields a specificity of 95.1%, and a sensitivity of 77.5% in average. Some more experiments on the larger PPI datasets of yeast and human beings will be considered in the further work.
Acknowledgements This research was funded by the 985 project of Peking University, titled with “statistical machine learning and its applications to bioinformatics” (No. 048SG/ 46810707-001), and also the Beijing Natural Science Foundation (No. 4032013).
References 1. Uetz, P., Giot, L., Cagney, G., Mansfield, T.A., Judson, R.S., Knight, J.R., Lockshon, D., Narayan, V., Srinivasan, M., Pochart, P., Qureshi-Emili, A., Li, Y., Godwin, B., Conover, D., Kalbfleisch, T., Vijayadamodar, G., Yang, M.J., Johnston, M., Fields, S., Rothberg, J.M.: A comprehensive analysis of protein-protein interactions in saccharomyces cerevisiae. Nature 403, 623–627 (2000) 2. Gavin, A.C., B¨ osche, M., Krause, R., Grandi, P., Marzioch, M., Bauer, A., Schultz, J., Rick, J.M., Michon, A.M., Cruciat, C.M., Remor, M., H¨ ofert, C., Schelder, M., Brajenovic, M., Ruffner, H., Merino, A., Klein, K., Hudak, M., Dickson, D., Rudi, T., Gnau, V., Bauch, A., Bastuck, S., Huhse, B., Leutwein, C., Heurtier, M.A., Copley, R.R., Edelmann, A., Querfurth, E., Rybin, V., Drewes, G., Raida, M., Bouwmeester, T., Bork, P., Seraphin, B., Kuster, B., Neubauer, G., Superti-Furga, G.: Functional organization of the yeast proteome by systematic analysis of protein complexes. Nature 415, 141–147 (2002) 3. Mrowka, R., Patzak, A., Herzel, H.: Is there a bias in proteome research? Genome Research 11, 1971–1973 (2001) 4. Sprinzaka, E., Sattath, S., Margalit, H.: How reliable are experimental proteinprotein interaction data? Journal of Molecular Biology 327(5), 919–923 (2003) 5. Xenarios, I., Salw´ınski, L., Duan, X.J., Higney, P., Kim, S.M., Eisenberg, D.: Dip, the database of interacting proteins: a research tool for studying cellular networks of protein interactions. Nucleic Acids Research 30(1), 303–305 (2002) 6. Sprinzaka, E., Margalit, H.: Correlated sequence-signatures as markers of proteinprotein interaction. Journal of Molecular Biology 311(4), 681–692 (2001) 7. Qi, Y.J., Joseph, Z.B., Seetharaman, J.K.: Evaluation of different biological data and computational classification methods for use in protein interaction prediction. PROTEINS: Structure, Function, and Bioinformatics 63, 490–500 (2006) 8. Nanni, L., Alessandra Lumini, A.: An ensemble of k-local hyperplanes for predicting protein-protein interactions. Bioinformatics 22(10), 1207–1210 (2006) 9. Shen, J.W., Zhang, J., Luo, X.M., Zhu, W.L., Yu, K.Q., Chen, K.X., Li, Y.X., Jiang, H.L.: Predicting protein-protein interactions based only on sequences information. PNAS 104(11), 4337–4341 (2007) 10. Marcotte, E., Pellegrini, M., Ng, H.L., Rice, D.W., Yeates, T.O., Eisenberg, D.: Detecting protein function and protein-protein interactions from genome sequences. Science 285, 751–753 (1999)
290
Y. Cai, J. Yu, and H. Wang
11. Dandekar, T., Snela, B., Huynena, M., Bork, P.: Conservation of gene order: a fingerprint of proteins that physically interact. Trends in Biochemical Sciences 23(9), 324–328 (1998) 12. Pellegrini, M., Marcotte, E.M., Thompson, M., Eisenberg, J., Yeates, T.O.: Assigning protein functions by comparative genome analysis:protein phylogenetic profiles. Proc. Natl. Acad. Sci. USA 96, 4285–4288 (1999) 13. Jansen, R., Yu, H.Y., Greenbaum, D., Kluger, Y., Krogan, N.J., Chung, S., Emili, A., Snyder, M., Greenblatt, J.F., Gerstein, M.: A bayesian networks approach for predicting protein-protein interactions from genomic data. Science 302(5644), 449– 453 (2003) 14. Lin, N., Wu, B.L., Jansen, R., Gerstein, M., Zhao, H.Y.: Information assessment on predicting protein-protein interactions. BMC Bioinformatics 5, 154 (2004) 15. Wu, X.M., Zhu, L., Guo, J., Zhang, D.Y., Lin, K.: Predicting protein-protein interactions based only on sequences information. Nucleic Acids Research 34(7), 2137–2150 (2006) 16. Chen, X.W., Liu, M.: Prediction of protein-protein interactions using random decision forest framework. Bioinformatics 21(24), 4394–4400 (2005) 17. Mewes, H.W., Amid, C., Arnold, R., Frishman, D., G¨ uldener, U., Mannhaupt, G., M¨ unsterk¨ otter, M., Pagel, P., Strack, N., St¨ umpflen, V., Warfsmann, J., Ruepp, A.: Mips: analysis and annotation of proteins from whole genomes. Nucleic Acids Research 32, 41–44 (2004) 18. Michael Cherry, J., Ball, C., Weng, S., Juvik, G., Schmidt, R., Adler, C., Dunn, B., Dwight, S., Riles, L., Mortimer, R.K., Botstein, D.: Genetic and physical maps of saccharomyces cerevisiae. Proteins 387(suppl.), 67–73 (1997) 19. Ben-Hur, A., Noble, W.S.: Choosing negative examples for the prediction of protein-protein interactions. BMC Bioinformatics 7(suppl. 1), S2 (2006) 20. The Gene Ontology Consortium: Gene ontology: tool for the unification of biology. Nature Genetics 25(1), 25–29 (2000) 21. Breiman, L.: Random forests. Machine Learning 45, 5–32 (2001) 22. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.: The weka data mining software: An update. SIGKDD Explorations 11, 1 (2009)
Nucleosomes Are Well Positioned at Both Ends of Exons Hongde Liu and Xiao Sun State Key Laboratory of Bioelectronics, Southeast University, Nanjing 210096, China
[email protected]
Abstract. Chromatin structure has an important role in gene regulation. Transcription elongation is closely coupled with the splicing in vivo in eukaryotes. In this paper, nucleosomes near splice sites are predicted for 13 specieses with curvature profile. The results indicate nucleosomes are well positioned at both ends of exons. Nucleosome at 5’ end is more conserved than that at 3’ end, which probably has a link with alternative splicing. The distance between nucleosome centre and splice site varies among specieses, suggesting an evolution selection. Our analysis reveals that nucleosomes positioned at both ends of exons positioning has a role not only in protecting splice sites, but also in the splicing by placing a barrier at exon ends. Moreover, it is revealed DNA sequence plays an important role in determining nucleosomes at boundary of exons.
1 Introduction Nucleosome positioning refers to the position of a DNA helix with respect to the histone core. Positioning has an important role in gene transcription, since packing DNA into nucleosomes affects accessibility of proteins [1]. RNA splicing is a vital process for eukaryotes. Studies show that transcription elongation is closely coupled with the splicing in vivo [2-3]. Thus, it is important to study the mechanism of splicing on the basis of nucleosome organization. Recently, nucleosome positioning pattern was extensively investigated in promoter region. A typical nucleosome free region (NFR) is revealed in human, fly, worm and yeast [1, 4]. Tilgner et al found positioning is stronger in exons with weak splice sites [5]. In a genome-wide analysis, distinct peaks of nucleosomes and methylation are observed at both ends of a protein coding unit [6], suggesting polymerases tend to pause near both coding ends. Another study showed that there is a higher nucleosome-positioning signal in internal human exons and this positioning is independent of gene expression [7]. Previous studies also suggested that nucleosome has a role of protecting splice sites [8]. Since the report of the nucleosome positioning code (an ~10 bp repeating pattern of dinucleotides AA-TT-TA/GC) in yeast [9], some models for predicting nucleosomes have been developed using DNA sequence properties [9-12]. The successful predictions suggest that DNA sequences partly encode nucleosomes themselves, although some deviations are observed between the predicted and the experimentally determined positions [9]. We wonder if the characteristics of nucleosomes in the vicinity of splice sites can be revealed with DNA sequence-based predictions. K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 291–297, 2010. © Springer-Verlag Berlin Heidelberg 2010
292
H. Liu and X. Sun
In this paper, using curvature profile [13], a DNA sequence-based nucleosome prediction model by us, nucleosomes around splice sites are predicted for 13 specieses. The positioning characteristics are thoroughly investigated. The results also demonstrated the important roles of nucleosome positioning in splicing.
2 Method and Datasets 2.1 Nucleosome Prediction Model In our previous work [8], it was found that weakly bound dinucleotides (AA, TT, AT and TA) of core DNA sequences were spaced with smaller intervals (≈10.3 bp) at the two ends of the nucleosome (each is 50 bp), with larger (≈11.1 bp) spacing in the middle section (47 bp). This suggests that the two ends have a large curvature and the middle region has a small curvature, which is called curvature pattern. Using the curvature pattern, we constructed a nucleosome prediction model, curvature profile. In the model, nucleosome prediction includes three steps (see Fig.1):
Seq: acgtacggtatgcgt…… Eq.1
Curvature curve
The curvature pattern signal
Convolution
Curvature profile
Fig. 1. Illustration of prediction procedure using the curvature pattern
(1) Calculating curvature curve for a given DNA sequence with eq.1 [13].
C = ν 0 ( n2 − n1 )
−1
n2
∑ (ρ j = n1
j
⎛ 2π ij ⎞ − iτ j ) exp ⎜ 0 ⎟ ⎝ ν ⎠
(1)
Where ν 0 is the double-helix average periodicity (10.4 bp). The numbers ( n2 − n1 ) represent the integration steps. Values of roll ρ and tilt τ angles of sixteen dinucleotide steps are listed in Table 1.
Nucleosomes Are Well Positioned at Both Ends of Exons
293
(2) Performing a convolution of the curvature curve and the curvature pattern signal (number representation of the curvature pattern). The convolution signal is called the curvature profile. If a segment of the curve resembles the pattern signal, the convolution will give a peak at the corresponding position, indicating a nucleosome. (3) Finding the positions of the peaks of the curvature profile and predicting nucleosomes. Table 1. Values of roll ρ and tilt τ angles of sixteen dinucleotide steps
dinucleotide
ρ
τ
dinucleotide
ρ
τ
A->A A->T T->A C->A G->T C->T G->A C->G
-0.09 0.16 -0.12 -0.04 0.12 0.04 -0.02 -0.07
0 0 0 -0.03 -0.01 0.03 -0.03 0
C->G G->C G->G T->G A->C A->G T->C C->C
-0.07 0.10 0.01 -0.04 0.12 0.04 -0.02 0.01
0 0 -0.02 0.03 0.01 -0.03 0.03 0.02
2.2 Distrinution of Nucleosome around Splice Sites
Thirteen species, including human, chimp, dog, cat, chicken, lizard, lancelet, mouse, opossum, zebrafish, worm, fly and yeast, are involved in our analysis. DNA sequences are retrieved from UCSC (http://genome.ucsc.edu/). Each sequence is 1200 bp long, containing 600 bp upstream (5’) of splice site and 600 bp downstream (3’) of splice site. Amount of sequences is more than 7000 for each of species. For each of 13 species, the characteristics of nucleosomes around splice site are computed as follows. At first, curvature profile of each sequence is computed; and then each of curvature profiles is aligned by the splice site. At last, the aligned curvature profiles are summated and smoothed with a 13-point window, and averaged by aligning at splice site.
3 Results and Discussions Prediction performance of curvature profile is described in our previous work [8]. In predicting for S. cerevisiae chromosome I, curvature profile achieves 59.51% of positive accuracy and 69.53% of sensitivity [8]. Moreover, the curvature profile has been proved to be a multi-species model. We also provide an online tool of the model (http://www.gri.seu.edu.cn/icons). The detailed test for the model will not be presented in this paper. In order to investigate characteristics of nucleosome positioning at splice sites, positioning signals (curvature profiles) are computed for the 13 specieses respectively. The aligned signals are shown in Fig.2 and Fig.3 (Fig.2 is for human, worm and yeast). A strong positioning signal is observed at both ends of exons. The
294
H. Liu and X. Sun
result is consistent with the findings in literatures [5-7]. The previous study revealed that nucleosomes in the vicinity of splice sites have a role of protecting the special sites. We speculate that the configuration of nucleosomes at both ends of exons plays roles not only in protecting splice sites, but also in process of the splicing. The wellpositioned nucleosomes provide a barrier in transcription elongation, ensuring the exact splicing. Nucleosome-free regions (NFRs) are observed near the wellpositioned nucleosome (Fig.2 and Fig.3). The NFRs potentially allow access of the splicing complex. Except yeast, cat and zebrafish, there is no clear indication of a well-positioned nucleosome at both ends of intron. More importantly, curvature profile is a prediction model based on DNA sequence. The feature of nucleosomes predicted by curvature profile is consistent with that in experiment at boundaries of exons [5-7], strongly suggesting DNA sequence has an important role in determining nucleosome.
01
/
.1
/
(
(
F(
S
F(
S
Fig. 2. Nucleosome positioning near the donor sites (a, c and e) and the acceptor sites (b, d and f), a and b for human, c and d for worm, e and f for yeast, axis y indicates averaged signal of curvature profile
Nucleosomes Are Well Positioned at Both Ends of Exons
295
Distances between nucleosome centre (dyad site) and the splice sites are listed in Table 2. Splice sites locate at ends of nucleosomes in human, chimp, dog, cat, chicken, lizard, lancelet, mouse and opossum genomes (Fig.3), which is basically in agreement with that reported previously [7]. Nucleosome at 5’ end (near acceptor site) of exon is more conserved than that at 3’ end (near acceptor site) of exon (variances of positions of nucleosomes at 5’ end and 3’ end are 49 and 85, respectively). Table 2. Distance between splice site and nucleosome dyad centre
human chimp dog cat chicken lizard lancelet
Distance between donor site and nucleosome dyad centre (bp) -66 -57 -64 10 -67 -59 -73
human chimp dog cat chicken lizard lancelet
Distance between acceptor site and nucleosome dyad centre (bp) 63 73 63 70 73 65 63
mouse
-63
mouse
73
opossum zebrafish
73 6
opossum zebrafish
-57 147
worm
-90
worm
54
fly yeast
-75 -300
fly yeast
-23 26
variances
85
Species
Species
49
To our knowledge, the first and last exons should be constitutively included in mRNA since skipping them will cause translation failure [2-3, 5-7]. The internal exon/intron junction is related to alternative splicing. According to the rule of “first in first serve”, if a nucleosome is well positioned at 3’ end of an internal exon, its immediate intron will probably be cut in splicing. Therefore, stability of nucleosomes at 3’ end of internal exons is closely related to alternative splicing. A clustering analysis for positioning signals of 13 specieses is shown in Fig.3. For nucleosome at 3’ end of exon, its position is gradually close to donor site from worm, fly, opossum, chimp, mouse, chicken, human, dog, lancelet to lizard. For nucleosome at 5’ end of exon, its position is gradually close to acceptor sites from yeast, fly, lizard, dog, worm, human, chicken, chimp, opossum, mouse to lancelet. In both ranks, mammalian is at a middle position; the distance of nucleosome centre to splice site is appropriate. Lower organisms locate at two sides of the ranks. Their nucleosomes positions are either far from or very close to the splice sites. The results suggest an evolution-related mechanism is involved in nucleosome configuration near splice sites.
296
H. Liu and X. Sun
Fig. 3. A clustering analysis for nucleosome positioning signal around donor (left) and acceptor (right) sites for 13 specieses, red indicates great value and green indicates small value; black curves under each heat map show average signals, ellipses indicate nucleosomes
It should be pointed out that nucleosome positioning is dynamic to a certain degree, and our prediction only shows the characteristic of nucleosomes determined by DNA sequences [13]. Nucleosomes at both ends of exon will probably be depleted at a specific condition. It is speculated that nucleosome positioning is a way of realizing the alternative splicing in cells.
4 Conclusions Characteristics of nucleosomes around splice sites are investigated with curvature profile. Results indicate that nucleosomes are positioned at the boundaries of exons, placing a barrier for transcription elongation. Our analysis suggests that nucleosome positioning has a role in the splicing. The differences of nucleosome positions observed among specieses may attribute to the evolution. DNA sequence plays an important role in positioning nucleosomes. Acknowledgements. This work is supported by the National Natural Science Foundation of China (Grant Nos. 60671018 and 30800209).
References 1. Jiang, C., Pugh, B.F.: Nucleosome positioning and gene regulation: advances through genomics. Nat. Rev. Genet. 10(3), 161–172 (2009) 2. Emanue, R., Benjamin, J.B.: Gene Expression: The Close Coupling Dispatch of Transcription and Splicing. Curr. Biol. 12(9), R319–R321 (2002) 3. Fong, Y.W., Zhou, Q.: Stimulatory effect of splicing factors on transcriptional elongation. Nature 414, 929–933 (2001)
Nucleosomes Are Well Positioned at Both Ends of Exons
297
4. Schones, D.E., Cui, K., Cuddapah, S., Roh, T.Y., Barski, A., Wang, Z.B., Wei, G., Zhao, K.J.: Dynamic Regulation of Nucleosome Positioning in the Human Genome. Cell 132, 887–898 (2008) 5. Tilgner, H., Nikolaou, C., Althammer, S., Sammeth, M., Beato, M., Valcárcel, J., Guigó, R.: Nucleosome positioning as a determinant of exon recognition. Nat. Struct. Mol. Biol. 16(9), 996–1001 (2009) 6. Jung, K.C., Bae, J.B., Lyu, J., Kim, T.Y., Kim, Y.J.: Nucleosome deposition and DNA methylation at coding region boundaries. Genome Biol. 10, R89 (2009) 7. Andersson, R., Enroth, S., Rada-Iglesias, A., Wadelius, C., Komorowski, J.: Nucleosomes are well positioned in exons and carry characteristic histone modifications. Genome Res. 19, 732–1741 (2009) 8. Kogan, S., Trifonov, E.N.: Gene splice sites correlate with nucleosome positions. Gene 6(352), 57–62 (2005) 9. Segal, E., Mittendorf, Y.F., Chen, L., Thåström, A., Field, Y., Moore, I.K., Wang, J.Z., Widom, J.: A genomic code for nucleosome positioning. Nature 442(17), 772–778 (2006) 10. Yuan, G.C., Liu, J.S.: Genomic sequence is highly predictive of local nucleosome depletion. PLoS Comput. Biol. 4, e13 (2008) 11. Miele, V., Vaillant, C., d’Aubenton-Carafa, Y., Thermes, C., Grange, T.: DNA physical properties determine nucleosome occupancy from yeast to fly. Nucl. Acids. Res. 36, 3746– 3756 (2008) 12. Tolstorukov, M.Y., Choudhary, V., Olson, W.K., Zhurkin, V.B., Park, P.J.: nuScore: a webinterface for nucleosome positioning predictions. Bioinformatics 24, 1456–1458 (2008) 13. Liu, H.D., Wu, J.S., Xie, J.M., Yang, X.N., Lu, Z.H., Sun, X.: Characteristics of nucleosome core DNA and their applications in predicting nucleosome positions. Biophys. J. 94(12), 4597–4604 (2008)
An Evaluation of DNA Barcoding Using Genetic Programming-Based Process Masood Zamani and David K.Y. Chiu School of Computer Science, University of Guelph, Guelph, Ontario, Canada {Masood.Zamani,David.Chiu}@socs.uoguelph.ca
Abstract. The DNA barcoding is a promising technique for identifications of biological species based on a relatively short sequence of COI gene. A research area to improve the DNA barcoding is to study the classification techniques that utilize common properties of DNA and amino acid sequences such as variable lengths of gene sequences, and the comparison of different reference genes. In this study, we evaluate a classification model for DNA barcoding induced by genetic programming. The proposed method can be adapted for both DNA and amino acid sequences. The performance is evaluated by representing the two types of sequences and one based on their properties. The proposed method evaluates common significant sites on the reference genes which are useful to differentiate between species. Keywords: DNA barcoding, genetic programming, classification, simulation.
1 Introduction DNA barcoding or the Barcode of Life project [1] is to develop a reliable and convenient method to classify quickly known biological species without using their morphological traits. The method relies on one or more reference genes such as mtDNA gene known as cytochrome c oxidase I (COI). The COI gene is a 648-bp of 5’ half of the DNA sequence. One advantage is that the DNA barcoding technique aims to improve the discovery and identification of new species. A basis of the DNA barcoding genes is the presumed fast mutation rate. The high variation of the selected genes such as COI is contradictory to other genes. COI gene has been used for classifying animals and algae and many other eukaryotes. However, in plants the success rate of DNA barcoding is lower than the mentioned groups since the combination of rbcL and matK genes is required by the study conducted in [2]. DNA barcoding even becomes more complicated in fungi. On the other hand, a number of studies indicate it is hard to generalize the DNA barcoding as sufficient information for above species levels and even at the species level [3]. DNA barcoding data consists of relatively long and variable length of sequences and in some cases with missing values in the data. In addition, for a number of species more than one reference genes might be required. In general, the classification of DNA barcoding sequences at a taxonomic level such as family or specie categorized in four main issues. The first is a highly reliable similarity K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 298–306, 2010. © Springer-Verlag Berlin Heidelberg 2010
An Evaluation of DNA Barcoding Using Genetic Programming-Based Process
299
measure. The second is the constructing of phylogenetic tree [4], [5]. The third concerns with metric-based method such as k-nearest neighbor [6]. The fourth issue is related to statistical classification methods such as Classification and Regression Trees (CART) [7], random forest (RF) [8], kernel approach such as k-mer string kernel [9] and weighted decomposition kernels [10]. CART generates a binary tree from reference sequences by a splitting rule in which the sequences at root level are in the same class and by growing the tree the sequences gradually divided in separate classes. In RF, similar to CART several binary trees are generated and a test query is labeled to a sequence selected by the majority of the trees. Kernel methods transforming data to a higher dimensional space and find a hyper-plane by which separating data into same classes with regards to minimize the classification errors. Robust Discrete Discriminant Analysis (RDDA) is the other method that compares the supervised information from the training data with an unsupervised modeling of data, and the method is able to include new labels (classes) to the classification model [11]. Artificial neural network also has been studied to induce a classification scheme for DNA barcoding in [12]. In general, most of the reviewed methods are using raw similarity scores by which the amount of information is inferred. In addition, Bayesian approach studied in [13] and kernel methods used by support vector machine apply to two species at a time to improve the performance and multiple classifiers are required for multiple species groups. Although, using artificial neural network type analysis often increases the accuracy, there is a trade-off for comprehensibility. Also, sequences with variable lengths have to be aligned. Knowing the challenges for DNA barcoding, we aim to evaluate the application of genetic programming (GP) [14] for constructing a DNA barcoding classification scheme, formulated as an optimization problem. Evolutionary computations have been successfully applied to wide ranges of optimization and engineering problems. Creating variable descriptor in GP is a possible classification candidate for DNA barcoding. It also potentially explores the segments of genes that play more significant role to differentiate between species.
2 Genetic Programming Genetic programming (GP) [14] is a technique to make a computer program to induce automatically programs for solving problems. It has been effectively applied to various problems related to optimizations and applied sciences. The strength of the method is the exploration properties by which the method not only attempts to find the best solution in its local search but also searches for other alternative and possibly global solution in the entire search space. In GP, the primary solutions of a given problem are represented in a tree structure (as an individual). Each node in a tree is either a terminal or a function. Terminals are the leaves which do not have outgoing links, while functions have a number of outgoing links connected to the leaves or the functions (as interior nodes). The choice of terminal and function sets is depended on the problem aimed to be solved. GP, though not biologically Darwinian, uses some genetic operators inspired from processes of natural selection and recombination. These operators are applied to the individuals of a population and the offspring with the higher fitness are passed to the
300
M. Zamani and D.K.Y. Chiu
next generation. The start point in GP is to create an initial population that consists of the primary (candidate) solutions. The primary solutions are the initial states in the solution space which can be conceived as a graph. The effect of the evolutionaryinspired operators is basically similar to shifting the individuals of a population to different states. The operators are mutation, crossover, reproduction and selection. Mutation operator alters a tree’s node to a different function or terminal defined within the range of terminal or function set. The mutation operator maintains diversity in a population. Crossover operator is applied on two trees (as individuals). Two nodes from the trees are randomly selected, and then the two branches which their roots are the selected nodes are swapped. The result is two new trees different from the parent trees. To perform genetic operators on the selected individuals, there are two common selection techniques known as fitness-proportional and tournament selections [15]. In tournament selection a group of individuals with the size of k (tournament size) is randomly selected from the population and compete with each other. The fittest individual (or the winner) is selected as an input for the genetic operators. In fitness-proportional selection, a probability is assigned to each individual. These probabilities determine the chance an individual can pass their offspring to the next generation.
3 Methodology In this study, we aim to evaluate the application of Genetic Programming for the classification of biological species based on COI gene. The flexibility of representing a problem in GP as a population of various lengths of programs is a unique feature for solving complex classification problems. The advantage usually cannot be found in other standard classification techniques such as decision tree, statistical classifiers (SVM) and neural networks. For instances, by increasing the training time in GP the accuracy of the trained model is increased (with a trade-off of over-fitting the data). Also a GP run is probabilistic; the outcome is not identical in each run. Therefore, the GP classifiers are usually based on a voting scheme, and induce a classification model with higher accuracy. 3.1 Representation of the Classifier in GP The proposed classification model is based on a hybrid approach of representing classifiers in GP proposed in [16]. The first representation called class enumeration technique that a tree is constructed by IF-rules and the result of a tree evaluation is a class label. The second approach is called evidence accumulation in which a tree, consists of blocks that add predefined values within [-1, 1] to a vector called certainty vector whose components represent the class labels. After the evaluation of the tree the element of the certainty vector that has the highest value indicates the class label predicted by the tree. In the proposed method both IF-rule and evidence accumulation criteria are included in the model for a robust evaluation. A value within [-1, 1] is added to an element of the certainty vector with regards to the class label that is returned by an IF-rule. The terminal and function sets are defined as the following:
An Evaluation of DNA Barcoding Using Genetic Programming-Based Process
301
1. Arithmetic function set: A= { - , +, / , * } 2. Logical function set: L= { > , < , = , >= , <= , BETWEEN (a1,a2,a3), IF (condition) – THEN (ci,v1) – ELSE (cj,v2) } 3. Terminal set: T= { SEQUENCE (position, length, pattern) , CLASS } An example of a solution (tree) represented in the GP is shown in Figure 1.
Fig. 1. Representing a primary solution consists of two IF functions by the proposed GP. The letters S, B and C denoted for SEQUENCE, BETWEEN and CLASS respectively.
The arithmetic functions have two inputs from sequence terminal and one output, all as real values. The logical functions have two or three inputs as real values from a sequence terminal or an arithmetic function and one output either true or false. Among the logical functions, between function has three inputs a1, a2, a3 and its return value is true if a2 ≤ a1 ≤ a3 , otherwise false. The IF function adds the value v1 to the ith element of the certainty vector corresponding to class ci if the evaluated condition is true, otherwise the value v2 is added to the jth element of the vector corresponding to class cj where i, j=1,…,n (n as the number of classes) . The class ci and cj are defined directly by a class terminal or indirectly by the return value of an IF function. The values of a class terminal range from 1 to n. Also, a sequence terminal can be linked as an input either to logical or arithmetic functions. A sequence terminal has three parameters which are position, length and pattern. The position parameter determines the start point that the evaluation of a query sequence is performed. The length parameter defines the number of nucleotides or amino acids used for evaluation from the start point. The pattern parameter is a sequence of randomly generated nucleotides or amino acids. 3.2 The Mutation Operator Several scenarios may occur for mutation whether the selected node is a function or terminal. If the selected node is a class terminal, its label is reduced or increased by 1 within [1, n]. In case of a sequence terminal, one of its three parameters is modified. The position parameter can be within [1, m] (m is the maximum sequence length defined as a parameter). Similarly, the length parameter can be modified within [L1, L2] (L1, L2 defined also as parameters). L1, L2 have been set to 2 and 10 respectively
302
M. Zamani and D.K.Y. Chiu
in the GP runs. Also, if the selected parameter is pattern, an element of pattern parameter is flipped to a different nucleotide or amino acid, depending on the type of the training sequences. Finally, if the selected node is an IF function, v1 or v2 is flipped to a number within [-1, 1]. In case the selected node is a function, it is flipped to a different function with the same type. For instance, if a logical function is changed to arithmetic or vice versa, the result would be a tree with a wrong syntax. 3.3 The Crossover Operator The crossover operator applied in the proposed GP is similar to the common GP crossover. However, in the proposed GP model a number of restrictions are applied for swapping the branches of trees. A tree’s branch that its root is an IF function can be swapped with a tree’s branch that its root is either an IF function or a class terminal and vice versa. The second restriction is that a tree’s branch that its root is a logical function can only be swapped with a tree’s branch that its root is a logical function. Lastly, a tree’s branch which its root is an arithmetic function can be swapped only with another tree’s branch that its root is either an arithmetic function or a sequence terminal. The restrictions prevent the formation of trees which are syntactically wrong. 3.4 The Fitness Function To evaluate the fitness of individuals, training sequences are represented to an individual (or tree) each at once. The return values of sequence terminals are calculated and passed either to arithmetic functions or logical functions. The tree evaluation is performed based on bottom-to-top method. For an IF function, the value v1 or v2 is added to the ith element of the certainty vector. After evaluating all nodes up to the tree’s root, if the ith element of certainty vector has the highest value, the tree classifies the represented training sequence as class i (i=1,..,n the number of classes). By repeating the tree evaluation for each training sample, the total fitness of the solution (or tree) would be m ≤ n if m training samples are classified correctly out of total n training sequences. To calculate the value of a sequence terminal, a subsequence t with the length of l from the position k of the represented sequence is selected. The position k and length l are defined by the length and position parameters of the sequence terminal. The Euclidean distance between the substring t and the pattern defined by the sequence terminal is calculated as the output of the sequence terminal. The supervised learning process is performed by representing the primary solutions in the form of a tree structure as described in section 3.1. The primary solutions consist of one or two IF functions and the rest of the tree’s nodes are selected randomly from the terminal and function sets. The fittest solutions from the recent generation are selected and placed in a candidate pool. The genetic operators are applied to the individuals of the candidate pool, and the offspring is passed to the next generation. The learning process is terminated if the desired solution is obtained, the maximum number of generations is reached or in two consequent generations over-fitting is occurred. Over-fitting is evaluated by a set of sequences that are not used during the training and test.
An Evaluation of DNA Barcoding Using Genetic Programming-Based Process
303
4 Experimental Setup The DNA sequences of COI genes and their corresponding amino acid sequences have been collected from the Birds of North America species [17] as shown in Table 1.The sequences are in different lengths, and a number of sequences contain unidentified nucleotides and amino acids. Table 1. 65 DNA and 65 Amino acid sequences from the Birds of North America species Species Accipiter gentilis Actitis macularius Aix sponsa Amphispiza belli Amphispiza bilineata Anas acuta
Specimens 5 4 5 5 6 7
Species Anas americana Anas carolinensis Anas clypeata Anas platyrhynchos Anas discors
Specimens 7 8 6 6 6
In the GP runs, the crossover rate and population number are set to 85% and 256 respectively, and the two-tournament technique is applied for selection. The values of the GP parameters and the parameters of the terminal nodes discussed in the section 3 have been selected by a number of preliminary experiments and based on their effects on the training performances. In this study, we have conducted three experiments as shown in Table 2. Table 2. The sequence types and the number of sequences used in the three experiments Experiment I II III
Generations 2×104 2×104 7×104
Training 54 22 54
Test 11 43 11
Sequence type DNA AA DNA AA DNA
AAP AAP
The proposed GP classifier is trained separately by representing DNA sequences, amino acid sequences (denoted AA) and numeric sequences generated based on the ten physio-chemical properties of amino acids (denoted AAP). The ten physiochemical properties can be obtained from [18]. To generate the AAP dataset, a 20×10 matrix is created by which each row represents an amino acid and columns representing the ten properties. Row i and column j of the matrix is set to 1 if the amino acid i has the property j, otherwise 0. For example, amino acid H is positive and aromatic, therefore in the row representing H the columns related to positive and aromatic are set to 1 and the other columns set to 0. By filling the matrix cells for all amino acids, we can interpret each row as a binary number xi representing the amino acid i. Lastly, for a given amino acid sequence t with the length l, a vector with the length l is generated by setting the number xi to amino acid i and similarly for the other amino acids exist in the sequence. In the experiments I, III, one sequence is selected from each specie for the test set, and the rest of the sequences are used as the
304
M. Zamani and D.K.Y. Chiu
training set (Table 2). As contrary, experiment II was conducted by reducing the number of training sequences.
5 The Results and Discussion In this section, we present the experimental results. The average percentage of the classification accuracy obtained in 10 GP runs is shown in Figure 2. The training results indicate the classifier learned by DNA sequences outperforms the classifier trained by AA and AAP sequences. Also, the classifier using AAP sequences is trained more accurately than the classifier using AA sequences. Moreover, the detailed results of the experiment II confirms the model learns more accurately with less number of training sequences, about 95.45% in the best run, out of 10 GP runs. But the classifier performs much less accurately about 54% on the query sequences. However, by increasing the training sequences in experiment I, the classification accuracy reached to 81.8% in the best run, out of 10 GP runs. The obtained result is still substantial with regards to the ratio of the training samples to the query samples. We also investigate the performance of the proposed method in experiment III by increasing the training time (number of generations). In the experiment III, only DNA sequences are used since previously we obtained the best result by DNA sequences. The classification accuracy in the best run was increased from 81.8% to 91%. Average classification accuracy in trainings and tests 90
Experiment III
Ave. training % Ave. test %
80
Experiment II 70 Experiment I Fitness %
60 50 40 30 20 10 0
DNA
AA
AAP
DNA
AA
AAP
DNA
Sequence type
Fig. 2. The comparisons of average classification accuracy in 10 GP runs
The comparisons shown in Figure 2 indicate that in average also the best result is obtained by DNA sequences. The classification accuracy is also improved by increasing the training samples and changing the training dataset from AA sequences to AAP sequences. In experiment III, we observed that repeating the experiment I with more training time increases both the average training and test accuracy as expected. Lastly, the proposed method can identify the common significant sites on the sequences once the training is completed. The common sites on the DNA sequence obtained from the experiment III are shown in Table 3.The common sites
An Evaluation of DNA Barcoding Using Genetic Programming-Based Process
305
are extracted from the sequence terminals. The positions and lengths of all sequence terminals are extracted, and the union of the locations represented as the common significant sites. The identified locations on the sequences are the minimum information required for classification of the eleven species shown in Table 1. Table 3. The significance sites on DNA sequences identified in experiment III Common Significant Sites on DNA Sequences 13-27 312-316 36-49 324-329 89-95 333-340 145-155 378-382 241-265 388-416 276-279 470-520
6 Conclusion In this study, the application of Genetic Programming (GP) is evaluated for the classification of biological species using DNA barcoding data. Inducing a variable size of classification models by means of GP is an interesting property for the studied classification problem. An important factor for the classification of COI genes is to determine the segments of the genes that can be used to classify species. Using GP, a variable size of classification model can be induced. The model is able to include a sufficient number of subsequences (patterns) which are required for classification. The results indicate that DNA sequences are the best candidate for the studied classification problem. The proposed method also suggests an interpretation of the classification task by identifying potentially useful locations on the studied sequences. In fact, the identified locations are the minimum amount of information required for the classification using GP. Nonetheless, similar to the other evolutionary approaches, the method has a number of predefined parameters, and setting appropriate values for the parameters are crucial to obtain reasonable results. The proposed GP-based method is computationally expensive, but it can produce good results if the parameters are properly set with sufficient training time.
References 1. Hebert, P.D.N., Ratnasingham, S., Waard, J.R.: Barcoding animal life: cytochrome c oxidase subunit 1 divergences among closely related species. Proceedings of the Royal Society of London. Series B: Biological Sciences 270, S96–S99 (2003) 2. Chase, M.W., Fay, M.F.: Barcoding of plants and fungi. Science 325, 682–683 (2009) 3. Moritz, C., Cicero, C.: DNA Barcoding: Promise and Pitfalls. PLoS Biol. 2, 1529–1531 (2004) 4. Saitou, N., Nei, M.: The neighbor-joining method: a new method for reconstructing phylogenetic trees. Molecular biology and evolution 4, 406–425 (1987) 5. Guindon, S., Gascuel, O.: A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Systematic biology 52, 696–704 (2003)
306
M. Zamani and D.K.Y. Chiu
6. Austerlitz, F., David, O., Schaeffer, B., Bleakley, K., Olteanu, M., Leblois, R., Veuille, M., Laredo, C.: DNA barcode analysis: a comparison of phylogenetic and statistical classification methods. BMC bioinformatics 10, S10 (2009) 7. De’ath, G., Fabricius, K.E.: Classification and regression trees: a powerful yet simple technique for ecological data analysis. Ecology 81, 3178–3192 (2008) 8. Breiman, L.: Random forests. Machine learning 45, 5–32 (2001) 9. Leslie, C., Eskin, E., Noble, W.S.: The spectrum kernel: A string kernel for SVM protein classification. In: Proceedings of the Pacific Symposium on Biocomputing, vol. 7, pp. 566–575 (2002) 10. Menchetti, S., Costa, F., Frasconi, P.: Weighted decomposition kernels. In: Proceedings of the 22nd International Conference on Machine learning, pp. 585–592 (2005) 11. Bouveyron, C., Girard, S., Olteanu, M.: Supervised classification of categorical data with uncertain labels for DNA barcoding. In: Proceeding of the 16th European Symposium on Artificial Neural Networks (ESANN 2009), pp. 29–34 (2009) 12. Zhang, A.B., Sikes, D.S., Muster, C., Li, S.Q.: Inferring species membership using DNA sequences with back-propagation neural networks. Systematic Biology 57, 202–216 (2008) 13. Nielsen, R., Matz, M.: Statistical approaches for DNA barcoding. Systematic Biology 55, 162–169 (2006) 14. Koza, J., Poli, R.: Genetic programming: on the programming of computers by means of natural selection. The MIT press, Cambridge (1992) 15. Goldberg, D.E., Deb, K.: A comparative analysis of selection schemes used in genetic algorithms. Foundations of genetic algorithms 1, 69–93 (1991) 16. Loveard, T., Ciesielski, V.: Representing classification problems in genetic programming. In: Proceedings of the 2001 Congress on Evolutionary Computation, vol. 2, pp. 1070–1077 (2001) 17. Barcode of Life Data Systems, http://www.boldsystems.org 18. Taylor, W.R.: The classification of amino acid conservation. Journal of theoretical Biology 119, 205–218 (1986)
Auto-Creation and Navigation of the Multi-area Topological Map for 3D Large-Scale Environment Wenshan Wang1, Qixin Cao2, Chengcheng Deng1, and Zhong Liu1 1
Research institute of Robotics, Shanghai JiaoTong Universcity, No. 800, Rd. Dongchuan 200240 Shanghai, China
[email protected],
[email protected],
[email protected] 2 The State key Laboratory of Mechanical System and Vibration, Shanghai Jiao Tong University
[email protected]
Abstract. The widely used topological map is quite essential to localization and navigation for mobile robot, especially in large-scale environment. In this paper, a new structure of topological map is presented and applied in robot navigation. This kind of map, compared to occupancy grid map and conventional topological map, can better represent the environment with certain features, such as multi-floor and multi-type, and reduce time and space complexity to a degree. In this case, it will be more convenient for the creation and maintenance of topological map, and the navigation based on this kind of map become more efficient and robust. Finally, certain experiments demonstrate that this approach is very effective. Keywords: Topological Map, Multi-area, Navigation, Laser Sensor, Mobile Robot.
1 Introduction Navigation is one of the most challenging capacities for mobile robots. Over the past two decades, it has made great progress and received considerable attention on the problem of robot localization and navigation in mobile robots community. Thus far, map can be divided into continuous map and discrete map. The continuous map is mainly about the geometrical map [1][2], which represent the geometrical feature with feature points such as lines and points. It is easy for identification and localization, but with poor anti-interference performance. Discrete map includes grid map [3], topological map [4][5] and hybrid map [8] and so on. The grid map is concise and flexible, but consuming large memory resource. Topological map represents a specific location with node, and represents the passage with link. It is with high abstractness but low specification about the details of the environment [6][7]. In this paper, the topological map is more suitable for describing large-scale environment for its flexibility, robustness and low memory consumption. In the navigation process, a variety of properties are set on the topological nodes, including K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 307–315, 2010. © Springer-Verlag Berlin Heidelberg 2010
308
W. Wang et al.
orientation, size, landmark, stopFlag, etc. These attributes can remedy the inaccuracy of the topological map and improve the ability of the localization and navigation. In practical, mobile robots usually work indoors with multi-floor and multi-room environment. There is presented by some researchers a fusion algorithm dealing with multi-floor problem [9]. However, in large-scale environment, the robot doesn’t have to search the entire map when doing path plan, which will rise the searching difficulty, increase the searching time. This paper presents a multi-area topology map structure. It can effectively organize the information under large-scale, multi-floor situation. Furthermore, a path plan approach based on the multi-area topology map is presented.
2 Robot and Map Editor 2.1 Mobile Robot for Home Service At present, service robots are expanding towards agriculture, services, military and other non-industrial areas at amazing speed. Particularly in our daily lives, service robots have a promising future and its commercialization becomes more and more practical. In this paper, the study is based on the dual-arm robot SmartPal cooperatively produced by Shanghai Jiaotong University and Yaskawa Electric cooperation, and focus on the map building and navigation problems in large-scale environment. SmartPal collects information through the laser sensor, and then build a grid map automatically. After that, a topological map is created in 3D Map Editor.
Fig. 1. The left figure show the 3D simulation model of the mobile service robot: Smartpal. Laser sensor is represented by the rays around the robot. The right figure is the 3D environment 3D Map Editor.
Auto-Creation and Navigation of the Multi-area Topological Map
309
2.2 3D Map Editor We use 3D Map Editor to model the environment and testify our map-creating algorithm. 3D Map Editor provides us with the functions of building geometrical map, grid map and topological map. The whole program is based on Java3D language.
3 The Topological Modeling and Multi-area Structure 3.1 Topological Node Model Topological nodes represent the accessible points for robots on the map, especially some turning points and other critical points. On an topological node, the robot probably need to complete some specific actions, such as opening the door, turning around, picking up objects or taking an elevator and so on. Therefore topological nodes should contain certain information, such as size, orientation, landmark and some other special properties. We record the information in XML format, which is a cross-platform language with hierarchy structure. The node structure is shown in Figure 2.
In 3D world
Fig. 2. This shows the structure of Topological node
3.2 Topological Link Model Topological links represent a continuous region through the area. Usually, through the topological link, the robot just needs to complete some simple motions as translation or rotation. Therefore, the model of the topological link is relatively simple. It includes the information of start and end nodes, time cost, velocity, angular velocity, width and so on. Figure 3 shows the structure of topological link.
310
W. Wang et al.
Fig. 3. This shows the structure of Topological link
3.3 Multi-area Structure This paper addressed a topological map with multi-area structure in the large-scale environment. Different floors, different rooms can be divided into different areas. Each area can be regarded as an independent topological map. At the same time, each area is also connected by the special topological link based on actual situation. Usually the connections between two areas are likely to be the elevator between different floors and the passage between different rooms. Figure 4 shows the basic structure of multi-area structure of the topological map.
Fig. 4. This shows the architecture of multi-area topological map. It depicts a three-floor building, which is constructed to four areas.
Auto-Creation and Navigation of the Multi-area Topological Map
311
There are three major advantages of this structure: I. Simple structure and clear hierarchy. In a large-scale complex environment, such as modern high-rise building, or exhibition hall, it is hardly possible to build the entire environment into a complete map and use it in navigation. With a regional approach, it would be easy to solve this problem. II. Modular explanation and good maintainability. When a local environment changes, it only need to modify a specific area, which in the case of large scale situation, would greatly reduce the calculating workload. III. Easy positioning and navigation. When using topological map for path planning, if the start node and goal node are in the same area, it just need to search in that particular area, which will greatly simplify the problem. If the start and goal nodes are not in the same area, the task can be done by traveling through the topological links between areas.
4 Auto-creating and Editing Topological Map In the 3D Map Editor, the data collected by the laser module is converted to a grid map. Then, the skeleton of the map is extracted using a graphics algorithm. Finally, the non-critical topological nodes are replaced by links. This automatically process of topological map building is shown as below:
LRF data file
Grid map
Topo map
Clean the non-critical nodes, create topological map.
Add the topo nodes using thinning algorithm.
Paint the grid onto the map
Split the grid, and mark the grids which are taken by obstacles.
Read map data from XML fle
Fig. 5. This shows the process of creating topological map
It is very convenient to display and create the map in 3D Map Editor. The entire process can be visualized and is easy to modify and improve. In addition, 3D Map Editor supports vrml model. This allows us to use Solid Works and other 3D software to establish a precise environment model. In fact, we made a copy of our laboratory in the virtual world, and achieve a realistic effect. This combination of the real world and the simulation is very helpful to our study.
312
W. Wang et al.
Robot with laser sensor
Fig. 6. This shows the experiment result of creating topological map in 3D Map Editor. There are several steps as shown in the figure. 1) Collecting laser data in the 3D environment. 2) Creating grid map. 3) Build the map with topological nodes using thinning algorithm. 4) Creating topological map.
5 Multi-area Navigation The topological map is used on navigation when the robots get the commands from people, who specify the start node and the end node on the topological map. Then the path planner works out the path with minimum cost if exists, and the robot can follow the path to the destination. There are mainly three problems about the planning algorithm. I. The algorithm should be good at real-timing. That is, the planner should take as little time as possible when navigation. II. The algorithm should avoid infinite loops. There may be a lot of circles on the topological map, the robot should avoid running into the same node again. III. The algorithm should adapt to the multi-area structure of the map. For the real-timing, the algorithm should be as simple as possible. We consider the heuristic searching algorithm to solve this problem. Let f (n) be defined as the actual cost of the optimal path constrained to go through n. And f (n) is consisted of two parts:
f (n) = g (n) + h(n).
(1)
Where g (n) is the actual cost of an optimal path from the start point s to n. And h(n) is the actual cost of an optimal path from n to the goal point. The evaluation
Auto-Creation and Navigation of the Multi-area Topological Map
313
function fˆ ( n) is defined as the sum of gˆ (n) and hˆ ( n ) , where gˆ(n) is an estimate of g (n) , and hˆ(n) is an estimate of h( n) . In this case, gˆ (n) can be defined as the minimum cost of the path from s to n. And then the problem is how to define the
hˆ( n) to form the heuristic function, and adapt to this multi-area situation. There are several basic principles to follow. I. The estimated cost must be lower than the actual cost [12]. II. The cost should be less if the current node is in the same area with the end node. III. The robot may only search one certain part of the map depending on the navigation task. Based on the above three principles, the heuristic function is designed like this:
fˆ (n ) = g ( n) + hˆ(n)
(2)
hˆ(n) = d ( n) + δ ⋅ l (n)
(3)
Where d (n) is the distance of the n and the goal node, and l ( n) is an additional value of the area cost, and δ ∈ (0,1) is the impact factor.
Fig. 7. This shows the experiment result of the navigation result. The task for robot is traveling from node 5 in area 2 to node 10 in area 0. It totally passes 13 nodes all through the path, especially three cross nodes, those are node 2 in area 2, node 7 in area 1 and node 25 in area 0.
314
W. Wang et al.
This algorithm is tested on the map showed in figure 6. The start node is set to node 5 in area 2, and the end node is set to node 10 in area 0. Figure 7 shows the result of the path plan, where dash line indicates the optimal path calculated by path plan unit. The result of simulation figures that the path planner can handle the large-scale navigation efficiently and can well adapt to the multi-area situation. Besides, compare with non-area map, which build the whole environment into one map, this approach appears a better performance on calculating efficiency.
6 Conclusion The paper addressed a multi-area structure of topological map to deal with the mapbuilding and navigation in the large-scale 3D environment. A new model of topological node and link is developed to achieve better navigation results. The mapbuilding process is successfully accomplished in 3D Map Editor. The path plan algorithm and the simulation result are show in 3D Viewer. The encouraging result of the simulation shows the improved topological map can be used for map-building and navigation in large-scale environment. Acknowledgments. This work was supported by The State key Laboratory of Mechanical System and Vibration under grant MSV-MS-2010-01.
References 1. Borges, G.A., Aldon, M.J.: Optimal mobile robot pose estimation using geometrical maps. IEEE Transactions on Robotics and Automation, 87–94 (February 2002) 2. Habib, M.K., Yuta, S.: Development and implementation of navigation system for an autonomous mobile robot working in a building environment with its real time application. In: 15th Annual Conference of IEEE Industrial Electronics Society, IECON 1989, November 6-10, vol. 3, pp. 613–622 (1989) 3. Makarenko, A.A., Willians, S.B., Durrant-Whyte, H.F.: Decentralized certainty grid maps. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 3258–3263. IEEE, Piscataway (2003) 4. Guillaume, D., Samuel, P., Laurent, F., Pascal, L.: Topological Map: An Efficient Tool to Compute Incrementally Topological Features on 3D Images. In: Reulke, R., Eckardt, U., Flach, B., Knauer, U., Polthier, K. (eds.) IWCIA 2006. LNCS, vol. 4040, pp. 1–15. Springer, Heidelberg (2006) 5. Van Zwynsvoorde, D., Simeon, T., Alami, R.: Incremental Topological Modeling using Local Vorono’i-like Graphs. In: Proceedings of the 2000 IEEE/RSJ International Conference on intelligent Robots and Systems (2000) 6. Su, L., Yu, Y., Chen, W., Cao, Z., Tan, M.: Extraction of Topological Map Based on the Characteristic Corners from Grid Environment Exploration. In: IEEE International Conferences on Cybernetics & Intelligent Systems and Robotics, Automation & Mechatronics, Bangkok, Thailand, pp. 772–777 (July 2006) 7. Cheong, H., Park, S., Park, S.-K.: Topological Map Building and Exploration Based on Concave Nodes. In: International Conference on Control, Automation and Systems 2008, COEX, Seoul, Korea, October 14-17, pp. 1115–1120 (2008)
Auto-Creation and Navigation of the Multi-area Topological Map
315
8. Zhuang, Y.: Hybrid Mobile Robot Indoor Localization Using Large-Scale MetricTopological Map. In: The Sixth World Congress Intelligent Control and Automation on WCICA 2006, pp. 9057–9062 (2006) 9. Gu, J., Cao, Q., Zhang, Z.: Geometrical Maps Integrated with Topology and Navigation Approach in Multi-floor Indoor Environment. In: 7th International Workshop on Intelligent Robots of China 10. Choi, C.-H., Song, J.-B., Chung, W., Kim, M.: Topological map building based on thinning and its application to localization Intelligent Robots and System. In: IEEE/RSJ International Conference (2002) 11. Tae-Bum, K., Jae-Bok, S.: Thinning-based topological exploration using position possibility of topological nodes. Advanced Robotics 22(2-3), 339–359 (2008) 12. Hart, P.E., Nilsson, N.J., Raphael, B.: A Formal Basis for the Heuristic Determination of Minimum Cost Paths. IEEE, Transactions of Systems Science and Cybernetics Ssc-4(2), 100–107 (1968)
Relation of Infarct Location and Size to Extent of Infarct Expansion After Acute Myocardial Infarction: A Quantitative Study Based on a Canine Model Jianhong Dou1,2, Ling Xia2, Yunliang Zang2, Yu Zhang2, and Guofa Shou2 1
Department of Anesthesiaology, Guangzhou General Hospital of Guangzhou Military Command Guangzhou 510010, China 2 Department of Biomedical Engineering,Zhejiang University Hangzhou 310027, China
[email protected]
Abstract. The paper analyzes the influence of size and location of myocardial infarction (MI) to the extent of infarct expansion (IE), based on a coupled electromechanical bi-ventricular model of canine. The ventricular motion and distributions of principal strain and stress during systole were used to contrast the effect of IE at different locations and with different sizes. The results showed that IE occured more in anterior wall (AW) near apex than posterior wall (PW), and large transmural MI may contribute a lot to the development of IE, which was in agreement with clinical results. Keywords: Infarct expansion (IE), myocardial infarction (MI), computational modeling.
1 Introduction Infarct expansion (IE) characterized by dilation and thinning of the infarcted zone, is an event early in the course of myocardial infarction (MI) with serious short- and long-term consequences [1]. This process appears to be associated with deterioration in cardiac function and increased mortality [2], and may be important in the developments of cardiac rupture and late aneurysm formation [3]. Therefore, IE is treated as a potent predictor of survival in patients after MI. Nevertheless, the explicit mechanism underlying the abnormal ventricular function caused by IE has not been elucidated. Quantitative mechanical assessment of IE may be useful for diagnosis and therapy of MI with IE to avoid ominous prognosis and development of chronic heart failure (CHF). Tagged MRI and echocardiography analysis are valuable and invasive tools in analyzing cardiac mechanics. However, they are limited to few simultaneous LV locations [4]. Furthermore, it’s not feasible to measure all the necessary mechanical indexes from imaging experiments or other artifices invasively in clinic. Modeling of heart may be an alternative to enhance our understanding of the K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 316–324, 2010. © Springer-Verlag Berlin Heidelberg 2010
Relation of Infarct Location and Size to Extent of Infarct Expansion
317
mechanical properties under pathological conditions. Based on a strong-coupled electromechanical bi-ventricular model of canine, the relative contributions of anterior wall (AW) and posterior wall (PW) on IE and the relation of initial infarct size to extent of IE after acute myocardial infarction (AMI) was studied.
2 Method A three-dimensional (3D) electromechanical bi-ventricular model of normal canine, including realistic geometry and fiber orientation based on DT-MRI datasets from Duke University, has been constructed in our previous investigations [5]. The electrical activation conduction sequences were simulated based on solutions of reaction-diffusion equations and solved with a strategy of parallel computation [6]. However, the ischemia/MI can change the myocardial action potentials with decreased magnitude of the resting potential and prolonged action potential duration, which has been observed in the infarcted regions of experimental animals [7]. By progressively modifying the abnormal action potentials assigned to the injured region as shown in Fig. 1, the evolution of the abnormal electrical activation sequences of ventricles after AMI was simulated with a monomain solution and shown in Fig. 2.
Fig. 1. Simulated action potentials. The degree of the injured muscle becomes more severe from the activation potential curve atp2 to atp6, and the blue is the normal action potential curve.
318
J. Dou et al.
Fig. 2. Simulated abnormal electrical activation sequences of the canine ventricles after AMI. Top panel and bottom panel show 3D electrical activation sequences of ventricles for AW MI and PW MI, with an increased area of 5%, 10% and 15% of MI, respectively.
In this study, simulations were divided into two subsets according to the sites of infarct-zone: those with transmural AW MI and those with transmural PW MI near apex, with an increased area of 5%, 10% and 15%, respectively. After the excitation series were determined, we can calculate the active forces per element at different time. MI may change the local contraction forces and stiffness of myocardium. In the fiber-coordinate system (one axis is chosen to coincide with the local muscle fiber direction, another one is determined by the epicardium surface normal vector), the abnormal active tensile stress {Ff }e in the direction of fiber is shown in Equation (1), which depicts the alternation of the active force. Le
{Ff }e = −w∑ ∫ l =1
ξl
1
1
∫ ∫
ξ l −1 −1 −1
[ B]T T {0,0, σ e' ,0,0,0}Tl J dξdηdζ
(1)
where w is the MI factor, in the boundary zone of the MI region w varies between 0.0 ' and 1.0 (healthy situation) [7]; [B] is the geometric matrix of an element, σ e is the active myofiber stress as a function of time after onset of contraction and sarcomere length history, |J | is the determinant of the Jacobin matrix, ξ, η, ζ are the local coordinate system with the magnitudes ranging from -1 to 1, l and Le are the number and total number of layers in an element, respectively and T is the transformation matrix between the fiber coordinate and global coordinate.
Relation of Infarct Location and Size to Extent of Infarct Expansion
319
Since the infarct tissues are relatively stiffer than normal cardiac tissues, the elastic modulus of the diseased zone is set to be several times larger than that of normal zone. These constitutive relations for cardiac mechanics will then be coupled to the finite-element (FE) model. For the operation of convenience, the LV and RV wall are divided into 14 layers along the long axis in this model. There are in total 2269 hexahedral elements, and 8736 degrees of freedom. After discretizing the ventricles into finite elements in radial direction, we group ten layers per element and assume that all the fibers within a layer are in the same orientation. The active force vectors are multi-directional inside an element. Muscle alike restriction is added to base elements due to the constraint of pleura. A Windkessel model for arterial impedance was coupled to ventricular pressure to compute the homodynamic boundary and pre-load was also added. The equations related to mechanics are solved using finite-element method (FEM) with eight-node isoparametric elements. Regional motion and deformation of the torso ventricles are then calculated during systole phase.
3 Results 3.1 Motion of Ventricles A mesh display of the bi-ventricle for AW MI and PW MI during middle-rapid ejection was shown in Fig. 3. The deformations near the infarcted zone were relatively unpredictable than those near the healthy region. The diseased zone bulged outside of the LV wall and it was especially obvious for those of AW MI, close to the results observed experimentally in dogs using 3D echocardiography [8].
Fig. 3. A mesh display of the bi-ventricle for AW MI and PW MI during middle-rapid ejection
320
J. Dou et al.
3.2 The Minimum Principal Strain Analysis The Lagrangian strain tensor at each point of the myocardium was used for evaluation of ventricular deformation, which was a function of time. Here we adopted the minimum principal strain E3, as an index to represent the physiological phenomenon of muscular contraction. E3 is the minimal eigenvalue of the Lagrange Green strain tensor E with the expression: E = 1/ 2( F T F − I )
(2)
Where I denotes the identity matrix and the superscript T represents the transpose of matrix, F indicates both the rotation and the deformation around a point. Fig. 4 shows the minimum principal strain distribution of LV middle wall for PW MI (top panels) and AW MI (bottom panels) during ejection. Notice that with the increase of MI area, the abnormal region of both AW MI and PW MI bulged more out of the wall, but strains of AW MI may be more distinctive and were larger compared with PW MI on average.
Fig. 4. The minimum principal strain of LV middle wall for PW MI (top panels) and AW MI (bottom panels) at four different times. (T1: the end of isovolumetric contraction; T2: middlerapid ejection; T3: the end of rapid ejection; T4: the end of reduced ejection)
Relation of Infarct Location and Size to Extent of Infarct Expansion
321
Fig. 4. (continued)
Fig. 5. The top panel shows the distribution of mechanical stress of PW around equator site during ejection and the bottom panel shows the distribution of mechanical stress of AW around equator site
322
J. Dou et al.
Fig. 5. (continued)
3.3 The First Main Stress Distribution Strain analysis alone may be inadequate for assessing mechanical dyssynchrony. Myocardial wall stress is one of the primary determinants of oxygen consumption and can be used as a marker for disease. However, regional wall stress is very difficult to measure in the intact heart. Predictions of the spatial distributions of mechanical stress of AW and PW MI around equator site during ejection are illustrated in Fig. 5. It is found that the predicted peak stress in the infarcted ventricle in AW is larger than that in PW. In the border zone surrounding the MI, systolic stresses are elevated. The results suggest that infarct expansion occurs in more of anterior transmural myocardial infarctions and to a lesser extent in infarctions at other sites (i.e. PW), which is in agreement with clinical data [1].
4 Discussion Based on an electro-mechano FE model of canine after AMI, we successfully studied the relation of initial infarct size to extent of IE and the relative contributions of AW and PW on IE quantitatively. Myocardial deformations (strains and stresses) show that infarct size is a determinant of infarct expansion and IE is associated with large infarcts. It can also be found from simulations that infarct location is another factor
Relation of Infarct Location and Size to Extent of Infarct Expansion
323
which may, in part, affect the extent to which infarct expansion occurs. The anterior left ventricular wall may be more susceptible to the development of expansion after transmural infarction than posterior left ventricular wall. However, it needs to point out that infarct transmurality is assumed in all of our simulations. These results showed good agreement with clinical conclusions [1, 9]. The phenomenon that expansion occurs more in infarcts in the AW might be explained by that regions of the left ventricular myocardium of AW near apex have greater curvature than those of PW. Furthermore, the muscle tissues in the AW infarct-zone sustain larger local wall stresses, which enlarges the regional forces of myocardial tissues to expand.
5 Conclusion This simulation suggests that such coupled heart model can be used to assess the mechanical function of the bi-ventricle with diseases such as MI, and quantify the mechanical status of bi-ventricle. With more precise model, it can provide the information in diagnosis and procedure for surgical and medical intervention of patients with MI. Acknowledgements. This project is supported by the 973 National Basic Research & Development Program (2007CB512100), the National Natural Science Foundation of China (30570484), the 863 High-tech Research & Development Program (2006AA02Z307) and the Program for New Century Excellent Talents in University (NCET-04-0550). We would like to thank Prof. Edward W. Hsu, Department of Bioengineering, Duke University, for providing the dataset and technical discussions.
References 1. Weisman, H.F., Healy, B.: Myocardial infarct expansion, infarct extension, and reinfarction: pathophysiologic concepts. Prog. Cardiovasc. Dis. 30, 73–110 (1987) 2. Eaton, L.W., Weiss, J.L., Bulkley, B.H., Garrison, J.B., Weisfeldt, M.L.: Regional cardiac dilatation after acute myocardial infarction. N. Engl. J. Med. 300, 57–62 (1979) 3. Schuster, E.H., Bulkley, B.H.: Expansion of transmural myocardial infarction: A pathophysiologic factor in cardiac rupture. Circulation 60, 1538–1632 (1979) 4. Walker, J.C., Ratcliffe, M.B., Zhang, P., Wallace, A.W., Fata, B., Hs, E.W.: MRI-based finite-element analysis of left ventricular aneurysm. Am. J. Physiol. Heart Circ. Physiol. 289, H692–H700 (2005) 5. Dou, J.H., Xia, L., Zhang, Y., Huo, M.M., Li, X.W.: A Three-dimensional Electromechanical Model of Canine Heart. In: DCDIS Proceedings of the International Conference on Life System Modeling and Simulation (LSMS 2007), pp. 593–597. Watem Press, Shanghai (2007) 6. Zhang, Y., Xia, L., Gong, Y.L., Chen, L.G., Hou, G.H., Tang, M.: Parallel solution in Simulation of Cardiac Excitation Anisotropic Propagation. In: Sachse, F.B., Seemann, G. (eds.) FIHM 2007. LNCS, vol. 4466, pp. 170–179. Springer, Heidelberg (2007)
324
J. Dou et al.
7. Liu, F., Xia, L., Zhang, X.: Analysis of the influence of the electrical asynchrony on regional mechanics of the infarcted left ventricle using electromechanical heart models. JSMEA 46, 1–9 (2003) 8. Mannaerts, H.F.J., van der Heide, J.A., Kamp, O., Stoel, M.G., Twisk, J., Visser, C.A.: Early identification of left ventricular remodelling after myocardial infarction, assessed by transthoracic 3D echocardiography. Eur. Heart J. 25, 680–687 (2004) 9. Eaton, L.W., Bulkley, B.H.: Expansion of acute myocardial infarction: its relationship to infarct morphology in a canine model. Cir. Res. 49, 80–88 (1981)
Artificial Intelligence Based Optimization of Fermentation Medium for β-Glucosidase Production from Newly Isolated Strain Tolypocladium Cylindrosporum Yibo Zhang1, Lirong Teng1, Yutong Quan1, Hongru Tian1, Yuan Dong1, Qingfan Meng1, Jiahui Lu1, Feng Lin1, and Xueqing Zheng2,* 1
College of life science, Jilin University, Changchun 130012, China The Frist hospital of Jilin University, Changchun 130021, China
[email protected],
[email protected] 2
Abstract. A Tolypocladium cylindrosporum strain was isolated for efficiently produce extracellular thermoacidophilic β-glucosidase (BGL). This objective of the present paper is to integrate two different artificial intelligence techniques namely artificial neural network(ANN) and genetic algorithm(GA) for optimizing medium composition for the production of BGL on submerged fermentations(SmF). Specifically, the ANN and GA were used for modeling non-linear process and optimizing the process. The experimental data reported in a previous study for statistical optimization were used to build the ANN model. The concentrations of the four medium components served as inputs to the ANN model and the β-glucosidase activity as the output of the model. The average error (%) and correlation coefficient for the ANN model were 1.36 and 0.998, respectively. The input parameters of ANN model were subsequently optimized using the GA. The ANN-GA model predicted a maximum βglucosidase activity of 2.679U/ml at the optimun medium composition. The ANN-GA model predicted gave a 22% increase of β-glucosidase activity over the statistical optimization, which was in good agreement with the actual experiment under the optimum conditions. Keywords: Artificial intelligence(AI), Artificial neural network(ANN), Genetic algorithm(GA), fermentation medium, β-glucosidase(BGL), Tolypocladium cylindrosporum.
1 Introduction In view of fast depletion of oil reserves and food shortages, people in worldwide kindled enormous interest in the development of new and cost-efficient process for converting plant-derived biomass to bioenergy[1]. Cellulose replenished by photosynthetic reduction of carbon dioxide using sunlight energy is the major * Corresponding author. K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 325–332, 2010. © Springer-Verlag Berlin Heidelberg 2010
326
Y. Zhang et al.
component of plant cell walls and the most abundant renewable biological resource in the biosphere[2]. It is known to all that cellulase is basically comprised of endoglucanase, exoglucanase and β-Glucosidases. β-Glucosidases play a key role during enzymatic hydrolysis of cellulose, the activities of endoglucanase and exoglucanase are severely inhibited by cellobiose and some cello-oligosaccharides which can be converted to glucose by BGL[3]. Hence biotechnologists are constantly in search of more productive strains and new inexpensive fermentation processed for a high BGL activity[4]. Thus in the fermentation process, medium optimization is recognized as a simple but effective method for achieving high productivity of desired products[5]. Since these factors have complex effects on the fermentation processed, statistical techniques like response surface methodology(RSM) are increasingly being used[6]. The development of accurate models for a biological system on a chemical and physical basis is still a critical challenge, mainly due to the complex and highly nonlinear nature of the fermentation process[7]. It has been improved that ANN and GA are good at mimicking and modeling different aspects of biological information in food science, environmental biotechnology and biochemical engineering has a well established tradition[8]. Genetic algorithms[9], which are artificial intelligence-based stochastic non-linear optimization formalisms, follow the theory of population evolution in natural systems. They have been used with great success for optimizing complex expressions, which in this case is ANN. BGL production from T. cylindrosporum syzx4 was optimizd for an enhanced BGL activity by following steps: (i) an ANN model has been developed using the influential process variables as model inputs and the BGL activity as the model output, (ii) the input space of the ANN model is optimized using the GA formalism with a view of maximizing the BGL activity, and (iii) the actual experiment is carried out to evaluate the feasibility of the model.
2 Materials and Methods 2.1 Microorganism and Culture Conditions and Assay of Enzyme Activity The strain T. cylindrosporum syzx4 (CCTCC M 209312) was isolated from the rotten corn stover in northeast of China. Upon screening for BGL activity, the isolate was found to produce high activity of BGL. The culture inoculum was prepared and used as described earlier[10]. After culturing, aliquots of the cells were centrifuged at 6,000×g for 10 min, and the supernatants were used to analyze the enzymatic activity of BGL with pNPG as the substrate[11]. One unit (U) of enzyme activity was defined as the amount of enzyme required to produce 1 μmol of p-nitrophenol from the pNPG substrate per minute. 2.2 Artificial Neural Network (ANN) and Genetic Algorithm(GA) If you have more than one surname, please make sure that the Volume Editor knows how you are to be listed in the author index. The typical neural network architecture has an input layer, one or more hidden layer and an output layer[12]. The MLP, which is the most widely used to mimic the
Artificial Intelligence Based Optimization of Fermentation Medium
327
non-linear model, consists of only one hidden layer. The most widely used formalism for the training of the network is the error-back-propagation (EBP) algorithm. Once a capable ANN model with good prediction accuracy is developed, a genetic algorithm can be used to optimize its input space. The GA-based search for an optimal solution begins with a randomly initialized population of probable (candidates) solutions. The solutions, coded in the form of binary digits (chromosomes) are then evaluated to measure their fitness in fulfilling the optimization objective. Next, a main loop of operations consisting of: (i) selection, (ii) crossover, (iii) mutating elements of the offspring strings, is executed. The best string that evolves after repeating the abovedescribed loop till convergence forms the solution to the optimization problem.
3 Results and Discussion 3.1 The Medium Components for Process Modeling The BGL activity from T. cylindrosporum syzx4 has been optimized previously using statistical method. We optimized the medium for BGL activity using response surface methodology (RSM). TCS, SM, KH2PO4 and (NH4)2SO4 significantly improved the BGL activity. The combined effects of these components with five levels (Table 1) were studied on BGL activity using central composite design (CCD) (Table 2), a second-order polynomial was established to identify the relationship between the BGL activity and three medium components. The maximum BGL activity by RSM model was 2.21 U/mL. 3.2 ANN-Based Modeling of BGL Fermentation An artificial neural network is a superior and more accurate modeling technique as compared to RSM as it represents the nonlinearities in a much better way. The ANN model has four input nodes for representing the four influential fermentation variables(TCS, SM, KH2PO4 and (NH4)2SO4) and one output representing the BGL activity U/mL at the end of the fermentation. Specially, a total of 36 experiments were conducted and the corresponding data is tabulated in Table 2. For ANN-based modeling, the process data comprising 36 patterns (example set) each representing a Table 1. Medium components and their levels of the variables used in CCD
Independent X1 X2 X3 X4
TCS (g/L) SM (g/L) KH2PO4 (g/L) (NH4)2SO4 (g/L)
Variable levels -2
-1
0
1
2
10 2.5 1 2
20 5 1.5 3.5
30 7.5 2 5
40 10 2.5 6.5
50 12.5 3 8
328
Y. Zhang et al. Table 2. Experimental design and results of the central composite design (CCD)
Run
X 1
X 2
X 3
X 4
BGL activity (U/mL)
Run
X 1
X 2
X 3
X 4
BGL activity (U/mL)
1
-1
-1
-1
-1
2.092
19
0
-2
0
0
1.803
2
-1
-1
-1
1
1.503
20
0
2
0
0
1.585
3
-1
-1
1
-1
2.012
21
0
0
-2
0
1.665
4
-1
-1
1
1
1.654
22
0
0
2
0
1.954
5
-1
1
-1
-1
1.928
23
0
0
0
-2
2.083
6
-1
1
-1
1
1.42
24
0
0
0
2
1.355
7
-1
1
1
-1
2.084
25
0
0
0
0
2.075
8
-1
1
1
1
1.514
26
0
0
0
0
2.036
9
1
-1
-1
-1
1.596
27
0
0
0
0
2.087
10
1
-1
-1
1
1.356
28
0
0
0
0
2.069
11
1
-1
1
-1
1.732
29
0
0
0
0
2.072
12
1
-1
1
1
1.438
30
0
0
0
0
2.073
13
1
1
-1
-1
1.573
31
0
0
0
0
2.072
14
1
1
-1
1
1.295
32
0
0
0
0
2.079
15
1
1
1
-1
1.467
33
0
0
0
0
2.073
16
1
1
1
1
1.407
34
0
0
0
0
2.072
17
-2
0
0
0
1.192
35
0
0
0
0
2.078
18
2
0
0
0
1.355
36
0
0
0
0
2.078
pair of model inputs (fermentation conditions) and a single output (BGL activity), was partitioned into a training set (24 patterns) and a test set (12 patterns). While the training set was utilized for adjusting the weights (W) of the ANN model during its training, the test set was used for gauging the network’s generalization performance after the each training iteration. The weights resulting in the least test set RMSE magnitude were chosen as the optimum weights. In the ANN training procedure, the tansig and purelin function were used for computing the output of the hidden nodes and output nodes. While developing an optimal ANN model, the effect of varying number of hidden nodes and the EBP algorithm specific parameters (learning rate, η, and momentum coefficient, α) on the training and test set RMSE was rigorously studied. The values of η and α, which resulted in the minimum RMSE of training set were 0.8 and 0.05, respectively with the corresponding RMSE value for the training set and test set were 0.138 and 0.013. The values of R2 value between the model-predicted and desired BGL activity pertaining to the training set and the test set were 0.997 and 0.994, respectively, while the R2 with RSM is 0.929. The small and comparable magnitudes of the RMSE and average prediction error (%), and the high value of R2, suggest that the ANN-based
Artificial Intelligence Based Optimization of Fermentation Medium
329
model possesses good approximation and generalization characteristics. A comparison of the ANN-predicted and RSM- predicted with desired values of the BGL activity are depicted in Fig. 1. As can be seen in Fig. 1, the ANN model were more fitted the experimental data with an excellent accuracy than RSM model. Similar results were obtained for the production of protease, where it was shown that ANN has a better prediction accuracy than RSM.
Fig. 1. BGL activity predicted by ANN and RSM versus actual BGL activity
3.3 GA-Based Optimization of the ANN Model GA was implemented to optimize the medium composition for maximum BGL activity. Since the ANN model represents the non-linear relationship between the four medium components and BGL activity, the same can be used to define the corresponding single objective optimization problem as stated below: Maximize y=f(xi,W); xmax≤xi≤xmin; where f represents the objective function(ANN model); y refers to the BGL activity; x denotes the medium components conditions; i is the number of input nodes and xmax and xmin represent the lower and upper bounds on xi. Using this artificial neural network as the fitness function, the genetic algorithm was implemented to optimize the medium composition for maximum BGL activity. The values of GA parameters used in the optimization simulation were: lchr (chromosome length) = 30, pmut (mutation probability) = 0.09, Npop (population size) = 40, pcr (crossover probability) = 0.2 and Nmaxg (number of generations over which GA evolved) = 50. During GA implementation, the search for the optimal solutions was restricted to the following ranges (coded [-2, 2] in Table 1) of the four process operating variables: TCS, SM, KH2PO4 and (NH4)2SO4. The GA was run five times and the results at the end of 50 generations for each of the runs and the actual experiments based on the GA results are reported in Table 3. It can be seen in Table 3 that the best sets of medium components are expected to result in the BGL activity of 2.662 U/ml on average. The maximum enzyme activity exhibited by these runs was 2.662 U/ml at the end of 50 generation, with the
330
Y. Zhang et al.
experimental BGL activity was 2.679 U/ml, which is in close agreement with the GAoptimized BGL activity. GA is run 50 generations, but the best solution is obtained at about 6th generation in Fig.2. Table 3. ANN-GA predicted concentration of medium components and the predicted and actual BGL activity
2.304
GAoptimized BGL activity (y) (U/mL) 2.661
Experiment al BGL activity (y) (U/mL) 2.651
1.986
2.254
2.658
2.662
1.975
2.401
2.674
2.678
7.104
1.983
2.375
2.674
2.679
26.323
7.126
1.997
2.305
2.632
2.640
26.274
7.066
1.991
2.328
2.660
2.662
GAoptimized solutions
TCS (g/L)
SM (g/L)
KH2PO4 (g/L)
(NH4)2SO4 (g/L)
1
26.316
7.056
2.013
2
26.234
7.025
3
26.233
7.021
4
26.265
5 Average
The average result of 2.660 U/ml BGL activity predicted by GA-ANN model is consistent with the result of the actual experiment. This T. cylindrosporum syzx4 strain when isolated and characterized was found to produce 2.21 U/ml with RSM. The utilization of ANN-GA hybrid formalism has resulted in the BGL activity of 2.660 U/ml. From the experimental data given in Table 2, it is noted that the maximum yield of EPS obtained in different trial experiments was 2.679 U/ml, which was increased by approximately 22%. It is thus seen that the usage of artificial intelligence-based modeling and optimization methods could improve the BGL activity significantly.
Fig. 2. Evolution of the best and average fitness (BGL activity) over the 16 generations in the GA
Artificial Intelligence Based Optimization of Fermentation Medium
331
4 Conclusions In the past few years, artificial intelligence based optimization techniques have gained great popularity. Without the prior knowledge of the interactions of the process variables, ANN-GA differs from conventional optimization in their ability to learn about the system. The data obtained from RSM was used and processed using ANN– GA for further enhancing the BGL activity and that was also validated by actual experiment. The results showed that the training of an artificial neural network with the experimental data from T. cylindrosporum syzx4 fermentation was quite successful. The ANN–GA predicted optimized BGL activity of 2.679 U/ml was about 22% higher than that of predicted by RSM model. The approach for ANN–GA can be varied with different data inputs and this method can be used as a viable alternative to the standard RSM approach and also can be employed for modeling and optimization of other bioprocess systems. Acknowledgments. This work was supported by the Important Agriculture Program of the Jilin Province Technology Department (Project No. 20096013), Jilin University basic science research fund (No. 200903259), Graduate Innovation Fund of Jilin University (Project 20101043) and Jilin Fuel Alcohol Company Ltd., China.
References 1. Bayer, E.A., Lamed, R., Himmel, M.E.: The potential of cellulases and cellulosomes for cellulosic waste management. Curr. Opin. Biotechnol. 18, 237–245 (2007) 2. Gruno, M., Vaeljamaee, P., Pettersson, G., Johansson, G.: Inhibition of the Trichoderma reesei cellulases by cellobiose is strongly dependent on the nature of the substrate. Biotechnol. Bioeng. 86, 503–511 (2004) 3. Kovarova-Kovar, K., Gehlen, S., Kunze, A., Keller, T., Von Daniken, R., Kolb, M., Van Loon, A.P.G.M.: Application of model-predictive control based on artificial neural networks to optimize the fed-batch process for riboflavin production. J. Biotechnol. 79, 39–52 (2000) 4. Lotfy, W.A., Ghanem, K.M., El Helow, E.R.: Citric acid production by a novel Aspergillus niger isolate. II. Optimization of process parameters through statistical experimental designs. Bioresour. Technol. 98, 3470–3477 (2007) 5. Gangadharan, D., Sivaramakrishnan, S., Nampoothiri, K.M., Sukumaran, R.K., Pandey, A.: Response surface methodology for the optimization of alpha amylase production by Bacillus amyloliquefaciens. Bioresour. Technol. (2007) 6. Almeida, J.S.: Predictive non-linear modeling of complex data by artificial neural networks. Curr. Opin. Biotechnol. 13, 72–76 (2002) 7. Davis, L. (ed.): Handbook of genetic algorithms. Van Nostrand Reinhold, NY (1991) 8. Venkatasubramanian, V., Sundaram, A.: Genetic algorithms: introduction and applications. In: Schleyer, P.V.R., et al. (eds.) Encyclopedia of Computational Chemistry, pp. 1115– 1127. Wiley, Chichester (1998) 9. Mandels, M., Weber, J.: The production of cellulases. Reese ET Cellulases and their application 95, 391–413 (1996)
332
Y. Zhang et al.
10. Juhász, T., Szengyel, Z., Réczey, K., Siika-Aho, M., Viikari, L.: Characterization of cellulases and hemicellulases produced by Trichoderma reesei on various carbon sources. Process Biochem. 40, 3519–3525 (2005) 11. Claeyssens, M., Aerts, G.: Characterization of cellulolytic activities in commercial Trichoderma reesei preparations: an approach using small, chromogenic substrates. Bioresour. Technol. 39, 143–146 (1992) 12. Rumelhart, D., Hinton, G., Williams, R.: Learning representations by back propagating errors. Nature, 323–533 (1986)
The Human Computer Interaction Technology Based on Virtual Scene Huimeng Tan, Wenhua Zhu, and Tianpeng Wang CIMS & Robotics Center, Shanghai University, No.149 Yanchang Road, 200072 Shanghai, China
[email protected]
Abstract. Assembly simulation technology which is based on virtual reality can benefit a lot to the optimization of product design as well as the training and coaching of workers. At present, the effect of common assembly simulation is not very good, because it only pays attention on the process of assembly and ignores the environment, and it is not interactive just like a movie. In order to change this, we introduce the virtual scene and human computer interaction technology on the basis of Assembly simulation technology, which is through the control of keyboard and mouse to view the details and the assembly simulation in virtual scene from different angle. Also through the control of mouse, we can drag those parts to achieve the assembly, this will surely more conducive to the training and guidance of employees. At Last, an example of assembly simulation for airplane horizontal tail was given, and the result is proved to be effective. Keywords: Virtual Reality, Human Computer Interaction, Assembly Simulation, Virtools.
1 Introduction Virtual Reality is a brand-new and comprehensive information technology, also very active these years. Since 1980s, it began to draw public attention and developed very fast recently. It also has a broad application in many fields, which need computer storage, management and analysis of complex data, such as military, industry, entertainment, medicine, geographic information system. The application is wider and wider, and the fastest growth of this kind of application lies in manufacturing industry. Assembly simulation can also be called as assembly process simulation. It is the application of virtual reality technology. Assembly simulation is based on the digital definition of products, using computer simulation environment for product assembly process simulation and analysis. It is a comprehensive application technology. Assembly simulation can do the simulating test According to the actual assembly process of product and resources. Therefore, the maintainability and assemblability can be analyzed directly, and related analysis report can be generated. And this is helpful for product design, resource design, assembly process design, and the assembly line layout design [1]. K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 333–339, 2010. © Springer-Verlag Berlin Heidelberg 2010
334
H. Tan, W. Zhu, and T. Wang
The content of assembly simulation has two levels. The first level is visualization and Interference detection for assembly simulation, namely using the simulation to show the movement of parts and their location during assembly to detect the interference and give an alarm. The second level is to construct a virtual assembly environment by using Virtual Reality technology. In this environment, operators can perceive assembly process and effect through their visual, auditory and touch feelings [2]. Now the fist level has been widely researched, it simulates the assembly process in an animation way to help improving the product design and process design, and it also makes it easier for new workers to understand the process. However, this animation way cannot show the details of the process clearly, and the effect is not very obvious to help new workers. To improve this, we try to construct a virtual scene and use human computer interaction technology.
2 The Human Computer Interaction Technology Immersion, interaction and imagination are the basic characteristics of virtual reality technologies, which are known as three “I” characteristics [3]. Interaction of Virtual reality technology means to exchange information between human and computer, which is called human computer interaction. Human computer Interaction effectively realizes human computer dialogue through computer input and output devices. Human computer interaction is one kind of technology to exchange information and bridge the relationship between human and computer, including that computer provides relevant information and request via the output or display equipment, and human input relevant information and answer the request. The human computer interaction technology grows with the development of computer technology. In the history of computer technology, human computer interface experienced several major changes, In the 1940s, it is composed of lights and mechanical switch control interface, then transferred to terminal and keyboard in 1960s and to the Graphical user interface (GUI) in 1980s. Now the human computer interaction technology becomes more and more natural and harmonious [4]. Gang Zhao, Chao Wang, Wenjun Hou and Yue Jin introduce a virtual assembly system for the complex products, proposed the product information management technology on an algorithm of the SOP (structure of parts) data construction and presented the DOF (degree of freedom) law for matching the rotation center of models [5]. Rui Wang studies the movement of the virtual hand and the graphics conversion matrix theory and expressed the implementation of the virtual hand [6]. Tao Peng, Shiqi Li, Junfeng Wang and Chi Xu propose a novel augmented interaction technique using the mixture of virtual reality and actual reality, augmenting information and constraining proxy [7]. Yi Zheng, Ruxin Ning, Chengtong Tang and Jingchang Shangguan propose the method of gesture recognition that is synthetically using data collected by the data gloves and 3D trackers [8]. Changming He, Andrew Lewis, and Jun Jo propose a fine control mode, an agent object interaction method and a twohanded scaling method in their novel Human Computer Interaction paradigm [9]. Tony Adriaansen, Alex Krumm-Heller, and Chris Gunn produce a system which combines 3D graphics visualization, touch (haptics) and audio interaction with a computer
The Human Computer Interaction Technology Based on Virtual Scene
335
generated model, and also add face to face communication to the sense of presence between users [10]. Patrick G. Kenny, Thomas D. Parsons, and Albert A. Rizzo use Interactive computer generated characters can be applied to the medical field as virtual patients for clinical training [11]. The use of assembly simulation based on human computer interaction technology can provide good reference for the design of product and the design of the assembly process, especially when a product is complex and consists of Hundreds of thousands of parts and components. The assembly simulation based on human computer interaction technology, also can provide good guidance for the product manufacturing. Take the training of new employees for example, the general simulation can only be watched with eyes. However the assembly simulation based on human computer interaction technology can not only be watched with eyes but also the workers can do the assembly on the computer graphics interface through the input device such as keyboard and mouse. So, people can interact with the virtual environment created by computer and finish the product assembly in the virtual environment by operating the part one by one according to the simulation they have watched. In this way, the training can be more effective.
3 Implementation of Human Computer Interaction Technology Based on Virtools Virtools is a well known software used in virtual reality. It was developed by French global three dimensional development solution VIRTOOLS Corporation. It can be used to create 3D real-time applications and develop many kinds of multimedia such as 3D games, virtual roam system, virtual training system, etc., by combining models, pictures, text and audio. Virtools is a powerful solution and easy to use with extremely development efficiency. We developed a virtual interaction platform based on virtools. Fig. 1 shows the construction the technology.
Fig. 1. The construction of human computer interaction technology based on virtools
336
H. Tan, W. Zhu, and T. Wang
3.1 Establishment of Simulation Scene Adding plants, workshops, grass, trees to the scene can make the simulation more vivid and immersive. Virtools itself cannot create models, so the work of establishing simulation scene should be undertaken by other software like 3ds max, maya and so on. Appropriate plug-ins is needed, such as Virtools.Max.Exporter for 3ds max, and then users can easily convert the 3ds files created by 3ds max to nmo files and import them into Virtools. In order to optimize the virtual scene, appropriate materials and textures should be added. 3.2 Import of Product Parts At present, 3D CAD software are widely used in most companies for their product design. The models created by these software can be used directly for simulation. But these models should be converted into 3D XML format before used. After that, we can import these 3D XML files into the virtual scene in Virtools. 3.3 Implementation of Simulation Virtools provides many kinds of operation for 2D and 3D models, such as size control, translation, scaling, rotating, color change, texture and ray change etc. These operations are implemented by scripts, and the scripts are built by Building Blocks (BBs), which is the important feature of Virtools. Virtools provides over 500 BBs. Every BB is a built-in behavior control function which has 4 interfaces: Behavior Input, Behavior Output, Parameter Input and Parameter Output. Therefore, we can use these BBs from the library to control models conveniently. By calling various kinds of BBs, we can use keyboard and mouse for human computer interaction. The method to bring it into effect is mentioned as follow. Pressing UP/DOWN or LEFT/RIGHT on keyboard, users can move or rotate the camera in the simulation scene and zoom in/out can also be achieved via the similar way, so we can find the details of simulation. Fig.2 shows an example of BBs for human computer interaction on keyboard.
Fig. 2. BBs for human computer interaction on keyboard
The Human Computer Interaction Technology Based on Virtual Scene
337
For new employees, it is not enough to watch the simulation process on a screen. In order to get a more impressive experience and learning effect, practice is the optimal way. Through the human computer interaction technology, we can make the assembly by ourselves in the virtual scene. Fig.3 shows an example of BBs for operation of part assembly by mouse. Users click the part displayed in the scene, seizing and dragging the target to the right place for completing the assembly process. Via the entire simulation for assembly, the practice effect can be splendid.
Fig. 3. BBs for operation of part assembly by mouse
Moreover, during the practical assembly process design, we constantly modify the assembling parameters for more accurate data until the optical value has been found. Through this continuous validation, the best assembling process can be worked out. So we expect to control the simulation by parameters. And we also hope to separate these two phases (control and simulation) which means we need two PCs for test. One for parameter input and control while another for displaying the simulation effect. This can also be achieved by Virtools. Fig.4 shows the BBs for the operation mentioned above.
Fig. 4. BBs for the control operation
338
H. Tan, W. Zhu, and T. Wang
3.4 Output of Simulation The size of file produced by Virtools is small. So the files can be saved as cmo format for browsing in local PC or as vmo format for sharing on the internet, which enables remote browsing as well as facilitates learning and guidance for workshop workers.
4 Application of Human Computer Interaction Technology Based on Virtools to an Airplane Horizontal Tail Simulation We use 3ds max to create the virtual scene like factory, plant, grass and convert the file into nmo format, import it into Virtools. The models of horizontal tail are generated by CATIA. It can be directly converted to 3D XML format and imported into the scene. We can add materials, texture, light and camera to the scene which can make it more vivid. After that, we release the simulation result on the intranet of company. As long as there is a PC connecting with the intranet, the whole simulation can be viewed through a browser. The interaction with keyboard and mouse is also available which greatly facilitates viewing and training. Fig.5 shows the assembly simulation for a horizontal tail.
Fig. 5. Assembly simulation for airplane horizontal tail
5 Conclusion Applying human computer interaction to the simulation based on virtual scene, people can use keyboard and mouse to interact with virtual environment. Compared with the common assembly simulation, the effect has been improved a lot. It not only overcomes the weakness of common simulation which the process can only be viewed by the visual angle of the simulation creator, but also exhibits the simulation in a more clear way. The training of workers gains a new approach as well. Furthermore, it extends the content of digital manufacturing for corporations.
The Human Computer Interaction Technology Based on Virtual Scene
339
Acknowledgments. The project supported by the Cooperative Fund of International Science and Technology, Shanghai China (Project No. 09170712000), partially supported by the Municipal Program of Science and Technology Talents, Shanghai China (Project No. 09QT1401300), and Shanghai Leading Academic Discipline Construction (Project No. Y0102). The authors gratefully acknowledge the encouragements from Prof. Ming-lun FANG, et al, and acknowledge the support of colleagues in CIMS Center of Shanghai University.
References 1. Guo, H.: Application of Assembly Simulation Technology in Aircraft Concurred Design Stage. J. Aeronautical Manufacturing Technology 24, 65–71 (2009) 2. Jiang, Z., Chen, Y.: Realization of Assembly Simulation in Pro/Engineer. J. Automation & Instrumentation 3, 44–47 (2008) 3. Zhu, W.: Virtual Reality Technology and Applications. Intellectual Property Publishing House, Beijing (2007) 4. Gong, W., Ding, M., Jiang, Y., Yu, H.: Study on Human-Machine Interaction Based on VR Technology. J. Mechanical & Electrical Engineering Technology 35(5), 41–43 (2009) 5. Zhao, G., Wang, C., Hou, W., Jin, Y.: Study on Human-Machine Interaction Based on VR Technology. J. Journal of Beijing University of Aeronautics and Astronautics 35(2), 137– 141 (2009) 6. Wang, R.: Development and Application of Human Machine Interaction Simulation System Based on Virtual Reality Technology. Hefei University of Technology, Hefei (2009) 7. Peng, T., Li, S., Wang, J., Xu, C.: Virtual Assembly Based on Augmented HumanComputer Interaction Technology. J. Journal of Computer-Aided Design & Computer Graphics 21(3), 354–361 (2009) 8. Zheng, Y., Ning, R., Tang, C., Shangguan, J.: Human-Computer Interaction in Virtual Assembly Environment. J. Transactions of Beijing Institute of Technology 26(1), 19–22 (2006) 9. He, C., Lewis, A., Jo, J.: A Novel Human Computer Interaction (HCI) Paradigm for Volume Visualization in Projection-based Immersive Virtual Environments. In: Butz, A., Fisher, B., Krüger, A., Olivier, P., Owada, S. (eds.) SG 2007. LNCS, vol. 4569, pp. 49–60. Springer, Heidelberg (2007) 10. Adriaansen, T., Krumm-Heller, A., Gunn, C.: Enhancing Human Computer Interaction in Networked Hapto-Acoustic Virtual Reality Environments on the CeNTIE Network. In: Bubak, M., van Albada, G.D., Sloot, P.M.A., Dongarra, J. (eds.) ICCS 2004. LNCS, vol. 3038, pp. 905–912. Springer, Heidelberg (2004) 11. Kenny, P.G., Parsons, T.D., Rizzo, A.A.: Human Computer Interaction in Virtual Standardized Patient Systems. In: Jacko, J.A. (ed.) Human-Computer Interaction, Part IV, HCII 2009. LNCS, vol. 5613, pp. 514–523. Springer, Heidelberg (2009)
ICA-Based Automatic Classification of Magnetic Resonance Images from ADNI Data Wenlu Yang1,2 , Xinyun Chen1 , Hong Xie1 , and Xudong Huang2 1
Department of Electronic Engineering, Shanghai Maritime University, Shanghai 200135, China
[email protected] 2 Department of Radiology, Brigham and Women’s Hospital, and Harvard Medical School, Boston, MA, USA
[email protected]
Abstract. This paper proposes a novel method of automatic classification of magnetic resonance images based on independent component analysis (ICA). The ICA-based method is composed of three steps. First, all magnetic resonance imaging (MRI) scans are aligned and normalized by statistical parametric mapping. Then FastICA is applied to the preprocessed images for extracting specific neuroimaging components as potential classifying feature. Finally, the separated independent coefficients are fed into a classifying machine that discriminates among Alzheimer’s patients, and mild cognitive impairment, and control subjects. In this study, the MRI data is selected from the Alzheimer’s Disease Neuroimaging Initiative databases. The experimental results show that our method can successfully differentiate subjects with Alzheimer’s disease and mild cognitive impairment from normal controls. Keywords: Alzheimer’s disease, mild cognitive impairment, magnetic resonance imaging, independent component analysis, support vector machine.
1
Introduction
Alzheimer’s disease (AD), a neurodegenerative disorder, is by far the most common cause of dementia associated with aging. To clinically diagnose AD patients at an early stage, structural magnetic resonance imaging (sMRI) [1], one of biomedical imaging techniques, has been used. Structural MRI studies can detect changes in structure that are able to distinguish AD and mild cognitive impairment (MCI) subjects from normal controls (NC). However, it has been challenging to use fully automated MRI analytic methods to identify potential AD neuroimaging biomarkers [2], and further to diagnose AD or MCI patients. Structural MRI promises to aid in the diagnosis and treatment monitoring of MCI and AD through the facile detection of surrogate biomarkers for disease progression. Studies involving the analyzing of sMRI brain scans can be generally categorized into two classes: region-of-interest (ROI) analysis [3] and K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 340–347, 2010. c Springer-Verlag Berlin Heidelberg 2010
ICA-Based Automatic Classification of Magnetic Resonance Images
341
whole brain analysis [4][5]. ROI analysis focuses on specific brain regions, especially the hippocampus and the entorhinal cortex [6][7] to show histopathological changes at early stages of AD [8]. ROI analysis of the brain structure is considered a gold standard, but it has some drawbacks such as operator-dependency, labor- and time-intensiveness, and the required step of a priori choice of regions for investigation [9]. To overcome these shortcomings, some automated methods of measuring whole brain atrophy have been developed, such as voxel-based morphometry (VBM) [4], tensor-based morphometry [10], and source-based morphometry [11], etc. However, sMRI researches are often divided as far as to which analysis technique should be chosen image analysis: volume statistics [3][6], cortex shape analysis [7], and blind signal separation (BSS)/machine learning (ML) techniques [2][5][12][13]. As one of BSS/ML techniques, independent component analysis (ICA) has proved to be a powerful method for analyzing neuroimaging data [12][14]. It is one of the multivariate data-driven techniques that enable an exploratory analysis of MRI datasets to provide useful information about the relationships among voxels in local substructures of the brain. For the diagnosis or classification of AD and MCI patients, support vector machine (SVM), one of machine learning techniques, has received more attention [5][13]. In the current study, we have applied ICA-based method coupled with SVM technique to selected MRI data from the ADNI databases. Our results indicate that the proposed method is able to classify AD and MCI patients from NC subjects with certain accuracy.
2 2.1
Method The Framework of the Proposed Method
The framework of our ICA-based method is shown in Fig. 1. First of all, we normalize all MRI scans into a template [15], and then reconstruct the brain images in which all nonbrain voxels are masked out. Next, the normalized brain images are decomposed into MRI basis functions and the corresponding independent coefficients using the FastICA algorithm [16]. Finally, the separated coefficients are fed into a SVM-based classifier for diagnosis of individuals with AD, MCI and NC.
Fig. 1. The framework of the method for analysis of structural MRI data
342
2.2
W. Yang et al.
Overview of the Data Set
In the study, we will use MRI data from the Alzheimers Disease Neuroimaging Initiative (ADNI) [1]. The ADNI has recruited over 800 adults aged between 55 to 90 years old from approximately 50 sites across the United States and Canada. These include approximately 200 cognitively normal individuals who are followed for 3 years, 400 subjects with MCI who are followed for 3 years, and 200 patients with early AD who are followed for 2 years. 2.3
MRI Data Preprocessing
We preprocessed the MRI images in the ADNI database and then normalized the MRI scans of each subject into a standard space defined by the template image T1.nii supplied with the SPM8 toolbox [4]. The detailed configurations included source image smoothing with 8mm, affine regularization with ICBM space template, a nonlinear frequency cutoff of 25, 16 nonlinear iterations, and trilinear interpolation. Finally, we extracted the sub-volumes within the bounding box of (−79 ∼ 80, −112 ∼ 79, −74 ∼ 85 in mm) relative to the anterior commissure in the space described in the atlas of Talairach and Tournoux [17]. Therefore, all MRI images were normalized into 160×192×160 voxel-wise images. To further analyze the MRI images, we segmented the whole brain images in ADNI database into three parts: gray matter (GM), white matter (WM), and cerebrospinal fluid (CSF). 2.4
Spatial ICA Model for sMRI Images
A key technique in the framework is ICA as its basic goal to solve the blind signal separation problem by expressing a set of observations as linear combinations of statistically independent component sources. According to assumptions of sources independent over space or time, ICA can be further described as spatial ICA (sICA) [14] and time ICA (tICA) [18]. Spatial ICA seeks a set of mutually independent component (IC) source images and a corresponding set of unconstrained time courses. By contrast, tICA seeks a set of IC source time courses and a corresponding set of unconstrained images. In concrete sMRI data, sICA embodies the assumption that each image in X is composed of a linear combination of spatially and statistically independent images. The spatial ICA model for MRI images is shown in Fig. 2, where, the data X denotes voxels of all MRI images, the voxels in each MRI image are arranged into one row in X. S and A are unsupervisedly learned from X. Each row in A denotes a base, also called a basis function, or a feature. 2.5
Classification Using Support Vector Machine (SVM)
Support vector machine (SVM) [19], a very popular classifier and one of the machine learning methods based on statistical learning theory, has been recently
ICA-Based Automatic Classification of Magnetic Resonance Images
343
Fig. 2. A spatial ICA model for decomposing MRI images
used to help distinguish AD subjects from elderly control subjects using anatomical MR imaging (MRI) [5]. SVM conceptually implements the idea that vectors are nonlinearly mapped to a very high dimension feature space. In this feature space, a linear separation plane is created to separate the training data by minimizing the margin between the vectors of the two classes. We used the LIBSVM (http://www.csie.ntu.edu.tw/∼ cjlin/libsvm) as the classifier to diagnose AD or MCI subjects from normal controls.
3
Results
We have performed several analyses using the MRI images from ADNI databases to verify the performances of the proposed method, such as feature extraction and discrimination of AD and MCI subjects from normal controls. 3.1
Feature Extraction
Using the FastICA algorithm to decompose the brain images, we can obtain MRI image basis functions shown in Fig. 3. From these bases, we can easily notice that each base has only coded a local part of the brain MRI images. Differently bases locally code different parts of the brain images. If a corresponding coefficient is significant, we can draw a conclusion that the base is more important than others in the individual MRI scans. 3.2
Classification of Whole Brain Images
After image normalization using SPM8 and feature extraction using ICA, the corresponding independent component coefficients are considered as the representations of MRI images on the subspace spanned by independent components, and they can be classified into two groups: AD vs. NCs and MCI vs. NC, using a SVM classifier. As the SVM classifier is based on statistical theory, we divided all MRI image data (202 AD, 410 MCI and 236 NC) into two data sets: training set and testing set. Taking the influence of the number of training samples on
344
W. Yang et al.
Fig. 3. Examples of learned MRI image bases from MRI data with AD and NC
classification accuracy into account, two ratios of training sets are compared: 75% training set vs. 25% testing set, and 90% training set vs. 10% testing set. All training sets were randomly selected from the ADNI MRI database with AD, MCI, and NC subjects. The best classification accuracy is shown in Table 1.
Table 1. Classification accuracy of ADNI whole brain images (%) Training set 75
90
3.3
Class Class. rate Sensitivity Specificity Class. rate Sensitivity Specificity
AD vs. NC 78.4 70.6 86.3 85.7 90.5 80.9
MCI vs. NC 71.2 69.5 72.9 79.2 83.3 75.0
Classification of the GM Images
Only the gray matter of the brain was analyzed to check the degree of significance in discriminating AD and MCI from NC. First, the whole brain MRI image was segmented by segment module in SPM8 into three parts: gray matter (GM), white matter (WM), and cerebrospinal fluid (CSF). In the study, we mainly analyzed the GM images. Then, the segmented GM images were decomposed by the FastICA algorithm abovementioned. The decomposed independent coefficients were viewed as the representations in the ICA subspace and fed into the SVM classifier for discriminating AD and MCI from NC. The experimental methods were the same as the whole MRI images. The classification results are shown in Table 2.
ICA-Based Automatic Classification of Magnetic Resonance Images
345
Table 2. Classification accuracy of ADNI GM images (%) Training set 75
90
3.4
Class Class. rate Sensitivity Specificity Class. rate Sensitivity Specificity
AD vs. NC 82.4 78.4 86.3 88.9 80.9 95.8
MCI vs. NC 69.5 66.1 72.9 81.3 91.7 70.8
Comparison with Related Works
Our results from the ADNI database are also comparable with that presented by Gutman et al. [20] and Chupin et al. [6]. Gutman et al. used a SVM classifier based on spherical harmonics to classify 49 AD patients and 63 controls with 75.5% sensitivity and 87.3% specificity, and with 82.1% overall correctness. Chupin et al. [6] proposed a fully automatic method using probabilistic and anatomical priors for hippocampus segmentation. They further used the obtained hippocampal volumes to automatically discriminate among AD patients, MCI patients, and elderly controls. They obtained an 82% classification rate, 75% sensitivity, and 89% specificity for classification among 29 AD and 30 NC subjects. In their experiments on the subjects with 67 AD, 143 MCI, and 123 NC subjects aged between 70 and 80 years old, the classification rates appeared to be 76-80% of AD vs. NC and 61-63% of MCI vs. NC.
4
Conclusion
Herein we have demonstrated that the fully automatic method based on ICA for classification of MRI images is very useful in discriminating among AD, MCI, and NC subjects. This study is comparable with other related works. ICA is one of the data-driven, multivariate, and unsupervised methods with an advantage of using no a priori information. It has become an increasing popular biomedical data-mining technique as well as processing method for functional and structural MRI data. To our knowledge, there are rare reports on the application of ICA to structural MRI data with AD patients. However, ICA might also be a useful tool for early AD diagnosis of sMRI data analysis because it has shown its usefulness in processing sMRI data from schizophrenia patients [11]. Therefore, in this study, we have applied ICA to the analysis of AD-related sMRI data. Experimental results on MRI data from the ADNI databases have indicated that the proposed method based on ICA is a useful tool for classifying AD, MCI, and NC data. In conclusion, our study has proven that the proposed ICA-based method is useful for classifying AD and MCI patients from normal controls. However, the achieved classification accuracy is still not optimal due to several factors. First,
346
W. Yang et al.
the ADNI is a multicenter database (approximate 50 centers using different voxel sizes and acquisition parameters), and we did not take scanner or center effects into account. Next, potential brain vascular lesions in the subjects may be a confounding factor. Third, due to the large number of images, we have not manually validated the normalization quality of the MRI images. Finally, the factors of age and gender have not been taken into account, and we have not yet examined these data in the context of amyloid burden in these subjects, using CSF markers. All of these aspects should influence the final classification accuracy. Further, we would like to perform result comparison with the ones from other published semi-automated methods. Our main goal in the current study is to verify the performances of the proposed method based on ICA. Much work lies ahead, however. We have presented the basis functions, or features, but questions have not yet been addressed such as: What is the exact meaning of these features? Which feature is more important? How many features are related to AD? What are the effects of the age, gender, and amyloid pathology upon the features? etc.. Therefore, our future work will focus on answering these questions. Moreover, the ADNI has provided follow-up MRI data, and to apply the proposed method to the longitudinal analysis of MRI images also is our next step.
Acknowledgements The authors would like to express their gratitude for the support from the Shanghai Maritime University Foundation(Grant No. 20080471, 20090175), the Science and Technology Commission of Shanghai Municipality (Grant No. 09511502502), the Ministry of Transport of the People’s Republic of China (2010318810019), the national natural science foundation of China (Grant No. 60905056), the NIA/NIH (5R21AG0-28850), Alzheimer’s Association (IIRG-07-60397), and the research funds from the Radiology Department of Brigham and Women’s Hospital (BWH). ADNI data collection and sharing for this project was funded by the Alzheimer’s Disease Neuroimaging Initiative (National Institutes of Health Grant U01 AG024904).
References 1. Jack, C.R., Bernstein, M.A., et al.: The Alzheimer’s Disease Neuroimaging Initiative (ADNI): MRI methods. J. Magn. Reson. Imaging 27(4), 685–691 (2008) 2. Walhovd, K.B., Fjell, A.M., et al.: Combining MR Imaging, Positron-Emission Tomography, and CSF Biomarkers in the Diagnosis and Prognosis of Alzheimer Disease. AJNR Am. J. Neuroradiol. (2010) 3. Jack, C.R., Petersen, R.C., Obrien, P.C., Tangalos, E.G.: MR-based hippocampal volumetry in the diagnosis of Alzheimers-disease. Neurology 42(1), 183–188 (1992) 4. Ashburner, J., Friston, K.J.: Voxel-based morphometry - The methods. Neuroimage 11(6), 805–821 (2000) 5. Magnin, B., Mesrob, L., et al.: Support vector machine-based classification of Alzheimer’s disease from whole-brain anatomical MRI. Neuroradiology 51(2), 73– 83 (2009)
ICA-Based Automatic Classification of Magnetic Resonance Images
347
6. Chupin, M., Gerardin, E., et al.: Fully automatic hippocampus segmentation and classification in Alzheimer’s disease and mild cognitive impairment applied on data from ADNI. Hippocampus 19(6), 579–587 (2009) 7. Zhou, L., Lieby, P., et al.: Hippocampal shape analysis for Alzheimer’s disease using an efficient hypothesis test and regularized discriminative deformation. Hippocampus 19(6), 533–540 (2009) 8. Braak, H., Braak, E.: Staging of Alzheimer-related cortical destruction. Int. Psychogeriatr 9(suppl. 1), 257–261 (1997); discussion 69-72 9. Giesel, F.L., Thomann, P.A., et al.: Comparison of manual direct and automated indirect measurement of hippocampus using magnetic resonance imaging. European Journal of Radiology (66), 268–273 (2008) 10. Hua, X., Leow, A.D., et al.: Tensor-based morphometry as a neuroimaging biomarker for Alzheimer’s disease: an MRI study of 676 AD, MCI, and normal subjects. Neuroimage 43(3), 458–469 (2008) 11. Xu, L., Pearlson, G., Calhoun, V.D.: Joint source based morphometry identifies linked gray and white matter group differences. Neuroimage 44(3), 777–789 (2009) 12. Xu, L., Groth, K.M., et al.: Source-based morphometry: the use of independent component analysis to identify gray matter differences with application to schizophrenia. Hum. Brain Mapp. 30(3), 711–724 (2009) 13. Kloppel, S., Stonnington, C.M., et al.: Automatic classification of MR scans in Alzheimer’s disease. Brain 131(Pt. 3), 681–690 (2008) 14. McKeown, M.J., Sejnowski, T.J.: Independent component analysis of fMRI data: Examining the assumptions. Human Brain Mapping 6(5-6), 368–372 (1998) 15. Marcus, D.S., et al.: Open access series of imaging studies (OASIS): Cross-sectional MRI data in young, middle aged, nondemented, and demented older adults. Journal of Cognitive Neuroscience 19(9), 1498–1507 (2007) 16. Hyvarinen, A.: Fast and robust fixed-point algorithms for independent component analysis. IEEE Transactions on Neural Networks 10(3), 626–634 (1999) 17. Talairach, J., Tournoux, P.: Co-planar Stereotaxic Atlas of the Human Brain. Thieme Medical, New York (1988) 18. Calhoun, V.D., Adali, T., et al.: Spatial and temporal independent component analysis of functional MRI data containing a pair of task-related waveforms. Human Brain Mapping 13(1), 43–53 (2001) 19. Cortes, C., Vapnik, V.: SUPPORT-VECTOR NETWORKS. Machine Learning 20(3), 273–297 (1995) 20. Gutman, B., Wang, Y., et al.: Disease classification with hippocampal shape invariants. Hippocampus 19(6), 572–578 (2009)
Label Propagation Algorithm Based on Non-negative Sparse Representation Nanhai Yang, Yuanyuan Sang, Ran He, and Xiukun Wang Department of Computer Science and Technology, Dalian University of Technology, 116024 Dalian, China
Abstract. Graph-based semi-supervised learning strategy plays an important role in the semi-supervised learning area. This paper presents a novel label propagation algorithm based on nonnegative sparse representation (NSR) for bioinformatics and biometrics. Firstly, we construct a sparse probability graph (SPG) whose nonnegative weight coefficients are derived by nonnegative sparse representation algorithm. The weights of SPG naturally reveal the clustering relationship of labeled and unlabeled samples; meanwhile automatically select appropriate adjacency structure as compared to traditional semi-supervised learning algorithm. Then the labels of unlabeled samples are propagated until algorithm converges. Extensive experimental results on biometrics, UCI machine learning and TDT2 text datasets demonstrate that label propagation algorithm based on NSR outperforms the standard label propagation algorithm. Keywords: biometrics, nonnegative sparse representation, semi-supervised learning, sparse probability graph, label propagation.
1 Introduction In bioinformatics and biometrics, one often faces a lack of sufficient labeled data, whereas large numbers of unlabeled data can be easily obtained. Consequently, semisupervised learning methods are proposed, which aim to learn from both labeled and unlabeled samples to boost the algorithmic performance. In recent years, a prominent achievement in the semi-supervised learning area is the development of the graphbased semi-supervised learning strategy [8], which models the whole data set as a graph G = (V ; E ) . V is the vertex set, and E is the edge set. The basic assumption behind graph-based semi-supervised learning is the cluster assumption [13]: 1) nearby points are likely to have the same label, and 2) points on the same structure (such as a cluster or a sub-manifold) are likely to have the same label. Note that the first assumption is local, whereas the second one is global. The cluster assumption implies that we should consider both the local and global information contained in the data set during graph construction. Based on the graph-based semi-supervised learning, Wang and Zhang proposed a novel method called Linear Neighborhood Propagation (LNP) [8], which is much more stable with the variation of the neighborhood size and propose a preprocessing K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 348–357, 2010. © Springer-Verlag Berlin Heidelberg 2010
Label Propagation Algorithm Based on Non-negative Sparse Representation
349
method that can automatically remove the bridge points from the data set. But LNP still requires manual setting of parameter k and is sensitive to graph parameters. Secondly, the algorithm only use the local information of the samples and cannot reveal sufficiently the clustering relationship of the samples, thus affects the performance and efficiency in the machine learning. Furthermore, LNP adopts heuristic method to remove the bridge points and weaken the noises, which is a lack of theoretical basis. In this paper, Enlightened by sparse signal representation of l 1 optimization [6] [7] and nonnegative sparse representation (NSR) [9], we propose a label propagation method based on NSR and robust NSR [9] for semi-supervised learning. Firstly it assumes that each sample can be linearly reconstructed from the sparse representation of the remaining samples, and thus the method constructs a sparse probability graph (SPG) whose nonnegative weight coefficients are derived by nonnegative sparse representation algorithm(or Correntropy-based nonnegative sparse representation algorithm [9]). The weights of SPG naturally reveal the clustering relationship of labeled samples and unlabeled samples, and meanwhile avoid the adjacency selection and parameter setting process in traditional semi-supervised learning algorithm. Then, the labels of unlabeled samples are propagated in the SPG until convergence. Extensive experimental results demonstrate the effectiveness and efficiency of the proposed approach.
2 Traditional Graph-Based Semi-supervised Learning Algorithm For a graph-based semi-supervised learning problem, we assume the training samples X = [x1, x 2,..., x n ] ∈ Rd , n is the total number of the training samples. Traditional graph construction methods usually construct the graph by two steps, graph adjacency construction and graph weight calculation. For graph adjacency construction, there exist two popular methods [2]: ε -ball neighborhood and k-nearest neighbors. They both need to set parameter at first and then measure the nearest neighbors by the usual Euclidean distance. For weight calculation, there exist three frequently used approaches: 1)
Gaussian Kernel: Wij = exp(−
2) 3)
|| x i − x j ||2 ) δ
(1)
Zhou’s KNN algorithm [13] adopt Gaussian function for weight calculation. Inverse Euclidean Distance [5]:Wij =|| x i − x j ||−1 where || . || is the Euclidean distance. Local Linear Reconstruction Coefficient [10] : Roweis et al. proposed to utilize the local linear reconstruction coefficients as graph weights and minimize the l2 reconstruction error defined as ξ(W ) =
∑ || xi − ∑Wij x j i
j
||2
s.t.∑Wij = 1 j
(2)
350
N. Yang et al.
An informative and discriminative graph plays an important role in graph-based machine learning methods. For a machine learning task, three characteristics of an informative graph are often desirable [7]: high discriminating power, sparsity and adaptive neighborhood. Traditional semi-supervised learning methods can not meet the above characteristics. They all (in general) use a fixed global parameter to determine the neighborhoods for all the samples, and thus do not handle the situations where an adaptive neighborhood is required. Moreover, the adjacency structure of the graph is already fixed during the first step and the consequent graph weight calculation step will be constrained by these neighborhood relations [2]. Adjacency construction and weight calculation are associated with each other, and traditional algorithms often separate them. So it is very necessary to develop a parameter-free semi-supervised learning method which can construct an informative and discriminative graph.
3 Label Propagation Algorithm Based on NSR 3.1 SPG Construction In recent years, nonnegative sparse representation [3][14]problem has aroused more and more attention in the theoretic research and practical application. Bruckstein et al. [14] and Donoho et al. [3] indicated that nonnegative sparse representation problem can be described by l 1 optimization as follows: min || wi ||1 wi
s.t.X iˆwi = x i , wi ≥ 0
(3)
Where X is a matrix whose columns are n training samples, Xiˆ is the matrix obtained from X by removing its i -th column x i .We add the first constraint of Eq.(3) to the objective function in the form of Lagrange factor, and obtain the following nonnegative sparse problem:
∑ min || xi
− X iˆwi ||22 + λ || wi ||1
s.t.wij ≥ 0
i
(4)
We construct a SPG by Eq.(4). The SPG considers each data point can be linearly reconstructed by a nonnegative sparse representation of the training data. For each sample x i , the nonnegative sparse vector wi deduced in Eq.(4) describes how the rest samples contribute to the sparse representation of x i so that it can essentially reveal the clustering relationship among samples. We further set λ = 0 to develop a more practical and universal graph construction method, and obtain the following NSR model, which can be computed by nonnegative least square[1].:
∑ min || xi − Xiˆwi ||22 i
s.t.wij ≥ 0
(5)
Considering the exist of variation of illumination, as well as occlusions and corruptions in the practical applications, which affect the classification performance.
Label Propagation Algorithm Based on Non-negative Sparse Representation
351
We further propose an information theoretic Correntropy-based NSR model. In the study of Information theoretic, Correntropy is mainly a generalized similarity measure between two arbitrary random variables A and B and we describe the Correntropy based on samples [4] as follows: n 1 V σ,m (A, B ) = ∑ g(a j − bi , σ) n i =1
(6)
In the definition Eq. (6) of correntropy, we set a j = x i and bi = Xiˆwi to obtain Correntropy-based NSR model [9] for each sample: d
n −1
k =1
j =1
max ∑ g(x ik − (Xiˆwi )k ) − λ ∑ wij wi
where g(x ) = exp(−
|| x ||2 2σ 2
s.t.wij ≥ 0
(7)
) is a Gaussian kernel function, x ik is the k-th entry of vector
and is (Xiˆwi )k is the k-th entry of vector. Gaussian function is actually a robust function. By the properties of Correntropy, if sample x i is a bridge point or outlier, it will not be denoted by the linear model Xiˆwi and have a long distance with Xiˆwi . Thus x i will obtain a low value by the Gaussian function and have less contribution to the objective function in semi-supervised learning. So we can gain a robust weight matrix by Eq.(7). The weight wij directly characterizes how similar date xi to date x j and the relationship of the data. We still expect that the values of the weights can reflect the probability that the x j and xi belong to the same class, hence we normalize the weight by Eq.(8) and thus
n
∑ j =1W (i, j ) = 1
n −1 ⎧⎪ ⎪⎪ w / ⎪⎪ ij ∑ wij , ⎪⎪ j , =1 ⎪⎪ n −1 W (i, j ) = ⎪⎨ wi( j −1) / ∑ wij , ⎪⎪ j , =1 ⎪⎪ ⎪⎪ 0 ⎪⎪ ⎪⎪ ⎩
j
i
(8)
j =i
The construction of SPG is different with the construction of LNP and traditional graph. Firstly, SPG has certain parameters which require manual setting. These parameters are usually determined empirically and affect the adjacency relationship, sparsity and the classification performance significantly. LNP algorithm selects neighbors by k-nearest neighbors method and in general( k ≠ n , n is the total number of training samples), only reflects the local information of the sample space not the global information, so it does not meet the cluster assumption which required to learn the structure of the samples in the view of local and global. When k = n , LNP
352
N. Yang et al.
reflects the global information of the dataset and has a sparse matrix ,but the computational cost of the LNP are larger than the cost of SPG. Secondly, SPG selects neighbors and calculates weight simultaneously by NSR and robust NSR algorithm, which naturally determine the sparsity of the graph and avoids to separate adjacency construction and weight calculation. Finally, Comparing with the heuristic method of LNP for removing bridge points and outliers, robust NSR algorithm is based on Correntropy and has better theoretical basis. 3.2 Label Propagation Based on NSR Let X represent a set of n samples, the first l points are labeled and the remaining n − l points XU are unlabeled. After the graph has been constructed, we can make use of it to predict the labels of the unlabeled data. In this section, we adopt label propagation method to iteratively propagate the labels of the labeled data to the remaining unlabeled data XU on the constructed graph. Let f denote the set of classifying functions defined on X . Suppose there are c classes, and the label set is L = {1,2,..., c} . In each propagation process, each sample obtains its label partly from the label information of its neighbors and meanwhile retains some label information of its initial value. Therefore, the label of x i at time m + 1 becomes fim +1 = α ∑ j :x
j ∈N (x i )
wij f jm + (1 − α)ti
(9)
Where 0 < α < 1 is the fraction of label information that x i receives from its neighbors, let t = (t1, t2,..., t n )T ,where tij = 1 if x i is labeled as j , and tij = 0 otherwise, and for unlabeled points x u , tuj = 0(1 ≤ j ≤ c) . f m = (f1m , f2m ,..., fnm )T is the prediction label vector at iteration m, and f 0 = t ,Then, we can rewrite Eq.(9) as f m +1 = αWf m + (1 − α)f 0
(10)
We use Eq.(10) to update the labels of each data until convergence, which means that the predicted labels f m of the data will not change in several successive iterations. The final f m is the limit of the above sequence f * = (1 − α)(I − αW )−1 f 0 which corresponds to a unique classification on X that labels xi as yi = arg max j ≤c fij* . Where I is the identity matrix of n . Theorem 1. Label propagation algorithm based on NSR is converges. n
Proof. By Theorem 1 of [8], since wij ≥ 0 and ∑ j =1W (i, j ) = 1 , label propagation algorithm based on NSR converges. Moreover, sequence f m in the Eq.(10) converges to f * = (1 − α)(I − αW )−1 f 0 .
Label Propagation Algorithm Based on Non-negative Sparse Representation
353
We summarize the main procedure of label propagation algorithm based on NSR as follows: Algorithm 1. Label propagation algorithm based on NSR. step 1. Input X = [ x1 , x2 ,..., xl , xl +1 ,..., xn ] ∈ R d , {xi }li =1 are labeled, {x u }nu =l +1 are unlabeled, the constant α . step 2. For i = 1 to n do a) Let X iˆ = {x1 , x2 ,...xi −1 , xi +1 ,..., xn } b) Construct the neighborhood and weight graph by solving the least square problem based on NSR in Eq.(5) (if there are outliers in the dataset, solving the robust NSR model in Eq. (7)). End For step 3. Normalize the weight by Eq. (8) and obtain the weight matrixW . step 4. Repeat sequence f m +1 = αWf m + (1 − α)f 0 until converge to limit f * . step 5. Output the label of each data xi by yi = arg max j ≤ c f ij* .
4 Experimental Verification 4.1 Database Setting In order to evaluate fairly the proposed label propagation algorithm based on nonnegative sparse representation and robust nonnegative sparse representation, we compare it with LNP and traditional KNN algorithm for the performance in the semisupervised learning. We select five different data sets which are from different machine learning applications and consist of different formats of features: ORL 1 , YALE2, UCI_Aust3, UCI_Heart4, and TDT2_1005. The ORL database contains ten different facial images of each of 40 distinct subjects. The Yale face database contains 165 grayscale facial images of 15 individuals with 11 images per subject. UCI_Aust and UCI_Heart are associated with two different practical applications in the UCI dataset whose data is real number. The UCI_Aust dataset consists of features of Australian Sign Language information. The UCI_Heart dataset is used to diagnose and analyze heart disease. The TDT2 dataset is used for text classification and selects 100 documents respectively in each category of the top 9 categories for training. Note that the feature vector in this dataset is sparse. 4.2 Algorithm Setting We compare the proposed two algorithms with KNN and LNP methods. They all adopt different models and strategies for graph construction. After constructing the graph, the label propagation processes are similar, the constant α =0.4 1
http://www.face-rec.org/databases/ http://cvc.yale.edu/projects/yalefaces.html 3 http://archive.ics.uci.edu/ml/datasets/Australian+Sign+Language+ signs 4 http://archive.ics.uci.edu/ml/datasets/Heart+Disease 5 http://www.nist.gov/speech/tests/tdt/tdt98/index.htm 2
354
1) 2) 3)
4)
N. Yang et al.
KNN algorithm. The graph is constructed by k-nearest neighbors and then Gaussian kernel function in Eq. (1).we set k =20 δ =0.01. LNP algorithm. The graph is constructed by k-nearest neighbors and further add constraint wij ≥ 0 in Eq.(2)to calculate weight and set k =20. NSR1(nonnegative sparse representation algorithm).The graph is constructed by the least square model in Eq.(5) in a parameter-free way, and is normalized by Eq.(8). We solve the model by [11][12]. NSR2(robust nonnegative sparse representation algorithm).The graph is constructed by correntropy-based robust nonnegative sparse representation model in Eq.(7) and is normalized by Eq.(8).
4.3 Experimental Results and Analysis To fairly compare NSR1 and NSR2 against traditional LNP and KNN algorithm, all the experiments were repeated 50 times. In each run, a number of samples are selected randomly for training. Table 1 shows the classification error rates for semisupervised algorithms. Table 1. Semi-supervised classification error rates of different methods
Orl_ind
Yale_ind
UCI_Aust
UCI_Heart
TDT2_100
NSR2
NSR1
LNP
KNN
50% 60% 80% 50% 60% 80%
16.0 ± 2.9 11.7 ± 3.1 6.2 ± 3.1 32.2 ± 4.6 26.3 ± 5.5 21.3 ± 5.1
17.4 ± 3.2 13.2 ± 2.9 6.9 ± 3.1 34.5 ± 5.1 30.2 ± 4.5 22.3 ± 5.4
18.3 ± 3.3 14.1 ± 3.0 8.2 ± 4.0 36.4 ± 5.2 32.0 ± 4.4 24.8 ± 8.0
23.2 ± 4.2 20.0 ± 4.4 15.8 ± 4.2 38.9 ± 5.4 37.6 ± 5.3 33.6 ± 6.7
50% 60% 80% 50% 60% 80%
25.6 ± 2.6 24.2 ± 2.8 22.4 ± 3.5 25.6 ± 3.8 24.4 ± 3.7 22.1 ± 6.3
26.4 ± 2.8 25.1 ± 2.6 23.4 ± 3.0 27.0 ± 2.7 26.0 ± 3.7 23.8 ± 4.6
28.8 ± 1.9 27.6 ± 1.4 25.6 ± 3.5 31.3 ± 3.5 30.1 ± 3.7 28.3 ± 5.4
30.2 ± 1.8 29.3 ± 2.4 28.7 ± 3.8 34.0 ± 3.4 32.8 ± 4.2 31.3 ± 6.1
50% 60% 80%
24.7 ± 2.0 22.5 ± 1.8 18.9 ± 3.0
14.2 ± 1.6 12.9 ± 2.0 9.8 ± 2.1
15.0 ± 1.7 14.0 ± 1.7 11.9 ± 2.3
18.4 ± 1.7 16.8 ± 2.0 16.1 ± 2.9
We observe from the numerical results that: 1) In the face recognition and UCI datasets, NSR1 and NSR2 achieves the lower error rates as compared with LNP and KNN. This is because we utilize both the local and global information contained in the data set and obtain more effective graphs for label propagation. When there are variation of illumination or occlusions in the dataset, such as parts of image are sheltered, blurry, or the exist of mislabeled samples, robust algorithm NSR2 makes use of Gaussian function to get a small weight for the outliers and meanwhile
Label Propagation Algorithm Based on Non-negative Sparse Representation
355
distinguishes and removes outliers. So NSR2 outperforms NSR1 in the classification error rates. 2) In the TDT2_100 dataset, NSR1 still achieves the lower error rate as compared with LNP and KNN. However, NSR2 achieves the higher error rate as compared with other three algorithms. It is because that there is almost no outliers in the text dataset. on the other hand, the features in TDT2 is spare, when TDT2 consists of many spare vectors, NSR2 emphasizes especially the spare vectors and computes a high weight on the spare vectors by Gaussian function, so has a lower performance. 4.4 Sparsity Analysis NSR1, NSR2 and LNP algorithm all can achieve a sparse code in the machine learning. However, because of the difference of classification accuracy, we still need to investigate how sparse is better for semi-supervised classification. We measure the sparsity of different algorithms by l 0 norm. The l 0 norm of a graph is defined as n
1 ∑ || W (i,:) ||0 n i =1
(11)
WhereW (i,:) indicates the i -th row of weight matrixW . We can observe from Eq.(11) that l 0 norm can be seem as the average neighbors’ number of each sample in the SPG. Fig. 1. shows the sparsity of different methods measured by l 0 norm in the five datasets. We can observe from the figure: 1) since we set neighbor k =20, the l 0 norm is all 20 in the five datasets by KNN method. 2) The l 0 norm of NSR1 and NSR2 is larger than the norm of LNP. It seems that LNP algorithm learns a more sparse code than the proposed algorithms in this paper. 45 40 NSR2 NSR1 LNP KNN
35
l
0
30 25 20 15 10 5
1
2
3 Datasets
4
5
Fig. 1. The sparsity of different methods
Since NSR1 and NSR2 consider both the local and global information contained in the dataset to obtain adaptive neighborhood and clustering relationship, so the SPG can reveal the intrinsic relationship of dataset. From the value of l 0 norm of KNN algorithm, we can observe that the manual setting k is larger in the first four datasets and smaller in the remaining dataset than the optimal k value, which further proves it is very difficult to set k in different applications. LNP only makes use of the local
356
N. Yang et al.
information by manually setting k , and constructs a more spare weight graph which cannot better reflect the clustering relationship of the samples. So LNP achieves lower classification performance than NSR1 and NSR2 in some aspects.
5 Conclusion This paper presents a novel label propagation algorithm for semi-supervised learning, which assumes that each data point can be linearly reconstructed by nonnegative sparse representation of the training data, and then constructs a SPG by local and global information of the samples. The nonnegative weight coefficients of the graph are derived by NSR algorithm and correntropy-based NSR algorithm which naturally reveal the clustering relationship of labeled samples and unlabeled samples in a parameter-free way. Correntropy-based NSR algorithm can weaken the affects of the outliers and have better robustness. Finally, the labels of unlabeled samples are propagated until convergence. Experimental results show the advantage of label propagation algorithm based on NSR. Acknowledgments. This work was supported by DUT R & D Start-up costs.
References 1. He, R., Hu, B.G., Zheng, W.S., Guo, Y.Q.: Two-stage Sparse Representation for Robust Recognition on Large-scale Database. In: Twenty-Fourth AAAI Conference on Artificial Intelligence (2010) 2. Yan, S.C., Wang, H.: Semi-supervised Learning by Sparse Representation. In: SIAM International Conference on Data Mining SDM, pp. 792–801 (2009) 3. Donoho, D.L., Tanner, J.: Sparse Nonnegative Solution of Underdetermined Linear Equations by Linear Programming. Proc. of the National Academy of Sciences of the United States of America (2005) 4. Liu, W.F., Pokharel, P.P., Principe, J.C.: Correntropy: Properties and Applications in NonGaussian Signal Processing. IEEE Transactions on Signal Processing 55(11), 5286–5298 (2007) 5. Cortes, C., Mohri, M.: On transductive regression. In: Neural Information Processing Systems, NIPS (2007) 6. Donoho, D.: Compressed sensing. IEEE Trans. on Information Theory 52(4), 1289–1306 (2006) 7. Wright, J., Ma, Y., Mairal, J., Sapiro, G., Huang, T., Yan, S.: Sparse representation for computer vision and pattern recognition. In: Proc. of IEEE (2009) 8. Wang, F., Zhang, C.S.: Label propagation through linear neighborhoods. IEEE Trans. on knowledged and data engineering 20(1), 55–67 (2008) 9. He, R., Zheng, W.S., Hu, B.G.: Maximum correntropy criterion for robust face recognition. Submitted to IEEE Trans. on Pattern Analysis and Machine Intelligence (2009) 10. Roweis, S., Saul, L.: Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500), 2323–2326 (2000)
Label Propagation Algorithm Based on Non-negative Sparse Representation
357
11. Bjorck, A.: A direct method for sparse least-squares problems with lower and upper bounds. Numerische Mathematik 54(1), 19–32 (1988) 12. Portugal, L.F., Judice, J.J., Vicente, L.N.: A comparison of block pivoting and interiorpoint algorithms for linear least squares problems with nonnegative variables. Mathematics of Computation 63(208), 625–643 (1994) 13. Zhou, D., Bousquet, O., Lal, T., Weston, J., Schoelkopf, B.: Learning with local and global consistency. In: Neural Information Processing Systems, NIPS (2004) 14. Bruckstein, A.M., Elad, M., Zibulevsky, M.: On the Uniqueness of Nonnegative Sparse Solutions to Underdetermined Systems of Equations. IEEE Trans. on Information Theory 54(11), 4813–4820 (2008)
Multiple Sequence Alignment by Improved Hidden Markov Model Training and Quantum-Behaved Particle Swarm Optimization Chengyuan Li 1, Haixia Long2, Yanrui Ding1, Jun Sun1, and Wenbo Xu1 1
School of IOT Engineeing, Jiangnan University , Lihu Road 1800, 214122, WuXi, JiangSu, China 2 School of Education, Jiangnan University , Lihu Road 1800, 214122, WuXi, JiangSu, China
Abstract. Multiple sequence alignment (MSA), known as NP-complete problem, is one of the basic problems in computational biology. Presently, profile hidden Markov model (HMM) is widely used for multiple sequence alignment. In this paper, Quantum-behaved Particle Swarm Optimization (QPSO) is used to train profile HMM. Furthermore, an integration algorithm based on the profile HMM and QPSO for the MSA is proposed. In order to evaluate the approach protein sequences are taken. Finally, compared with other algorithms, the results show that the proposed algorithm not only finds out perfect profile HMM, but also produces the optimal alignment of multiple sequences. Keywords: multiple sequence alignment, Profile hidden Markov model, Quantum-behaved particle swarm optimization.
1 Introduction Multiple sequence alignment (MSA) of nucleotides and amino acids is a fundamental and challenging problem in computational biology. The resulting aligned sequences are used to construct phylogenetic trees, to find protein families, to predict secondary and tertiary structure of new sequences, and to demonstrate the homology between new sequences and existing families [1, 2]. Three types of approaches are mostly taken in practical situations [3, 4]. One way to deal with this problem is to use a progressive alignment method [5]. The second is widely used iterative, stochastic approaches for MSA. These approaches include simulated annealing (SA) [6], genetic algorithms [7], and genetic algorithm with ant colony optimization (GA-ACO) [8] and evolutionary programming [9]. The third one is the strategy based on probabilistic models such as hidden Markov models (HMM), which was introduced by Churchill [10] in computational biology, and are widely used to perform multiple sequence alignments [11-13]. Our approach belongs to the third strategy. One problem of the HMM approach is that there is no global optimization algorithm that can guarantee to find the optimally trained HMM within reasonable runtime. At present, a popular learning algorithm of K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 358–366, 2010. © Springer-Verlag Berlin Heidelberg 2010
Multiple Sequence Alignment by Improved Hidden Markov Model Training
359
the HMM, called Baum-Welch or forward-backward [14], is a local optimization algorithm, and the resulting alignment obtained is often far from the global optimum. To cover the weakness of the local optimization algorithm, simulated annealing [6], genetic algorithm [7] and particle swarm optimization (PSO) [15] were used to train HMM. However, they did not resolve the problem, because these stochastic search approaches were apt to premature convergence. Here we present a novel global optimization method to train HMM which is based on QPSO algorithm [16-18]. The algorithm is global convergent compared to its predecessor PSO algorithm [15].
2 Methods 2.1 A Hidden Markov Model Topology for Multiple Sequence Alignment The HMM structure used in this study is the standard topology for the MSA problem originally suggested by Krogh et al. [14]. Fig.1 shows a simple topology example as a directed graph. The model includes three state is set to repeat the simple structure from left to right. In order to facilitate research, two additional states (begin state and end state) are added. States are connected to each other by transition probability a ij and a has the following properties: aij ≥ 0, 1 ≤ i, j ≤ n and aij ≥ 0, 1 ≤ i, j ≤ n . One match or insert state ( S j ) emits an observable (a symbol) ( vk ) from an output alphabet Σwith a probability b j (k ) .Emission probabilities has the following properties: b j (k ) ≥ 0, 1 ≤ j ≤ n, 1 ≤ k ≤ M and ∑ b ( k ) = 1 , 1 ≤ j ≤ n , here M is the number of observable symbols. Delete state, begin state and end state do not emit any symbol. ij
M
k =1
j
Fig. 1. An example of a simple HMM of length 3 for MSA
2.2 Quantum-Behaved Particle Swarm Optimization Algorithm PSO algorithm is an evolutionary optimization technique originally introduced by Kennedy and Eberhart [15]. The main disadvantage of PSO is that global convergence cannot be guaranteed [19]. Early concepts of a global convergence guaranteed QPSO, was developed and reported at conferences [16-18].
360
C. Li et al.
In QPSO, a global point called Mainstream Thought or Mean Best Position of the population is introduced into QPSO. The global point, denoted as C, is defined as the mean of the personal best positions among all particles. C (t ) = (C1 (t ), C2 (t ),L, CD (t )) =
1 M 1 M 1 M ⎛1 M ⎞ Pi1(t ), Pi 2 (t ), L, PiD (t ) ⎟ ∑ Pi (t) = ⎜⎝ M ∑ ∑ ∑ M i =1 M M ⎠ i =1 i =1 i =1
(1)
Where M is the population size and Pi is the personal best position of particle i. Then the value of L and the position are calculated by L = 2α ⋅ C ij ( t ) − X ij ( t )
(2)
X ij (t + 1) = p ij ± α ⋅ C ij (t ) − X ij (t ) ⋅ ln(1 / u )
(3)
The parameter α is considered as the Contraction-Expansion (CE) Coefficient. It can be tuned to control the convergence speed of the algorithms. The PSO algorithm with (3) is called the QPSO. Since the search scope of each particle in QPSO is R D space, the sampling space of QPSO in the each iteration is also R D , which definitely covers the feasible solution space. By criterion of a global convergent algorithm [20], we can conclude that QPSO is of global convergence. The QPSO algorithm is described as follows. 1. 2. 3.
4. 5.
Initialize an array of particles with random position and velocities inside the problem space; Determine the mean best position among the particles by (1); Evaluate the desired objective function (for example minimization) for each particle and compare with the particle’s previous best values: That is, If f ( X i ) < f ( Pi ) then X i = Pi ; Determine the current global position minimum among the particle’s best positions. ( f ( Pi )) ; (M is the population size) That is: g = arg 1min ≤i ≤ M
6.
The current global position is compare to the previous global: if the current global position is less than the previous global position; the global position is placed of the current global; 7. For each dimension of the particle, a stochastic point between P and Pgd ; 8. pid = ϕ * Pid + (1 − ϕ ) * Pgd , ϕ = rand ; 9. Attain the new position by stochastic (3); 10. Repeat steps 2-9 until a stop criterion is satisfied OR a pre-specified number of iterations are completed. id
2.3 Viterbi Algorithm We use Viterbi algorithm to discover the hidden state sequence that is most likely to have produced a given observation sequence. Fist we define:
Multiple Sequence Alignment by Improved Hidden Markov Model Training
δ t (i ) = max P( q1 q 2 L qt , qt = i, o1o2 L ot | λ ) q1q 2 Lqt −1
361
(4)
as the probability of the most probable state path for the partial observation sequence. The Viterbi algorithm is defined as follows: 1.
2.
Initialization: δ 1 (i ) = π i bi (o1 ), 1 ≤ i ≤ N ϕ1 (i) = 0, 1 ≤ i ≤ N
(5)
δ t (i) = max[δ t −1 ( j )a ji ]bi (ot ), 2 ≤ t ≤ T ,1 ≤ i ≤ N
(6)
ϕ t (i ) = arg max[δ t −1 ( j )a ji ], 2 ≤ t ≤ T ,1 ≤ j ≤ N
(7)
Recursion: 1≤ j ≤ N
1≤ j ≤ N
3.
Termination: P * = max[δ T (i )]
(8)
qT* = arg max[δ T (i )]
(9)
1≤ i ≤ N
1≤i ≤ N
4.
Optimal state sequence backtracking: qt* = ϕ t +1 (q t*+1 ), t = T − 1, T − 2, L ,1
(10)
2.4 Encoding Schemes When training profile HMM based on the QPSO algorithm, each profile HMM is represented as one particle. Through updating the position of the particle to optimize profile HMM. We keep the length of the HMM constant during training and only optimized the parameters of the HMM, i.e. the transition and emission probabilities. The length of the HMM is the average length of the unaligned sequence. Don’t consider begin and end states. If the length of HMM is m , the number of state is 3m + 1 and the number of transition probability is 3(3m + 1) and the number of emission probability is (2m + 1) A ( A is the number of nucleotide bases or amino acid). So the dimension of each particle is 9m + 3 + (2m + 1) A . According to the properties of the transition and emission probabilities, transition and emission probabilities must be normalized before training HMM. 2.5 Fitness Evaluation In this paper, we use two objective functions to evaluate MSA_HMM+QPSO. • Training evaluation During the training of the HMM its quality needs to be measured, for this a log-odds score is used, which is based on a log-likelihood score [21] shown in (11): Log − odds (O, λ ) =
1 N
N
∑ log i =1
2
P (Oi | λ ) P (Oi | λ N )
(11)
362
C. Li et al.
Where O = {O1 , O2 ,L , OM } is the set of unaligned sequences, λ is the trained HMM model, and λ N is the null-hypothesis model. The QPSO tries to maximize the probability that the HMM generates the given unaligned sequences of nucleotides and amino acids. • Alignment evaluation Once obtaining alignment from the Viterbi algorithm, we can use Sum-of-Pairs (SOP) scoring function (12) to evaluate the results of alignment: n −1
SOP = ∑
n
∑ D(l , l
i =1 j = i +1
i
j
)
(12)
where l i is aligned sequence i and D is a distance metric. To prevent the accumulation of many gaps in an alignment, we deduct an affined gap cost from the sum-of-pairs score. The penalty is computed by the following equation for each gap in the alignment:
Gap cos t = GOP + n × GEP
(13)
Where GOP is a fixed penalty for opening a gap, GEP is the penalty for extending the gap, and n is the number of gap-symbols in the gap. The gap cost is calculated for each gap in each of the aligned sequences. The sum of these costs is then deducted from the sum-of-pairs score. For the SOP scores, the QPSO tries to maximize the quality of the alignment produced by the HMM encoded in the particles.
3 Results 3.1 Experimental Datasets We tested the performance of our method in the alignment of (1) simulated nucleotide data sets; (2) amino-acid data sets from benchmark alignment database. We used nine instances from the BAliBASE database [22] shown in Table 1, which contains several manually refined multiple sequence alignments specifically designed for the evaluation and comparison of multiple sequence alignment methods. Table 1. Nine benchmark datasets from BAliBASE N: number of sequences, LSEQ: length of sequences, Identity: identity of sequence
Name 1idy 451c 1krn kinase 1pii 5ptp 1ajsA glg 1tag
N 5 5 5 5 4 5 4 5 5
LSEQ(min,max) (49,58) (70,87) (66,82) (263,276) (247,259) (222,245) (258,387) (438,486) (806,928)
Multiple Sequence Alignment by Improved Hidden Markov Model Training
363
3.2 Experimental Set-Up In our experiments, we conducted HMM training experiments with the BW PSO QPSO used log_odds score as objective function to evaluate the quality of the HMM, and conducted multiple sequence alignment experiments with the CW [23], BW, PSO and QPSO used SOP score as objective function to evaluate the quality of the alignment. In PSO and QPSO algorithm, we employed 25 particles and 20 trial runs are carried out for each case with each run set to 1000 iterations. The other parameters in PSO and QPSO algorithms in the following:
、 、
Velocity weight( ϖ ) linearly decreases from 1.0 to 0.5 c1 = c 2 = 2.0
→
Maximum velocity ( v ) =1.0 Contraction-Expansion Coefficient ( α ) linearly decreases from 1.0 to 0.5 max
Furthermore, for nucleic acids, the ‘swgap’ substitution score table (from ClustalW 1.81) with gap-opening and gap-extension penalties of 15 and 7, respectively, and for amino-acid data, the BLOSUM62 table with gap-opening and gap-extension penalties of 11 and 2, respectively. 3.3 Experimental Results Tables 2-3 recorded the average best fitness and standard error results for the log-odds scores and SOP scores. Table 2 displays best fitness and standard error of the BAliBASE test sets where log-odds score was used as the objective function. Again the QPSO was able to achieve the best average scores. Averagely, the PSO had better results than BW. Table 3 indicates the average best SPS scores for the alignment of the BAliBASE test sets. The QPSO results were better than BW and PSO, but not as good as those of Clustal W. Table 2. HMM log-odds scores ± standard error of the BAliBASE test sets
Name
BW
PSO
QPSO
1idy 451c 1krn kinase 1pii 5ptp 1ajsA glg 1tag
42.0576 68.3522 69.0222 214.9693 213.0459 266.5928 326.4896 380.7306 729.3726
59.7932±0.4202 89.1605±0.5126 81.9846±0.9212 211.2745±0.8959 277.0576±0.9306 311.5647±0.9254 381.6639±0.8374 395.1211±0.9201 763.7521±1.0002
71.4864±0.7630 106.3024±0.6950 103.6417±0.8287 356.8937±0.6798 328.1439±0.8513 428.8537±0.6151 483.7352±0.5643 486.5318±0.5950 875.6509±0.7841
364
C. Li et al. Table 3. SPS scores of the BAliBASE test sets
Name
CW
BW
PSO
QPSO
1idy 451c 1krn kinase 1pii 5ptp 1ajsA glg 1tag
0.705 0.719 1.000 0.736 0.864 0.966 0.571 0.941 0.963
0.5132 0.3989 0.8182 0.2268 0.1647 0.6053 0.2864 0.5691 0.6453
0.5658 0.4519 0.7863 0.3061 0.2738 0.6831 0.3245 0.6684 0.6931
0.7763 0.8027 1.000 0.5753 0.9372 0.9572 0.6914 0.8569 0.7953
Fig.2-3 illustrates convergence process of the average best Log-odds scores and SOP scores for the protein family. Clearly, the QPSO is able to achieve better results than CW BW and PSO. Note that the scores of the CW BW and PSO runs converged rather quickly whereas the QPSO continuously improved its results throughout the run. Moreover, during the first iterations the QPSO made major result improvements, which were already superior compared to the final results of the CW BW and PSO after 1000 iterations.
、
、
、
Fig. 2. Mean Log-odds scores for1idy protein family
Fig. 3. Mean Log-odds scores for 1ajsA protein family
Multiple Sequence Alignment by Improved Hidden Markov Model Training
365
4 Conclusion In this paper, we introduced an approach of training HMM for MSA with QPSO. The results of our experiments show that the QPSO is a remarkably effective training method for HMM compared to BW and PSO. It is concluded that the HMM trained by the QPSO with SOP as objective function also produced better alignment than all of other methods. Because QPSO is global convergence algorithm, and has only position vector, furthermore it has fewer parameters to adjust. The computation time for the training HMM with QPSO is equal to PSO algorithm, but much large than BW method. From experiments, we can summary that the longer of the average length and the more of the number of the multiple sequences, the much of the computation time for the training HMM with QPSO. The spent time of QPSO is averagely 6 hours; however the spent time of BW is average a few minutes. The vast majority of the computation time is spent on the evaluation of HMM by the forward-function. In the future work, we can enhance performance of HMM and reduce computation time by improving QPSO algorithm.
References 1. Frishman, D., Argos, P.: Knowledge-based protein secondary structure assignment. Proteins 23, 566–579 (1995) 2. Mount, D.W.: Bioinformatics: Sequence and Genome Analysis Cold Spring. Harbor Laboratory Press (2001) 3. Notredame, C., Higgins, D.G.: SAGA: sequence alignment by genetic algorithm. Nucleic Acids Res. 24, 1515–1524 (1996) 4. Nicholas Jr., H.B., et al.: Strategies for multiple sequence alignment. Biotechniques 32, 572–574 (2002) 5. Feng, D.-F., Doolittle, R.: Progressive sequence alignment as a prerequisitetto correct phylogenetic trees. Journal of Molecular Evolution 25, 351–360 (1987) 6. Myers: Multiple sequence alignment using simulated annealing. Computer Applications in the Biosciences 4, 7 (1988) 7. Licheng, J., Lei, W.: A novel genetic algorithm based on immunity. IEEE Transactions on Systems, Man and Cybernetics, Part A: Systems and Humans 30, 552–561 (2000) 8. e-Jung Lee, S.-F.S., Chuang, C.-C., Liu, K.-H.: Genetic algorithm with ant colony optimization (GA-ACO) for multiple sequence alignment. Applied Soft Computing 8 (2008) 9. Thomsen, R.: A Clustal alignment improver using evolutionary algorithms, pp. 121–126 (2002) 10. Churchill, G.A.: Stochastic models for heterogeneous DNA sequences. Bull. Math. Biol. 51, 79–94 (1989) 11. Rabiner, L.R.: A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE 77, 257–286 (1989); Loytynoja, A., Milinkovitch, M.C.: A hidden Markov model for progressive multiple alignment. Bioinformatics 19, 1505–1513 (2003) 12. Mamitsuka, H.: Finding the biologically optimal alignment of multiple sequences. Artif. Intell. Med. 35, 9–18 (2005); Krogh, A., et al.: Hidden Markov models in computational biology. Applications to protein modeling. J. Mol. Biol. 235, 1501–1531 (1994)
366
C. Li et al.
13. Kennedy, J., Eberhart, R.: Particle swarm optimization. In: Proceedings of IEEE International Conference on Neural Networks, vol. 4, pp. 1942–1948 (1995) 14. Jun, S., et al.: Particle swarm optimization with particles having quantum behavior. In: Congress on Evolutionary Computation, CEC 2004, vol. 1, pp. 325–331 (2004) 15. Jun, S., et al.: A global search strategy of quantum-behaved particle swarm optimization. In: 2004 IEEE Conference on Cybernetics and Intelligent Systems, vol. 1, pp. 111–116 (2004) 16. Jun, S., et al.: Adaptive parameter control for quantum-behaved particle swarm optimization on individual level. In: 2005 IEEE International Conference on Systems, Man and Cybernetics 2005, vol. 4, pp. 3049–3054 (2005) 17. Clerc, M., Kennedy, J.: The particle swarm - explosion, stability, and convergence in a multidimensional complex space. IEEE Transactions on Evolutionary Computation 6, 58– 73 (2002) 18. Solis, F.J., Wets, R.J.-B.: Minimization by Random Search Techniques. Math. of Oper. Res. 6, 19–30 (1981) 19. Rabiner, L.R.: A tutorial on hidden Markov models and selected applications in speech recognition. Proc. IEEE 77, 257–285 (1989) 20. Thompson, J.D., et al.: BAliBASE: a benchmark alignment database for the evaluation of multiple alignment programs. Bioinformatics 15, 87–88 (1999) 21. Thompson, J.D., et al.: CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucl. Acids Res. 22, 4673–4680 (1994)
Breast Cancer Diagnosis Using WNN Based on GA Xiaomei Yi, Peng Wu, Jian Li, and Lijuan Liu Department of Information Engineering of ZheJiang Agricultural & Forestry University, Hangzhou, China [email protected], [email protected], [email protected], [email protected]
Abstract. Breast cancer diagnosis is an important field of medical research. In order to improve the accuracy of diagnosis, this article proposed a model of breast cancer diagnosis with wavelet neural network (WNN) based on genetic algorithm (GA). In this model, wavelet is used as the excitation function of the neural network, and genetic algorithm is used to optimize the weight of neural network. On the basis of it the WNN-GA implements learning step and built the WNN-GA model of breast cancer diagnosis. The result of the experiment shows that this algorithm can be used in breast cancer diagnosis effective and reliable. Keywords: Neural Network, wavelet transform, Genetic Algorithm, Breast Cancer Diagnosis.
1 Introduction Breast cancer is one of the most common female malignancies, and the incidence is trending to rapidly ascend. Breast cancer typically produces no symptoms when the tumor is small and most treatable. It is therefore very important for women to follow recommended screening guidelines for detecting breast cancer at an early stage, before symptoms develop [1]. Most of traditional diagnosis methods are statistic-based or rulebased and there are some limitations in accuracy and practicality in it, for example: It has high rate of diagnostic accuracy for the case of data base only based on this particular case of a medical model, but it maybe has greater error when diagnose new cases. On account of above reasons, it is necessary to establish an efficient and practical breast cancer diagnosis model with fault-tolerant capability, many scholars introduced artificial intelligence to the diagnosis system recently: O.L. Mangasarian and W. Nick Street designed a breast cancer diagnosis and prognosis via linear programming [2]. Mehmet Fatih Akay combined support vector machines with feature selection for breast cancer diagnosis[3]. Hussein A proposed an evolutionary artificial neural networks approach for breast cancer diagnosis [4]. L Liu and M Deng put an evolutionary artificial neural network approach for breast cancer diagnosis also [5]. Wavelet neural network is a new method that combines the time-frequency localization properties of wavelet transform and the self-learning ability of neural networks, it avoids local minimum effectively and has faster convergence speed than neural network, but neural network algorithm is adopted in wavelet neural network also, there is therefore inevitable defects of neural network in it. K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 367–374, 2010. © Springer-Verlag Berlin Heidelberg 2010
368
X. Yi et al.
Genetic algorithm is a probability search algorithm optimization, it can be realize global search in complex nonlinear and non-differentiable space. it is used to optimize the topology structure of wavelet neural network to overcome the defect of wavelet neural network. This paper builds a diagnosis model using wavelet neural network and genetic algorithm. First breast cancer is abstracted as a pattern recognition problem, on the basis of it GA is used to find the best original structure of wavelet neural network. The second, final wavelet neural network structure is established through training on original structure. The third step is to construct a training set and determine the input and output of training, then substituted the training set samples to WNN-GA training algorithms to obtain the trained model. Finally, a diagnosis result is available after specific cases of input data is put into the training model. The experimental results verify the accuracy of this method proposed.
2 Wavelet Neural Network 2.1 Wavelet Transfer Theory [6] Wavelet analysis theory originates from traditional Fourier transform with adjustable window location and size. It is constructed from a function ψ (x) defined in a limited range, ψ (x) is called mother wavelet. ψ ab (x ) is constructed by ψ (x) as defined below: a−b (1) ) a Where a is dilation parameter, b is translation parameter. a, b ∈ R, a ≠ 0 . The continuous wavelet transform of a signal f (t ) ∈ L2 ( R) is defined as below: −
1
ψ ab ( x) =| a | 2 ψ (
W f (a, b) =< f (t ),ψ ab (t ) >=| a |
−
1 +∞ 2
∫ f (t )ψ (
−∞
t −b ) dt a
(2)
The signal f (t ) can be recovered from an inverse wavelet transform of W f (a, b) is defined as: +∞+∞ 1 t −b 1 ) dadb f (t ) = W f (a, b)ψ ( (3) Cψ −∫∞−∫∞ a a2 Where ψ (x) is defined by Cψ =
−∞
∫
| ψ (ω ) | dω < ∞
−∞
ω
(4)
2.2 Wavelet Neural Network
Wavelet neural network is a theory combines wavelet transform and neural network. The nonlinear activation function (for example, sigmoid) of neural work is replaced by wavelet in it. Theoretical studies have shown that wavelet neural network has more fault-tolerant capabilities and information extraction capabilities than neural network.
Breast Cancer Diagnosis Using WNN Based on GA
369
Allowing for the theory of three-layer feed forward neural network can approximate any continuous function having been proved by Kolmogonov, this paper constitutes a wavelet neural network model consists of three layers, includes: input layer, hidden layer and output layer. ψ1 x1
v1
ψ2 x2
vM
ψ T −1 xS
ψT
Fig. 1. The structure of a wavelet neural network
Due to Morlet ψ ( x) = cos(1.75 x)e − x / 2 has the characteristic of good robustness, small error and stability calculation, the paper uses it as excitation function for hidden layer and sigmoid function as that for output layer still. The learning algorithm of wavelet neural network is based on the idea of error back-propagation. The process is described in detail below [7]: If we have X n (s ) is the learning pattern, V n ( m) T is the expected output, S is the 2
number of the input layer node, M is the number of the output layer node, T is the number of the hidden layer node, the value of j ranges in [- J, J ] , the value of k
ranges in [- K, K ], the value
j and k are calculated form T ,
j = T /( 2 K + 1) − J and
u k = t mod( 2 K + 1) − k , st and wtm are connection weights: The value of network output is calculated as: M S ( 2 J +1)( 2 K +1) ⎛ ⎞ Vn (m ) = ∑ wtm 2 − j 2ψ ⎜⎜ 2 − j ∑ ∑ u st X n (s ) − k ⎟⎟ m =1 s =1 t =1 ⎝ ⎠
(5)
The error function is defined as: N
M
(
)
E = 0.5∑∑ Vn (m ) − Vn (m ) n =1 m =1
T
2
(6)
The value of gradient vector calculated as:
δw = tm
[
]
N S ( 2 J +1)( 2 K +1) ⎛ ⎞ ∂E T = −∑ Vn (m ) − Vn (m ) 2 − j 2ψ ⎜⎜ 2 − j ∑ ∑ u st X n (s ) − k ⎟⎟ ∂wtm n =1 s =1 t =1 ⎝ ⎠
(7)
370
X. Yi et al.
δw = tm
S
Where s ' = ∑ s =1
[
]
M ⎛ N ⎞ ∂E ∂ξ T = ∑ ⎜⎜ − ∑ Vn (m ) − Vn (m ) wtm 2 − j 2 X n (s )⎟⎟ ' ∂wtm m=1 ⎝ n=1 ∂ψ ⎠
(8)
( 2 J +1)(2 K +1)
∑u t =1
st
' j ' X n (s ) , if we have t n = 2 s − k
⎛ t' ∂ψ = − cos 1.75t n' exp⎜ − n ' ⎜ 2 ∂s ⎝
(
2
)
⎞ −j ' ⎛ ' ⎟2 t n − 1.75 sin 1.75t n' exp⎜ − t n ⎟ ⎜ 2 ⎠ ⎝
(
)
2
⎞ −j ⎟2 ⎟ ⎠
(9)
The procedure of error back-propagation is defined as: new Δwtm = −η
Δu stnew = −η
Where η is learning rate, adjusted:
α
∂E old + αΔwtm old ∂wtm
(10)
∂E + αΔu stold ∂u stold
(11)
is momentum factor, the network parameters is
new old new wtm = wtm + Δwtm
(12)
new old new u tm = u tm + Δu tm
(13)
3 Genetic Algorithm Optimize Wavelet Neural Network Allowing for the feature of comprehensive, rapid, adaptability and robustness features of genetic algorithm optimization process, we use genetic algorithm to optimize the parameters of wavelet neural network [8,9]. The procedure is described as below: (1) Set the genetic parameters and adaptive adjustable parameters; (2) Generate a set of wavelet neural network parameters ( u st , wtm , a , b ) randomly and encode them to form the initial population using real-coded; (3) Decode the chromosomes of the population and calculate the individual fitness function value. (Typically, the value of individual’s fitness is decided on the error, and the fitness value will be low if the error is great.) If the value of the fitness function meets the performance requirements or evolutionary maximum number of times, then go to step (6); (4) Make selection operation using roulette wheel selection algorithm on the basis of the fitness value (the reciprocal of the error function); (5) Produce the next generation by adopting the adaptive probability crossover p c and the adaptive probability mutation p m [10], and then go to step (3);
Breast Cancer Diagnosis Using WNN Based on GA
371
p c are given as follows:
(14)
Where p c is the crossover rate of the individual; f ' is the fitness of the individuals which crossover is performed on; favg is the average of the population’s fitness; fmax is the fitness of the best individual. p m are given as follows:
(15)
Where p m is the mutation rate of the individual; f ' is the fitness of the individuals which mutation is performed on; favg is the average of the population’s fitness; fmax is the fitness of the best individual. (6) Find out the optimum wavelet neural network parameters ( u st , wtm , a , b ), learn from training data using the optimized wavelet neural network.
4 Diagnosis of Breast Cancer with WNN-GA 4.1 The Selection of Dataset
The experimental dataset this paper adopted was collected by Dr. William H. Wolberg at University of Wisconsin Madison Hospitals taken from needle aspirates from human breast cancer tissue. There are 683 sample data in this dataset and each includes 9 features (Clump thickness, Unif_Cell_Size, Unif_Cell_Shape, Marginal_Adhesion, Single_Cell_Size, Bare_Nuclei, Bland_Chromatine, Normal_Nucleoli, Mitoses) and a type (benign, malign) [11]. The model of breast cancer diagnosis takes the described parameters of cell property as the input vectors, expressed as a vector: xi = (xi1, xi2,..., , xi9) , the class of cell as the output of result: v{0,1} , malignant cells as being cases of samples, labeled as 1; benign cells as a counter-example sample marked 0. So, the number of input units is 9 and that of output units is 1. The number of hidden units is 8. A threshold is set up to 0.5, if the value of output ranges in [0, 0.5], it is considered to be 0 and decision-making for the categories of benign, otherwise malign. 4.2 Experimental Result
Based on the models, 467 samples are selected from the data set for training and 216 samples for testing. There are 134 benign samples and 82 malign samples in test data.
372
X. Yi et al.
The population size is set to 60, adaptive crossover and mutation operators are introduced above, and the maximum number of epochs is set to 100. This article uses MATLAB as a simulation tool, the squared error curves and training curves are shown in figure 1 and figure 2.
Fig. 2. Error sum squares for WNN-GA
Fig. 3. Training performance for WNN-GA
Fig.1 shows that after about 90 generations the average fitness tended to stabilize at this time and it means that a higher fitness groups has been found. Fig.2 shows that training goal is set to 0.001 in the experiments and error goal performance will be reached after 292 epochs. After training on the input data, the test dataset is substituted into BP and wavelet neural network based on genetic algorithm model that has been trained before, five times tests against breast cancer diagnosis to be done on BP and wavelet neural network based on genetic algorithm respectively after pre-processing above is carried out. The average values of five times test result are shown in Table1: the recognition rate of breast cancer diagnosis is 97.56% in BP and that is 98.78 in wavelet neural network based on genetic algorithm, the accuracy is 98.15% in BP and %98.61 in wavelet neural network based on genetic algorithm. Experiment shows that the
Breast Cancer Diagnosis Using WNN Based on GA
373
Table 1. The test results of System Performance
Algorithm BP WNN-GA
Test dataset
benign
malign
Recognition rate of breast cancer
Accuracy
216
132
80
81/82=97.56%
212/216=98.15%
216
132
81
81/82=98.78%
213/216=98.61%
recognition rate and the accuracy of breast cancer diagnosis using wavelet neural network based on genetic algorithm is more efficient than that using BP.
5 Conclusion Wavelet neural network based on genetic algorithm to breast cancer diagnosis this paper proposed avoids the disadvantages of determining the structure and its parameters by experiments and experience in WNN, the optimal solution is obtained exclude local minimum, it improves diagnostic capabilities and diagnostic precision in breast cancer diagnosis. Experimental results on the breast cancer diagnosis data set demonstrate that wavelet neural network based on genetic algorithm breast cancer diagnosis model achieve a greater diagnostic precision. However we will further do a lot of work in how to reduce the training time. Acknowledgments. This Project Supported by Scientific Research Fund of Zhejiang Provincial Education Department (No.Y200805069, No.Y200908922).
References 1. American Cancer Society, Breast Cancer Facts & Figures 2009-2010. Technical Report American Cancer Society, Atlanta, Georgia (2010) 2. Mangasarian, O.L., Nick Street, W.: Breast Cancer Diagnosis and Prognosis via Linear Programming, http://www.aaai.org/Papers/Symposia/Spring/1994/ SS-94-01/SS94-01-019.pdf 3. Mehmet Fatih Akay Support Vector Machines Combined with Feature Selection for Breast Cancer Diagnosis. Expert Systems with Applications 36(2), Part 2, 3240–3247 (2009) 4. Abbass, H.A.: An Evolutionary Artificial Beural Networks Approach for Breast Cancer Diagnosis. Artificial Intelligence in Medicine 25(3), 265–281 (2002) 5. Liu, L., Deng, M.: An Evolutionary Artificial Neural Network Approach for Breast Cancer Diagnosis. In: 2010 Third International Conference on Knowledge Discovery and Data Mining. wkdd, pp. 593–596 (2010) 6. Hou, Z., Noori, M., St, R.: Wavelet-Based Approach for Structural Damage Detection. J. Journal of Engineering Mechanics 126(7), 677–683 (2000) 7. Jun-hua, Q.J.: Study of Target Tracking Based on Wavelet Neural Network. J. Control & automation 25(8), 129–131 (2009)
374
X. Yi et al.
8. Ling, S.H., Leung, F.H.F.: Genetic Algorithm-Based on Variable Translation Wavelet Neural Network and Its Application. In: Proceedings of International Joint Conference on Neural Networks, Montreal, Canada, pp. 1365–1370 (2005) 9. Delyon, B., Judilsky, A.J.: Accuracy Analysis for Wavelet Approximations. IEEE Trans. on Neural Network 6(2), 332–384 (1995) 10. Beyer, H.G., Deb, K.J.: On self-adaptive features in real-parameter evolutionary algorithms. IEEE Transactions on Evolutionary Computation 5(3), 250–270 (2001) 11. William, H. W.: UCI respository of machine learning, http://www.ailab.si/ orange/doc/datasets/breast-cancer-wisconsin-cont.tab
Lattice-Based Artificial Endocrine System Qingzheng Xu1,2, Lei Wang1, and Na Wang2 1
School of Computer Science and Engineering, Xi’an University of Technology, Xi’an, China 2 Xi’an Communication Institute, Xi’an, China [email protected]
Abstract. For the problem of homogeneous endocrine cells and lacking time concept in hormone transportation and metabolism in digital hormone model, a lattice-based artificial endocrine system (LAES) model which is inspired from modern endocrinology theory is proposed. Based upon environmental latticed, supported by cell intellectualization, jointed by cumulative hormone, and directed by target cells, LAES model finally adapts itself to continuous changes of external environment and maintains relevant stability stable of internal environment. Endocrine cells are classed as regular endocrine cells and optimum endocrine cells reflecting the diversity and complexity of endocrine system. The model mimics dynamic process of hormone transportation and the hormone concentration is determined not only by the current distribution of endocrine cells, but also by the past distribution. The experiments show it can eliminate complex interference, such as multi-target cells and multi-obstacles. Keywords: artificial endocrine system, endocrine cell, target cell, hormone.
1 Introduction Until recently, it was thought that the major regulating systems of human body – the cerebral nervous system, the immune system and the endocrine system – functioned independently of each other. It is now known that, bolstered by modern scientific research, they are, in fact, all integrated into one single system of information communication[1,2]. With bidirectional information transmission between cytokine, neurotransmitter and hormone, these systems interact and cooperate with each other to organize a cubic and intelligent regulatory network. We believe that the structure of these regulatory systems may function in the regulation of metabolism, growth, development, reproduction, thinking and motion in most of mammals, including humans, which is responsible for making adaptive response and maintaining longterm dynamic equilibrium of the organism and organization when internal and external environments is changing rapidly and physiological balance is disturbed. Enormous achievements made in the theory, model and application on Artificial Neural Network[3-5] and Artificial Immune System[6-8] have shown the significant theoretical meaning and practical applying value of intelligent system research based on biological information processing mechanism. At the same time, it has inspired and guided the interest and enthusiasm in research on other biological information K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 375–385, 2010. © Springer-Verlag Berlin Heidelberg 2010
376
Q. Xu, L. Wang, and N. Wang
processing systems including endocrine system. Comparatively speaking, with slow progress, research on Artificial Endocrine System (AES) has been at discipline creation and preliminary exploration stage, and quite a great deal challenges still remain regarding theoretical model and engineering application. Autonomous Decentralized Systems (ADS) proposed by Mori in 1984 is perhaps the earliest attempt to use hormone-inspired methodology to build systems that are robust, flexible, and capable of doing on-line repair[9-11]. In ADS, the content code communication protocol is developed for autonomous systems to communicate not by “addresses” but by the content of messages. Shen puts forward the Digital Hormone Model (DHM) as a distributed control method for robot swarming behaviors. The model uses the advantages of Turing’s reaction-diffusion model, stochastic reasoning and action, dynamic network reconfiguration, distributed control, self-organization, and adaptive and learning techniques[12-14]. This method proves itself by its extensive application areas, its simplicity and its robustness, that have promoted its reputation in many fields[15-17]. It is quite likely that, it meets the requirements of a general theoretical model of AES. With continued development and expansion in theoretical model and application fields of DHM, its core idea and key technique are understood deeply and some inherent disadvantage are discovered gradually as discussed below. First of all, hormones in our body are divided into more than 200 distinct species which have been detected and recognized so far. These hormones with complex sources and distinguished functions are widely distributed in the blood, tissue fluid, intercellular fluid, intracellular fluid or gap of ganglion vesicle and other parts. In DHM model, all endocrine cells have the same physiological function and the nature of all hormones is identical, which can not reflect the diversity of hormones and complexity of interaction among hormones. Secondly, effective concentration of hormone is determined altogether by synthesis and release speed and degradation and conversion speed, which are exquisitely regulated by cerebral nervous system, so as to keep physiological concentration of hormone at the smooth level. For some unknown reasons, authors make assumption that hormones can degrade or convert to other form very quickly and completely in DHM model. As a result, hormone concentration is merely related to the current distribution of endocrine cells, having nothing to do with their past distribution. Finally, DHM model lacks coordinative and cooperative mechanism among endocrine cells, which result in difficult to overcome interference of complicated external environment, such as multiple target cells and barriers.
2 LAES Model In order to resolve these problems, inspired from information processing mechanisms in endocrine system, Lattice-based AES (LAES) is presented in this paper. Mathematically speaking, LAES can be abstracted as a quintuple model LAES = (Ld, EC, TC, H, A). In the equation, LAES stands for lattice-based artificial endocrine system, which is constituted by five components: environmental space Ld, endocrine cell EC, target cell TC, hormone H and algorithm A. The 4 former are discussed in next section and algorithm A is described alone until section 2.2.
Lattice-Based Artificial Endocrine System
377
2.1 Design of LAES In our model, all LAES elements can only survive, communicate, move and die out in a bounded square, which is called environment space Ld, where positive integer d stands for dimension of environment space. In human body, as we all know, the endocrine system with very complex function play a role in an extensive action domain, and all endocrine cells are distributed freely in continuous space occupying a given position. For simplicity, first of all, the two-dimension environment space is discrete or latticed, and each lattice can only contain one endocrine cell. The lattice form is driven by practical requirements of applications and standard square is used widely, as shown in the background in Fig. 1. We make Lxy representing the lattice unit at row x and column y. It is occupied by nothing when Lxy = 0, and obstacles such as wall, bars, doors, or rivers when Lxy = -1.
Fig. 1. Lattice-based Artificial Endocrine System
Endocrine cell EC, also know as unit cell or elementary cell, is the most fundamental component of LAES model. Out of consideration for low cost and complexity, each endocrine cell should be as simple as possible. In this paper, each endocrine cell is equipped with two sensors A1 and A2 and one releaser B1. Sensor A1 is responsible for perceiving hormones concentration of neighboring lattices and sensor A2 responsible for perceiving distance from its current location to target cell. Modern endocrinology tells us that the hormone release process is staged and the activity period is very short compared with its lifetime. Then, we make the assumption that the hormone release process is discrete and releaser B1 is responsible for releasing a certain quantity of hormones at suitable time. In addition, it is well known that there are a variety of hormones with distinguished function in our body. Hence, according to distances between endocrine cells and target cells, endocrine cells are divided into two sorts, respectively namely regular endocrine cells Cregular (as shown at the upper left in Fig. 1) and optimum endocrine cells Coptimum (as shown at the upper right in Fig. 1) in this paper. Optimum endocrine cell is the closest cell to target cell at the current state, and all other cells are regular endocrine cells.
378
Q. Xu, L. Wang, and N. Wang
Target cell TC is the organ or cell that can accept stimulation from endocrine cells in our body. Its receptor has the capability of binding directly with specific hormone. Generally, target cell is used as target and task to be seized and accomplished in our paper. A target cell is abstracted by a simple releaser B2 which releases constantly hormones with an appropriate concentration to attract surrounding endocrine cells, as shown at the bottom in Fig. 1. It is important to note, it is unlikely for endocrine cells to generate without foundation or fade away, but just to communication with each other and to move followed corresponding movement rule. When an endocrine cell arrive at the position of target cell and also achieve established target, that target cell will die out and stop releasing any hormone automatically. Hormone H is efficient bio-active substances, which is secreted by endocrine cell and endocrine gland. They play an important role in effecting physiological function and adjusting metabolism of tissue cells. In this paper, hormones are classified into repulsion hormone Hr and attraction hormone Ha (as shown in the middle in Fig. 1). Generally, target cell merely release attraction hormone, which can help the endocrine cell to search and seize it. On the contrary, regular endocrine cell merely release repulsion hormone, which can hold back others searching the similar area. Optimum endocrine cell release both less repulsion and more attraction hormones, which can lead others to search similar area and increase the probability of seizing target cell. It should be noted that, all hormones released by endocrine cells and target cells can survive only in their action sphere. However, it is relatively complicated and hard work to select advisable neighborhood, usually depending on features of problem, computing resources and preference of the decision-maker. Von Neumann type, Moore type and extended Moore type are some common types of neighborhood as in cellular automata model. Their neighborhood can be described as follows:
N Neumann = {( N x , N y ) | N x − x + N y − y ≤ 1, ( N x , N y ) ∈ Z 2 }
(1)
N Moore = {( N x , N y ) | N x − x ≤ 1, N y − y ≤ 1, ( N x , N y ) ∈ Z 2 }
(2)
N Moore− r = {( N x , N y ) | N x − x ≤ r , N y − y ≤ r , ( N x , N y ) ∈ Z 2 }
(3)
Nx and Ny are row and column coordinate of the neighborhood respectively, x and y are row and column coordinate of centrocyte, and r is neighborhood radius. In biological endocrine system, after being synthesized and released in endocrine cells, the hormones move into corresponding neighborhood through the blood circulatory system. In the blood circulatory system, the hormone can exist for a long time in two forms: inactivity combined hormone and activity dissociative hormones. So, we can assume that activity hormone concentration at different location follow normal distribution during transportation. The ith type hormone released from endocrine cells at the position Lab ( (a, b) ∈ N Moore− r ) at the moment t are transferred to the position Lxy through the blood circulatory system, and the concentration of dissociative hormone at this position can be described as follows:
Lattice-Based Artificial Endocrine System
H i ( x, y , t ) =
σ i2
( x − a ) 2 + ( y −b ) 2
ai 2πσ
379
2 i
e
2σ i2
(4)
is standard deviation representing transportation loss, and ai is constant
representing activity hormone ratio. As time goes on, hormones may be run out as a result of conversion or degradation, and the concentration of hormones will naturally go lower. Thus, endocrine cells are required to synthesize and release new hormones so as to maintain natural equilibrium of hormones concentration in organism. After metabolism, the cumulative hormone concentration at Lxy is equal to the sum of remaining activity hormone concentration and new synthesized and released hormone concentration. It can be described as follows:
⎧m n ⎪∑∑ H i ( x, y, t ) + H ' ( x, y, t ) t > 1 ⎪ j =1 i =1 H ( x, y , t ) = ⎨ m n ⎪ H i ( x, y , t ) t =1 ⎪⎩∑∑ j =1 i =1
(5)
H ' ( x, y, t ) = (1 − α ) H ( x, y, t − 1)
(6)
α is metabolism extinction coefficient and m is the number of endocrine cells in action sphere and n is type number of hormones. Give consideration to remaining activity hormone and new hormone, metabolism extinction coefficient is 0.5 in our following simulation experiments. 2.2 Design of Algorithm Algorithm A is a loop process in which fundamental elements of LAES can communicate, move and finish designated task together by making use of cumulative hormones. The flow chart of algorithm is described in Fig. 2. An endocrine cell in LAES select its move direction next based on movement rule R, which is a dynamics function and conditioned on three local factors: cell condition, cumulative hormone concentration and local topology. It is obviously that the movement rule R is depending on local information and homogenous for all endocrine cells in 2-dimition plane. Even so, it can greatly influence the sophisticated behaviors of the system as shown in following experiments, and give significant help in predicting and analyzing the global system performance. Supposing that h0 is the cumulative hormone concentration of the lattice at k step. h1, h2, h3, …, h8 is respectively the cumulative hormones concentration of its surrounding eight lattices. We will test two movement rules R1 and R2. Based on the above definition, movement rule R1 can be described as follow. Step 1. Compute the select probability pi (i = 0, 1, …, 8) according to the cumulative hormone concentration hi. The alternative method is variety such as Eqn. (7) used in our work.
380
Q. Xu, L. Wang, and N. Wang
⎧ ⎪10 × h i ⎪⎪ pi = ⎨1 ⎪ 1 ⎪− ⎪⎩ hi
hi > 0 hi = 0
(7)
hi < 0
Step 2. Determine the next position of endocrine cell according to roulette selection rule. Step 3. The endocrine cell moves in a virtual way. If several endocrine cells occupy the same lattice, then they move towards the nearest vacant lattice. Correspondingly, movement rule R2 (also named as Metropolis) can be described as follow. Step 1. Select new position from nine candidate locations randomly. Step 2. Compute the cumulative hormone concentration H*(x, y, t) at the new location. Step 3. If the cumulative hormones concentration increases, then new coordinate is acceptable. If it declines, then the new coordinate is acceptable with a probability of e β •( H *( x , y ,t )− H ( x , y ,t )) . Step 4. The endocrine cell moves in a virtual way. If several endocrine cells occupy the same lattice, then they move towards the nearest vacant lattice.
3 Experimental Results In order to verify accuracy and efficiency of LAES model, we have conducted the following three experiments. To eliminate the random error from initial distribution of endocrine cells and iterations of model itself, each experiment is independently carried out 50 times. All algorithms are programmed by JAVA language and experimental data are analyzed and processed by SPSS 14.0. The running environment is Pentium IV 2.4 GHz and RAM is 512 MB. Environmental space of AES is a latticed network of 100×100 and the size of endocrine cell population is 100. With relative simple scene setting, the aim of the first experiments is to verify the correctness of LAES and to compare the search and seize capacity of different models (as shown in Fig. 3). As shown in the figure, endocrine cells are initially concentrated at the up left corner and distributed nearby (5, 5) and target cell is distributed at the position (55, 55). In this way, the relative distribution between endocrine cells and target cells can be keep pace in the unbounded and bounded case. From Table 1, we can see that LAES exhibit the identical performance indexes, without statistical difference, in iterations and number of move in the unbounded and bounded case. However, the time running algorithm in the bounded case is a little bit less than that in the unbounded case. In these experiments, the initial coordinate of endocrine cells is at the top left corner and target cell is at the central in 2-dimision space. Thus, in the bounded case, downward and rightward frequency of endocrine
Lattice-Based Artificial Endocrine System
381
cells move is much more than upward and leftward frequency as shown in Table 1. In contrast, in the unbounded case, the frequencies in different cases are almost identical. Considering all the above results, we come to believe that LAES model display the same performance in the unbounded and bounded case. Hence, for simplicity, all the following experiments are performed in the bounded case.
Fig. 2. Flow Chart of Algorithm
Fig. 4 vividly describes running results of several kinds of algorithm. Furthermore, it can be seen clearly that LAES based upon the Metropolis rule (see the third row) is inferior to LAES based on movement rule R1 (see the first row) from Fig. 3 and 4. Generally, it spent more time or steps to seize target cell and explore less space in the same steps. For this reason, all following experiments are performed based on movement rule R1. In addition, we observed that LAES don’t show obviously superior to DHM (see the fourth row) in this set experiment.
382
Q. Xu, L. Wang, and N. Wang
Fig. 3. Examples in searching and seizing target Table 1. Performance comparison of LAES model in the unbounded and bounded case
Iterations Move numbers Model Mean t-test Mean
Spent time (ms)
t-test
Mean
t-test
LAES(bounded) 392 52113 0.630 0.986 LAES(unbounded) 384 52151
6348 7088
0.010
Downwards/Up wards frequency Mean
t-test
Rightwards /Leftwards frequency Mean
t-test
1.11727 1.12064 0.000 0.000 0.99985 0.99910
It should be specially noted, both in LAES and DHM model, not all endocrine cells are devoted to the same target cell, and there are sufficient endocrine cells still searching for other potential targets in the open environment space. This automatic dynamic balancing between global searching and local seizing task is partly due to the non-deterministic but probability endocrine cell behavior rule in LAES and DHM model. Normally, due to existence of optimum endocrine cells in LAES, endocrine cells are more concentrated on the certain target cell.
(a)
(b)
(c)
Fig. 4. Performance comparison of different models (a) steps; (b) move numbers; (c) time
Lattice-Based Artificial Endocrine System
383
The aim of the second set of experiments is mainly to verify the capacity of LAES model in dealing with multiple target cells (as shown in Fig. 5). The scenario is similar to the previous experiment. Endocrine cells are initially distributed nearby (5, 5) and three target cells are located respectively at (10, 90), (90, 10) and (90, 90).
Fig. 5. Examples in searching and seizing multiple target cells
From Fig. 6, it can be seen that cost of LAES and DHM to search and seize the first target cell is almost identical, which is obviously correspond with results of the first group experiments. Because the distance from the endocrine cell initially to the two targets at (10, 90) and (90, 10) is equal, endocrine cells in the two models are similarly close to the second target cell when seizing the first target cell. So algorithms take similar cost to seize the second target cell, which is the minimum among all process. However, there is relatively significant difference between the two models in searching and seizing the third target cell. LAES takes less time and resource compared with that in seizing the first target cell. However, on the contrary, cost spent by DHM at this stage is usually higher than that in seizing the first target cell. The difference is mainly due to the optimum endocrine cell in LAES. The experiment includes three target cells, so LAES usually generates three optimum endocrine cells which respectively search and acquire their own target cell and interact with each other. Theoretically speaking, there is chance for endocrine cells to seize the farthest target firstly. In 50 independent experiments, we observe even 11 actual examples in LAES and none in DHM yet.
Fig. 6. Performance comparison in searching and seizing multiple target cells (a) steps; (b) move numbers; (c) time
In the process of searching and achieving their goals, endocrine cells will inevitably face barriers or pitfalls, such as house, wall, ravine and river. The aim of
384
Q. Xu, L. Wang, and N. Wang
the third group experiments is mainly to verify the circumventing capacity of LAES (as shown in Fig. 7). Likewise, endocrine cells are initially distributed nearby (5, 5), target cell is located at (90, 90) and barrier is sprinkled crosswise at (30, 30). It is impossible for endocrine cell to directly traverse the barrier, whereas it is possible for hormones by means of transportation in blood.
Fig. 7. Examples in bypassing barriers
As it can be seen from Table 2, when barrier exists in environment space, endocrine cells in both models can traverse successfully the barrier after iterating, continue to search and acquire target cells and finish designated task, but the algorithm performance may be reduced to different extents. Since steps spent by DHM are more than LAES, so when the algorithm is over, more endocrine cells in DHM traverse the barrier than in LAES. Table 2. Performance comparison in bypassing barriers
Barrier YES NO
Method
Steps
Running time (ms)
DHM LAES DHM LAES
1680±276 1002±389 2043±419 1519±412
25378±4166 15563±6043 30708±6388 23459±7029
Number escaping from encirclement
- -
57.0±5.9 48.4±9.0
4 Conclusion Considering some disadvantages in DHM and inspired by biological endocrine system, the authors put forward lattice-based artificial endocrine system. Based upon environmental latticed, supported by cell intellectualization, jointed by cumulative hormone, and directed by target cells, LAES model finally adapts itself to continuous changes of external environment and maintains relevant stability stable of internal environment. This model simulates heterogeneity of endocrine cells and hormones, reflects diversity of the endocrine system and also mimics the dynamic process of hormones transportation through blood. Furthermore, hormone concentration is not only related to current distribution of endocrine cells, but related to their past distribution. According to experimental results, LAES can overcome interference of complicated external environment, such as multiple target cells and barriers, etc.
Lattice-Based Artificial Endocrine System
385
Our further research will be aimed at investigating manifestation and performance of LAES in a dynamic environment, such as, barrier or target cell moves fast, the quantity of endocrine cells or target cells rises or falls after algorithm implementation, and even some target cells call for coordination of multiple endocrine cells. Acknowledgment. We would like to thank Dr. Weimin Shen from University of Southern California for his valuable help. This work was supported by the National Natural Science Foundation of China (Grant Nos. 60603026, 60802056) and the Natural Science Foundation of Shaanxi Province (No. 2010JM8028).
References 1. Felig, P., Frohman, L.A.: Endocrinology and Metabolism, 4th edn. The McGraw-Hill Companies, Ohio (2001) 2. Liao, E.Y., Mou, Z.H.: Endocrinology, 2nd edn. People’s Medical Publishing House, Beijing (2007) (in Chinese) 3. White, H., Gallant, A.R., Hornik, K., Stinchcombe, M., Wooldridge, J.: Artificial Neural Networks: Approximation and Learning Theory. Blackwell Pub., New Jersey (1992) 4. El Sharkawi, M.A., Mori, H., Niebur, D., Pao, Y.H.: Overview of Artificial Neural Networks. IIEEE, New York (2000) 5. Graupe, D.: Principles of Artificial Neural Networks. World Scientific Publishing Company, New Jersey (2007) 6. Dasgupta, D.: Artificial Immune Systems and Their Applications. Springer, Heidelberg (1998) 7. De Castro, L.N., Timmis, J.: Artificial Immune Systems: A New Computational Intelligence Approach. Springer, Heidelberg (2002) 8. Dasgupta, D., Nino, F.: Immunological Computation: Theory and Applications. Auerbach Publications, Florida (2008) 9. Ihara, H., Mori, K.: Autonomous decentralized computer control systems. IEEE Computer 17, 57–66 (1984) 10. Miyamoto, S., Mori, K., Ihara, H.: Autonomous decentralized control and its application to the rapid transit system. International Journal of Computer in Industry 5, 115–124 (1984) 11. Mori, K.: Autonomous decentralized system technologies and their application to train transport operation system. In: Winter, V.L., Bhattacharya, S. (eds.) High Integrity Software, pp. 89–111. Springer, Heidelberg (2001) 12. Shen, W.M., Chuong, C.M., Will, P.: Digital hormone model for self-organization. In: The 8th International Conference on Artificial Life, pp. 116–120. ACM, New York (2002) 13. Shen, W.M., Chuong, C.M., Will, P.: Simulating self-organization for multi-robot systems. In: 2002 IEEE/RSJ International Conference on Intelligent Robots and System, pp. 2776– 2781. IEEE, New York (2002) 14. Shen, W.M.: Self-organization through digital hormones. IEEE Intelligent Systems 18, 81–83 (2003) 15. Shen, W.M., Will, P., Galstyan, A., Chuong, C.M.: Hormone-inspired self-organization and distributed control of robotic swarms. Autonomous Robots 17, 93–105 (2004) 16. Jiang, T.X., Widelitz, R.B., Shen, W.M., Will, P., Wu, D.Y., Lin, C.M., Jung, H.S., Chuong, C.M.: Integument pattern formation involves genetic and epigenetic controls: Feather arrays simulated by digital hormone models. International Journal of Developmental Biology 48, 117–135 (2004) 17. Bayindir, L., Sahin, E.: A review of studies in swarm robotics. Turkish Journal Electrical Engineering and Computer Sciences 15, 115–147 (2007)
Direct Sparse Nearest Feature Classifier for Face Recognition Ran He, Nanhai Yang, Xiu-Kun Wang, and Guo-Zhen Tan School of Computer Science, Dalian University of Technolgoy, 116024 Dalian, China {rhe,nanhai,jsjwxk,gztan}@dlut.edu
Abstract. Sparse signal representation proposes a novel insight to solve face recognition problem. Based on the sparse assumption that a new object can be sparsely represented by other objects, we propose a simple yet efficient direct sparse nearest feature classifier to deal with the problem of automatically realtime face recognition. Firstly, we present a new method, which calculates an approximate sparse code to alleviate the extrapolation and interpolation inaccuracy in nearest feature classifier. Secondly, a sparse score normalization method is developed to normalize the calculated scores and to achieve a high receiver operator characteristic (ROC) curve. Experiments on FRGC and PIE face databases show that our method can get comparable results against sparse representation-based classification on both recognition rate and ROC curve. Keywords: Nearest feature classifier, Sparse representation, Receiver operator characteristic, Face recognition.
1 Introduction In many pattern recognition applications, multiple feature points are available for a class. Such information can be used to further improve classification performance, which has received more and more attentions in recent decade. There are two common categories: Nearest feature classifiers (NFC) and linear representation. The simplest nearest feature classifier is the nearest feature line (NFL) [1] algorithm, which assumes that the variation of a sample manifold draws a trajectory linking sample points in the feature space. The set of all trajectories constitute a subspace to approximate a manifold. NFL calculates a minimum distance between a query feature point and feature lines connecting any two feature points of a class. In a straight forward way, NFL can be extended to nearest feature plane (NFP) [2], nearest feature space (NFS) [3], or nearest manifold (NM) [4], by considering the distance between the query point and the projected point onto the feature plane, subspace or manifold. In [7], feature lines are used to augment numbers of prototypes for locally linear embedding. In [8], the NFL distance is extended by a nearest intra-class subspace derived by a regularized principal component analysis. In [9], kernel method is introduced in NFL, NFP and NFS to construct a nonlinear feature space. In [10], a rectified NFL segment method is presented to overcome extrapolation inaccuracy. In [11], a sub-space is constructed by NFL distance of intra-class to achieve a desirable K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 386–394, 2010. © Springer-Verlag Berlin Heidelberg 2010
Direct Sparse Nearest Feature Classifier for Face Recognition
387
discriminating ability. In [12], the concept of feature lines is introduced to dissimilarity representation and dissimilarity-based classification. In [13], feature lines and planes are only constructed by the corresponding prototypes which are the neighbors of a query point in order to reduce computation cost. Early works of linear representation are nearest linear combination (NLC) and nearest constrained linear combination (NCLC) [2]. NLC and NCLC use pseudoinverse matrix to learn a linear representation for one class and select the minimum reconstruction error as minimum distance. K-Local Hyperplane (HKNN) [5] attacks the same problem as NCLC with a different view like NFP. The Nearest subspace (NS) [6] calculates a minimum distance to the subspace spanned by an orthogonal basis of a class. In [14], linear regression method is used for face recognition where all samples of one subject are used to learn a linear representation. Recently, sparse representation-based classification (SRC) [16] offers a novel insight to solve object recognition problem. The experimental results illustrate that SRC can significantly improve receiver operator characteristic (ROC) curves against other methods. However, SRC entirely depends on finding sparse codes which remains a very difficult computational problem. Solving the sparse problem needs a very high computation cost. In this paper, we present a simple yet efficient direct sparse nearest feature classifier based on the assumption that a new image object can be sparely represented by other ones. The contribution of our work lies in two-fold: 1) Assuming that the representations of feature points and scores of all class are both sparse, we present a direct sparse nearest feature classifier (DSNF). It learns an approximate sparse code by a simply greedy way base on nearest feature classifier. 2) A sparse score normalization is also presented to improve the ROC curve of nearest feature classifier methods and linear representation methods. The new method is validated on three commonly used face features: Eigen-face, Fisher-face and LBP feature. Experimental results show our method’s efficiency and effectiveness on both recognition rate and ROC curve. ,$ @' ,'
@$
,A
Fig. 1. Extrapolation and interpolation inaccuracy of nearest feature classifier
2 Direct Sparse Nearest Feature Classifier A major problem of nearest feature classifier is the extrapolation and interpolation inaccuracy problem [10]. Fig. 1 shows an example of this problem. In Fig. 1, there are two query points (circle points) and three feature points (square points) belonging to one class. It is easy to learn that q1 and q2 can be uniquely calculated by x2 and x3. Thus the distances of two query points to the class are all zero. However, it may be more suitable to use the dash line to represent the distances of two query points to the
388
R. He et al.
class. The inaccurate case of p1 is often called interpolation inaccuracy; and the inaccurate case of p2 is often called extrapolation inaccuracy. In Fig. 1, we also observe that point x1 is closer to q1 than x3 and point x1 is closer to q2 than x2. It seems that a faraway point to the query point receives a larger coefficient in linear representation when the extrapolation and interpolation inaccuracy occur. For the case of q1, the coefficient corresponding to x1 is 0 and the coefficient corresponding to x3 is nonzero. If we remove a faraway point which has a larger coefficient than near point to the query point, we can learn a correct distance. Hence we present a new nearest feature classifier based on the strategy of iteratively removing some faraway points. Since we assume the points that have larger coefficients are close to query point and the coefficient s of faraway points trend to be small, we call this method as direct sparse nearest feature classifier (DSNF).
Algorithm 1. Direct sparse nearest feature classifier (DSNF)
1: Input: matrices of training samples X1 ,…, XK for K classes, a test sample z, and s1 is a sparse factor 2: Calculate residuals dk (z ) for each Xk 2 2.1: Normalize the columns of Xk to have unit l -norm, and sort x in Xk according to their distance to z. 2.2: Let Xk = [xk 2 − xk 1,..., xkm − xk 1 ] , z = z − x k 1 , and solve the following nearest feature classifier problem: min || z '− X ka ||22
(10)
α
2.3: For i=km-1 to s1 s If ai > s11 ∑ j1=1 a j then
Remove the last column from X k Compute a according to (10) end end
2.4 dk (z ) =|| Xka − z ||2 3: Output: identify(z ) = arg mink dki (z ) and dk (z ) Algorithm 1 outlines the procedure of DSNF. In step 2.1, we normalize each sample to have unit l2-norm and rearrange Xk according to their distance to z. Inspired by tangent distance and NCLC method, we construct a sample matrix Xk in step 2.2 by assuming that the nearest sample to z will always receive a large coefficient. In step 2.3, the faraway sample which has large coefficient is iteratively removed from the Xk to alleviate the extrapolation and interpolation inaccuracy in nearest feature classifier. We consider that the proposed iterative and greedy strategy as a coincidence with the results in sparse presentation methods [22][15]. Instead of computing a linear representation by single class, SRC assumes that there is a sparse representation of all sample points. Fig. 2 (a) shows a sparse code of a query feature point on FRGC database. (SRC is solved by [18]). We sort all feature points according to their
Direct Sparse Nearest Feature Classifier for Face Recognition
389
Euclidean distances to the query feature point. It is easy to find that the large sparse coefficients often lie in several nearest feature points. This phenomenon also occurs in the nonnegative sparse representation method [15]. Hence, we make use of the greedy strategy to learn an approximate sparse linear representation for NFCs.
(a)
(b)
Fig. 2. Sparse coefficient and the corresponding residuals of a query feature point calculated by the sparse code
3 Sparse Score Normalization Although NFC and linear representation methods have significantly improved the recognition rate for face recognition, their improvements of ROC are quite limited [16]. SRC has shown its advantage against traditional methods on ROC curves. Fig. 2 (b) shows the scores (or residuals) corresponding to the sparse code in Fig. 2 (a). We can observe that the scores are also sparse. There are only several scores are significantly lower than others which have large values and small variations. We call the score generated by sparse code as sparse score. For a sparse score in Fig. 2, there are two categories. One category occupies large entries of the scores (the scores are close to 1); the other category type occupies only small part of the scores. We can learn that in Fig. 2 (b) there are only several entries whose scores are smaller than 1. Inspired by the sparse score in SRC, we divide the scores of DSNF into two parts and normalize them separately.
Algorithm 2. Sparse score normalization for DSNF
1: Input: the dk (z ) (k=1,…,K) computed by DSNF and a factor s2 2: Rearrange dk (z ) by ascend order and denote the sorted dk (z ) by dk (z ) . Let set I1=[1,…,s2] and set I2=[s2+1,…,K] (dk (z )) and variance v = std (dk (z )) 3: Compute the mean m1 = mean k ∈I k ∈ I1
1
on set I1 and then normalize all scores by: dk (z ) = (dk (z ) − m1)/ v
4: Calculate them mean m2 = mean dki (z ) and normalize all scores by:
dki (z )
=
(dki (z ) − m2 )/ m2
k ∈I2
5: Output: the normalized score dk (z )
390
R. He et al.
Algorithm 2 summarizes our sparse score normalization algorithm for the direct sparse nearest feature classifier. Firstly, we rearrange dk (z ) by ascend order and
denote the sorted dk (z ) by dk (z ) . We divide the dk (z ) into two sets: subset I1=[1,…,s2] and subset I2=[s2+1,…,K]. Secondly, in order to construct a sparse score, we utilize the mean and variance of the set I1 to normalize all dk (z ) . Lastly, we assume that the scores in subset I2 have small variations and similar mean value as the sparse score of SRC in Fig.2. (b). Hence we utilize the mean value in subset I2 to further normalize all dk (z ) . Experimental results demonstrate that the proposed score normalization method can significantly improve the ROC curves as compared to the state-of-the-art SRC method. The computational costs of both Algorithm 1 and Algorithm 2 are only relative to the number of sample points of a single subject instead of all sample points. Hence computational cost of DSNF is the same as that of linear representation method and can be used to real time recognition applications [14].
4 Experimental Verification To evaluate our method, we perform the experiments on two publicly available datasets for face recognition, and compare performance across various feature spaces and with several popular classifiers. The robustness and effectiveness are demonstrated by ROC curves and recognition rates. We set s1=5 and s2=4 in the experiment. The SRC is solved by [18].
Fig. 3. Cropped facial images of one subject for our experiment in PIE and FRGC. The first ten facial images are in the gallery set and others in the probe set.
4.1 Datasets CMU PIE Database: The facial images are collected from a subset of PIE [23] face database. There are more than 40,000 facial images of 68 subjects. These still images are acquired across different poses, illuminations and facial expressions. A subset is selected in our experiment which contains five near frontal poses (C27, C05, C29, C09 and C07) and illumination indexed by 03 08, 11 and 17. So there are 20 images for each subject. The first row of Fig. 3 shows twenty images of one subject. We take the first 10 facial images of each person as the gallery set and the remaining 10 images as the probe set. The grayscale facial images are cropped according to the positions of eyes and normalized with dimension 64*64. Because the number of subjects is 68, the maximal dimension for Fisher faces is only at 67.
Direct Sparse Nearest Feature Classifier for Face Recognition
391
FRGC Database: This experiment is performed on a subset of facial images in FRGC version 2 [24]. There are 8014 images of 466 subjects in the query set for the FRGC experiment 4.These uncontrolled still images contain the variations of illumination, expression, time, and blurring. However, there are only two facial images available for some persons. Thus, a subset is selected in our experiments. We take the first 20 facial images if the number of facial images is not less than 20 (The second row of Fig. 2 shows twenty images of one subject). Then we get 3720 facial images of 186 subjects. We divide the 3720 images into two subsets. We take the first 10 facial images of each person as the gallery set and the remaining 10 images as the probe set. The grayscale facial images are cropped according to the positions of eyes and normalized with dimension 64*64. Because the number of subjects is 186, the maximal dimension for Fisherfaces is only at 185. 4.2 Facial Features Two expressive features and one non-Euclidean feature are used in our experiments. For expressive features, Eigenfaces [20] and Fisherfaces [21] play an important role in development of face recognition. They can reduce high image dimension to a lower one. We testify the proposed method on these two common feature spaces. We should note that Fisherfaces are different from other features because the maximal number of valid Fisherfaces is one less than the number of classes. The Eigenfaces reduces the image space to a PCA subspace where 98% energy of PCA is saved. For nonEuclidean features, local binary patterns (LBP) algorithm is a newly approach which is proven superior in face recognition task [22]. We perform LBP on a cropped facial image and then subdivide it by 7 × 7 grids where histograms with 59 bins are calculated. An LBP feature vector is obtained by concatenating the feature vectors on each grid. Here we use 58 uniform patterns for LBP and each uniform pattern accounts for one bin. The remaining 198 binary patterns are all put in another bin, resulting in a 59-bin histogram. So, the number of features in a LBP feature vector is 59 × (7 × 7) = 2891. The settings are consistent with that in [22]. If we directly take LBP as the facial feature, calculating sparse codes of SRC is easy to suffer ‘ill-matrix condition’ [18] and high computation cost. Thus we reduce the 2891-dimension LBP features to a low dimension by using PCA. 4.3 Recognition Rates Figure 4 shows the recognition rates performance for various feature spaces in PIE database, in conjunction with five different classifiers: NN, NS, NFL, SRC and DSNF. The maximum recognition rates for NN, NS, NFL, SRC and DSNF are 91.6%, 95%, 95%, 96.6% and 95.6%, respectively. Although SRC achieves the best recognition rate of 96.6% on LBP+PCA feature space, DSNF gains a comparable recognition rate with a small computation cost. DSNF can perform face recognition task in real-time. Furthermore, DSNF performs better than SRC on Eigenfaces and Fisherfaces. It can achieve a higher recognition rate than other methods on the two expressive features. DSNF achieves recognition rates between 75% and 95.6% for different feature spaces. And SRC achieves recognition rates between 72% and 96.6%. The performance of all methods varies with the choice of feature space. They
392
R. He et al.
all depend on a good choice of “optimal” features. It seems that LBP features are more powerful than other features on the subset of PIE database. Figure 5 further compares DSNF to other four algorithms in FRGC database. We can observe that the Fisherfaces are more powerful than other features and all methods achieve their best rate on Fisherfaces. As in Fig.4. (b), DSNF performs better than other methods on Fisherfaces, but SRC couldn’t. On the Eigenfaces and LBP features, SRC performs better than other methods.
(a) Eigenface
(b) Fisherface
(c) LBP
Fig. 4. Recognition rates on various feature spaces using classifiers in PIE database
(a) Eigenface
(b) Fisherface
(c) LBP
Fig. 5. Recognition rates on various feature spaces using classifiers in FRGC database
4.4 ROC Curves Receiver operator characteristic (ROC) curve is an important standard to evaluate different face recognition methods. The ROC curve can be represented equivalently by plotting the fraction of the false acceptance rate (FAR) vs. the verification rate (VR). It is often used to measure the accuracy of outlier rejection. FAR is the percentage of test samples that are accepted and wrongly classified. VR calculates the percentage of valid test samples that are rejected. A good algorithm should achieve high verification rates even at very low false acceptance rates. Figure 6 and Figure 7 show the ROC curves of different methods on PIE and FRGC database respectively. We can observe that SRC and DSNF consistently outperform other methods. They can gain a significant improvement in the ROC. In
Direct Sparse Nearest Feature Classifier for Face Recognition
393
PIE database, DSNF seems to perform slightly better than SRC in ROC curves on all feature spaces. This improvement of DSNF benefits from the sparse normalization. In FRGC database, although the recognition rates of DSNF are lower than those SRC on Eigenface and LBP feature spaces, DSNF can achieve similar ROC curves as SRC. Both SRC and DSNF can obtain significant improvements on ROC curves.
(a) Eigenface
(b) Fisherface
(c) LBP
Fig. 6. ROC curves on various feature spaces using classifiers in PIE database
(a) Eigenface
(b) Fisherface
(c) LBP
Fig. 7. ROC curves on various feature spaces using classifiers in FRGC database
5 Conclusions This paper presents a direct sparse nearest feature classifier for real-time face recognition problem. This method firstly calculates a direct sparse code for each class and then learns a sparse score. It can improve both recognition rate and ROC curves. The proposed sparse score normalization method can also be extended to other NFC, TD and linear representation methods to further improve the ROC curves. Compared with the SRC method, it can achieve a comparable ROC curve with a slim computation cost. It is clear that both DSNF and SRC are both based on the sparse assumption. The further work includes applying our method into object recognition and verifying the performance where the sparse assumption is unsatisfied. Acknowledgments. This work was supported by DUT R & D Start-up costs.
394
R. He et al.
References 1. Li, S.Z., Lu, J.: Face recognition using nearest feature line method. IEEE Trans. Neural Network 10(2), 439–443 (1999) 2. Li, S.Z.: Face Recognition Based on Nearest Linear Combinations. In: CVPR (1998) 3. Chien, J.T., Wu, C.C.: Discriminant waveletfaces and nearest feature classifiers for face recognition. IEEE Trans. PAMI 24(12), 1644–1649 (2002) 4. Zhang, J., Li, S.Z., Wang, J.: Nearest manifold approach for face recognition. Automatic Face and Gesture Recognition, 223– 228 (2004) 5. Vincent, P., Bengio, Y.: K-Local Hyperplane and. Convex Distance Nearest-Neighbor Algorithms. In: Advances in Neural Information Processing Systems, vol. 14, pp. 985–992 (2001) 6. Ho, J., Yang, M., Lim, J., Lee, K., Kriegman, D.: Clustering appearances of objects under varying illumination conditions. In: CVPR, pp. 11–18 (2003) 7. Zhan, D.-C., Zhou, Z.-H.: Neighbor line-based locally linear embedding. In: Ng, W.-K., Kitsuregawa, M., Li, J., Chang, K. (eds.) PAKDD 2006. LNCS (LNAI), vol. 3918, pp. 606–615. Springer, Heidelberg (2006) 8. Liu, W., Wang, Y.H., Li, S.Z., Tan, T.N.: Nearest intra-class space classifier for face recognition. In: Proceedings of the 17th ICPR, pp. 495–498 (2004) 9. He, Y.H.: Face Recognition Using Kernel Nearest Feature Classifiers. In: Int. Con. on Computational Intelligence and Security, vol. 1, pp. 678–683 (2006) 10. Du, H., Chen, Y.Q.: Rectified nearest feature line segment for pattern classification. Pattern Recognition 40(5), 1486–1497 (2007) 11. Pang, Y., Yuan, Y., Li, X.: Generalized nearest feature line for subspace learning. IEE Electronics Letters 43(20), 1079–1080 (2007) 12. Orozco-Alzate, M., Duin, R.P.W., Casteiianos-Domingue, C.S.: Generalizing Dissimilarity Representations Using Feature lines. In: Rueda, L., Mery, D., Kittler, J. (eds.) CIARP 2007. LNCS, vol. 4756, pp. 370–379. Springer, Heidelberg (2007) 13. Zheng, W.M., Zhao, L., Zou, C.R.: Locally nearest neighbor classifiers for pattern classification. Pattern Recognition 37, 1307–1309 (2004) 14. Naseem, I., Togneri, R., Bennamoun, M.: Linear regression for face recognition. IEEE Trans. on Pattern Analysis and Machine Intelligence (2009) (accepted) 15. Ran, B.G., Hu, Zeng, W.S., Guo, Y.Q.: Two-stage Sparse Representation for Robust Recognition on Large-scale Database. In: Twenty- Fourth AAAI Conference on Artificial Intelligence (AAAI 2010) (2010) 16. Wright, J., Yang, A.Y., Ganesh, A., Sastry, S.S., Ma, Y.: Robust face recognition via sparse representation. IEEE Trans. PAMI (March 2008) 17. Boyd, S., Vandenberghe, L.: Convex Optimization. Cambridge University Press, Cambridge (2004) 18. Candes, E., Romberg, J.: l1-magic: Recovery of sparse signals via convex programming (2005), http://www.acm.caltech.edu/l1magic/ 19. Martinez, A., Benavente, R.: The AR face database. CVC Tech. Report 24 (1998) 20. Turk, M., Pentland, A.: Eigenfaces for recognition. In: CVPR (1991) 21. Belhumeur, P., Hespanda, J., Kriegman, D.: Eigenfaces vs. Fisherfaces: recognition using class specific linear projection. IEEE Trans. on Pattern Analysis and Machine Intelligence 19(7), 711–720 (1997) 22. Ahonen, T., Hadid, A., Pietikainen, M.: Face description with local binary patterns: Application to face recognition. IEEE Trans. on PAMI Pattern Analysis and Machine Intelligence 28(12), 2037–2041 (2006) 23. Sim, T., Baker, S., Bsat, M.: The CMU Pose, Illumination, and Expression Database. IEEE Transactions on PAMI 25(12), 1615–1618 (2003) 24. Philips, P., Flynn, P., Sruggs, T., Bowyer, K.: Overview of the face recognition grand challenge. In: CVPR (2005)
A Mathematical Model of Myelodysplastic Syndromes: The Effect of Stem Cell Niches Xiuwei Zhu, Ling Xia*, and Luyao Lu Key Lab of Biomedical Engineering of Ministry of Education, Department of Biomedical Engineering, Zhejiang University, Hangzhou 310027, China [email protected]
Abstract. While myelodysplastic syndromes (MDS) are commonly observed nowadays, the underlying mechanisms remain unclear, not to mention mathematical models for MDS. In this work, by incorporating the concept of stem cell niches, we proposed a minimal mathematical model that can be used as a platform for studying the formation and treatment of MDS. Our model includes two main compartments: bone marrow and peripheral blood, in both compartment normal and abnormal cells exist. Simulation results show that 1) under normal condition, our model is robust to reproduce the hemopoiesis even with different perturbations; 2) by reducing stem cell niches, formation of MDS can be observed in our model; 3) treatments should be used to improve environment in bone marrow, rather than to kill the abnormal cells only. Keywords: myelodysplastic syndrome, stem cell niche, bone marrow, peripheral blood.
1 Introduction The Myelodysplastic syndromes (MDSs) are a family of clonal disorders of hematopoietic stem cells characterized by abnormal differentiation and maturation of myeloid cells, bone marrow failure, and a genetic instability with enhanced risk to transform to acute myeloid leukemia (AML) [1]. Its malignancy is age-related and a substantial proportion of these diseases are related to exposures to environmental or occupational toxins. The estimated incidence in the United States is more than 10,000 cases per year and is likely to increase [2]. However, our understanding of the pathogenesis of MDSs is far from clear since MDSs are clinically and pathologically heterogeneous. Mathematically, no previous model has been developed for the formation and development of MDS. Even for the normal hematopoietic system, only a few mathematical models have been established, in a relatively simple way. For example, Obeyesekere et al. [3] proposed a mathematical model for the kinetics of hemopoietic cells, including CD34+ cells, white blood cells and platelets. Their model reproduced *
Corresponding author.
K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 395–403, 2010. © Springer-Verlag Berlin Heidelberg 2010
396
X. Zhu, L. Xia, and L. Lu
a steady state of hemopoiesis even when perturbations were applied. Predictions were also made with their model for recovering of the hematopoietic system after highdose chemotherapy. However, their model cannot be used for explain the underlying mechanism of formation of MDS. Currently, it is widely accepted that the stem cell’s fate is at least partially dependent on the stem cell niche, which is a particular growth environment, consisting of different cell types and extracellular matrix components [4-6]. Besides the maintenance of stem cells [7-9], the niche has also been suggested to play an important role in the determination of stem cell’s fate [10-13]. The nutrient and molecules in the niche do not only physically determine the size of stem cell population, but also affect the rate at which the stem cells proliferate [14]. The deregulation of the niche leading to an unbalance of proliferation and differentiation may result in both tumorigenesis [13] and the progression of the cancer. Indeed, it has been suggested that targeting the niche could result in a reduction of the tumor burden [15-17]. The goal of this paper is to implement a minimal model to study the mechanism of formation and development of MDS. With this model, we can not only study the stability of the normal hemopoiesis, but also simulate the initiation of MDS. It is obvious that our model is not a complete replica of the hemopoietic system; rather, the value of this model lies in general qualitative behavior. However, we hope that our model could provide a new platform for studying MDS and even for curing this disease in future.
2 Material and Method While MDS is a commonly observed nowadays, the underlying mechanism is unclear. Based on a mathematical model of normal hemopoiesis [3], we proposed a minimal model that can serve as a tool for studying MDS. Detailed descriptions are discussed below. 2.1 Model Description In this model, two compartments are considered: bone marrow (BM) and peripheral blood (PB). Figure 1 shows the elements of the two compartments and their interactions. The BM compartment consists of bone marrow stem cells, in either normal (denoted by S) or abnormal (SA) form. In PB compartment, for simplification, we used whole blood cells (WBC) to represent all functional mature cells, such as white blood cells, red blood cells and platelets. In the same way, normal blood cells (denoted by W) and abnormal blood cells (WA) are assumed. Under a normal condition, stem cells undergo self-renewal to maintain its population (denoted by aS), and the progenitor lineages within the bone marrow (TL) differentiate into functional mature cells that are moved into PB compartment. Cells in PB decay at a certain rate (dW) due to many causes, for example, apoptosis. A feedback (f1) is used to keep the WBC pool in a steady state. Under a pathological condition, abnormal stem cells are formed and consequently the abnormal cells in PB.
A Mathematical Model of Myelodysplastic Syndromes: The Effect of Stem Cell Niches
397
Fig. 1. The schematic figure for interactions between specific components within the hemapoietic system
The dynamics of abnormal cells is similar to that of normal cells, only with some parameters different (see in next section). The dynamics of all of the components described above are modeled by a system of ordinary differential equations (ODEs), equations 1-4, as given below. 2.2 Mathematical Equations The dynamics of the system shown in Figure 1 can be mathematically given by following ODEs:
dS / dt = ωS S + ω S A − α S − TL
(1)
dW / dt = aampTL − dW W
(2)
dS A / dt = ωSA S A + α S − ω S A − TLA
(3)
dWA / dt = aampATLA − dWAWA
(4)
Each of the terms in the above equations is further defined by equations 5-10. Here, we introduced the concept of stem cell niche. Since niches are necessary for stem cells, we suppose that one niche serve for one stem cell, and Hn is the total number of niches. Because free niches decrease as the stem cells increase, the selfrenewal rate of stem cells decreases as well [18]. Therefore, we used a reverse sigmoid function to model the relationship between the count of stem cells and its self-renewal rate (equation 5,6). Note that, Vt in equation 6 is the total volume of bone marrow. The definitions of all these parameters can be found in Table 1.
ωS = aS / (1 + exp(10* S / H n − 5))
(5)
398
X. Zhu, L. Xia, and L. Lu
ωSA = aS / (1 + exp(10* S A / (Vt − H n )) − 5)
(6)
TL = (aT + f1 *(1 − W / Wn )) * S
(7)
TLA = (aT + f 2 ) S A
(8)
α = a1 ( S − H n ) / ( S + S A )
(9)
ω = a2 S A / ( S + S A )
(10)
2.3 Parameter Settings All of the parameter values used in this model (equations 5-10) is given in Table 1. Most of these values are derived from experimental data or previously published models [3], however, due to absence of experimental biological data, the other parameters are selected in this model to observe realistic steady-state values. Table 1. Model parameter values Parameters +
*
Values 0.1/day
Definition Self-renewal rate
aS + aT + f1 + f2 a1
3.956/day
Differentiation rate
0.1/day
Feedback strength of normal blood cells
0.1/day
Feedback strength of abnormal blood cells
1.0
Transform rate from S to SA
a2
0.01
Transform rate from SA to S
aamp
700
Amplification value
aampA
1400
Amplification value
dW + dWA
0.7908/day
Decay of whole blood cells
0.7615/day
Decay of abnormal blood cells
Vt
1.1086×106 cells/ml
Total volume of bone marrow
6
Niches under normal condition
6
Steady-state
Hn Sn * * Wn
1.1086×10 cells/ml
277.3×10 cells/ml
Whole blood cell
S An
0
Abnormal stem cells
WAn
0
1.1086×10 cells/ml 6
Abnormal whole blood cells +
derived from (Korbling et al. [19]); derive from (Obeyesekere et al. [3]).
A Mathematical Model of Myelodysplastic Syndromes: The Effect of Stem Cell Niches
399
3 Results and Discussion The model explained in the previous section is numerically solved to obtain the timedependent solutions of S, W, SA and WA. The simulations were implemented by using an ODE solver (ode15s) which is provided by Matlab R2009a. 3.1 Stability of the Basic Model Our first simulation is to explain the recovery of the hemopoietic system when perturbed by three different conditions at time, t = 10days. In case (0), all peripheral cells annihilated, that is, W = 0. In case (1), stem cells, S, is reduced to a half of the normal. In case (2), S is reduced to 0.1% of the normal.
Fig. 2. Simulations for three different perturbations. Line (0): W = 0. Line (1): S = 0.5*Sn. Line (2): S = 0.001*Sn.
From Figure 2 we can see that perturbation in peripheral blood can be quickly recovered to normal state, while reductions of stem cells require a longer time to recover. This observation is important because it shows the role of stem cells played in keeping the hemapoiesis under normal conditions. Furthermore, these simulation results validate the stability of our model. 3.2 Effect of Stem Cell Niches on the Hemopoietic System Let us consider the pathological conditions now, that is, when stem cell niches change from normal state. As described above, stem cell niches occupy the whole bone
400
X. Zhu, L. Xia, and L. Lu
marrow in normal state, i.e., Hn = Vt, under which condition, no stem cells change into abnormal form. However, what if the stem cell niches are in part destroyed by external stimulus, for example, chemotherapy and radiotherapy? To answer this question, we performed two simulations by adding perturbations of stem cell niches at time, t = 10days. While case (0) is the normal condition, case (1) is that the stem cell niches are reduced to a half of normal, i.e., Hn = 0.5*Vt , and case (2) is under the condition that niches are reduced to a fourth of normal, i.e., Hn = 0.25*Vt .
Fig. 3. Simulations of perturbation in stem cell niches. Line (0): Hn = Vt,. Line (1): Hn = 0.5*Vt, Line (2): Hn = 0.25*Vt. (a) shows the change of stem cells in BM, (b) changes of blood cells in PB. Note that, curves at the top of each panel are normal cells, while curves at the bottom are abnormal cells.
From Figure 3 we can see that once niches are decreased, a proportion of normal stem cells move out of niches, and consequently become abnormal stem cell. Another steady state was achieved when a balance between S and SA was obtained. In the PB compartment (Figure 3b), normal WBC was reduced and abnormal WBC was generated and increased. Though the sum of W and WA is increased, the functional mature cells in PB are reduced, since WA represents blood cells losing normal physiological functions. This observation could be seen as occurrence of anemia, which is a sign for MDS. Moreover, from the simulation results we can conclude that MDS is more likely to be observed if the stem cell niches are more reduced.
A Mathematical Model of Myelodysplastic Syndromes: The Effect of Stem Cell Niches
401
3.3 Responses to Different Treatments In this section, let us consider theoretical treatment for the system when MDS is already observed, i.e., abnormal stem cells and abnormal blood cells exist due to the reduction of stem cell niches. For simplification, we just take the case Hn = 0.5*Vt as an example. The same curves with those in Figure 3 can be observed if there was no treatment (line 0 in Figure 4).
Fig. 4. Simulations of theoretical treatments for MDS. Line (0): no treatment; Line (1): WA= 0; Line (2): SA= 0; Line (3): Hn = Vt. (a) shows the change of stem cells in BM, (b) changes of blood cells in PB.
Then, three different theoretical treatments were applied at time, t = 25days, as follow: (1) all abnormal blood cells are killed, i.e., WA=0; (2) all abnormal stem cells are killed, i.e., SA=0; (3) stem cell niches are recovered to normal state, i.e., Hn = Vt. In test (1), WA was set to zero, however, the abnormal blood cells increased quickly to the value before treatment. This could be easily understood because the abnormal stem cells can differentiate into WA immediately. Even a small increase of SA can be seen in this test (line 1 in Figure 4a). In test (2), SA was set to zero, but again, we can see that it gradually increased to the value before treatment. In PB compartment, although a decrease of WA was observed after treatment, it returned to the state before treatment in 10 days. Then, in test (3), stem cell niches were expanded to its normal
402
X. Zhu, L. Xia, and L. Lu
value, i.e., Hn = Vt. This time, both SA and WA decreased to zero, and no relapse was observed. These results suggest the importance of stem cell niches in the treatment of MDS.
4 Conclusion and Outlook In this work, we proposed a minimal model that includes normal and abnormal stem cells in bone marrow compartment, normal and abnormal blood cells in peripheral blood compartment, with the purpose of studying the underlying mechanism of formation and development of MDS. Though it is a simplification of hemotopietic system, many of the basic mechanisms within the hemopoiesis can be seen in our model; for example, the self-renewal and differentiation. More importantly, we incorporated the concept of stem cell niches in our work by modeling the niches as the container for normal stem cells. Thus, we can simulate the formation of MDS by applying perturbations of stem cell niches. Simulation results show a good stability of our model under normal conditions; what is more important, our model suggests that the reduction of stem cell niches, due to either exposure to harsh environment or excessive chemo- and radiotherapy, might be a potential cause for the formation of MDS. Outcomes of theoretical treatments also imply the important role that stem cell niches play in MDS. Further work is needed to build a more precise mathematical model for MDS. For example, more feedbacks should be considered in this system, and the transformation between normal stem cells and abnormal stem cells should be modeled in more detailed way. Moreover, the interactions between environment and gene is obviously important in determining the behavior of all kinds of cells, and the signaling pathways are linkages between intrinsic mechanism and extrinsic factors, therefore, modeling of signaling pathways should also be incorporated in. Hopefully, as the experimental techniques and theoretical investigation develop, these points could be addressed in the future.
References 1. Valent, P., Horny, H.P., Bennett, J.M., Fonatsch, C., Germing, U., Greenberg, P., Haferlach, T., Haase, D., Kolb, H.J., Krieger, O., Loken, M., van de Loosdrecht, A., Ogata, K., Orfao, A., Pfeilstocker, M., Ruter, B., Sperr, W.R., Stauder, R., Wells, D.A.: Definitions and standards in the diagnosis and treatment of the myelodysplastic syndromes. In: Consensus statements and report from a working conference, vol. 31(6), Leuk Res, pp. 727–736 (2007) 2. Gondek, L.P., Tiu, R., O’Keefe, C.L., Sekeres, M.A., Theil, K.S., Maciejewski, J.P.: Chromosomal lesions and uniparental disomy detected by SNP arrays in MDS, MDS/MPD, and MDS-derived AML. Blood 111(3), 1534–1542 (2008) 3. Obeyesekere, M.N., Berry, R.W., Spicer, P.P., Korbling, M.: A mathematical model of haemopoiesis as exemplified by CD34 cell mobilization into the peripheral blood. Cell Prolif. 37(4), 279–294 (2004) 4. Adams, G.B., Scadden, D.T.: The hematopoietic stem cell in its place. Nat. Immunol. 7(4), 333–337 (2006)
A Mathematical Model of Myelodysplastic Syndromes: The Effect of Stem Cell Niches
403
5. Scadden, D.T.: The stem-cell niche as an entity of action. Nature 441(7097), 1075–1079 (2006) 6. Walker, M.R., Patel, K.K., Stappenbeck, T.S.: The stem cell niche. J. Pathol. 217(2), 169– 180 (2009) 7. Xie, T., Spradling, A.C.: A niche maintaining germ line stem cells in the Drosophila ovary. Science 290(5490), 328–330 (2000) 8. Zhang, J., Niu, C., Ye, L., Huang, H., He, X., Tong, W.G., Ross, J., Haug, J., Johnson, T., Feng, J.Q., Harris, S., Wiedemann, L.M., Mishina, Y., Li, L.: Identification of the haematopoietic stem cell niche and control of the niche size. Nature 425(6960), 836–841 (2003) 9. Visnjic, D., Kalajzic, Z., Rowe, D.W., Katavic, V., Lorenzo, J., Aguila, H.L.: Hematopoiesis is severely altered in mice with an induced osteoblast deficiency. Blood 103(9), 3258–3264 (2004) 10. Potten, C.S., Booth, C., Pritchard, D.M.: The intestinal epithelial stem cell: the mucosal governor. Int. J. Exp. Pathol. 78(4), 219–243 (1997) 11. Lechler, T., Fuchs, E.: Asymmetric cell divisions promote stratification and differentiation of mammalian skin. Nature 437(7056), 275–280 (2005) 12. Bjerknes, M., Cheng, H.: Clonal analysis of mouse intestinal epithelial progenitors. Gastroenterology 116(1), 7–14 (1999) 13. Li, L., Neaves, W.B.: Normal stem cells and cancer stem cells: the niche matters. Cancer Res. 66(9), 4553–4557 (2006) 14. Narbonne, P., Roy, R.: Regulation of germline stem cell proliferation downstream of nutrient sensing. Cell Div. 1, 29 (2006) 15. Joyce, J.A.: Therapeutic targeting of the tumor microenvironment. Cancer Cell 7(6), 513– 520 (2005) 16. Calabrese, C., Poppleton, H., Kocak, M., Hogg, T.L., Fuller, C., Hamner, B., Oh, E.Y., Gaber, M.W., Finklestein, D., Allen, M., Frank, A., Bayazitov, I.T., Zakharenko, S.S., Gajjar, A., Davidoff, A., Gilbertson, R.J.: A perivascular niche for brain tumor stem cells. Cancer Cell 11(1), 69–82 (2007) 17. Anderson, K.C.: Targeted therapy of multiple myeloma based upon tumor-microenvironmental interactions. Exp. Hematol. 35(4 suppl. 1), 155–162 (2007) 18. Morrison, S.J., Kimble, J.: Asymmetric and symmetric stem-cell divisions in development and cancer. Nature 441(7097), 1068–1074 (2006) 19. Korbling, M., Anderlini, P., Durett, A., Maadani, F., Bojko, P., Seong, D., Giralt, S., Khouri, I., Andersson, B., Mehra, R., van Besien, K., Mirza, N., Przepiorka, D., Champlin, R.: Delayed effects of rhG-CSF mobilization treatment and apheresis on circulating CD34+ and CD34+ Thy-1dim CD38- progenitor cells, and lymphoid subsets in normal stem cell donors for allogeneic transplantation. Bone Marrow Transplant. 18(6), 1073– 1079 (1996)
Ion Channel Modeling and Simulation Using Hybrid Functional Petri Net Yin Tang* and Fei Wang Shanghai Key Lab of Intelligent Information Processing, Fudan University, Shanghai, China [email protected]
Abstract. Neural system and ion channels remain one of the most intractable issues in biology over years because of its complexity. A representation that takes in both the intuition of biologists and the computational ability of the ion channel system is of great importance. In this paper, we exploit Hybrid Functional Petri net (HFPN) for representing ion channel dynamics. As an extension of Petri net, HFPN allows both discrete and continuous factors and realizes ordinary differential equations (ODE) which make it easy to handle biological factors in the ion channel system such as the open(close) state of ion channels and the influx (efflux) of various ions. We prove that neural elements can be naturally translated into HFPN. Simulation results of the action potential show our model very effective. Our work explores a novel approach for neuroscience research and a new application for Petri-net based method. Keywords: Hybrid Functional Petri net, Dynamic model, Intuitive, Ion channel, Action potential.
1 Introduction Neural system is getting more and more attention due to its high relevance to mental diseases. The behavior of ion channels, which are responsible for the generation of the action potential and altering membrane potential, are discovered to play key roles in neuronal and physiological functions. Various kinds of models have been studied for ion channel kinetics such as differential equation model [1, 2, 3], Discrete-state Markov model [4], Markov chain model [5] and Fractal model [6], yielding interesting and enlightening results. However, differential equations play a central role in modeling ion channel behaviors for its solid mathematic foundation. Luo and Rudy [1] proposed a model for cardiac action potential. They used linear differential equations to describe the change of the potassium and sodium currents through ion channels, representing different states of the channels. Liebovitch et al. [6] employed Markov models for the discrete states (open/close) of the ion channels. Their approach has fewer adjustable parameters and is more consistent with the dynamics of protein conformations. Mahajan et al. [2] combines differential equations *
Corresponding author.
K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 404–412, 2010. © Springer-Verlag Berlin Heidelberg 2010
Ion Channel Modeling and Simulation Using Hybrid Functional Petri Net
405
with Markov models and incorporates a phenomenological model. Their model for cardiac action potential included the Cai cycling component and was more powerful and precise. White et al. [3] presented non-linear ordinary differential equation models for the voltage gated channels. While differential equations are prevalent, there are still reasons for the increased interest in alternative methods. Discrete approaches are necessary because biological systems require logical analysis in addition to quantitative representation. For ion channels, we mainly focus on the logical state of the channel (activated, inactivated), and discrete elements fit well for these discrete states. Moreover, an effective model should include the latest findings from related field. However, frequent changes to differential equation models could be a tedious work because they are not intuitive when representing biological systems. In order to elaborate on ion channels directly, we turn our attention to Petri net. Petri nets are graphical notations for modeling concurrent systems [7]. They have a history of 40 years and Petri net based simulations benefit us a lot from their intuitive, concurrent and mathematically well-founded representations. In addition, mathematical properties of Petri net such as T-invariant have been discovered to be applicable in biological system [8, 9, 10]. Extensions such as stochastic, hybrid and functional Petri net are developed [11, 12, 13], aiming to include random factors [11], continuous elements [12] and dynamic changes of the network structure [13]. Hybrid functional Petri net (HFPN) was proposed in [14] as an extension of hybrid Petri net and incorporates functional Petri net [13]. It has a discrete part and a continuous part. The former allows logical analysis such as gene regulation and state changing of ion channels. The latter realizes ODEs [15], which is crucial for molecular dynamics, e.g. Na+ influx and efflux. Stochastic factors which are essential for biological systems, can be included in HFPNs. Well-constructed models based on HFPN have been made to give insight into a plenty of biological processes [14, 16, 17, 18]. In this paper, we try to exploit HFPN for representing ion channels where biochemistry and electrical chemistry are both included. We build a model for sodium and potassium channel which are mainly responsible for the action potential and simulate the process of depolarization and repolarization. We test the impact of outer stimulations of different periods and intensities on the action potential. Our paper is organized as follows: Section 2 gives a brief introduction to HFPN. Section 3 demonstrates our approach to build the model for ion channels using HFPN. Section 4 includes the simulation results of our model. Section 5 shows the advantage and limitations of our approach and its extension to other neural issues.
2 HFPN We assume the reader is familiar with Petri net and Hybrid Petri Net (HPN) [12]. In this section, we give a brief introduction to HFPN and its representation for biological system.
406
Y. Tang and F. Wang
As an extension of traditional Petri net, HFPN also consists of four kinds of components: Place, Transition, Arc and Token. Places can hold tokens as their content. They connect with transitions through directed arcs. Arcs specify the rules of causality between places and transitions. A transition can fire, representing change of state in the system, as long as its firing condition is satisfied. The firing condition is specified in terms of the content of the input places of the transition. Firing of the transition results in tokens moving between places. Figure 1 shows a HFPN model for a biochemical reaction.
23 23 Fig. 1. HFPN representation of biochemical reaction. In (b), circle A, B, C, D, E are (continuous) places. Square t1 is a (continuous) transition. A, B, C are input places for t1. m1, m2, m3, m4, m5 are the number of tokens held by (the content of) respective places. c1, c2, c4, c5 are normal arcs. c3 is a test arc. place A and B represent reactants A and B, and place D and E are for products D and E. Place C represents enzyme C. t1 represents the biochemical reaction, and its speed is k*m1*m2*m3, representing the speed of the reaction. k is the reaction rate constant. Firing of t1 consumes tokens held by A and B, and gains tokens held by D and E. The number of tokens held by C remains unchanged.
HFPN has both continuous and discrete elements. A continuous place contains a real number as its content instead of an integer in a discrete place. A continuous transition fires in a continuous way other than the discrete way in which the discrete transition fires. The continuous transition keeps on firing as long as its firing condition is satisfied and moves the tokens between places in a certain speed. The speeds and delays of transitions can be functions that take the contents of places in the model as their variables. Test and inhibitory arcs are added for convenience. In biological Petri net modeling, places are for object species in biological system; tokens held by places are for object instances; transitions are for events modifying objects, e.g. biochemical reactions. Continuous components in HFPN are commonly used in metabolic and signaling part where the concentration and reaction rate are more concerned, e.g., ion concentrations; influx and efflux of ions. Discrete components are useful in control part such as gene regulatory part where the states of molecules are more important. Test and inhibitory arcs connect reaction with enzymes.
Ion Channel Modeling and Simulation Using Hybrid Functional Petri Net
407
3 Model for Ion Channel Neurons maintain a voltage difference across the cell’s plasma membrane known as the membrane potential. The difference is caused by relatively high concentrations of intracellular potassium ions and low concentrations of intracellular sodium ions. The underlying mechanism is the Na+/K+ pump which moves these two ions in opposite direction to their concentration gradient through the membrane.
Fig. 2. Sodium and Potassium ion channels and Na+/K+ pump. Na+ and K+ move in opposite direction
23 23 Fig. 3. Different states of an ion channel. A voltage-gated ion channel has three states: Activated, Inactivated and Deactivated.
Ion channels allow ions flow down their electrochemical gradients. They are crucial for an array of biological processes. Ion channels can be divided into two categories. Voltage-gated ion channels open or close depending on the membrane potential. The membrane potential is dependent of various intracellular ion concentrations. Ligand-gated ion channels open or close depending on binding of ligands to the channel. In this paper, we mainly focus on voltage-gated sodium and potassium channels for their crucial roles in nerve impulse and action potential. Figure 3(a) shows the three states of a voltage-gated ion channel, deactivated, activated and inactivated. When the
408
Y. Tang and F. Wang
membrane potential reaches its threshold, a deactivated ion channel becomes activated and allows certain ions to go through it, changing the membrane potential. When the membrane potential reaches the reversal potential, the activated ion channel turns into the inactivated state. At that time, the channel cannot be activated, even if the membrane potential is favorable. After a while, the ion channel becomes deactivated again. Figure 3(b) shows the HFPN representation of an ion channel. We use discrete elements to describe the behavior of an ion channel because the logic state of the channel is our main focus. Those electrical chemical reactions are much faster than biochemical ones. Thus, changes between the states can be regarded as pulses and represented by discrete transitions. The delay of discrete transitions fits well for representing the period of time during the inactivated period of an ion channel.
Fig. 4. The whole model of ion channels. The model covers all key parts which are responsible for the action potential, the sodium and potassium ion channels, membrane potential, the Na+/K+ pump, the influx and efflux of the sodium and potassium ions and outer stimulus.
Figure 4 is the whole model of the system. Discrete transition t40 represents outer stimulus added to the system. Stimulus of various periods and intensities can be added by changing the parameters of t40. Table 1 lists all the places and their biological meanings in our model. The model is built on Cell Illustrator 3.0 [19].
Ion Channel Modeling and Simulation Using Hybrid Functional Petri Net
409
4 Results In this section, we present simulation results of our model. We add outer stimulus of various periods and intensities to see its impact on the action potential. 4.1 The Action Potential Figure 5(a) shows an idealized action potential. When the membrane potential reaches a certain level (threshold of the Na+ channel), the Na+ channels open, allowing Na+ to enter the cell and causes depolarization. The membrane potential keeps increasing and activates the K+ channels. K+ channels allow efflux of K+ thus causing a negative effect on the membrane potential. However, at this time, the Na+ channels dominate and the membrane potential keeps rising. The Na+ channels become inactivated at the peak of the action potential while the efflux of K+ continues, resulting in the drop of membrane potential and hyperpolarizing the cell. After that, the K+ channels become inactivated and the voltage returns to the resting value [20].
23 4 , 23 A Fig. 5. Biological and simulation results of the action potential Table 1. Places and their biological meanings Name MP ST K+ K_out Na+ Na_out Na_K pump K K_ACT K_RP Na Na_ACT Na_RP
Type Continuous Continuous Continuous Continuous Continuous Continuous Continuous Discrete Discrete Discrete Discrete Discrete Discrete
Meaning Membrane potential Outer Stimulus Intracellular K+ concentration Extracellular K+ concentration Intracellular Na+ concentration Extracellular Na+ concentration Na+/K+ pump K+ Channel in deactivated state K+ Channel in activated state K+ Channel in inactivated state Na+ Channel in deactivated state Na+ Channel in activated state Na+ Channel in inactivated state
410
Y. Tang and F. Wang
Figure 5(b) shows our simulation result of the action potential. It matches well with the idealized one. The depolarization, hyperpolarization, repolarization and refractory period are very clear in our result. The resting voltage and the peak of action potential are close to those shown in figure 5(a).
Fig. 6. Action potential caused by outer stimulus of different periods
4.2 Periodical Stimulation We add an outer stimulus of 20 mv to the system every 30, 40, 60 time units to see whether action potential will be caused according to the stimulus given. Figure 6 shows that our system reacts well to the outer stimulus. The system forms an action potential each time it is stimulated. 4.3 All or None Principle In this section we test the all or none principle on our model. The amplitude of an action potential is independent of the intensity of the outer stimulus. The action potential occurs fully or not at all. Figure 7 shows the simulation results of the system which receives an outer stimulus of 20, 40, 80 mv. We can see that the amplitude of the action potential remains unchanged, but it takes less time to reach the peak.
5 Discussion We believe the method proposed in this paper can be a new approach researching neuroscience. The graphical representations of ion channels, outer stimulus and
Ion Channel Modeling and Simulation Using Hybrid Functional Petri Net
411
biochemical substance are very intuitive, thus providing a powerful test of the hypothesis set up by neurologists. Moreover, we exploit Petri net based approaches to a new field where both biochemistry and electrical chemistry are included.
Fig. 7. Action potential caused by outer stimulus of different intensities
(a)
(b)
Fig. 8. HFPN representation for neuron spike. (a) shows two neurons with a synapse. Place N1 and N2 in (b) represent the two neurons.
In addition to ion channel modeling, HFPN is useful in other neural issues. For instance, at the higher level, neural spikes are pulse-like events that follow the “all or none” law. Discrete elements in HFPN match very well with these features. Figure 8 shows the HFPN representation for neural impulse spread between two neurons. In the future, HFPN can be applied to other biological issues. We will try to introduce stochastic factors into HFPN to improve its modeling ability for certain biological systems which require stochastic approach. In addition, the current model
412
Y. Tang and F. Wang
for ion channels can be perfected by taking more biological facts. A more complete model will give us deeper insights by producing more significative results.
References 1. Luo, C.H., Rudy, Y.: A model of the ventricular cardiac action potential, depolarization, repolarization, and their interaction. Circ. Res. 68, 1501–1526 (1991) 2. Mahajan, A., et al.: A rabbit ventricular action potential model replicating cardiac dynamics at rapid heart rates. Biophys. J. 94, 392–410 (2008) 3. White, J.A., Klink, R., Alonso, A., Kay, A.R.: Noise from voltage-gated ion channels may influence neuronal dynamics in the entorhinal cortex. J. Neurophysiol. 80, 262–269 (1998) 4. Kienker, P.: Equivalence of Aggregated Markov Models of Ion-Channel Gating. Proceedings of the Royal Society of London. B. Biological Sciences 236(1284), 269–309 (1989) 5. Milescu, L., Akk, G., Sachs, F.: APR. Maximum likelihood estimation of ion channel kinetics from macroscopic currents. Biophysical Journal 88(4), 2494–2515 (2005) 6. Liebovitch, L.S., Fischbarg, J., Koniarek, J.P., Todorova, I., Wang, M.: Fractal model of ion-channel kinetics. Biochim. Biophys. Acta 896(2), 173–180 (1987) 7. Petri, C.A.: Kommunikation mit Automaten, PhD diss., University of Bonn, West Germany (1962) 8. Reddy, V.N., Mavrovouniotis, M.L., Liebman, M.N.: Petri net representations in metabolic pathways. In: Proc. First ISMB, pp. 328–336 (1993) 9. Koch, I., Junker, B.H., Heiner, M.: Application of petri net theory for modeling and validation of the sucrose breakdown pathway in the potato tuber. Bioinformatics 21, 1219– 1226 (2005) 10. Grunwald, S., Speer, A., Ackermann, J., Koch, I.: Petri net modeling of gene regulation of the Duchenne muscular dystrophy. J. BioSystems 92(2), 189–205 (2008) 11. Bause, F., Kritzinger, P.S.: Stochastic Petri Nets, An introduction to the Theory. Verlag Vieweg (1996) 12. David, R., Alla, H.: Discrete, Continuous and Hybrid Petri Nets. Springer, Heidelberg (2004) 13. Valk, R.: Self-modifying nets, a natural extension of Petri nets. In: Ausiello, G., Böhm, C. (eds.) ICALP 1978. LNCS, vol. 62, pp. 464–476. Springer, Heidelberg (1978) 14. Matsuno, H., Tanaka, Y., Aoshima, H., Doi, A., Matsui, M., Miyano, S.: Biopathways Representation and Simulation on Hybrid Functional Petri Net. In Silico Biology 3(3), 389–404 (2003) 15. Hardy, S., Robillard, P.N.: Phenomenological and molecular-level Petri net modeling and simulation of long-term potentiation. Biosystems 82, 26–38 (2005) 16. Doi, A., Fujuta, S., Matsuno, H., Nagasaki, M., Miyano, S.: Constructing biological pathway models with hybrid functional Petri nets. In Silico Biology 4, 271–291 (2004) 17. Doi, A., Nagasaki, M., Matsuno, H., Miyano, S.: Simulation-based validation of the p53 transcriptional activity with hybrid functional petri net. In Silico. Biol. 6, 1–3 (2006) 18. Troncale, S., Tahi, F., Campard, D., Vannier, J.P., Guespin, J.: Modeling and simulation with hybrid functional Petri nets of the role of interleukin-6 in human early haematopoiesis. In: Pac. Symp. Biocomput, vol. 11, pp. 427–438 (2006) 19. http://www.cellillustrator.com/ 20. Kuffler, S.W., Nicholls, J.G., Martin, A.R.: From Neuron to Brain. Sinauer Associates, Sunderland (1984)
Computer Simulation on the Compaction of Chromatin Fiber Induced by Salt Chun-Cheng Zuo, Yong-Wu Zhao, Yong-Xia Zuo, Feng Ji, and Hao Zheng Jilin University, College of Mechanical Science and Engineering 130025, China [email protected]
Abstract. We present a computer simulation on the compaction of 30nanometer chromatin fiber induced by salt. The nucleosome is represented as rigid oblate ellipsoids without consideration of DNA-histone wrapping conformation. It is found that equilibrium conformations of multi-nucleosome chains at physiological ionic concentrations are more or less random “zig-zag” structures. Moreover, the diameter, the linear mass density and the persistence length of fiber show a strong dependence on the ion strength. The computational results show us that decreasing the salt strength from 0.15M to 0.01M leads to an increase in the diameter and the linear mass density and a decrease in the persistence length. Keywords: chromatin fiber, simulation, salt, compaction.
1 Introduction In eukaryotic, the DNA inside nucleus is packed into chromatin through several hierarchical organizations. The packing arrangement is of great significance to the gene expression and DNA-protein interactions. The first building block of chromatin is nucleosome—formed by an octamer (two copies of each of H2A, H2B, H3 and H4))—around which 146 bp of DNA are wrapped in 1.75 turns in a left handed helix. Multi-nucleosome chain then folds into a fiberlike structure with a diameter of approximately 30nm forming the second level of package. Many experimental techniques have been used to investigate the internal structure of 30 nm fibers, but the precise condensation of DNA and protein remains a mystery. Two different models are presented to interpret the structure of fiber [1]: the solenoid model and the zig-zag model [2-7]. The solenoid model and variants were first introduced by Finch and Klug [8], which has gained a wide acceptance in the early days [9][10]. In this solenoid model, the next nucleosome succeeds the last nucleosome in the helical direction. The axis of each nucleosome is perpendicular to the axis of the solenoid. Both DNA entry side and exit side point toward the axis of fiber. The linker DNA between adjacent nucleosomes has to be bent or curled in some fashion in order to connect two K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 413–420, 2010. © Springer-Verlag Berlin Heidelberg 2010
414
C.-C. Zuo et al.
neighboring nucleosomes. However, recently cryo-electron microscopy and atom force microscopy showed that the chromatin fiber is a zig-zag chain with linker DNA running nearly straight between nucleosomes [11-14]. A clear contradiction is put forward: Is the linker DNA bent or straight at physiological salt concentrations? Does the length of linker DNA decrease with the increasing salt or not? The aim of this paper is to investigate the precise conformation of 30-nanometer fiber at various salt concentrations. The numerical technique, such as the system model and parameters used in the simulations, is presented in section 2. Results and discussions are presented in section 3, and conclusions are given in section 4.
2 Numerical Technique 2.1 System Model The fiber system consists of nucleosomes and linker DNA segments. Without consideration of the internal DNA-histone structure, the nucleosome is modeled by a rigid ellipsoidal disk with 11 nm in diameter and 5.5 nm in height. Two adjacent nucleosomes are connected by linker DNA which is represented by cylindrical segments. The simple geometry of the nucleosome is shown in Fig. 1.
Fig. 1. The systematic geometry of nucleosomes and linker DNA with (left side) and without (right side) consideration DNA-histone interactions. The simplified model (right side) is used in our simulations.
The fiber systems are coupled to each other through stretching, bending, torsion, and electrostatic potentials. The stretching potential of linker DNA can be expressed by
Estr =
1 kstr (l − l0 ) 2 2 l0
(1)
Computer Simulation on the Compaction of Chromatin Fiber Induced by Salt
415
where k str is the stretching rigidity, l0 is the equilibrium length of linker DNA, and l is the actual length of linker DNA. The bending potential of linker DNA can be expressed by Eb =
kBT l p 2 θ 2 l0
(2)
where k B is the Boltzmann constant, T is absolute temperature, l p is the persistence length, θ is the opening angle which is related to the entry linker DNA segment and exit linker DNA segment. The torsion potential of linker DNA can be expressed by ET =
k BT lt 2 φ 2 l0
(3)
where lt is the torsion persistence length and φ is the torsion angle of linker DNA. According to Debye-Huckel approximation, the electrostatic interaction of DNA can be expressed by Eije =
exp(− Krij ) ν2 dλ ∫ dλj ∫ i ε rε0 rij
(4)
where ν is the liner charge density, ε r is the dielectric constant of solution, ε 0 is the dielectric constant, λi and λ j represent the distance of current position to the end of the segment. And K is the inverse of the Debye length, which can be calculated by the formula K 2 = (2 z 2 e 2 n0 ) /(ε r ε 0 k BT ) , where z is the valence of ions, e is the charge of an electron, and n0 is the bulk concentration. The Gay-Berne potential is used to model the internucleosomal interactions: 12 6 ⎛⎛ ⎞ ⎛ ⎞ ⎞ σ0 σ0 ⎜ V (uˆ1 , uˆ2 , rˆ) = 4ε (uˆ1 , uˆ2 , rˆ) ⎜ ⎟ −⎜ ⎟ ⎟ ⎜ ⎝ r − σ (uˆ1 , uˆ2 , rˆ) + σ 0 ⎠ r − σ (uˆ1 , uˆ2 , rˆ) + σ 0 ⎠ ⎟ ⎝ ⎝ ⎠
(5)
where the vectors uˆ1 and uˆ2 point into the direction of the symmetry axis of the particles, rˆ is the unit vector of center-to-center distance, σ 0 scales the potential width.
⎛
1 2
ˆ ˆ1 + ru ˆ ˆ2 ) 2 (ru ˆ ˆ − ru ˆ ˆ2 ) 2 ⎫ ⎞ ⎧ (ru + 1 ⎬ ⎟⎟ ⎩ 1 + χ (uˆ1uˆ2 ) 1 − χ (uˆ1uˆ2 ) ⎭ ⎠
σ (uˆ1 , uˆ2 , rˆ) = σ 0 ⎜⎜1 − χ ⎨ ⎝
χ = (σ &2 − σ ⊥2 ) /(σ &2 + σ ⊥2 )
−1/ 2
(6)
(7)
416
C.-C. Zuo et al.
ε (uˆ1 , uˆ2 , rˆ) = ε ν (uˆ1 , uˆ2 )ε 'μ (uˆ1 , uˆ2 , rˆ)
(8)
ε (uˆ1 , uˆ2 ) = ε 0 [1 − χ 2 (uˆ1uˆ2 ) 2 ]−1/ 2
(9)
1 2
⎡ (ru ˆ ˆ1 + ru ˆ ˆ2 )2 (ru ˆ ˆ − ru ˆ ˆ2 ) 2 ⎤ + 1 ⎥ ⎣ 1 + χ '(uˆ1uˆ2 ) 1 − χ '(uˆ1uˆ2 ) ⎦
ε '(uˆ1 , uˆ2 , rˆ) = 1 − χ ' ⎢
χ ' = (ε s1/ μ − ε e1/ μ ) /(ε 1/s μ + ε e1/ μ )
(10)
(11)
where χ and χ ' define the anisotropy of the potential width and potential depth, respectively. σ & and σ ⊥ represent the relative potential width for particles oriented parallel and oriented orthogonal, respectively. ε 0 represents the potential depth. ε s and ε e define the relative potential width for particles in lateral and in longitudinal orientation, respectively. ν and μ are dimensionless parameters. 2.2 Computer Simulation Procedure
The computer simulation procedure is same as the references [15][16]. The classical Metropolis-Monte Carlo method is used to generate randomly a statistically relevant set of representative configurations of the system at temperature T . Starting with an arbitrary configuration, a new configuration is created randomly from the previous configuration.
3 Results and Discussions 3.1 Chromatin Fiber Configuration at Physiological Ionic Strength
We performed Monte Carlo simulations to obtain the chromatin fiber configurations at physiological ionic strength, as shown in Fig.2. The starting configuration is arranged in a straight line, where the adjacent nucleosomes are located in a toothed shape. The neighboring nucleosomes are connected by a linker DNA with a length of 3.52nm. The distance between the i th and i + 2 th nucleosome is 6nm. The initial bending angle and torsion angle are 180° and 0° , respectively. In our simulations, the equilibrium configurations are obtained after 250000 Monte Carlo steps, where the nucleosomes are crossed-linked. In this zig-zag model, the mean length of linker DNA is 3.36nm. The mean torsion angle and mean bending angle of equilibrium configurations are 103.7° and 10.1° , respectively. 1500 equilibrium configurations are chosen to investigate systematic properties. The diameter, linear mass density and persistence length of fiber are 30.17 ± 0.05nm , 6.34 ± 0.01 nucleosomes/ 11nm and 100.77nm, respectively.
Computer Simulation on the Compaction of Chromatin Fiber Induced by Salt
Fig. 2. Snapshot of configurations of chromatin fiber at physiological ionic strength
417
418
C.-C. Zuo et al.
3.2 Effect of Salt Concentrations
It is well known that the salt concentrations of solution have a great effect on the configuration of the chromatin fiber. In order to analyze the influence of the ion strength on the systematic properties of fiber, Monte Carlo simulations have been performed with an increase of salt concentrations from 0.15M to 0.01M. The diameter, linear mass density and persistence length of chromatin fiber as a function of salt concentrations are shown in Fig.3, Fig.4 and Fig.5, respectively. It can be seen, from these three figures, that the varieties of properties in low salt concentration are much more pronounced than those in high salt concentrations. The linear mass density of the fiber decreases from 5.68 to 1.49 nucleosomes/11nm, while the persistence length increases from 93nm to 548nm when lowering the ionic strength.
Fig. 3. The diameter of chromatin fiber as a function of the salt concentration
Fig. 4. The linear mass density of chromatin fiber as a function of the salt concentration
Computer Simulation on the Compaction of Chromatin Fiber Induced by Salt
419
Fig. 5. The persistence length of chromatin fiber as a function of the salt concentration
4 Conclusions Based on the model of 30 nm chromatin fiber, computer simulations are performed to systematically study compaction configurations and properties of chromatin fiber. A clear zig-zag model is computationally shown in our simulations, where the adjacent nucleosomes are cross-linked. Moreover, the systematic properties show a strong dependence on the salt concentrations. Also the results suggest that the linear mass density and persistence length of chromatin fiber can be varied in opposite directions as a function of ionic strength. More detailed model, which contains the dependence of H1-protein and histone on the salt concentrations, is necessary for further studies of chromatin fiber. Acknowledgments. This work is supported by National Natural Science Foundation of China (Grant No.30770501) and the Chinese National Programs for High Technology Research and Development (Grant No. 2006AA04Z305).
References 1. Holde, K.v., Zlatanova, J.: What determines the folding of the chromatin fiber? J. Proc. Natl. Acad. Sci. USA 93, 10548–10555 (1996) 2. Horowitz, R.A., et al.: Chromatin conformation and salt-induced compaction: threedimensional structural information from cryoelectron microscopy. J. Cell Biol. 131, 1365– 1376 (1994) 3. Schiessel, H., Gelbart, W.M., Bruinsma, R.: DNA folding: Structural and mechanical properties of the two-angle model for chromatin. J. Biophys. 80, 1940–1956 (2001) 4. Woodcock, C.L., Dimitrov, S.: Higher-order structure of chromatin and chromosomes. J. Curr. Opin. Genet. Dev. 11(2), 130–135 (2001) 5. Woodcock, C.L., et al.: A chromation folding model that incorporates linker variability generates fibers resembling the native structures. J. Proc. Natl. Acad. Sci. USA 90, 9021– 9025 (1993)
420
C.-C. Zuo et al.
6. Mergell, B., Everaers, R., Schiessel, H.: Nucleosome interactions in chromatin: Fiber stiffening and hairpin formation. J. Phys. Rev. E 70, 11915 (2004) 7. Sun, J., Zhang, Q., Schlick, T.: Electrostatic mechanism of nucleosomal array folding revealed by computer simulation. J. Proc. Natl. Acad. Sci. USA 102, 8180–8185 (2005) 8. Finch, J.T., Klug, A.: Solenoidal model for superstructure in chromatin. J. Proc. Natl. Acad. Sci. USA 73, 1897–1901 (1976) 9. Yao, J., Lowary, P.T., Widom, J.: Direct detection of linker DNA bending in definedlength oligomers of chromatin. J. Proc. Natl. Acad. Sci. USA 87, 7603–7607 (1990) 10. Yao, J., Lowary, P.T., Widom, J.: Linker DNA bending induced by the core histones of chromatin. J. Biochemistry 30(34), 8408–8414 (1991) 11. Marion, C., et al.: Conformation of chromatin oligomers. A new argument for a change with the hexanucleosome. J. Eur. J. Biochem. 120, 169–176 (1981) 12. Bednar, J., et al.: Nucleosomes, linker DNA, and linker histone form a unique structural motif that directs the higher-order folding and compaction of chromatin. J. Proc. Natl. Acad. Sci. USA 95, 14173–14178 (1998) 13. Zlatanova, J., Leuba, S.H., Holde, K.v.: Chromatin fiber structure: morphology, molecular determinants, structural transitions. J. Biophys. 74, 2554–2566 (1998) 14. Leuba, S.H., et al.: Three-dimensional structure of extended chromatin fibers as revealed by tapping-mode scanning force microscopy. J. Proc. Natl. Acad. Sci. USA 91, 11621– 11625 (1994) 15. Wedemann, G., Langowski, J.: Computer simulation of the 30-nanometer chromatin fiber. J. Biophys. 82(6), 2847–2859 (2002) 16. Aumann, F., et al.: Monte Carlo simulation of chromatin stretching. J. Phys. Rev. E. 73, 041927 (2006)
Electrical Remolding and Mechanical Changes in Heart Failure: A Model Study Yunliang Zang and Ling Xia Department of Biomedical Engineering, Zhejiang University, Hangzhou 310027, China [email protected]
Abstract. We have developed a canine cardiac cellular electromechanics model to simulate electrophysiological remodeling of heart failure (HF) and predicted cardiomyocyte contractility after HF. INa,L is integrated into this model to study its role to the prolongation of action potential (AP) in control and HF conditions, which was not established well in the past. It may have a great contribution to prolongation of AP in control and even greater contribution to that of HF. Ionic remolding after HF is modeled by downregulation of Ito1, IK1, IKs, SR pump function and upregulation of Na+-Ca2+ exchange (NCX) and INa,L. The HF model could successfully simulate the prolonged AP, reduced ICaL, enhanced INaCa and blunted Ca2+ transient. With computed Ca2+ being the input to myofilament model, myofilament forces are determined. Compared with control, reduced amplitude, increased latency to onset of contraction, increased time to peak (TTP) and attenuated cell shortening are found in HF model. The model could also be embedded into tissue electromechanics model to simulate the altered activation sequence and mechanical function.
1 Introduction Heart failure (HF) is a primary cardiac disease characteristic of impaired contractility and reduced cardiac output. There have been many studies about the electrophysiological changes in HF [1-6]. As described in those papers, action potential (AP) duration (APD) is consistently recorded prolonged in experimental observations. And the prolongation of AP facilitates the development of early afterdepolarizations (EADs) which frequently occurs with abnormal repolarization during phase 2 or phase 3 [1, 2]. The role of remodeling K+ in AP prolongation and EADs formation has been studied numerously. Ca2+ independent Ito1 is found decreased in all studies and all tissues except the sinus node [3, 7, 8]. In ventricular myocytes, some studies show a decrease in IK1 [1-3, 7, 8], but two other studies [4, 5] do not. IKr does not change in most studies except in the study of Tsuji et al. [5]. As to IKs, it is decreased in ventricular, atrial, and sinoatrial node cells [1-5, 7, 8]. Na+-Ca2+ exchange (NCX) is reported to be enhanced [3, 6] and SR pump function down-regulation [3]. Na channel generates a large inward current (INa) supporting rapid depolarization and responsible for action propagation. Most of the current inactivates rapidly, but a K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 421–429, 2010. © Springer-Verlag Berlin Heidelberg 2010
422
Y. Zang and L. Xia
small fraction remains as late INa (INa,L) [9]. However, a possible role for the cardiac Na channel in AP prolongation in HF is less well established. Initial reports showed no changes in peak INa [3] and Maltsev et al. later showed a decrease [10]. INa,L was first found an increase in HF in hamsters [11]. In 1998, Undrovinas et al. showed that tetrodotoxin block of INa,L shortened APs in HF myocytes and eliminated EADs [12]. Following electrical remodeling, myocardial force is decreased and many other stress characteristics are changed. Nonetheless, few experimental studies have explored the impaired contractility after HF because of the great challenge to the methods for monitoring myocardial function. The mechanism between electrical remodeling and disturbed mechanical performance is still not clear. In this study, we develop a cardiac cellular electromechanics model based on the excitation-contraction coupling (ECC) model of Greenstein [13] and the myofilament force model of Rice [14]. Specially, we modify the Greenstein model by incorporating the INa,L current which may be very important in HF model. We utilize the model to simulate the electrophysiological changes after HF. After demonstrating the ability of this model to simulate quantitative electrophysiological features, we predict the generated tension after HF and discuss the link between altered expressions of ionic currents, Ca handling proteins and impaired contractile force.
2 Methods 2.1 Electrophysiological Model The electrophysiological model of this study is mainly modified from Greenstein ECC model. The original model consists of 76 ODEs, of which 40 represent the intracellular Ca2+ states. While the others represent gating kinetics, ion transfer relationships and corresponding concentrations. As stated above, INa,L may contribute to AP prolongation and generation of arrhythmia in HF. We incorporate the slowly inactivating late sodium current INa,L and add 2 state variables representing activation gate and slowly inactivation gate of INa,L, respectively. Its formulation is from original Luo-Rudy dynamic (LRd) fast sodium current INa [15], and the voltage dependence and kinetics of inactivation are based on data from Valdivia et al. [9]. Based on previous studies [1, 3, 9], for IK1, IKV43, INa,L, IKs, only the number of expressed channels is changed rather than the kinetics and gating behavior. So we scale their maximum conductance velocity by factors chfsc_Ik1, chfsc_IKV43, chfsc_INa,L and chfsc_IKs respectively. The scaling factors chfsc_Ik1, chfsc_IKV43 are derived from canine tachycardia-induced heart failure model by Kääb et al. [3]. The factor chfsc_IKs is from midmyocardial myocytes under HF in LI’s experiment [1] and chfsc_INa,L is quantified from Valdivia et al. [9]. Xiong made a special report about different expression for NCX in normal and failing hearts [6]. We model the upregulation of NCX by a scale factor chfsc_INaCa on the basis of Xiong’s study. The above scaling factors are found by varying the factors and simulating the correspondent peak current-voltage (I-V) relationship under HF. The optimal scaling
Electrical Remolding and Mechanical Changes in Heart Failure: A Model Study
423
factors are determined with minimal square error between simulated and experimental peak I-V relationship by Simplex algorithm.
MSE =
1 n ( X i − X )2 ∑ n i =1
(1)
where MSE is the minimal square error; n means the data points in experimental and simulated data sets; X i is the simulated data set and X is the experimental data set. Down regulation of the SR function is modeled by scaling of forward max pump rate by a scale factor, chfsc_JUP. The optimal scaling factors will be listed in Appendix. 2.2 Tension Development Model Development of models for cardiac cellular mechanics has lagged behind models of cardiac cellular electrophysiology because of the debates about mechanism of actinmyosin interactions and their role in thin filament activation. Recent myofilament model of Rice et al have addressed the deficiencies to some degree and reproduced some experimentally observed phenomenon of cardiac mechanics [14]. In Rice model, Ca2+ binding to troponin is artificially separated into regulatory Ca binding and apparent Ca binding to overcome the deleterious effects caused by global feedback of generated force on Ca binding. The model could reproduce realistic Ca sensitivity with F-Ca relations similar to true Hill functions. The regulatory Ca binding and apparent Ca binding could be mathematically expressed as below:
TropRe gulatory ( x) = (1 − SOVFthin ( x)) × TropCaL + SOVFthin ( x) × TropCaH Trop Apparent ( x) = (1 − SOVFthin ( x)) × TropL + SOVFthin ( x) ×( FractSBXB × TropH ) + (1 − Fract SBXB ) × TropL
(2)
(3)
where TropRe gulatory ( x) is the fraction of thin filament release units (RUs) that have regulatory Ca binding; x is the sarcomere length; SOVFthin ( x) is the single overlap function of the thin filament; TropCaL and TropCaH are the fraction of Ca binding to Low and high affinity points of troponin regulatory points. The definition of normalized active stress is:
Factive ( x ) = SOVFthick ( x ) ×
xXBPr eR × XBPr eR + xXBP ostR XBP ostR Max x0 × XBPostR
(4)
where x is the sarcomere length; The SOVFthick ( x) is a scaling factor for the contribution of sarcomere geometry to the number of recruitable crossbridges; xXBPr eR and XBPr eR are
424
Y. Zang and L. Xia
mean distortion and occupancy of sate XBPr eR , respectively; xXBPostR and XBPostR are Max mean distortion and occupancy of sate XBPostR , respectively; XBPostR is the occupancy of sate XBPostR at optimal conditions. In this study we modify the original rat myofilament stress model into a canine model. The adjustments to myofilament model parameters will be listed in the appendix. 2.3 Computational Methods The coupled cardiac cellular electromechanics model consists of 87 ODEs, of which 78 represent ECC model, 11 represent myofilament mechanics model and 2 in common. The simulation in this study is implemented and executed in MATLAB using the ode23t integrator with a maximum step size of 0.1 ms. APD is computed at 50% and 90% repolarization, i.e. APD50 and APD90 respectively. The adjusted parameters for HF model are obtained by Simplex algorithm.
3 Results 3.1 The Role of INa,L on APD We make some adjustments to Greenstein ECC model so that it could better serve as a tool for us to study the mechanisms supporting electrical remodeling and mechanical changes in HF. Figure 1A demonstrates the ability of this model to reconstruct AP morphology, duration and rate dependence of AP. The resulted actions show a significant spikeand-dome configuration in agreement with experimental observations [16]. Kiyosue et al. have found an increase (nearly 10%) in APD in guinea pig ventricular myocytes by the effect of INa,L [17]. Compared with original model, the APD is prolonged evidently at 0.5 HZ and 1 HZ. However, the trend to prolongation is not obvious at 2 HZ. Results here are similar with experimental observations of Kiyosue [17]. Figure 1B illustrates the model computed rate-dependent properties of APD90 and APD50 in control. APD90 in model is compared with the values measured by Liu and Antzelevitch [16], and we could see the predicted values are within the standard deviation of experimental reported means. For lack of direct and reliable data, we just plot APD50 computed by model in the figure. Figure 1C shows the computed INa,L at -20 mV, elicited by a serious of depolarizing voltage steps from -140 mV to voltages ranging from -60 mV to 0 mV. The pulse duration is 700 ms (voltage protocol is shown in the inset). In Figure 1D, the predicted steady state I-V relationship agrees well with the experimental measurements of Valdivia et al. [9] during the range from -30 mV to 0 mV. However, the model could not successfully reproduce the steady state I-V relationship under -30 mV. INa,L mainly plays an important role in the plateau phase of APD, so this defect would have little effect on the simulation.
Electrical Remolding and Mechanical Changes in Heart Failure: A Model Study
425
Fig. 1. A: AP morphology, duration and rate dependence of AP computed from original model (dotted line) and modified model (solid line). B: Model computed APD90 versus experimental measurements of [16], and computed APD50. C: Voltage-clamp protocol and INa,L currents at 20 mV. D: Model predictions of steady state I-V relationship (the inset is from the experimental results of Valdivia et al. [9]).
The model could successfully reproduce action potentials, Ca2+ transients and ECC characteristics (results not shown in this paper) in control. So we use it to model HF and predicted mechanical changes after HF. 3.2 Electrical and Mechanical Changes After HF By adjustment of the model parameters, we develop the canine HF cellular electromechanics model. In Figure 2A, a normal AP is shown by the blue solid line, whose duration is about 300 ms, the dashed line shows an AP with the density of IKV43 reduction by 84%. Obviously, downregulation of IKV43 shortens the APD and diminishes the notch of phase 1. The dot-dashed line shows an AP with both IKV43 (the same as above) and IK1 (35%) downregulation. The reduced amplitude of IK1 should have prolonged the APD, however, it could only compensate the shortening by IKV43 in part. And the net effect is still a little reduction in APD. Downregulation of SR pump function (85%) plays an important role in the prolongation of APD in HF model as shown by the dotted line. As SR Ca2+ ATPase is less expressed, reduced Ca2+ is pumped into SR and cytosolic Ca2+ transients increase which could support a higher and longer AP plateau. Upregulation of NCX (76%) and downregulation of IKs (49%) have little effect on APD compared with other factors. INa,L (6.7 times) in HF is significantly increased and contributes to AP prolongation greatly in return. The red dotted line corresponds to the HF model with the reduction of IKV43, IK1, IKs, SR pump function and enhancement of NCX and INa,L. The AP of final HF model is very similar with the experimental result of Kääb et al [3]. Figure 2B shows the control (solid line) and HF (dotted line) Ca2+
426
Y. Zang and L. Xia
transients. Ca2+ transients of HF model have a reduced amplitude, slower rise, slower decline and longer duration. Altered properties here are qualitatively like those in Kääb et al [3]. Ca2+ is very crucial between electrophysiology and mechanics and its change can cause a great influence on mechanical function. ICaL in control and HF is shown in Figure 2C. Reduction of ICaL was mainly caused by the reduction of Ito1, which was analyzed before by Greenstein [13]. Figure 2D shows the INaCa in control and HF. We could find that NCX operates in reverse model during most of the AP plateau phase in both control and HF. Its effect on APD prolongation is less than ICaL for its relative lower amplitude.
Fig. 2. Electrophysiological remolding after HF. A: Simulated APs for control (blue solid line); IKV43 downregulation (dashed line); both IKV43 and IK1 downregulation (dot-dashed line); IKV43, IK1 and SR pump function downregulation (dotted line) and the HF (downregulation of Ito1, IK1, IKs, SR pump function and upregulation of NCX, INa,L) (red dotted line). B: Simulated Ca2+ transients for control (solid line) and HF (dotted line). C: Simulated ICaL in control (solid line) and HF (dotted line). D: Simulated INaCa in control (solid line) and HF (dotted line).
From the coupled cellular electromechanics model, the myofilament mechanics could be determined by the Ca2+ transients in control or HF. Being the input of the myofilament model, the amplitude and duration of the cytosolic Ca2+ transients are important determinants of the velocity, amplitude and duration of cardiomyocyte contraction. We adjust some parameters to model canine myofilament force for the divergence of species. To compare the mechanical characteristics of control and HF cardiomyocyte, we perform a simple test to simulate isometric contraction in which the sarcomere length remains constant during the twitch, as shown in Figure 3A. Intuitively, we could find increased latency to the onset of contraction and reduced amplitude in HF cardiomyocyte compared with control. The latency to onset of contraction is about 32 ms in control, which matches well with 29±3.4 ms measured by Cordeiro et al. [18]. The latency is 150 ms when HF. It may come to a conclusion here that the delay between activation and contraction (EM delay) may be larger in HF than control. The
Electrical Remolding and Mechanical Changes in Heart Failure: A Model Study
427
variation of EM delay could be a principal factor to induce Arrythmia. Cordeiro et al. measured values of nearly 172±22 ms and 312±19.4 ms for time to peak (TTP) and relaxation time (peak to 90% relaxation) in control, respectively [18]. Our computed corresponding values are 173 ms and 380 ms respectively, which are very close to the experimental observations. The TTP increases with a value of nearly 350 ms in HF and duration of the force also increases relative to control. Figure 3B shows a cell shortening twitches. The sarcomere is initially fixed at the rest length of 1.9 um, then shortens to smaller length and finally returns to the rest length as shown in the figure. By comparison of control (solid line) and HF (dotted line), besides the results got from Figure 3A, we could also find an attenuated magnitude of shortening in HF.
Fig. 3. Cardiomyocyte contractility in control and HF. A: Isometric twitch force in control (solid line) and HF (dotted line). The inset shows the force transients renormalized in each case. B: Unloaded cell shortening twitches in control (solid line) and HF (dotted line).
4 Discussion In this paper, we developed a cardiac cellular electromechanics model based on modified Greenstein ECC model and Rice et al. myofilament model. We demonstrated the effectiveness of the model, simulated the electrophysiological properties and predicted the cardiomyocyte contractility after HF. The role of INa,L in control and HF is studied in this model. It can cause a moderate increase in the APD in control. There is a significant increase for INa,L in HF and it contributes much to the prolongation of AP in return. The formulation of INa,L is derived from LRd fast sodium current. There is a small defect that could not completely match the experimental measurements of Valdivia et al [9], although it nearly does not affect the simulations here. Electrophysiological remolding of channels in HF is modeled and their role on AP prolongation is studied. IKV43 is reduced in HF and it could cause the loss of the notch, the shortened AP, blunted Ca2+ transients, decreased efficiency of ECC [13]. IK1 amplitude is substantially reduced in HF cells and its reduction contributes to the APD prolongation to some degree as an outward current. Reduced expression of SR Ca2+ ATPase leads to a significant increase in APD and heightened plateau. The effects of NCX upregulation and IKs downregulation on APD are also incorporated.
428
Y. Zang and L. Xia
Although we simulate the canine myofilament contractility by this model, there still needs some improvements to this model. Because of the slow Na+ and K+ kinetics, it will take several minutes to approach the steady state. The original Rice myofilament model was developed for rat and rabbit initially. Although by adjusting parameters, we can model canine myofilament model, there is still a problem to be resolved. The amplitude of Ca2+ outputted by ECC model is much lower than more typically observed levels. The number of Ca2+ release units has to be enlarged from 50000 to 75000 as done by Niederer et al. [19]. So electrophysiological and myofilament model based on reliable, and consistent species data may be needed in the future, if there is abundant experimental data. Our model could serve as an effective tool to optimize the treatment of HF. For example, in this model, we could find that block of INa,L can shorten AP in HF and may have an effect to eliminate EADs. The model could also be embedded into tissue electromechanics model to simulate the altered activation sequence and mechanical function in the future. Acknowledgement. This work is supported in part by the 973 National Key Basic Research & Development Program (2007CB512100) of China.
References 1. Li, G.R., Lau, C.-P., Ducharme, A., Tardif, J.-C., Nattel, S.: Transmural action potential and ionic current remodeling in ventricles of failing canine hearts. Am. J. Physiol. Heart Circ. Physiol. 283(3), 1031–1041 (2002) 2. Janse, M.J.: Electrophysiological changes in heart failure and their relationship to arrhythmogenesis. Cardiovasc. Res. 61(2), 208–217 (2004) 3. Kaab, S., Nuss, H.B., Chiamvimonvat, N., O’Rourke, B., Pak, P.H., Kass, D.A., Marban, E., Tomaselli, G.F.: Ionic Mechanism of Action Potential Prolongation in Ventricular Myocytes From Dogs With Pacing-Induced Heart Failure. Circ. Res. 78(2), 262–273 (1996) 4. Rozanski, G.J., Xu, Z., Whitney, R.T., Murakami, H., Zucker, I.H.: Electrophysiology of rabbit ventricular myocytes following sustained rapid ventricular pacing. J. Mol. Cell. Cardiol. 29(2), 721–732 (1997) 5. Tsuji, Y., Opthof, T., Kamiya, K., Yasui, K., Liu, W., Lu, Z., Kodama, I.: Pacing-induced heart failure causes a reduction of delayed rectifier potassium currents along with decreases in calcium and transient outward currents in rabbit ventricle. Cardiovasc. Res. 48(2), 300–309 (2000) 6. Xiong, W., Tian, Y., DiSilvestre, D., Tomaselli, G.F.: Transmural Heterogeneity of Na + Ca2 + Exchange: Evidence for Differential Expression in Normal and Failing Hearts. Circ. Res. 97(3), 207–209 (2005) 7. Beuckelmann, D., Nabauer, M., Erdmann, E.: Alterations of K + currents in isolated human ventricular myocytes from patients with terminal heart failure. Circ. Res. 73(2), 379–385 (1993) 8. Zicha, S., Xiao, L., Stafford, S., Cha, T.J., Han, W., Varro, A., Nattel, S.: Transmural expression of transient outward potassium current subunits in normal and failing canine and human hearts. J. Physiol. 561(Pt. 3), 735–748 (2004) 9. Valdivia, C.R., Chu, W.W., Pu, J., Foell, J.D., Haworth, R.A., Wolff, M.R., Kamp, T.J., Makielski, J.C.: Increased late sodium current in myocytes from a canine heart failure model and from failing human heart. J. Mol. Cell. Cardiol. 38(3), 475–483 (2005)
Electrical Remolding and Mechanical Changes in Heart Failure: A Model Study
429
10. Maltsev, V.A., Sabbab, H.N., Undrovinas, A.I.: Down-regulation of sodium current in chronic heart failure: effect of long-term therapy with carvedilol. Cell. Mol. Life Sci. 59(9), 1561–1568 (2002) 11. Jacques, D., Bkaily, G., Jasmin, G., Menard, D., Proschek, L.: Early fetal like slow Na + current in heart cells of cardiomyopathic hamster. Mol. Cell. Biochem. 176(1-2), 249–256 (1997) 12. Undrovinas, A.I., Maltsev, V.A., Kyle, J.W., Silverman, N., Sabbah, H.N.: Gating of the late Na+ channel in normal and failing human myocardium. J. Mol. Cell. Cardiol. 34(11), 1477–1489 (2002) 13. Greenstein, J.L., Hinch, R., Winslow, R.L.: Mechanisms of excitation-contraction coupling in an integrative model of the cardiac ventricular myocyte. Biophys. J. 90(1), 77–91 (2006) 14. Rice, J.J., Wang, F., Bers, D.M., de Tombe, P.P.: Approximate model of cooperative activation and crossbridge cycling in cardiac muscle using ordinary differential equations. Biophys. J. 95(5), 2368–2390 (2008) 15. Luo, C.H., Rudy, Y.: A dynamic model of the cardiac ventricular action potential. I. Simulations of ionic currents and concentration changes. Circ. Res. 74(6), 1071–1096 (1994) 16. Liu, D.W., Antzelevitch, C.: Characteristics of the delayed rectifier current (IKr and IKs) in canine ventricular epicardial, midmyocardial, and endocardial myocytes. A weaker IKs contributes to the longer action potential of the M cell. Circ. Res. 76(3), 351–365 (1995) 17. Kiyosue, T., Arita, M.: Late sodium current and its contribution to action potential configuration in guinea pig ventricular myocytes. Circ. Res. 64(2), 389–397 (1989) 18. Cordeiro, J.M., Greene, L., Heilmann, C., Antzelevitch, D., Antzelevitch, C.: Transmural heterogeneity of calcium activity and mechanical function in the canine left ventricle. Am. J. Physiol. Heart Circ. Physiol. 286(4), H1471–H1479 (2004) 19. Niederer, S.A., Smith, N.P.: A mathematical model of the slow force response to stretch in rat ventricular myocytes. Biophys. J. 92(11), 4030–4044 (2007)
Appendix current IKV43 IK1 Jup INaCa INa,L IKs
Heart failure model parameters parameter parameter modification (HF) GKV43 chfsc_IKV43=0.16 GK1 chfsc_IK1=0.65 Vmaxf chfsc_JUP=0.15 KNaCa chfsc_INaCa=1.76 GNa,L chfsc_INa,L=7.7 GKs chfsc_IKs=0.51
Modification of myofilament model parameters parameter original NCaRU(number of Ca release units) 50000 xbmodsp(species dependent coefficient) 1 Mass 5e-5
modified 75000 0.2 2e-5
Modeling Conformation of Protein Loops by Bayesian Network Peng Yang1 , Qiang L¨ u1,2, , Lingyun Yang1 , and Jinzhen Wu1 1 2
School of Computer Science and Technology, Soochow University Jiangsu Provincial Key Lab for Information Processing Technologies Suzhou, 215006, China [email protected]
Abstract. Modeling protein loops is important for understanding characteristics and functions for protein, but remains an unsolved problem of computational biology. By employing a general Bayesian network, this paper constructs a fully probabilistic continuous model of protein loops, refered to as LoopBN. Direct affection between amino acids and backbone torsion angles can be learned under the framework of LoopBN. The continuous torsion angle pair of the loops can be captured by bivariate von Mises distribution. Empirical tests are conducted to evaluate the performance of LoopBN based on 8 free modeling targets of CASP8. Experimental results show that LoopBN not only performs better than the state-of-the-art modeling method on the quality of loop sample set, but also helps de novo prediction of protein structure by providing better sample set for loop refinement. Keywords: Modeling protein loop, Bayesian network, bivariate von Mises distribution.
1
Introduction
Modeling protein loops is a subproblem of protein structure prediction which is one of the greatest challenges in computational biology and bioinformatics. Loop is a main type of protein second structure [1], Compared with the conservation of helix and sheet, it exhibits a high range of flexibility, which makes it a hard problem to predict conformation of loop itself accurately. Furthore, protein loops play critical role of characteristics and functions for protein by providing active and binding sites for other molecular docking [2]. Modeling loop conformation is hence important but remains an unsolved problem in related fields [3,4]. Among recent approaches to tackle the problem of prediction protein structure, Rosetta [5,6] and I-TASSER [7,8,9] are two well known methods with
Supported by National Science Foundation of China under the grant number 60970055. Correspondent author.
K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 430–438, 2010. c Springer-Verlag Berlin Heidelberg 2010
Modeling Conformation of Protein Loops by Bayesian Network
431
excellent performances in historical Critical Assessment of Techniques for Protein Structure Prediction (CASP). Rosetta developed a fragment-assembly-based method which combinatorially assembles fragments collected from existing PDBs [10]. I-TASSER proposed a good refinement method together with the threaded templets. Much effort has gone into modeling the loop conformation [2,11] too. TorusDBN is one of the most successful approaches [12]. In the previous study, we applied TorusDBN to model only protein loops[13]. Inspired by the success of the improvement, in the present study we develop a general Bayesian network (BN), instead of DBN, to capture the property of protein loops. We call our model LoopBN. which extends the first feature of TorusDBN by generalizing DBN model to BN model while using the same bivariate von Mises distribution to model the backbone angles, and thus obtains better performance on modeling protein loops. In TorusDBN, the principle of amino acid residue sequence conformation is implemented by hidden node sequence. But LoopBN allow the amino acid sequence directly determines all the details of the tertiary structure in protein loops. Figure 1 (a) and (c) depict the representation ability difference between DBN and BN. Notice that TorusDBN is focused on modeling much longer sequence with varied length, DBN is the natural choise. LoopBN is for modeling only loop sequence, after applying some restrictions, we can choose BN as the model tool. Since we are just interested in modeling loop structures with LoopBN, the constrains of unknown sequence length can be removed by setting the loops with the fixed length. Although protein loops’ length varies from 1 to 30 residues, most protein loops have less than 12 residues [14]. So we set the length of loop as 20 resides in this paper. If the real loop’s length is bigger than 20, we drop the extra residues at both two ends; If the real loop’s length is smaller than 20, we equally expand the two ends. LoopBN then uses 20-residue segment to contain most of protein loops. It is natural to expect that such LoopBN can accurately capture the properties of loops conformation and produce more useful sample set for predicting loops conformation. We conduct two tests to valuate the performance of LoopBN. The experiments show the positive results favor for LoopBN.
Fig. 1. Different probabilistic models for modeling protein loops: (a) TorusDBN. (b) LoopBN: Initial BN G0 . (c) LoopBN: a snapshot of the final learned BN. The complete graph is described in Table 1.
432
2 2.1
P. Yang et al.
Materials and Methods Nodes of LoopBN
The backbone of a protein loop can be represented by a sequence of dihedral angle (φ, ψ, ω), that are well known from the Ramachandran plot[15]. The ω dihedral angle can be assumed to be fixed at 180◦(trans) or 0◦ (cis). As a usual process, we skip ω’s effect in LoopBN. So the loop’s structure can be represented by a sequence of dihedral angle pairs: (φ, ψ)s. In LoopBN, a residue within a loop is described by the amino acid (AA) node, secondary structure (SS) node and dihedral angle (TA) node. To capture the angular preferences of protein backbones, we turn to the field of directional statistics for the dihedral angular distribution with Gaussian-like properties that allows for efficient sampling and parameter estimation. From the family of bivariate von Mises distributions, we choose the cosine variant, which suitable for this purpose [16]. The density function is given by f (φ, ψ) = c(κ1 , κ2 , κ3 ) exp(κ1 cos(φ − μ)+ κ2 cos(ψ − ν) − κ3 cos(φ − μ − ψ + ν))
(1)
The distribution has five parameters: μ and ν are the respective means for φ and ψ, κ1 and κ2 their concentration, and κ3 is related to their correlation. 2.2
Training Set of LoopBN
We choose SABmark 1.65 twilight [17] protein dataset, which is used as TorusDBN’s training data, to train LoopBN. And this dataset provides a set of structures with low sequence similarity for each different SCOP-fold [18]. It covers 209 main SCOP-folds and contains 1723 proteins. Firstly, we extract the amino acid information, secondary structure information and dihedral angle pairs from SABmark1.65 twilight dataset. Where the secondary structure is assigned by using DSSP [19]. We only adopt three secondary structure labels: H(helix), E(sheet) and L(Loop). Secondly, we extract the loop dataset satisfying the following requirements: (1) The amino acid sequence’s length is 20, and (2) The subsequence with secondary structure labels L, is in the middle position of the 20 residues amino acid sequence. And now we get the training dataset for LoopBN which contains 21838 loop segments. 2.3
Structure Learning and Sampling of LoopBN
We developed a simple but efficient Hill Climbing algorithm to learn BN starting from an initial BN. Shown in Algorithm 1. The initial BN G0 is shown in Figure 1 (b). The score to evaluate the LoopBN is Bayesian Information Criterion (BIC) [20] shown in equation (2), which is a score based on likelihood, and penalizes an excess of parameters and thereby avoids overfitting. BICScore(LoopBN) = 2 ln(L) − p ln(n),
(2)
Modeling Conformation of Protein Loops by Bayesian Network
433
Algorithm 1. Hill Climbing algorithm for learning LoopBN Input: Initial BN G0 ; P0 is the preference probability for choosing AA or SS nodes when climbing hill. Output: Optimized BN G while not convergent do Stochastically choose TAi from the 20 dihedral angle nodes if The number of TAi ’s parents reaches the maximum restriction then Stochastically remove a parent node from TAi ’s parent nodes else if The generated random probability is less than P0 then Choose a amino acid node AAj as the candidate node N ode else Choose a secondary structure node SSj as the candidate node N ode end if if N ode is already one of the TAi ’s parent nodes then Remove the parental relationship between TAi and N ode, if there exists such a relationship else Add the parental relationship N ode →TAi end if end if if The score of the updated BN is not better than the previous one then Cancel all changes made above end if end while Output the current BN G as the final LoopBN
where L is the maximum likelihood of LoopBN, p is the number of parameters of the LoopBN, and n is the record number of training dataset.
3
Results and Discussion
All the results presented in this section are obtained at IBM P550 with PowerPC 4-way Dual Core Processor at 1.55GHz, with 8GB of RAM, and running by 64bit Suse Linux. 3.1
The Final Trained LoopBN
To train LoopBN, we set the convergent condition as follows. If no improvement happens to the BIC score of the LoopBN after continuous twenty thousand tries, LoopBN is considered convergent. We set P0 = 0.7 to bias select AA nodes as the potential parental node of TA node. The maximum parent number is set to 3. After learning the training dataset through Hill Climbing algorithm, the final BN structure of LoopBN is like Figure 1 (c), and the detail is described below in Table 1.
434
P. Yang et al. Table 1. The final structure of LoopBN: TA nodes with their parent nodes TA(1-10) TA1 TA2 TA3 TA4 TA5 TA6 TA7 TA8 TA9 TA10
parents AA1 , SS1 , SS2 AA2 , SS1 , SS3 AA3 , SS1 , SS4 AA4 , SS2 , SS8 AA5 , SS4 , SS6 SS6 , SS7 , SS8 AA7 , SS2 , SS9 AA8 , SS2 , SS7 AA9 , SS5 , SS8 AA10 , SS3
TA(11-20) TA11 TA12 TA13 TA14 TA15 TA16 TA17 TA18 TA19 TA20
parents AA11 , SS11 , SS12 SS11 , SS12 , SS14 AA13 , SS11 , SS15 SS9 , SS11 , SS15 AA15 , SS12 , SS17 AA16 , SS15 , SS16 AA17 , SS16 , SS20 AA18 , SS18 , SS19 SS18 , SS19 , SS20 AA20 , SS19 ,SS20
As compared with the initial BN G0 , only the parents of TA nodes have changed, Table 1 only lists the final parents of each TA node. Combinning Figure 1 (b) and Table 1 gives the final structure of LoopBN. The BIC score of this final trained LoopBN approximate -132. 3.2
Design of Evaluation Tests
The test cases are selected from the free modeling (FM) targets of the latest CASP8. They are T0397-D1, T0405-D1, T0405-D2, T0416-D2, T0443-D1, T0476, T0496-D1, T0460 and T0513-D2. These cases are considered as the most fresh target instances for de novo prediction. The main reason why we choose these test instances is that those sequences are far enough from the sources by which LoopBN and TorusDBN are trained. The native backbones used in the evaluation are obtained from PDB online site. As LoopBN focuses on modeling the loop, all the loop segments should be extracted from the 8 test cases, in accordance with the previous procedure for preparing the training data. Finally we get a total of 49 loop segments as the test set for the following comparison. 3.3
Loop Distribution Test
We use above the mentioned 49 loops’s information to sample 200 records for each loop consisted of 20 (φ , ψ) pairs by TorusDBN and LoopBN, respectively. As a result we get two sample sets for 49 loop conformations. We refer to Q1 as the sample set produced by LoopBN, and Q2 as the sample set produced by TorusDBN. We use two different metrics to evaluate which sample set is more close to the native conformation, just in the same way what TorusDBN did [12]. The first metric is Kullback-Leibler (KL) divergence [21,22] which is a standard measure of distance between two probability distributions. For discrete probability distributions P and Q, KL is defined as, P (i) , (3) KL(P, Q) = P (i) ln Q(i) i
Modeling Conformation of Protein Loops by Bayesian Network 12
1.3 LoopBN TorusDBN
1.2
435 LoopBN-phi LoopBN-psi TorusDBN-phi TorusDBN-psi
11 10
1.1
9 Angular Deviation Value
KL Value
1 0.9 0.8 0.7 0.6
8 7 6 5 4
0.5
3
0.4
2 1
0.3 0
5
10
15
20
25
30
35
40
45
50
0
5
10
15
20
25 Test loops
30
35
40
45
50
Test loops
Fig. 2. KL distance
Fig. 3. Angular deviations
where i iterates all the statistical grids which P and Q provide. KL divergence is a positive number. The lower the KL value, the smaller the gap between P and Q. KL becomes zero if and only if the two distributions are equal. After φ and ψ being discredited into 1000 bins from −180◦ to 180◦ , all the points of Q1 and Q2 can be depicted at proper grids in a 1000 × 1000-grid plane. P presents the distribution of native conformation. By applying equation (3), we compare the KL divergence between Qs and P . Results of the comparison are shown in Figure 2. In Figure 2, we see that LoopBN performs better than TorusDBN on 23 loops, almost equally well(less than 0.1) on 13 loops, and worse only on 13 loops. The second metric, in analogy to the root mean square deviation measure, is the angular deviation D between two vectors of angles x1 and x2 : 1 (min(|x2i − x1i |, 2π − |x2i − x1i |))2 , (4) D(x1 , x2 ) = n i where the angles are measured in radians. For sample set Q1 and Q2 , we calculate D values of angular deviation, and show result in Figure 3: From Figure 3 we see that there are 43 loops where the angular deviation of LoopBN is better than that of TorusDBN in terms of the dihedral angle φ. There are 27 loops where the angular deviation of LoopBN is better than that of TorusDBN in terms of the dihedral angle ψ. There are 25 loops where the angular deviation of LoopBN performs better than that of TorusDBN in terms of both dihedral angle φ and ψ. Overall speaking, LoopBN samples more accurate loop conformations than TorusDBN in two quality metrics. 3.4
De Novo Prediction of Loop Structure
In this test, we evaluate how LoopBN helps improve the de novo prediction of protein structure. We use pacBackbone [23] as the platform of de novo prediction of protein structure. pacBackbone used a parallel ant colony optimization algorithm by sharing pheromone among parallel colonies [24]. pacBackbone used 1-mer fragments from Rosetta’s fragment library to refine the loop after the
436
P. Yang et al.
low-resolution backbone is produced. In order to test the performance of loop structures sampled by LoopBN, we replace the 1-mer fragments with sample set produced by LoopBN for pacBackbone’s refinement of loop. The modified pacBackbone program generates 800 decoys, which can be evaluated by the criteria——GDT TS[25]. Figure 4 shows the typical improvements for predictions of T0397-D1, T0416-D2, T0476 and T0460 target.
T0397-D1
T0416-D2
0.4 LoopBN Rosetta 1-mer fragments GDT-TS
GDT-TS
0.35 0.3 0.25 0.2 0.15 0
0.8 0.75 0.7 0.65 0.6 0.55 0.5 0.45 0.4 0.35 0.3
100 200 300 400 500 600 700 800 decoys
LoopBN Rosetta 1-mer fragments
0
T0476
T0460
0.5
0.45
LoopBN Rosetta 1-mer fragments
0.45
GDT-TS
GDT-TS
0.3 0.25
0.35 0.3 0.25
0.2
0.2
0.15
0.15 0
100 200 300 400 500 600 700 800 decoys
LoopBN Rosetta 1-mer fragments
0.4
0.4 0.35
100 200 300 400 500 600 700 800 decoys
0
100 200 300 400 500 600 700 800 decoys
Fig. 4. The detail improvements of FM backbone predictions by refining loops with LoopBN sample sets
Figure 4 gives the details of the improvements caused by LoopBN samples help pacBackbone increase the prediction accuracy from 2nd-place to 1st-place on CASP8 FM targets [23]. 3.5
Discussion
LoopBN not only provides a joint probability distribution among AA, SS and TA nodes for loops sample TA values, but also the relationship between AAs and TAs in perspective of biology and biochemic. Some interesting issues have been found from the learning result of LoopBN. For example, in Table 1: 1. Node AA11 , SS11 and SS12 are learned to be parents of node TA11 . We find that secondary structures, not only the amino acid type, are also able to control the residue’s conformation. This is consistent with our normal knowledge about the loop backbone.
Modeling Conformation of Protein Loops by Bayesian Network
437
2. For TA10 , it is understandable that AA10 is the parent. But a long range interaction has also been found that SS3 , which is a far SS node to TA10 , is a parent of TA10 . Further investigations are needed to validate such long range dependency. 3. For TA12 , TA14 , and TA19 , it is very strange that the corresponding AA12 , AA14 and AA19 are not the parents. Is it just caused by “ill” training data, or wrong model which LoopBN adopted?
4
Conclusion
Modeling protein loops is a hard but important task. Working on LoopBN gives us some hints for the future’s work. A probability model is suitable for modeling protein loops, compared with the discrete statistical analysis approach. Considering high range degree of loop freedom, a continuous probability distribution is necessary for capturing the properties of torsion angles of loops. A open framework, like LoopBN, is able to allow most types of relationships to be established between loops’ attributes. Relaxing the structure constrains of LoopBN will make LoopBN learn more sophisticated model for protein loops.
References 1. Heuser, P., Wohlfahrt, G., Schomburg, D.: Efficient methods for filtering and ranking fragments for the prediction of structurally variable regions in proteins. Proteins 54, 583–595 (2004) 2. Jiang, H., Blouin, C.: Ab initio construction of all-atom loop conformations. Molecular Modeling 12, 221–228 (2006) 3. Cortes, J., Simeon, T., Remaud-Simeon, M., Tran, V.: Geometric algorithms for the conformational analysis of long protein loops. J. Comput. Chem. 25, 956–967 (2004) 4. Canutescu, A.A., Dunbrack Jr., R.L.: A robotics algorithm for protein loop closure. Protein Sci. 12, 963–972 (2003) 5. Rohl, C.A., Struss, C.E.M., Misura, K.M.S., Barker, D.: Protein structure prediction using rosetta. Methods In Enzymology 383, 66–93 (2004) 6. Bradley, P., Misura, K.M.S., Barker, D.: Toward high-resolution de novo strcture prediction for small protein. Science 309(5742), 1868–1871 (2005) 7. Zhang, Y., Skolnick, J.: Automated structure prediction of weakly homologous proteins on a genomic scale. PNAS 101, 7594–7599 (2004) 8. Zhang, Y.: I-tasser server for protein 3d structure prediction. BMC Biol. 9, 40 (2008) 9. Wu, S., Skolnick, J., Zhang, Y.: Ab initio modeling of small prediction by iterative tasser simulation. BMC Biology 5, 17 (2007) 10. Berman, H.M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T.N., Weissig, H., Shindyalov, I.N., Bourne, P.E.: The protein data bank. Nucleic Acids Research 28(1), 235–242 (2000) 11. Rohl, C.A., Strauss, C.E.M., Chivian, D., Baker, D.: Modeling structurally variable regions in homologous proteins with rosetta. Proteins 55, 656–677 (2004)
438
P. Yang et al.
12. Boomsma, W., Mardia, K.V., Taylor, C.C., Ferkinghoff-Borg, J., Krogh, A., Hamelryck, T.: A generative, probabilistic model of local protein structure. PNAS 105, 8932–8937 (2008) 13. Yang, P., L¨ u, Q., Yang, L., Wu, J., Wen, W.: A generative probabilistic model for loop modeling. Computers and Applied Chemistry(in Chinese) (2010) (in press) 14. Fiser, A., Do, R.K.G., Sali, A.: Modeling of loops in protein structures. Protein Sci. 9, 1753–1773 (2000) 15. Ramachandran, G.N., Ramakrishnan, C., Sasisekharan, V.: Stereochemistry of polypeptide chain configurations. J. Mol. Biol. 7, 95–99 (1963) 16. Mardia, K.V., Taylor, C.C., Subramaniam, G.K.: Protein bioinformatics and mixtures of bivariate von mises distributions for angular data. Biometrics 63, 505–512 (2007) 17. Walle Van, I., Lasters, I., Wyns, L.: Sabmarkda benchmark for sequence alignment that covers the entire known fold space. Bioinformatics 21, 1267–1268 (2005) 18. Murzin, A.G., Brenner, S.E., Hubbard, T., Hubbard, T., Chothia, C.: Scop: a structural classification of proteins database for the investigation of sequences and structures. J. Mol. Biol. 247, 536–540 (1995) 19. Kabsch, W., Sander, C.: Dictionary of protein secondary structure: Pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 22, 2577–2637 (1983) 20. Schwarz, G.: Estimating the dimension of a model. Ann. Stat. 6, 461–464 (1978) 21. Kullback, S., Leibler, R.A.: On information and sufficiency. Ann. Math. Stat. 22, 79–86 (1951) 22. Bishop, C.M.: Pattern recognition and machine learning. Springer, New York (2006) 23. Wu, J., L¨ u, Q., Huang, X., Yang, L.: De novo prediction of protein backbone by parallel ant colonies (October 2009) (in submission), http://www.zhhz.net/~ qiang/pacBackbone 24. L¨ u, Q., Xia, X., Qian, P.: A parallel aco approach based on one pheromone matrix. In: Dorigo, M., Gambardella, L.M., Birattari, M., Martinoli, A., Poli, R., St¨ utzle, T. (eds.) ANTS 2006. LNCS, vol. 4150, pp. 332–339. Springer, Heidelberg (2006) 25. Read, R.J., Chavali, G.: Assessment of casp7 predictions in the high accuracy template-based modeling category. Proteins: Structure, Function, and Bioinformatics (2007)
Towards Constraint Optimal Control of Greenhouse Climate Feng Chen1 and Yongning Tang2 1
Department of Automation, University of Science and technology of China, Hefei, 230027, China 2 School of Information Technology, Illinois State University, Chicago, USA [email protected], [email protected]
Abstract. Greenhouse climate is a multiple coupled variable, nonlinear and uncertain system. It consists of several major environmental factors, such as temperature, humidity, light intensity, and CO2 concentration. In this work, we propose a constraint optimal control approach for greenhouse climate. Instead of modeling greenhouse climate, Q-learning is introduced to search for optimal control strategy through trial-and-error interaction with the dynamic environment. The coupled relations among greenhouse environmental factors are handled by coordinating the different control actions. The reinforcement signal is designed with consideration of the control action costs. To decrease systematic trial-and-error risk and reduce the computational complexity in Q-learning algorithm Case Based Reasoning (CBR) is seamlessly incorporated into Q-learning process of the optimal control. The experimental results show this approach is practical, highly effective and efficient. Keywords: Q-learning, Case based reasoning, Environmental factor, Reinforcement signal, Action coordination, Greenhouse climate.
1 Introduction The optimal control of greenhouse climate is one of the most critical techniques in digital agriculture, which aims at providing a suitable man-made climate for vegetable growth [1],[2],[3]. However, it is a challenge because: (1) Greenhouse climate, a nonlinear and complex system, is composed of multiple closely interrelated factors such as temperature, humidity, light intensity and CO2 concentration. For example, the change of temperature has a strong impact on humidity; the change of light intensity also affects temperature, humidity and CO2 concentration. (2) Greenhouse climate is an uncertain system due to (a) the transition probability of environment state is unknown, (b) the effect of control action is uncertain, and (c) greenhouse environment is partially open and influenced by climate outside. In recent two decades, researchers have paid considerable attentions on greenhouse climate control. The conventional control approaches of greenhouse climate can be classified into three methods: proportional-integral-derivative (PID), fuzzy control, and neural network [4], [5]. PID is the most common control method of greenhouse climate. So far various studies involving greenhouse climate control based on PID K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 439–450, 2010. © Springer-Verlag Berlin Heidelberg 2010
440
F. Chen and Y. Tang
have been performed by many researchers. For example, Setiawan et al. investigated a Pseudo-Derivative-Feedback control method for greenhouse temperature control, which is a modification of integral control with a derivative-feedback algorithm [6]; Cunha et al. used recursive identification technique to estimate, in real time, the parameters of a second-order model for inside temperature of a greenhouse, and then a PID controller is employed for greenhouse temperature control [7]. However, it is difficult to tune the parameters of PID controller on line. Moreover, most of the studies focus on the control of single greenhouse environmental factor. Little work has been done for control of multiple greenhouse environmental factors simultaneously. The paper by Pasgianos et al. presented a feedback-feedforward approach for climate control of greenhouses including temperature and humidity [8]. But variable decoupling is dependent upon the hypotheses that external disturbances are isolated, measurable and the matrix is nonsingular, it is extremely difficult for actual greenhouse environment to satisfy these conditions. Fuzzy control is also widely applied to greenhouse climate control, the typical work includes: For greenhouse temperature and humidity control, Lafont et al. propose a fuzzy control method in term of expert knowledge [9]. The correlations among greenhouse environmental factors are decoupled by the fuzzy controller design. Nachidi et al. described the control method of temperature and humidity in greenhouses. In their work, a Takagi-Sugeno (T-S) fuzzy model is constructed from a simplified nonlinear dynamic model of the greenhouse climate [10]. Fuzzy control is able to accomplish control actions without precise mathematical model, but it has several shortcomings such as low control accuracy, and the hardness in determining and adjusting fuzzy rules. Neural network based control methods of greenhouse climate are a class of relatively new techniques [11], [12], In the paper by Fourati and Chtourou, an Elman neural network is used to emulate the direct dynamics of a greenhouse. The inverse neural network is employed to drive the system outputs to desired values. Based on radial basis function neural networks[15], Ferreira et al model the inside air temperature of a hydroponic greenhouse as a function of the outside air temperature, solar radiation, and the inside relative humidity[16].Neural network can represent nonlinear correlations among different variables. However, it relies on a large number of training patterns, which can be problematic in practice. Reinforcement learning is a type of unsupervised machine learning. Q-learning as a typical reinforcement learning method, is widely applied in intelligent control and Robotics fields [15],[16],[17],[18]. Interacting with environment, Q-learning searches for optimal policy by trial-and-error without modeling environment, which is a suitable and effective means for the optimal control of greenhouse climate. The objective of this paper is to find an optimal control approach for the greenhouse climate. With considering of both vegetable growth requirements and control costs, we propose a novel control approach of greenhouse climate based on Q-learning. Instead of modeling the correlation between vegetable growth and environmental factors, the control system searches for optimal control policy (a set of state-action pair sequences) by conducting trial-anderror interaction with the environment, such that an optimal or sub-optimal environment could be set up to facilitate vegetable growth. Moreover, we incorporate CBR (Case Based Reasoning), an AI methodology for processing empirical knowledge [20], into
Towards Constraint Optimal Control of Greenhouse Climate
441
the Q-learning process to overcome several drawbacks such as high computational complexity and trial-and-error risk [16], [19]. This paper is organized as follows. In section 2, the novel control principle for greenhouse climate is introduced and analyzed in detail. In section 3, a CBR integrated Q-learning is presented. Experimental results are shown in section 4. Finally, section 5 concludes this work.
2 Technical Approach 2.1 Q-Learning Reinforcement learning is a type of unsupervised learning and consistent with the behaviorism theory presented by Brooks [21]. In reinforcement learning module, an agent acts on environment and receives reinforcement signal which is either punishment or reward induced by environment state transition. The learning task for an agent is to search for an optimal policy: state-action sequences, which is an action selection rule corresponding to a given state. The basic model of reinforcement learning is shown in Fig.1.
Fig. 1. The basic model of reinforcement learning
In Q-learning, an environment is regarded as a Markov process with finite states. Q-learning optimizes directly a Q-function that can be iteratively computed without environment model. Q-function, defined as formula 1, is a discounted cumulative reward given that an agent starts in state st , applies action at once, and follows a policy thereafter.
Q(st , at ) = rt + γ {max Q(st +1 , a )} a∈ A
(1)
γ (0 < γ < 1) is discounted rate, rt is reward received when the environmakes a transition from state st to st +1 . Due to the above properties, Q-
Where, ment
learning is suitable to achieving optimal control of greenhouse climate. 2.2 The Control Principle In general, vegetable growth process could be divided into three stages: seeding, growth, and mature periods. Each period has different requirements for greenhouse climate. Taking as research background the control of greenhouse climate during a growth period of greenhouse vegetable, we try to study an optimal control approach for such an uncertain environment. Observing that greenhouse climate changes in a
442
F. Chen and Y. Tang
specific range, the variations of temperature, humidity, light intensity and CO2 concentration are almost cyclical. Therefore, global optimal control on a greenhouse environment could be decomposed into multi-step local optimal control to achieve daily environment optimal control. An environment state S can be defined as a tuple T × H × L × C , where variables T , H , L and C represent temperature, humidity, light intensity and CO2 concentration, respectively. By sampling greenhouse environmental factors in a fixed time interval, the continuous states of greenhouse environment can be decomposed into a series of discrete states {S 0 , S1 , " , S n } . The related definitions are given as the following: Definition 1: An optimal control process for greenhouse climate is defined as a tuple
S , At × Ah × Al × Ac , R states;
,
where,
S
is
a
finite
set
of
environment
At , Ah , Al and Ac are the finite discrete sets of executable actions for the
corresponding environment states, the intersection set of
;
Ai and A j
may be non-
empty (where i, j ∈ {t , h, l , c} ) R : S × A → R is a numerical reward function. The control system aims at finding a policy that can obtain optimal discounted cumulative rewards for any discrete state sequences. Since there may be contradicted relations among the action sets At , Ah , Al and
Ac . We define the corresponding action relations as the following: A is a discrete joint action set, a joint action a ∈ A consists p q i of the actions from sub-action sets At , Ah , Al and Ac such as at , ah , al and Definition 2: Suppose
acj (where p , q , i and j are the indices of sub-actions atp , ahq , ali and acj in their respective sub-action sets), which are called sub-actions of the joint action a . If aip = ¬a qj is satisfied, where ¬a qj denotes the opposite action of a qj , then aip and
a qj are regarded as contradicted actions. Contradicted actions are not permitted to constitute any joint action. When subactions from different sub-action sets are combined to form joint actions, if sub-action
aip is contrary to sub-action a qj , then the corresponding joint action is combined with
aip × null or a qj × null , where null means no action.
The computational complexity of Q-learning increases exponentially with the number of states. Moreover, trial-and-error risk is another factor to consider. In order to improve the Q-learning performances, CBR is adopted and seamlessly incorporated into the Q-learning process. CBR is a type of incremental machine learning that achieves learning and problem solving according to the past experiences in case library. This property makes CBR especially suitable for application domains where it is difficult to find regular knowledge and to express casual relations through precise mathematical model. Since greenhouse
Towards Constraint Optimal Control of Greenhouse Climate
443
environment is a multivariable, nonlinear, and uncertain system, domain experts’ prior knowledge plays an important role in greenhouse environment control. CBR is capable of representing domain experts’ inference processes, thus an effective means for the control of greenhouse environment. Incorporating CBR into Q-learning could accelerate learning process, reduce the search space, and minimize trial-and-error risk. In this work, a case library is developed for the control of greenhouse climate, in which each case has three attributes: environment state, action and reward. An environment state not only denotes case attribute, but also serves as the case index. Let si1 , si 2 , si 3 and
si 4 denote the current values of temperature, humidity, light intensity and CO2 concentration, respectively. The case library has structure as shown in Table 1: Table 1. Case Library structure Case attribute
si1 × si 2 × si 3 × si 4
Action
Reward value
r
ai
The case library consists of three kinds of cases: prior case, restrict case and selflearning case. Prior cases: the cases designed initially according to prior domain knowledge and experiences. Restrict cases: the cases including contradicted actions or forbidden action. Self-learning case: the cases generated during systematic trial-and-error interaction with the environment. Let a forbid be a forbidden action and ae an action given by the domain experts, prior case and restrict case are expressed in Table 2: Table 2. Representation of Prior Case and Restrict Case Case attribute
Action
Reward value
si1 × si 2 × si 3 × si 4
a forbid
-1
si1 × si 2 × si 3 × si 4
ae
0.5
The control system can make full use of prior cases to accelerate learning process effectively. Restrict cases are used to avoid invalid and risky actions during the systematic trial-and-error interactions. Self-learning cases, which are generated autonomously while searching for optimal policy, can be used to enhance online learning performance of the control system. Initially, only the prior cases and restrict cases exist in case library. In a given state S , the system retrieves the current state to determine whether it matches with case attribute in the library or not. If a match found, the control system executes the associated action by this case. However, for the purpose of exploration, the control system may occasionally select an alternate action with a low probability even a matched
444
F. Chen and Y. Tang
case found. Otherwise, the control system chooses a non-forbidden action to conduct with the probability determined by Boltzmann distribution. After the action is executed, the environment state will be transited, and the reinforcement signal will be received. Then the less reward case will be replaced with the better rewarded one in case library, so that the case library will be improved and learning process can be accelerated. Along with the constantly iterative learning, the optimal policy will be discovered and then a suitable environment can be offered for vegetable growth.
3 Algorithm Description
α ∈ (0,1)
γ ∈ (0,1) be discounted rate. A policy f : s → a is a function from states to actions. V f ( x ) denotes the value of policy f , which is desired rewards given that the process begins in state x and follows policy f thereLet
be learning rate,
after. The control algorithm for greenhouse climate is described as follows: Q← initialized the values of actions(initialized as 0), Let se be goal state 1: Repeat s←current state
s e − s ≤ δ then goto 1 //where δ is user-defined error threshold.
If
If s matches with case index s’ in the case library then {Execute valid action a’ that determined by state s’, Let y be the next state and r be the reward received, but occasionally an alternate} Else select an action to perform in term of Boltzmann distribution:
p(a, s ) =
eQ (s ,a )/ T ∑ e Q ( s ,b ) / T b∈ A
Endif
[
]
Q (s , a ) ← (1 − α )Q (s, a ) + α r + γV ( y )
// V ( y ) = Q ( y, f ( y ))
Q( y, a ) = max b∈ A Q( y, b) Revise policy value V f ( s ) = max b∈ A Q ( s, b) //for each y ∈ S : f ( y ) ← a that
If no matching case found then generate a new case, Else If r>r’ then Replace the corresponding case s’ with s Endif // r’ is reward corresponding to case s’ Endif s←y
Towards Constraint Optimal Control of Greenhouse Climate
445
The computational complexity of this algorithm is analyzed as follows:
max(size(T , H , L, C )) be n and the maximal size of sub-action sets be m . Assuming k is total number of the cases in case library, the computational complexity Let
(
)
of standard Q-learning algorithm for greenhouse climate control is O n × m . Accordingly, the computational complexity of this Q-learning algorithm combined with CBR is decreased by k × O (m 4 ) . When k increases during searching for optimal policy, the search space of Q-learning is reduced significantly. 4
4
4 Case Study and Experimental Results 4.1 Reinforcement Signal Design
Taking as an example the greenhouse climate control of cherry tomato growth process, the Algorithm described in section 3 will be applied to search for an optimal policy. It is critical to design the reinforcement signal, which is used to evaluate and improve the performances of control system. In the cherry tomato growth example, we believe the following principles should be taken into consideration to design reinforcement signal. Cherry tomato growth requirement for greenhouse environmental factors. Cherry tomato has adaptability for its surrounding environment. The suitable temperature range of cherry tomato growth is represented as (tl , to , t h ) , where t l , t o and t h are the minimum, optimum and maximum temperature, respectively. The suitable humidity, light intensity, and CO2 concentration ranges can be represented similarly. Thus optimal values of greenhouse environmental factors for cherry tomato growth are denoted by tuple: (to , ho , lo , co ) , where t o , ho , l o and co represent the desired values of temperature, humidity, light intensity and CO2 concentration, respectively. The coupled relations among the control actions. Due to complex correlations among temperature, humidity, light intensity and CO2 concentration, it leads to the coupled relations among the corresponding control actions. The control system has to coordinate sub-actions to achieve the desired control effects. The control action cost. Control actions inevitably introduce certain costs. In fact the energy provided is not sufficient to drive the internal climate to the desired one. This is can be explained by the limit of the actuators power. For practical applications, it is important to take the control action costs into consideration. Suppose variable cost is total costs of the control system, then it is determined by Equation 2. n
4
cost = ∑∑ cost (a ij )
(2)
i =1 j=1
( )
Where cos t aij denotes the cost of executing control action aij . The greenhouse control system drives the environmental factors to the desired values, but it must
446
F. Chen and Y. Tang
subject to the constraint condition cost ≤ θ , θ is the user-defined upper limit of total control costs. By considering the above criteria, the reinforcement signal is designed as follows:
g (s i ) = (k i 1 f (t i − t o ) + k i 2 f ( h i − h o ) + k i 3 f (l i − l o ) + k i 4 f (c i − c o )) Where si is ith state. experts.
f ( x ) = 2e − x
values
of
k i1 , k i 2 , k i 3
and
ki 4
(3)
are weights designated by the domain
− 1 . ti , hi , li and ci are real-time values of temperature, humidity, light intensity and CO2 concentration at time i , respectively. g ( x ) represents approximate degree between the values of environmental factors and the desired values at time i . The Larger the value of g ( x ) is, the closer the
Δg ( x ) =
2
/a
greenhouse
environment factors are to their desired values. g ( x + 1) − g ( x ) reflects the varying trends of the greenhouse environ-
mental factors. If Δg (x ) > 0 , it shows that the values of the greenhouse environment factor are approaching to the desires values, and thus a positive reward ought to be obtained. Otherwise, a punishment (i.e., a negative reward) will be received. So the final expression of reinforcement signal is defined as the following.
r (x ) =
1 1+ e
− Δg ( x ) / b
(4)
4.2 The Coordination of Sub-actions
Based on the proposed algorithm, an intelligent apparatus can be developed to autonomously control temperature, humidity, light intensity and CO2 concentration to optimize greenhouse environment (shown in Fig. 2) for cherry tomato growth.
Fig. 2. The structure of greenhouse environment control system
This control system can execute the two types of actions: switch action and discrete action. The actions executed by CO2 generator, fan device, spray device and sodium lamp device belong to the switch action type. The actions with certain control
Towards Constraint Optimal Control of Greenhouse Climate
447
precision, e.g. percentage of opening ventilation window or sunshade net, are referred to as the discrete action. As an actuator may be driven by multiple sub-states, the following examples are given to illustrate how to coordinate and combine the actions. Switch action. Let the setpoints of temperature and humidity for cherry tomato be to and ho , respectively. In current state Si , temperature value t > to , humidity value
h < ho . Thus, sub-state si1 demands fan on for cooling. On the contrary, sub-state si 2 requires the fan off. Since the fan actuator is only able to execute one action at a given time, the two sub-actions are combined according to Definition 2. Discrete action. The combination and coordination of discrete actions are much more complex than that of switch action. Assume the setpoints of temperature and light intensity are to and lo , respectively. In current state Si , temperature t > to , light intensity centage
l < lo . For sub-state si1 , the sunshade net should be closed with per-
x , but for sub-state si 3 , it should be opened with another percent-
age x' (where x, x'∈ [0,1] ). Therefore, there are actions offered for sub-state
int (100 × ( x − x') + 1) candidate
si1 and si 3 to select (where int (⋅) is integral function,
every integer corresponds to an action). 4.3 Experimental Results
In our field experiments, we first apply a standard Q-learning algorithm to control greenhouse climate, the experimental results are shown as Fig.3. Then, we adopt the proposed algorithm to control greenhouse climate, the result is shown in Fig.4. Fig.3. and Fig.4 illustrate the algorithm combined Q-learning with CBR can accelerate learning and converge. For the computation requiring 250 times iterations using the standard Q-learning algorithm, it is only requires 110 time iterations after the corporation of CBR into Q-learning.
Fig. 3. The experimental results by standard Q-learning algorithm
Fig. 4. The experimental results by combination Q-learning with CBR
448
F. Chen and Y. Tang
For the purpose of comparison, we implemented three control system that adopt fuzzy control, PID control and our approach, respectively, to control the same greenhouse climate located in the city of Hefei, China, in summer, 2008. Owing to the complicated coupled correlations among the greenhouse environmental factors, each factor is controlled using its individual PID controllers. The Ziegler-Nichols method was used to calculate and tune the parameters of these PID controllers: K P , K i , K d [22]. The approach in literature [10] is adopted to implement a fuzzy controller, which is a control system with multi-input and multi-output. The fuzzy rules are developed according to Takagi-Sugeno model and there are 225 fuzzy rules in this fuzzy rule library. Under the same control cost condition, the actual experimental results are shown in Fig.5 and Fig.6. Temperature is the most important parameter for greenhouse climate control. The growth and yield of cherry tomato are strongly correlated with daily average temperature but weakly related to daily temperature difference. The effect of this control system on greenhouse temperature is shown in Fig.5.
Fig. 5. The actual control results of temperature by our approach (denotes as Q+CBR), fuzzy control and PID control
The daily average temperature is 26.5oC that approximates desired daily average temperature 23.5oC. Correspondingly, the daily average temperature controller by fuzzy control and PID are 27 oC and 28.2 oC, respectively. The desired value of greenhouse humidity for cherry tomato is 85%. When our approach is employed to adjust the greenhouse humidity, the daily average humidity is 82%, which is better than fuzzy control with humidity 80.2% and PID control with humidity 78.5%. The setpoints of light intensity and CO2 concentration for cheery tomato are 17700lx and 700mg/l, respectively. The control of light intensity and CO2 concentration are achieved only in daytime. When our approach is adopted to control light intensity and CO2 concentration in the greenhouse, the average light intensity and CO2 concentration are 18200 lx and 620 mg/l, respectively, compared to PID control with 19700lx, 560mg/l and the Fuzzy control 18900lx, 596mg/l.
Towards Constraint Optimal Control of Greenhouse Climate
449
Fig. 6. The actual control results of humidity by our approach (denoted as CBR+Q), fuzzy control and PID control
5 Conclusion Greenhouse climate is a multivariable, nonlinear and uncertain system. It is a challenge to model the correlation between vegetable growth and greenhouse climate. Due to the lack of greenhouse environment model, it is extremely difficult for the typical methods including PID control, fuzzy control, and neural network to achieve optimal control of greenhouse climate. Based on Q-learning, this paper proposes a constraint optimal control algorithm for greenhouse climate without modeling the environment. CBR is seemly incorporated Q-learning. As a result, Q-learning process is accelerated and the trial-and-error risk is decreased effectively. In addition, the coupled control actions can be coordinated by systematic interactions with the controlled greenhouse environment. The experimental results demonstrate that this approach is able to drive greenhouse climate to approximate the desired one and obviously superior to the PID and Fuzzy control. This work not only offers an effective means for optimal control of greenhouse climate, but also provides a model-free approach in tackling the challenge in controlling similar uncertain system, which has broader impact on digital agriculture. Acknowledgements. This work is supported financially by National Science Fund of China under Granted No. 60775014, Key Natural Science Fund of Anhui Provincial Education Department under Granted No. 2006KJ028A.
References 1. Xu, F., Chen, J., Zhang, L., Zhan, H.: Self-tuning Fuzzy Logic Control of Greenhouse Temperature using Real-coded Genetic Algorithm. In: 9th International Conference on Control, Automation, Robotics and Vision, ICARCV 2006, December 5-8, pp. 1–6 (2006) 2. Yingxia, L., Shangfeng, D.: Advances of intelligent control algorithm of greenhouse environment in china. Transactions of the CSAE 20(2), 267–272 (2004)
450
F. Chen and Y. Tang
3. Trigui, M.: A strategy for greenhouse climate control, part I: model development. J. Agric. Engng. Res. 78(4), 407–413 (2000) 4. Sigrimis, N., King, R.E.: Advances in greenhouse environment control. Computers and Electronics in Agriculture 26(3), 217–219 (2000) 5. Yanzheng, L., Guanghui, T., Shirong, L.: The Problem of the Control System for Greenhouse Climate. Chinese Agricultural Science Bulletin 23(10), 154–157 (2007) 6. Setiawan, A., Albright, L.D., Phelan, R.M.: Application of pseudo-derivative-feedback algorithm in greenhouse air temperature control. Comp. Electronics Agric. 26(3), 283–302 (2000) 7. Boaventura Cunha, J., Couto, C., Ruano, A.E.: Real-time parameter estimation of dynamic temperature models for greenhouse environmental control. Control Engineering Practice 5(10), 1473–1481 (1997) 8. Pasgianos, G.D., Arvanitis, K.G., Polycarpou, P., Sigrimis, N.: A nonlinear feedback technique for greenhouse environmental control. Computers and Electronics in Agriculture 40(1-3), 153–177 (2003) 9. Lafont, F., Balmat, J.F.: Optimized fuzzy control of a greenhouse. Fuzzy Sets and Systems 128, 47–59 (2002) 10. Nachidi, M., Benzaouia, A., Tadeo, F.: Temperature and humidity control in greenhouses using the Takagi-Sugeno fuzzy model. In: 2006 IEEE International Conference on Control Applications, pp. 2150–2154 (October 2006) 11. Ferela, P.M., Ruano, A.E.: Choice of RBF model structure for predicting greenhouse inside air temperature. In: 15th Triennial World Congress, Barcelona, Spain (2002) 12. Sandra, P.: Nonlinear model predictive via feedback linearization of a greenhouse. In: 15th Triennial World Congress, Barcelona, Spain (2002) 13. Fourati, F., Chtourou, M.: A greenhouse control with feed-forward and recurrent neural networks. Simulation Modeling Practice and Theory 15, 1016–1028 (2007) 14. Ferreira, P.M., Fariab, E.A., Ruano, A.E.: Neural network models in greenhouse air temperature prediction. Neurocomputing 43, 51–75 (2002) 15. Barto, A.G.: Reinforcement learning in the real world. In: 2004 IEEE International Joint Conference on Neural Networks, vol. 3, pp. 25–29. 16. Watkins, C.J.C.H., Dayan, P.: Technical notes: Q-learning. Machine Learning 82, 39–46 (1992) 17. Jangmin, O., Lee, J., Zhang, B.-T., et al.: Adaptive stock trading with dynamic asset allocation using reinforcement learning. Information Sciences 176, 2121–2147 (2006) 18. Macek, K., Petrovic, I., Peric, N.: A reinforcement learning approach to obstacle avoidance of mobile robots. In: 7th International Workshop on Advanced Motion Control, pp. 462– 466 (July 2002) 19. Whitehead, S.D., Lin, L.-J.: Reinforcement learning of non-Markov decision processes. Artificial Intelligent 73, 271–306 (1995) 20. Juell, P., Paulson, P.: Case-based systems. IEEE Intelligent Systems (see also IEEE Expert) 18(4), 60–67 (2003) 21. Brooks, R.A.: Intelligence Without Representation. Artificial Intelligence Journal 47, 139– 159 (1991) 22. Kuo, B.C.: Automatic Control System. Prentice-Hall, New York (1995)
A Kernel Spatial Complexity-Based Nonlinear Unmixing Method of Hyperspectral Imagery Xiaoming Wu1, Xiaorun Li1, and Liaoying Zhao2 1
College of Electrical Engineering, Zhejiang University, Hangzhou 310027, China 2 Institute of Computer Application Technology, Hangzhou Dianzi University, Hangzhou 310018, China {wujietao2014,lxr,zhaoly}@zju.edu.cn
Abstract. In the hyperspectral analysis, the spatial correlation information is potentially valuable for hyperspectral unmixing. In this paper, we propose a new model, denoted “kernel spatial complexity-based nonnegative matrix factorization” (KSCNMF), to unmix the nonlinear mixed data. The method is derived in the feature space, which is kernelized in terms of the kernel functions in order to avoid explicit computation in the high-dimension feature space. In the algorithm, input data are implicitly mapped into a high-dimensional feature space by a nonlinear mapping, which is associated with a kernel function. As a result the high order relationships and more useful features between the spectral data can be exploited. Experimental results based on a set of simulated data and a real hyperspectral image demonstrate that the proposed method for decomposition of nonlinear mixed pixels has excellent performance. Keywords: spatial complexity, kernel function, spectral unmixing, nonnegative matrix factorization.
1 Introduction The multispectral remote sensing images cannot provide enough information for spectral unmixing. Owing to high spectral resolution and hundreds of spectral channels ranging from 0.4 to 2.5 micrometers, the hyperspectral remote senseing images are widely used for spectral analysis. However the modern spectrometer could not bring us to the same high spatial resolution, so the mixed pixels are widespread in hyperspectral imagery. How to efficiently unmix the mixed pixels becomes the critical step in the hyperspectral applications. The fundamental problem in many unmixing analysis is finding a suitable representation of the mixed data. The mixture models include linear and nonlinear ones. The linear model is widely applied for its simplicity of modeling and significant representation of data. However, for micro-scale spectral analysis and for low probability detection analysis, it is necessary to use nonlinear mixture models. Recently scholars developed some nonlinear spectral mixture analysis algorithms based on the kernel theory, including support vector nonlinear approximating regression [1], kernel orthogonal subspace projection [2], etc. However, these methods are supervised algorithms, thus it is a challenge to develop an unsupervised nonlinear algorithm with low computational complexity. K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 451–458, 2010. © Springer-Verlag Berlin Heidelberg 2010
452
X. Wu, X. Li, and L. Zhao
The nonnegative matrix factorization (NMF) [3] has been introduced to decompose hyperspectral mixed data into two nonnegative parts, the endmembers, and their corresponding abundances. Some scholars introduced the sparse characters into NMF algorithm and developed some methods such as sparse-based nonnegative matrix factorization (sNMF) [4], nonnegative matrix factorization with sparseness constraints (NMFsc) [5], etc. Recently, Sen Jia proposed a complexity based nonnegative matrix factorization with nonnegative constraints [6]. Let CBnNMF denote this algorithm. In this paper, we extended the CBnNMF algorithm to nonlinear mixture data and developed an unsupervised kernel [7] spatial complexity-based nonnegative matrix factorization algorithm (KSCNMF). Compared to the existing supervised algorithm, the proposed approach avoids explicit computation in the feature space, and can get better endmember abundances without knowing any prior knowledge. The remainder of this paper is organized as follows. Section 2 introduces the CBnNMF algorithm. Section 3 briefly describes the kernel function and the KSCNMF algorithm. In Section 4, the experimental results, based on a set of synthetic mixtures and a real image scene are evaluated. Lastly, Section 5 concludes the paper.
2 CBnNMF Algorithm 2.1 Linear Spectral Mixture Model Commonly, in hyperspectral imagery each pixel can be approximately represented as a linear combination of its imagery endmembers. Let v be a mixed pixel consisting of l spectral bands and p spectral endmembers present in the mixed pixel. Thus
v= wh+ n .
(1)
p
satisfying w ≥ 0 , ∑ hi =1 , 0 ≤ hi ≤ 1 , where w = [ w1 w 2 ⋅ ⋅ ⋅ w p ] is the ( l × p ) endmember i =1
matrix and h is an ( p × 1 ) abundance column vector whose elements are the proportions of each endmember contributing to the mixed pixel. Ignoring the influence of additive zero-mean Gaussian noise, the model (1) can be rewritten as v ≈ wh . 2.2 CBnNMF
Nonnegative matrix factorization is used to find two nonnegative matrixes W ∈ R L× N
L× p
and H ∈R p× N with an initial matrix V ∈ R and a positive integer p < min( L, N ) , so that they can satisfy the equation V ≈ WH . W and H are found by minimizing the 1 2 Euclidean distance f (W , H ) ≡ || V − WH ||F , satisfying Wij ≥ 0 , H ij ≥ 0 . By 2 introducing spectral and spatial complexity of hyperspectral data, the paper [6] reformulated the objective function as
A Kernel Spatial Complexity-Based Nonlinear Unmixing Method
f (W , H ) =
453
2 1 α V −WH 2 + W −W + β χ ( H ) . 2 2
(2)
W Lp = λW ( L − 1) p + (1 − λ )W ( L − 1) p .
(3)
where χ ( H ) represents the local correlation value of abundance matrix H . It can be written as
χ ( H ijp ) = ∑ w i′j′φ ( H ijp − H i′j′p , τ ) .
(4)
φ (ε , τ ) = τ ln[cosh(ε / τ )] .
(5)
i ′j ′∈ N ijp
In the equation (4), N ijp denotes the neighborhood system of H ijp ,with wi′j′ being the weighting factor and τ being the scaling parameter. In the equations, a typical value for τ is 0.1 and wi′j′ is 1. The neighborhood system N ijp equates {(i − 1) j , (i + 1) j , i ( j − 1), i ( j + 1)} . The update rule described in [6] for matrixes W and H is W ← W ⋅ ∗(VH + α W ) ⋅ /(WHH + α W ) . T
(6)
T
4 4β T T H ← H ⋅ ∗(W V + β ( H − ∇ χ ( H ) )) ⋅ /(W WH + H) .
τ
(∇ χ ( H ) )
= (∇ χ ( H ) ) pb
= ∑ tanh( ijp
(7)
τ
H ijp − H i′j′p
τ
i′j′∈ Nijp
) ( H pb = H ijp ) .
(8)
The marks ⋅ ∗ , ⋅ / denote dot product and dot divide respectively. (∇ χ ( H ) ) is the pb
partial derivative of matrix H in the coordinates ( p, b) . In order to make H satisfy the full additivity constraints, the following proposed iterative operation is applied H ij =
H ij p
.
(9)
∑ H ij i=1
3 KSCNMF Algorithm 3.1 Kernel Method
According to the nonlinear mapping
φ : x ∈ R → φ ( x) ∈ C . n
N
(10)
454
X. Wu, X. Li, and L. Zhao
The paper [7] gives the definition of kernels. Assuming all x, z ∈ X , X ⊂ R , if function k satisfy the equation n
k ( x , z ) =< φ ( x ) ⋅ φ ( z ) > .
(11)
then we refer to k as the kernel function. Some commonly used kernels include radial basis functions k ( x , z ) = exp( −
x −z
2
) , polynomial kernels k ( x , z ) = ( x ⋅ y ) , and d
2σ sigmoid kernels k ( x , z ) = tanh(α ( x, y ) + θ ) . In this paper, radial basis kernel is employed. 2
3.2 KSCNMF
Spectral nonlinear mixture model can be written as V = g (W, H ) , with g denoting the nonlinear function. In the feature space, we can get the linear mixture model Vφ = Wφ H + n by using a nonlinear mapping φ . As a result, the spectral reflectance matrixes V and W are transformed into albedo matrixes Vφ and Wφ . The purpose of the NMF algorithm in the feature space is to find two suitable nonnegative matrixes Wφ and H , while satisfying Vφ ≈ Wφ H . In order to simplify computation and make the input space data suitable for mapping, we only introduce the spatial complexity. The simplified objective function is f (W , H ) =
1
V −WH
2
2
+ β χ (H ) .
(12)
and the corresponding update rule is W
m +1
φ
H
m +1
← H ⋅ ∗((W m
← W ⋅ ∗(V ( H ) ) ⋅ /(W H ( H ) ) . m
φ
m
T
m
φ
m
m
T
(13)
φ
4 m 4β m m +1 T m +1 T m ) V + β ( H − ∇ χ ( H m ) )) ⋅ /((W ) (W ) H + H ).
m +1 T
φ
τ
φ
φ
φ
τ
(14)
T
Let both sides of equation (13) be multiplied by V , and do the same operation for φ
m
m +1
the molecular and denominator. Let K VW , K VW and K VV denote the kernel matrixes T
m
T
Vφ W φ , Vφ Wφ
m +1
T
T
, Vφ Vφ respectively, then we get m +1
K VW = K VW . ∗ ( K VV ( H ) ). /( KVW H ( H ) ) . m
m
T
m
m
m
T
(15)
A Kernel Spatial Complexity-Based Nonlinear Unmixing Method
Multiplying the pseudo inverse matrix (V ) to the equation K VW = V W T
+
m +1
T
φ
φ
455
m +1
φ
gives
rise to W
m +1
φ
m +1 = (V T ) + KVW .
(16)
φ
In the same way, the update rule for abundance matrix H can be obtained as follows H
m +1
← H ⋅ ∗(( K m
4 m 4β m m +1 T + m +1 m ) + β ( H − ∇ χ ( H m ) )) ⋅ /(( K ) K K H + H ).
m +1 T
τ
VW
VW
VV
τ
VW
(17)
Then equations (15) and (17) can be used to unmix the nonlinear mixed spectral data. In the algorithm, there are two methods to find the endmembers. The first one, we set some threshold values t j to decide which column of the matrix H to choose, and then the mean values of those chosen columns are chosen as the endmembers. The equation is Wj =
1 cj
∑ Vi 1 ≤ j ≤ p .
(18)
H ij ≥ t j
The second solution to find the endmembers is to ultilize the linear mixture model +
V ≈ WH .We can obtain an approximate endmember matrix W ≈ VH .
4 Experiment 4.1 Synthetic Images
In our experiment, the image size is 36 × 36 and the endmember number is 3. The spectral data are generated by choosing different kinds of spectral reflectances from the USGS spectral library [8]. Abundance matrix is generated according to a Dirichlet distribution which satisfies the positivity and sum-to-one constraints. We use the nonlinear mixture method described in the paper [1] to create the mixed pixels. The VCA [9] algorithm is applied to extract the endmembers and abundances as the initial W and H of CBnNMF and KSCNMF. Endmember correlation coefficient (ECC) [6] is used to measure the similarity between the true endmember signature and its estimate one. Root mean square error (RMSE) [6] is used to measure the similarity between the true abundances and the estimated abundances. These two algorithms were evaluated by using different angle of incidence i and signal to noise ratio(SNR). Table 1. Unmixing results of simulated data (SNR=30)
i CBnNMF KSCNMF
30
45
60
ECC
RMSE
ECC
RMSE
ECC
RMSE
0.9653 0.9742
0.1653 0.1296
0.9778 0.9826
0.1934 0.1746
0.9007 0.9181
0.2154 0.1654
456
X. Wu, X. Li, and L. Zhao Table 2. Unmixing results of simulated data (SNR=15)
i CBnNMF KSCNMF
30
45
60
ECC
RMSE
ECC
RMSE
ECC
RMSE
0.8609 0.7514
0.2634 0.2062
0.7821 0.7001
0.3036 0.2435
0.8509 0.8054
0.2085 0.1844
The comparison between CBnNMF and KSCNMF on the accuracy of extracted abundances is shown in Table1 and Table2. From the Tables, it is clear that the abundances obtained by KSCNMF ( σ = 0.8 ) algorithm outperformed the CBnNMF algorithm. The signal to noise ratio influenced the endmember similarity, especially in the case of a small signal to noise ratio (SNR=15). 4.2 Real Image Scene
In this section, we apply these two algorithms to real hyperspectral data captured by the Airborne Visible/Infrared Imaging Spectrometer(AVIRIS) over Curprite, Nevada. The low SNR bands as well as the water-vapor absorption bands(including bands 1-2, 104-113, 148-167, and 221-224) has been removed from the original 224 bands. Our experiment is based on a subimage (50*40 pixels), shown in Figure 1. The estimated endmember numbers (ignoring the spectrums generated by the same endmember) is 3 by using the virtual dimensionality method [10]. We use the CVX MATLAB software [11] to extract the true abundance fractions of the endmembers, shown in Figure 2 (pure black denotes that the abundance of a certain endmember in this pixel is 0, while pure white denotes that the abundance is 1). According to the ground truth in [12], and the data provided by [13], the three endmembers andradite, kaolinite and montmorillonite are shown in the figures below from left to right respectively. From figure 3 and figure 4, we can conclude that the difference of abundance fraction is very apparent ( σ = 8 ).
Fig. 1. Hyperspectral imagery of Cuprite area (band 28)
Fig. 2. True Abundance fractions of the three endmembers
A Kernel Spatial Complexity-Based Nonlinear Unmixing Method
457
Fig. 3. Abundance estimation obtained by CBnNMF algorithm
Fig. 4. Abundance estimation obtained by KSCNMF algorithm
5 Conclusion In this paper, the proposed method achieves better abundance fractions through the combination of the kernel function and the spatial complexity of hyperspectral imagery. It also overcomes the nonlinear influence while avoiding explicit computation by using an appropriate nonlinear mapping. But, there are several issues that deserve further studies. First, the computation of kernel matrix is a heavy burden. Second, the initial values for the endmembers and abundances will influence the convergence speed. Lastly, it is critical to find an appropriate kernel parameter σ to enhance the unmixed accuracy. Acknowledgments. This work is supported by National Natural Science Foundation of China (60872070) and Zhejiang Province Key Scientific and Technological Project (Grant No. 2007C11094, No. 2008C21141).
References 1. Bo, W., Liangpei, Z., Pingxiang, L.: Unmixing Hyperspectral Imagery Based on Support Vector Nonlinear Approximating Regression. J. Remote Sensing 10(3), 312–318 (2006) 2. Kwon, H., Nasrabadi, N.M.: Kernel orthogonal subspace projection for hyperspectral signal classification. J. Geoscience and Remote Sensing 43(12), 2952–2962 (2005) 3. Lee, D.D., Seung, H.S.: Algorithms for Non-negative Matrix Factorization. Advances in Neural Information Processing Systems 13(3), 556–562 (2001) 4. Liu, W., Zheng, N., Lu, X.: Nonnegative matrix factorization for visual coding. In: Proceedings of the IEEE interational Conference on Acoustics.Speech, and Signal Processing (ICASSP 2003), vol. 3, pp. 293–296 (2003) 5. Hoyer, P.O.: Nonnegative matrix factorization with sparseness constraints. J. Journal of Machine Learning Research. 5, 1457–1469 (2004)
458
X. Wu, X. Li, and L. Zhao
6. Jia, S.: Unsupervised Hyperspectral Unmixing Theory and Techniques. Ph.D. Dissertation, University of Zhejiang, Hangzhou (2007) 7. Sjohn, S.T., Nello, C.: Kernel methods for pattern analysis. China Machine Press, Beijing (2005) 8. Clark, R.N., Swayze, G.A., Gallagher, A., King, T.V., Calvin, W.M.: The U.S. Geological Survey, Digital Spectral Library: Version 1: 0.2 to 3.0 μm. U. S. Geol. Surv., Washington, DC, Open File Rep., pp. 93–592 (1993) 9. Nascimento, J.M.P., Dias, J.M.B.: Vertex Component Analysis: A Fast Algorithm to Unmix Hyperspectral Data. IEEE Transactions on Geoscience and Remote Sensing 43(4), 898–910 (2005) 10. Chang, C.I., Du, Q.: Estimation of Number of Spectrally Distinct Signal Sources in Hyperspectral Imagery. IEEE Transactions on Geoscience and Remote Sensing 42(3), 608–619 (2004) 11. Grant, M., Boy, S.: Matlab software for disciplined convex programming, http://www.stanf-ord.edu/~boyd/-cvx 12. Swayze, G.: The hydrothermal and structural history of the Cuprite Mining District, southWestern Nevada: An integrated geological and geophysical approach. Ph.D. disserta-tion, Univ. Colorado, Boulder (1997) 13. United States Geological Survey, http://speclab.cr.usgs.gov/cuprite.html
Study on Machine Vision Fuzzy Recognition Based on Matching Degree of Multi-characteristics Jingtao Lei1, Tianmiao Wang2, and Zhenbang Gong1 1
School of Mechanical Engineering & Automation, Shanghai University, Shanghai, China 2 School of Mechanical Engineering & Automation, Beihang University, Beijing, China
Abstract. This paper presents a new method used for fruit category recognition based on machine vision and total matching degree of fruit’s multicharacteristics. The ladder membership function was used to express each characteristic. The matching degree of each characteristic was calculated by its membership function, and then the total matching degree was calculated, fruit category recognition can be determined by the total matching degree. In this paper, a 5-input 1-output zero-order Takagi-Sugeno fuzzy neural network was constructed to achieve non-linear mapping between fruit characteristics and fruit type, then the parameters of membership function for each characteristic was designed as learning parameters of the network. Training the fuzzy neural network through a large amount of sample data, the corresponding parameters of the membership functions of recognized fruit can be determined. Taking apple recognition as an example, the experimental results show that the method is simple, effective, highly precise, easy to implement. Keywords: Membership functions, Fuzzy recognition, Fuzzy neural network, Matching degree, Multi-characteristic.
1 Introduction With people’s living standards rising as well as the aging society coming, there is an inevitable trend that service robots come to the community and family. Bill Gates has predicted that the robot is about to repeat the rising of the personal computer industry, ‘Each family will have a robot in the future’. Service robots in family in the future need to recognize family members by their voice features or face features, operate according to different voice commands from family members, for example, bring fruits such as an apple or an orange, take out bottled drinks such as milk or beer from refrigerator. So service robots should have the recognition function such as voice recognition, word recognition and object recognition through images or other information. Although the recognized objects such as voice, word and object are different, but there are some same attributes, which we can abstract universal attribute and then develop modular recognition functional component to be used for different system. In this paper, takes the fruit identification as an example, presents an identification method used on common functional components, i.e., fuzzy neural network recognition method based on the total matching degree of multi-characteristics. K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 459–468, 2010. © Springer-Verlag Berlin Heidelberg 2010
460
J. Lei, T. Wang, and Z. Gong
Study on robot recognition based on machine vision is one of the hot topics. The identification is a problem studied for many years and of course network models have been tested for this problem. The color transformation and threshold segmentation algorithm were generally adopted for automatic identification and harvesting of mature fruit [1, 2, 3]. Makoto GENBA, Ken NAKAMURA[4] researched on modular functional components based on RT middleware for robotics systems, discussed the method of the characteristic expression and identification method of object with similar characteristic. Akihiro IKEZOE[5] studied image recognition hardware module used by modular robot based on middleware. The classification identification algorithm based on the edge characteristic was presented for identifying the underwater target [6]. The automatic system of the fingerprint and voice identification was studied [7]. The ANFIS was used for the detection of electrocardiographic changes in patients [8]. The neural network, template matching and cluster analysis algorithm were mainly adopted for the recognition system. This paper presents fuzzy recognition method based on the total matching degree of multi-characteristics. The spherical fruits with characteristics similar were taken as an example to study recognition, multi-characteristics of fruits were extracted. The fuzzy neural network was trained by a large amount of measured data to construct the membership function of each characteristic. The matching degree of each characteristic of recognized fruit was calculated, and then the total matching degree was calculated. The fruit category recognition can be determined by the total matching degree. The method has higher recognition accuracy.
2 Basic Principle of Fuzzy Recognition The principle of fuzzy recognition for the fruit category by machine vision is shown in Fig.1.The robot obtains image information of recognized fruit through computer vision system in the camera.
~ S
η~ ~ R ~ G ~ B
#
Fig. 1. System principle of fuzzy recognition
After image processing, the shape size, color, and other characteristics are extracted, at the same time, the characteristics of recognized fruits are imported into the embedded system unit for calculating the total matching degree, fruit category recognition can be achieved. 2.1 Characteristics In this paper, two categories of characteristics of the spherical fruit were considered: shape and color characteristics.
Study on Machine Vision Fuzzy Recognition Based on Matching Degree
~
461
~
The shape characteristics include size parameter ( S ) and shape parameter ( η ).
~
Size parameter ( S )defined as the average length of two sides of the rectangle which
~
encloses the fruit image boundary, shape parameter(η ) defined as the aspect ratio of the same rectangle. The fruit colors defined as the vision images are divided into three primary colors, extracting the intensity of red, green and blue. So the color characteristics include Red
~
~
~
intensity ( R ), Green intensity ( G ) and Blue intensity ( B ). 2.2 The Membership Function for Each Characteristic In this paper, according to the distribution of a large amount of sample data for each characteristic, and considering the requirement for real-time calculation of fuzzy reasoning, so the ladder membership function was adopted to describe each characteristic, as shown in Fig.2.
Fig. 2. The ladder membership function
The expression of the ladder membership function is: ⎧0, ⎪x −a ⎪ , ⎪b − a ⎪ f ( x, a, b, c, d ) = ⎨1, ⎪d − x ⎪ , ⎪d −c ⎪⎩0,
x
(1)
c≤x≤d d<x
Where x indicates the universe range of constant variable, a, b, c, d indicate the shape of the ladder membership function. At the same time, a, b, c, d are also adjustable parameters, which are determined by fuzzy neural network’s learning and training. The accuracy of recognition result is depends on whether parameters of the membership function of each characteristic are determined reasonably. The ladder membership functions of five fuzzy characteristics
~ B
were determined as shown in Fig.3. ~ S
η~
~ R
~ G
~ B
Fig. 3. Membership functions of five characteristics
~ ~ ~ S ,η~ , R , G and
462
J. Lei, T. Wang, and Z. Gong
2.3 Total Matching Degree After fuzzy neural network’s training, the parameters of the membership function of five characteristics in Eq.(1) can be determined. Then the matching degree i.e., membership expressed by mi for each characteristic can be calculated respectively. According to
mi of each characteristic, the total matching degree m for fuzzy reason-
ing can be calculated by the method of weighted average. The calculating equation is: 5
m=
∑w m i
wi
(2)
∑w
i
i =1
Where
i
i =1 5
is the weight of each characteristic. The value of m is the total matching
degree, ranging between 0 and 1. The closer to 1, the higher the degree of matching, the higher the accuracy of reasoning ‘the unknown fruit is a certain fruit’. The value of m shows that the matching degree between the characteristic of ‘the unknown fruit’ with the characteristic of ‘some kind of fruit’, i.e., the credibility of the inference that ‘the unknown fruit is some kink of fruit’. Large numbers of results show that if m ≥ 0.8000, the reasoning result is coincident with the actual situation.
3 The Structure of Fuzzy Neural Network As mentioned above, for a given fruit to be identified, there are only two cases for the results of the classification. One is ‘is this kind fruit’ expressed by 1; the other ‘is not this kind fruit’, expressed by 0 .So in the paper, zero-order T-S model is adopted for fuzzy neural network. And corresponds to zero-order T-S model, a 5-input 1-output zero-order TakagiSugeno fuzzy neural network was constructed, as shown in Fig.4. Of which, five inputs of the network are: Size parameter (S), aspect ratio (η ), Red intensity (R), Green intensity (G), Blue intensity (B). The outputs of the network is the classification results ,1 express ‘yes’, 0 express ‘not’. Only two inference rules were required to identify certain fruit category. According to the expression method of zero-order T-S model, the rules can be expressed as follows: Rule 1: if
~ S
x1
is
x1
is not
z =1 Rule 2: if not
~ B , then z = 0
and
~ S
x2 or
~
is η and
x2
x3
~
is
is not η or
~ R x3
and
is not
x4
~ R
is
~ G
or
x5
is
~ B , then
is not
~ G
or
and
x4
x5 is
Study on Machine Vision Fuzzy Recognition Based on Matching Degree x1
463
O1 O2
x2
O3 O4
x3 x4 x5
Fig. 4. Zero-order T-S fuzzy neural network structure
The first layer: this layer has five nodes, each node is square node expressed by a membership function.
O1i = μ A~i ( xi ) , i=1,2,…,5
(3)
Where xi is the input node i , Ai is the language variable related with the node’s
O1i is the ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ membership function of fuzzy sets A ( A = A1 , A2 , A3 , A4 , A5 = S , η , R , G , B ). function value, such as fruit’s geometry size and color parameters, i.e,
As mentioned above, the node function of this layer adopts ladder membership function and its corresponding expression is Eq.(1), the parameters a, b, c, d are the adjustable parameters, which are determined by fuzzy neural network after learning and training. The second layer: each node of this layer represents one fuzzy rule, whose function is used to match the antecedent of fuzzy rule, and calculate the application degree of each rule. For a given fruit to be identified, only two reasoning rules are needed. So this layer has two nodes, the output of this layer is: 5
O2 j = w j = ∏ μ A~ ( xi ) , j=1,2
(4)
i
i =1
The third layer: the nodes of this layer are the same as the second layer, it is used for normalization.
O3 j = w j =
wj 2
∑w i =1
, j=1,2
(5)
i
The fourth layer: this layer is the output layer, whose single-node is a fixed node. According to the requirement of fruit classification, its membership function is selected as constant, the parameter r is adjustable, determined by the fuzzy neural network’s training, too. The output of this layer is: 2
O4 = ∑ wi r i =1
(6)
464
J. Lei, T. Wang, and Z. Gong
The designed fuzzy neural network above is virtually a multi-layer forward-feed network, and can adopt mature learning algorithm. The BP neural network learning algorithm was adopted to carry out training of adjustable parameters in the network above.
4 The Membership Function for Each Characteristic Taking apple recognition from other similar fruits in shape and color as an example, the principle above is adopted to determine the membership function of apple’s five characteristics. The experimental system is shown in Fig.1, 1030 measured data of fruit’s characteristics are obtained, of which 510 measured data are apple, 520 measured data are other fruits. Correlative data of each characteristic is shown in Table 1. In Table 1, 1 means ‘apple’, 0 means ‘other type ’. No.1 to No.1000 measured data in Table 1 are used for training the fuzzy neural network. No. 1001 to No.1030 measured data are used for actual identification. The network model was constructed as shown in Fig.5. BP algorithm was adopted to the network by 200 times training, the training error is 0.048688, as shown in Fig.6. Fig.7(a) is the initial membership function of each characteristic before the fuzzy neural network training, Fig.7(b) is the membership function of each characteristic after the fuzzy neural network training. The training results of parameters a, b, c, d for membership function of apple’s each characteristic are shown in Table 2. Table 1. Measured data of the characteristics for two kinds of fruits No.
S
η
R
G
B
1 2 3
8.9003 6.2488 8.5008
1.0693 0.9355 1.2249
9.2137 6.0165 8.4066
0.9720 2.4347 1.5423
1.7826 2.7951 0.7430
Fruit Category 1 0 1
#
#
#
#
#
#
#
1029 1030
9.5902 9.5640
1.3397 1.3072
6.3854 7.3388
3.1825 3.8684
3.7927 2.4556
0 0
Table 2. Training results of parameters for membership function of apple’s each Characteristic Characteristic
a
b
c
d
S η
5.533
7.075
8.857
11.02
0.8725 5.495
1.008 8.211
1.285
1.481
1.808
4.307
1.798
4.321
R G B
Study on Machine Vision Fuzzy Recognition Based on Matching Degree
465
Similarly, for other type fruit, such as orange, peach, etc., the membership functions of their characteristics are similar to apple. The corresponding parameters of the membership function can be obtained after training from the fruits’ measured data. In the system shown in Fig.1, if more than two kind of fruit category need to be identified synchronously, the process above should to be repeated for each kind of fruit respectively, the membership function of characteristic for each kind of fruit should be established.
Fig. 5. Fuzzy neural network model
Fig. 6. Training error curve #$
!"
!"
(a) Before training
(b) After training
Fig. 7. Membership function of characteristics before and after training
466
J. Lei, T. Wang, and Z. Gong
5 Recognition Results Analysis Taking apple as an example, No. 1001-1030 experimental data in Table 1 were used to test training results by neural network, as shown in Fig.8. Test results show that: the average test error is 0.025126. So it can be concluded that the established fuzzy neural network model has a high modeling accuracy.
Fig. 8. Testing result of neural network modeling accuracy
No.1001-1030 measured data in Table 1 were used to calculate corresponding matching degree according to the membership function of each characteristic, and using Eq.(2) to calculate the total matching degree matched to the apple’s characteristic, assumed that the five characteristics have same weight. The calculation results by Eq. (2) contrast with the fuzzy neural network, as shown in Table 3. Table 3. Calculated results of the total matching degree by measured data No.
S
η
R
G
B
1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030
7.0317 8.1398 8.1886 7.9066 7.11 7.9482 8.033 7.4903 8.0149 7.7177 6.6594 6.1297 6.1353 6.9616 6.1503 9.2912 9.7455 9.7732 9.1551 9.4046 6.3786 6.701 6.8608 6.6569 6.586 9.3573 9.686 9.323 9.5902 9.564
1.1885 1.1473 1.0906 1.2224 1.2027 1.0772 1.0638 1.0119 1.0876 1.2216 0.9923 0.9092 0.9028 0.978 0.9103 0.9883 0.9763 0.9921 0.9619 0.9997 1.3848 1.362 1.3403 1.3375 1.366 1.3535 1.3011 1.3682 1.3397 1.3072
9.6249 8.2588 9.0883 9.2018 8.6162 8.6504 8.97 9.7707 8.3668 8.1721 7.0475 7.3248 7.7785 6.5671 7.3889 6.0809 7.5654 6.939 7.5163 6.7528 7.8428 7.8663 7.5028 6.865 6.7029 6.911 7.4164 7.1952 6.3854 7.3388
0.4353 1.1819 0.4724 1.1683 0.3822 0.1689 0.8226 0.0695 0.9028 0.8937 3.711 3.4218 3.102 3.5436 2.7716 2.1085 3.7615 3.0143 3.0213 3.2086 2.6988 2.2876 3.207 2.2188 3.699 3.0178 3.3038 2.9257 3.1825 3.8684
0.81088 1.7969 0.32107 0.54112 1.5317 1.3237 0.41756 1.2446 1.1542 0.4546 3.3555 3.658 3.3369 2.8488 3.9933 3.2855 2.4104 2.9806 3.8767 3.1895 3.9306 3.3827 2.8223 3.9775 2.357 3.0682 3.4968 3.5357 3.7927 2.4556
Fruit Category 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Total Matching Degree 0.9944 1 1 1 1 1 1 1 1 0.9971 0.5615 0.3897 0.4655 0.5978 0.4242 0.6319 0.6186 0.6079 0.5914 0.6074 0.5404 0.6835 0.6705 0.5874 0.5472 0.5906 0.5940 0.5700 0.4738 0.6307
Fuzzy Neural Network 0.9875 1.0117 1.0167 1.0167 1.0167 1.0167 1.0167 0.9943 1.0167 0.9951 -0.018 -0.048 -0.034 0.0023 -0.048 0.0050 0.0101 0.0252 -0.009 0.0140 -0.047 0.0281 0.0467 -0.028 -0.034 -0.006 0.0004 -0.026 -0.045 0.0017
Study on Machine Vision Fuzzy Recognition Based on Matching Degree
467
From calculated results, it can be concluded that the results of total matching degree have two cases. One case is that the result is equal to 1 or very close to 1 (>0.99), which show that the unknown fruit is ‘Apple’. Another case is that the result is less than 0.7. Because of the matching degree is lower, then the unknown fruit is ‘not Apple’. The above identification results are in coincident with the actual case completely.
6 Conclusion This paper presents a new method for fruit category fuzzy recognition based on total matching degree calculation of fruit’s multi-characteristics. The experiments show that the method has higher recognition accuracy. The proposed recognition algorithm is simple, easy to be achieved in the embedded system. The ladder membership function was used to express each characteristic, and its parameters were determined by fuzzy neural network’s learning and training with a large amount of sample data. The proposed method has a better generality, and would be able to develop a common modular recognition functional component for robot system, expected to be applied for speaker recognition based on voice feature or word recognition for service robots and so on. It should be noted that when changing in the category or batch of identified fruit, i.e., the fruit characteristic is changed different from the training data, the relevant parameters of membership function for each characteristic of new fruit should be redetermined by training the fuzzy neural network.
7 Future Work In order to develop the method proposed in this paper, we should study further in the following some issues in the future. (1) The weighting factors of each characteristic in Eq(2) should be determined reasonably. (2) The determine method for the judging threshold should be studied. The best reasonable judging threshold in accordance with the actual reasoning can be determined by more experiments or simulations. (3) The method can be generally used in different recognition systems, for example, word recognition, speaker recognition and so on. It is need to be studied deeply. Acknowledgment. This research work is supported by the High Technology Research and Development Program of China. 2007AA0417.
References 1. Luo, R.C., Tsai, A.C., Liao, C.T.: Face Detection and Tracking for Human Robot Interaction through Service Robot. In: The 33rd Annual conference of the IEEE Industrial Electronics Society, Taiwan (2007)
468
J. Lei, T. Wang, and Z. Gong
2. Yang, L., Dickinson, J., Wu, Q.M.J., et al.: A Fruit Recognition Method for Automatic Harvesting. In: Mechatronics and Machine Vision in Practice, M2VIP, pp.152–157 (2007) 3. Zhao, J., Tow, J., Katupitiya, J.: On-tree fruit recognition using texture properties and color data. In: IEEE/RSJ International Intelligent Robots and Systems Conference, pp.263–268 (2005) 4. http://www.is.aist.go.jp/rt 5. Ikezoe, A., Nakamoto, H., Nagase, M.: Development of RT-middleware for Image Recognition Module. In: SICE-ICASE International Joint Conference, pp. 2036–2041 (2006) 6. Liqiang, C., Zhong, L., Xiaodong, T.: Classification and recognition of underwater target based on edge feature. Automation Technology and Application 26(8), 77–79 (2007) 7. Chen, Z., Nanfeng, X.: Design and implementation of an automatic recognition robot for fingerprints and voices. Computing Technology and Automation 25(2), 113–116 (2006) 8. Guler, I., Ubeyli, E.D.: Application of adaptive neuro-fuzzy inference system for detection of electrocardiographic changes in patients with partial epilepsy using feature extraction. Expert Systems with Applications 27(3), 323–330 (2004)
Application and Numerical Simulation on Water Mist Cooling for Urban Environment Regulation Junfeng Wang1,*, Xincheng Tu2, Zhentao Wang1, and Jiwei Huang1 1
School of Energy and Power Engineering, Jiangsu University, Zhenjiang, Jiangsu, China 2 School of Mechanical and Aerospace Engineering, Gyeongsang [email protected], [email protected]
Abstract. The fine water mist is a type of sustainable and environment-friendly cooling technology. This paper concerns the use of water mist flow to improve the quality of urban environment in summer. According to the survey and analysis on the potential for saving energy consumption of household air condition, calculating the energy consumption and carbon reduction of spray cooling system, a theoretical basis for popularization of this technology for regulation in urban environment was provided. In order to character the cooling performance of water mist flow, the CFD simulation was conducted as a design tool to investigate the temperature and relative humidity distribution for different relative Reynolds and environmental conditions. The discrete phase model was used to simulate the performance of the water mist cooling process in terms of microclimate improvement. With the increased of relative Reynolds number, the extent of cooling area was expanded due to the increase of spray penetration length. The effect of relative humidity increasing was degraded by increasing the relative Reynolds number. The temperature distribution for the case of semi-outdoor environment indicates that the presence of roof and the height of roof fixed would influence the cooling effect. In case of ventilation for the cooling is well equipped, the change of environmental condition doesn’t make effect on the relative humidity distribution evidently. The results obtained showed that numerical simulation could be used as a predictive manner to optimize the performance of spray cooling process in the outdoor environment. Keywords: two-phase flow, spray cooling, regulation of microclimate, numerical simulation.
1 Introduction Human beings suffer from serious environmental problems such as heat stress, lack of comfort and poor air quality. The urban heat island (UHI) becomes the main character of urban environment in summer. Every year, the heat island increases the cooling load of buildings by several times. The traditional methods, don’t meet the demand of energy-saving techniques for sustainable and green building environment. The energy consumption of air condition results in the emission of carbon dioxide and greenhouse *
Corresponding author.
K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 469–480, 2010. © Springer-Verlag Berlin Heidelberg 2010
470
J. Wang et al.
effect. Increased use of air condition creates a serious peak electricity load problem to utilities and aggravates the cost of electricity. In many cities, the structure of building energy consumption is characterized by a high share of the air condition. For example, in the city of Chongqing, the power consumption for air condition in summer is over 50% of the total yearly consumption in 2006. In China, 2005, the demand of electricity peak for air condition is over 50,000,000 kw, according to this number, the power consumption for air condition will be 80,000,000 kw and 160,000,000 kw in 2010 and 2020 respectively. Air condition is just suitable to improve the comfort for indoor environment. The cooling effect of air condition is limited. Several new development technologies have proven potential efficient in decreasing the need for cooling and improving indoor environmental conditions. However, there is rare equipment developed to regulate the outdoor environmental quality. Therefore, it is necessary to develop a type of green cooling method to improve the energy performance of building environment and increase the use of outdoor environment for energy efficient movement (i.e. walking and sports). During the recent years, there is common public interest in the improvement of microclimate in outdoor or semi-outdoor environment; the microclimate condition has been begun viewed as an of importance element in the urban environment evaluation. Spray cooling is a well known, frequently used method of heat removal in many processes. It takes advantage of phase change in hot environment, using less cooling fluid and covering a larger area than sprinkler. Spray cooling has the advantage of additional heat removal that is thought to be due to the additional surface and evaporation of droplets [1].The removal of large heat fluxes is becoming a barrier to the technology roadmaps for microprocessors, power electronic modules, and many other applications incorporating microelectronic or micro photonic devices. The spray parameters (i.e. droplet size, velocity, and distribution) those are crucial for the spray cooling performance, as indicated by many authors [2, 3]. In the past few years, many researchers have applied the evaporative cooling performance of fine water mist to metallurgical industry, fire suppression, cooling process in tunnels and food industry. However, few works in the literature are dedicated to improvement for outdoor environment in summer. Kachhwaha [4, 5] and Sureshkumar [6, 7, 8] presented the experimental study and numerical simulation of the heat and mass transfer between a water spray and ambient air. Yamada proposed a pilot study about designing and predicting the performance of a water mist system [9]. The difference in temperature reduction for different diameter of droplets was simulated. However, the area of temperature reduction by high pressure atomization is limited. The droplets atomized by high pressure can’t scatter in a short time by natural wind; the performance of cooling is limited, and the relative humidity in local environment should ascend evidently because of the congregation of droplets in local environment. Jun-feng Wang [10] developed a kind of spray cooling system of two-phase flow by low pressure atomization, proposing the influence of natural wind on temperature and relative humidity distribution. Based on the review of spray cooling technology, we can find that there is little reference data and literatures concerning the design of spray cooling system to solve the problem of microclimate improvement in summer. The most important decision that affects the cooling performance of fine water mist system in microclimate is taken in the initial stage of design. In general, the design evaluation and optimization
Application and Numerical Simulation on Water Mist Cooling
471
can be supported by the results of simulation, by simplified guidelines or expert advice based on experience [11]. But the latter two methods are based on the engineering experience and experimental investigation. Since the spray cooling process is an unsteady state, experiment need take a long time to get sufficient data. In additional there is no much about the design of spray cooling system for regulation in microclimate. Therefore, the numerical simulation is adaptive for the final stage of design and optimization rather than a support from experts when the layout of outdoor environment is programmed. In this paper, according to the calculation and analysis of the energy consumption of a representative community to demonstrate the presence of energy saving potential of household air condition, the authors propose the technology of fine water mist cooling to improve the quality of urban environment in summer, calculating and analyzing the energy consumption and carbon reduction of spray cooling system to provide a theoretical basis for popularization of this technology in the field of environmental regulation in urban public open space. The Computational fluid dynamics (CFD) calculations are carried out to study the cooling performance of fine water mist system. The Reynolds number of outlet and environmental conditions are critical for the cooling performance of fine water mist flow. In order to characterize the influence of these parameters on the droplets evaporation and transportation, two cases were investigated: different velocity of air jet flow and different environmental conditions.
2 Energy-Efficient Cooling Method 2.1 Calculation and Analysis of Air Condition Performance Take the representative urban community with 3000 households for instance. There are 120 air conditions per 100 households. The power of air condition is about 1.0 kw. The working time in everyday is 10 hours for two months in summer. The totaled power consumption of this urban community in summer is about 2,160,000 kwh. In 2007, the consumption of standard coal for 1 kWh is 334 g, so the totaled power consumption of these air conditions is equivalent to the weight of standard coal is 721.44 tons. If the electricity cost of this community is 0.5 RMB per kwh, the expense for electricity cost is 1,080,000 RMB.In units of emission of carbon dioxide is 638g per kwh, The emission of carbon dioxide is 1378 tons. Therefore, we can adopt a method to increase the use of outdoor environment for energy efficient movement (i.e. walking and sports) and decrease the working time of air condition. Decreasing the working hours of air condition for two hours everyday, the amount of totaled power consumption decreased of this urban community in summer is about 432,000 kwh, the amount of saved electricity cost is 216,000 RMB, the amount of carbon dioxide emission decreased is 275.6 tons, the amount of standard coal consumption decreased is 144 tons. The calculation of the energy consumption demonstrates the presence of energy saving potential for household air-condition. The urban residents would be encouraged to do physical training in outdoor environment to reduce the work time of air-condition. The achievement of saving energy and emission reduction for urban residents would be implemented by using a low-carbon technology to improve the quality of microclimate.
472
J. Wang et al.
2.2 Calculation and Analysis of Energy Consumption and Carbon Reduction A schematic of the spray cooling system is shown in Fig.1. This system mainly includes rotational configuration, fan, and atomization nozzles, appropriately connected. The rotational configuration adjusts the air jet flow direction. Air Jet Flow
Control System
Nozzles
Water Resource
Rotational Device
Two-Phase Jet Flow
Power Resource
Fig. 1. Schematic Diagram of Spray Cooling System
The energy consumption and carbon reduction are the most two important factors for system design. The fine water mist cooling for regulation in large space to improve the microclimate is carried out by high pressure atomization and low pressure atomization. The droplets is scattered by the air flow. For the fine water mist flow, a round air jet flow is used to enhance the disturbance of two-phase flow to achieve the transportation of droplets in a long distance [12, 13]. During the process of droplets evaporation, the droplets accomplish the improvement of quality in microclimate. Spray cooling system can be accomplished by utilizing high pressure atomization or low pressure atomization to form droplets. In this section, the energy consumption and carbon reduction are calculated. According to the pilot study of spray cooling system [10, 13], the energy consumption comparison of different cooling method is shown in Table 1. Table 1 summarizes the energy consumption comparison of three cooling method: high pressure atomization cooling, low pressure atomization cooling and sprinkler cooling. The comparison indicate that the water consumption of spray cooling is 0.3~0.48 L/h•m2, less than 2.5% of sprinkler cooling. The droplets that is formed by high pressure atomization is smaller than low pressure atomization, so the lifetime is relative shorter. The energy consumption of high pressure atomization is also higher than low pressure atomization. Table 1. Energy Consumption Comparison of Different Cooling Method Item Water Consumption Electricity Consumption Atomization Consumption
High Pressure Atomization 0.3~0.48 L/h·m2
Low Pressure Atomization 0.24 L/h·m2
12~20 Wh/L
0.7~1.0 Wh/L
2 W/L
0.42-0.92 W/L
Sprinkler 20 L/h·m2
Application and Numerical Simulation on Water Mist Cooling
473
Table 2. Carbon Reduction of Spray Cooling System Item Power Plant Air Condition Spray Cooling System
Energy Generated(Consumption) Per Hour 1 kwh 0.012 kWh/m2· 0.2 kwh
℃
Discharge Amount of Carbon Dioxide 638 g 0.012×638=7.656 g/m2· 0.2×638÷3÷157=0.27 g/m2·
℃ ℃
The utilization of spray cooling method is an effective way to reduce the emission of carbon dioxide to improve the quality for urban environment. For example, the energy consumption of air condition is 12 w for each 1 m2 area and each 1 decrease. The power of the fan is 200 w, and the cooling area is 157 m2. If the temperature decrease is 3 , the carbon reduction of spray cooling system can be calculated. The amount of carbon reduction of spray cooling system is shown in Table 2. The comparison of carbon dioxide reduction or emission for power plant, air condition and spray cooling system is presented. According to the numbers, we can find that the carbon reduction of spray cooling system is about 0.27 g/m2• , which is approximately 3.5% of the air condition’s carbon dioxide emission.
℃
℃
℃
3 Numerical Simulation Procedure 3.1 Numerical Simulation The simulation is carried out by using FLUENT, a commercial computational fluid dynamics (CFD) code with a three-dimensional configuration. The governing equations are solved using the finite-volume method in the grid system. The geometrical model and the computational grid are created using the Gambit pre-processing software. A non-structured and non-uniform mesh is used. The mesh is generated with hexahedral scheme. The numerical results are obtained with 570,000 cells. A view of the geometry mesh of the calculation volume is given in the Fig.2. The numerical simulation is conducted in a large space which is 15×10 m2 in plan and 4 m in height. Fig.2 shows the geometry of numerical simulation area. A fan consists of a cylinder, 0.35 m diameter and 0.3 m in height, is placed along the X axis direction. The fan locates at the height of 2 m above the ground. The nozzles are fixed at the outlet of fan. This space is similar in size, geometry and construction to the outdoor environmental condition in summer. SIMPLE algorithm is used for coupling pressure and velocity into the continuity equation. The second-order upwind differencing scheme is used for the convection terms of flows and all transported variables to reduce the numerical diffusion. The description of temperature and relative humidity development was based on the conservative law of mass, motion and energy.
474
J. Wang et al.
Fan&Nozzles
4
Outflow Z
2
0
Ground
0
5
-4
X
-2 10
Y 0 2 4
15
Fig. 2. Geometry model and boundary condition
3.2 Turbulence Model The numerical modeling of turbulent flows involves the modification of the governing equations for the case of the laminar flow using a time-averaging procedure known as Reynolds averaging[14]. In the cooling process of two phases flow, the influence of flow field configuration caused by buoyant flow can’t be neglected. According to the characteristic of flow field in a round turbulent jet flow, the generation term of buoyant is zero in standard k~ε model, therefore, the turbulent transport is modeled by the two-equation (realizable k~ε) model. 3.3 Spray Flow Model The discrete phase model (DPM) option in Fluent solves the equation of motion for a discrete phase dispersed in the continuous phase. The motion of each particle of the dispersed phase is governed by an equation that balances the mass-acceleration of the particles with the forces acting on it. Appropriate forces such as the drag and gravitational forces have been incorporated into the equation of motion [15]. 3.4 Boundary and Initial Condition The numerical solution precision strongly depends on the accuracy of the boundary conditions and the way in which these conditions are integrated within the numerical model. Fig.2 is the geometry of the calculation area and the boundary conditions used for the calculations. In our case, the following boundary conditions were considered: The ground zone and the shell of fan are defined as wall boundary condition. Standard wall function is adopted at the near-wall zone. The no-slip boundary condition is enforced at wall. The thermal condition of combined external radiation and convection heat transfer is defined at wall boundary. The shell of fan is made of steel. The material parameter of ground is defined as concrete. The temperature and external emissivity of ground is 50 and 0.71 respectively. The thickness of ground is 0.15 m. The relative
℃
Application and Numerical Simulation on Water Mist Cooling
475
℃
humidity of air is 60%, and the temperature of air is 35 . The main direction of the air jet flow is along the X axis. There is no evident velocity gradient in the X=0 m YZ plane, X=15 m YZ plane, Y=5 m XZ plane and Y=-5 m XZ plane, so setting of the outflow velocity on these planes is 0.01m/s. The outlet of fan is given velocity inlet boundary.
4 Analysis Results and Discussion 4.1 Cooling Performance for Different Relative Reynolds The relaxation time usually means the duration of a perturbed system into equilibrium. Each relaxation process can be characterized by relaxation time. In other words, the stabilized duration of temperature and relative humidity distribution depend upon the relative Reynolds. In this section, the cooling performance with different velocity of air is simulated. Fig. 3~6 present the numerical results concerning the evolution of the temperature and relative humidity along different direction for different air velocity 6 m/s and 10 m/s. The mass flow rate of spray is 0.012 kg/s. The mean diameter of droplets is 60 µm. 4
4
36
2
20
34
24 26
26
34
30
34
0
5
10
X
15
32
32
32
32
32
33
34 0
32 32
33
34
30 0
32
33 1
30
32
36
36
32
28
31
31
32
28
32
30
31
34
34
30
36
1
31
29 28
Z
Z
32 32 26
29
34
32
33
26 30 30
2
33
3
36
3
32
32
0
5
32 10
X
(a)
15
(b)
℃) (a) 6 m/s and (b) 10m/s
Fig. 3. Air temperature contour in the Y=0 m XZ plane ( 4
4
36
2
26
26 28 30
34
0
5
63
63 63
63 63 62
61
X
(a)
10
15
0
62
62
34
30 0
63
61
34
61
67
64
1
30
32
36
36
32
28
63
34
34
67
62 30
36 1
72
Z
34
24
64
65
34
62
65
68
20
2
61
62
63
32 32 26
64
Z
26 30 30
61
3
36
3
0
5
X
10
(b)
Fig. 4. Relative humidity contour in the Y=0 m XZ plane (%) (a) 6 m/s and (b) 10m/s
15
4
4
36
35
32.8
2
36
36
31.6 31.2 32.4 31.6 31.4 32
35
0
32.6
36
-2
36
36
33.2
35
10
X
0
15
31.8 32 32.2 32 .4
.6
5
32.2
32.2 31.8 32
32.2
32
0
32.8
33 .4
-4
-4
33
-2
32.4
32.6 32.4
32
0
4 32.
32.6
33
29
36 32 332 33 34 313 23 6 1 32 35 30 34 36 36
Y
36
2
Y
33.4 33 .2
36
3 2. 8
J. Wang et al.
33
476
5
(a)
32.4 10
X
15
(b)
℃) (a) 6 m/s and (b) 10m/s
Fig. 5. Air temperature contour in the Z=1.7m XY plane (
4
4
60.5
62
62
2
61
62
2
62 66 68 62 94 2 66 8 2 62 66 62 64 6
61 61.5 61.5 60.5 62 62.5 62 .5 62 63 61.5 64 64 61 63 62.5 63 62.5 62 62 1.5 60.5 61.5 6 61
0
72
0
Y
Y
62
62
61
-2
-2
61
62 -4
62
60. 5 -4
62 0
5
X
10
15
0
5
(a)
X
10
15
(b)
Fig. 6. Relative humidity contour in the Z=1.7m XY plane (%) (a) 6 m/s and (b) 10m/s
Fig.3 and Fig.5 show contour plots of the temperature for different air velocity. The Z=1.7 m XY plane is the perception zone of human body. In the case of velocity is 6 m/s, the temperature fall below 32 at X=4~10 m in the Y=0 m XZ plane. In the case of velocity is 10 m/s, the temperature is around 32 at X=3~13 m. Compare with the result of the former case, the cooling region of high velocity shows larger cooling extent in the perception zone of human body, the temperature of calculation area is around 33.4 . Fig. 4 and Fig. 6 present a comparison between two different velocities concerning the contours of relative humidity during the cooling process. In the case of velocity is 6 m/s, the spray penetration length is not enough to achieve the transportation in a long distance. The droplets evaporate at X=2~6 m, so the relative humidity is up to
℃
℃
℃
Application and Numerical Simulation on Water Mist Cooling
477
more than 70% in this zone. In the case of velocity is 10m/s, the relative humidity increase to 64%~67% at X=2~6 m. Compare with the result of the former case, the cooling region of high velocity shows relative low humidity distribution in the perception zone of human body, the relative humidity of calculation area is about 64%. According to the comparison of cooling performance, we can find that with the increased of relative Reynolds number, the extent of cooling area is expanded due to the increase of spray penetration length. The effect of relative humidity increasing is reduced by the velocity of air jet flow. 4.2 Cooling Performance for Different Environmental Conditions A number of previous investigations have demonstrated that the risk of heat stress was much greater in the shaded upper tier of seating than in the open without shaded seating, because of the reduced ventilation and increased radiant heat from semitransparent roof material [16]. Jennifer Spagnolo [17] defined the environment that is still being exposed to the outdoor environment in most respects and included some man-made device to moderate the effect of microclimate as semi-outdoor environment. Examples including a roof used as a radiation shield or a wall used as a vertical wind break. Therefore, the environmental condition is an important factor that could influence the human comfort. It is necessary to design the cooling system that is used in different environmental condition. For the numerical simulation in this section, a part of the boundary condition of the model would be changed as follow: for the case of semi-outdoor environment, the temperature of the ground is 39 , and the external radiation heat transfer is ignored due to the presence of roof. The thermal condition of combined external radiation and convection heat transfer is defined at roof wall boundary. The radiant power of sunshine is 600 w/m2. The material of the roof is PVC. For the case of outdoor environment, the temperature of ground is 55 . The velocity of fan outlet is 10 m/s. Fig. 7-10 show the numerical results concerning the evolution of temperature and relative humidity in the Y=0 m XZ plane and in the Z=1.7 m XY plane, for different environmental condition: outdoor environment and semi-outdoor environment. Fig. 7 and 9 present the temperature distribution in the semi-outdoor environment and outdoor environment. The figures clearly indicate that transport and evaporation of droplets are significantly affected by the roof boundary. The temperature in semi-outdoor environment decrease by 3~5 , mostly below 32.2 , and very evenly distributed. For the case of outdoor environment, the air temperature decrease about 1~2 in the vertical and horizontal cross-section along the axial direction. The temperature distribution indicates that the presence of roof and the height of roof fixed would influence the cooling effect. Fig. 8 and 10 present the relative humidity distribution in the semi-outdoor environment and outdoor environment. For the case of semi-outdoor environment, the limitation due to the roof boundary doesn’t form the accumulation of water vapor. The water mist still scatters sufficiently because of the disturbance of air jet flow. The relative humidity increases about 1~6%. However, tendency of the relative humidity is almost increased. When the spray cooling systems are working simultaneously, the
℃
℃
℃
℃
℃
J. Wang et al. 4
33
4
33
33
35
.5 32
3
3
32
32 31.5
Z
2
31.5
32
31.5
5
32
34 .5
1
33 33 .5
32 10
X
34
33 .5
0
15
33
33.5 32.5
34.5
0
32.5
32
32
32
32
34
32
31.5
31.5
34
32 0
32
32
32.5
33.5
32
32
32
32.5
32.5
28.5
32.5
.5 31
32
1
32
2
30.5
34
32.5
32
34
34.5
32
32
31.5
30
35
Z
32.5
34.5
478
0
5
10
X
(a)
15
(b)
℃) (a) Semi-outdoor environment and
Fig. 7. Air temperature contour in the Y=0 m XZ plane ( (b) outdoor environment 4
4
3
3
61
61
63
64
1
61 62 63
63
61
61
61
62
64 64
62
62
61
62
62
Z
0
63 66
61
61
62
61 0
61
63
61
69
64
1
68
2
61
64
68
62
64 6 61 2
62
61
65
64
61
63
66
68
2
61
61 62
63
Z
62
62
62 5
X
10
0
15
0
5
10
X
(a)
15
(b)
32. 2
Fig. 8. Relative humidity contour in the Y=0 m XZ plane (%) (a) Semi-outdoor environment and (b) outdoor environment
4
33.5
Y
0
32
32
31 34 32.5
33
32.5 33.5
33.5
33.5
34
-2
33.5
34 32
2 32.
-4 0
34
34
34
Y
32.2
32.2
32.2
33.5
2
32.2 32.2 32 3 32 32 32 31.8 31.4 1.8 31.8 31.6 31 31.6 31.4 32 32 32.2 32 32 32 32.2 32.2
-2
34
32.2
32.2 0
34
32.2 32.2
2
4
32
32.2
5
X
10
32
34
-4
15
0
5
(a)
X
10
15
(b)
℃) (a) Semi-outdoor environment
Fig. 9. Air temperature contour in the Z=1.7 m XY plane ( and (b) outdoor environment
Application and Numerical Simulation on Water Mist Cooling
60.5 4
60.5
60.5
60.5
4
60.5
479
61.561.5 .5 61 61 61
0
62
-2
61 60.5
60.5 60.5
-4
61
60.5 61 62.5 61 64.5 64.5 63.5 62 66.5 64.5 64.563 62.5 61 62 61 61 60.5
62 61.5
61
Y
Y
2
61
-2 60.5
61
60.5
60.5
-4
5
X (a)
10
15
61
60.5
60.5 0
61
0
61
60.5
60.5
60.5 61 .5 62 62 61 .5 1 6 61.562 64 63 62.5 .5 63 62.5 61 62 61 60.5 60.5
61
60.5
61
60.5
61
60.5
2
61
60.5
0
5
X
10
15
(b)
Fig. 10. Relative humidity contour in the Z=1.7 m XY plane (%) (a) Semi-outdoor environment and (b) outdoor environment
relative humidity will increase sharply. In case of ventilation for the cooling is well equipped, the change of environmental condition doesn’t make effect on the relative humidity distribution evidently. For the semi-outdoor condition, the relative humidity evolution would be in controlled during the stage of design.
5 Conclusion This paper proposed a type of spray cooling technology to improve the quality of urban environment. According to the analysis and calculation on energy consumption of a representative community to demonstrate the presence of energy saving potential of household air-condition. The authors developed the technology of fine water mist cooling for regulation in large space of local environment, calculating and analyzing the energy consumption and carbon reduction to provide a theoretical basis for popularization of this technology. Numerical CFD approach was carried out as an assistant design tool to characterize temperature and relative humidity contour of the spray cooling process. The cooling performance for different air jet velocity and environmental condition were discussed. With the increased of relative Reynolds number, the extent of cooling area was expanded due to the increase of spray penetration length. The effect of relative humidity increasing was reduced by the velocity of air jet flow. The temperature distribution for the case of semi-outdoor environment indicates that the presence of roof and the height of roof fixed would influence the cooling effect. In case of ventilation for the cooling is well equipped, the change of environmental condition doesn’t make effect on the relative humidity distribution evidently. For the semi-outdoor condition, the relative humidity evolution would be in controlled during the stage of design.
480
J. Wang et al.
References 1. Pavlova, A., Kiyoshi, O., Michael, A.: Active performance enhancement of spray cooling. Anna International Journal of Heat and Fluid Flow 29, 985–1000 (2008) 2. Lefebvre, A.H.: Atomization and Sprays. Hemisphere Publishing Corporations, pp. 1–417 (1989) 3. Pimentel, R.G., de Champlain, A., Kretschmer, D., et al.: Generalized Formulation for Droplet Size Distribution in a Spray. AIAA paper 2006-4918 (2006) 4. Kachhwaha, S.S., Dhar, P.L., Kale, S.R.: Experimental studies and numerical simulation of evaporative cooling of air with a water spray-I. Horizontal parallel flow. International Heal Mass Transfer 41, 41–447 (1998) 5. Kachhwaha, S.S., Dhar, P.L., Kale, S.R.: Experimental studies and numerical simulation of evaporative cooling of air with water spray- II. Horizontal counter flow. International Heal Mass Transfer 41, 465–474 (1998) 6. Sureshkumar, R., Kale, S.R., Dhar, P.L.: Heat and mass transfer processes between a water spray and ambient air – I. Experimental data. Applied Thermal Engineering 28, 349–360 (2008) 7. Sureshkumar, R., Kale, S.R., Dhar, P.L.: Heat and mass transfer processes between a water spray and ambient air – II. Simulations, Applied Thermal Engineering 28, 361–371 (2008) 8. Sureshkumar, R., Dhar, P.L., Kale, S.R.: Effects of spray modeling on heat and mass transfer in air–water spray systems in parallel flow. International Communications in Heat and Mass Transfer 34, 878–886 (2007) 9. Yamada, H., Yoon, G., Okumiya, M., et al.: Study of Cooling System with Water Mist Sprayers: Fundamental Examination of Particle Size Distribution and Cooling Effects. Build Simul. 20, 214–222 (2008) 10. Wang, J., Tu, X.: Experimental Study and Numerical Simulation on Evaporative Cooling of Fine Water Mist in Outdoor Environment. In: International Conference on Energy and Environment Technology, pp. 156–159 (2009) 11. de Wilde, P., Augenbroe, G., van der Voorden, M.: Design analysis integration: supporting the selection of energy saving building components. Building and Environment 37, 807– 816 (2002) 12. Wang, J., Tu, X., Huang, J., et al.: Numerical simulation of cooling effect by fine water mist. Journal of Jiangsu University: Natural Science Edition 30, 591–595 (2009) (in Chinese) 13. Wang, J., Tu, X.: Numerical analysis of cooling process in two-phase flow by low pressure atomization in semi-outdoor environment. Drainage and Irrigation Machinery 27(4), 255– 260 (2009) (in Chinese) 14. Rajaratnam, N.: Turbulent Jets. Elsevier, Amsterdam (1976) 15. Moureh, J., Letang, G., Palvadeau, B., et al.: Numerical and experimental investigations on the use of mist flow process in refrigerated display cabinets. International journal of refrigeration 32, 203–219 (2009) 16. Fiala D.L.K.J.: Application of a computer model predicting human thermal responses to the design of sport stadia. Presented at Chartered Institution of Building Services Engineers National Conference (CIBSE 99), Harrogate, UK (1999) 17. Spagnolo, J., de Dear, R.: A field study of thermal comfort in outdoor and semi-outdoor environments in subtropical Sydney Australia. Building and Environment 38, 721–738 (2003)
Optimal Guaranteed Cost Control for Linear Uncertain System with Pole and H∞ Index Constraint Xianglan Han1 and Gang Zhang2 1
School of Information Science and Engineering, Ningbo Institute of Technology, Zhejiang University, Ningbo 315100, China 2 The Faculty of Maritime, Ningbo University, Ningbo 315211, China [email protected]
Abstract. This paper addresses the problem of optimal guaranteed cost reliable control for linear uncertain systems with regional pole and H∞ disturbance attenuation performance indices constraint. Based on a more practical and general model of continuous actuator gain failures, the reliable controller is designed to guarantee the closed-loop system satisfying the pre-specified regional pole index, H∞ norm-bound constraint and having the optimal quadratic cost performance simultaneously. The consistency of the performance indices mentioned earlier is also set up for fault-tolerant control. Necessary and sufficient conditions for optimizing guaranteed cost controller design for linear uncertain systems are also given in terms of LMIs. The simulation example shows the effectiveness of the proposed method. Keywords: optimal control, fault-tolerant control, actuator failure, guaranteed cost control.
1 Introduction Actuator failures encountered in feedback control systems can cause serious performance deterioration and may lead to instability, possibly resulting in catastrophic accidents [1]. To improve system reliability in the presence of actuator failures, failure tolerant control designs are desirable. In the past forty years, existing reliable control system design method can be divided into: active fault-tolerant control and passive fault-tolerant control according to the existence of fault detection and diagnosis unit [2-5]. Whereas, many distributions are focused on keeping stability of closed-loop systems under faulty conditions, and other performance indices are seldom considered, such as transient property, precision performance and H∞ norm index etc.. Factually, the performance, which should be maintained after faults occurred, is usually multi-objective besides the stability. Moreover, it is impossible to optimize all the indices simultaneously in the designing of the controllers. So it is desirable practically for engineering systems that the performance indices are kept in admissible region under normal/faulty conditions [6-8]. This note concerned with the passive fault-tolerant control for linear uncertain system under the actuator failure. The attention focuses on the designing of state-feedback K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 481–489, 2010. © Springer-Verlag Berlin Heidelberg 2010
482
X. Han and G. Zhang
controller that guarantee, for all admissible uncertainties and disturbance as well as the actuator failure, the close-loop system satisfies the pre-specifies indices i.e., regional pole index, H∞ norm index and cost function index simultaneously. For the performance indices may be contradiction, their consistency is also discussed in this note. Moreover, a more general actuator failure model is adopted in this note, which covers the typical work status of actuator, i.e. normal operation, partial degradation and outage. Notation: R n and R n×m denote, respectively, the n-dimensional Euclidean space and the set of n × m real matrices. The superscript T denotes matrix transposition. I and 0, respectively, are unit matrix and zero-matrix of appropriate dimensions; diag ["] denotes a block-diagonal matrix. The notation P > 0 , for P ∈ R n×n means that the matrix P is real symmetric positive definite. Λ ( A) means the set of eigenvalues of matrix A . Φ ( q , r ) denotes an open circular disk centered at ( q, 0 ) with radius r .
2 Problem Formulation and Lemmas Consider the following linear uncertain system given by: x ( t ) = ( A + ΔA ) x ( t ) + ( B + ΔB ) u ( t ) + Dω ( t ) z ( t ) = Cx ( t )
(1)
Where x ∈ R n is the state vector; u ∈ R p is the control input vector; z ∈ R k is output vector; ω ( t ) ∈ R q is zero mean white noise process with || ω ( t ) ||2 ≤ β , and uncorrelated to the initial state x ( 0 ) ; ( A, B, C , D ) are known real constant matrices of appropriate dimensions, and ΔA , ΔB are real matrices function representing time-varying parameter uncertainties. The admissible uncertainties are assumed to be of the form: [ ΔA ΔB ] = HF ( t ) [ E E ] , and F ( t ) is unknown real time-varying matrix with 1
2
。
Lebesgue measurable elements satisfying: F ( t ) F ( t ) ≤ I H , E1 , E2 are known real constant matrices which characterize how the uncertain parameters in F ( t ) enter the nominal matrices A and B . Suppose that all the states are available for feedback, we consider actuator faults that occurred often in practical engineering systems. Then, the close-loop system under the faulty condition can be described as follows: T
x ( t ) = ( AC + ΔAC ) x ( t ) + Dω ( t ) z ( t ) = Cx ( t )
Where
AC = A + BMK
,
M = diag[m1 , m2 ," , m p ]
(2)
ΔAC = HF ( t )( E1 + E 2 MK )
. K is the designed feedback gain matrix; is the actuator faults matrix satisfy 0 ≤ mil ≤ mi ≤ miu ≤ 1 . mil , miu are
known real constants. Obviously, diagonal matrix M represents the time-varying faults condition of actuator. In the case of mi = 1 , it represents the corresponding actuator is in normal condition; If mi = 0 , it covers the outage case, i.e. the actuator i is allowed total failed; And if 0 ≤ mil < mi < miu < 1 , it represents the partial degradation of the actuator. Such actuator fault model is more flexible than some existing techniques where only either normal or complete failure has been considered.
Optimal Guaranteed Cost Control for Linear Uncertain System
483
If define the cost function associated with fault system as: ∞
J = ∫ ⎣⎡ xT ( t ) Ux ( t ) + uT ( t ) Ru ( t ) ⎦⎤ dt 0
(
(3)
)
∞ T = ∫ ⎡⎢ xT ( t ) U + ( MK ) R ( MK ) x ( t ) ⎤⎥ dt 0 ⎣ ⎦
Where U = U T > 0, R = RT > 0 are given positive real matrix. Then, our objective is design a appropriate state feedback law u (t ) = Kx ( t ) for the impaired linear uncertain system (2), such that following performance indexes are satisfied: The fault closed-loop system (3) poles lie in the circular disk Φ ( q, r ) , where q and r are known real constants with q > r > 0 ; (a)The H∞ norm of transfer function matrix from ω ( t ) to z ( t ) satisfy H ( s ) < γ , where γ > 0 is a known real constant; (b)The cost function (3) is optimized and optimized value does not exceed a certain number J , i.e. J ≤ J ; In the following section, we will first define some relevant matrices and some matrix inequalities, which will be used in the proof of our results. Define: ∞
*
*
M 0 = diag ⎡⎣ m01 , m02 ," , m0 p ⎤⎦ J = diag ⎡⎣ j1 , j2 ," , j p ⎤⎦
L = diag ⎡⎣ l1 , l2 ," , l p ⎤⎦
Where: moi = ( mil + miu ) 2 , ji = ( miu − mil ) ( miu + mil ) , li = ( mi − m0i ) m0i . Then we have: M = M 0 ( I + L)
L ≤J≤I
Now we introduce the following two lemmas, which can be gained by pole assignment theory and robust control theory easily. Lemma 1: For the fault system (2) and the given circular disk index Φ ( q, r ) , if and only if there exists scalar γ > 0 , positive matrix Q = QT > 0 such that the following matrix inequalities holds:
( AC + ΔAC + qI ) Q ( AC + ΔAC + qI )
T
( AC + ΔAC ) Q + Q ( AC + ΔAC )
T
− r 2Q < 0
+ γ −2QDT DQ + CC T < 0
(4) (5)
Then system (2) is asymptotically stable with Λ ( AC + ΔAC ) ⊂ Φ ( q, r ) , and the H∞ norm satisfy H ( s ) ∞ < γ . The proof of lemma 1 is simple, and the similar proof process can be seen in reference [9-10]. Obviously, lemma 1 cover two indices i.e. pole index and H∞ performance index, and does not change the sufficient and necessary qualitative of pole assignment theory. Furthermore, the following lemma 2 covers all the three performance indices: circular pole Φ ( q, r ) , H∞ norm and cost function J index simultaneously, but it is only the sufficient condition of pole assignable. Lemma 2: For the fault system (2) and the given circular disk index Φ ( q, r ) , if there exists scalar γ > 0 , positive matrix Q = QT > 0 such that the following matrix inequalities holds:
484
X. Han and G. Zhang
( AC + ΔAC + qI ) Q ( AC + ΔAC + qI )
T
( AC + ΔAC ) Q + Q ( AC + ΔAC )
T
− r 2Q < 0
+ γ −2QDT DQ + CC T + U + ( MK ) R ( MK ) < 0 T
Then system (2) is asymptotically stable with Λ ( A
C
isfy
H (s)
∞
<γ
(6)
+ ΔAC ) ⊂ Φ ( q , r )
(7)
, the H∞ norm sat-
and J ≤ x0T Qx0 + γ 2 β .
Proof: Obviously, if matrix inequality (6) and (7) hold, according to lemma1 the close-loop system (2) satisfy pole index Λ ( A + ΔA ) ⊂ Φ ( q, r ) , H∞ norm satisfy H ( s ) < γ . C
∞
C
For the given cost function index, define Lyapunov function as V ( x) = xT (t )Qx(t ) , with •
direct computation of V ( x ) , it follows inequality (7) that: xT (t ) ⎡⎣U + ( MK )T R ( MK ) ⎤⎦ x (t ) ≤ −V ( x ) + γ 2 wT ( t ) w ( t ) .
Then:
(
)
∞ T J = ∫ ⎡⎢ xT ( t ) U + ( MK ) R ( MK ) x ( t ) ⎤⎥ dt 0 ⎣ ⎦ ∞
≤ ∫ ⎣⎡ −V ( x ) + γ 2 wT ( t ) w ( t ) ⎦⎤ dt = x0T Qx0 + γ 2 β 0
.
This ends the proof of lemma 2.
3 Main Results Firstly, we give the LMI version of lemma 1. Theorem 1: Consider the system (2), for the given Φ ( q , r ) , if and only if there exits scalar γ > 0 , ε > 0 ( i = 1, 2,3, 4 ) and symmetrical positive matrix X = X > 0 , Y , such that the following LMIs holds, T
i
⎡Π ⎢ ⎢* ⎢* ⎢ ⎢⎣ *
( A + qI ) Q + BM 0 S
ε 2 BM 0 JM 0 E2T
− rQ
E1Q + E2 M 0 S
* *
−ε1 I + ε 2 E2 M 0 JM 0 E2T *
⎡Σ + ε 4 BM 0 JM 0 BT ⎢ * ⎢ ⎢ * ⎢ * ⎢⎣
QDT
0 ⎤ ⎥ ST J1 2 ⎥ <0 0 ⎥ ⎥ −ε 2 I ⎥⎦
QE1T + S T M 0 E2T + ε 4 BM 0 JM 0 E2T
−γ I
0
* *
−ε1I + ε 4 E2 M 0 JM 0 E2T *
2
ST J1 2 ⎤ ⎥ 0 ⎥ <0 0 ⎥ ⎥ −ε 4 I ⎥⎦
(8)
(9)
Where: Π = −rQ + ε HH + ε BM JM B , Σ = AQ + BM S + ( AQ + BM S ) + ε HH + CC . Then for all of the admissible parameter uncertainties, disturbance and actuator faults, the state-feedback control law K = SQ make the closed-loop system (2) satisfy desirable performance index (a), and the H∞ norm satisfies H ( s ) < γ . T
1
T
T
2
0
0
0
0
−1
∞
T
3
T
Optimal Guaranteed Cost Control for Linear Uncertain System
485
Proof: According to lemma 1, close-loop system satisfy desirable performance index (a), and the H∞ norm satisfies H ( s ) < γ if and only if matrix inequalities (4), (5) have a feasible solution. By schur compliant theorem, matrix inequality (4) is equal to: ∞
⎡ −rQ ⎢ T ⎢⎣Q ( AC + qI )
( AC + qI ) Q ⎤ −rQ
⎪⎧ ⎡ H ⎤ ⎥ F ( t ) ⎡⎣0 ⎩⎪ ⎣ 0 ⎦
⎡H ⎤ ⎥ + ⎢ ⎥ F ( t ) ⎡⎣ 0 ⎥⎦ ⎣ 0 ⎦
( E1 + E2 MK ) Q ⎤⎦ + ⎨⎢
T
⎪⎫
( E1 + E2 MK ) Q ⎤⎦ ⎬ ⎭⎪
<0
For the given uncertain F T ( t ) F ( t ) ≤ I , there exists scalar ε1 > 0 such that:
( AC + qI ) Q
⎡ − rQ + ε1HH T ⎢ T ⎢ Q ( AC + qI ) ⎢ 0 ⎢⎣
0
−rQ
Q ( E1 + E2 MK ) −ε1I
T
( E1 + E2 MK ) Q
⎤ ⎥ ⎥<0 ⎥ ⎦⎥
Considering M = M 0 ( I + L ) and define:
( A + BM 0 K + qI ) Q
⎡ − rQ + ε1HH T ⎢ T Y = ⎢Q ( A + BM 0 K + qI ) ⎢ 0 ⎢⎣
0 Q ( E1 + E2 M 0 K ) −ε1I
− rQ
T
( E1 + E2 M 0 K ) Q
⎤ ⎥ ⎥ ⎥ ⎦⎥
We have: T
⎧ ⎡ BM 0 ⎤ ⎫ ⎡ BM 0 ⎤ ⎪ ⎪ Y + ⎢⎢ 0 ⎥⎥ L [ 0 KQ 0] + ⎨ ⎢⎢ 0 ⎥⎥ L [ 0 KQ 0]⎬ < 0 ⎪ ⎪ ⎣⎢ E2 M 0 ⎦⎥ ⎩ ⎣⎢ E2 M 0 ⎦⎥ ⎭
By schur theorem, there exists scalar ε 2 > 0 such that: ⎡Π ⎢ ⎢* ⎢ ⎢⎣ *
( A + qI ) Q + BM 0 S − rQ
⎤ ⎡ 0 ⎤ ⎥ ⎢ T⎥ −1 ε + ⎥ 2 ⎢( KQ ) ⎥ J [ 0 KQ 0] < 0 ⎢ 0 ⎥ T⎥ −ε1 I + ε 2 E2 M 0 J ( E2 M 0 ) ⎥⎦ ⎣ ⎦
ε 2 BM 0 J ( E2 M 0 )
T
E1Q + E2 M 0 S
*
Define a new matrix S = KQ . According to schur compliant theorem, we have LMI (8). Similar to the proof of LMI (8), for the matrix inequality (5), we have: AC Q + QACT + γ −2QC T CQ + DDT + HF ( t )( E1 + E2 MK ) Q + ⎡⎣ HF ( t )( E1 + E2 MK ) Q ⎤⎦
T
≤ AC Q + QACT + γ −2QC T CQ + DDT + ε 3 HH T + ε 3−1 ⎡⎣( E1 + E2 MK ) Q ⎤⎦ ⎡⎣( E1 + E2 MK ) Q ⎤⎦ T
= ( A + BM 0 K ) Q + Q ( A + BM 0 K ) + BM 0 LKQ + ( BM 0 LKQ ) + γ −2QC T CQ + DDT + ε 3 HH T T
T
+ε 3−1 ⎡⎣( E1 + E2 MK ) Q ⎤⎦ ⎡⎣( E1 + E2 MK ) Q ⎤⎦ <0 T
By schur compliant theorem, the above inequality can be rewrite as: ⎡ Σ QC T ⎢ −γ 2 I CQ ⎢ ⎢( E + E M K ) Q 0 2 0 ⎣ 1
Q ( E1 + E2 M 0 K ) 0 −ε 3 I
T
T ⎤ ⎡ BM 0 ⎤ ⎧ ⎡ BM 0 ⎤ ⎫ ⎥ ⎢ ⎪ ⎪ ⎥ ⎢ ⎥ ⎥ + ⎢ 0 ⎥ L [ KQ 0 0] + ⎨ ⎢ 0 ⎥ L [ KQ 0 0]⎬ < 0 ⎥ ⎢E M ⎥ ⎪⎢ E M ⎥ ⎪ ⎩⎣ 2 0 ⎦ ⎭ ⎦ ⎣ 2 0⎦
Then, it’s easy to get LMI(9) by schur compliant theorem.
486
X. Han and G. Zhang
Only if all the values of performance indices are given in feasible range, the desirable fault-tolerant controller with multi-performance indices constraint has the feasible solution. Thus, before seeking for the feasible solution of fault-tolerant controller, firstly, we must judge whether the performance index is in feasible range. In the following section, we always assume that the fault close-loop system (2) is pole assignable about Φ ( q , r ) . Then, LMIs (8-9) holds about matrix variable Q and S . The following result is deuced by theorem 1 directly. Corollary 1: Assume close-loop system is pole assignable about Φ ( q , r ) , then LMIs (8-9) has feasible solution and the following optimal problem is meaningful. min ( γ 2 )
:(
Q, S , ε i , γ 2 )
S.t. LMIs (8-9)
(10)
Denote the minimal point of optimal problem (10) as γ L . For the given pole index Φ ( q , r ) and H∞ norm index γ > γ L , we have: Theorem 2: Given regional pole index Φ ( q , r ) , suppose the system is robust faulttolerant state-feedback assignable, then any H∞ norm index γ > γ L is consistent with Φ ( q , r ) . Similar to theorem 1, the following theorem describes the reliable guaranteed cost controller design method with pole index and H∞ norm index constraint. It is just the LMIs form of lemma 2. Theorem 3: Consider the system (2), for the given Φ ( q , r ) and H∞ norm index γ > 0 , if there exits scalar ε > 0 ( i = 1, 2,3, 4,5 ) and symmetrical positive matrix Q = Q > 0 , S , such that the following LMIs holds, T
i
⎡Π ⎢ ⎢* ⎢* ⎢ ⎣⎢ *
( A + qI ) Q + BM 0 S
ε 2 BM 0 JM 0 E2T
− rQ * *
E1Q + E2 M 0 S −ε1 I + ε 2 E2 M 0 JM 0 E2T *
⎡Σ + ε 4 BM 0 JM 0 BT ⎢ * ⎢ ⎢ * ⎢ * ⎢ ⎢ * ⎢ ⎢⎣ *
Where: and
Π = − rQ + ε 1 HH + ε 2 BM 0 JM 0 B T
Ψ = QE1 + S M 0 E2 + ε 4 BM 0 JM 0 E2 T
T
T
T
T
QDT
Ψ
ST J 1 2
ST M0
−γ 2 I
0
0
0
*
Ζ
0
0
*
*
−ε 4 I
0
*
*
*
ε 5 I − R −1
*
*
*
*
,
0 ⎤ ⎥ ST J1 2 ⎥ <0 0 ⎥ ⎥ −ε 2 I ⎦⎥
(11)
ST M 0 J ⎤ ⎥ 0 ⎥ 0 ⎥ ⎥<0 0 ⎥ 0 ⎥ ⎥ −ε 5 I ⎥⎦
(12)
Σ = AQ + BM 0 S + ( AQ + BM 0 S ) + ε 3 HH + CC T
T
T
+U
,
Ζ = −ε 1 I + ε 4 E2 M 0 JM 0 E2
T
. Then for all the admissible parameter uncertainties, dis-
turbance and actuator faults, the state-feedback control law K = SQ make the closedloop system (2) is asymptotically stable with Λ ( A + ΔA ) ⊂ Φ ( q, r ) , the H∞ norm satisfy −1
C
H (s)
∞
<γ
and J ≤ x0T Qx0 + γ 2 β .
C
Optimal Guaranteed Cost Control for Linear Uncertain System
487
Proof: Obviously, LMIs(8-9) are the LMI form of matrix inequalities (4) and (5). The proof process is similar to theorem 1. The only issues we must be aware is that if there exists scalar ε 5 > 0 satisfying R −1 − ε 5 I > 0 , then the following inequality holds. M T RM ≤ M 0T ( R −1 − ε 5 I ) M 0 + ε 5−1M 0T J T JM 0 −1
Theorem 3 presents a method of designing a reliable guaranteed cost control with multi-performance indices constraint. Whereas, theorem 4 presents a method selecting a selecting a reliable controller minimizing the upper bound of the expecting of guaranteed cost. Theorem 4: For the given close-loop system (2), pole index Φ ( q , r ) , H∞ norm index γ > 0 and cost function (3), if the following optimal problem has a feasible solution ( Q* , S * , ε i* ) , Then for all the admissible parameter uncertainties, disturbance and actuator faults, the state-feedback control law K = SQ make the closed-loop system (2) is asymptotically stable with Λ ( A + ΔA ) ⊂ Φ ( q, r ) , H∞ norm satisfy H ( s ) < γ and cost function reaching the minimum value. −1
C
∞
C
min η + γ 2 β
(13)
Q S εi
(Ⅰ) LMIs(11-12) (Ⅱ) ⎡⎢⎣ x−ηQ x−QQ⎤⎥⎦ < 0 Proof: According to theorem 3, condition (Ⅰ) make the close-loop system satisfy T 0
0
Λ ( AC + ΔAC ) ⊂ Φ ( q , r )
x Qx0 ≤ η T 0
and
H (s)
, so J ≤ η + γ β . 2
∞
<γ
. By schur compliment theory, condition 2 is equal to
,
If denote the minimum value of optimal problem in theorem 4 as η * + γ 2 β according to theorem 4, for the given indices Φ ( q , r ) and γ > 0 , obviously, any cost function index J ≥ η * + γ 2 β is consistent with pole index and H∞ disturbance. Now overview theorem 1 to theorem 4, if the close-loop system (2) is pole assignable, LMIs(8-9) must have feasible solutions and γ L > 0 exists. Then for any γ > γ L , there exists statefeedback control K make close-loop system (2) met performance index constraint (a) and (b). But, for the given pole index Φ ( q , r ) and any γ > γ L , LMIs(11-12) may have no feasible solution because of lemma 2 changing the necessary and sufficient property of lemma 1. So, before seeking for the reliable control K make close-loop system (2) satisfied performance indices constraint (a), (b) and (c) simultaneously, the feasible of LMIs(11-12) must be checked firstly, which can be solved by LMI-toolbox of MATLAB easily. We have the following satisfactory reliable guaranteed cost controller design step: Step 1: for the given pole index Φ ( q , r ) , check the feasibility of LMIs(8-9). If LMIs(8-9) has a feasible solution, solve the optimal problem (10) and get the γ L > 0
488
X. Han and G. Zhang
Step 2: given γ > γ L , check the feasibility of LMIs(11-12). If LMIs(11-12) has no feasible solution, increase scalar γ and check the feasibility of LMIs(11-12) as far as they have feasible solution. Step 3: solve the optimal problem (13) and get state-feedback control law K = SQ make the closed-loop system (2) met performance constraint (a), (b) and cost function J reaching the minimum value. −1
4 Numerical Example To illustrate the effectiveness of the proposed design approaches, a numerical simulation example is discussed in this section. Considering the following linear uncertain system with uncertainty: 0.05426 ⎤ −0.14 ⎡ 0.976 ⎡ 0.2642 −0.74263⎤ ⎡0.1⎤ T A = ⎢⎢ 0.0167 −0.01989 ⎥⎥ , B = ⎢⎢ −0.0634 −0.22412 ⎥⎥ , D = ⎢⎢ 0 ⎥⎥ , x ( 0 ) = [1 1 1] 0.54 0.74 ⎦⎥ ⎣⎢ 0.08811 −0.08681 ⎣⎢ 0.38187 −0.7333 ⎦⎥ ⎣⎢ 0 ⎦⎥
0.57 −0.39051⎤ ⎡ 0.874 C=⎢ ⎥ ⎣ −0.496 0.9034 0.6319 ⎦
, β = 1 , H = [1 0 0]T , E2 = [0 0.41 0] , E2 = [0.4 0.1]
The corresponding actuator failure matrixes are: ⎡0.75 0 ⎤ ML = ⎢ , M U = I 2×2 0.8⎥⎦ ⎣ 0
Cost function matrixes U = R = I 3×3 . Define Φ ( q, r ) = Φ (1, 5,1) , solve the optimal problem (10), it get that γ L = 2.5734 . Given γ = 3 , solve the optimal problem (13), it comes that J * = 21.4093 , and the corresponding reliable guaranteed cost controller is: ⎡ −0.00894 −0.1134 −0.0402 ⎤ K =⎢ ⎥ ⎣ 0.0117 0.00566 0.0603 ⎦
5 Conclusion In this paper, we present a method of designing reliable guaranteed cost control for linear uncertain system with pole index, H∞ norm index constraint under the condition of actuator failure. It is proved that the feasible of a group of LMIs is necessary and sufficient for the existence of such a controller. It is also shown that the consistency of performance indices is a feasible solution problem with some LMIs restriction.
References 1. Jovan, D.B., Raman, K.M.: A decentralized scheme for accommodation of multiple simultaneous actuator failures. In: Proceedings of the American Control Conference, pp. 5098– 5103. IEEE Press, New York (2002)
Optimal Guaranteed Cost Control for Linear Uncertain System
489
2. Puig, V., Quevedo, J.: Fault-tolerant PID controller using a passive robust fault diagnosis approach. Control engineering practice 11, 1221–1234 (2001) 3. Bonivento, C., Paoli, A., Marconi, L.: Fault-tolerant control of ship propulsion system benchmark. Control engineering practice 5, 483–492 (2003) 4. Chen, B., Liu, X.P.: Delay-Dependent Robust H∞ Control for T–S Fuzzy Systems With Time Delay. IEEE Transactions on fuzzy systems 4, 544–556 (2005) 5. Sun, J., Li, J., Wang, Z.: D-stable robust fault-tolerant control for uncertain discrete systems. Control theory and applications 4, 636–664 (1998) 6. Han, X., Xie, D., Zhang, D., Wang, Z.: Robust H∞ guaranteed cost satisfactory faulttolerant control with regional poles constraints. J. Huazhong Univ. of Sci. & Tech. (Natural Science Edition) 1, 40–43 (2009) 7. Zhang, D.F., Wang, Z.Q., Hu, S.S.: Robust satisfactory fault-tolerant control of uncertain linear discrete-time systems: an LMI approach. Int. J. of Systems Science 2, 151–165 (2007) 8. Han, X.L., Zhang, G., Wang, Z.Q.: Design of Fuzzy Fault-Tolerant Control System with Multi-Indices Constraints. Journal of Beijing Institute of Technology (Natural Science Edition) 1, 38–43 (2009) 9. Wang, Y.G., Guo, Z.: Consistency of multiple performance indices of feedback control systems. Control Theory & Application 3, 423–426 (2003) 10. Lui, S.Q., Guo, Z., Qian, L.J., Wang, Y.G.: Robust state estimation with desired indices of a class of linear periodic systems. Control and Decision 6, 843–846 (2002)
Statistical Modelling of Glutamate Fermentation Process Based on GAMs Chunbo Liu, Xuan Ju, and Feng Pan Institute of Automation, Jiangnan University, Wuxi, 214122, China
Abstract. Application of Generalized Additive Models (GAMs) for modelling of Glutamate fermentation process was proposed in this paper. There were so many variables in fermentation process and insignificant variables that might worsen pre-built model performance, so experiments of choosing significant variables were firstly carried out. One new model was constructed after choosing time (Time), dissolved oxygen (DO) and oxygen uptake rate (OUR) as significant variables. The simplified relationships that could reflect each variable effect in fermentation process between Time, DO, OUR and GACD were investigated using the constructed model. The integrated relationships that could provide theoretical base to implement control and optimize in fermentation processes between Glutamate and other significant variables were also explored. Normally, fermentation model was specific with the character of poor generalization, because of the complications of fermentation process, high degree of time-varying and batch changing. However the new model fitting results indicated the advantages, in term of non-parameter identification, prediction accuracy and robust ability. So the new model in this paper was satisfiedly characteristic of generalization. The advocated modelling method potentially supplies an alternative way for optimization and control of fermentation process. Keywords: Glutamate, Fermentation, Statistical model, GAMs.
1 Introduction A glutamic-acid-producing bacteria was discovered in 1957, firstly named Micrococcus glutamicus, later renamed Corynebacterium glutamicum [1]. Fermentation process is very complicated and characteristic of high degree of time-varying and batch changing. Furthermore, Glutamate production is a typical non-growth associated fermentation process. It is essential to build an accurate and effective mathematical model firstly for implementing control and optimization of fermentation processes. Many scholars have done the work [2-7]. The unstructured model, such as the Monod growth model and the Luedeking-Piret product formation model[8-10], was one method that can describe the time changes of those state variables. The method was mainly used for the off-line fermentation process’ analysis, such as prediction, optimization, and control. However, the general predicting and control performance based on the model was limited, because of some disadvantages of the model. The black-box model, such as the artificial neural network (ANN) model[11, 12], was another method which was completely based on the K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 490–499, 2010. © Springer-Verlag Berlin Heidelberg 2010
Statistical Modelling of Glutamate Fermentation Process Based on GAMs
491
input–output series data and not considering the real mechanisms. Because of the good ability dealing with the non-linear process and complex characteristics, ANN has been popular in fermentation process modelling. However the general performance of ANN would be very bad if the data were not enough for training the model. The fuzzy logic inference model[13, 14] which was a human experience and knowledge-based qualitative model was also another method. Its performance of prediction was largely depended on the experiential fuzzy logic rules and membership functions. However it was really a time and labour consuming process as the development and adaptive adjustment of fuzzy logic rules and membership functions. The paper also aimed to the modelling of Glutamate fermentation process. Generalized additive models (GAMs) [15-18] are GLMs in which some of the terms in the model are smooth, non-linear functions of explanatory variables. To explore the relationships between the response and explanatory variables, GAMs really provide a flexible framework of modelling. Furthermore, some off-the-shelf statistical software packages including the Algorithms for fitting GAMs are available, such as R software [19]. The method has been used extensively in the analysis of air pollution, health, environment and ecology [20, 21]. More recently, GAMs have been applied to hydrological and climatic time series [22, 23]. There is little evidence of the application of GAMs for modelling fermentation in the scientific literature.
2 Materials and Methods 2.1 Microorganism and Fermentation Conditions Corynebacterium glutamicum S9114, kept by the Key Laboratory of Industrial Biotechnology in China, was used in this study. Concentrated glucose was added based on requirement to ensure the substrate concentration above a suitable level (15 g/l) throughout the fermentation period. The seed microorganism was grown in a shaker at the temperature of 32 . The stirring speed of motor was 200 rpm for 8 to10 hours in liquid medium containing (in g/l): K2HPO4 1.5, glucose 25, MnSO4 0.005, FeSO4 0.005, MgSO4 0.6, corn slurry 25, and urea 2.5 (separated sterilization). The medium for jar fermentation contained (in g/l): glucose 140, K2HPO4 1.0, FeSO4 0.002, MgSO4 0.6, MnSO4 0.002, thiamine 5.0×10−5, corn slurry 15, and urea 3.0 (separated sterilization). Corynebacterium glutamicum S9114 was cultured for glutamate production at 32 in a 5 L fermentor (BIOTECH-5BG, Baoxing Co., China) containing about 3.4 L above-mentioned medium. Initial pH was adjusted to 7.0–7.2. pH was controlled at 7.1±0.1 by automatic addition of 25% (w/w) ammonia water which also supplied the nitrogen source required for glutamate synthesis. Dissolved oxygen concentration (DO) was controlled at various levels by automatically or manually controlling the agitation speed based on particular requirements. The CO2 and O2 concentrations (partial pressure) in the inlet and exhaust gas were on-line measured by a gas analyser (LKM2000A, Lokas Co. Ltd., Korea). The collected on-line data were smoothly filtered, and then OUR and CER were on-line calculated based on the literature reported method.
℃
℃
492
C. Liu, X. Ju, and F. Pan
2.2 Methods Generalized linear models (GLMs), which are with a minimum of extra complication regression compared with ordinary linear regression, are a unifying family of parametric models for covering a wide range of regression analyses with nonnormal responses [24]. GLMs are fully characterized by three components: (1) a random component, which is the probability distribution of the response variable Yi for units i = 1, K , I . Generally, it will depend on a mean parameter μ i and
on a global dispersion parameter Φ (2) a deterministic component, which can specify a linear function of covariates X i upon which the response Yi is assumed to depend. The linear predictor was denoted
λi = α ' X i
(3) the link, a fixed function f (⋅) describing the functional relationship between the deterministic component and the expected value of the random component. It relates the linear predictor to the mean μ i of the response variable:
μ i = f −1 (λi ) = f −1 (α ' X i ) . Specific choices for the random component and the fixed link function lead to very popular regression models, such as the logistic or Poisson models. In some cases, because of the assumption of the GLMs linearity in the covariates, it could be very restrictive. However the restriction could be avoided by using the generalized additive models (GAMs) which was an extension of GLMs. Because it just needed to assume that the effects might be represented by arbitrary unknown smooth functions instead of the assumption of a parametric form for the effects of the continuous covariates. To a multidimensional nonparametric regression problem, it is good to restrict it to an additive model. It is worth noticing that this class of models avoids the curse of dimensionality. Because the additive components in GAMs can simply describe the influence of each covariate separately, so GAMs are easy to interpret and can be expressed in Eq (1)
μ ( X ) = f −1 (α ' X ) = f −1 (α + f1 ( X 1 ) + L + f q ( X q )) where the partial functions
(1)
f j s are unknown (zero-mean) partial functions.
In this paper, the model of Glutamate fermentation process was not only built using GAMs , but also tested using the data from the different fermentor. The RMSE (Root Mean Square Error (Difference)) and COR (correlation coeffi
cient) were selected as the testing performance results. The RMSE of an estimator with respect to the estimated parameter was defined as the following Eq (2), which meaned the square root of the mean squared error: ∧
∧
RMSE (θ ) = E ((θ − θ ) 2 )
(2)
Statistical Modelling of Glutamate Fermentation Process Based on GAMs
493
RMSE was a good measure of how accurately the model predicted the response and the most important criterion for fit. The performance of RMSE was very suitable if the main purpose of the model was prediction. The correlation coefficient (COR), also named the cross-correlation coefficient, was a quantity that gave the quality of a least squares fitting to the original data. It could be written as the following form (3).
r = 2
Where ss xx ,
ss xy2 ss xx ss yy
(3)
ss yy and ss xy of a set of n data points ( xi , y i ) could be shown as the
following form (4-6)
SS x , x = ∑ ( xi − x ) 2 = ∑ x 2 − nx 2
(4)
SS y , y = ∑ ( y i − y ) 2 = ∑ y 2 − ny 2
(5)
SS x , y = ∑ ( xi − x )( y i − y ) = ∑ xy − nx y
(6)
3 Results and Discussions Many variables were measured in the fermentation process, such as fermentation time (Time), temperature (Temp), pH, oxygen uptake rate (OUR), carbon dioxide evolution rate (CER), dissolved oxygen (DO), stirring speed. To model, it was not good to use all the variables because the insignificant variables might worsen the model performance. After analysis, the result was that Time, DO and OUR were significant. So the following model was built just using these significant variables. Specially, the variable of Time was the most significant variable. Figure 1 showed some simplified variable relationships based on the constructed model using GAMs. Figs 1a-c showed the relationships between the Glutamic-acid output (GACD) and other variables including Time (hour), DO (%) and OUR 3
( mol / m h ); Figs 1d-f showed the relationships between the Glutamic-acid producing speed (SPEED) and the above variables Time (hour), DO (%) and OUR 3
( mol / m h ). The simplified relationship could indict the impact of every significant variable in the production of glutamate preliminary. It was not difficult to know the relationships of modelling variables using GAMs. The GACD and its increasing Speed (SPEED) were dependent on time changing in a nonlinear way. The SPEED was increasing quickly from 5th hour, and the value got to the highest which was more than 10 g/l at about 11th hour. After that time, SPEED was decreasing gradually until to zero and indicated no trend to increase again. Correspondingly to the change of SPEED, the GACD’s increasing rate changing from high to low rat was around 20th hours. The final Glutamate concentration reached about 80 g/l at the time of 34th hours. Moreover, results also suggested that the production GACD would reach its
C. Liu, X. Ju, and F. Pan
&
!"#
!"#
!"#
#
494
Fig. 1. The simplified relationships between significant variables and GACD (SPEED)
maximum with the SPEED decreased to zero. The glutamate production was strongly depended on the value of DO. When DO was at a special value of 10%, the values of GACD and SPEED were satisfied. In fact, further work was done to find the relationship between GACD and DO. Depending on the statistical analysis using more new data, it was found that Glutamate could have higher production when DO was near to 10% or 55%. It was worth noting that Glutamate production was not high when DO was near to 30%. So in the process of fermentation, it should be mentioned this point to ensure a high production of Glutamate. It was a little difficult to decide which value of OUR was beneficial to the highest production from the simplified relationship pictures, and in fact the relationship would be further studied in the later charter. 3
But the value of OUR should not be at about 100 mol / m h , at which the lest production could be, seeing from the simplified relationship pictures. The simplified relationships just as the above descriptions of modelling variables were enlightening using GAMs, that were really making a good preparation for the further control and optimization. The describing relationship character was advantage of GAMs, it was beyond the capacity of the methods such as unstructured dynamic model, black-box model and fuzzy logic inference model. The regression results using the nonlinear model which was constructed based on the previous 10 groups data were shown in Figure 2. Here black-dots were observed GACD or SPEED, and red-dots were fitted GACD or SPEED using significant variables (Time, DO and OUR) as independent variables. The 10 groups were selected
Statistical Modelling of Glutamate Fermentation Process Based on GAMs
495
Fig. 2. The testing results using the 5 testing groups randomly selected from the 15 groups
randomly from 15 groups. The results were as following: To GACD, RMSE was 3.376932, COR was 0.9893097; To SPEED, RMSE was 2.234239, COR was 0.7296624. Before using GAMs, the linear regression method was also applied to model, the best result was COR was 0.7141428 after choosing significant variables. The experiment results further showed that it was more effective using nonlinear method to fermentation process. The simplified relationships above could indict the impact of every significant variable preliminary. Glutamate production was complex and every variable impacted each other, to obtain more complete information, the integrated relationships between Time, DO, OUR and GACD were investigated using the above constructed model. The work results could make further preparation for implementing control and optimization of Glutamate fermentation process. Because the value of GACD was measured every two hours, so the sample points were not enough to reflect the relationships. Fortunately, DO and OUR were measured per second and the model of Glutamate fermentation process had already been constructed, so it was not difficult to know other GACDs at the measuring time. In this paper, the interpolation was made to resolve the problem of less sample points. The integrated relationships of Time, DO, OUR and GACD could be seen from Fig. 5 and Fig. 6. The results was divided into two parts, the first one was from time 7th hour to 25th hour, the second one from 25th hour to 34th hour. From the figures, some conclusions could be drawn, such as DO should be at some special range to ensure high Glutamate production at different time, for example, when time was between 7th hour and 9th hour, it was proper to set
496
C. Liu, X. Ju, and F. Pan
Fig. 3. The integrated relationships of DO, OUR and GACD from the time 7th hour to 25th hour
DO at the range value between 10% and 25% and OUR at the range value between 3
3
180 mol / m h and200 mol / m h . From Fig. 3., Fig. 4., it could be good to Glutamate production, when controlled 3
3
OUR at the range of 190-200 mol / m h before 19th hour and 50-60 mol / m h after 19th hour, and DO at the range of 15%-25% during the whole process of Glutamate fermentation production. The experimental results also showed that the second part
Statistical Modelling of Glutamate Fermentation Process Based on GAMs
$
$
$
$
497
$
$
$
$
$
Fig. 4. The integrated relationships of DO, OUR and GACD from the time 25th hour to 34th hour
was more important to high production of Glutamate production. The selected areas, 3
DO at the range of 15%-25% and OUR at the range of 40-70 mol / m h in Fig. 6 were optimal control zones. From 30th hour, other optimal control zones appeared, but DO needed to be more than 50%. So the previous optimal control zones were better.
498
C. Liu, X. Ju, and F. Pan
After testing the robustness of the new constructed model, the results were satisfied although further studies needed to be made.
4 Conclusion Because of the complication of fermentation, it was not easy to model. Although many variables were measured in the process of fermentation, it was not good idea to model using a lot of variables, because the insignificant variables might worsen the capacity of the constructed model. Significant variables named Time, DO and OUR were chosen firstly depending on the hypothesis testing statistical method in this paper, and then used to model. GAMs have been used extensively in the analysis of some areas, such as air pollution, health and rainfall. It was tried to use in the area of fermentation. To GACD, the predicting COR result was 0.9738 and the experiment results were satisfied. The work presented might have provided a new way for the practical application of fermentation process. The preliminary constructed model needed more work to be tested and improved, because of the complication of Glutamate fermentation process. Further work based on more new data would be done to further investigate the application of GAMs to the modelling of Glutamate fermentation process to increase model robustness performance. OUR was related to metabolic engineering combining systematic analysis of metabolic and other pathways with molecular biological techniques at the micro level. It also needed a lot of work to do to control the variable OUR effectively in the future. Acknowledgments. The paper was supported by the National 863 Project Foundation of China (No. 2006AA020301.
References 1. Kinoshita, S.: Glutamic acid bacteria. In: Demain, A.L., Solomon, N.A. (eds.) Biology of Industrial Micro-organisms, pp. 115–142. Benijamin Cummings, London (1985) 2. Zhang, C.Y., Shi, Z.P., Gao, P., Duan, Z.Y., Mao, Z.G.: On-line prediction of products concentrations in glutamate fermentation using metabolic network model and linear programming. Biochemical Engineering Journal 25, 99–108 (2005) 3. Gebert, J., Radde, N.: A new approach for modeling procaryotic biochemical networks with differential equations. Computing Anticipatory Systems 839, 526–533 (2006) 4. Gonzalez, R., Murarka, A., Dharmadi, Y., Yazdani, S.S.: A new model for the anaerobic fermentation of glycerol in enteric bacteria: Trunk and auxiliary pathways in Escherichia coli. Metabolic Engineering 10, 234–245 (2008) 5. Jimenez-Hornero, J.E., Santos-Duenas, I.M., Garcia-Garcia, I.: Structural identifiability of a model for the acetic acid fermentation process. Mathematical Biosciences 216, 154–162 (2008) 6. Vazquez, J.A., Murado, M.A.: Unstructured mathematical model for biomass, lactic acid and bacteriocin production by lactic acid bacteria in batch fermentation. Journal of Chemical Technology and Biotechnology 83, 91–96 (2008) 7. Gebert, J., Radde, N., Faigle, U., Strosser, J., Burkovski, A.: Modeling and simulation of nitrogen regulation in Corynebacterium glutamicum. Discrete Applied Mathematics 157, 2232–2243 (2009)
Statistical Modelling of Glutamate Fermentation Process Based on GAMs
499
8. Shimizu, K., Furuya, K., Taniguchi, M.: Optimal Operation Derived by Greens Theorem for the Cell-Recycle Filter Fermentation Focusing on the Efficient Use of the Medium. Biotechnology Progress 10, 258–262 (1994) 9. Bause, M., Merz, W.: Higher order regularity and approximation of solutions to the Monod biodegradation model. Applied Numerical Mathematics 55, 154–172 (2005) 10. Dette, H., Melas, V.B., Pepelyshev, A., Strigul, N.: Robust and efficient design of experiments for the Monod model. Journal of Theoretical Biology 234, 537–550 (2005) 11. Pollard, J.F., Broussard, M.R., Garrison, D.B., San, K.Y.: Process Identification Using Neural Networks. Computers & Chemical Engineering 16, 253–270 (1992) 12. Ungar, L.H., Powell, B.A., Kamens, S.N.: Adaptive Networks for Fault-Diagnosis and Process-Control. Computers & Chemical Engineering 14, 561–572 (1990) 13. Kishimoto, M., Yoshida, T.: Application of Fuzzy Theory on Fermentation Processes. Hakkokogaku Kaishi-Journal of the Society of Fermentation Technology 69, 107–116 (1991) 14. Georgieva, O., Wagenknecht, M., Hampel, R.: Takagi-Sugeno fuzzy model development of batch biotechnological processes. International Journal of Approximate Reasoning 26, 233–250 (2001) 15. Hastie, T., Tibshirani, R.: Generalized Additive Models. Chapman and Hall, Boca Raton (1990) 16. Gu, C.: Cross-validating non-Gaussian data. Journal of Computational and Graphical Statistics 1, 169–179 (2002) 17. Wood, S.N.: Generalized Additive Models: An Introduction with R. Chapman and Hall/CRC, Boca Raton (2006) 18. Wood, S.N.: Fast stable direct fitting and smoothness selection for generalized additive models. Journal of the Royal Statistical Society Series B-Statistical Methodology 70, 495– 518 (2008) 19. R Development Core Team, R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria (2008) 20. He, S., Mazumdar, S., Arena, V.C.: A comparative study of the use of GAM and GLM in air pollution research. Environmetrics 17, 81–93 (2006) 21. Health Effects Institute, Revised analysis of time-series studies of air pollution and health: Special report, Boston, Mass (2003) 22. Cox, M.E., Moss, A., Smyth, G.K.: Water quality condition and trend in North Queensland waterways. Marine Pollution Bulletin 51, 89–98 (2005) 23. Morton, R., Henderson, B.L.: Estimation of nonlinear trends in water quality: An improved approach using generalized additive models, Water Resources Research 44 (2008) 24. McCullagh, P., Nelder, J.A.: Generalized Linear Models. Chapman & Hall, London (1989)
The Application of Support Vector Regression in the Dual-Axis Tilt Sensor Modeling Wei Su and Jingqi Fu Shanghai Key Laboratory of Power Station Automation Technology, School of Mechanical Engineering and Automation, Shanghai University, Shanghai, 200072, China [email protected]
Abstract. This paper investigates the dual-axis tilt sensor modeling using support vector regression (SVR). To implement a dual-axis tilt measurement system, the designing structure of this system is firstly presented. Then, to overcome the nonlinear between the input and output signals, support vector regression (SVR) is used to model the input and output of the tilt sensor. Finally, a real dual-axis tilt measurement system experimental platform is constructed, which can provide a lot of experimental data for SVR modeling. Experiments of different modeling ways for the dual-axis tilt sensor are compared. Experimental results show that the proposed modeling scheme can effectively improve the modeling precision. Keywords: Tilt sensor, non-linear, support vector regression (SVR), conformity.
1 Introduction Tilt sensor, with the ability of improving the technology of Biometrics and Medical Rehabilitation, has been widely used in the life sciences such as the electronic medical pen, feedback controlled functional electrical stimulation system, vestibular prosthesis, etc. [1]-[5]. The relationship between input and output of tilt sensor is arcsine, which is a typical nonlinear relationship. This nonlinear problem will seriously restrict the tilt measurement precision and the range of tilt sensor. Therefore, nonlinear compensation must be taken into accounted. Recently, some methods to deal with the nonlinear problem have been proposed. For example, T. G. Constandinou [6]-[7] presented an analog circuit composed by diode MOS devices, and the differential signal of acceleration sensor output is extracted to handle the arcsine function. Crescini, D. [8] uses linear function to model the tilt sensor, and achieves measurement error ±0.1° in the range of ±20°. Dong [9] uses least squares to model the tilt sensor, and achieves measurement error ±0.1° in the range of ±20°. The least squares method improves the tilt measurement range, but its model is more complex than linear interpolation, and its precision can hardly meet our satisfaction. However, these above methods mainly deal with the low range of tilt input and output, which can not effectively settle high range tilt with the more serious nonlinear. SVR is different from the above mentioned traditional algorithm. It not only has an obvious advantage in small samples learning, but also has the advantages of precise, K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 500–508, 2010. © Springer-Verlag Berlin Heidelberg 2010
The Application of Support Vector Regression in the Dual-Axis Tilt Sensor Modeling
501
better global convergence and better generalization performance in nonlinear modeling [10]-[12]. Therefore, SVR is used to model the tilt sensor in this paper. It can be clearly found from the experimental results that SVR has obvious advantages compared with the least squares method in the measurement range of ±60°. The rest of this paper is organized as follows. Measuring principle of tilt sensor system is present in Section 2. Designing structure of dual-axis tilt measurement system is then given in Section 3. Section 4 introduces the modeling method of SVR. Experiment is performed in Section 5. Finally, Section 6 concludes the paper.
2 Measuring Principle of Tilt Sensor System The acceleration sensor can measure the gravity by referring to the ground level, and the measurement principle of tilt sensor is shown in Fig. 1.
Fig. 1. Measurement schematic of tilt sensor
In this schematic, a horizontal plane is defined by the x0 and y0 axes; the two coordinate system with axis lines X and Y and Z, is fixed to the mobile tilt sensor; δ denotes the angle between the sensitive x-axis and line x0, and γ denotes the angle between sensitive y-axis and line y0. Once the accelerations through the sensitive axes are calculated, the corresponding δ and γ can be obtained. According to the space geometric relation showed in Fig.1, it follows that,
g x = g cos α
(1)
g y = g cos β
(2)
Where g is the acceleration of gravity; gx, gy are the acceleration which x-axis, yaxis measured, respectively; α, is the angle between x-axis and gravity axis; β is the angle between y-axis and gravity axis. Based on α + δ = 90°, (1) and (2) can be expressed as
δ = arcsin( gx / g )
(3)
γ = arcsin( gy / g )
(4)
502
W. Su and J. Fu
From (3) and (4), the relations between the acceleration which the sensitive axis measured and the pitch and roll angle are achieved. Thus it can also be seen that the input and output of the tilt sensor system is serious nonlinear through the way of utilizing the gravity to measure the angle.
3 Designing Structure of Tilt Sensor System To overcome the nonlinear of input and output signals caused by the measurement principle of tilt sensor, a tilt measurement system is designed in Fig.2. The system deals the input and output signals with SVR to implement nonlinear compensation effectively.
Fig. 2. The designing structure of dual-axis tilt measurement system
In Fig. 2, the key components are dual-axis acceleration sensor, microprocessor and digital potentiometer. The biaxial acceleration sensor is employed to sense the acceleration of X axis and Y axis, and the ratio of the acceleration to gravity acceleration is transform to the output of PWM. The period of PWM is controlled by digital potentiometer which is adjusted by microprocessor. Meanwhile, the microprocessor process the signal output by the acceleration sensor intellectually. On the one hand, the microprocessor adjusted the digital potentiometer according to the PWM. On the other hand, the microprocessor uses intelligent algorithm like SVR to model the input and output signals. The results are displayed on the computer by the serial port communication. The power unit supplies the proper power for acceleration sensor, microprocessor, digital potentiometer, serial device, and JTAG.
4 Model Method of SVR Support vector machine (SVM) is a new learning method based on the principle structural risk minimization. To get a good generalization capability, SVM often find a best trade-off between the model complexity and learning ability according to the limited sample information. SVR is one type of SVM, which is usually used to solve the regression problem.
The Application of Support Vector Regression in the Dual-Axis Tilt Sensor Modeling
503
In order to deal with the nonlinearity, the input data, x, in input space is mapped to a high-dimensional feature space via a nonlinear mapping function, Φ(x). Formally, a SVR can be described by the following equation
f ( x, w) = w ⋅ φ ( x) + b
(5)
Where w is weight vector, b is threshold. When introducing Vapnik’s ε-insensitive loss function [13-15], the SVR problem can be expressed as the following optimization problem:
min w
l 1 2 w + C ∑ (ξ i + ξ i* ), i = 1,2,", n 2 i =1
⎧ y i − w ⋅ Φ ( x) − b ≤ ε + ξ i* ⎪ Subject to ⎨ w ⋅ Φ ( x ) + b − y i ≤ ε + ξ i ⎪ * ⎩ξ i, ξ i ≥ 0
(6)
Where C is penalty factors, ε is loss function parameter. By introducing a dual set of Lagrange multipliers,
αi
and
α i* ,
the minimization
problem in (6) can be transformed into a dual problem. After obtaining parameters and
α
* i ,
αi
the final approximation function is nsv
f ( x) = ∑ (ai − ai* ) K ( xi , x) + b
(7)
i =1
xi represents the support vector, α i and α i* are parameters associated with support vector xi , nSV is the number of support vectors; K(xi, x) is the kernel function and
Where
K ( xi , x) = exp(−λ x − xi ) 2
(8)
where λ is kernel parameter. From (6) to (8), it can be found that the generalization capability of SVM significantly depends on the parameters of C, ε, λ. Therefore, how to select the three parameters reasonably and effectively will greatly promote the practical application of SVR. In the practical applications at present, the methods to determine the SVR parameter mainly have empirical determination and mesh search algorithm. Vladimir C [16] gives an expression of ε and C, provides an effective solution for the choice of ε and C. Cristianini N and Kandola J [17] used the method of kernel calibration parameters to quickly identify λ. Sathiya S [18] proposed a function of kernel parameter λ and C to transform the two-dimensional optimization problem into a two one-dimensional optimization problem. In this paper, with the toolbox of gridregression.py, the leave-one-out crossvalidation method is used to determine the parameters of C, ε, λ automatically [19].
5 Experiment According to the designing structure of dual-axis tilt measurement system, a real experimental platform is constructed in Fig.3. The experimental platform has the
504
W. Su and J. Fu
functions of data sampling with tilt sensor, data calibration with the tilt calibration device, and data display with the computer. Tilt sensor Tilt Calibration Device Computer
Fig. 3. Experimental platform
The data samples of input and output can be collected from the above constructed system. The calibration device is made by JTM Corporation, and the accuracy of the calibration device is 0.02°. The tilt sensor is fixed on the tilt calibration device to calibrate the input and output signals. And the output of tilt sensor changes with the rotation of calibration device. As the measurement of the tilt increased, the sensitivity of the system decreased. So we calibrate the angle from -60° to 60°, record the data every 5°. The sampling data are listed in Table 1. Table 1. The sampling data from -60° to 60° Output of x-axis 1734 1829 1920 2010 2098 2182 2264 2343 2415 2483 2545 2601 2650
Angle input 0 5 10 15 20 25 30 35 40 45 50 55 60
Output of x-axis 1734 1639 1548 1458 1370 1286 1204 1125 1053 985 923 867 818
Angle input 0 -5 -10 -15 -20 -25 -30 -35 -40 -45 -50 -55 -60
Output of y-axis 1678 1770 1861 1951 2039 2123 2205 2284 2357 2424 2480 2541 2590
Angle input 0 5 10 15 20 25 30 35 40 45 50 55 60
Output of y-axis 1678 1586 1495 1405 1317 1233 1151 1072 999 932 876 815 766
Angle input 0 -5 -10 -15 -20 -25 -30 -35 -40 -45 -50 -55 -60
After data samples of input and output are collected, the method of support vector regression (SVR) is used to model the input and output of the tilt sensor. And the procedure of SVR modeling is demonstrated in Fig.4.
The Application of Support Vector Regression in the Dual-Axis Tilt Sensor Modeling
505
Fig. 4. The procedure of SVR modeling
1) Training set and validation set By sampling the data point every 5°, the input and output data of the x-axis and yaxis from -60° to 60° are obtained. Each data point contains 30 data samples. 2) Normalization of samples The command of svm-scale is used to normalize the samples. After the command is executed, the data are normalized in [-1, 1]. Then the normalized training set and normalized validation set are saved in data.txt and test.txt, respectively. 3) Kernel function A lot of kernel functions can be used, in order to make the training result much better, we choose RBF kernel function. 4) Parameter optimization Under the environment of python, the gridregression.py can be employed to optimize the parameters C, ε and λ. And the tool of gridregression.py is based on the leave-one-out cross-validation method. 5) SVR model According to the above obtained best parameters C, ε and λ, the training set is used to get the SVR model by executing the command svm-train. 6) The validation of the training model The training model can be test by the command svm-predict. The parameters of model are saved in data.txt.model, and the out.txt contains the test results. Because the feature of the measurement system from -60° to 0° and 0° to 60° is symmetric, only the angle range from 0° to 60°are modeled . In additional, to compare with the mo deling results of SVR and least squares cubic polynomial fit. The modeling errors of SVR and least squares are listed in Fig.5 and Fig.6.
!
"#$
Fig. 5. The error of x-axis
506
W. Su and J. Fu %
!
"#$
Fig. 6. The error of y-axis
According to Figs.4 and 5, the maximum fitting errors are 0.62° and 0.55° by the traditional least squares, whereas the maximum fitting errors are -0.08° and 0.08° by SVR. It is clearly shown that the SVR modeling can get better fitting results than least squares. Conformity is a criteria of evaluating the fitting-agreement between fitting curve and desired curve. Conformity El is expressed as
El = ± Δy m y m × 100%
△
(9)
Where ym is the maximum deviation between practical characteristic curve of the instrumentation and desired fitting straight line of appearance, ym is full-scale output of instrumentation. The conformities of the two methods are shown in Table.2. Table 2. Compare the Conformity of SVR and least squares least squares
SVR
x-axis
y-axis
x-axis
y-axis
±1.03%
±1%
±0.13%
±0.13%
From Table.2, the conformity of SVR is far smaller than least squares. This also confirms that the SVR method can get better model. Consequently, the tilt measurement system can get higher precision.
6 Conclusions A dual-axis tilt measurement system has been designed in this paper. Considering the serious non-linear features of the input and output signals, SVR is used to model the input and output of the tilt sensor. Experiments on this system are performed respectively by two methods. Using SVR, the system error is less than ±0.08°, and the conformity is ±0.13%. Therefore, it can clearly be seen that experimental results show that
The Application of Support Vector Regression in the Dual-Axis Tilt Sensor Modeling
507
SVR has more obvious advantages than the least squares method in the measurement range from -60°to 60°. Moreover, Tilt measurement system can meet the application requirements of biological sciences, and shows the potentials for real-life applications. Acknowledgment. The authors gratefully acknowledge support for this work from the National High Technology Research and Development Program of China under Grant 2007AA04Z174.
References 1. Hofer, J., Gruber, C., Sick, B.: Biometric analysis of handwriting dynamics using a script generator model. In: 2006 IEEE Mountain Workshop on Adaptive and Learning Systems, SMCals 2006, pp. 36–41 (2006) 2. Foss, O.A., Klaksvik, J., Benum, P., Anda, S.: Pelvic rotations: A pelvic phantom study. Acta Radiologica 48(6), 650–657 (2007) 3. Yu-Luen, C.: Application of Tilt Sensors in Human–Computer Mouse Interface for People With Disabilities. IEEE Transactions on Neural Systems and Rehabilitation Engineering 9(3), 289–294 (2001) 4. Ding, D., Leister, E., Cooper, R., Cooper, R.A., Kelleher, A., Fitzgerald, S.G., Boninger, M.L.: Usage of tilt-in-space, recline, and elevation seating functions in natural environment of wheelchair users. Journal of Rehabilitation Research and Development 45(7), 973–984 (2008) 5. Kubík, J., Vcelák, J., O’Donnell, T., McCloskey, P.: Triaxial fluxgate sensor with electroplated core. Sensors and Actuators 152(2), 139–145 (2009) 6. Constandinou, T.G., Georgiou, J.: Micropower arcsine circuit for tilt processing. Electronics Letters 44(23) (2008) 7. Constandinou, T.G., Georgiou, J.: A Micropower Tilt Processing Circuit. IEEE Transactions on Biomedical Circuits and Systems 3(6), 363–369 (2009) 8. Crescini, D., Marioli, D., Romani, M., Sardini, E., Taroni, A.: An Inclinometer based on Free Convective Motion of a Heated Air Mass. In: Sicon104 ~ Sensors for Industry Conference, New Orleans, Lo~tiriana. USA, pp. 21–29 (January 2004) 9. Dong, W., Kwang, Y.L., Young, K.G., Kim, D.N., I-Ming, C., Song, H.Y., Been-Lirn, D.: A low-cost motion tracker and its error analysis. In: 2008 IEEE International Conference on Robotics and Automation, ICRA 2008, May 19 (2008) 10. Cao, X., Chen, J., Matsushita, B., Imura, H., Wang, L.: An automatic method for burn scar mapping using support vector machines. International Journal of Remote Sensing 30(3), 577–594 (2009) 11. Takahashi, N., Nishi, T.: Global convergence of decomposition learning methods for support vector machines. IEEE Transactions on Neural Networks 17(6), 1362–1369 (2006) 12. Trafalis, T., Alwazzi, S.A.: Support vector regression with noisy data: A second order cone programming approach. International Journal of General Systems 36(2), 237–250 (2007) 13. Vapnik, V.N.: The nature of statistical learning theory. Springer, New York (1999) 14. Cherkassky, V., Ma, Y.: Practical Selection of SVM parameters and noise estimation for SVM regression. Neural Networks 17, 113–126 (2004) 15. Cervantes, J., Li, X., Yu, W., Li, K.: Support vector machine classification for large data sets via minimum enclosing ball clustering. Neurocomputing 71(4-6), 611–619 (2008)
508
W. Su and J. Fu
16. Vladimir, C., Yunqian, M.: Practical Selection of SVM Parameters and Noise Estimation for SVM Regression. Neural Networks 17(1), 113–126 (2004) 17. Cristianini, N., Shawe-Taylor, J., Kandola, J., et al.: On Kernel Target Alignment. In: Proc. of Neural Information Processing Systems. MIT Press, Cambridge (2002) 18. Sathiya, S., Keerthi, C.J.: Asymptotic Behavior of Support Vector Machines with Gaussian Kernel. Neural Computation 15(7), 1667–1689 (2003) 19. Zhang, J.Y., Liu, S.L., Wang, Y.: Gene association study with SVM, MLP and crossvalidation for the diagnosis of diseases. Progress in Natural Science 18(6), 741–750 (2008)
Implementing Eco-Friendly Reservoir Operation by Using Genetic Algorithm with Dynamic Mutation Operator Duan Chen1,2, Guobing Huang1, Qiuwen Chen2, and Feng Jin1 1
Changjiang River Scientific Research Institute, Jiuwanfang, Hankou District, Wuhan 430010, China 2 Research Center for Eco-environmental Science, Chinese Academy of Sciences, Shuangqinglu 18, Haidian District, Beijing 100085, China
Abstract. Simple Genetic Algorithms (SGA) uses a constant rate in mutation operator and may leads to pre-convergence and local optimal deficiency, especially for the problem with many nonlinear constraints such as eco-friendly reservoir operation. The study adapted SGA with a double dynamic mutation operator and developed an optimization model of eco-friendly reservoir operation, and applied it to the cascade reservoirs in the Southwest of China. It is shown that the adaptive GA with the dynamic mutation operator can fulfil the goal of eco-friendly reservoir operation and it was enhanced in search accuracy and global searching ability in comparison with SGA. Keywords: Genetic algorithms, double dynamic mutation operator, Eco-friendly reservoir operation.
1 Introduction Although great progress has been made in the last 40 years, efficient operation of reservoir systems still remains a very active research area. The combination of multiple water uses, non-linearity in the model and in the objectives, strong uncertainties in inputs and high dimensional state make the problem challenging and intriguing[1]. Traditionally, reservoir operation is performed based on heuristic procedures, embracing rule curves and subjective judgments by the operator for reservoir releases according to the current reservoir level, hydrological conditions, water demands and the time of the year[2]. In order to increase the reservoir’s efficiency, a variety of methods including linear programming, dynamic programming, gradient-based search algorithms, heuristic programming and other nonlinear optimization techniques were developed during past decades [3]. Among of them, Genetic Algorithm, first conceived by John Holland, was highlighted by many researchers for its capability of solving problem with discontinuous, combinatorial and nonlinear objective functions or nondifferentiable and non-convex design spaces [4,5]. And it has been applied to reservoir management [6,7], reservoir operating rules [8], real-time reservoir operation[9], and K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 509–516, 2010. © Springer-Verlag Berlin Heidelberg 2010
510
D. Chen et al.
Multi-reservoir systems optimization[10], etc. However, even many efforts were done, relative computational complexity, genetic drift[11], inaccuracy in local search intensification process and slow rate of convergence are still remaining not solved in GA. Eco-friendly reservoir operation is an adaptive way to manage the reservoir by duly considering the ecological needs and releasing adaptive managed flows into downstream. It typically adds another objective related to ecosystem which normally conflicting to the original one that only concerns social and economic interests, or imposes ecological flow demand as another constraint which is highly nonlinear and dynamic along with time[1,2].This reservoir operation in an eco-friendly manner was proved to be the effective solutions to reduce the adverse impacts of traditional reservoir operation and maintain the sustainability of river ecosystem under these threats [12]. Many prototype investigation results indicate that river ecosystem damage could be restored distinctly when reservoir operation is improved to meet the Ecological flow demand (EFD) in downstream [13,14]. However, the conflicting objective or new highly nonlinear being added as well as the deficiency of GA itself may bring great difficulties in designing, computing, and optimizing process in reservoir operation model. Therefore, it is important to assess GA performance in eco-friendly reservoir operation and study on the desirable adaptation. This research took two cascade reservoirs as the study case, and developed a model to optimize reservoir operation in an eco-friendly manner by using GA. To improve its performance, GA was adapted. Each parameter that impact GA process were studied and tested in order to value GA performance in the model and the performance of adapted genetic algorithm were compared with the traditional one.
2 Problem Statement Two cascade reservoirs A and B were selected as the study case in this research. The two reservoirs are located in the upstream of Yalongjiang River in the southwest China. The reservoir A has a high concrete dam with power house at the dam toe, while the reservoir B has a low water head diversion type hydropower station. There is a long division tunnel transferring water to its power house. The natural channel from dam B to the power house B is about 120 km. When the power house B is operated, the flow in the channel will be dramatically reduced and may be dewatered, see the schematic of this hydraulic system in Fig. 1. In this 120 km river channel, most of the spawning grounds of the aboriginal fish Schizothorax chongi are distributed. This implies the operation of reservoir B will severely threaten the living conditions of the fish, even lead to extinction if no remediation measures are taken. In order to protect the aboriginal fish, it was required to discharge certain flow into the river channel to maintain the downstream ecosystem at basic level. However, the stockholders concerned about the hydropower loss caused by ecological flow demand, thus bring serious difficulties for its practical implementation. It is, therefore, necessary to operate these two reservoirs by not only concerning human benefits but also ecosystem interests.
Implementing Eco-Friendly Reservoir Operation by Using Genetic Algorithm
511
Reservoir A Division Channel
Reservoir B Dewatered
Power
river channel
Fig. 1. Layout of the cascaded reservoirs
3 Model Proposed 3.1 Objective of Model According to the planning and designing proposal, these two reservoirs were mainly constructed for hydropower generation. For its importance and easy to be quantified, hydropower was used as the objective in this study, which was given in equation (1). 12
E = max ∑ (c AQt H tA + c B Qt H tB )Tt t =1
A
B
, ∀t = 1,2,3.........,12
(1)
where E is cascade annual hydropower outputs, c is output coefficient, t is number of time steps, Q is discharge through turbines (m3/s), Ht is water head (m), and T is operation time(s). Firstly, a monthly-based optimization model was proposed, therefore T is one year and t is single month in the year. A & B represents reservoir A and reservoir B, receptively. In order to maintain the ecosystem downstream, another objective related to ecosystem requirement should be added, thus formatted this into a multi-objective (MO) problem. Since these two objectives are conflicting, it requires a Pareto efficiency whose solution is no longer a mere technical exercise but requires consideration of the preference of the parties involved[1]. At the same time, the quantifying of ecosystem value varies greatly by different assess methods and still remains controversial. Therefore, a strategy was adopted in the study, to separate technical issues from preference aspects and avoiding the inaccuracy of ecosystem value. It was to express ecosystem concern by imposing an “ecological flow” constraint on the objective of human interest in the model, thus to reduce the multi objective (MO) problem into a set of parametric single objective (SO) optimal control problems.
512
D. Chen et al.
3.2 Constraints 3.2.1 Water Level and Reservoir Capacity Constraints When reservoir is operated, the upstream water level should be restrained by the inflow discharge and reservoir capacity in an equation as:
Vt +1
A, B
= Vt
A, B
+ (Qnature
A, B
− Qup
A, B
− Qloss
A, B
− Qt
A, B
− St
A, B
)T A, B
(2)
where, V is reservoir storage (m3), and S is discharge through other structures besides the turbines (m3/s). Qnature is inflow from upstream (m3/s), Qup is water withdraw from the reservoir (m3/s), and Qloss is water loss by evaporation and leakage. 3.2.2 Reservoir Storage Constraints According to dam design, the storage in the reservoirs should be less than or equal to the capacity of reservoir Vm and greater than or equal to the dead storage Vd in all the time. Mathematically this constraint is given as:
Vd
A, B
≤ Vt
A, B
≤ Vm
(3)
A, B
3.2.3 Turbine Release and Output Constraints The releases through turbines for power generation should be less than or equal to turbine capacities and must not be negative. In addition, the real output should be greater than or equal to the firm output Nf, but less than or equal to the installed capacity Ni in all the time. These constraints can be written as:
0 ≤ Qt
Nf
A, B
≤ Qmax
A, B
≤ Nt
A, B
A, B
≤ Ni
(4)
A, B
(5)
3.2.4 Constraints of Ecological Flow Demand Ecological flow was considered to be a major component to minimize the impact of reservoir operation on river downstream ecosystem. Reservoirs should provide for flow release to meet their specific purposes as well as the downstream ecosystem and livelihood objectives identified through scientific and participatory processes. These flows are also referred to as the “ecological flows”, which are simply not a quantity of water released downstream of a reservoir. Several approaches are available for assessing the ecological needs of the river systems downstream of a reservoir. In this research, time series ecological flow obtained by comprehensive habitat method was used. It is more appropriate than just a single value from Tenant or other hydrological approaches. To guarantee the ecological flow in downstream, the constraint was given as:
Qt + St ≥ EFD(t ); Qt + S t − Qt = EFD(t ); A
A
A
A
B
(6)
where EFD(t) is discharge series of ecological flow demand that came from Schizothorax chongi fish habitat model [15], shown in Fig.2.
Implementing Eco-Friendly Reservoir Operation by Using Genetic Algorithm
513
500
3
Qdown(m /s)
450 400 350 300 250 200 150 100 50 0
EFD
Jan
Feb
Mar
Apr
May
Jun July Aug
Sep
Oct
Nov
Dec
Date
Fig. 2. EFD series of Schizothorax chongi
3.3 State Variable Upstream water level is commonly used to represent reservoir dispatching process, therefore, it is selected to be the state variable in the model. Since it is monthly-based model, the state variable then determined as ( H1A , H 2A ,......H12A ; H1B , H 2B ,......H12B ) , 24 variables in total. 3.4 Model Framework This optimization model took total power generation as objective function, and water level, reservoir capacity, discharge, EFD, as constraints. Each reservoir in the system was connected by discharge or water level. The primary inputs of the model were reservoir inflow from upstream, and the initial values of the constraints. Genetic algorithm (GA) was used to seek for the maximum value of the objective function under all the constraints, thus the optimal solution on reservoir operation was obtained. The model frame was illustrated in Fig.3. The model was programmed in C and Matlab language.
Objective Function
Nature runoff
Res A
Res B
GA Function
Flow/water level
Constraints
Fig. 3. The framework of the optimization model
Optimal reservoir policy
514
D. Chen et al.
4 GA Adaptation Genetic Algorithms, developed by Holland, are analogous to Darwinian natural selection, which combines an artificial survival of the fittest and the natural genetic operators to attempt to find the best solution in a given solution space. GA has received a great deal of attention regarding its potential as an optimization technique for complex problems. The GA search starts with an initial randomly generated population and progresses to improve the evaluation of solutions through iterations by implementing GA operators, including selection scheme, reproduction, crossover operators and mutation operators. Although GA has been proved a powerful optimization method, it has some disadvantages such as pre-convergence (PC problem) and local optimal solution, especially for the problems with many nonlinear constraints such as multiple reservoirs regulation.Traditionally, GA uses a constant rate in mutation operation [5], normally range from 0.0001 to 0.1. Some studies showed that a high mutation rate can provide more chance to obtain global optimal instead of local one. However, it may ruin excellent genes if high mutation rate is applied too early. Based on the previous studies, this research adopted a double dynamic strategy on mutation rate P, which was given by a function, shown in (7).
⎧ p0 ⎪gβ ⎪ p=⎨ ⎪ p * fa ⎪⎩ 0 f m
fm − fa ⎫ ∈ [0 .1, 1.0 ] ⎪ fm ⎪ ⎬, p 0 = 0 .1, β ∈ [1, 1.5 ] fm − fa if ∈ [0 .0,0.1]⎪ ⎪⎭ fm
if
(7)
Where g is the number of generation iteration. fm and fa is the maximum value and average value of fitness function in g generation, respectively. When given by the equation, mutation rate P would decrease dynamically with generation from the initial higher rate, thus to enlarge the search scope, enhance the ability for global optimal from the beginning, and also keep the excellent genes not to be ruined along with evolution. If the best and the average of a generation were close,which resulting in termination of evolution, the mutation rate P could dynamically increased as their difference decreases, thus to force the individual to jump out of the local optimal point and keep the optimization process running further .
5 Result and Discussion The performance of adaptive genetic algorithm (AGA) and the simple genetic algorithm (SGA) were presented based on upstream water level operation (Fig.4) and the objective values (Fig.5). The result showed the AGA strategy can achieve a higher upstream water level(reservoir A) scheme than SGA, accordingly, higher objective value namely more power generation can be achieved under the same constraints. The best results of power generation under these two methods were listed in Table 1. The comparisons indicated power generation would increase nearly 4% under the AGA method instead of SGA in the same hydrological year. Fig.6 showed the flow process released into the dewatered river channel. In dry season, it was equal to EFD. While in flood season, the flow Qetotal in the river
Implementing Eco-Friendly Reservoir Operation by Using Genetic Algorithm
515
channel may have a huge increase, due to reservoir abandoning water. And with this flow regime, the total weighted usable area of fish habitat (Table 2) would increase dramatically compare to the conventional reservoir operation. !!"#$
!!"#$
& !$4444
& !/444
Fig. 4. Upstream water level operation in reservoir A under AGA and SGA Table 1. Comparison on power generation of two optimization methods
optimization method SGA AGA
Power generation(kW*h) 4.09E10 4.28E10
"3" "3#
C 1 $ $ !2
"3 "3 "3$ #3< #3-
) )
#
)
$$$ $$$ /$$$ $$$$ $$$$ #$$$$ /$$$$ 0 0(
Fig. 5. Total power generation under AGA and SGA
1' @2 #$$$ /$$ $$$ /$$ $$$ /$$ $
(
=+ (+ ) 0 B K = )0
Fig. 6. Flow in dewatered river channel
Table 2. The weighted usable area of fish habitat under different reservoir operation policy
Operations Conventional Eco-friendly
Total weighted usable area of Schizothorax chongi fish habitat (m2) 10383631 36357703
516
D. Chen et al.
6 Conclusion To maintain the ecosystem downstream, ecological flow as a highly nonlinear constraint was taken into account in the optimization of eco-friendly reservoir operation. GA was adapted to improve its performance in dynamic mutation operator. The case study showed the optimal solution on reservoir operation in an eco-friendly manner can be obtained through adaptive GA, where a maximal output was achieved and the ecological flow demand was met. And the adaptive GA was enhanced in global searching ability and provides an adequate, effective and robust way for searching rational reservoir operating hydrographs in an eco-friendly manner. Acknowledgements. The Author would like to acknowledge the financial support for this research by National Nature Science Foundation of China (50639070, 50879086).
References 1. Castelletti, A., Pianosi, F., Soncini-Sessa, R.: Water reservoir control under economic, social and environmental Constraints. Automatica 44, 1595–1607 (2008) 2. Hakimi-Asiabar, M., Ghodsypour, S.H., Kerachian, R.: Deriving operating policies for multi-objective reservoir systems: Application of Self-Learning Genetic Algorithm. Appl. Soft. Comput. (2009) (in press) 3. Ngo, L.L.: A case study of the Hoa Binh reservoir, Vietnam Ph.D. Thesis Octobe, Technical University of Denmark (2006) 4. Schaffer, J.D.: Some experiments in machine learning using vector evaluated genetic algorithms, Ph.D. thesis, Vanderbilt University, Nashville, TN (1984) 5. Goldberg, D.E.: Genetic algorithms in search, optimization and machine learning. Addison-Wesley Publishing Co., Inc., Reading (1989) 6. Chang, F.J., Chen, L.: Real-coded genetic algorithm for rule based flood control reservoir management. Water. Resour. Manag. 12(3), 185–198 (1998) 7. Cai, X., Mckinney, D., Lasdon, L.S.: Solving nonlinear water management model using a combined genetic algorithm and linear programming approach. Adv. Water. Res. 2(6), 667–676 (2001) 8. Oliveira, R., Loucks, D.: Operating rules for multi reservoir systems. Water. Resour. Res. 33(4), 839–852 (1997) 9. Akter, T., Simonovic, S.P.: Modelling uncertainties in short term reservoir operation using fuzzy sets and a genetic algorithm. Hydrolog. Sci. J. 49(6), 1081–1097 (2004) 10. Sharif, M., Wardlaw, R.: Multi reservoir systems optimization using genetic algorithms: case study. J. Comput. Civil. Eng. 14(4), 255–263 (2000) 11. Amor, H.B., Rettinger, A.: Intelligent Exploration for Genetic Algorithms. In: GECCO 2005, Washington, DC, USA, June 25-29 (2005) 12. IUCN. Flow: The essentials of environmental flows (2003), http://www.iucn.org/dbtw-wpd/edocs/2003-021.pdf 13. Petts, G.E.: Water allocation to protect river ecosystem. Regul. River 12, 353–365 (1996) 14. Kemp, J.L., Harper, D.M., Crosa, G.A.: Use of functional habitats to link ecology with morphology and hydrology in river rehabilitation. Aquat. Conserv. 9, 159–178 (1999) 15. Li, R., Chen, Q., Chen, D.: Application of genetic algorithm to improve the fuzzy logic river habitat model. In: Proceedings of the 6th ISEH Conference, Athens, Greece (2010) (in press)
Research on the Biocompatibility of the Human Rectum and a Novel Artificial Anal Sphincter Peng Zan, Jinyi Zhang, Yong Shao, and Banghua Yang Department of Automation, College of Mechatronics Engineering and Automation, Shanghai University, Shanghai Key Laboratory of Power Station Automation Technology, Shanghai, China {zanpeng,Jinyi Zhang,shaoyong,yangbanghua}@shu.edu.cn
Abstract. This paper discusses biocompatibility issues that are related to the human rectum and a novel artificial anal sphincter. The artificial anal sphincter system is a novel hydraulic-electric muscle to treat fecal incontinence. A high integration of all functional components and no wire linking to the outer device make the surgical implantation more easy and lower risk. However, the human rectum is not a rigid pipe, and motion in it is further complicated by the fact that the bowel is susceptible to damage. With the goal of designing a reliable and safe instrument, the motion model between the artificial anal sphincter and the rectum is developed, the biomechanical material properties of human rectum are analyzed. The results show that the deformation of the artificial anal sphincter can be controlled by the press of reservoir below the upper limit of human tissue ischemia. Keywords: rectum, artificial anal sphincter, biocompatibility, ischemia.
1 Introduction Fecal incontinence is a common disease of anorectal surgery, which means there’re some problems for the people’s defecation control capacity [1-2]. Patients often have serious psychological disturbance such as less talking, phrenasthenia, dissociableness, scare of being found because of fecal incontinence, which also make them decadent, decrease their social accommodation [3-5]. The artificial anal sphincter in existence can’t apperceive the quantity of the feces. Patients can’t control defecation time autonomously, need to pump the liquids with hands, and it is too expensive for patients. As the development of technology, people have asked for more requirements on quality of the life from modern medicine [6-7]. This paper describes part of our ongoing effort to realize an artificial anal sphincter (AAS) that is a novel hydraulic-electric muscle to treat fecal incontinence. Our efforts lie in the realm of a high integration of all functional components and no wire linking to the outer device, which can make the surgical implantation easy and low risky [8-10]. First, the biomechanical model for the action on the intestine in the direction of radial compression and axial extension is built; secondly, the biomechanical material properties of human rectum is analyzed; and thirdly, the stress-strain predicted relationship of K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 517–524, 2010. © Springer-Verlag Berlin Heidelberg 2010
518
P. Zan et al.
the rectum is derived. The goal of this work is to develop the biocompatibility analytical models that can be used to control the deformation of the artificial anal sphincter by the press of reservoir below the upper limit of human tissue ischemia. These analytical models provide us with a means to design an AAS that operates safely and reliably.
2 System Overview As shown in Fig.1, this system mainly comprises three modules including an AAS, an energy module and a communication module.
Fig. 1. Diagram of artificial anal sphincter system
It is composed of two components. One is implanted, the other is placed outside the body. The AAS is composed of a reservoir, an occluding cuff, and a micropump with motor gear. The structure of artificial anal sphincter is shown as Fig.2. There is a sensor in AAS to detect the pressure of the anal canal. It can measure the pressure in the cuff and the pressure of the rectum. The cuff and the reservoir are connected with a bidirectional micropump as shown in Fig.2. By shifting the fluid between the reservoir and the front cuff, the sphincter can be compressed or relaxed and thus the state of continence can be controlled by MCU. For the defecation of the bowel, the fluid has to be pumped into the reservoir and for occlusion of the bowel into the front cuff, respectively. The AAS system prototype is shown in Fig.3.
Fig. 2. Execution unit diagram
Research on the Biocompatibility of the Human Rectum
519
Fig. 3. The AAS system prototype
3 Biomechanical Model 3.1 Radial Compression Model As shown in Fig. 2, when the AAS described above occludes the rectum, it will deform and stress the rectum tissue. So an analytical model must be developed to predict the tissue behavior. The researched human rectum can be idealized as nonlinear, axisymmetric, homogenous, viscoelastic pressure vessels undergoing large deformations due to external axisymmetric loading distributions. Fig. 4 is the free body analytic model of the rectum tissue, where R is the radius of the rectum [11].
Fig. 4. Free body analytic model
The equilibrium equation for such vessels is given by:
σa R1
+
σc R2
=
p k
(1)
520
P. Zan et al.
In equation (1), σ a is the normal stress along a meridian, σ c is the circumferential cuff stress, R1 is the radius of curvature of the meridian direction, R2 is curvature normal to the meridian, p is the internal pressure of the rectum, and k is the thickness of the rectum. For the rectum tissue seen in Fig. 2, the meridional curvature is zero, hence, Equation (1) reduces to:
σc =
pR k
(2)
The stress-strain relationship of the rectum tissue can be expressed as following:
ε c = H (σ c )
(3)
where H(·) shows the biomechanical constitutive relationship in the radial direction. So the relationship between p and the deformation of the rectum ΔR can be shown as:
p=
k ⎛ ΔR ⎞ ⋅ H −1 ⎜ ⎟ R ⎝ R ⎠
(4)
3.2 Axial Extension Model When the AAS occludes the rectum, the rectum tissue will extend in the axial direction, as shown in Fig. 5.
Fig. 5. Deformation of rectum in the axial direction
According to the free body balance equation, we can deduce the following equation:
V =V′
(5)
where V and V′ are the volume before and after the rectum is occluded. The other two equations can be derived as following: R
V = 2π ∫ prdr 0
(6)
Research on the Biocompatibility of the Human Rectum
V ′ = 2π Rσ a k sin
π 2
521
(7)
Using equations (5)~(7), we can easily get σ a on the section of the rectum:
σa =
pR 2k
(8)
The stress-strain relationship of the rectum tissue can be expressed as following:
ε a = J (σ a )
(9)
where J(·) shows the biomechanical constitutive relationship in the axial direction. So the relationship between p and the deformation of the rectum L0 can be shown as:
p=
2k −1 Lo ⋅J ( ) R L
(10)
Using (4) and (10), we can find that the deformation of the rectum in the radial and axial direction can be controlled by the variation of the AAS cuff pressure.
4 Material Properties of Human Rectum Energy function of the viscoelastic rectum is expressed as follows:
ρ 0W =
c' exp(aEl2 ) 2
(11)
This function is one-dimensional special case of two-dimensional viscoelastic strain energy function expression [12]. c' and a are constant related with materials. ρ0 is density of tissue before deformation. El is Green strain and its expression is:
)
(12)
L ΔR or λl = 0 R L
(13)
El =
(
1 2 λl − 1 2
And λl is :
λl =
σl and Sl is defined under Cauchy—Euler meaning and Kirchhoff meaning respectively.
σl = σc Sl =
or
σl = σa
ρ0 1 σl ρ λl2
(14) (15)
522
P. Zan et al.
ρ is density of tissue after deformation. Because:
Sl =
d ρ 0W dEl
(16)
Substitute equation (14) into equation (15) then we can get Sl. Sl is substituted into left side of equation (16) . Equation (11) is substituted into right side of equation (16). And equation (16) is simplified as follow equation:
ρ0 2 PS c ' a 2 ⎡⎣ λl − 1⎤⎦ exp ⎡⎣λl2 − 1⎤⎦ λl = 2 ρ Rk
{
}
ρ0 2 PS c ' a 2 ⎡⎣ λl − 1⎤⎦ exp ⎡⎣λl2 − 1⎤⎦ λl = 2 ρ Lk
{
}
(17)
(18)
5 Results and Discussion Using (12)~(18), we define c'=0.05 [13], so the constitutive relationship of the human rectum can be derived, as shown in Fig. 6 (a), (b). In Fig. 6 (a), (b), the stress increases as the strain rises up, and the relationship between them shows strong non-linearity. The stress are separately 120g/cm2 and 180 g/cm2 as both of the strain reach 0.30. According to the research on rectum anatomical structure, we assume that L=100mm, R=12.5mm, k=3mm, and the rectum is occluded by homogeneous pressure. Using (4) and (10), the rectal endured pressure-rectal deformation curve can be calculated, shown as Fig. 7 (a), (b).
(a) radial direction
(b) axial direction
Fig. 6. Predicted stress-strain curves of rectum
Research on the Biocompatibility of the Human Rectum
(a) radial direction
523
(b) axial direction
Fig. 7. Rectal endured pressure-rectal deformation curve
In Fig. 7 (a), (b), the rectal deformation increases as the occluded pressure rises up, and the relationship between them shows strong non-linearity. The deformation are separately 7mm and 12mm in the radial and axial direction as the rectal pressure reaches 2kPa. Generally, according to the curve shown in Fig.7 (a), (b), the rectal deformation can be controlled by the pressure of the occluded reservoir.
6 Conclusion A novel artificial anal sphincter system for fecal incontinence is developed, and the AAS prototype is manufactured. The basic function of the prototype has been tested well. The paper presents an important biomechanical model including radial compression model and axial extension model, which must be considered in the design of a novel artificial anal sphincter. The material properties of human rectum is also considered. Using the model, the design of our prototype can be improved so that it operates within safe limits. According to the model and the analysis of tissue ischemia [10], the cut-off pressure 2kPa is appropriate for our design. Future efforts will be made to take animal experiments and research the rebuilding of the rectum sensation function. Acknowledgments. This work was supported by “Shanghai University, ‘11th FiveYear Plan’ 211 Construction Project”, Innovation Fund of Shanghai University, Scientific Special Research Fund for Training Excellent Young Teachers in Higher Education Institutions of Shanghai(No.shu10052), and National Natural Science Foundation of China (No. 60975079).
References 1. Kang, S.B., Kim, N., Lee, K.H.: Anal sphincter asymmetry in anal incontinence after restorative proctectomy for rectal cancer. World Journal of Surgery 32(9), 2083–2088 (2008) 2. Frudinger, A., Schwaiger, W., Pfeifer, J.: Adult stem cells to the treatment with anal incontinence after perineral laceration 3 or 4 pilot study. Geburtshilfe Und Frauenheilkunde 68 (suppl. 1), 19 (2008)
524
P. Zan et al.
3. Vaccaro, C., Clemons, J.L.: Anal sphincter defects and anal incontinence symptoms after repair of obstetric anal sphincter lacerations in primiparous women. In: 28th Annual Meeting of the American-Urogynecologic-Society, Hollywood, pp. 1503–1508 (2007) 4. Faucheron, J.L.: Anal incontinence. Presse Medicale 37(10), 1447–1462 (2008) 5. Long, M.A.: Fecal incontinence management systems. Rehabilitation Nursing 33(2), 49–51 (2008) 6. Vaizey, C.J., Kamm, M.A., Gold, D.M.: Clinical, physiological, and radiological study of a new purpose-designed artificial bowel sphincter. Lancet 352(9122), 105–109 (1998) 7. Fassi-Fehri, H., Dinia, E.M., Genevoix, S.: AMS 800 artificial urinary sphincter implantation: Can the penoscrotal approach constitute an alternative to the perineal approach? Progres En Urologie 18(3), 177–182 (2008) 8. Peng, Z., Guozheng, Y., Hua, L.: Analysis of Electromagnetic Compatibility in Biological Tissue for a Novel Artificial Anal Sphincter. IET Science, Measurement & Technology 3(1), 22–26 (2009) 9. Peng, Z., Guozheng, Y., Hua, L.: Adaptive Transcutaneous Power Delivery for Artificial Anal Sphincter System. Journal of Medical Engineering & Technology 33(2), 136–141 (2009) 10. Peng, Z., Guozheng, Y., Hua, L.: Modeling of Human Colonic Blood Flow for a Novel Artificial Anal Sphincter System. Journal of Zhejiang University-Science B 9(9), 734–738 (2008) 11. Jorgensen, C.S., Dall, F.H., Jensen, S.L.: A new combined high-frequency ultrasoundimpedance planimetry measuring system for the quantification of organ wall biomechanics in vivo. Journal of Biomechanics 28(7), 863–867 (1995) 12. Fung, Y.C., Fronek, K., Patitucci, P.: Pseudoelasticity of arteries and the choice of its mathematical expression. American Journal of Physiology 237(5), 620–631 (1979) 13. Hoeg, H.D., Slatkin, A.B., Burdick, J.W.: Biomechanical modeling of the small intestine as required for design and operation of robotic endoscope. In: IEEE International Conference on Robotics & Automation, San Francisco, pp. 750–753 (2000)
A Medical Tracking System for Contrast Media Chuan Dai1, Zhelong Wang2,1, and Hongyu Zhao1 1
Faculty of Electronic Information and Electrical Engineering Dalian University of Technology Dalian, 116024, China [email protected] 2 Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang, China
Abstract. Contrast media is a kind of chemical substance used to improve the image quality of Computed Tomography. However, due to its high speed of injection, emergencies (such as capillary hemorrhage) always exist. In view of this problem, a video object tracking system is implemented to monitor the injection site. The color feature is abstracted from image sequences and used for the mean shift tracking algorithm. The experiment results show that the tracking system is real-time, robust and efficient. Keywords: Tracking, Visual tracking, Color histogram, Mean shift algorithm.
1 Introduction Computed Tomography (CT) is a medical imaging method employing tomography created by computer processing. CT produces a volume of data which can be manipulated to demonstrate various bodily structures based on their ability to block the X-ray beam. Because contrast CT scans rely on intravenously administered contrast agents in order to provide superior image quality, there is a low but non-negligible level of risk associated with the contrast media injection, such as capillary hemorrhage and needle eruption. What's more, in most cases, these injures cannot be noticed by the patients themselves. At present in China, a common countermeasure is to let the family members of patients wear the X-ray protective aprons before going into the CT room. However, X-ray protective aprons cannot completely avoid the X-ray damage to human body. Therefore, a medical tracking system for contrast media injection is proposed in this paper. Visual perception is the most important sense of human, and image is the basis of vision. To abstract certain features and identify specific object from a image is an important aspect of image processing. After decades of development, visual tracking is widely used in medical diagnosis, military affairs, entertainment and science research [1-5]. Currently methods of visual tracking contain Kalman filter, particle filter, optical flow and so on. The standard Kalman filter can obtain a good result in linear and Gaussian system, but the tracking system in our project is non-linear and non-Gaussian; particle filter can deal with non-linear system, but particle degeneracy is difficult to overcome; the computation of optical flow is too intensive to be used in the system which has a real-time requirement. Thus, mean shift tracking algorithm is chosen in this paper. K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 525–531, 2010. © Springer-Verlag Berlin Heidelberg 2010
526
C. Dai, Z. Wang, and H. Zhao
The mean shift algorithm is a robust method of finding local extreme in the density distribution of a data set. This is an easy process for continuous distributions; in that context, it is essentially just hill climbing applied to a density histogram of the data. For discrete data set, however, this is a somewhat less trivial problem [6]. When mean shift used for visual tracking, a confidence map in the new image based on the color histogram of the object in the previous image would be created. Then, the peak of a confidence map near the object's position in previous image is found. In that case, mean shift algorithm can greatly reduce the search area and computation complexity of the tracking system, which insures a good real-time performance of the system. The tracking system is composed by rotational station, camera and computer. The color feature is abstracted, and mean shift algorithm is employed. The remainder of this paper is organized as follows. Section 2 explains how to abstract the color feature. Section 3 describes the tracking algorithm frame. Experiment results and discussion are shown in Section 4, and the conclusions and future works are presented in the final section.
2 Feature Extractions According to the actual situation, different object tracking systems should choose different features. The tracking and monitoring target in this study is the injection site on the arm. However, the ways of pasting medical adhesive tape by different doctors are not the same, thus it's difficult to abstract the features from the injected area directly. Furthermore, the morphological characteristics of the injection site will change if blood oozes out, which makes the initial abstracted features useless. Therefore, this paper presents a method for tracking the region adjacent to the needle tip of a contrast injector using a colored marker. Because of the relatively stable environment of a CT room, the object always appears the same color, thus a marker with special color (red color in this paper) was set. Color feature of the object was represented by color histogram, which is widely used to describe the color distribution by probability distribution and insensitivity to target rotation and scale motion. Histograms find uses in many computer vision applications. Histogram-based methods are very efficient when compared to other image segmentation methods because they typically require only one pass through the pixels. It is used to detect scene transitions in videos by marking when the edge and color statistics markedly change from frame to frame. They are used to identify interest points in images by assigning each interest point a "tag" consisting of histograms of nearby features [6]. Color histogram describes the proportion of each color in the whole image regardless of the color location in HSV space [7]. Therefore, it is particularly suitable for the description of the images which are different to be segmented in morphology. The data shown in histogram are obtained by statistical method, and reflect the statistical distribution and basic tone of image colors. There is a one-to-multiple relationship between histograms and images, which means that one image has only one corresponding color histogram but one color histogram may represent several images. Because color is very sensitive to light intensity, the images taken by camera are converted from RGB model to HSV color space in this paper, in which the
A Medical Tracking System for Contrast Media
527
H component stands for hue. The color distribution in the object region is represented by matrix after color space transformation and statistic of pixels. Let x be a state vector of the tracked object[8], R (x) be the object area, u = {h, s, v} be a pixel of the object area,
bi (u ) ∈ {1,..., N } be a mapping function to determine which statistical
color section the pixel belongs to, with N as the max value of the histogram longitudinal axis. The color probability histogram can be calculated as follows:
q (n; x) = k
∑ δ [b (u ) − n] i
u∈R ( x )
where
(1)
δ (t ) is an impulse function: ⎧1, t = 0; ⎩0, t ≠ 0;
δ (t ) = ⎨
(2)
k is a normalized coefficient to keep the histogram being a probability distribution, n ∈ {1,..., N } is the value of the histogram lateral axis. In order to enhance the object-pertinence of color probability histogram in statistics, kernel function, a method for weight adjustment based on the relative positions of the pixels, is introduced in this paper [8]:
q (n; x) = k
∑ w[d (u )]δ [b (u ) − n] t
u∈R ( x )
where d (u ) is the radius, u0 is the center of the template, and function defined as follows:
w(t ) = αe − β
t2
(3)
w(t ) is a decreasing (4)
As is shown in the experiment [8], it has a good performance when α =100, β =20. In this way, pixels in the edge of the object area can be ignored in the process of the histogram calculation, and more attention will be paid to the pixels in the center. A color histogram of a rectangle area in an image is shown in Fig. 1.
Fig. 1. Color histogram
528
C. Dai, Z. Wang, and H. Zhao
3 Tracking Algorithm In this paper, mean shift algorithm is used. Probability density of pixels is calculated first, and then the search window is moved toward the highest probability density until it finally converges [9]. Given n points {xi }i =1...n in the d-dimensional Euclidean space Rd [10], the kernel density estimation can be computed by [11]:
1 fˆ ( x) = d nh
n
∑ K( i =1
x − xi ) h
(5)
where h is the window width, and K ( x ) is the Epanechnikov kernel function which is one of the optimal kernel functions based on the minimum mean square error(MMSE): T ⎧ 1 −1 ⎪ c (d + 2)(1 − xT x < 1) x x < 1 K E ( x) = ⎨ 2 d otherwise ⎪⎩ 0
where
(6)
cd is the volume of a unit sphere in d-dimensional Euclidean space.
Mean shift vector is defined as the difference between local mean and window center, and points to the peak or valley of the density estimation function[11].
M h ( x) =
1 nx
ˆ f ( x) h2 ∇ fˆ ( x)
∑[ x − x ] = d + 2
xi ∈S h ( x )
i
(7)
S h (x) is a hyper sphere with a radius of h , a volume of h d cd , and a center ˆ f ( x) is the gradient of kernel density estimation. of X , containing n points, and ∇ where
According to the above interpretation, the mean shift algorithm can be viewed as consisting of the following steps: •
Step 1: Calculate the mean shift vector
• •
Step 2: Move the search window according to the vector; Step 3: Repeat step 2 until M h (x) converges to a threshold (near to zero).
M h (x) ;
In practical application, mean shift algorithm needs the initial position of tracking object. So in the tracking process, initial search window, which include location, type and size, is chosen firstly. And then, the mean-shift algorithm runs as follows [6]: •
Step 1: Choose a search window: Its initial location; Its type (uniform, polynomial, exponential, or Gaussian); Its shape (symmetric or skewed, possibly rotated, rounded or rectangular);
A Medical Tracking System for Contrast Media
• • •
529
Step 2: Compute the window's (possibly weighted) center of mass; Step 3: Center the window at the center of mass. Step 4: Return to step 2 until the window stops moving(it always will);
Mean shift algorithm keeps moving the search window toward the object area by the iterative method in a small range, which makes it superior to the general global search algorithm. Mean shift algorithm is extended to continuous image sequence. That is, mean shift vector is computed to move the track window toward the target at each frame, it is also referred to as CamShift algorithm [12-13]. The following five steps describe the process of the final tracking algorithm: • • • • •
Step 1: Initialize the location and size of track window; Step 2: Calculate the color probability distribution within the search window (shown in section 2); Step 3: Compute the color probability distribution of the search window in the next frame; Step 4: Use the mean shift algorithm to gain the new location and size of track window, and control the rotation of pan-tilt; Step 5: Use the new tracking window in the next frame, and repeat step 3.
4 Experimental Results and Discussions In this section, the experimental results are shown. The hardware system includes the SCC-C4233 digital color camera made by SAMSUNG and the YD3040 rotational station made by YAAN. The software is implemented in C++, when running on a standard 2.66GHz desktop PC, performs in real time at 25 frames/second working from the live camera feed and a Microvision MV-400 video capture card. At the beginning of tracking procedure, the original tracking position is determined by the first frame of image sequences.
Frame 5
Frame
Frame 45
Frame 55
Frame 25
Frame 65
Frame 35
Frame 75
Fig. 2. These images show the selected eight image frames from the image sequences of hundreds of frames. The green ellipse in the figure is the tracking result.
530
C. Dai, Z. Wang, and H. Zhao
As the laboratory environment is just like the simple surroundings of CT room, the experiment is carried out in the laboratory. First, a rectangle is chosen as the initial position in the image sequences. Once the initial position is determined, the tracking system begins to work. The marker we used is a medical adhesive plaster adjacent to the injection site. To avoid disturbing by other objects, red color is chosen in this experiment. The green ellipses in the images show the estimate result, as is shown in Fig 2. The images demonstrate the robust of the tracking system, as there is no violent shake of trace window when the object size changes. The YD3040 is a uniform rotational station and it's rotation speed is higher then most CT bed, so the tracking system can easily meet the requirement of real-time. In practical application, the color of patients’ clothes may be very similar to the marker, and this will cause tracking fail. In that case, we should chose a different color for the marker. Acknowledgments. This work was supported in part by the China Postdoctoral Science Foundation (no. 20080441102) and Special Public Sector Research Funds for Earthquake Study (no. 200808075).
References 1. Comanniciu, D., Ramesh, V., Meer, P.: Kernel-based object tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence 25(5), 564–577 (2003) 2. Hager, G.D., Belhumeur, P.N.: Efficient region tracking with parametric models of geometry and illumination. IEEE Transactions on Pattern Analysis and Machine Intelligence 20(10), 1025–1039 (1998) 3. Tissainayagam, P., Suter, D.: Object tracking in image sequences using point feature. Pattern Recognition 38(1), 105–113 (2005) 4. Coifmana, B., Beymerb, D., McLauchlanb, P., Malikb, J.: A real-time computer vision system for vehicle tracking and traffic surveillance. Transportation Research, 271–288 (1998) 5. Clady, X., Collunge, F., Jurie, F., Martinet, P.: Object Tracking with a Pan-Tilt-Zoom Camera:application to car driving assistance. In: Proceedings of the IEEE International Conference on Robotics & Automation, pp. 1653–1658 (February 2001) 6. Bradski, G., Kaehler, A.: Learning OpenCV: Computer Vision with the OpenCV Library, 1st edn. O’Reilly Media, Sebastopol (2008) 7. David, M., Tourahimi, E.: Orientation histogram-based matching for region tracking. In: Eight International Workshop on Image Analysis for Multimedia Interactive Services, pp. 6–8 (2007) 8. Xiehua, Z., Shen, Z., Min, T.: Research on the ‘Selective Color Histogram’ for Moving Object Tracking. Journal of Chinese Computer Systems 30, 1864–1868 (1998) 9. Yuan, X.: Tracking Moving People Based on the MeanShift Algorithm. Computer Engineering & Science 30(4), 46–49 (2008) 10. Han, H., Zhi, J.W., Jiao, L.C., Chen, Z.P.: Data Association for Multiple Targets Based on MeanShift in Clutter. Journal of System Simulation 21(11), 3351–3355 (2009) 11. Bradski, G.R.: Computer vision face tracking for use in a perceptual user interface. Intel Technology Journal Q2, 1–15 (1998)
A Medical Tracking System for Contrast Media
531
12. Yuan, F.N.: A fast accumulative motion orientation model based on integral image for video smoke detection. Pattern Recognition Letters 29, 925–932 (2008) 13. Marimon, D., Ebrahimi, T.: Orientation histogram-based matching for region tracking. In: Eight International Workshop on Image Analysis for Multimedia Interactive Services, pp. 6–8 (June2007) 14. Spengler, M., Schiele, B.: Towards robust multi-cue integration for visual tracking. Machine Vision and Applications 14(1), 50–58 (2003) 15. Wang, H., Wang, J.T., Ren, M.W., Yang, J.Y.: A New Robust Object Tracking Algorithm by Fusing Multi-features. Journal of Image and Graphics 14(3), 489–498 (2009) 16. Sanjeev Arulampalam, M., Maskell, S., Gordon, N., Clapp, T.: A tutorial on particle filters for online nonlinear/non-Gaussian Bayesian tracking. IEEE Transactions on Signal Process. 50(2), 174–188 (2002) 17. Mokhtarian, F., Suomela, R.: Robust Image Corner Detection Through Curvature Scale Space. IEEE Transactions on Pattern Analysis and Machine Intelligence 20(12), 1376–1381 (1998)
Rapid Planning Method for Robot Assited Minimally Invasive Surgery Yanhua Cheng, Chun Gong, Can Tang, Jianwei Zhang, and Sheng Cheng Kunshan Industrial Technology Research Institute, Jiangsu, P.R. China {chengyanhua,gongchun,tangcan,zhangjianwei}@ksitri.com [email protected]
Abstract. The traditional space mapping and surgical planning method for surgery are time-consuming, and the accuracy of positioning is not high. This paper aims to present a practical and fast way for planning. In the session of visual orientation for spatial location, MicronTracker camera and self-calibration template are used for positioning; in the session of tracking and locating for four markers on patient and robot’s template, the coordinates of them are extracted automatically; in the session of DICOM medical image processing, the contour of the tumor is extracted automatically, in terms of the seed filling algorithm, contour tracking algorithm and the B-spline fitting function. Coordinates transformation from the image space to the camera space and to the robot space can be completed rapidly and precisely through this method. Experimental results show that the traditional from 25 to 30 minutes planning time for the entire operation can be reduced to 5 minutes; the space mapping accuracy can be improved from the traditional 5mm to 4mm now. Keywords: Space Mapping, Computer Assisted Surgery, Robot system, Surgical Planning, Contour Extraction.
1 Introduction With the advancement of technology and human living standards, medical surgical robots has developed rapidly in recent years[1-4]. In the robot-assisted minimally invasive surgery, preoperative surgical planning process is an important part in the whole operation[5-8]. It includes extracting the tumor contours of the CT/MRI, choosing the appropriate path and controlling the posture of the surgical robotic arm. During the surgical planning process, it needs to implement the coordinate transformation of markers from the image space to the robot space[9-10], including the camera calibration, the marker of patient calibration and the markers calibration in the image space. There are two methods for camera calibration currently. The first one is to take the points in the image with hands, using the CCD visual positioning system and the selfmade "calibration block" to calibrate. The second one is to take the points in the image with hands, using the CCD visual positioning system, and calibrating by controlling the robot manipulator end-tip moving to N (N> 6) points[11].But the two methods still K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 532–540, 2010. © Springer-Verlag Berlin Heidelberg 2010
Rapid Planning Method for Robot Assited Minimally Invasive Surgery
533
have inadequacies. First of all, once the camera has been calibrated, it can not move, or needs to be calibrated again if moving the camera is required. Since the camera needs to re-calibrated before surgery every time, the operation is cumbersome and timeconsuming. Secondly, this positioning system can only locate static objects and cannot track the location of moving objects. Thirdly, the accuracy of this positioning system is not high for multiple factors, such as the placement of the camera, the number of reference points, the positioned points and the relative positions of various reference points. There are two traditional methods for the calibration of marker points. Similar to the second method of camera calibration, the first one uses electrode slices for calibration. The second one uses calibration needles to calibrate, basing on the optical positioning system and electrode[11]. The second method is faster than the first one, but it cannot fundamentally improve the calibration accuracy. Both the two methods have the same shortcomings. Firstly, the tip at the end of the manipulator needs to contact he calibration points one by one, and sometimes it needs to adjust the robot pose repeatedly, so the calibration procedure is time consuming. Secondly, when the calibration pin is near to the marker, it is too difficult for the axis to ensure passing through the geometric center electrodes, because the electrode’s surfaces are not spherical. Meanwhile, it is difficult to determine the geometric center in the image space, together with the jitter of hands and the instability of the joystick control, etc, it results a low accuracy of calibration. For above shortcomings, this paper presents an optical positioning system for static calibration. By using a proprietary library to complete the camera calibration, the calibration process is simple and has a high precision. And it extracts the markers’ accurate coordinates in the camera space and the image space automatically. It saves planning time by extracting the tumor contour automatically instead of manually.
2 Basic Components of the System The system is composed of the MicronTracker camera (see Fig. 1), the marker (see Fig. 2), the human model (see Fig. 3), the robotic arm and the computer planning software. Space mapping includes the image space, the camera space and the surgical space. In the process of the surgical planning, step is to extract the tumor contour in the clear axial planes, then do the 3D reconstruction and choose the suitable path for the surgery with robot. Here, the surgical planning needs the coordinate transform action from the image space to the robot space. The spaces are defined as follows: image space: the corner point is the bottom left of the first CT / MRI slice, the y-axis and the x-axis are the height and the width of the CT / MRI respectively, and the number of the CT / MRI increase is the z-axis; camera space: it establishes the coordinate system using the middle point of both two cameras as the origin point, two camera connection as x direction, the xy plane which the z-axis perpendicular to is the plane where the lens in; robot space: using the markers of close to the patient to set up XYZ coordinate system, the markers are usually attached to which easily to capture by the camera.
534
Y. Cheng et al.
Fig. 1. MicronTracker
Fig. 2. Marker
Fig. 3. Human Modal
3 Surgical Planning Based on Automatic Extraction 3.1 Image Spatial Planning 3.1.1 Automatic Extraction of Tumor Contour Extracting the tumor contour is an important part of the planning. In the past, it needs to draw out in the clearer axial plane of the CT/MRI manually by doctor. First, the doctor analyses the location of a tumor in the CT/MRI relying on their own experience, and locates the points of the tumor edge with mouse in the picture, then uses Bspline fitting[12][13] to form a closed polygon. The operation is relatively cumbersome and time-consuming because it needs to operate repeatedly in every axial plane with the tumor. This article presents an approach to extract the tumor contour automatically bases on seed fill function[14][15]. The method selects a points of the tumor area with the mouse, and fills the point whose the pixel gray value within the threshold into another color, and then draw out the regional contour after finding the contour of the tumor with the tracking algorithm(see Fig. 4). The method described in this article has the following advantages. First, the operation is easy. Former method requires a careful selection of the points around tumor area, then forms a closed areas after selecting; this method only requires to click a point of the tumor in the axial plane with the mouse, the operation is simple and fast. Second, this method is time-saving. Experiments show that the previous method of manually extraction of 30-layer DICOM images, each layer need 20s, interval 5s, need 750s, or 12.5 minutes; this method extracts the contour of 30-layer DICOM images automatically, each layer need 2s, interval of 5s, need 210s, that is 3.5 minutes, so it is greatly improved the efficiency in surgical planning for doctors. 3.1.2 The Establishment of Image Space in the Affine Coordinate System This paper uses the following method to establish affine coordinate system of the image space. Finding four boundary points of the cylindrical piece and two adjacent CT/MRI slices. Clearly, the center of the cylindrical top can be obtain by the four points, as shown in Fig.5, the center of a circle fall on the mid-point connection of AB and CD. By the same method, all the marker points’ coordinates in the image coordinate system can be obtained. One of the identified coordinates can be selected as the origin point for establishing the affine coordinate system and affine coordinate matrix AI in the image space.
Rapid Planning Method for Robot Assited Minimally Invasive Surgery
Fig. 4. Tumor contour
Fig. 5. Top center of a circle cylinder
535
Fig. 6. Calibration with needle
The previous method uses the electrodes to calibrate, it is difficult to identify the geometric center of electrodes in the image. That’s because there is a certain gap between two slices when the electrodes cut, the center of the process may be between two slices of CT/MRI, so it has the errors of the center’s coordinate of the electrode in the image and the image space affine coordinate system. Therefore, the calibration accuracy is not high. Compared with the previous methods, this method has the following advantages: the cylinder diameter on the marker is larger than distance between slices, and less than double distance between slices, so three or four points on the two adjacent CT/MRI slices can be found to determine the exact coordinate of the circle’s center. This method has less error than previous ones, and the calibration accuracy is improved by establishing the affine coordinate system in the image space. 3.2 The Establishment of Camera Space’s Affine Coordinate System In this paper, it uses the method of extracting marker coordinate automatically to create an affine coordinate system in camera space. The coordinates of marker’s position can be obtained through calculating the end point’s position of extracted marker’s length vector. Because the short-vector and the long-vector share the same point, as shown in Fig. 2, AB is the short-vector, BC is the long-vector, point B is the public point. Therefore, the distance detection method can be used to determine the coordinate of the marker in the camera coordinate system. Another three marker coordinates in the camera coordinate system can be captured by using the same method. Selects one as the origin, the affine coordinates and affine coordinate matrix Ac can be established. Specific extraction steps as shown below:
Fig. 7. The extraction flow chart of the marker Coordinates
In the past, the method used to establish the affine coordinate in the camera space usually bases on optical positioning system and electrode films, and uses the calibration needles to calibrate. In other words, uses the handheld device which is equipped with a template and using it’s tip contacting to the markers (see Fig. 6), the marker’s coordinate in the camera coordinate system can be calculated through the geometric
536
Y. Cheng et al.
between the needle and the templates. The shortcomings of this method are: first, it needs to use the needle with a template to get close to the calibration points by hand, the calibration is slow and time-consuming; second, due to the hand shaking, the precision of calibration is low; third, because the surface of electrode is not spherical, when the needle nearing the mark, it is too difficult to ensure the axis passing through the geometric center electrodes. Compared with the previous method, the method proposed in this article has the following advantages: Firstly, extracting the four center coordinates of markers attached to the patient's with the MicronTracker automatically can save time; secondly, because the coordinates of marker are extracted by the camera, there is no jitter caused by hand and the calibration accuracy is advanced; thirdly, using the MicronTracker camera to obtain the precise coordinates of marker’s center, it has higher precision and the error caused by the electrodes’ own characteristics can be diminished. From the above, using this method can save calibration time and improve the accuracy of calibration and the surgery planning. 3.3 Robot-Assisted Surgery of Space Mapping There is no relative movement between markers affixed to the human modal. So their affine coordinate matrixes are identical in the camera coordinate system and image coordinate system. The space mapping is realized by obtaining affine coordinate system and the origin of coordinates in the base coordinate system. The implementation process is as follows: Type (1) and type (2) is the affine coordinate values of any points in the camera coordinate system and image coordinate system. ⎡ x ′⎤ ⎡ x c − x oc ⎤ ⎢ y ′ ⎥ = A −1 ⎢ y − y ⎥ c oc ⎥ ⎢ ⎥ ⎢ c ⎢⎣ z′ ⎥⎦ ⎢⎣ z c − z oc ⎥⎦
(1)
⎡ x ′′⎤ ⎡ x I − x OI ⎤ ⎢ y′′ ⎥ = A −1 ⎢ y − y ⎥ I OI ⎥ ⎢ ⎥ ⎢ I ⎣⎢ z ′′ ⎦⎥ ⎣⎢ z I − z OI ⎦⎥
(2)
Where ( X C , YC , Z C ), ( X I , YI , Z I ) are the points in the camera’s affine coordinate system and image’s affine coordinate system respectively. ( X OC , YOC , Z OC ) ( X OI , YOI , Z OI ) are the origin points in the camera coordinate system and image coordinate system respectively. ( X ′, Y ′, Z ′ ) is the affine coordinate values of ( X C , YC , Z C ) bases the camera coordinate system. ( X ′′, Y ′′, Z ′′ ) is the affine coordinate values of ( X I , YI , Z I ) bases the image coordinate system. ( X ′, Y ′, Z ′ ) and ( X ′′, Y ′′, Z ′′ ) are the same in the theory (because they are the same point in different coordinate system). Merging type (1) and type (2) for type (3):
、
⎡ xI ⎤ ⎡xc − xoc ⎤ ⎡xoI ⎤ ⎡xc ⎤ ⎡xoc ⎤ ⎡xoI ⎤ ⎢ y ⎥ = A A −1 ⎢ y − y ⎥ + ⎢y ⎥ = A A −1 ⎢y ⎥ − A A −1 ⎢ y ⎥ + ⎢ y ⎥ I c ⎢ c oc ⎥ ⎢ oI ⎥ I c ⎢ c⎥ I c ⎢ oc ⎥ ⎢ oI ⎥ ⎢ I⎥ ⎢⎣ zI ⎥⎦ ⎢⎣ zc − zoc ⎥⎦ ⎢⎣ zoI ⎥⎦ ⎢⎣ zc ⎥⎦ ⎢⎣ zoc ⎥⎦ ⎢⎣ zoI ⎥⎦
(3)
Rapid Planning Method for Robot Assited Minimally Invasive Surgery
537
Then the corresponding relationship between ( X C , YC , Z C ) and ( X I , YI , Z I ) is established, the mapping process is as follows: First, get the points’ coordinate in the camera coordinate system within the patient’s space, then convert these points into the affine coordinate values based on camera coordinate system, next, convert these values to the image coordinate system. The mapping from the image space to the patient space is accomplished. The mapping from the robot space to the image space can be implemented by this method. The camera captures the marker’s affine matrix ARC which is attached to the robot in the camera coordinate system. Then establishes an affine coordinate system for affine matrix AR by the following basis: The marker’s length and short axis of XZ and XY are known, and the XZ and YX as the X axis and Y-axis respectively, the z-axis is x-axis and y-axis cross multiply to be. The type (1) and type (2) obtained type (4): ⎡ x RC ⎤ ⎡ x R − x OR ⎤ ⎡ x ORC ⎤ ⎢ y ⎥ = A A −1 ⎢ y − y ⎥ + ⎢ y ⎥ RC R OR ⎥ ⎢ RC ⎥ ⎢ R ⎢ ORC ⎥ ⎢⎣ ZRC ⎥⎦ ⎢⎣ z R − z OR ⎥⎦ ⎢⎣ z ORC ⎥⎦
(4)
Among them, ( X R , YR , Z R ) is the point in the robot coordinate system, ( X OR , YOR , Z OR ) is the origin point in the robot coordinate system, ( X ORC , YORC , Z ORC ) is the coordinate of the origin point of the robot coordinate system maps in the camera coordinate system, ( X RC , YRC , Z RC ) is any coordinate of the point in the robot coordinate system maps to the camera coordinate system, and then it applies into the equation (3) to get equation (5): ⎡ ⎡xRI ⎤ ⎡xR − xOR ⎤ ⎡xORC ⎤ −xoc ⎤ ⎡xoI ⎤ ⎢y ⎥ = A A −1 ⎢A A −1 ⎢y − y ⎥ + ⎢y ⎥ −y ⎥ + ⎢y ⎥ ⎢ RI ⎥ I c ⎢ RC R ⎢ R OR ⎥ ⎢ ORC ⎥ oc ⎥ ⎢ oI ⎥ ⎢⎣ ⎢⎣zRI ⎥⎦ ⎢⎣ zR − zOR ⎥⎦ ⎢⎣zORC ⎥⎦ −zoc ⎥⎦ ⎢⎣zoI ⎥⎦
(5)
Among them, ( X RI , YRI , Z RI ) is the coordinates of the image coordinate system which is mapped from the point in the robot coordinate system. Compared to the above mapping, the process of mapping from the robot coordinate system to the camera coordinate is accessorial.
4 Experiment The experiments in this paper establish the entire space mapping process based on the MicronTracker camera. These experiments are used to verify the position error with the method by which the coordinates of markers in the image space and camera space is extracted.Please note that, if your email address is given in your paper, it will also be included in the meta data of the online version. In the experiments, five electrodes on the human modal’s different locations labeled A, B, C, D and E are selected respectively, and then uses the needle with the marker which can be identified by camera to click the electrodes and records the tracking errors when the needle position has changed.
538
Y. Cheng et al.
Specific diagrams are as follows:
(a) needle point electrodes
(b) test electrodes used in A-E
Fig. 8. Electrodes used in experiments and puncture needle
In this study, the error results of five pairs of electrodes in real-time tracking are as follows: B Point affine matrix tracking error
A Point affine matrix tracking error
C Point affine matrix tracking error 3.5
3.5
3.5
3
3
3 2.5
2.5
2.5
) m 2 (m ro rr1.5 E
) m 2 m ( r o r 1.5 r E
)m m 2 ( r o r 1.5 r E
1
1
1
0.5
0.5
0.5
0
0
1 19 37 55 73 91 109 127 145 163 181 199 217 235 253 271 289 307 325 Point Number
0 1
9
17
25 33
(a)
41
49
57 65 73 81 Point Number
89
97 105 113 121 129
1
11 21 31 41 51 61 71 81 91 101 111 121 131 141 151 161 171 Point Number
(b)
(c) E Point affine matrix tracking error
D Point affine matrix tracking error 3.5
3.5
3
3
2.5
2.5
) m 2 m ( r o r r 1.5 E
) m 2 m ( r o r 1.5 r E
1
1
0.5
0.5
0 1
16 31 46 61 76 91 106 121 136 151 166 181 196 211 Point Number
(d)
0 1
7
13 19 25 31 37 43 49 55 61 67 73 79 85 91 97 103 109 Point Number
(e)
Fig. 9. (a) to (e) are the experimental data of A-E electrodes respectively
Above analysis of experimental result shows that the mapping accuracy can be achieved within 1mm by the space mapping method in this paper. In this experiment, the experimental precision is relatively low due to hand shaking. In this case, the space mapping errors were mainly concentrating between 1mm to 3mm, which met a number of robot-assisted minimally invasive surgery’s need. So the new marker extraction method in the image space and camera space presented in this paper has advantages in the space mapping accuracy.
Rapid Planning Method for Robot Assited Minimally Invasive Surgery
539
5 Conclusion and Future Work The method of extracting tumor contours automatically, using non-contact camera to extract marker coordinates on patient automatically and extracting marker’s accurate coordinate of center point in the image space is used in the surgical planning in this article. Just using the mouse to click a point within the region, the automatic extraction of tumor contour can be completed. Comparing with the method of manually extraction, it is time-saving. Using non-contact camera to extract coordinates of markers on the human body automatically, the calibration time is shortened, the calibration error is reduced compared with the manual calibration method. Extracting the coordinates of the center marker in the image space to make full use of the geometric characteristics of markers, it can get precise geometric center and avoid the errors caused by the electrodes. Compared with traditional planning methods, the planning time is reduced and the positioning accuracy is improved by this planning approach. Experiments show, compared to manual method, the automatic extraction of tumor contour saves 9 minutes during operating the 30 CT. The results of tracking the five electrodes on the different location of the human body show that errors are mostly concentrated in less than 3mm despite the hand-shake. The entire planning time of the operating system from the original 25-30 minutes, reduce to 5 minutes or less now, this method gives reference in the planning time and planning efficiency. Future work will address two main issues: first, effectively improve the accuracy of extracting tumor contour automatically; second, optimize the three-dimensional reconstruction algorithm, including the three-dimensional reconstruction of the patient's surgery area, the tumor, the bone and so on, thus further enhance the accuracy and efficiency of the operation planning.
References 1. Guizhen, M., Bingchen, M., et al.: Tracking and Locating of Surgical Instrument Using Binocular Vision. Microcomputer Applications 26(2), 181–183 (2005) 2. Adhami, L., Coste-Maniere, E.: Optimal planning for minimally invasive surgical robots. IEEE Transactions on Robotics and Automation: Special Issue on Medical Robotics, Rainer Konietschke et al. (October 2003) 3. Cannon, J.W., Stoll, J.A., Selha, S.D., Dupont, P.E., Howe, R.D., Torchiana, D.F.: Port Placement Planning in Robot-Assisted Coronary Artery Bypass. IEEE Transactions on Robotics and Automation: Special Issue on Medical Robotics (October 2003) 4. Engel, D., Korb, W., Raczkowsky, J., Hassfeld, S., Woern, H.: Location Decision for a Robot Milling Complex Trajectories in Craniofacial Surgery. In: Proceedings of the 17th International Congress and Exhibition, CARS 2003, London, UK (2003) 5. Weiming, Z., Yannan, Z., et al.: An image analysis system for brain surgery assistant robot. Chinese High Technology Letters 15(4), 33–36 (2005) 6. Jolesz, F.A., Nabavi, A., Kikinis, R.: Integration of Interventional MRI With ComputerAssisted Surgery. Journal of Magnetic Resonance Imaging 13, 69–77 (2001) 7. PMcL, B., Moriarty, T., Alexander, E., et al.: Development and implementation of intraoperative magnetic resonance imaging and its neurosurgical applications. Neurosurgery 41, 831–843 (1997)
540
Y. Cheng et al.
8. Cleary, K., Member, IEEE, Clifford, M., Stoianovici, D., Freedman, M., Mun, S.K., Watson, V.: Technology Improvements for Image-Guided and Minimally Invasive Spine Procedures. IEEE Transactions on Information Technology in Biomedicine 6(4) (2002) 9. Feng, P., Wei, W., Yilu, Y., Xinhe, X.: Coordinate Mapping of Brain Surgery Robot System Based on Vision Localization. Journal of Northeastern University (Natural Science) 26(5), 413–416 (2005) 10. Yangyu, L., Senqiang, Z., Xiangdong, Y., Ken, C.: Mapping Method in Robot-aided Ultrasound-guided Microwave Coagulation Therapy System. China Mechanical Engineering 18(5) (2007) 11. Can, T.: Reseach on Key Techniques for a Robot system in CT-Guided Minimally Invasive Surgery, Doctoral thesis of Beijing University of Aeronautics and Astronautics (2009) 12. Dawei, J., Ziran, W.: Modelling of Complex Surface by B-Spline. Aeronautical Computer Technique, 2 (1999) 13. Hongmei, Z., Yanming, W., et al.: Non-Uniform Rational B-Splines Curve Fitting Based on the Least Control Points. Journal of Xi’an Jiaotong University (1) (2008) 14. Rongxi, T., Qunsheng, P., Jiaye, W.: Computer Graphics Tutorial. Science Press, Beijing (1990) 15. Fei, Z., Jinsen, W., Hang, L.: Visual C ++ digital image processing development and programming practice. Electronic Industry Press, Beijing (2008)
Autonomic Behaviors of Swarm Robots Driven by Emotion and Curiosity Takashi Kuremoto, Masanao Obayashi, Kunikazu Kobayashi, and Liang-Bing Feng Graduate School of Science and Engineering, Yamaguchi University 755-8611 Tokiwadai 2-16-1, Ube, Yamaguchi, Japan {wu,m.obayas,koba,n007we}@yamaguchi-u.ac.jp
Abstract. This paper proposes an improved internal model with emotional and curious factors for autonomous robots. Robots acquire adaptive behaviors in the unknown environment according to make observation of behaviors of others. Cooperative relation among the robots and transition of curiosity to the local environments drive robots to achieve the goal of the environment exploration. Simulations showed the effectiveness of the proposed model with interesting motions of robots. Keywords: autonomous robot, swarm robots, emotion, curiosity.
1 Introduction Autonomic behaviors of swarm robots make an important role in the adaptability of robots to the complex and dynamical environments. An autonomous robot may acquire valuable information by observing the behaviors of other robots, and also be able to provide own information to others [1-4]. Meanwhile, the factors of emotion and curiosity are considered to concern with the level of intelligence of lives, i.e., more complicated mental states which come from more emotional experiences [5] symbolizes higher intelligence. Recently, a group of Ide and Nozawa proposed an emotion model which drives autonomous robots avoiding obstacles and exploring a goal in the unknown environments [3] and [4]. The model is based on Russell’s “circumplex model of affect” which assigns 8 kinds of major emotions on a 2-dimensional map [5] and [6]. Using psychological analysis of evidences, Russell categorized affective states such as pleasure, excitement, arousal, distress, misery, depression, sleepiness, and contentment orderly in a pleasure-arousal space. The emotion model of robots given by [3] used the factors of “pleasure” and “arousal” to set a series of behavior rules for the autonomous robots. When obstacles or other robots appear in the vision of a robot, for example, the behavior rules make a reducing of the value of “pleasure” or “arousal” and an increasing of the value of “displeasure” or “sleepiness”. In [4], a useful application of the emotion models to the emergence of cooperative behaviors of robots was challenged and it suggests that using the emotion model autonomous robots may possess cooperation ability in uninhabited environment such as space or deep-sea. K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 541–547, 2010. © Springer-Verlag Berlin Heidelberg 2010
542
T. Kuremoto et al.
However, there are several practical problems exist in the model of Ide and Nozawa: 1) The limitation of the depth of view is necessary, as it determines the information of input and effects the output of the model; 2) The restriction on speed of actions is necessary; 3) Inductive function, i.e., pleasure renewing equation has a tendency to “displeasure” more easily, and this causes all robots drop to the state of “sleepiness” easily. In this paper, we intend to overcome the problems above, and adopt a new mental factor “curiosity” to raise the motivation of autonomous behaviors. The improved internal model is confirmed its effectiveness by kinds of comparative simulations dealing with goal-exploration problems.
2 An Improved Internal Model of Autonomous Robot The main difference from traditional psychological analyses of affect is that only pleasure and arousal dimensions are stressed in the model meanwhile conventionally a set of dimensions such as displeasure, distress, depression, excitement and so on were considered independently. According to 28 stimulus words presented to 36 young peoples, [5] described the emotion categories in the circular ordering. The group of Ide and Nozawa used the concept of the circumplex model to design an emotion model to evoke interactions or cooperative behaviors of multiple autonomous robots. Furthermore, Oudeyer and Kaplan used a curiosity model which considered the influence of time factor to raise the motivation of adaptive activities of robots [7]. In this Section, the emotion model of robots is introduced at first, an improved internal model including the emotional concept and a novel calculation method of curiosity is proposed. 2.1 A Conventional Emotion Model for Robots In a goal-exploration problem, robots move to search the goal and avoiding to obstacles or other robots in the unknown environments. In the conventional emotion model [3], information of local environment around the robot is obtained by the observation, and the information determines the degree of emotional vectors: “pleasure” and “arousal”, which cause the motion of robot. A set of behavior rules of each robot are set as the following: 1) Local information is obtained within the environment of vision; 2) The degree of arousal is proportion to the depth of vision; 3) A robot comes up to the robots which are “pleasure” appearing within the vision, and comes off to ones in the opposite case; 4) A robot comes up to other robots when it is in the state of “pleasure”, and comes off to the others in the opposite case; 5) The degree of pleasure is reduced when obstacles or other robots are observed, and increased in the opposite case; 6) The degree of arousal is increased when other robots are observed, and reduced in the opposite case. Descriptions of these rules can be expressed in the following equations: R i ( t + 1) = R i ( t ) + V i ( t )
Pv ji =
Pv j ⋅r ji r ji
(1) (2)
Autonomic Behaviors of Swarm Robots Driven by Emotion and Curiosity
Pv ij =
Pv i ⋅r ij r ij
543
(3)
Vi ( t + 1) = Vi ( t ) − l1 ∑ Pv ji + l2 ∑ Pv ij j i
(4)
Pv ( t + 1) = Pv ( t ) + e p・ R Pv・ Pv ( t )
(5)
⎧⎪ − 1 ( where d 0 > D , d r > D ) ep = ⎨ ⎪⎩1 ( where d 0 ≤ D , d r ≤ D )
(6)
Av ( t + 1) = Av ( t ) + e a・ R Av・ Av ( t )
(7)
⎧ − 1 ( where d r > D ) ea = ⎨ ⎩1 ( where d r ≤ D )
(8)
D = α・ Av + K
(9)
Where t : step (time); R i (t ) : position vector of robot i at time t ; V i(t ) : velocity vector of robot i at time t ; Pvji : influence from robot j to robot i ; Pv ij : Influence from robot i to robot j ; rij , rji : distance vector between robot i and j ; l1 , l2 : emotional influence parameter; R pv : rate of the change of pleasure ( 0 ≤ R pv ≤ 1) ; R Av : rate of the change of arousal (0 ≤ RAv ≤ 1) ; Pv : degree of pleasure;
Av : degree of arousal; d o : distance from robot i to the nearest obstacle;
d r : distance from robot i to the nearest robot; D : depth of the vision
α, ea, ep : positive coefficients; K : bias of the vision. According to these rules, patterns of behaviors of robots appear as the cases of: 1) Robots in the state of pleasure attract each other and come up to each other closely; 2) A robot in the state of pleasure moves to a direction and causes others to follow it; 3) Robots in the state of displeasure go away from each other. 2.2 An Improved Internal Model for Autonomous Robots Using the conventional model described in the Section 2.1, we performed simulation experiments and observed kinds of results such as robots successfully attracted each other, avoided to obstacles and achieved at the goal(s) of exploration, or failed to attract each other, or failed to achieve on the multiple goal areas in a complicated environment.
544
T. Kuremoto et al.
The reasons of the failed cases may be considered as: 1) Bias of the vision was set inadequately. Too large value of K caused the internal state of robot dropped into “displeasure” easily; 2) The time that robots influent each other was too short because of too high velocity; 3) There was a trend that the degree of pleasure reduced more easily than increased; 4) Low degree of the pleasure of all robots caused low degree of the arousal of robots, and the case resulted in all robots dropped into the state of sleepiness, the behaviors of exploration disappeared. To overcome these problems and to raise the motivation of exploration, here we propose to add new rules into the emotion model and adopt a new mental factor “curiosity” into the calculation of the velocity vector as following: 1) Limit bounds of the depth of vision: Kmin < < Kmax; 2) Limit a maximum value of velocity: V i(t) < Vmax; 3) Make the change of emotion factor “pleasure” to be dynamical, i.e., using Eq. (10) and Eq. (11) instead of Eq. (5).
K
D = α・ Av + K
(10)
x ( t ) = μ sin( π ( Pv ( t ) + e p M ))
(11)
Where μ , M , N , β are positive parameters. 4) “curiosity” means 2 situations concerning with the change of the internal state of robots: i) Robot i keeps to search the goals k ( k = 1, 2, …, k, …, K) before it arrives at them and after it arrives at one goal k then its “curiosity” to the goal is reduced eventually; ii) During robot i exploring the environment, when it crushes to obstacles its “curiosity” is reduced eventually. Eq. (12) defines the “curiosity” and Eq. (13) builds an improved internal model of autonomous robots: (t ) (if goal = 1, 2 , k ) ⎧ I k − λ1Cv ik ⎪ Cv ik (t + 1) = ⎨ I k − λ 2 Cv ik (t ) (if obstacle exists ) ⎪ 0 ( otherwise ) ⎩ Vi ( t + 1) = Vi ( t ) − l1
∑ Pv j
ji
+ l2
∑ Pv i
ij
+ l3
∑Cv i
ik
(12)
(13)
Where Cv (t ) is a factor of “curiosity” in the improved internal model, I k are positive parameters for different goal k , and coefficients λ1, λ2 , l 3 > 0.
3 Simulation Experiment To exam the internal model proposed in Section 2.2, computer simulation experiments were performed using two kinds of environments for multiple robots exploration. In a simple environment with single goal as same as in [3], the behavior of two autonomous robots was observed few differences between conventional method and the novel system, however, in a complicated environment (maze-like) with multiple goals, robots arrived at all goals only in the case of the proposed method (results of the later case is reported in Section 3.1).
Autonomic Behaviors of Swarm Robots Driven by Emotion and Curiosity
2
3
2
3
1
1
(b) Started at the same time (improved model).
(a) Started at the same time (conventional model). 2
2
3
3
1
1
(c) Started at the different time (conventional model).
:
Start place
545
(d) Started at the different time (improved model).
:
Obstacle
:
Goal
Fig. 1. Tracks of 2 robots started at the same time ((a) and (b)), and different time ((c) and (d)) exploring 3 goals in a complicated unknown environment (size of the environment: 500x500 steps).
3.1 Finding Multiple Goals in Complicated Environment The size of a 2-dimensional exploring space is 500x500 (steps), multiple obstacles exist in the different positions of the square, 2 robots start from 2 different positions to search 3 goal areas located at the different positions: the environment of simulation is shown in Fig. 2. Two cases of timing of start of robots were executed: i) 2 robots started at the same time; ii) one started 200 steps later. All parameters were set as shown in Table 1, and the limitation of steps of a trial was set to 2,000 steps.
546
T. Kuremoto et al.
Pv
Pv
step (b) Started at the same time (improved model).
step (a) Started at the same time (conventional model).
Pv
Pv
step
step (d) Started at the different time (improved model).
(c) Started at the different time (conventional model).
Fig. 2. Comparison of the change of pleasure degrees of 2 robots exploring 3 goals in the complicated unknown environment Table 1. Parameters used in the simulation of this section
Parameter The value of pleasure at start and goals Coefficients of emotional influence factors Coefficient of curiosity influence factor Bounds of the vision Limitation to velocity Parameters in dynamical pleasure calculation Coefficients in curiosity calculation Initial value of curiosity The value of curiosity for goal area
Symbol Pv ( 0 ) , Pv l1 , l2 l3
Kmin, Kmax Vmax N , M , μ, β λ1, λ2
Cv Ik
Value 0.0, 1,200.0 6.0, 1.0 0.5 30.0, 50.0 15.0 100, 100, 100, 0.05 0.2, 0.2 0.0 20.0
i) Simulation results of 2 robots started at the same time are shown in Fig. 1 (a) and Fig. 1 (b). Robots with conventional model stopped exploration for the kinds of reasons such as obstacles and multiple goals (Fig. 1 (a)), meanwhile, those with the improved model showed active exploration and reached at all 3 goals (Fig. 1 (b)).
Autonomic Behaviors of Swarm Robots Driven by Emotion and Curiosity
547
ii) Simulation results of 2 robots started at the different time are shown in Fig. 1 (c) and Fig. 1 (d). Robots with conventional model also stopped exploration without reaching to any goal (Fig. 1 (c)). Robots with the improved model also showed active exploration, however, one failed to reach at Goal 3 (Fig. 1 (d)). The change of the degree of pleasure, as curves depicted in Fig. 2 (a)-(d) respective to Fig. 1 (a)-(d), showed the difference of the internal state changing of 2 robots between conventional model and improved model. More dynamical activity was observed in the case of our model.
4 Conclusion An emotion-curiosity driven behavior model is proposed for the exploration activity of swarm robots. The basic idea of the internal model is that metal states including pleasure, arousal and curiosity motivate robots control the velocities in time. Simulations showed the effectiveness of the proposed model. This research suggests that the mental models may serve important roles in the art of swarm robot design. Acknowledgements. We would like to thank Y. Matsusaki and M. Sugino for their early work, and a part of this study was supported by JSPS-KAKENHI (No.20500207 and No.20500277).
References 1. Cao, Y.U., Fukunaga, A.S., Kahng, A.B.: Cooperative Mobile Robotics. Antecedents and Directions. Autonomous Robots 4, 7–27 (1997) 2. Asada, M., Uchibe, E., Hosoda, K.: Cooperative behavior acquisition for mobile robots in dynamically changing real worlds via vision-based reinforcement learning and development. Artificial Intelligence 110, 275–292 (1999) 3. Sato, S., Nozawa, A., Ide, H.: Characteristics of Behavior of Robots with Emotion Model. IEEJ Trans. EIS, 124(7), 1390–1395 (2004) (in Japanese) 4. Kusano, T., Nozawa, A., Ide, H.: Emergent of Burden Sharing of Robots with Emotion Model. IEEJ Trans. EIS, 125(7), 1037–1042 (2005) (in Japanese) 5. Russell, J.A.: A circumplex model of affect. Journal of Personality and Social Psychology 39(6), 1161–1178 (1980) 6. Larsen, R.J., Diener, E.: Promises and problems with the circumplex model of emotion. In: Clark, M.S(ed.): Review of Personality and Social Psychology: Emotion, vol. 13, pp. 25–59 (1992) 7. Oudeyer, P.Y., Kaplan, F.: Intelligent Adaptive Curiosity: a Source of Self-Development. In: Proc. 4th Intern. Workshop on Epigenetic Robotics, pp. 12–132 (2004)
Modelling and Simulating Dynamic Evolvement of Collective Learning Behaviors by Voronoi Diagram Xiang-min Gao and Ming-yong Pang Department of Educational Technology, Nanjing Normal University No.122, Ninghai Ave., Nanjing 210097, Jiangsu, P.R. China
Abstract. Simulating collective behaviors of human groups with interactions has essential importance in education, economics, psychology and other social science fields. This paper, we present a Voronoi diagram based method for modelling and simulating group learning behaviors. The method follows a set of learning rules to update individuals’ behaviors during evolution, and uses Voronoi diagram to compute and observe the change of each individual’s behaviors as well as the visualized longterm behaviors of the group at higher group level. We use a large number of experiments to show that the modelled group behaviors with certain learning rules can reach some limit states under restrictive conditions. In addition, we also discussed how the evolvement of group behaviors is affected by qualified rate in initial condition in the sense of statistics and analyzed and explained the special phenomenons appearing in the dynamic evolvement. Keywords: collective learning behaviors, dynamic evolvement, simulation, Voronoi diagram.
1
Introduction
Interaction make humans very susceptible to be influenced by other people around in all aspects of social life. Each individual often adjusts its behavior according to the behaviors of its neighboring individuals. In this paper, we name this phenomenon herding behavior, which is a very ordinary phenomenon in social life. It is a manifestation of collective non-rational behavior resulted by individual rational behavior. It is very surprising that the emergence of higher-level organizations from the interactions of lower-level units is in the case of group behavior. This is because that the higher-level organizations typically emerge spontaneously and simultaneously with the change of individual’s behavior, thus this group level behaviors are not easily detected or foreseen by any single individual. As we known, interacting bees create social colony architectures that no single bee intends. Populations of neurons create structured thought, permanent memories and adaptive responses that no neuron can comprehend by itself. Similarly, interactive people create group-level behaviors that are beyond the ken and anticipation of K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 548–554, 2010. c Springer-Verlag Berlin Heidelberg 2010
Modelling and Simulating Dynamic Evolvement
549
any single person. Many social phenomena, such as smoking, school attendance, non-marital fertility, drug use, crowds and rumors arise because of individuals’ beliefs and goals, but the eventual form that these phenomena take is rarely dictated by any individual. Therefore, recognizing the relationship between individual behavior and group evolvement has essential significance in education, society, psychology and economy[1]. In the past few years, there are remarkably computational models of collective behaviors, e.g. in sociology[2], psychology [3][4], anthropology[5], and economics[6] in past decades. In order to investigate the evolvement of grid user behaviors[7], the author established an evolutionary game model on grid users, and did lots of simulation experiments to show that evolutionary game approach can make grid users study and adjust strategy constantly through repeated games to achieve evolutionary stable equilibrium. In the literature[8], the author presents a general frame for modeling and simulating collective learning behaviors of human groups with interactions. A cellular automata based method is further proposed to analyze group learning behavior[9][10], discussed how the evolvement of group is affected by qualified rate in initial condition, the interaction level and distribution of individuals in the sense of statistics. In the process of individuals’ interactions, the similarity between individuals will lead to their greater similarity. This mechanism applied to analysis the phenomenon of youth smoking diffusion[11]. Some scholars have also studied on the issue of juvenile delinquency, drug abuse, spread rumors and so on, made a lot of threshold-based calculation models to simulate the process of this behavior diffusion[12], it pointed out that each individual has a tolerated threshold, once more than this value, the individual will be addicted to this behavior. In many literatures, the threshold is defined as the adoption rate of this behavior among local neighboring individuals rather than the entire group[13][14].That is to say, the individual often adjusts the behavior according to its neighboring individuals. In addition, some literature focuses on the system’s evolution and its eventual equilibrium state[15].
2
Voronoi Diagram Model
Russian mathematician J. Voronoi first proposed Voronoi diagram(VD) which describes a subdivision of the space that contains the given point set. This decomposition can better express the spatial proximity relationship between points as well as every point’s influence scope. 2.1
Voronoi Diagram
Set pi and pj are two points on some plane, thus the perpendicular bisector of the line segment pi pj divides the plane into two parts. Let H(pi , pj ) denote the half-plane included pi and H(pj , pi ) denote the half-plane included pj . Obviously, the points on the H(pi , pj ) is closer to pi than to the other points on the plane, that is, H(pi , pj ) is the composition of points closer to pi . Given a set of points {pi }N i=1 , the Voronoi cell V (pi ) corresponding to the point pi is defined by
550
X.-m. Gao and M.-y. Pang
V (pi ) =
H(pi , pj ).
i=j
V (pi ) is the intersection of N-1 half-planes closer to pi than to any other point. It is a planar convex polygon with no more than N edges, which is referred to Voronoi cell corresponding to pi . The set V (pi )N i=1 is called Voronoi diagram of the points set S. The points pi are called sites. The segments of Voronoi diagram are called edges.
Fig. 1. Voronoi diagram of randomly distributed sites on a plane
According to the above definition, given pi ∈ S, V (pi ) contains and only contains one point of S, as shown in Fig. 1. Through the Voronoi cell corresponding to pi , we can find all of its direct neighboring sites, as shown in Fig. 1, the its direct neighboring sites of pi are {p1 , p2 , p3 , p4 , p5 , p6 }. In view of the special geometric characteristics of Voronoi diagram, we gave a Voronoi diagram based method to simulate the dynamic evolution of group learning behavior. 2.2
Discrete Dynamic Voronoi Diagram Model
In this session, we introduce the basic idea of modelling group learning behaviors. Considering a group consists of N interactive individuals, we uses N points on a plane to simulate the N individuals and all of Voronoi-neighbors of each point to simulate the neighboring individuals who will influence its decision-making. Besides, in the case of randomly distributed sites, the number of each site’s Voronoi-neighbors is not the same, which is often associated with its position on the plane. Accordingly, in real life, the number of individuals who can influence one’s behavior is also different, which is usually related to the individual’s social status, interpersonal relationships and so on. Thus, it is rational to simulate the individuals with Voronoi sites and simulate the individual’s influenced populations with the corresponding site’s Voronoi-neighbors. Obviously, as shown in Fig.1, Voronoi diagram model is a discrete model. The Voronoi diagram of N points is composed of N Voronoi cells corresponding to the N points, thus we can assign a value for each Voronoi cell to describle the will of the individual to perform one behavior. In addition, during the simulation of dynamic evolvement of group behaviors, we also did a discrete time processing, that is, updating the individuals’ behaviors of the system by a certain time step. As time passed and individuals’ state continuously updated, a discrete dynamic simulation system is formed by Voronoi diagram model.
Modelling and Simulating Dynamic Evolvement
3
551
Modeling Group Learning Behavior
We here assume that one group has N individuals, thus, the mathematical model of Voronoi diagram of the group can be characterized as the following steps. (1) Generate N points on a plane and construct the corresponding Voronoi Diagram. (2) Consider each individual as a Voronoi cell V (pi ) of VD, where i is the individual’s index in the group. (3) Set a value for each Voronoi cell, either 0 or 1, indicating the individual’s attitude towards a behavior–reject or not. (4) Define a neighborhood for each Voronoi cell. From the voronoi diagram’s properties above, we can use VD to find its direct neighbors(see Fig. 1). (5) Define updating rule for VD system. In order to control the evolvement of the system, we here define the updating rule of local individuals’ status as followings: valuet+1 =
Ni j=0
(B t [N eighji ] − 0.5) /(Ni ∗ 0.5)
⎧ ⎨0 B t+1 [i] = valuet+1 ⎩ 1
if (valuet+1 ≤ 0) if (0 < valuet+1 < 1) if (valuet+1 ≥ 1)
(1)
Where, B t+1 [i] is the behavior will value of V (i) at time step t+1, N eighi is the indexes of Voronoi neighbors of the individual V (i), Ni is the number of Voronoi neighbors of the individual V (i), N eighji is the index of j-th neighbor of the individual V (i). It must be noted that the updating rule above allow the behavior value of each individual to be updated into a value in [0,1] at each time step, that is, the behavior value of each individual can be updated gently and strengthened gradually. When the group arrives at steady state, only a very small number of individuals’ behavior-values can not reach either 0 or 1, which needs to be modified as follows. ⎧ if ( 0 < B[i] ≤ 0.5 ) ⎨0 B[i] = (2) ⎩ 1 if ( 0.5 < B[i] < 1 ) where B[i] is the behavior value of individual V (i), whose value do not reach either 0 or 1 at the steady state. According to this updating rule, the statuses of the group and each individual are always changing before the group arrives at a steady equilibrium. Here, it needs particular notices: for every individual, the updating rule above is under the assumption that the contribution of every one of its neighbors is the same, namely equal influence. But in the real world, people often have different size of mutual influence according to their relationship far or near. Therefore, the model uses the distance between the site and its neighboring sites to define the influence size, namely weighted influence.
552
4
X.-m. Gao and M.-y. Pang
Experiment and Analysis
4.1
System Equilibrium and Analysis of Its Main Characteristics
In this session, the assigned value of each individual can be understood as the individual’s attitude towards the behavior–reject or not. At the beginning of this system, each individual has its own attitude towards the behavior–reject or not. This can be achieved through randomly generating a series of 0 or 1 assigned to individuals by computer. If the behavior value of an individual is 0, the corresponding Voronoi cell will be painted black, indicating that the individual refused this behavior. Otherwise, the corresponding Voronoi cell will be painted white. As shown in Fig.2(a), the distribution of black and white area is chaotic. In the dynamic process of interaction, each individual adjusts its behavior according to the behaviors of its neighboring individuals and the whole group also evolves its structure and organization simultaneously. Fig.2 snapshot some states of the group evolution. Finally, the group will reach a stable state, even if time goes by, the state of the entire group will be no longer any change, as in Fig.2(e). From Fig.2(e), we can also see that the color of a very few cells are not white or black, that is, when the system arrives at a steady state, not all individuals’ behavior values can achieve 0 or 1 using the updating rules proposed above. So we must modify this steady state under certain rules, namely, according to Formula (2). In fact, the number of individuals who need to be modified is very small, comparing Fig.2(e) and Fig.2(f).
(a) Initial
(b) 1th
(c) 2th
(d) 6th
(e) Steady
(f) Modified
Fig. 2. Several snapshots of group evolution
4.2
Initial Condition and Group Evolvement
In this session, the assigned value of each individual can be understood as the individual’s performance of this behavior. For example, in the study of students’ truancy in a school, the assigned value 0 indicates that the student is a unqualified student, playing truant and 1 indicates that he performs a qualified behavior, not playing truant. Here, “initial condition” is referred to “13 different initial qualified rates from 20% to 80% with a given interval step, 5%”. Because Voronoi diagram model in this paper is sensitively dependent on the initial conditions and the evolving system is nonlinear, thus we use 10000 experiments for each qualified rate to study the corresponding statistical results of group evolvement. (1) Relationship between initial qualified rate and steady qualified rate, as shown in Fig.3(a).
Modelling and Simulating Dynamic Evolvement
553
Here, we discussed two cases–equal influence and weighted influence. Whether in the case of equal or weighted influence, it’s not difficult to see that the rate of qualified individuals in steady state is roughly proportional the rate of qualified individuals in initial state, or precisely, the figure illustrates a certain nonlinear relation between the two rates by a skew “S” shape curve. This means that when a group has the approximately same amounts of qualified individuals and unqualified individuals, the evolution results can be forecasted in the sense of statistics. (2) Relationship between initial qualified rate and iterated times when the system arrives at steady state, as in Fig.3(b). It is shown that the initial qualified rate is closer to 50%, the iterated times is larger. Specifically, when the initial rate is 45% or 55%, the number of iteration can reach maximum. On the whole, there is a certain nonlinear relation between the initial rate and the iterated times by an approximately inverted “U” shape curve.
(a) Initial and steady qualified rate
(b) Initial qualified rate and iterated times
Fig. 3. Relationship between initial condition and group evolvement
5
Conclusion and Future Work
This paper presents a Voroni diagram model based on simulating evolution of group learning behaviors. It is not aimed at a specific issue in practice, with the values assigned different meanings, we can deepen the understanding of systems in reality from a different point of view. We use computer to simulate the dynamic evolution of group learning behaviors and make a large number of related experiments. From the experimental data, we find: (1) The changes of individual behavior make the whole structure of group behavior evolve at the same time, eventually achieving a more stable equilibrium state. (2)In two cases—equal influence or weighted influence, the rate of qualified individuals in steady state is roughly proportional the rate of qualified individuals in initial state, with a certain nonlinear relation between the two rates by a skew “S” shape curve. (3)The added weight has little effect. Therefore, in further studies, weighted influence can be replaced approximately by equal influence. (4)System with different initial conditions will have different evolution speed, with a certain nonlinear relation between the initial qualified rate and iterated times by an inverted “U” shape curve. As the future work, we will set more state variables for individuals, determine a more reasonable and
554
X.-m. Gao and M.-y. Pang
intelligent updating rules, introduce the game methods and learning strategies into the system so that the simulation model can be more realistic approximation of the real system. Acknowledgments. This work are supported by the 11th Five-year Plan of National Social Science Foundation for Educational Science of China (Program for Young Scientists) (Grand no. CHA060073), Key Project of the 11th Fiveyear Plan for Educational Science of Jiangsu Province of China (Grand no. Bb/2008/01/009), and the Outstanding High-end Talent Foundation of Nanjing Normal University (Grant No.2007013XGQ0150).
References 1. Ashforth, B.E., Sluss, D.M.: Socialization tactics, proactive behavior, and newcomer learning: integrating socialization models. Journal of Vocational Behavior 70(3), 447–462 (2007) 2. Macy, M.W., Willer, R.: From factors to actors: computational sociology and agentbased modeling. Annual Review of Sociology 28, 143–166 (2002) 3. Harris, J.R.: Where is the child’s environment? a group socialization theory of development. Psychology Review 102(3), 458–489 (1995) 4. Kenrick, D.T.: Dynamical evolutionary psychology: Individual decision rules and emergent social norms. Psychology Review 110(1), 3–28 (2003) 5. Kohler, T., Gumerman, G.: Dynamics in human and primate societies. Oxford University Press, Oxford (2002) 6. Brock, W.A., Durlauf, S.N.: Identification of binary choice models with social interactions. Journal of Econometrics 140(1), 52–75 (2007) 7. Li, Z.-J., Cheng, C.-T., Huang, F.-X.: Resource Allocation Based on Evolutionary Game in Simulation Grid. Journal of System Simulation 20(11), 2914–2919 (2008) 8. Pang, M.Y.: A Frame for Modelling Collective Learning Behaviors Based on Cellular Automata. In: Proceedings of 2008 IEEE International Symposium on IT in Medicine and Education, pp. 238–243 (2008) 9. Zhao, R.-B., Pang, M.-Y.: Analyzing group learning behavior based on cellular automata. In: Proceedings of 2008 IEEE International Symposium on IT in Medicine and Education, pp. 327–331 (2008) 10. Gao, X.-M., Pang, M.-Y.: Simulating Dynamic Evolvement of Collective Learning Behaviors Based on Voronoi Diagram. In: Proceedings of the 5th International Conference on E-learning and Game (2010) (to appear) 11. Kimberly, K.: Peers and adolescent smoking. Society for the Study of Addiction to Alcohol and Other Drugs (2003) 12. Granovetter, M.: Threshold models of collective behavior. The American Journal of Sociology 83(6), 1420–1443 (1978) 13. Valente, T.W.: Social network thresholds in the diffusion of innovations. Social Networks 18(1), 69–89 (1996) 14. Solomon, S.: Social percolation models. Physica A (Amsterdam) 277, 239–247 (2000) 15. Durlauf, S.N.: How can statistical mechanics contribute to social science. Proceedings of the National Academy of Sciences of the United States of America 96(19), 10582–10584 (1999)
Study of the Airway Resistance of a Micro Robot System for Direct Tracheal Inspection Lianzhi Yu1, Guozheng Yan2, Yuesheng Lu1, and Xiaofei Zhu1 1 College of Optoelectric Information and Computer Engineering, University of Shanghai for Science and Technology, Shanghai, 200093, P.R. China [email protected] 2 School of Electronic, Information and Electrical Engineering, Shanghai Jiaotong University, Shanghai, 200240, P.R. China [email protected]
Abstract. This paper described the structure of a new flexible and active endoscopy micro robot system for direct tracheal inspection; the mobile mechanism of the robot is based on the inchworm movement actuated by pneumatic rubber actuator. There are five air chambers controlled independently, by adjusting the pressures in air chambers, the robot can move in the straight mode or in the bending mode. According to the physical structure of human’s respiratory system and the prototype structure of the micro robot system, the resistance characteristics of the trachea with the micro system were discussed in detail. The airway resistance characteristics models were set up and were analyzed in detail. The simulation experiment results prove that the resistance of the robotic system in airway is small enough for normal breath, and the robot is respectable to be used for inspection in human trachea directly. Keywords: Respiratory system, Micro robot system, Flow, Airway resistance.
1
Introduction
In recent years, the development of micro robot in medical inspection and surgery has been concerned about more and more toward the aged society, the minimally invasive robot which can enter human’s cavities with less or no injuries has been studied widely [1-3]. Since Killian reported the first intervening detecting experiment of bronchoscope 100 years ago, the application value of the bronchoscope was confirmed in clinic, and the fiber bronchoscope made in Japan was applied after1967. With the development of science, video bronchoscope was designed with minimized CCD in place of the fiber system and it could provide high quality graphic images. Although the intervening bronchoscope has been widely used for diagnosis and therapy of the diseases in respiratory system [4-7], it usually hurts patients in the surgery process for its hard structure, and the surgery should be completed in short time, so the bronchoscope is not capable of monitoring respiratory parameters continuously, on the other hand, the surgery result also relies on the doctor’s experiences. This paper focused on the research of a miniature bionic micro robotic system which was designed to be capable of moving actively in human’s respiratory system and monitoring respiration parameters K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 555–563, 2010. © Springer-Verlag Berlin Heidelberg 2010
556
L. Yu et al.
in the inner microenvironment of a human’s lung directly and continuously. Based on the physical structure of human’s respiratory system, the structure of the micro robot system, and the maintenance of normal respiratory function, the resistance characteristics of the trachea with the micro monitoring system were discussed in detail. The airway resistance is one of the important monitoring parameters, and the high airway resistance will affect the respiration function, So it is important to keep the airway resistance low enough for normal breath.
2 The Micro Robot System 2.1 The Structure of the Micro Robot System The structure and the sizes of the robotic system had been discussed in Refs. [8,9]. The structure of the pneumatic robotic system is composed of three parts: the front holder, the driving part and the rear holder. In moving state, the holders are used to hold the position of the system body, and the driving part is used to change the positions of the holders. The two holders are designed with cylindrical structure and covered with air chambers outside. Each holder has two interconnected air chambers; when charged, the two chambers are kept in the same pressure, and hold tightly against the inner wall. The driving part is a pneumatic rubber actuator made of fiberreinforced rubber with three degrees of freedom. The actuator rubber tube is divided into three identical sector chambers. When the three chambers of the actuator are charged with the same air pressure, the actuator will stretch in axial direction, and when only one of them is charged, the actuator will bend to the direction opposite to the other chambers. The structure of the robot system is shown in Fig.1, while the basic characteristics of the robot prototype are listed in Table 1.
Fig. 1. Structure of the robotic system
Fig. 2. Rubber actuator
Study of the Airway Resistance of a Micro Robot System for Direct Tracheal Inspection
n
557
nπD
Fig. 3. Rubber parameters
The robotic system is driven by a pneumatic rubber actuator. The structure of the actuator is shown in Fig.2. The rubber tube is made of copper thread-reinforced rubber, and the inner tube is divided into three identical sector chambers. The thread is in spiral shape, and the angle between the spiral and the rubber tube axis is about 80°. The geometrical structure parameters of the robot actuator are shown in Fig.3. In Fig.3, D and L are the diameter and length of the rubber actuator respectively; l and n are the length and number of turns of copper thread respectively; while θ is the angle between the spiral thread and the rubber tube axis. Table 1. Structure characteristics
Symbol D0 L0 L1 L2 D L3 M
Name Diameter of actuator Length of actuator Length of front holder Length of rear holder Holding cylinder Length of robot body Mass of robot body
Characteristics 6mm(o.d.), 4mm (i.d.) 20mm 10 mm 10 mm 12mm(o.d.),11mm (i.d.) 30 mm about 2 g
2.2 Driving Characteristics of the Micro Robot Actuator The driving force in axial direction of the 3-DOF pneumatic robot rubber actuator can be derived in the same way as Chou did for one DOF pneumatic rubber actuator according to the conservation and transformation of energy [10], the ideal model of the 3-DOF actuator driving force in axial direction had been obtained in [9]. The relation between the driving force, the pneumatic pressure and the displacement is expressed in Fig.4. The simulation results show that the maximum driving force of the actuator is over 3N, and the maximum effective displacement is about 6 mm. When only one chamber of the actuator is charged, the bending characteristics of 3-DOF actuator can be described by three parameters, θ, R and λ as shown in Fig.5 [11]. In the view of the equivalence between the bending sectors, the angle that the actuator bending-axis projection makes with the axis x in x-y reference frame can be got, the theoretical deflection angle-pressure curve is a straight line [9,11], the maximum effective bending angle is about 1.5 rad.
558
L. Yu et al.
4 3 2 1 0 0.8
6
0.6
4
0.4 0.2
2 0
Fig. 4. Axial driving forces
Fig. 5. Deflection angle
3 Control System of the Robot The moving mechanism and time control orders of the robot have been described in Ref. [5]. An experimental electro-pneumatic pressure control system was designed. It
12V DC
Electromagnetic valves
PC Relays
Pipes
USB 6008 Robot Compressor
Fig. 6. Experimental control system
Fig. 6. Experimental control system
Study of the Airway Resistance of a Micro Robot System for Direct Tracheal Inspection
559
mainly consists of a computer, an USB card, a compressor, ten relays, ten electromagnetism valves, some pressure regulator valves and pipes. The robot system has five pneumatic pressure pipes to be controlled. Each relay controls an electromagnetic valve, every pressure supply pipe is controlled by a 2/2 valve and a 3/2 electromagnetic valve. The control system can control the robot moving according to its locomotion orders by LabVIEW programs. The control system is shown in Fig.6.
4 Airway Resistance Characteristics of the Micro Robot System 4.1 Liner Mathematics Model of the Airway Resistance Characteristics The liner mathematics model is the simple for airway resistance [12]. The elastic and the dissymmetrical radical size of the trachea are neglected, and the trachea is supposed to be rigid, the flow is supposed to be steady. So the Poiseuille resistance can be expressed as: R=
8 L μ Paw . = π r4 V
(1)
Where L, r are the length and the radius of the trachea respectively; μ is coefficient of kinetic viscosity of the air. Paw , V are the difference pressure and the respiration flow rate respectively. Thus, the Poiseuille resistance of micro system in trachea is also expressed as: R′ =
8μ L 8μ L1 L2 L2 ( + + ). = π r4 π r14 r24 r34
(2)
Where L1 , r1 L , r2 L3 , r3 are the length and the radius of the front holder, the driving rubber actuator and the rear holder; μ is coefficient of kinetic viscosity of the air.
Fig. 7. Resistance differential pressure of micro robot
560
L. Yu et al.
As the turbulent flow resistance is not constant, it will increase with the flow rate increased, and it is about 40% more then the Poiseuille resistance. So the resistance simulation results can be obtained. Thus, the airway resistance characteristics obtained from liner mathematics model were shown in Fig.7. 4.2 Energy Mathematics Model of the Airway Resistance Characteristics
The resistance of the micro robot system has relationship with the flow status and the flow rate [13]. The flow in trachea is complicated, and it is impossible to obtain the resistance results from the Naver-Stokes equation for complicated flow, so the energy conservation law should be used to calculate the resistance of the micro robot system. The trachea is supposed to be rigid and the flow state is supposed to be steady [14]. So the work of the micro flow unit from section 1 to section 2 in per time can be expressed as:
WA = ∫ pudA . A
(3)
The total kinetic energy in per time can be expressed as: E=∫
A
1 2 ρ q udA . 2
(4)
Where ρ is the flow density, and q is the flow unit speed. The mechanical energy dissipation from kinetic viscosity can be expressed as: Φ=∫
V
∂u ∂u j 1 σ ij ( i + )dV . 2 ∂x j ∂xi
(5)
The outside heat and heat exchange are neglected, the ρ is small for respiration flow, So the decreasing pressure work of any two sections is equal to the plus of increasing kinetic energy and the mechanical energy dissipation from kinetic viscosity. That is expressed as: W1 − W2 = E1 − E2 + Φ .
(6)
Where W1 , W2 are the pressure work of two sections , and the E1 , E2 kinetic energy of two sections. From equation (3) to equation (5) , the equation (6) can be expressed as:
∫
A1
pudA − ∫ pudA = ∫ A2
A2
∂u ∂u 1 2 1 1 ρ q udA − ∫ ρ q 2 udA + ∫ σ ij ( i + j )dV . A1 2 V 2 2 ∂x j ∂xi
(7)
For ideal gas, the value of the mechanical energy dissipation from kinetic viscosity is zero, and for Poiseuille of tube with length L, the mechanical energy dissipation can be expressed as: Φ p = 8πμ u 2 L = 8πμ (
V 2 8μ L 2 ) L= V . π r2 π r4
(8)
Study of the Airway Resistance of a Micro Robot System for Direct Tracheal Inspection
561
The resistance pressure is expressed as: P′ =
Φ ′p 8μ L . == V V π r4
(9)
The Poiseuille resistance is expressed as: R′ =
8μ L . π r4
(10)
Equation (10) is the same as the liner model of Poiseuille resistance. Considering for different entrance flow, the Equation (8) become as: Φ = Z ⋅Φp .
(11)
The coefficient of laminar flow of resistance of the micro robot is Z e′ =
α d ρ ud 12 α ρ 12 u1d12 u2 d 22 u3 d32 12 ( ) = ( ) ( + + ) . 16 L μ 16 μ L1 L2 L3
(12)
The value of coefficient α is 1 for the liner boundary contribution, and the he value of coefficient α is 3/4 for the parabola boundary contribution. he simulation resistances of the micro robot system from energy mathematics model are shown in Fig.10(a).
(a)
(b)
Fig. 8. Resistance differential pressure of robot system
For the turbulent flow, the coefficient of mechanical energy dissipation is 3
ZT = 0.005( Re ) 4 = 0.005(
Where Re is Reynolds number of the air.
ud ρ
μ
3
)4 .
(13)
562
L. Yu et al.
The simulation resistances of the micro robot system from energy mathematics model are shown in Fig.10 (b). So, the calculation results of energy model are littlie larger than that of the liner model, and resistance of the micro robot is less the 2.5Pa. 4.3 Measurement Experiments of Airway Resistance
Experiments had been done in lung-trachea model during mechanical ventilation. During mechanical ventilation, high airway resistance will affect the respiration function, so the airway resistance is one of the important monitoring parameters. The robot system in trachea will be of extra resistance, the resistance differential pressure in tracheal end were measured continuously by the pressure sensor equipped in the robot system, the results measured by the robot system are consistent with outside results shown by ventilator, the airway resistance was in normal value. 4.4 Analysis of Airway Resistance
Theoretical and experimental researches indicate that the bionics micro robot system can move smoothly in the tube and it is capable of monitoring respiratory parameters dynamically and continuously. Comparing to the total, the predicted extra resistance of the robot is very small, and the airway resistance was still in normal value [15], and the airway resistance of the robot will get small when the air section area gets large. The results afford a new idea and theory reference for the development of the robot system which can promote the development of practical miniature robot for medical purpose.
5
Conclusion
The characteristics of the micro robot system had been analyzed through theory models and experiments. The robotic system has enough driving force and deflection angels; The resistance characteristics of the micro robot system in airway had been discussed by setting up the linear and energy models, the simulation experiment results prove the extra resistance of micro robot system is small enough comparing to the total airway resistance, the micro robot system is respectable to be used for inspection in human trachea directly. Acknowledgement. This work was supported by the Scientific and Innovation Program of Shanghai Education Commission (No.: 10YZ103).
References 1. Lewis, F.L., Liu, K., Yesildirek: Neural Net Robot Controller with Guaranteed Tracking Performance. IEEE Trans. Neural netw. 6, 703–715 (1995) 2. Leu, Y.G., Wang, W.Y., Lee, T.T.: Observer-Based Direct Adaptive Fuzzy-Neural Control for Nonaffine Nonlinear System. IEEE Trans. Neural Netw. 16(4), 853–861 (2005)
Study of the Airway Resistance of a Micro Robot System for Direct Tracheal Inspection
563
3. Pati, Y.C., Krisshnaprasad, P.S.: Analysis and Synthesis of Feed-Forward Neural networks using discrete affine wavelet transformations. IEEE Trans. Neural Netw. 4, 73–85 (1993) 4. Ikeuchi, K., Yoshinaka, K., Hashimoto, S., Tomita, N.: Locomotion of Medical micro robot with spiral ribs using mucus. In: 7th IEEE International Symposium on Micro Machine and Human Science, pp. 217–222. IEEE Press, Nagoya (1996) 5. Anthierens, C., Libersa, C., Touaibia, M., Betemps, M., Arsicault, M., Chaillet, N.: Micro Robots Dedicated to Small Diameter Canalization Exploration. In: 2000 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 480–485. IEEE Press, Kagawa University, Takamatsu (2000) 6. Yan, G., Zuo, J.: A Self-propelling Endoscope System by Squirmy Robot. In: 2003 International Symposium on Micromechatronics and Human Science, pp. 159–163. IEEE Press, Nagoya (2003) 7. Thomann, M., Betemps, R.T.: The Design of a New Type of Micro Robot for the Intestinal Inspection. In: 2002 IEEE International Workshop on Robot and Human Interactive communication, pp. 1385–1390. IEEE Press, Berlin (2002) 8. Yu, L., Yan, G., Zhang, W., Wang, X.: Research on an Active and Continuous Monitoring System for Human Respiratory System. High Technology Letters 12, 68–71 (2006) 9. Yu, L., Yan, G., Wang, X.: A Soft Micro-Robot System for Direct Monitoring in Human Trachea. Robot. 28, 269–274 (2006) 10. Chou, C.P., Hannaford, B.: Static and Dynamic Characteristics of Mckibben Pneumatic Artificial Muscles. In: 1994 IEEE Robotic and Automation Conference, pp. 281–286. IEEE Press, San Diego (1994) 11. Suzumori, K., Likura, S., Tanaka, H.: Applying a Flexible Microactuator to Robotic Mechanisms. In: 1991 IEEE International Conference on Robotic and Automation, pp. 22– 27. IEEE Press, Sacramento (1991) 12. Nunn, J.F.: Applied Respiratory Physiology, Butterworths, p. 128 (1977) 13. Bates, J.H.T., Rossi, A., Milic-Emili, J.: Analysis of the Behavior of the Respiratory System with Constant Inspiratory Flow. J. Appl. Phys. 58, 1840–1848 (1985) 14. Verbraak, A.F.M., Rijnbeek, P.R., Beneken, J.E., et al.: A New Approach to Mechanical Simulation of Lung Behavior: Pressure-Controlled and Time-Related Piston Movement. Med. Biol. Eng. Comput. 39(1), 82–89 (2000) 15. Yu, L., Yan, G., Huang, B., Yang, B.: The Analysis of Airway Resistance with Online Monitoring System. Chinese Journal of Biomedical Engineering 26(2), 317–320 (2007)
Numerical Simulation of the Nutrient and Phytoplankton Dynamics in the Bohai Sea Hao Liu, Wenshan Xu, and Baoshu Yin 1
College of Marine Science, Shanghai Ocean University, 999 Hu-Cheng-Huan-Lu, 201306 Shanghai, China 2 Institute of Oceanology, CAS, 7 Nan-Hai-Lu, 266071 Qingdao, China [email protected]
Abstract. A coupled biogeochemical-physical model was developed to reproduce the annual cycle of the nutrient and phytoplankton dynamics in the Bohai Sea. Simulations were examined first, and then the nutrient and phytoplankton dynamics were investigated further. It was found that it may be the evolution of the thermal stratifications that is responsible for the spring algae bloom to occur later in the deep basin than in shallow bays. The simulation also shows that the phytoplankton dynamics was characterized by the nitrogen limitation as a whole in BS, though the phosphorus limitation appears in the Yellow River Estuary. Keywords: N/P ratio, algae bloom, a coupled biogeochemical-physical model, Bohai Sea.
1 Introduction Bohai Sea is a semi-enclosed shallow sea in China and located on the northwest of the Pacific. It consists of four parts, namely the Laizhou Bay in the south, the Bohai Bay in the west, the Liaodong Bay in the north, and the central basin. The biogeochemical environment in BS is strongly influenced by tides, East Asia Monsoons and the riverine inputs. Among over 40 rivers flowing into BS, Yellow, Haihe, Daliaohe and Luanhe Rivers are four major ones. Since 1950s, the hydrochemistry feature in BS has changed significantly [1]. What is the role that the riverine nutrient plays in shaping the hydrochemistry environment, and how the local phytoplankton dynamics response to the riverine nutrient changes? All these questions invite our interests. Therefore, by means of a coupled biogeochemical-physical model, a series of numerical experiments were conducted to reproduce the annual cycle of the nutrient and phytoplankton dynamics in BS.
2 Model Description A coupled biogeochemical-physical model was developed in this study. The biogeochemical model belongs to the Nutrient-Phytoplankton-Zooplankton-Detritus (NPZD) type, and its scheme is given in Fig.1. The physical model used here is the Princeton K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 564–569, 2010. © Springer-Verlag Berlin Heidelberg 2010
Numerical Simulation of the Nutrient and Phytoplankton Dynamics in the Bohai Sea
565
Ocean Model (POM). Through some modifications, POM could be used to simulate the key hydrodynamics in BS reasonably [2], thus it provides a genuine physical environment for the biogeochemical processes. Besides, a real-time irradiation model [3] and a river discharge model [4] are also used in this study for simulating the solar radiation and the riverine nutrient transport in sea waters, respectively. Solar radiation
mortality excretion grazing
exc ret ion regeration
Detritus
resuspension
sinking
n io ns pe us res
River loads
Advection Diffusion
lity
Phosphate
Nitrogen
Zooplankton
ta mor
ke upta on irati resp
Advection Diffusion
upt ake resp irati on
Phytoplankton
Sediments
regeneration
Fig. 1. Scheme of the biogeochemical model
The dissolved inorganic nitrogen and phosphate concentrations in four rivers are from [5]. Four rivers are seasonal and the freshwater inputs mainly concentrate in the flood season (from July to October), so it can be suggested that the riverine nutrient may have relatively little influence on the algae growth in spring. The coupled biogeochemical-physical model was forced by actual forcings, taking into account tides, wind and river discharges. The model was run for 2 years, and the results in the second year were presented for analysis.
3 Model Results and Analysis 3.1 Validation of Simulations In this study, the inorganic nitrogen was taken as a single material just for estimating the N/P ratio easier. The comparisons between simulated and observed nutrients are given in Fig.2, in which observations are basin wide data derived from the MABHE datasets [6]. It is easy to see that both the nitrogen and phosphorus concentrations are characterized by the highest level in winter and the lowest level in summer. Figure 3 shows the annual cycle of the phytoplankton biomass which is characterized by the double-peak structure, corresponding to spring and autumn blooms, respectively [7, 8]. The onset of algae bloom is always at the cost of consuming large amounts of
566
H. Liu, W. Xu, and B. Yin
5
mmol P m-3
mmol N m-3
nutrients, and it is why the nutrient concentration has declined to a very low level as the second bloom occurs. In winter, the nutrient stock can rapidly resume due to the algae inertia in cold waters.
2.5
0
(a) 0
2
4
6
8
10
1
0.5
0
12
(b) 0
2
4
month
6
8
10
12
month
Fig. 2. Comparisons between simulated (solid line) and observed (error bars) nitrogen (a) and phosphate (b)
Fig.3 also shows that the spring bloom first occurs in shallow bays, and then spreads to the relatively deep central basin. Photosynthesis generally happens in the euphotic zone, so the downward transport of algae induced by the tide and wind stirring causes the chlorophyll a stock to accumulate in a slow manner in the surface layer of the deep water. The vertical material exchange is not effectively blocked until the thermal stratification comes into being due to the increase of the heat flux on the sea surface, thus the phytoplankton biomass in the surface layer begins to accumulate in a rapid manner. Therefore, it can be suggested that the evolution of the thermal stratifications may be an important reason for the spring bloom to occur later in the central basin than in shallow bays.
Chlorophyll/mg m-3
3
Bohai Bay
3
2
2
1
1
0
0
2
4
6
8
3
10
12
Central Basin
0
2
1
1 0
2
4
6
8
10
12
0
2
4
6
8
3
2
0
Laizhou Bay
0
10
12
Liaodong Bay
0
2
4
6
8
10
12
month Fig. 3. Simulated (solid line) and observed (error bars) chlorophyll a in four sub-regions of BS
3.2 Horizontal Distribution of Inorganic Nitrogen and Phosphate Fig.4 shows the nitrogen and phosphorus distribution in winter and summer. In winter, the high level of nitrogen mainly distributes in three bays, while the central basin shows the lower nitrogen concentration. According to investigations [5], the annual
Numerical Simulation of the Nutrient and Phytoplankton Dynamics in the Bohai Sea
567
nitrogen discharged into the Laizhou, Bohai and Liaodong Bays through YR, HR and DR is approximately 4.96×109, 7.78×108 and 3.50×108 mol, respectively. Those newly inputted nutrients are mostly consumed by the phytoplankton, and then through a series of biogeochemical-physical processes most of them are ultimately released into the local water column in winter. In summer, the nitrogen concentration declined to the lowest level due to the algae bloom. Since the riverine nutrients are mainly disributed in the river plumes, it is why nitrogen is almost exhausted in the central basin, whereas three bays show the relatively high level of nitrogen. N 40.5°
N
(a)
40.5°
39.5°
39.5°
38.5°
38.5°
37.5°
37.5°
118° N
119°
120°
121°
122°E
118°
N
(c)
40.5°
40.5°
39.5°
39.5°
38.5°
38.5°
37.5°
37.5° 118°
119°
120°
121°
122°E
(b)
119°
120°
121°
122°E
119°
120°
121°
122°E
(d)
118°
Fig. 4. Horizontal distribution of nitrogen (mmol N m-3) in (a) Jan, (b) July, and phosphorus (mmol P m-3) in (c) Jan, (d) July in the surface layer of BS
Phosphorus distribution is different, which is characterized by the relatively high level in the central basin rather than in the Laizhou Bay where the largest amounts of riverine nutrients were discharged into. The climatological N/P ratio in YR is about 350.7, far higher than the Redfield ratio [9]; meanwhile the runoff of YR accounts for 60% of the total river inputs to BS, which means that much more phosphorus from the ambient seawater is needed by phytoplankton to sustain a balance growth. Therefore, the lowest level of phosphate appears in the Laizhou Bay no matter in winter or in summer, which is contrary to the situation of nitrogen. Although the N/P ratio in HR can reach 82.2, its small runoff would not induce the over consumption of phosphorus in the Bohai Bay. Unlike the above two bays, the Liaodong Bay shows the highest level of phosphorus due to the lower N/P ratio in DR, meaning that the newly inputted phosphorus may be surplus relative to nitrogen. The N/P ratio is not constant in an annual cycle. In winter, the highest N/P ratio appears in coastal waters, especially in the YR estuary where the phytoplankton dynamics
568
H. Liu, W. Xu, and B. Yin
shows somewhat phosphorus limitation. In summer, the N/P ratio is the lowest due to the strong regeneration mechanism of phosphorus [10], therefore, the nitrogen limitation may be dominant throughout BS except in the mouth of YR, where the riverine nitrogen is still able to maintain the relatively high N/P ratio. It can be suggested that the riverine inputs play an important role in shaping the nutrient distribution patterns in BS, and that any drastic variation in the riverine nutrient structure or in the runoff will inevitably exert a profound influence on the local nutrient and phytoplankton dynamics.
4 Discussion Observations have shown that diatom is always the dominant algae species in BS [8, 11], so silicate may play an important role in the phytoplankton dynamics. Although the silicate stock has decreased a lot since late 1950s, its concentration is still high enough to avoid from becoming the limiting factor [1, 5], and it is why silicate was not taken into account in the present modeling. However, if the river discontinuity, a major reason for the silicate reduction in BS, is not ameliorated, meanwhile there is no other makeup mechanism for silicate, the continuous decrease in silicate stock will enable it a potential limiting factor in the future. Based on observations, the change of the hydrochemistry environment in BS is characterized by the moderate increase in nitrogen and the drastic decrease in phosphorus in past decades. At the same time the phosphorus inputs through river discharges did not change very much, whereas the riverine nitrogen increased significantly due to the rapid development of the regional economy around BS. Therefore, it can be suggested that it is the mutual limitation between two nutrients that lead to the continuous decrease in phosphorus stock in BS, since one nutrient increases more quickly, the other one will suffer from the ultimate decrease. Accordingly, the nutrient limitation characters may undergo the shift in the long term as a response to the N/P ratio change in sea waters. In fact, the phosphorus limitation has been detected in the middle of the Laizhou Bay in spring 1999 [12], while such phenomena only occurred within the YR estuary in our modeling.
5 Conclusion A relatively simple biogeochemical-physical model was developed to reproduce the seasonal variation of the nutrients and the chlorophyll a in BS. Compared to the complicated model studies [8, 13], our simulations seemed more consistent with observations, especially in revealing the nutrient and phytoplankton dynamics. It is not surprising, as the complex model generally needs to handle more biological parameters whose uncertainty may deteriorate the model quality by and large [14]. Based on the simulations, it is found that the BS ecosystem is mainly limited by nitrogen as a whole, though phosphorus limitation seems apparent in the estuary of Yellow River due to the much more riverine inputs of nitrogen. Simulations also show that the nitrogen increase is generally at the cost of phosphorus reduction, implying that a shift from nitrogen limitation to phosphorus limitation may occur if the nitrogen enrichment continues.
Numerical Simulation of the Nutrient and Phytoplankton Dynamics in the Bohai Sea
569
Acknowledgement. The authors thank two anonymous reviewers. This study was supported by the key subject fund of Shanghai Education Committee (J50702).
References 1. Cui, Y., Chen, B., Ren, S., et al.: Study on status of bio-physic-chemical environment in the Bohai Sea. Journal of Fishery Sciences of China 3, 1–12 (1996) (in Chinese) 2. Liu, H.: Annual cycle of stratification and tidal fronts in the Bohai Sea: A model study. Journal of Oceanography 63(1), 67–75 (2007) 3. Liu, H., Yin, B.: A real-time irradiation model. Oceanologia Et Limnologia Sinica 37, 493–497 (2006) (in Chinese) 4. Kourafalou, V.H.: River plume development in semi-enclosed Mediterranean regions: North Adriatic Sea and Northwestern Aegean Sea. Journal of Marine Systems 30, 181–205 (2001) 5. Zhang, J., Yu, Z.G., Rabbc, T., et al.: Dynamics of inorganic nutrient in the Bohai seawaters. Journal of Marine System 44, 189–212 (2004) 6. Chen, G.Z., Niu, G.Y., Wen, S.C., et al.: Marine Atlas of Bohai Sea, Huanghai Sea, East China Sea. Ocean Press, Beijing (1992) (in Chinese) 7. Fei, Z., Mao, X., Zhu, M., et al.: The study on the primary productivity in the Bohai Seachlorophyll a, primary productivity and potential fisheries resources. Marine Fisheries Research 12, 55–69 (1991) (in Chinese) 8. Wei, H., Sun, J., Moll, A., et al.: Plankton dynamics in the Bohai Sea- observations and modeling. Journal of Marine System 44, 233–251 (2004) 9. Redfield, A.C., Ketchum, B., Richards, F.A.: The influence of organisms on the composition of seawater. In: Hill, M.N. (ed.) The Sea, New York, vol. 2, pp. 26–77. Wiley Interscience, Hoboken (1963) 10. Ryther, J.H., Dunstan, W.M.: Nitrogen, phosphorous, and eutrophication in the coastal marine environment. Science 171, 1008–1013 (1971) 11. Kang, Y.: Distribution and seasonal variation of phytoplankton in the Bohai Sea. Marine Fisheries Research 12, 31–54 (1991) (in Chinese) 12. Zou, L., Zhang, J., Pan, W., et al.: In situ nutrient enrichment experiment in the Bohai Sea and Yellow Sea. Journal of Plankton Research 23, 1111–1119 (2001) 13. Zhao, L., Wei, H.: The influence of physical factors on the variation of phytoplankton and nutrients in the Bohai Sea. Journal of Oceanography 61, 335–342 (2005) 14. Radach, G., Moll, A.: Review of three-dimensional ecological modelling related to the North Sea shelf system. Part II: Model validation and data needs. Oceanography and Marine Biology 44, 1–60 (2006)
Personalized Reconstruction of 3D Face Based on Different Race Diming Ai2,**, Xiaojuan Ban1,*, Li Song2, and Wenxiu Chen1 1
School of Information Engineering, University of Science and Technology Beijing, Beijing,100083 2 Beijing Institute of Special Vehicles,Beijing,100072 [email protected] [email protected]
Abstract. The 3D face reconstruction method of different race is proposed in this paper. It chose different standard face model according to different race, and adjusted and combined with the extracted character and corresponding model, thereby acquired the personalized model that reflected different race, then created realistic 3D face adding grain information by the texture mapping technique. The final result and quantitative analysis showed that: the character could adapt to standard face model more effectively and reconstruct realistic 3D face successfully. Keywords: Personalization, 3D model, Radial basis function, Texture mapping.
1 Introduction Face is the most expressive part of the human body that has diversification and individuation. In recent years, with the development of computer graphics technology, 3D face modeling becoming a hotspot in the research field of computer graphics ,and have received increasingly large amounts of attention. Over the past 30 years, through the efforts of scientific workers, the research of using computer to synthesize realistic human face have obtained some achievements. Park[1] firstly used face parameter model to create face image; Platt[2] and Wate etc proposed to build a virtual human face using muscle model; Horace[3] etc can synthesize face with two orthogonal photos; Blanz[4] etc proposed a method of human face modeling based on statistical inference. Although the above methods are able to reconstruct 3D face, but the results of personalized reconstruction were not very satisfied. This paper took the National Natural Science Foundation project, “Gelcasting medical and porous titanium alloy implant materials” as the scientific background. This project mainly study base on the basic theory and the critical process of gelcasting medical and porous titanium alloy implant materials. In order to make the later mold design and production with high accuracy, the personalized and realistic 3D face must be reconstructed in the stage of 3D face modeling. Therefore according to these *
Xiaojuan Ban, Professor, her research field is Artificial Intelligence, Computer Animation. ** Diming Ai, Senior Engineer, his researcher field is Artificial Intelligence.
K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 570 – 578, 2010. © Springer-Verlag Berlin Heidelberg 2010
Personalized Reconstruction of 3D Face Based on Different Race
571
requirements, the personalized 3D face reconstruction method of different ethnic is proposed in this paper. Mainly based on a single frontal face image as input, select some feature points of the image, adopt different standard face models for people of different region (for example, Asian, African, europeans) corresponding, use the method of automatic extraction, and then make full use of the extracted feature points to modify the model, in the end, combining the side photos, we can mapping the texture of personalized face model to synthesize realistic 3D face with regional characteristics. Our method is quick and simple, reduce the complexity and convenient to use, moreover greatly improve the human face reality. Standard model selection and the model adjustment introduced in detail in this paper.
2 Standard Face Model Selection of Different Racial Types Although everyone's face has generality, face parts: eyes, nose, mouth, ears and cheek etc. however, in many details there will be very different. Especially different racial type with different characteristic. Generally speaking, The facial features of Africa are round head, sloping forehead, flat and rich nose; Europeans with narrower but cleancut face, high noses ,deep eyes; Asians with wider face, broad cheek-bones, and small still nose. These features largely depend on the Z axis coordinates of the model, because the Z axis value decide the height of details parts of the model, for instance, forehead, nose, eyes, etc. Previous study didn't pay attention to the details of the face, mainly embodied in the following two points: (1) Do not distinguish race, gender, age, use a general face models; (2) Adjustment of the model based on a single image reconstruction only can panning and zoom X, Y axis coordinates, but cannot change the coordinate values of Z axis, eventually specific Z axis coordinate values of face model are still primitive default values. This inevitably result in a big shortfall of realistic 3D face. Therefore this paper focus on how to construct the face models of different racial types and reconstruct the realistic 3D face.
Fig. 1. African grid model, Europeans and Asians grid model
In this paper, using FaceGen professional modeling software to create standard 3D face model of African, European and Asian, then use software 3DSMAX and MAYA to edit and adjust the face model, different racial types use different model average depth values, so as to improve the trueness of personalized face, furthermore it can better applied to computer image in various fields. Face generalized model As shown in Fig.1.
572
D. Ai et al.
3 Geometric Adjustment of General Model The ultimate goal of our paper is to get the realistic face with regional features, so the general face model must be modified so as to the get the specific face model, and make it has certain face characteristics. This process requires two steps to complete the transformation, the first step is the whole transformation of the general face model to finish the outline modification of facial model, roughly make the same position of facial form and facial features of the general face model and specific face model; The second step is use the extracted feature (12 feature points selected in this paper) to do local transformation of specific face model after the whole transformation, mainly for further modify the feature position of the eyes, nose, mouth and chin, depict the detailed features of specific face model. 3.1 Global Adjustment Any point P (x, y, z) of the face grid model do the rigid motion to the target point ’ ’ ’ ’ P (x ,y ,z ) with the infinitesimal euler angle described as:
⎡ X′⎤ ⎡ 1 −Δθz Δθy ⎤ ⎡sx 0 0⎤ ⎡ X⎤ ⎡tx ⎤ ⎢ ⎥ ⎢ ⎥⎢ ⎥⎢ ⎥ ⎢ ⎥ ⎢Y′ ⎥ =⎢ Δθ ⎥ ⎢ 0 s 0⎥ ⎢ Y ⎥ +⎢t ⎥ θ 1 −Δ z x y ⎢ ⎥ ⎢ ⎥⎢ ⎥ ⎢ ⎥ ⎢ y⎥ ⎢Z′ ⎥ ⎢−Δθ Δθ ⎥ ⎢ 1 ⎥⎦ ⎣ 0 0 sz ⎥⎦ ⎢⎣ Z ⎥⎦ ⎢⎣tz ⎥⎦ y x ⎢⎣ ⎥⎦ ⎢⎣
(1)
θx, θy and θz are rotation angles around 3D X, Y, and Z axis; sx,sy and sz are scaling T factors of x y and z direction respectively;(tx,ty,tz) is translation vector. Global transformation can be equivalent to the estimation of 9 rigid motion parameters. But this is mainly to reconstruct face model base on facial photos with correct posture in this paper, rotation angles are considered as 0, so the simplified formula of whole adjustment is as follows:
⎡ X ′⎤ ⎡ sx ⎢ ⎥ ⎢ ⎢Y ′ ⎥ = ⎢ 0 ⎢ ⎥ ⎢ ⎢Z ′ ⎥ ⎢ 0 ⎢⎣ ⎥⎦ ⎣
0 sy 0
0 ⎤ ⎡ X ⎤ ⎡t x ⎤ ⎥⎢ ⎥ ⎢ ⎥ 0 ⎥ ⎢ Y ⎥ + ⎢t y ⎥ ⎥⎢ ⎥ ⎢ ⎥ s z ⎥⎦ ⎢⎣ Z ⎥⎦ ⎢⎣ t z ⎥⎦
(2)
Thus global transformation can be equivalent to estimate the scaling factors (Sx, Sy,, T Sz) of the model and the translation vectors (tx,ty,tz) . The six parameters can be obtained through calculate the distance of 3D face model to the center of 2D face. In face photo, we define Pt,Pr,Pc and Pm on behalf of the central point of left eye, the central point of right eye, the center of the eyes, and the center point of the mouth respectively. Pt’,Pr’,Pc’ and Pm’ are related features of 2D projection in 3D face model. Therefore, the definition of the scale factors sx and sy as follow: sx=|pt-pr|/|pt’-pr’|, sy=|pcpm|/|pc’-pm’|.Translation factor tx and ty define as: ty=tx =|pm-pm’|.In 2D image, the depth of face is not visible, in order to ensure automatism, the average value of sx and sy defined as Z direction ratio sz.
Personalized Reconstruction of 3D Face Based on Different Race
573
After global adjustment, the projection of the eye center point and the mouth center point of face models in 2D plane corresponding to the detected feature points in the face image. 3.2 Local Adjustment After the global adjustment of the face model, we need to use the 12 feature points parameters obtained from the face image of the front side and the lateral side to do further local adjustment. If we can establish a smooth interpolation function f(p) between general face model and specific face model, make f(p) for each feature points is satisfied, so the displacement of every point of general face model can be obtained. According to the study before, using radial basis function (RBF) [5] can have very good actual effect of the face surface adjustment , because RBF have a good ability of discrete data spatial interpolation. And about the radial basis function (RBF) in face reconstruction, using gauss function and isolated subfunction as the radial basis function (RBF) of basic function. Specific face construction method is that put all the face feature points and the general face model into the unitive constraint equations to solve and get the specific face model. Radial basis function (RBF) will change the face deformation [6] problem into the differential problem of many variables messy data, namely: known all the grid points of the three-dimensional model and the location of n feature points. When the feature points move to a new location Pi’ from the original location Pi(1≤i≤n). To seek out the location of the feature points, the calculation formula as follows: n
f ( p ) = ∑ ciφ( p − pi ) + Mp + t
(3)
i =1
n is the total number of face model feature points; ci is the corresponding coefficient of basis function; Pi(1≤i≤n) is the i feature point of model; ||p-pi||is Euclidean distance between p and pi;ф(||p-pi||)is basis function; Mp + t is affine component that express the whole transformation. In 3D face reconstruction, due to the feature points are three-dimensional, therefore the affine component M and t were 3×3 matrix and 3×1 vector respectively. So the new location of not feature point pp’ is p+f(p). The choice of basis function is the important step of face models with radial basis function reconstruct characteristics, different basis function have different characteristics. In specific face model reconstruction, Gaussian function, second multinomial, second fitting multinomial etc are commonly used. In the face reconstruction, due to different characteristics, different functions have different reconstructive effect. Through many experiments, we generally select ф(r)=e-r/R as basis function, R control the range of deformation according to r take the corresponding value. The feature point displacement is Pi(1≤i≤n), defining Pi= Pi- Pi’=f(Pi)(1≤i≤n) and substituting n feature points in equation (4) we can get:
△
△
n
Δpi = ∑ c j φ( pi − p j ) + Mpi + t,(1 ≤ i ≤ n) j =1
(4)
574
D. Ai et al.
Set the affine transform constraint conditions are: n ⎧ ⎪ ⎪ ⎪ ∑cj = 0 ⎪ ⎪ j=1 ⎨ n ⎪ ⎪ cj.pj = 0 ⎪ ∑ ⎪ ⎪ j = 1 ⎪ ⎩
(5)
Its function is eliminate the effect of affine component in radial basis function. Uniting formula (4) and (5) we get linear equations: ⎡φ ⎢ 11 ⎢φ ⎢ 21 ⎢" ⎢ ⎢ ⎢φn1 ⎢ ⎢ p1x ⎢ ⎢ p1y ⎢ ⎢ p1z ⎢ ⎣⎢ 1
p1x p1y p1z 1 ⎤ ⎡ c1 ⎤ ⎡ Δp1 ⎤ ⎥⎢ ⎥ ⎢ ⎥ φ22 " φ2N p2x p2y p2z 1 ⎥⎥ ⎢⎢ c2 ⎥⎥ ⎢⎢Δp2 ⎥⎥ " " " " " " "⎥⎥ ⎢⎢ # ⎥⎥ ⎢⎢ # ⎥⎥ ⎥ φn2 " φnN pnx pny pnz 1 ⎥ ⎢⎢ cn ⎥⎥ ⎢⎢Δpn ⎥⎥ ⎥ ⎢ ⎥ =⎢ ⎥ p2x " pnx 0 0 0 0 ⎥ ⎢ ⎥ ⎢ 0 ⎥ ⎥⎢ ⎥ ⎢ ⎥ p2y " pny 0 0 0 0 ⎥ ⎢M⎥ ⎢ 0 ⎥ ⎥⎢ ⎥ ⎢ ⎥ pz " pnz 0 0 0 0 ⎥ ⎢ ⎥ ⎢ 0 ⎥ ⎥⎢ ⎥ ⎢ ⎥ 1 " 1 0 0 0 0 ⎦⎥ ⎣⎢ t ⎦⎥ ⎢⎣ 0 ⎥⎦ φ12 " φ1N
(6)
And фij=ф(||pi-pj||)(1≤i, i≤n),( pix, piy, piz)(1≤ i≤n) is the coordinate of the feature point pi. Solving linear equations (6) we can get the coefficient of basis function in equation (4) and the affine component M and t.
4 Reconstruction Process of 3D Face In the whole process of personalized face reconstruction mainly have three modules: face detection, feature points location and extraction, and realistic 3D face. The
Fig. 2. The flowchart of face reconstruction
Personalized Reconstruction of 3D Face Based on Different Race
575
Fig. 3. Face feature points extraction
operation environment: ordinary PC,VI 2.4 G galloping processor, software vc++6.0 and OpenGL, face image obtained by ordinary digital camera. The whole process as shown in Fig.2: Face detection mainly use the face color clustering, no matter what the color of skin, the skin color of different ethnic group is distributed in certain areas of face. Face feature points location using automatic mode for feature points location of eyes, nose and mouth respectively, because the complexity of model adjustment is great, so only 12 feature points are extracted from this three parts: 6 points around eyes canthus, 3 points around mouth and the mouth center , 3 points in wing of nose and tip of nose. The extracted results shown as Fig.3: As described in section 3, using the already extracted feature points, general face model of different racial types can be adjusted to obtain specific 3D face model, then use two pictures from the front and the side to make texture images. Using basic idea of WON-SOOK LEE[7] method in this paper ,according to the image deformation characteristic lines predefined firstly, then make the face image distortion, along the canthus, corners of the mouth the face images from the front and the side can be splicing, because different illumination condition of face images from the front and the side. Because of skin colour change too much, simply splicing the boundary can lead to obvious cracks, so using the image fusion method of Laplacian[8] pyramid decomposition to eliminate the seamline of image and smooth transition, seamless face texture image can be produced, as shown in the Fig.4. and Fig.5.
Fig. 4. Asian imags from the front and the side respectively and the seamless face texture image
576
D. Ai et al.
Fig. 5. European imags from the front and the side respectively and the seamless face texture image
Fig. 6. Realistic 3D face from different perspectives
Finally calculate the texture coordinates to do texture mapping[9],realistic face reconstruction results can be obtained as shown in Fig.6.
5 Model Fitting Analysis In order to make quantitative analysis comparison, Asian people for example, X, Y axes coordinates of 12 feature points can be extracted from a positive face photograph of Asia. For Z coordinate we use average value of the model. In order to simplify, Z coordinates doesn't show in this figure. Table 1 show part of normalized coordinates of feature points extracted. Table 1. Feature points coordinates normalization feature point Left pupil Right wing of nose Left corner of lip
x,y (-27,36) (17,0) (-19,-18)
x,, y,, z, (-0.48214,0.88888,0.166676) (0.30357,0,0.00231) (-0.33928,-0.44444,0.06839)
Personalized Reconstruction of 3D Face Based on Different Race
577
Using 12 3D coordinates that have been normalized as show in the table (eyes 6, nose 3, mouth 3), we contrast the fitting of corresponding points of asian face model and general model. Contrasting result as show in Fig.7, green dots are the extracted feature points x, y coordinates, red dots are the corresponding points coordinates of Asian face model, blue dots are corresponding points coordinates of general face model.We can clearly see from the graph that the red dots and green dots are fitting better than the blue and green dots, in other words, using face model corresponding, the 3D face model reconstruction of different racial types is more realistic.
Fig. 7. Feature points fitting of Asian face model contrast with others
6 Conclusion and Prospect Using different general face models, a method of synthetize 3D face models with regional features is proposed in this paper. First, given a face picture from the front, feature points information of specific face model can be automatic extracted, and then according to these information general 3D face model can be modified, so specific 3D face model can be obtained, Finally we use texture mapping technology to synthetize realistic virtual 3D face. According to the experiment results, our method is feasible and the reconstruction can be more quickly and realistic, so our method has practical value. In this paper, in order to adjust the model complexity, only 12 feature points of the simplified model selected, Outline information of face do not extracted, there will be further research in this aspect in the future to make the 3D face reconstruction more realistic. Acknowledgments. This work is supported by National High-tech R&D Program of China (863 Program) (No.2006AA06Z137, No.2009AA04Z163). National Natural Science Foundation of P.R. China (No. 50634010 and No.60973063). Beijing Natural Science Foundation of P.R. China (No. 4092028).
578
D. Ai et al.
References 1. Parke, F.I.: Aparametric model for human faces. Technical Report UTEC-CSc-75. University of Utah, Salt Lake City, Utah, USA (1974) 2. Platt, S.M., Badler, N.L.: Animating facial expressions. Computer Craphics 15(3), 245–252 (1981) 3. Blanz, V., Vetter, T.: A morphable model for the synthesis of 3D faces. In: SIGGRAPH Proceedings, Orlando, FL, USA, pp. 71–78 (1999) 4. Horce, H.S., Yin, I.P., Li, J.: Constructing 3D Individualized Head Model from Two Orthogonal Views. The Visual Computer 12(5) (1996) 5. Shen, R.R.: Research on 3D Personalized Face Reconstruction Based on RBFs. University of Jiangsu (2008) 6. Zhan, Y.Z., Shen, R.R., Zhang, J.M.: 3-D Personalized Face, Reconstruction Based on Multi-layer and Multi-region with RBFs. In: Pan, Z., Cheok, D.A.D., Haller, M., Lau, R., Saito, H., Liang, R. (eds.) ICAT 2006. LNCS, vol. 4282, pp. 775–784. Springer, Heidelberg (2006) 7. Lee, W.B., Thalmann, N.M.: Head Modeling from Pictures and Morphing in 3D with Image Metamorphosis based on triangulation. In: Magnenat-Thalmann, N., Thalmann, D. (eds.) CAPTECH 1998. LNCS (LNAI), vol. 1537, pp. 254–267. Springer, Heidelberg (1998) 8. Xu, X.G., Bao, H.J., Ma, L.Z.: Study on Texture Synthesis. Journal of Computer Research and Development 39(11), 1405–1411 (2002) 9. Zhang, M.T., Ma, L.N.: An Image-Based Individual Facial Modeling Generation System. Computer Engineering and Applications 26, 92–94 (2004)
Lake Eutrophication Evaluation and Diagnosis Based on Bayesian Method and SD Model Kai Huang1,*, Xulu Chen1, and Huaicheng Guo2 1 College
of Environmental Science and Engineering, Beijing Forestry University, Beijing, China, 100083 2.College of Environmental Science and Engineering, Peking University, Beijing, China, 100871 [email protected]
Abstract. In order to comprehensively evaluate the eutrophication degree of Lake Dianchi, Bayesian Method was applied in this paper. The evaluation result showed that the eutrophication status of Lake Caohai was more serious than that of Lake Waihai, while the eutrophication degree was turning better these years. Besides, in this paper SD model was established to diagnose the socioeconomic factors that caused Lake Dianchi Eutrophication. The relationship between socio-economic development and eutrophication was analyzed, which will provide the theoretical basis for planning the urban population distribution and industrial sectors distribution. Ultimately, the N/P ratios of Lake Dianchi which can influence the growth of cyanobacteria were analyzed. The result showed that Lake Caohai and Lake Waihai were at the opposite side of the inflection point of 15, and the eutrophication in Lake Caohai and Lake Waihai should be treated differently. Keywords: Lake Dianchi, Eutrophication, Bayesian, SD, N/P Ratios, Control Strategies.
1 Introduction Along with the rapid economic development and population growth, eutrophication have appeared in Lake Taihu, Lake Dianchi, Lake Chaohu, as the representative of a large number of water bodies since the 70s of the 20th century, which have had a negative impact on people's lives, economic and social development. The degree of eutrophication should be determined, and the causation of eutrophication should be diagnosed for the serious polluted lake in the process of environmental management. This will be helpful to achieve effective plan and management. The research of Lake Eutrophication in China has a short history. Carlson Trophic Status Index, revised Carlson Trophic State Index, Nutritional Index Method, Integrated Nutrition State Index Method, Scoring Method, Fuzzy Evaluation Method, Clustering Method, The State Matrix, Material Element Method , Artificial Neural *
Corresponding author.
K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 579 – 587, 2010. © Springer-Verlag Berlin Heidelberg 2010
580
K. Huang, X. Chen, and H. Guo
Network Model and other methods nowadays are presented [1-3]. Among them, Random Evaluation Formula based on Bayesian Method, could handle the uncertainty information of evaluation by calculating the maximum probability to judge the eutrophication level. On account of the use of probability theory, the Bayesian method can improve the accuracy and credibility of eutrophication evaluation. This paper intended to apply Bayesian method to evaluate the eutrophication level of Lake Dianchi. Besides, the concentration change of nitrogen and phosphorus in Lake Dianchi were analyzed. After the eutrophication evaluation, the SD (System Dynamics) models were established to diagnose the relations between socio-economy and eutrophication. Ultimately, The N/P ratios of Lake Dianchi which can influence the growth of cyanobacteria were analyzed. This paper intended to provide a theoretical basis for the purpose of realizing the long-term goal of controlling eutrophication of Lake Dianchi based on the "12th Five-Year" plan.
2 Randomized Evaluation of Lake Eutrophication Based on Bayesian Method Bayesian theorem is a theorem in probability theory which can be represented by formula (1). The introduction of the Bayesian formula to eutrophication of lakes is just the introduction of the idea of probability to diagnosis. The calculation steps of Bayesian methods are as follows: Firstly, a single water quality index of which rating level is calculated. Water quality assessment of this single level is determined by using the maximum likelihood classification. Secondly, the weighted average method was used to determine assessment level of single index. Finally, the evaluation level [1] of water quality index of eutrophication is estimated with weighted average method. P ( Bi | Α ) =
P( Bi ) P( A | Bi ) n
∑ P( B ) P( A | B ) i
(1)
i
i =1
In the formula (1), P ( A | Bi ) , P( B | Ai ) stands for conditional probability respectively, P( Bi ) stands for the probability for the event. So eutrophication measured concentration matrix as X = ( xjk ) m × n , in which n stands for samples of reservoir eutrophication; m stands for items of nutritional evaluation index; x stands for the measured water quality indicators. So eutrophication standard concentration matrix as J = ( yij ) m × c (Eutrophication evaluation criteria is shown in Table 1) in which c stands for eutrophication status, y stands for evaluation criteria concentration, j=1,2,…,m; k=1,2,…,n; i=1,2,…c. Set Bi as the incident that the value xjk of water quality index belongs to water quality level i, i=1,2,…,c; j=1,2,…,m. The uncertainty of Eutrophication Assessment can be represented from P ( Bi / xjk ) with conditional probability.
Lake Eutrophication Evaluation and Diagnosis Based on Bayesian Method and SD Model
581
Table 1. Eutrophication Evaluation Criteria
Eutrophication Status
(mg/m )
(mg/m )
Low Low-middle Middle Middle-high High Very high
1 2 4 10 65 160
2.5 5 25 50 200 600
Chla
TP 3
TN 3
COD 3
SD
(mg/m )
3
(mg/m )
(m)
30 50 300 500 2000 6000
300 400 2000 4000 10000 25000
10 5 1.5 1 0.4 0.3
The degree of eutrophication by Bayesian method is as follows: (1) Calculate the probability of a single water quality index xjk in the level Bi Pji =
1/ L ji
( j = 1, 2,..., m; i = 1, 2,..., c)
c
∑1/ L i =1
(2)
ji
According to the concept of geometric probability, assuming P( xjk | Bi) is inversely proportional to the distance, shown in Figure 1, from Lji calculated by the following formula:
Lji =⏐xjk − yji⏐
(3)
↓ Lji
|⎯|⎯⎯|⎯⎯⎯⎯|⎯|⎯| yj1
yj2
yji
yjc
Fig. 1. The distance of water quality index from evaluation level
(2) Estimating the probability of Pi with comprehensive Multi-index m
Pi = ∑ wjP( yji | xjk )
(4)
j =1
P ( yji | xjk ) is Pji , so: m
Pi = ∑ wjPji
(5)
j =1
wj stands for Weights of different water, the influence of water quality type indicators on water types could be determined by documents or the actual use of water[2]. This paper use the weight (w(Chla,SD,TP.TN,COD,SS)= (0.440,0.242,0.149,0.083, 0.052,0.034)) [4] of Aizaki`s modified TSI to calculate.
582
K. Huang, X. Chen, and H. Guo
(3) Decision of the Ph with Maximum Probability Principle
Ph = max Pi
(6)
3 Eutrophication Evaluation of Lake Dianchi 3.1 Eutrophication Evaluation
Lake Dianchi is the sixth largest freshwater lake in China and has a surface area about 306.3 km2. The location of Lake Dianchi Watershed is showed in Figure 2. Lake Dianchi is divided into Lake Caohai and Lake Waihai by artificial gates. Lake Caohai is near the city and accepts most of domestic sewage.
N
Beijing
C H I NA Yunnan Province Lake Dianchi Watershed
Fig. 2. Location of Lake Dianchi Watershed
The TP, TN, SD, COD, Chla indicators of Lake Caohai and Lake Waihai were analyzed by Bayesian Method. The results were demonstrated in Table 2. Table 2. Eutrophication level of Lake Caohai and Lake Waihai from 1999 to 2008 Eutrophication Level Lake Caohai Lake Waihai
1999 Very high High
2000 Very high High
2001 Very high High
2002 Very high High
Year 2003 2004 Very High high High High
2005 Very high High
2006
2007
2008
High
High
High
High
High
High
It showed that the eutrophication level of Lake Caohai was very high except for the years of 2003, 2006, 2007 and 2008.That of Lake Waihai was high from 1999 to 2008. Although the eutrophication status of Lake Caohai was more serious than that
Lake Eutrophication Evaluation and Diagnosis Based on Bayesian Method and SD Model
583
of Lake Waihai, it had been much better since the year of 2006, when the "Eleventh Five-Year Plan" had been implemented. No significant changes had taken place in Lake Waihai, the entrophication level of which was still high. 3.2 Nitrogen and Phosphorus Analysis
The changes of eutrophication level of Lake Caohai and Lake Waihai could be reflected by comprehensive evaluation, but the Bayesian Method could not identify the underlying causes. Therefore, annual average concentration of TN and TP from 1988 to 2008 was analyzed further in Figure 3 and Figure 4. 18 16
Lake Caohai Lake Waihai
14 12 10 8 6 4 2 1985
1990
1995
2000
2005
2010
Fig. 3. TN mean value curves of Lake Dianchi from 1988 to 2008 (unit: mg/L) 1.6
Lake Caohai Lake Waihai
1.4 1.2 1.0 0.8 0.6 0.4 0.2 0.0
1985
1990
1995
2000
2005
2010
Fig. 4. TP mean value curves of Lake Dianchi from 1988 to 2008 (unit: mg/L)
The results showed that the TN and TP concentration of Lake Caohai had continued to rise from 2005. The results from Bayesian Method still appeared as very high level. It stated that the real changes in nitrogen and phosphorus could not be reflected pertinently. The TP concentration of Lake Waihai had come down obviously showed that eutrophication control had come to some effect, which was also not reflected in the eutrophication evaluation conclusion.
584
K. Huang, X. Chen, and H. Guo
4 Eutrophication Diagnosis of Lake Dianchi 4.1 The SD Model of Socio-economic Drivers and Eutrophication
It has been known that population, economic development and environmental pollution were closely connected, being influenced and being interacted with one another [5]. To diagnose the socio-economic factors that caused Lake Eutrophication, the SD (system dynamics) models were established. In general, the cyanobacteria in water can use nitrogen more than the available phosphorus. So phosphorus is usually the limiting factor. On a study of the relationship between physical-chemical factors and the dynamic changes of chlorophyll, the OECD has already proved that there was no significant correlation between Nitrogen and the biomass of phytoplankton [6]. Therefore, in the process of establishing SD models, the paper focused on phosphorus emissions and phosphorus pollution. TP was selected as an indicator to control. For more information on the TN control, we can also establish similar SD models. Considering the relationship between socio-economic development and Lake Eutrophication, our UML is that diagnosing from the population-driven and the three major industrydriven. 4.1.1 The SD Model of Population-Driven and Eutrophication Water quality is usually closely related to the population [7]. SD model in Figure 5 was established to demonstrate the influence of population growth on TN and TP emissions. Information of the agricultural population and non-agricultural population on the contribution to eutrophication could also be acquired. <Time> Urbanization Rate
Non-agricultural Population
Agricultural Population Rural LifeTN Emission Coefficient
Rural Life TN Emissions
Urban Domestic Sewage Emissions
Natural Growth Rate Total Population
Urban Population Annual Emission Coefficient
Population Variation Initial Population Value
<Time>
Rural Life TP Emission Coefficient Rural Life TP Pollution
Urban Domestic TP Emissions
Urban Domestic Sewage TN Concentration Urban Domestic TN Emissions
Urban Domestic Sewage TP Concentration
Fig. 5. The SD Model of Population and N&P Emissions
4.1.2 The SD Model of Agriculture-driven and Eutrophication There were massive agricultural non-point sources in Lake Dianchi Watershed. Many of them were not only distributed but also complex and played a very important role in the eutrophication in Lake Dianchi. Agriculture was divided into Livestock and Poultry Industry and Plant Industry. The phosphorus pollution of livestock husbandry and phosphate fertilizer application to farms was taken into consideration. The SD model of plant industry was showed in Figure 6 and Figure 7.
Lake Eutrophication Evaluation and Diagnosis Based on Bayesian Method and SD Model
Large Livestock Saughter Change Rate
Large Livestock Slaughter Variation
The Number of Large Livestock Slaughte
The Number of Pig Slaughter Pig Slaughter Variation
Large Livestock Output Value
Unit Output Value of Large Livestock
<Time>
Poultry Slaughter Variation
Poultry Saughter Change Rate
The Number of Poultry Slaughter
Pig Slaughter Change Rate
Pig Output Value
Total Output Value of Stock-raising
Output Value of Poultry
585
Unit Output Value of Pig The Number of Sheep Slaughter
<Time>
Sheep Slaughter Variation
Output Value of Sheep
Sheep Saughter Change Rate
Unit Output Value of Sheep
Unit Output Value of Poultry
Fig. 6. The SD Model of Poultry Industry and TP Emissions Unit Farming Output Value Change Rate
Unit Farming Output Value
Phosphate Fertilizers
<Time> Total Agricultural Output Value
Large Agriculture Industry Output Value
China Bristles Total Output Value
Nitrogenous Fertilizers Cultivated Area
Field
Fishery Output Value Development Speed Service Industry Development Speed
Per-person Cultivated Area
Unit Area Phosphate fertilizers
Field Change Quantity
Unit Area Nitrogenous fertilizers
Land Land Variation
Forestry Output
Fishery Output Value
<Time> Field Change Rate
Land Change Rate
<Time>
<Time> Forestry Output Value Development Speed
Fig. 7. The SD Model of Plant Industry and TP Emissions
The SD models of population-driven and agriculture-driven were showed as the example of the SD model of socio-economic drivers. The SD models of other industries could also be established by the same method. After the models were established, the record data in statistical yearbook could be quoted and inputted into the models, and then the relationship between socio-economic driver and eutrophication can be studied. 4.2 Eutrophication Diagnosis Based on N/P Ratios
Cyanobacteria bloom is the most obvious feature of eutrophication. Nitrogen and phosphorus are the dominant factors which affect the algae’s growth. The N/P ratios of Lake Caohai and Lake Waihai from 1990 to 2008 were calculated (Table 3). The optimum N/P Ratios for cyanobacteria growth is 15. The N/P Ratios of Lake Caohai was below 15 from 1999 to 2008, the same as that of Lake Waihai before 2006. In 2002, the N/P was nearly 15, which was conducive to the outbreak of cyanobacteria. From 2005, the N/P Ratios of Lake Waihai continued to increase and reached 20 in 2007, which went beyond the optimum growth of algae. Just considering about eutrophication, the water quality of Lake Waihai had improved since 2007,
586
K. Huang, X. Chen, and H. Guo Table 3. N/P Ratios of Lake Dianchi from 1999 to 2008
N/P Ratios Lake Caohai Lake Waihai
1999
2000
2001
2002
Year 2003 2004
2005
2006
2007
2008
13.42
10.93
10.92
10.78
10.77
10.13
12.16
9.98
10.41
11.48
6.44
7.15
10.47
14.38
13.29
12.71
9.74
12.41
22.27
23.35
when its N/P Ratios had deviated from 15. That is because the growth of algae is inhibited by the too high N/P Ratios.
5 Conclusion The following conclusions can be drawn from the above analysis: (1) The water quality of Lake Waihai was stable and better than that of Lake Caohai. The water quality of Lake Caohai had been deteriorated seriously, and may be further deteriorated. Based on the eutrophication evaluation by Bayesian Method, the eutrophication status of Lake Caohai was more serious than that of Lake Waihai. (2) As the rapid socio-economic development of Lake Dianchi Watershed, sewage discharge quantity and agricultural irrigation was becoming lager, which has caused the eutrophication of Lake Dianchi. Therefore, in order to improve the eutrophication situation of Lake Dianchi, the policy makers should be dedicated to control point source pollution, especially sewage treatment in the city. (3) Based on N/P Ratios analysis, Lake Caohai and Lake Waihai were at the opposite side of the inflection point of 15. In conclusion, the eutrophication in Lake Caohai and Lake Waihai should be treated differently. In order to make N/P Ratios deviated from 15, which is appropriate to cyanobacteria bloom. To Lake Caohai the TN input should be controlled, while as to Lake Waihai TP concentration should be kept down. Few outbreaks of bloom have happened in the lake with low phosphorus level, even if the nitrogen concentration is very high [8]. Therefore, eutrophication control strategies for Lake Dianchi should be focused on both internal and external pollution loads reduction through ecological restoration methods [9]. Phosphorus concentration recession will work efficiently of reducing the cyanobacteria bloom in Lake Dianchi. Acknowledgement. The paper is supported by “National major Science and Technology Program – Water Body Pollution Control and Remediation (Grant No. 2008ZX07102001)”.
References 1. Xie, P., Li, D., Chen, G., Ye, A.: A Lake Eutriphication Stochastic Assessment Method By Using Bayesian Formula and Its Verification. Resources and Environment in the Yangtze Basin 14(2), 224–228 (2005)
Lake Eutrophication Evaluation and Diagnosis Based on Bayesian Method and SD Model
587
2. Liao, J., Wang, J., Ding, J.: Water Quality Assessment of Main Rivers in Sichuan Based on Improved. Bayes Model. Journal of Sichuan Normal University (Natural Science) 32(4), 518–521 (2009) 3. Cai, Q., Liu, J., King, L.: A Comprehensive Model for Assessing Lake eutrophication. Chinese Journal of Applied Ecology 13(12), 1675–1679 (2002) 4. Aizaki, M., Iwakuma, T., Takamura, N.: Application of modified Carlson’s trophic state index to Japanese lakes and its relationship to other parameters related to trophic state. Research Report on National Institute of Environmental Studies 23, 13–31 (1981) 5. He, Y., Zhang, W., Li, G.: The Establishment of SD Pattern for Environmental Economy System. Journal of Jiangsu University of Science and Technology 3(4), 63–66 (2001) 6. Iwasa, Y., Uchida, T., Yokomizo, H.: Nonlinear behavior of the socio-economic dynamics for lake eutrophication control. Ecological Economics 63(1), 219–229 (2007) 7. Niu, T., Jiang, T., Chen, J.: Study on the Relationship Between Socioeconomic Development and Eutrophication in Coastal Water in Shenzhen. Marine Environmental Science 25(1), 41–44 (2002) 8. Xie, L., Xie, P., Li, S., Tang, H., Liu, H.: The Low TN:TP Ratio, a Cause or a Result of Microcystis Blooms. Water Research 37(9), 2073–2080 (2003) 9. Hein, L.: Cost-efficient Eutrophication Control in a Shallow Lake Ecosystem Subject to Two Steady States. Ecological Economics 59, 429–439 (2006)
Respiration Simulation of Human Upper Airway for Analysis of Obstructive Sleep Apnea Syndrome Renhan Huang and Qiguo Rong* College of Engineering, Peking University, Beijing 100871, P.R. China [email protected], [email protected]
Abstract. Obstructive sleep apnea syndrome (OSAS) is a disease that the pharyngeal portion collapses repeatedly during sleep and finally results in the cessation of breathing. So far the potential pathogenesis factors that may cause OSAS are discussed from two main aspects: anatomic abnormalities of the upper airway and the weak or absence of nerve control mechanism. In this study, a three-dimensional finite element model which possesses high geometrical similarity with the real anatomical structure is built. By making use of the pressure in upper airway measured in normal expiration and apnea episode, the fluid field in upper airway and the displacement of the soft tissue around the airway are calculated using fluid-structure coupled algorithm, and then the result between normal respiration and apnea episode are compared. According to the result, the region where the maximum negative pressure and the largest displacement occur will be the most domains the airway collapses and breath apnea appears. Keywords: OSAS, upper airway, fluid-structure interaction, FEM.
1 Introduction Obstructive Sleep Apnea Syndrome (OSAS) is a common sleep-related breathing disordered characterized by repetitive pharyngeal collapse, cessation and reopen of the airflow in the oral and nasal cavity (Figure 1). It is reported to affect approximately 4% of the United States population[1]. Severity of OSAS is measured by the apnea-hypopnea index (AHI), where apnea is defined as cessation of airflow for at least 10 seconds. For mild OSAS patient, the AHI is 5-15, as for severe patient, the AHI can be more than 30. The most representative symptoms are snoring and excessive daytime somnolence, which will decrease quality of the life and increase the risk of cardiovascular and cerebrovascular disease [2-4]. Although the pathology of the OSAS is complicated, fundamentally it can be concluded into two main aspects: anatomic abnormalities of the upper airway and the weak or absence of nerve control mechanism. The narrow and obstruction of the upper airway, which may be caused by kinds of anatomic abnormalities, will greatly affect the fluid filed in upper airway and lead to collapse of some parts of upper airway. In order to obtain enough airflow during inspiration, more negative pressure is *
Corresponding author.
K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 588 – 596, 2010. © Springer-Verlag Berlin Heidelberg 2010
Respiration Simulation of Human Upper Airway
589
needed at the region where the upper airway area is narrow. As soon as the negative pressure descends below the pressure of peripheral tissue, the collapse will occur. Furthermore, required by the needs of speech, swallowing, respiration and other physiological function, a complex control system with more than twenty various muscles playing a role in upper airway. These groups of muscles interact in a complex fashion, constriction or dilatability according to breath state, to maintain the ventilation. If this nerve control mechanism becomes weak or even absence, the upper airway may collapse under a small negative pressure in lumen.
Fig. 1. Obstructive sleep apnea
Based on the analysis of OSAS from physiological and pathological view, it can be known that the whole process from airflow enters the upper airway from nasal cavity at the beginning of breath to the collapse of the pharyngeal portion, is a problem that possesses material and geometrical nonlinearity, fluid and structure interaction and lift self-adapting from mechanical view. The motion state of upper airway in breath apnea can be studied by mechanical model [5-6]. Each potential reason can be treated as a control factor of the mechanical model. By changing these control factor, it is studied that how the relevant potential reason affects OSAS. The biomechanical study of OSAS aims at providing theoretical principle and technical support for the prevention and treatment. As a result, some mechanical models have been developed and some useful results have been achieved.[7-16] Based on CT medical images of ten volunteers, the 3D FE model of the upper airway was reconstructed by using the method of surface rendering, and the airflow of the whole cavity is simulated numerically and analyzed by the FE method(Yinxi Liu et al )[17]. A pharyngeal airway model characterized by a maximum narrowing at the site of retropalatal pharynx was reconstructed from cross-sectional magnetic resonance of a patient with obstructive sleep apnea, and two flow –modeling strategies: steady Reynolds-Averaged Navier-Stokes(RANS) methodology and Large Eddy Simulation were employed to analysis the fluid field in
590
R. Huang and Q. Rong
upper airway(Mihai Mihaescu et al)[18]. A computational fluid dynamics model was constructed using raw data from three-dimensional computed tomogram images of an OSAS patient, and then the low Reynolds number κ − ε model was adopted to reproduce the important transition from laminar to turbulent flow in the pharyngeal airway (Soo-Jin Jeong et al) [19]. Computational fluid dynamic analysis was used to model the effect of airway geometry on internal pressure in the upper airway of three children with obstructive sleep syndrome and three controls. Model geometry was reconstructed from magnetic resonance images obtained during quiet tidal breathing, meshed with an unstructured grid, and solved at normative peak resting flow, the unsteady Reynolds-averaged Navier-Stokes equations were solved with steady flow boundary conditions in inspiration and expiration, using a two-equation low-Reynolds number turbulence model(Chun Xu et al)[20]. Up to now, all the models that involved OSAS study are simplified more or less in geometric configuration, especially the bone tissue and soft tissue around the upper airway are excluded in the model. In fact the bone tissue such as skull, neck and hyoid may restrict the deformation of the airway because of their high young’s modulus, and the soft tissue around the upper airway may act on it by active contraction. As a result, the skull, neck and hyoid and other anatomical characteristic such as maxillary antrum, sphenoid sinus and frontal sinus must be taken into account in order to obtain a result close to physiological condition. It is necessary of using fluid-structure interaction algorithm because the pressure originated from airflow acts on the wall of upper airway, resulting in the structural deformation which can change the pressure distribution in reverse. In this paper, a finite element model including airway, skull, neck, hyoid and soft tissue around the upper airway is presented, besides a preliminary fluid structure interaction simulation result from an respiration during a second is explained.
2 Method Computer modeling was conducted using CT data obtained from a 26-year-old male person. Three-dimensional CT scanning was performed on a GE MEDICAL SYSTEMS/LightSpeed VCT scanning station with the 1.25 mm thickness. Scanning was conducted while the person was awake in the supine position. The scanned images were transferred to Materialise’s Interactive Medical Image Control System 10.0. MIMICS is an interactive tool for the visualization and segmentation of CT images as well as MRI images and 3D rendering of objects. The regions of interest were isolated and reconstructed into 3D models one by one according to gray level threshold segmentation. Different tissue has different density so that each tissue has a special gray level threshold in the CT images. The very dense parts are corresponding to high threshold while the soft tissue with a low threshold value. As a sequence using both an upper and a lower threshold can separate an interested part. The thresholds used to obtain airway, skull, neck, hyoid and soft tissue are demonstrated in Table 1. Then each part of the model was imported into Geomagic Studio 10 (A reverse engineering software maked by US Raindrop Company) to be edited by manual and was exported with NURBS (Non-Uniform Rational B-Splines). After that all of them were transferred to Ansys 11.0 to be assembled as a whole model, which can be meshed
Respiration Simulation of Human Upper Airway
591
Table 1. Gray thresholds for Different Tissues
Lower threshold Higher threshold
Cavum -1024 -420
Bone 226 3071
Soft tissue -700 225
using an unstructured grid. The 4-node tetrahedral element possessing well fitting function was selected because of the complexity of the model. Specially 4-node shell element type was used to mesh the outside surface of the airway, which was not only can be treated as the parameter transfer interface between fluid domain and structure domain required by the algorithm but also can be applied initial stress for the further study in the future. Finally the meshed model was shifted to Adina (Automatic Dynamic Incremental Nonlinear Analysis) 8.6.0 to accomplish the fluid-structure coupled simulation. Figure 2 and Figure 3 demonstrate the finite element model used to calculation, where different tissue is rendered with distinct color. The finite element model information is listed in Table 2. The ADINA system has a structural analysis capability as well as a fluid analysis capability. The anailability of both capabilities within the same code provides the base for developing sophisticated fluid-structure interaction tools. For fluid-structure interaction problems, the fluid model must be based on an arbitrary-Lagrangian-Eulerian coordinate system since the fluid-structure interface is deformable. The fundamental conditions applied to the fluid-structure interfaces are kinematic condition or displacement compatibility
d f = ds And the dynamic condition or traction equilibrium
n ⋅ Tf = n ⋅ Ts Where df and ds are, respectively, the fluid and solid displacement and Tf and Ts are, respectively, the fluid and solid stresses. The fluid and solid parts are coupled as follows: the fluid nodal positions on the fluid-structure interfaces are determined by the kinematic conditions. The displacements of the other fluid nodes are determined automatically by the program to preserve the initial mesh quality. The governing equations of fluid flow in their ALE formulations are then solved. In steady-state analyses, the mesh velocities are always set to zero even the fluid nodal displacements are updated. Accordingly the fluid velocities on the fluid-structure interfaces are zero. According to the dynamic conditions, on the other hand, the fluid traction is integrated into fluid force along fluid-structure interfaces and exerted onto the structure node.
F (t ) = ∫ h d T f ⋅ ds Where hd is the virtual quantity of the solid displacement[21].
592
R. Huang and Q. Rong
Fig. 2. Finite element model Table 2. FEM Model Information
Element Type Node Element
Airway Skull Neck Hyoid Soft tissue 3-D Fluid 3-D Solid 3-D Solid 3-D Solid 3-D Solid 15927 35271 36548 1924 465843 66498 31153 31223 1735 369933
Interface Shell 9461 18926
Table 3. Material Property for FEM Model[22] Young's Modulus(Pa)
Poisson Ratio
Density(g/mm3)
Bone (Skull Neck Hyoid)
1.37×1010
0.3
1.85×10-3
Soft Tissue
1.0×104
0.45
1.06×10-3
Shell Part One
1.37×1010
0.3
1.85×10-3
Shell Part Two
1.0×104
0.45
1.06×10-3
Shell Part Three
2.02×106
0.3
1.25×10-3
Actually, the mechanical characteristic of biological tissue is nonlinear. As a preliminary research, however, linear constitutive relation is used for the purpose of optimizing time consumption. Owing to the fact that water forms most of the soft tissue component; it can be taken as quasi-incompressible. Human upper airway is a complex lumen which can be partitioned into various individual segments having distinct anatomical properties and physiological functions. These individual segments act as singularities whose contributions cannot be ignored in the understanding of the overall upper airway behavior. So the surface of the airway is divided into three parts as exhibition in Figure 4. The part (The verdant part in Fig 4) embodied in the nasal cavity hardly has deformation when respiration because it is very close to the hard tissue, so a high Young’s Modulus value is assigned to it. Based on anatomy, there are a series of cartilage rings around the wall of airway from hyoid downward to the branchus, so this part (The purple part in Fig 4) reflects a cartilage mechanical property. The rest part of the surface (The yellow part in Fig 4) is treated as the same
Respiration Simulation of Human Upper Airway
Fig. 3. Norma Sagittalis of the FEA Model
593
Fig. 4. Segments of Upper Airway Wall
material property as soft tissue. There are three material models was used. Particularly the detailed value is demonstrated in Table 3. During the sleep in the supine position, the back side of the head contacting to the bed is fixed, as a sequence the skull and the neck are hardly moved when breathing. Their tiny displacements are almost make no difference for the deformation of the upper airway, therefore all freedom degree of the bone tissue are fixed except for the hyoid which is imbedded in muscle and connects to neither the skull nor the neck. As for the fluid model used for the upper airway, viscous incompressible laminar flow model is chosen. In fact due to the complexity of the geometric configuration of the upper airway plus the highly instability of airflow when collapse occur, a turbulence model may be more possible to obtained the real outcome. The first step of this project, however, focuses in the fluid-structure interaction effect so the turbulence phenomenon is neglected and left as the next work. The parameter used for the airflow is as follows: 1.297×10-6g/mm3, 1.81×10-5Pa for density and viscosity coefficient, respectively. At the nostril where is the airflow inlet, a zero pressure is applied while at the hypopharynx a variable a time varying pressure function is exerted as the outlet boundary condition. The pressure function was measured with titration at normal breath situation. The gravity of the soft tissue around the anterior upper airway isn't taken into account because the configuration of the airway has been the station after the gravity effect in spine position. A segment of load lasting 1.2 second in expiration is picked for calculation. The time step is designed quite small at the beginning, for the purpose of obtaining a reasonable initial condition for the iteration in the transient analyses.
3 Results and Discussion Figure 5 shows the displacement contour of soft tissue. According to the distribution, the maximum displacement appears at the anterior of the soft tissue around the neck. This is reasonable because this domain is far from the fixed bone tissue and near the outlet where normal pressure traction was directly applied. The fact that the displacement
594
R. Huang and Q. Rong
magnitude at the time of 0.05 second is quite large compared to the situation at the time of 1.2 second is also rational as the pressure in the upper airway varies from the maximum to the nearly zero during the expiration process. Figure 6 demonstrates the pressure contour of the upper airway. According to the distribution, the pressure near the nostril is close to atmosphere. The maximum pressure at each time step is almost agree with the variation of the pressure function. It is worth to draw attention to the nasal cavity where the pressure changes intensively. The complicated configuration of the nasal cavity increases the airway resistance and as a sequence a large pressure gradient.
Fig. 5. Displacement distribution in soft tissue at the 0.05s and 1.20s in expiration
Fig. 6. Pressure distribution in upper airway at the 0.05s, 0.4s, 0.8s and 1.20s in expiration
As to the currently model by now, there are four aspects that could be improved. First, only 1.2 second in expiration phase is simulation. Usually a integrated respiration period last 4 seconds and the apnea only occurs in the inspiration phase. So the simulation must comprise of several respiration period so as to make it significant for the obstructive sleep apnea syndrome. Second, although the model in this paper has the quality that reflects more details than the others that are more or less simplified, but the muscles and fat should be added to the model in the subsequence work in
Respiration Simulation of Human Upper Airway
595
order to get more meaningful results. Third, the material characteristic that used in those models is linear elastic model, which can’t really reflect the mechanical property of soft tissue. The mechanical characteristic of soft tissue around the upper airway, including muscle and fat, is nonlinearity. As a result, nonlinearity constitutive relation must be used in order to achieve a precise simulation. In addition, the deformation of the collapse part of airway should be described in large deformation theory. However it’s a pity that it is still unused in actual calculation. The deformation of the collapsed position compared to the diameter of upper airway has been beyond the small deformation hypothesis, so the large deformation theory should be used. The last but not the least, there is hardly study work that regards nerve control mechanism in respiration. It is well known that each physiological activity is accurately controlled by nervous system and respiration is no exception. How to embody this selfregulating feedback control mechanism in the model calculation is worth studying.
4 Conclusion In this study, a finite element model including airway, skull, neck, hyoid and soft tissue around the upper airway is developed, besides a preliminary fluid structure interaction simulation result from a expiration during a second is explained. Although it is the first step of the whole project plan, its result verifies that the biomechanical method is workable and useful. The further research is in progress.
References 1. Young, T., Palta, M., Dempsey, J., Skatrud, J., Weber, S., Badr, S.: The occurrence of sleep-disordered breathing among middle-aged adults. The New England Journal of Medicine 328(17), 1230–1235 (1993) 2. Malhotra, A., White, D.P.: Obstructive sleep apnea. The lancet 360 (2002) 3. Stradling, J.R., Davies, R.J.O.: Obstructive sleep apnoea/hypopnoea syndrome: definitions, epidemiology, and natural history. Thorax 59, 73–78 (2004) 4. Ayappa, I., Rapoport, D.M.: The upper airway in sleep: physiology of the pharynx. Sleep Medicine Reviews 7, 9–33 (2003) 5. Huang, L., Quinn, S.J., Ellis, P.D.M., et al.: Biomechanics of snoring. Endeavour 19(3), 96–100 (1995) 6. Farrè, R., Rigau, J., Montserrat, J.M., et al.: Static and Dynamic Upper Airway Obstruction in Sleep Apnea. Am. J. Respir Crit. Care Med. 168, 659–663 (2003) 7. Payan, Y., Chabanas, M., et al.: Biomechanical models to simulate consequences of maxillofacial surgery. C.R. Biologies 325, 407–417 (2002) 8. Luo, X.Y., Pedley, T.J.: Multiple solutions and flow limitation in collapsible channel flows. J. Fluid Mech. 420, 301–324 (2000) 9. Sakurai, A., Obba, K., Maekawa, K.: Flow in collapsible tube with continuously varied compliance along the tube axis. In: 19th Meeting of the Japanese Society of Biorheology, vol. 33(4,5) 10. Heil, M., Pedley, T.J.: Large post-buckling deformations of cylindrical shells conveying viscous flow. Journal of Fluids and Structures 10, 565–599 (1996) 11. Heil, M., Jensen, O.E.: Flows in collapsible tubes and past other highly compliant boundaries
596
R. Huang and Q. Rong
12. Auregan, Y., Depollier, C.: Snoring: Linear stability analysis and in-vitro experiments. Journal of Sound and Vibration 188(1), 39–54 (1995) 13. Aittokallio, T., Gyllenberg, M., Polo, P.: A model of a snorer’s upper airway. Mathematical Biosciences 170, 79–90 (2001) 14. Payan, Y., Pelorson, X., et al.: Physical Modeling of Airflow-Walls Interactions to Understand the Sleep Apnea Syndrome. In: Ayache, N., Delingette, H. (eds.) IS4TM 2003. LNCS, vol. 2673, pp. 261–269. Springer, Heidelberg (2003) 15. Huang, L., Williams, J.E.F.: Neuromechanical interaction in human snoring and upper airway obstruction. Journal of Applied Physiology 86, 1759–1763 (1999) 16. Fodil, R., Ribreau, C., Louis, B.: Interaction between steady flow and individualized compliant segment: application to upper airways. Med. Bio. 1. Eng. Comput. 35, 638–648 (1997) 17. Liu, Y.X., Yu, C., Sun, X.Z., et al.: 3D FE Model Reconstruction and Numberical Simulation of airflow for the Upper Airway. Modelling and Simulation 2(3), 190–195 (2006) 18. Mihaescu, M., Murugappan, S., et al.: Large Eddy Simulation and Reynolds-Averaged Navier-Stokes modeling of flow in a realistic pharyngeal airway model: An investigation of obstructive sleep apnea. Journal of Biomechanics 41, 2279–2288 (2008) 19. Jeong, S.J., Kim, W.S., Sung, S.J.: Numerical investigation on the flow characteristics and aerodynamic force of the upper airway of patient with obstructive sleep apnea using computational fluid dynamics. Medical Engineering & Physics 29, 637–651 (2007) 20. Xu, C., Sin, S.H., McDonough, J.M.: Computational fluid dynamics modeling of the upper airway of children with obstructive sleep apnea syndrome in steady flow. Journal of Biomechanics 39, 2043–2054 (2006) 21. Zhang, H., Bathe, K.J.: Direct and Iterative Computing of fluid flows fully Coupled with Structures. In: Bathe, K.J. (ed.) Computational Fluid and Solid Mechanics. Elsevier Science, Amsterdam (2001) 22. Costantinoa, M.L., Bagnolia, P., Dinia, G., et al.: A numerical and experimental study of compliance and collapsibility of preterm lambtracheae. Journal of Biomechanics 37, 1837– 1847 (2004)
Optimization for Nonlinear Time Series and Forecast for Sleep∗ Chenxi Shao1,2,3,4, Xiaoxu He1, Songtao Tong1, Huiling Dou1, Ming Yang2, and Zicai Wang2 1 Depatment of Computer Science and Technology, University of Science and Technology of China, 230027, Hefei, China 2 Control & Simulation Center, Harbin Institute of Technology, 150001, Harbin, China 3 MOE-Microsoft Key Laboratory of Multimedia Computing and Communication, University of Science and Technology of China, 230027, Hefei, China 4 Anhui Province Key Laboratory of Software in Computing and Communication, 230027, Hefei, China [email protected], {xiaoxuhe,tongsongtao,douzi}@mail.ustc.edu.cn, {myang,wzc}@hit.edu.cn
Abstract. It is important processes that phase-space diagram and computation of geometrical eigenvalues are reconstituted in nonlinear dynamical analysis. It’s difficult to analyze nonlinear system such as EEG real-time because the algorithms of phase-space diagram reconstitution and geometrical eigenvalue computation are complex on both time and space. The algorithms were optimized to reduce their complexity, after that the algorithms were parallelized, at last the integrated algorithm’s running time is 1/30 of the running time before optimization and parallelization. It was found that the value of correlation dimension can reflect sleep stages after analyzing the sleep EEG, final sleep stages were also forecasted simply. Keywords: nonlinear system, correlation dimension, parallel computation, sleep EEG, forecast.
1 Introduction There are two processes in the dynamic analysis of nonlinear system. The first process is reconstructing the nonlinear system’s phase-space diagram from collected time series data. The most important and commonly used phase-space diagram reconstructing method is time-delay embedding [1]. The second process is computing the geometrical eigenvalues. The most important geometrical eigenvalue is the correlation dimension, which is a measurement of nonlinear system’s complexity. The accurate computation of correlation dimension is affected by some parameters such as data ∗
Supported by Key Project of Natural Science Foundation of China (Grant No. 60874065 and 60434010) and the Science Research Fund of MOE-Microsoft Key Laboratory of Multimedia Computing and Communication (Grant No. 06120803). ** To whom correspondence should be addressed.
K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 597 – 603, 2010. © Springer-Verlag Berlin Heidelberg 2010
598
C. Shao et al.
number, time-delay and embedded dimension, so it’s crucial to select these parameters correctly. EEG is the total reaction of encephalic nerve’s physiological electric activity on the surface of pallium and scalp. Existing research indicates that EEG is non-steady time series signal. Linear analysis method can’t effectively deal with the ruleless phenomena caused by the nonlinear factors of EEG, so it is important that nonlinear analysis in aided medical diagnosis. It was found that EEG of epilepsy patient has low dimension chaotic activity[2]. EEG can also describe the stages of sleep, because sleep stages can be distinguished by its correlation dimension. The high-performance algorithm was described here provide method for real-time EEG diagnosis and simulation forecast.
2 Optimization and Integration for Algorithms It’s difficult to analyze nonlinear system real-time because the computational process is time-consuming. The algorithms used in the dynamic analysis of nonlinear systems are complex on both time and space. We have made serial optimization and parallelization for the algorithms to improve the computation efficiency. 2.1 Optimization for Algorithm of Computing Time-Delay The reconstruction of phase-space diagram includes the calculation of embedded dimension and delay-time, and we should calculate embedded dimension firstly. Existing research has brought forward some methods to calculate embedded dimension such as autocorrelation function method, mutual information method, reconstruction expansion method, etc. But these methods have some serious shortcomings; for example, autocorrelation function method isn’t fit for nonlinear system; mutual information method is very complex and need a great deal of data; reconstruction expansion method can’t get proper time-delay in some condition[3]. An effective method named C-C method used to calculate time-delay was brought forward by Kim[4]. Calculating correlation integral in different time-delay L (t different time-delays in all), in different embedded dimension M (m different dimensions in all), and in different distance r (σ different distances in all) is the main step of C-C method whose time complexity is O(t*m*σ*N2). Because the algorithm need to compare 1/2, 2/2 …σ/2 of time series data’s variance with a repeatedly calculated distance, we changed the logical sequence of calculation and built an array to store the result of comparison to calculate a distance only once. As a result, the modified algorithm’s time complexity is reduced to O(t*m*σ*N2) just through building 2 double array with a length of 4. After computing some experimental data, the modified algorithm’s running time is about 1/3 of the original algorithm’s running time when calculating data of different quantity. It is necessary for us to parallelize the serial optimized C-C method to improve its efficiency farther. Message passing interface (MPI) was chose to parallelize the algorithm, so designing how to divide the algorithm is the most important step. A simple method is to divide the algorithm by using time-delay as division factor in order to reduce message transmission. Because different time-delay costs different computing time, we can’t divide the algorithm by time-delay averagely. We have found a rule that the computing time monotonously decreases when the time-delay increases. In a
Optimization for Nonlinear Time Series and Forecast for Sleep
599
parallel cluster with n nodes, a task with time-delay being i was arranged like this: we let s be the remainder of i dividing 2n; if s is less than n, this task is assigned to node with number s+1; if s is bigger than or equals to n, this task is assigned to node with number 2n-s. After calculating Lorenz system’s data of size 3000 and 5000 with the parallelized C-C method repeatedly, the result is shown in Fig 1. The experiment’s environment is parallel cluster (34*HP rx2600 server) with Linux operating system. When there are a few nodes, computing efficiency evidently increases with the increase of node number; but when there are more than 5 nodes, computing efficiency tardily increases with the increase of node number. Through serial optimization and parallelization to C-C method, its computation speed promotes approximately 11 times. 2.2 Algorithm of Computing Embedded Dimension We should calculate nonlinear system’s embedded dimension after having got timedelay through C-C method. Liangyue Cao brought forward a practical method named False Nearest Neighbor method (FNN)[5] to calculate embedded dimension.
Fig. 1. Running time of parallel C-C algorithm vs. node number
The FNN method uses the False Neighbor method’s idea for reference, and has the following advantages: it can analyze data of small amount; it can distinguish chaotic time series and stochastic time series; it is fit for analyzing high dimension time series and has high computing efficiency. After the method’s algorithm was serial optimized and parallelized[3], the computing efficiency of the algorithm improves obviously. To evaluate the algorithms having been serial optimized and parallelized be right or not, we used them to calculate the time series data of Lorenz system and Torus[5] system produced by four steps Runge-Kutta integration method. As a result, the computation results are the small as theory values, so prove the correctness of the algorithm. 2.3 Optimization for Algorithm of Computing Correlation Dimension G-P method calculates correlation dimension of chaotic signals through mathematic definition[6]. The G-P method whose physical meaning is very obvious is easy to
600
C. Shao et al.
implement. We can use the time-delay and embedded dimension having got before to calculate correlation dimension. Most of G-P method’s running time is spent on calculating correlation integration for different distance r which has an amount of Rg. The algorithm’s time complexity is O(Rg*m*N2). Because the algorithm calculates the distance of vector pairs repeatedly for different distance r, we built an array DistanceInfo to store the distributing information of time series data’s vectors, and every element of the array denotes the amount of vector pairs whose distance is in a specific range. After optimization, we can initialize the array DistanceInfo while calculating the distances of all array pairs, and then we read the value from the array to calculate correlation integration for different r simply. The time complexity of the algorithm having been serial optimized is O(Rg+m*N2). Through calculating time series data from different nonlinear system with the serial optimized algorithm, the result is shown in Table 1. The optimized algorithm’s running time is about 1/50(changes with Rg) of the original algorithm’s running time when calculating data of different amount. Table 1. Running time of G-P algorithm before and after optimization Data amount Average running time before optimization(s) Average running time after optimization(s)
1000
3000
5000
6.972
65.225
180.125
0.119
1.190
3.534
Then we parallelized the algorithm using MPI too. Most of the serial optimized algorithm’s running time is spent on calculating distances of vector pairs, so it’s better to partition the data averagely. Every node calculates the same amount of vector pairs’ distance, and then the main node collects the result of other nodes to calculate the correlation dimension simply. After calculating time series data of different amount, the result shows the parallelized algorithm’s computation speed promotes approximately 4 times. 2.4 Integration for Algorithms Program For nonlinear system’s time series data, we use C-C method to calculate the system’s time delay; and then we use Liangyue Cao method to calculate the system’s embedded dimension using time delay as parameter; at last, we use G-P method to calculate the system’s correlation dimension using time delay and embedded dimension as parameters. We integrated the 3 algorithms to useful software including serial version and parallel version to make convenience for users. The serial version software which programmed in Visual C++ environment can run on Windows operating system, and it has graphical interface to be friendly to users. The parallel version software can run on parallel cluster with Linux operation system, and users can submit jobs by job management system LSF.
Optimization for Nonlinear Time Series and Forecast for Sleep
601
3 Analysis for Sleep EEG 3.1 Source of Data The online version of the volume will be available in LNCS Online. Members of institutes subscribing to the Lecture Notes in Computer Science series have access to all the pdfs of all the online publications. Non-subscribers can only read as far as the abstracts. If they try to go beyond this point, they are automatically asked, whether they would like to order the pdf, and are given instructions as to how to do so. Please note that, if your email address is given in your paper, it will also be included in the meta data of the online version. 3.2 Analysis Result According to RKS standard, the whole sleep process is made up of REM (Rapid Eye Movement) sleep and NREM (Non-Rapid Eye Movement) sleep. NREM sleep can be divided into I, II, III, IV, four sleep stages according to sleep depth. The latest research shows: REM sleep stage is full of dreams; I, II sleep stages are the shallow sleep stage; III, IV sleep stages are the deep sleep stages[7]. The length of deep sleep’s time is an important standard to weigh sleep quality, so it has positive meaning for analyzing and improving sleep quality to effectively distinguish and forecast sleep stages. We divide the whole EEG data which lasts several hours into small pieces which lasts 30 seconds. Because the sampling frequency of EEG data is 100Hz, so every piece of data has a length of 3000. We computed the correlation dimension of 6 EEG data (they are sc4002e0, sc4012e0, sc4102e0, sc4112e0, st7022j0 and st7121j0), and then we computed every sleep stage’s average value of correlation dimension. The result is show in Figure 2, the IV sleep stage’s correlation dimension of data sc4012e0 isn’t shown in the figure because IV sleep stage is not marked in the data sc4012e0. In Wake and REM stages, the values of correlation dimension are greater than other sleep stages; in I, II, III, IV, four sleep stages, the values of correlation dimension are descending. The result accords with literature’s conclusion[8].
Fig. 2. Correlation dimension of different sleep stages
602
C. Shao et al.
3.3 Forecast for Sleep We found different sleep stages alternate constantly in a health person’s sleep process by carefully observing sleep EEG’s correlation dimension. There are some rules in the alternating of sleep stages: when a REM sleep is over, a NREM-REM recurrence is over; every recurrence lasts about 90-100 minutes; there are 4-5 recurrences in a whole sleep. Through these rules, we can forecast sleep stages. If a testee’s correlation dimension of sleep EEG is close to wake EEG, the testee is in REM sleep stage and will go on being in REM sleep stage in a period of time; if the testee’s EEG’s correlation dimension minish observably then, the testee begins turning into deep sleep stage and will be in deep or shallow sleep stages for about 90-100 minutes, but the testee’s sleep stage alternate in I, II, III, IV, four sleep stages randomly; if the testee’s EEG’s correlation dimension augments to close to wake EEG’s correlation dimension then, the testee is in REM sleep stage and will be in that stage for an uncertain period of time. Following the rules described above, we can forecast a testee’s sleep stage by computing the testee’s sleep EEG’s correlation dimension real-time. We used above rules to analyze two group of data st7052j0 and st7132j0, as a result, we can analyze sleep stage correctly.
4 Conclusion After serial optimization and parallelization for the nonlinear system’s dynamic analysis algorithms, the efficiency of the algorithms improves greatly. The algorithms whose running time was several minutes before optimization spend just several seconds running now, and it is possible to analyze nonlinear system real-time. We found EEG’s correlation dimension can distinguish different sleep stages through computing sleep EEG’s data. Because the value of correlation dimension represents system’s complexity materially, cerebra’s dynamic character in Wake and REM sleep stages is more complex and more unstable than in I, II, III, IV, four sleep stages. By observing the character of sleep EEG’s correlation dimension, we can simulate and forecast sleep stages.
References 1. Packard, N.H., Crutchfield, J.P., Farmer, J.D., et al.: Geometry from a time series. Physical Review Letters(S0031-9007) 45, 712 (1980) 2. Babloyantz, A., Destexhe, A.: Low-dimensional chaos in an instance of epilepsy. Proceedings of the National Academy of Sciences 83, 3513 (1986) 3. Shao, C.X., Shen, L.F., Wang, X.F., et al.: Nonlinear analysis of the alcoholic’s EEG. Progress in Natural Science(S1002-0071) 12(12), 915–919 (2002) 4. Kim, H.S., Eykholt, R., Salas, J.D.: Nonlinear dynamics, delay times, and embedding windows. Physica D(S0167-2789) 127(1-2), 48–60 (1999) 5. Cao, L.Y.: Practical method for determining the minimum embedding dimension of a scalar time series. Physica D(S0167-2789) 110(1-2), 43–50 (1997) 6. Grassberger, P., Procaccia, I.: Characterization of strange attractors. Physical Review Letters(S0031-9007) 50(5), 346–349 (1983)
Optimization for Nonlinear Time Series and Forecast for Sleep
603
7. Takeuchi, T., Ogilvie, R.D., Murphy, T.I., et al.: EEG activities during elicited sleep onset REM and NREM periods reflect different mechanisms of dream generation. Clinical Neurophysiology(S 1388-2457) 114(2), 210–220 (2003) 8. Roschke, J., Aldenhoff, J.: The dimensionality of human’s electroencephalogram during sleep. Biological Cybernetics (S0340-1200) 64, 307–313 (1991)
Classifying EEG Using Incremental Support Vector Machine in BCIs Xiaoming Zheng1, Banghua Yang1,2, Xiang Li1, Peng Zan1, and Zheng Dong1 1
Shanghai Key Laboratory of Power Station Automation Technology, Department of Automation, College of Mechatronics Engineering and Automation, Shanghai University, Shanghai, 200072, China 2 State Key Laboratory of Robotics and System (HIT), Harbin, 150001, China [email protected]
Abstract. The discrimination of movement imagery electroencephalography (EEG) is an essential issue in brain-computer interfaces (BCIs). Classifying EEG signals is an important step in the discrimination process. From the physiological standpoint, EEG signal varies with the time elapse, mood, tiredness of the subject, etc. An excellent classifier should be adaptive to tackle the dynamic variations of EEG. In this paper, an incremental support vector machine (ISVM) is adopted to classifying the EEG. The ISVM can consecutively delete some history samples and replenish some new samples obtained lately. And so the classifier model of the ISVM is updated periodically to adapt to the variations of EEG. At the same time, the ISVM can use a small training set to train the classifier, which is better in training speed and memory consuming than the standard SVM. To the data set 1 on left hand and foot imagery of BCI Competition IV 2008, the empirical mode decomposition (EMD) is employed to decompose the EEG signal into a series of intrinsic mode functions (IMFs), and then AR model parameters and instantaneous energy (IE) can be gained from some important IMFs, which form the initial features. The extracted features are fed into the ISVM classifier. Compared with the standard SVM, elementary results show that the ISVM can obtain better classification performance. The ISVM provides a good way to solve the adaptability of the online BCI system. Even so, the effectiveness of the ISVM should be verified furthermore with more data and subjects. Keywords: incremental support vector machine (ISVM); electroencephalogram (EEG); brain-computer interface (BCI); empirical mode decomposition (EMD).
1 Introduction Brain-Computer Interfaces (BCIs) have drawn growing deal of attention in recent years, which provide a new communication and control for people, especially for those with severe motor disabilities or without physical effort. A BCI makes it possible to establish a direct link between the user’s brain and an executing-entity under the help of a computer [1~3]. Due to many advantages, such as non-invasive, relative K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 604 – 610, 2010. © Springer-Verlag Berlin Heidelberg 2010
Classifying EEG Using Incremental Support Vector Machine in BCIs
605
low cost, ease of acquisition and high temporal resolution, the electroencephalogram (EEG) signal has becoming popular in BCI research [4]. Additionally, there is clear evidence in medical science that observable changes in EEG result from performing mental activities [5~7]. This paper is mainly concerned with BCIs based on EEG. The user executes certain mental activities, such as limb imaging movements, mental arithmetic, etc, to produce differentiable EEG. The BCI system then realizes specified functions through recognizing these EEG signals. Whenever an activity is detected, the BCI executes the corresponding action [8]. A BCI system based on EEG signals are commonly divided into four main units: EEG acquisition, preprocessing, pattern recognition and output equipments. The whole construction is shown in Fig.1.
Fig. 1. The Construction of a Brain Computer Interface System
The EEG acquisition unit is commonly composed of an electrode array and some signal conditioning circuits, in which conducting liquid is usually injected into the space between electrodes and cortical zone to boost electro-conductibility. The preprocessing unit is responsible for removing the potential noise and detecting the EEG signal. And the core of the four units is pattern recognition which could be subdivided into three courses: feature extraction, feature selection and feature classification. The feature extraction derives some initial features from EEG signals and the feature selecting course chooses the most powerful ones from initial features and so a feature vector is formed. The feature vector is then fed to the feature classification which classifies the features into commands. The output unit generates the command action associated to classified results and gives out a feedback to the user who can modulate his/her mental activity to adjust control of the BCI. The classifier is concerned with the final output and is the key to continuously distinguish the incoming EEG and so to recognize human’s different control intentions. Nowadays, the traditional Support Vector Machine (SVM) is widely used as a
606
X. Zheng et al.
classifier but still with common problems that the parameters of classifier could not be adjusted by automatically during the use. However, EEG signal varies with the time elapsing, mood, tiredness of the subject, etc. An excellent classifier should be adaptive to tackle the dynamic variations of EEG. So the traditional SVM cannot adapt to the use, hence the performance of all the system is limited eventually. To solve the mentioned issue, an improved algorithm--Incremental Support Vector Machine (ISVM) in this paper is mainly researched to improve the adaptation and accuracy of BCI system with different popular features.
2 Feature Extraction Because of the small amplitude and sensitivity of EEG signals, a preprocessing is needed before the feature extraction. Since the spectral of normal EEG signals is mainly confined to the 0-50Hz range, the signals are usually filtered in that range [9]. By now, the operable EEG signals are divided into several species: Slow Cortical Potential (SCP), 1~4Hz low frequency, Mu rhythm and Beta rhythm, event-related desynchronization/ event-related synchronization (ERD/ERS), 8~30Hz frequency etc. All the species above are being researched around the globe. In order to contain significant features as many as possible, the 0~30Hz frequency signal is used in the paper. As for the construction of the feature vector, the commonly available methods are applied together to improve the accuracy as high as possible. The mean value of timedomain data is used as the first features. The median absolute deviation estimation is calculated as the second features. Then, the empirical mode decomposition (EMD) is employed to decompose the EEG signal into a series of intrinsic mode functions (IMFs), and after here autoregressive (AR) model parameters and instantaneous energy (IE) can be gained from some important IMFs, which form the left initial features. At last, all the features extracted above are formed into a vector which will be sent to feature classification module. The data from BCI competition 2008 is used in the paper to show the performance of the proposed ISVM method.
3 Feature Classification with ISVM The classification is a key section in BCI system which determines the output directly and is affected by the features extracted formerly. In advance, a model is constructed for each mental activity aiming to determine whether a feature vector results from performing the corresponding mental activity or not. The initial mental activity models are built using a set of features resulting from training duration which is selected from original database. In what follows, the classifier parameters are renewed regularly while discarding all previous data except their support vectors. The essence is to retain the Kuhn-Tucher(KT) conditions on all previously seen data, while adding a new data point to the solution. 3.1 Kuhn-Tucker Conditions In the SVM classification, the optimal separating function reduces to a linear combination of kernels on the training data, f ( x) = ∑ α j y j K ( x j , x) + b , with training j
Classifying EEG Using Incremental Support Vector Machine in BCIs
607
vectors xi and corresponding labels yi=±1. In the dual formulation of the training problem, the coefficients αi are obtained by minimizing a convex quadratic objective function under constraints [10].
min : W =
0 ≤α i ≤C
1 ∑ α i Qijα j − ∑i α i + b∑i yiα i 2 i, j
(1)
With Lagrange multiplier (and offset)b, and with symmetric positive definite kernel matrix Qij = y i y j K ( x i , x j ) The first-order conditions on W reduce to the KuhnTucker(KT) conditions:
⎧≥ 0; α i = 0 ∂W ⎪ gi = = ∑ Qijα j + y i b − 1 = y i f ( xi ) − 1" ⎨= 0; 0 < α i < C ∂α i j ⎪≤ 0; α = C i ⎩
(2)
∂W = ∑ y jα j = 0 ∂b j
(3)
Which partition the training data D and corresponding coefficients
{α i , b} ,i=1,…,l,
in three categories: the set S of margin support vectors strictly on the margin (yif(xi)=1), the set E of error support vectors exceeding the margin(not necessarily misclassified), and the remaining set R of (ignored) vectors within the margin. 3.2 Update Procedure The initial classifying model is gained from previous training session. In use, new features are extracted from new data and given to the classifier. One seeks to update the coefficients α and b, and to determine αc. The updated coefficients must satisfy the KT conditions.
Δg i = Qi ,c Δα c + ∑ Qi , j Δα j + yi Δb
(4)
0 = y c Δα c + ∑ y j Δα j
(5)
j∈S
j∈S
It can be shown that the updating differentials Δb ,
Δ∂j and Δgj are propor-
tional to Δ∂c . The proportionality coefficients directly result from the linear system defined by the differential KT equations. The value of Δ∂c is iteratively determined by taking into account the following conditions. gc ≤ 0 , with equality when xc joins the set of support vectors S;
gc ≤ C , with equality when xc joins the E;
608
X. Zheng et al.
0 ≤ gc ≤ C , with equality0 when xi transfers from S to R, and equality C when xi transfers from S to E;
gi ≤ 0 , with equality when xi transfers from E to S; gi ≥ 0 , with equality when xi transfers from R to S; It is clear that the updating procedure when several labeled feature vectors become available results from repeatedly applying the above mentioned procedure.
4 Results 4.1 Data Description In order to certificate the performances of ISVM, the data coming from Calibration data of BCI competition IV were chosen. These data sets were recorded from healthy subjects. In the whole session motor imagery was performed without feedback. For each subject two classes of motor imagery were selected from the three classes left hand, right hand, and foot (side chosen by the subject; optionally also both feet). In the first two runs, arrows pointing left, right, or down were presented as visual cues ,on a computer screen, being displayed for a period of 4s during which the subject was instructed to perform the cued motor imagery task. These periods were interleaved with 2s of blank screen and 2s with a fixation cross shown in the centre of the screen. The fixation cross was superimposed on the cues, i.e. it was shown for 6s. These data sets are provided with complete marker information. 4.2 Preprocess Before being sent to the feature extraction program, the date sets were resorted according the real time leaving only useful data and abandoning the data connecting to none of the motor imageries. The coming period was to choose the data sets reflecting the motor imagery brain areas, in which one terminal data set is from ‘C4’ channel subtracting the average of nearest four channel data ‘CFC4’, ‘CFC6’, ‘CCP4’,‘CCP6’ and the other set from ‘Cz’ subtracting the average of closest four channels ‘CFC1’, ‘CFC2’, ‘CCP1’, ‘CCP2’ . The channels are shown in Fig.2. – the placement of electrodes in the 2Dprojection in which vertical axis Y is based on landmarks on the skull, namely the nasion (Nz), the inion (Iz), and horizontal axis X on the left and right pre-auricular points (LPA and RPA) [11]. Furthermore, the two data sets (C4-(CFC4+CFC6+CCP4+CCP6)/4 and Cz(CFC1+CFC2+CCP1+CCP2)/4) are filtered and then are passed to the feature extraction procedure resulting in a matrix (192*22, every line is solely subset of sample features) according to the section II. At last, the 192 lines were divided into 2 groups before classification, every other, equal in number. One group is used for training and the other for testing. To find and demonstrate better classifying algorithm, all the features are kept the same while carrying out different classifiers including ISVM, SVM and PNN.
)
(
(
)
Classifying EEG Using Incremental Support Vector Machine in BCIs
609
Fig. 2. Relative Positions of Electrodes in the 2D-projection
Totally, there were 7 subjects’ data available in BCI competition IV, all of which are used to test the differences among the proposed ISVM, traditional SVM and another common classifying algorithm- Probabilistic Neural Network (PNN). All the classification results are shown in Fig.3.
Fig. 3. The Compare of Classifying Correct Rates among ISVM, SVM and PNN
610
X. Zheng et al.
5 Conclusions and Future Work By briefly glancing at Fig.3, it apparently demonstrates that the classifying correct rate of the ISVM is considerably higher than that of the traditional SVM. The ISVM could better adapt to different subjects. Our future work consists in verifying this result on successful acquiring data for a longer period and on more subjects in an experiment that spans for a couple of weeks to check for time consistency. In addition, we are designing a strategy to combine the third level of adaption, such as providing feedback to the user. Acknowledgments. The project is supported by National Natural Science Foundation of China (60975079), State Key Laboratory of Robotics and System (HIT), Shanghai University, "11th Five-Year Plan" 211 Construction Project, Systems Biology Research Foundation of Shanghai University Shanghai Key Laboratory of Power Station Automation Technology (08DZ2272400).
References 1. Wolpaw, J.R., Birbaumer, N., McFarland, D.J., Pfurtscheller, G., Vaughan, M.T.: Briancomputer interfaces for communication and control. J. Clinical Neurophysiology 113, 767–791 (2002) 2. Li, Y.Q., Guan, C.: A Semi-supervised SVM Learning Algorithm for Joint Feature Extraction and Classification in Brain Computer Interfaces. In: 28th IEEE EMBS Annual International Conference, pp. 2570–2573. IEEE Press, New York City (2006) 3. Vidaurre, C., Schlöogl, A., Cabeza, R., Scherer, R., Pfurtscheller, G.: A Fully On-Line Adaptive BCI. IEEE Transactions on Biomedical Engineering 53(6) (2006) 4. Wang, L., Xu, G.Z., Wang, J., Yang, S., Yan, W.L.: Application of Hilbert-Huang Transform for the Study of Motor Imagery Tasks. In: 30th Annual International IEEE EMBS Conference, pp. 3848–3851 (2008) 5. Windhorst, U., Johansson, H.: Modern Tecniques in Neuroscience Research. Springer, New York (1999) 6. Penghai, L., Baikun, W.: A Study on EEG Alpha Wave-based Brain-Computer Interface Remote Control System. In: 2007 IEEE International Conference on Mechatronics and Automation, Harbin, China (2007) 7. Wu, W., Gao, X.R., Hong, B., Gao, S.K.: Classifying Single-Trial EEG During Motor Imagery by Iterative Spatio-Spectral Patterns Learning. IEEE Transactions on Biomedical Engineering 55(6) (2008) 8. Molina, G.G.: BCI Adaptation using Incremental SVM Learning. In: 3rd International IEEE EMBS Conference on Neural Engineering, Hawaii, USA, pp. 337–341 (2007) 9. Niedermeyer, E., Silva, Lopes da Silva, F.H.: Electroence phalo graphy: Basic Principles, Clinical Applications and Related Fields, 4th edn. Williams and Wilkins (1999) 10. Vapnik, V.: The Nature of Statistical Learning Theory. Springer, New York (1995) 11. Oostenveld, R., Praamstra, P.: The five percent electrode system for high-resolution EEG and ERP measurements. J. Clinical Neurophysiology 112, 713–719 (2001)
Acute Isolation of Neurons Suitable for Patch-Clamping Study from Frontal Cortex of Mice Yuan-yuan Li1,2, Li-jun Cheng1, Gang Li1, Ling Lin1, and Dan-dan Li1 1
College of Precision Instruments and Opto-Electronics Engineering, Tianjin University, Tianjin, China 2 School of Computer Science & Software Engineering, Tianjin Polytechnic University, Tianjin, China [email protected]
Abstract. A method was described for the acute isolation of neurons suitable for patch-clamping study from the frontal cortex of 7-10-d-old Kunming mice by a combination of mechanical and enzymatic means. Using inverted microscope and whole-cell configuration of patch-clamp technique, the morphological and electrophysiological properties of cortical neurons were studied respectively. It was shown that the enzymatically isolated neurons had plump profile, smooth surface, strong aureole and long survival time, met the electrophysiological requirements, and exhibited the whole-cell transmembrane currents, voltage-gated sodium and potassium currents. The experiment proves that this method is simple, efficient, reliable and utility. The dissociated cortical neurons could be obtained and applied to patch-clamping study, which has reference value for studying the effects of physiology, pathology, pharmacology and physical factors on the ion channels of the cortical neurons of mice. Keywords: frontal cortex; neuron; acute isolation; mice; patch-clamp technique.
1 Introduction Cerebral cortex on which there are approximately 14 billion neurons is closely related to human learning and memory [1]. And the technique of patch-clamping provides a powerful access to the ion channels of cell membranes and dissociated cells, and have proved of value in understanding the function of ion channels, clarifying the pathogenesis of ion channel disease and predicting new ways of treating. In patch-clamp recording, the isolated cells should have plump profile, smooth surface, strong aureole and long survival time. In the past, these cells are obtained through cell culture. But Blatz et al. prove that the properties of artificially cultured nerve cells are greatly changed due to the impact of culture environment, which could be avoided by acute isolation [2]. Lei Yang et al. propose a method to acute isolation of rat cortical neurons [3]. However, compared with rats, mice can be used in much more fields, including drug screening, the study of cancer leukemia, radiology, genetic disease and immunology, etc. K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 611 – 617, 2010. © Springer-Verlag Berlin Heidelberg 2010
612
Y.-y. Li et al.
Zheng-xiang Xie et al. prove that mice have less difference to human than rats [4]. Moreover, mice have the advantages of smaller size, higher reproductive capacity, more easily being fed and controlled, etc. Therefore, mice are more valuable than rats. In summary, it is significant to establish a simple and effective method that allows the isolation of large quantities of cortical neurons from mice, preserving the morphology and electrical properties for patch-clamping study. Based on predecessors' works [3, 5-9], every aspect of the experiment is improved, such as solution formulation, cutting method and incubation time of brain slices, enzyme digestion, etc., and then we establish a method and the technique of isolating. The key factors which determine the success or failure of the acute isolation are also discussed in detail around mice age, pH value of solution, preparation of brain slices, enzyme digestion, incubation, mixed gas volume, trituration method of brain slices, etc. The experiment shows that this method is suitable for patch-clamping study, and meanwhile it also provides an important way to further study the effects of physiology, pathology, pharmacology and physical factors on the ion channels of cortical neurons of mice.
2 Materials and Methods 2.1 Preparation Before Experiment The following works should be done before experiment, such as preparing various kinds of glassware, cleaning experiment instruments and workbench, making ice pack, checking the patency of the output of mixed gas, preheating the water-bath, etc. 2.2 Experiment Operations (1) Weighing drugs and compounding solutions The artificial cerebrospinal fluids (ACSF), standard extracellular bath solutions and pipette solutions used in this experiment were listed in Table 1. Table 1. Solutions for brain slices preparation and acute isolation of neurons Solutions ACSF
Volume(ml) 200
Extracellular bath solution
50
Pipette solution
50
Component(mmol/L) NaC1 134, KCI 5, HEPES 10, Glucose 10, NaH2PO4 1.5, MgSO4 2, CaCI2 2, NaHCO3 25 NaC1 130, KCI 5.4, HEPES 10, Glucose 10, MgCl2 1, CaCl2 2 KCI 130, HEPES 10, MgCl2 2, CaCl2 1, EGTA 10, Na2ATP 2
pH value Others Adjust pH value to 7.4 with KOH (1mol/L) Adjust pH value to 7.3 with NaOH (1mol/L) Adjust pH value to 7.3 with KOH (1mol/L)
Filtration with 0.22μm filer membrane Filtration with 0.22μm filer membrane
Acute Isolation of Neurons Suitable for Patch-Clamping Study
613
(2) Saturating Solution After the solutions were compounded, 95%O2+5%CO2 mixed gas was bubbled to fully saturate the ACSF for 30min and extracellular bath solution for 15min respectively. (3) Adjusting pH value The pH value of ACSF and extracellular bath solution were adjusted to 7.4 and 7.3 using KOH (1mol/L) and NaOH (1mol/L) respectively. Then 10ml ACSF was put into refrigerator to drop its temperature to 0~4 . (4) Selecting mice Kunming mice, regardless of sex, aged from 7~10 days, were used in this experiment. They were supplied by Institute of Radiation Medical of CAMS. (5) Decapitation and removing brain The mice were decapitated, and their crania were clipped. Then the whole brain tissues were quickly removed and put into ice-water mixed ACSF for about 1min. (6) Separating cortex and cutting brain slices One piece of qualitative filter paper soaked with ACSF was put into glass culture dish. And ice pack was put under the dish to keep it at low temperature. Then the cool-downed brain tissues were placed onto the filter paper and cut into 400-500μm -thick brain slices along the vertical midline direction with the blade. (7) Incubating brain slices Brain slices were put into the beaker containing 50ml ACSF and then incubated for about 50min at the room temperature of 20~25 , bubbled with 95%O2+5%CO2 mixed gas. (8) Enzymolysis and rinsing brain slices The incubated brain slices were removed into the beaker containing 0.3mg/ml Pronase. Then the beaker was placed into the water-bath at 32 with 95%O2+5%CO2 mixed gas. After about 15min of enzyme digestion, the brain slices were rinsed three times in ACSF to clean up the Pronase on the surface of brain slices. Complete brain slices would be alive and neurons maintained well for at least 6~8h in ACSF under 95%O2+5%CO2 mixed gas. (9) Triturating brain slices and settling and sticking cells One or two pieces of brain slices were placed into the centrifuge tube containing ACSF and triturated mechanically with a graded series of fire-polished Pasteur pipette successively. After about 3~5min, the cell suspension in the centrifuge tube was then transferred onto a cleaned cover glass in the culture dish and then the cells were settled and stuck for about 20min. (10) Changing solution The cover glass was washed 2~3 times with extracellular bath solution, and the solution bathing the cells was changed to another extracellular bath solution after allowing the cells to settle and stick to the cover glass. When recording different kinds of ion channel currents, the extracellular bath solution containing different blocking agents should be added to form Na bath solution, K bath solution, etc. (11) Whole-cell patch clamp recording Recording pipettes (Institute of Electronics, Chinese Academy of Sciences) were pulled in two steps from glass capillary tubes using a micropipette puller (05-E mode;
℃
℃
℃
614
Y.-y. Li et al.
Inbio LifeScience Instrument Co., Ltd., Wuhan, China). The pointed-end diameter was 1~2 μm . The resistance of whole-cell recording pipettes was 2~5 MΩ. Recordings were obtained according to standard patch-clamp methods using a PC2C amplifier (Inbio LifeScience Instrument Co., Ltd.) interfaced to a personal computer. Once giga-ohm seals were formed between pipette and cell membrane, the fastcapacitance should be compensated to neutralize the capacitance overshoot. Next the cell membrane was ruptured with negative pressure to connect the pipette solution with the extracellular bath solution so as to form the whole-cell configuration. Then slow capacitance and series resistance were compensated to neutralize the transient currents alternatively, at the same time, the adjustable value should be recorded. Finally, the activation of channel currents could be observed under preset stimulated voltage. Voltage commands were generated, and current responses were recorded and analyzed using several computerized acquisition and storage systems (pClamp4, Inbio LifeScience Instrument Co., Ltd; pClAMP, Axon Instruments; Origin8, OriginLab Co.).
3 Results 3.1 Morphological Observation of Cortical Neurons Acutely isolated neurons were observed with an inverted microscope (×250). The appearance of isolated cells which could be selected for patch-clamp recording under inverted microscope was as an estimate of neuronal vitality. As shown in Fig.1, the somas of healthy cells had plump profile, smooth surface and strong aureole, the shape was pyramidal, triangular or oval, and one apical dendrite and two or more base dendrites were maintained. The complete neurons were to be maintained in good condition and used for patch-clamp recording for up 3h in ACSF.
Fig. 1. Photomicrograph of the acutely isolated neuron from frontal cortex of mice (Inverted microscope ×250)
3.2 Page Numbering and Running Heads There is no need to include page numbers. If your paper title is too long to serve as a running head, it will be shortened. Your suggestion as to how to shorten it would be most welcome. 3.3 Ion Channel Properties of Cortical Neurons The whole-cell transmembrane total currents were evoked by depolarizing voltage with above pipette solution and extracellular bath solution which had not any
Acute Isolation of Neurons Suitable for Patch-Clamping Study
160ms
615 Outward current
+60mV -70mV -80mV
3nA Inward current (a) Stimulus-pulse waveform
20mV
(b) Whole-cell transmembrane total currents
Fig. 2. Whole-cell transmembrane total currents recorded from cortical neurons of mice. Inward currents were sodium currents and outward currents were potassium currents
blocking agent. As shown in Fig.2, cells were held at a holding potential of -80mV, a series of 160ms depolarizing steps from -70mV to +60mV (10mV increment at each step) were applied at a frequency of 0.2Hz. Transmembrane total currents of cortical neurons were divided into inward currents and outward currents. The inward currents had the properties of fast activation and deactivation. It could be blocked by 1μmol/L TTX to get the outward potassium currents, which was shown in Fig.3. There were two kinds of outward currents. One was fast activated and deactivated currents which could be blocked by 30mmol/L TEA-CL, the other was slowly activated and hardly deactivated currents which could be blocked by 3mmol/L 4-AP. The inward sodium currents could be obtained if the above two outward currents were simultaneously blocked, which was shown in Fig.4.
Fig. 3. Block the inward currents to obtain outward potassium currents
Fig. 4. Block the outward currents to obtain inward sodium currents
4 Discussions The crucial issue of patch-clamp recording is to isolate the complete, surface smooth and long time survival cells. This paper gives a fast and reliable method, applied to patch-clamp recording, for acute isolation of neurons from frontal cortex of mice. Owing to a combination of mechanical and enzymatic means, the yield of dissociated
616
Y.-y. Li et al.
neurons with higher quality was greatly improved. According to the experience accumulated in repeated experiments, the following issues which determine the acutely isolated success or failure should be paid much attention to when isolating cells. (1) Mice age Adopting proper age of mice is the prerequisite to obtain the ideal cortical neurons. The experiments select the Kunming mice aged from 7~10 days. If the mice are too young, their cortical neurons are not differentiated to be mature, so the recorded channel currents are weak. On the other hand, if they are too old, their tissues are too loose, the blood streaks on the surface of tissues are too much, the dendrites of the cortical neurons are also too long, which influences the quality of acute isolation. (2) The pH value of solution The accuracy of the pH value of ACSF, extracellular bath solution and pipette solution is one of key factors during isolation. The general principle of adjusting pH value is “better acid than base”, which means that once pH value increases, the tenacity of brain slices will get worse and they will be hard to be isolated. (3) Preparing brain slices Decapitation, removing the brain and preparing the brain slices should be finished as quickly as possible. And the process of preparing brain slices must be operated at low temperature to avoid metabolism too fast and the injury to the cells caused by ischemia and hypoxia. The brain slices is about 400-500- μm - thickness. Being too thick is bad for oxygen permeation, which will bring about cell hypoxia. Otherwise, it will bring about big mechanic injury. (4) Enzyme Digestion Enzyme digestion is another critical step in the course of acute isolation, so the enzyme amount and digestion time need to be precisely controlled. The enzyme for digestion should be “freshly prepared while used”, otherwise the activity of enzyme will decrease. If the enzyme amount is too much, it will cause the over digestion of the cells which are easy to die when sealing and rupturing the membrane. On the other hand, if the amount is too less, it is hard to form the whole-cell patch-clamp recording mode. (5) Incubation The goal of incubation is to buffer the injury to brain tissues during preparation of brain slices. The incubation time in the references [3] and [7] are 30min and 1h respectively. Our experiments have proved that it can achieve the anticipated result as long as the incubation time is nearly 50min. (6) Bubbling mixed gas During the incubation and enzymolysis, 95%O2+5%CO2 mixed gas should be bubbled. The volume and speed of bubbling are both to be moderate. If they are too small, it will make the brain slices hypoxia, and the activity will get worse. Contrarily, if being too big, it will make brain slices flip caused by air currents, which will bring about injury to brain slices and reduce the quantity of complete cortical neurons. (7) Triturating brain slices The Pasteur pipette with proper inside diameter should be selected. If the inside diameter is too big, it is difficult to disperse the tissue blocks so as to have a long time to triturate slices repeatedly. This will reduce the quantity of living cells. Otherwise, it will destroy the integrity of the cells. Furthermore, the trituration with a graded series of firepolished Pasteur pipettes must be gentle and slow or it will bring about injury to cells.
Acute Isolation of Neurons Suitable for Patch-Clamping Study
617
5 Conclusions This paper successfully establishes one simple and reliable method, preserving the morphology and electrical properties for patch-clamping studies, for acute isolation of cortical neurons from frontal cortex of 7-10-d-old mice by a combination of mechanical and enzymatic means. Every step in experiment is dispensable and the key steps should be operated carefully. This method has important meanings for studying dynamics properties of ion channels using patch-clamp whole-cell recording technique and investigating the pathogenesis, prevention and cure of central nervous system diseases.
References 1. Selkoe, D.J.: Amyloid beta protein precursor and the pathogenesis of Alzheimer’s disease. Cell 58, 611–612 (1989) 2. Blatz, A.L.: Properties of single fast chloride channels from rat cerebral cortex neurons. J. Physiol. 441, 1–21 (1991) 3. Yang, L., Li, Y.-r., Su, L.-f., et al.: A modified method of acute isolation of rat cortex neurons. Journal of Harbin Medical University 39, 285–287 (2005) 4. Xie, Z.-x., Niu, Y.-h., Ma, H.-q., et al.: Comparison among beta-3 adrenoceptors of human, mice and rat, and the biological and pharmacological implications. Chinese Journal of Medical Physics 23, 123–125 (2006) 5. Li, G., Cheng, L.-j., Lin, L., et al.: Acute isolation of hippocampal neurons of neonate mice and application of patch-clamp technique. Journal of Tianjin University 41, 1157–1161 (2008) 6. Li, X.-m., Li, J.-g., Yang, J.-m., et al.: An improved method for acute isolation of neurons from the hippocampus of adult rats suitable for patch-clamping study. Acta Physiologica Sinica 56, 112–117 (2004) 7. Qiao, X.-y., Li, G., Dong, Y.-e., et al.: Neuron excitability changes induced by low-power laser irradiation. Acta Physica Sinica 57, 1259–1265 (2008) 8. Kay, A.R., Krupa, D.J.: Acute isolation of neurons from the mature mammalian central nervous system. Curr. Protoc. Neurosci., Somerset (2001) 9. Gao, X.-p., Qi, J.-s.: Comparison of the characteristics of fast-inactivating K+ channel currents in rat hippocampal neurons with those in cerebral cortical nenrons. J. Shanxi Med. Univ. 38, 481–483 (2007)
Palmprint Identification Using PCA Algorithm and Hierarchical Neural Network Ling Lin Dept. of Computer Science, YiLi Normal Colleg, Yining, China 835000 [email protected]
Abstract. Palmprint-based personal identification, as a new member in the biometrics family, has become an active research topic in recent years. The rich texture information of palmprint offers one of the powerful means in the field of personal recognition. In this paper, a novel approach for handprint identification is proposed. Firstly, region of interest is segmented through hand’s key points localization, then PCA algorithm is used to extract the palmprint features. A hierarchical neural network structure is employed to measure the degree of similarity in the identification stage. Experimental results show that the designed system achieves an acceptable level of performance. Keywords: Palmprint identification; PCA; Neural network.
1 Introduction Biometric identification refers to technologies that measure and analyze human physical and behavioral characteristics for identifying an individual. Biometrics have received much attention in the security field recently [1,2], and the usage of biological features adopted as the personal identification number has replaced the use of digits gradually due to their advantages[3]. For example, in password-based systems, people usually use different passwords for different aims, and they often suffer from forgetting or confusing with so many passwords. However, biometrics could provide a good solution to these problems. The biometric computing-based approach is concerned with identifying a person by his/her physiological characteristics, such as iris, palmprint, finger- print and face [4,5]. Recently, voice, face and iris-based verifications have been studied extensively. As a result, many biometric systems for commercial applications have been successfully developed. Nevertheless, not so much work has been reported on handprint identification and verification [6]. In contrast to existing techniques [7], our hand-print identification system is based on a hierarchical neural network classifier. In this system, the test images are preprocessed using region of interest (ROI) localization and histogram equalization in pre-processing stage, and then the hand geometry features extracted are input into the self-organizing map (SOM) for coarse-level classification. And then, in the fine-level stage, the texture features extracted using PCA algorithm are sent into a backpropagation (BP) neural network for final decision. (shown in Fig. 1). K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 618 – 625, 2010. © Springer-Verlag Berlin Heidelberg 2010
Palmprint Identification Using PCA Algorithm and Hierarchical Neural Network
619
Fig. 1. Block diagram of the proposed identification system
The rest of this paper is organized as follows. Section 2 introduces the image acquisition and segmentation ROI. Section 3 describes the palmprint feature extraction. The design of hierarchical neural network is depicted in Section 4. The experimental results are reported in Section 5. And Section 6 gives the conclusions.
2 Images Acquisition and Pre-processing Since the quality of an image directly influences the result of identification, the captured image with high quality is necessary for our system. At the same time, preprocess procedure is always a crucial stage in most identification systems. In this section, we introduce a new method for locating key points in the pre-process procedure. 2.1 Images Acquisition The user puts his/her hand on the scanner with the fingers spread naturally without any strict constraints. In this way, the user will not feel uncomfortable during the image acquisition stage. The resolution of images captured in our study is 300dpi. 2.2 Pre-processing and ROI Localization As mentioned above, pre-process procedure is a crucial stage in most identification systems [8]; it is also a very important stage for the segmentation of ROI and feature extraction in our system. In our study, the captured hand image is binarized using a global threshold value due to the high contrast between the background and hand. After the hand image is segmented from the background, the following procedure is to localize the ROI of hand image. The hand image labeled with key points is shown in Fig. 2, and the process of ROI localization is described as follows: (1) Extract the blue channel from the RGB image, and only use it as identification image in the following steps. (2) Locate points T1 , T2 , T3 and T4 using the algorithm proposed by Lin [2]. They are the peaks of little finger, ring finger, middle finger and index finger, respectively.
620
L. Lin
(3) Locate points
P1 , P2 , P3
and
P4 . They are the valley points among little finger,
ring finger, middle finger and index finger. (4) Connect print with
P2
P3 , and prolong the line P2 P3 , which joins the border of palm-
and
P1 . Connect P3 and P4 , and prolong the line P3 P4
P1 P2 , P2 P3 , P3 P4
der of palmprint with P5. The middle points of
F1 , F2 ; F3
and
(6) Connect
F1
, which joins the borand
P4 P5
are
F4 , respectively. (5) Connect Ti and Fi , i = 1,2,3,4. F4 ,
and
the angle y between line
F1 F4
and the horizontal line is
computed as follows:
θ = tan −1 ( y F − y F 1 )( x F − x F 1 ) 4
(7) Find the line
F4 F6
F1 F5
(1)
4
which is vertical and equal to the line
F1 F4
in length,the line
F1 F4 , then connect points F5 and F6 . The F1 , F5 , F6 . and F4 is regarded as the ROI, shown in
is vertical and equal to the line
square region with corners Fig. 2(b).
(8) Rotate ROI angle θ clockwise to make the line of
F1 F4
horizontal.
(9) Enhance the contrast of ROI using Laplacian trans-form. The result is shown in Fig. 2(c).
3 Feature Extraction 3.1 Geometry Features Extraction Nine geometry features are extracted and input to the SOM (shown in Fig. 2), including the lengths of four fingers, distance of four finger-bottoms and the width of palmprint, which are signified by the lengths of line
P2 P3 , P3 P4 , P4 P5
and
T1 F1 , T2 F2 , T3 F3 , T4 F4 , P1 P2
,
P1 P5 .
3.2 Textural Features Extraction PCA (Principal Component Analysis) has been widely used for dimensionality reduction in computer vision. Widely research shows that PCA also performs well in various recognition tasks [9,10,11]. In our context, the basis vectors bi ( x, y ) generated from a set of palmprint images are called eigenpalm, as they have the same dimension as the original images and are like palmprint in appearance. Recognition is performed
Palmprint Identification Using PCA Algorithm and Hierarchical Neural Network
621
Fig. 2. The localization ROI from the palmprint. (a) Key points of a hand.. (b) The localized ROI from the palmprint. (c) ROI after enhancement.
by projecting a new image into the subspace spanned by the eigenpalms and then classifying the palm by comparing its position in palm space with the positions of known individuals. More formally, consider a set of Μ palmprint images, i1 , i2 , ... , i M , the average palm of the set is defined as
1 M
i=
M
∑i j =1
j
.The dif-
ference between each palmprint image and the average palm i is expressed by the M
vector
φ n = in − i . A covariance matrix is constructed by: C = ∑ φ jφ j
eigenvectors
vk
and eigenvalues λ k with symmetric matrix
C
are calculated.
vk
ϕ to form the eigen-
M
= ∑ vlk φ k , l = 1," , M
. Then,
j =1
determines the linear combination of Μ difference images with palms: bi
T
. From these eigenpalms,
K (< M ) eigen-
k =1
palms are selected corresponding to the K highest eigenvalues. The set of palmprint images, {i} is transformed into its eigenpalm components (projected into the palm space)
by
the
k = 1, " , K .
operation:
ω nk = bk (in − i )
The weights obtained form a vector
n = 1," , M and Ω n = [ω n1 , ω n 2 ," , ω nK ] where
that describes the contribution of each eigenpalm to represent the input palm image, and the eigenpalms is treated as a basis set for palm images. Feature extraction method integrating PCA are presented in this paper. The PCA features formed by the largest N values are first extracted from the ROI.
622
L. Lin
4 Identification This section presents a hierarchical neural network classifier strategy, which is composed of SOM neural network and BP neural network. 4.1 SOM Neural Network SOM neural network is an unsupervised neural network pproach that can be used for classification task. SOM neural network has an input layer and an output layer, and he output layer is the competition layer. When training data sets are fed into the network, SOM neural network will compute a wining neuron. The process works like that in human brain, where neurons of similar functions tend to cluster in groups. SOM neural network has been widely used in the field of classification [12,13,14]. In our system, at the coarse-level classification stage, the geometry features are input to the SOM neural network to decide to which class these belong. 4.2 BP Neural Network BP neural network is one of the most popular and general methods for supervised classification. BP usually has three layers, including an input layer, a hidden layer and an out put layer[15].The selayer sareinter connected by modifiable weights, which are represented by links between layers. In this study, the number of BP neural networks is identical with the number of clusters produced by SOM neural network, one BP neural network for one class. Then we train each BP network using the textural features of samples with respect to its corresponding class. 4.3 Hierarchical Neural Network The SOM neural network has certain superiority in data cluster, especially for a large sample set. But, it is a real challenge to use SOM neural network for finelevel recognition. However, BP neural network just can accomplish the task of fine-level identification perfectly. Therefore, in a hierarchical system with multi-networks, SOM neural network is always placed in the first level, and BP neural network in the second [16,17]. The process of our system is composed of two stages. During the first stage, SOM neural network classifies the handprint samples in the database using their geometry features. All of the samples are clustered into several classes. in which each region enclosed represents one class, and each class corresponds to one BP neural network for fine-level identification. In the identification stage, the geometry features of the samples are input to the SOM neural network. If these are classified into one class, then the texture features are sent to the corresponding BP neural network for further identification. In the fine identification stage, The PCA features are fed to BP neural network for fine identification.
5 Experimental Results In order to test the effectiveness of proposed approach, experiments are done on a database containing 1000 palmprints collected from 100 different person using
Palmprint Identification Using PCA Algorithm and Hierarchical Neural Network
623
flatbed-scanner. These palmprints were taken from the people of different ages and both sexes. We captured these palmprints twice, at an interval of around three months, and acquire about 5 images from each palm at each time. Therefore, this database contains about 10 images of each individual. We did a great deal of experimentations in selecting the result of the competition layer of SOM. The size of competition layer is set 40 × 40. Nine BP are established according to the number of wined nerve and input samples from different region to different BP for training. When testing a query sample, the testing sample is first input to the SOM for coarse-level classification. According to the result of classification, transfer corresponding BP for fine-level matching. The length of features is obtained based on the Equal Error Rate (EER) criteria where FAR is equals to FRR. This is based on the rationale that both rates must be as low as possible for the biometric system to work effectively. Another performance measurement obtained from FAR and FRR is called Correct Rate (CR) .It represents the verification rate of the system and is calculated as follow:
⎛ ⎞ FRR + FAR ⎟⎟ × 100% CR = ⎜⎜1 − ⎝ Total Number of test samples ⎠
(2)
As shown in Fig. 3, the CR can reach the value 95.4%, when L=30, FRR=2.3%, FAR=2.3%.
Fig. 3. The distributions of FRR eLT and FAR (L)
Comparisons have been performed among our method, Hu’s approach [20], Kumaral’s algorithms [21] and Wong’s algorithms [22]. Table 2 summarizes the results of our method and these approaches with respect to image resolution, feature type and accuracy. We can see from Table 1 that our method is superior to Kumarals’s and Wong’s algorithms in image resolution and accuracy. Though the image resolution in Hu’s approach is low, we have a much more high accurate rate.
624
L. Lin Table 1. Comparison of different palmprint recognition methods Method Feature Image resolution Accurate rate
Hu’s[18] Statistical 60*60 84.67%
Wong’s[19] Structural 420*420 95%
Our method Statistical 180*180 95.8%
The execution times for the pre-processing, feature extraction and matching are listed in Table 2 using an Intel Pentium IV processor (2.8 GHz). Time of identifying one testing sample is about 5 s, which is fast enough for real-time identification. In fact, we have not completely optimized the code, so it is possible to further reduce the computation time. Table 2. Execution time for our handprint identification system Operation
Images acquisition Pre-processing Feature extraction Matching
Execution time (ms) 4000 430 233 23
6 Conclusions In this paper, a novel approach is presented to authenticate individuals by using their geometrical features and texture features. The hand images are captured by a scanner without any fixed peg. This mechanism is very suitable and comfortable for all users. In addition, we present a texture feature extraction method based PCA algorithm. Additionally, the system adopts the combination of SOM and BP for effective personal identification and the system accuracy can reach above 95.4% accuracy rate. Acknowledgement. This research was supported by the University fund from the Xinjiang Government of China under Grant No. XJEDU2007I36,Natural Science project of Xinjiang under Grant No. 2009211A10 and Science Research project Plan of YiLi Normal College under project No.YB200937.
References 1. Connie, T., Teoh, A., Goh, M., Ngo, D.: Palmprint recognition with PCA and ICA, Palmerston North (November 2003) 2. Lin, C.-L., Chuang, T.C., Fan, K.-C.: Palmprint verification using hierarchical decomposition. Pattern Recognition 38, 2639–2652 (2005) 3. Han, C.-C., Cheng, H.-L., Lin, C.-L., Fan, K.-C.: Personal authentication using palm-print features. Pattern Recognition 36, 371–381 (2003) 4. Kumar, A., Shen, H.C.: Palmprint identification using PalmCodes. In: Proceedings of the Third International Conference on Image and Graphics (ICIG 2004)0-7695-2244-0/04 (2004)
Palmprint Identification Using PCA Algorithm and Hierarchical Neural Network
625
5. Sun, Z., Wang, Y., Tan, T., Cui, J.: Improving iris recognition accuracy via cascaded classifiers. Appl. rev. 35(3) (2005) 6. Poon, C., Wong, D.C.M., Shen, H.C.: A new method in locating and segmenting palmprint into region-of-interest. In: Proceedings of the 17th International Conference on Pattern Recognition (ICPR 2004)1051-4651/04 (2004) 7. You, J., Li, W., Zhang, D.: Hierarchical palmprint identification via multiple feature extraction. Pattern Recognition 35, 847–859 (2002) 8. Osowski, S., Nghia, D.D.: Fourier and wavelet descriptors for shape recognition using neural networks—a comparative study. Pattern Recognition 35, 1949–1957 (2002) 9. Connie, T., Teoh, A., Goh, M., Ngo, D.: Palmprint Recognition with PCA and ICA, Palmerston North (November 2003) 10. Wang, X., Kuldip, K.P.: Feature extraction and dimensionality reduction algorithms and their applications in vowel recognition. Pattern Recognition 36(10), 2429–2439 (2003) 11. Lu, G., David, Z., Wang, K.: Palmprint recognition using eigenpalms features. Pattern Recognition Letters 24(9-10), 1473–1477 (2003) 12. Mu, T., Nandi, A.K.: Breast cancer detection from FNA using SVM with different parameter tuning systems and SOMCRBF classifier. J. Franklin Inst. 344, 285–311 (2007) 13. Lee, J., Kwak, I.S., Lee, E., Kim, K.A.: Classification of breeding bird communities along an urbanization gradient using an unsupervised artificial neural network. Ecol. Modelling 203, 62–71 (2007) 14. Chou, H.C., Cheng, C.H., Chang, J.R.: Extracting drug utilization knowledge using selforganizing map and rough set theory. Expert Syst. Appl. 33, 499–508 (2007) 15. Osowski, S., Nghia, D.D.: Fourier and wavelet descriptors for shape recognition using neural networks—a comparative study. Pattern Recognition 35, 1949–1957 (2002) 16. Huang, R., Xi, L., Li, X., Liu, C.R., Qiu, H., Lee, J.: Residual life predictions for ball bearings based on self-organizing map and back propagation neural network methods. Mech. Syst. Signal Process. 21, 193–207 (2007) 17. Kong, J., Li, D.G., Watson, A.C.: A firearm identification system based on neural network. In: Gedeon, T(T.) D., Fung, L.C.C. (eds.) AI 2003. LNCS (LNAI), vol. 2903, pp. 315–326. Springer, Heidelberg (2003) 18. Hu, D., Feng, G., Zhou, Z.: Two-dimensional locality preserving projections (2DLPP) with its application to palmprint recognition. Pattern Recognition 40, 339–342 (2007) 19. Wong, M., Zhang, D., Kong, W.-K., Lu, G.: Real-time palmprint acquisition system design. IEE Proc. (online no. 20049040) 20. Hu, D., Feng, G., Zhou, Z.: Two-dimensional locality preserving projections (2DLPP) with its application to palmprint recognition. Pattern Recognition 40, 339–342 (2007) 21. Kumaral, A., Zhang, D.: Personal authentication using multiple palmprint representation. Pattern Recognition 38, 1695–1704 (2005) 22. Wong, M., Zhang, D., Kong, W.-K., Lu, G.: Real-time palmprint acquisition system design. IEE Proc. (online no. 20049040)
Image Fusion Using Self-constraint Pulse-coupled Neural Network Zhuqing Jiao, Weili Xiong, and Baoguo Xu School of IoT Engineering, Jiangnan University Wuxi 214122, China [email protected]
Abstract. In this paper, an image fusion method using self-constraint pulse coupled neural network (PCNN) is proposed. A self-constraint restrictive function is introduced to PCNN neuron, so that the relation among neuron linking strength, pixel clarity and historical linking strength is adjusted adaptively. Then the pixels of original images corresponding to the fired and unfired neurons of PCNN are considered as target and background respectively, after which new fire mapping images are obtained for original images. Finally, the clear objects of original images are decided by the weighted fusion rule with the fire mapping images and merged into a new image. Experiment result indicates that the proposed method has better fusion performance than several traditional approaches. Keywords: image fusion; pulse-coupled neural network; self-constraint; linking strength.
1 Introduction Image fusion is a process of combining two or more images from different modalities or instruments into a single image [1]. During this process, more important visual information found in original images will be transferred into a fused image without introduction of artifacts. A successful fusion method can achieve more exact, reliable, and comprehensive description of the images, so it is essential of image fusion to obtain a fusion effect with richer details and more prominent objectives [2]. However, there is often a strong correlation among pixels, and a single pixel can not properly express image features. The pulse coupled neural network (PCNN) is a recently developed artificial neural network model [3, 4], which has been efficiently applied to image processing such as image segmentation, image restoration, image recognition, etc [3]. In PCNN, a neuron’s firing will cause the neighboring neurons with similar brightness to ignite and achieve initiatively passing information, and its parameters never need any training, which can greatly save processing time and reduce the computational complexity. Generally, the parameters of PCNN play a decisive role in its performance and have important research value, but the selections of several key parameters mainly rely on repeated test and manual adjustment, which largely limit the application of PCNN [5]. K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 626 – 634, 2010. © Springer-Verlag Berlin Heidelberg 2010
Image Fusion Using Self-constraint Pulse-coupled Neural Network
627
Both [6] and [7] discuss the self-constraint updated model of distributed networks and express the change of new credit-rating based on historical credibility through a selfconstraint factor. This paper presents a novel image fusion method, in which a self-constraint restrictive function is introduced into PCNN neuron. The fired and unfired neurons of the PCNN are considered as the target and the background respectively, after which new fire mapping images are obtained. Then the clear objects of each original image are decided and merged into a new clear image. Finally, the experimental result shows the method is effective. The remainder of this paper is organized as follows. Section 2 introduces the self-constraint PCNN, and its image fusion method is proposed. In Section 3, two experiments are conducted and relevant discussions are reported. The conclusions are summarized in Section 4.
2 Image Fusion Method 2.1 PCNN Model PCNN is a style of feedback network and each neuron consists of three parts: the receptive field, the modulation field and the pulse generator [8]. The structure of PCNN neuron is shown in Figure 1.
Lij
Yij , pq
Vθ
αθ
β ij θij
Yij Fij
U ij
Fig. 1. PCNN neuron structure
Where Fij represents the feeding input of the (i, j)th neuron, Lij is the linking input, βij is the linking strength, θij is the dynamic threshold, Uij of neuronal activities within the item, Yij is the pulse output. The neuron receives input signals from other neurons and from external sources through the receptive fields. The signals include pulses, analog time-varying signals, constants, or any combination. Generally, the pulses come from the other neurons and the external sources are referred to something not belonging to the neural network. For a neuron, the feeding input is the primary input from the neuron’s receptive area, while the linking input is the sum of the responses of the output pulses from surrounding neurons [9]. PCNN neuron model can be described by a group of equations [10]:
628
Z. Jiao, W. Xiong, and B. Xu
Fij [n] = I ij
(1)
Lij [n] = e−α L Lij [n − 1] + VL ∑ pq Wij , pqYij , pq [n − 1]
(2)
U ij [n] = Fij [n](1 + βij Lij [n])
(3)
θij [n] = e −αθ θij [n − 1] + Vθ Yij , pq [n − 1]
(4)
⎧⎪1, Yij [n] = ⎨ ⎪⎩ 0,
U ij [n] > θ ij [n] U ij [n] ≤ θ ij [n]
(5)
Where ‘n’ denotes the number of iteration, Iij is the external inputs, Wij is the synaptic gain strengths, αL and αθ are the decay constants. VL is the amplitude gain and Vθ is the time constant of the threshold adjuster. 2.2 Self-constraint PCNN
In above model, β reflects pixel characteristics and value relationship between surrounding pixels. In many applications of PCNN in image processing, β generally are the same and set as a constant [11]. To human vision, the responses to a region with notable features are stronger than to a region with non-notable features [12]. Therefore, the linking strength of each neuron in PCNN should be related to the features of the corresponding pixels of the images. Based on this, it is impossible for real neurons to have the same linking strength. It is well known that the clarity of each pixel is a notable feature of the edges of the images. Accordingly, the clarity of each pixel is chosen as the linking strength of the corresponding neuron [13]. The linking strength is denoted as βij =
∑
( i , j )∈D
[Δf x (i, j )]2 + [ Δf y (i, j )]2 2
(6)
Where D denotes a M×N neighborhood as the center of pixel f(i, j), Δfx (i, j) and Δfy(i, j) are the variance of f(i, j) in the x, y direction, respectively. The PCNN used for image fusion is a single layer 2-D array of laterally linked pulse coupled neurons. Suppose that the size of each original image is M×N, the size of each PCNN designed is M×N accordingly. Each pixel value is input into the neuron connecting to it, while each neuron is connected with neighboring neurons. Each neuronal output has two states, ignition (1state) or non-ignition (0state). For a neuron, the feeding input is the intensity of the corresponding pixel, and the linking input is the sum of the responses of the output pulses from surrounding neurons. Because each image pixel is associated with a PCNN neuron, the structure of the PCNN comes out from the structure of the input image. The better the clarity of the pixel is, the larger the value of β, and therefore the larger the linking strength of the corresponding neuron. As a bionic model of complex biological visual system, it is difficult for a single PCNN to meet various needs of
Image Fusion Using Self-constraint Pulse-coupled Neural Network
629
image processing. Therefore, it is necessary to combine with other relevant models to achieve greater value [5]. In order to express the relationship among linking strength, pixel clarity and historical linking strength, the self-constraint restrictive function is introduced to linking strength:
βij [n] = γ ij [n]Sij [n] + (1 − γ ij [n]) β ij [n − 1]
(7)
Where βij[n] is the linking strength produced by n-bonding iteration; βij[n—1] is the linking strength produced of n—1 bonding iteration; Sij[n] is the clarity of n-bonding iteration; γij[n] [0,1] is the self-restraint factor. From Equation (7) we can see, βij[n] not only depends on Sij[n] and Sij[n—1], but also is related to the jitter degree between them. Self-restraint factor γij is used to control the change of linking strength, and its self-restraint ability can make linking strength converge to a stable state. γij[n] is defined as follows:
∈
γ ij [n] =
Δtij [ n ]
λe −1 , λe + 1
Δtij [n] = Sij [n] − βij [n − 1]
∈
∈
(8)
∈
Where Sij[n], βij[n−1] [0,1], then Δtij[n] [0,1]; λ [0,1] is the jitter parameter, which can change the jitter degree of γij[n]. The self-constraint restrictive function defines the linking strength among neighboring neurons, which associates with both the linking strength from previous iteration and the current clarity, and reflects the changes in amplitude of linking strength through self-restraint factor. It avoids the artificial images based on the adjustment of each parameter. 2.3 Image Fusion Rule
When an image is input into PCNN, its edge, texture and other information can effectively extracted through the ignition frequency of the neurons. The higher ignition frequency of a neuron shows the richer information on the point. Supposing the ignition frequency of PCNN neuron corresponding to the (i, j)th pixel of original image is Y(i, j), the neighborhood means of Y is denoted as T (i, j ) =
1 MN
( M −1)/ 2
∑
( N −1)/ 2
∑
T (i , j )
(9)
m =− ( M −1)/ 2 n =− ( N −1)/ 2
The clear objects of each original image are decided by the a weighted fusion rule with the fire mapping images pixel by pixel, and then all of them are merged into a new clear image. For original images A and B, the fused image is F (i, j ) = ωA A(i, j ) + ωB B(i, j )
where ω A =
YA (i, j ) YB (i, j ) , ωB = . YA (i, j ) + YB (i, j ) YA (i, j ) + YB (i, j )
The structure of the image fusion based on PCNN is plotted in Fig. 2.
(10)
630
Z. Jiao, W. Xiong, and B. Xu
βA YA
ωA
YB
ωB
βB
Fig. 2. The structure of PCNN based image fusion
3 Experiments Result In the experiments, the parameter values of PCNN are set as follows: αL=0.2, αθ=0.25, VL=1, Vθ=0.5, λ=1, Nmax=100, and the 5×5 linking matrix is ⎡1 ⎢ ⎢1 ⎢ W = ⎢1 ⎢ ⎢1 ⎢1 ⎣
12 ⎤ ⎥ 3 1 2 1 1 2 1 3⎥ ⎥ 2 1 1 1 1 2⎥ ⎥ 3 1 2 1 1 2 1 3⎥ 2 1 3 1 2 1 3 1 2 ⎥⎦ 2
1
3 1
2 1
3
(11)
A set of multi-focus images and a set of infrared and visible light images are used to evaluate the proposed method. For comparison purposes, the fusion methods based on discrete wavelet transform (DWT) with basis “db2”, PCNN and the proposed method is performed. The original multi-focus images and their fused images are shown in Figure 3. From Figure 3 we can see, DWT achieves a basic image fusion, but the ornament on the left is vague, and the arm on the right have some significant ghosting. The method based on PCNN weakened the ghosting in the image. Although its effect is better than the former, there are still some false information gaps in detail. The proposed method retained the edge information and detail information of the original images to the greatest degree, so the fused image is clearer, and almost all ghosting have been eliminated. Although visual perception can give an intuitive comparison, it is still vulnerable to psychological factors. Information entropy (IE) reflects the amount of information contained in an image, and average gradient (AG) reflects the detail contrast and texture variation of an image, while retained edge information QAB/F shows the edge information transmission capacity from original image to fused image. In order to objectively evaluate the fused images, IE, AG and QAB/F are used to evaluate the fusion performance quantitatively. The objective evaluation to different fused images is shown in Table 1:
Image Fusion Using Self-constraint Pulse-coupled Neural Network
(a) Left focused
(c) DWT
(b) Right focused
(d) PCNN
(e) The proposed method
Fig. 3. Original multi-focus images and fused images Table 1. Objective evaluations to fusion results Method Figure 3(c) Figure 3(d) Figure 3(e)
IE 7.3699 7.4761 7.5045
Objective evaluation AG 7.1423 7.7342 7.8372
QAB/F 0.6874 0.7291 0.7370
631
632
Z. Jiao, W. Xiong, and B. Xu
Compared with other methods, the proposed method brings the largest IE and AG. It indicates that the image information, especially the details is the richest in its fusion. The QAB/F value of the fused image is increased significantly than those of other two methods. Meanwhile, the detail information of the original images has a better retention, which indicates that the proposed method can significantly improve the final fusion effect. The objective evaluation is in accord with the visual effect, which not only further represents the advantages of the proposed method to image fusion but also proves the effectiveness of self-constraint PCNN. The infrared and visible images and their fused images are shown in Figure 4.
(a) Infrared image
(b) Visible light image
(c) DWT
(d) PCNN
(e) The proposed method
Fig. 4. Infrared and visible light images and fused images
Image Fusion Using Self-constraint Pulse-coupled Neural Network
633
In Figure 4, after DWT the target is not distinct and the edges and details seem to be rather ambiguous. The fusion based on PCNN increases the ability to capture image edge information, but the overall contrast remains questionable. After the proposed method, the target edge is much clearer and the fused image is more natural than above fused results. It not only integrates the target information of the infrared image successfully, but also retains the scene information of the visible light image as possible. Mutual information (MI) and overall cross entropy (EAB/F) reflect the amount of information extraction from original image, so MI, CE and QAB/F are used to evaluate the fusion performance quantitatively. The objective evaluation results are shown in Table 2: Table 2. Objective evaluations to fusion results Objective evaluation
Method Figure 4(c) Figure 4(d) Figure 4(e)
MI 0.2070 0.2641 0.2660
Eψψψψ 1.3413 1.2426 1.0575
QAB/F 0.6006 0.6481 0.6991
In Table 2, the objective evaluation is in accord with the visual effect. Compared with DWT and PCNN, the proposed method has the highest MI and the lowest EAB/F, which indicates that the fusion image extracts more scene and target information from original images. The highest QAB/F verifies that the proposed method preserves more edge and detail information from original images than the other two methods. The two experiments represent the effectiveness of the proposed method to multifocus image fusion and infrared and visible image fusion in improving fusion image quality.
4 Conclusions In this paper, we have proposed a novel image fusion method with self-constraint PCNN. The relation among neuron linking strength, pixel clarity and historical linking strength is adjusted adaptively. Furthermore, the variation of clarity causes the linking strength’s iteration and renewal, and then the fired and unfired nerves of the PCNN are considered as target and background respectively. The clear objects of original images are decided by the weighted fusion rule and merged into a fused image. In experiments the proposed method, which is perfect in preserving edge information or target information, shows better fusion performance than DWT and PCNN methods. Acknowledgment. The authors acknowledge the supports of National High Technology Research and Development Program of China (No. 2006AA10Z248), the Fundamental Research Funds for the Central Universities (No. JUSRP10927) and the Ph.D Student Research Fund of Jiangnan University.
634
Z. Jiao, W. Xiong, and B. Xu
References 1. Xiaohui, Y., Licheng, J.: Fusion algorithm for remote sensing images based on nonsubsampled contourlet transform. Acta Automatica Sinica 34(3), 274–281 (2008) 2. Qiang, Z., Baolong, G.: Multifocus image fusion using the nonsubsampled contourlet transform. Signal Processing 89(7), 1334–1346 (2009) 3. Zhaobin, W., Yide, M.: Medical image fusion using m-PCNN. Information Fusion 9(2), 176–185 (2008) 4. Xiaobo, Q., Jingwen, Y., Hongzhi, X., et al.: Image fusion algorithm based on spatial frequency-motivated pulse coupled neural networks in nonsubsampled contourlet transform domain. Acta Automatica Sinica 34(12), 1508–1514 (2008) 5. Zhijiang, Z., Chunhui, Z., Zhihong, Z.: A new method of PCNN′s parameter′s optimization. Acta Electronic Asinic 35(5), 996–1000 (2007) 6. Wang, Y., Vijay, V.J.: Interaction trust evaluation in decentralized environments. In: Proc. of the 5th International Conference on Electronic Commerce and Web Technology, Zaragoza, Spain, pp. 144–153 (2004) 7. Mingwu, Z., Bo, Y., Wenzheng, Z.: Self-constraint reputation updating model. Computer Engineering 33(18), 145–147 (2007) 8. Zhaobin, W., Yide, M., Feiyan, C., et al.: Review of pulse-coupled neural networks. Image and Vision Computing 28(1), 5–13 (2010) 9. Shuyuan, Y., Min, W., Licheng, J., et al.: Image fusion based on a new contourlet packet. Information Fusion 11(2), 78–84 (2010) 10. Berg, H., Olsson, R., Lindblad, T., et al.: Automatic design of pulse coupled neurons for image segmentation. Neurocomputing 71(10-12), 1980–1993 (2008) 11. Jiangbo, Y., Houjin, C., Wei, W., et al.: Parameter determination of pulse coupled neural network in image processing. Acta Electronica Sinica 36(1), 81–85 (2008) 12. Shuyuan, Y., Min, W., Yanxiong, L., et al.: Fusion of multiparametric SAR images based on SW-nonsubsampled contourlet and PCNN. Signal Processing 89(12), 2596–2608 (2009) 13. Qiguang, M., Baoshu, W.: A novel image fusion algorithm based on local contrast and adaptive PCNN. Chinese Journal of Computers 31(5), 875–880 (2008)
Segmentation for SAR Image Based on a New Spectral Clustering Algorithm Li-Li Liu1,2, Xian-Bin Wen1,2, and Xing-Xing Gao1,2 1 Key Laboratory of Computer Vision and System of Ministry of Education, Tianjin University of Technology, 300191, Tianjin, China 2 Tianjin Key Laboratory of Intelligence Computing and Novel Software Technology 300191 Tianjin, China [email protected] U
Abstract. A new spectral clustering (SC) algorithm with Nyström method is proposed for SAR image segmentation in this paper. The proposed algorithm differs from previous approaches in that not only with Nyström method are employed for alleviating the computational and storage burdens of the SC algorithm, but also a new similarity function is constructed by combining the pixel value and the spatial location of each pixel to depict the intrinsic structure of the original SAR image better. Our algorithm and the classic spectral clustering algorithm with Nyström method are evaluated using the real-world SAR images. The results demonstrate the running time and the error rate of the proposed approach and the classic spectral clustering algorithm with Nyström method. Keywords: Image segmentation, spectral clustering (SC), synthetic aperture radar (SAR).
1 Introduction Synthetic aperture radar (SAR) is a kind of microwave imaging system. It has the attractive property of producing images in any weather condition and also in absence of sun light [1]. Hence, SAR images have wide application fields ranging from military, economic and social. Moreover, SAR Image segmentation plays a significant role in image compression, target detection and the recognition of targets and so on. The purpose of SAR image segmentation is to partition an image into regions with different characteristics [2]. There are a wide variety of segmentation approaches, such as statistic model-based methods [3], [4], [5], morphologic methods [6], [7], threshold methods [8], [9] and clustering algorithms [10], [11]. Compared with many other segmentation approaches, the spectral clustering (SC) algorithm can obtain clusters in sample spaces with arbitrary shape. SC algorithm is proposed firstly by Donath and Hoffman [12] in 1973. Then Hagen and Kahng [14] put forward the ratio-cut criterion and they established spectral clustering algorithm. Recently, the SC algorithm has shown great promise for SAR image segmentation, such as Zhang et al. put forward spectral clustering ensemble applied to SAR image segmentation [2] in
,
K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 635 – 643, 2010. © Springer-Verlag Berlin Heidelberg 2010
636
L.-L. Liu, X.-B. Wen, and X.-X. Gao
2008, Belongie et al. proposed spectral partitioning with indefinite kernels using the Nyström extension [18] in 2002, Fowlkes et al. put forward spectral grouping using the Nyström method [20] in 2004, K. Zhang and James proposed density-weighted Nyström method for computing large kernel eigensystems [21] in 2009. In this paper, we put forward a new SC algorithm combine with the Nyström method for SAR image segmentation. The Nyström method is a well-known sampling-based technique for approximating the affinity matrix. In the Nyström method, a small set of randomly sampled data points from all pixels is used to perform the approximation. It can lower the computational complexity. However, the drawback of the classic SC algorithm with the Nyström method is the requirement of choosing the appropriate scaling parameter and similarity function. Hence, we adopt the neighborhood adaptive scaling method to choose it automatically. Besides, we put forward a new similarity function to construct the affinity matrix in SC algorithm. The scaling parameter σ exists in the Gaussian radial basis function. It’s constructed on the density and space feature of the pixels in the SAR image, which is clearly stated in Section 3. Experimental results show that the proposed method is effective for SAR image segmentation and appropriate to the scaling parameter, which is clearly stated in Section 4. The structure of this paper is as follows: In Section 2, we describe the classic SC algorithm with the Nyström method. In Section 3, we bring in the new similarity function to construct the affinity matrix for SAR image segmentation. In Section 4, we analyze the performance of the proposed method and the classic spectral clustering algorithm with Nyström method by experiments for SAR image segmentation. Besides, we discuss the significance of the proposed method by making a fair comparison with the existing methods. Finally, we conclude with Section 5.
2 The Classic SC Algorithm with the Nyström Method The SC algorithm is related to the eigenvectors and eigenvalues of affinity matrix W ∈ R n× n (a n × n symmetric matrix). The affinity matrix W defines the similarity between each pair of pixels in the SAR image. It can be obtained by computing the weighted adjacency matrix for a weighted undirected graph G = ( V, E ) , where the set of nodes V represents the pixels in the SAR image, V={vi } , and the weight wij on
each edge E is equal to the similarity between pixel vi and pixel v j , E={w ij |w ij ≥ 0} , and n denotes the number of pixels in the SAR image. The process of SAR image segmentation corresponds to the process of graph-partitioning. In graph-partitioning, we need to partition the set of nodes V into disjoint sets v1 , v2 ,..., vk , by comparing the similarities among the nodes. The similarities of nodes in the same set vi are far higher than different sets. Next, the normalized cut criterion [18] will be described. Partition the set of nodes V into disjoint sets A and B , so A ∪ B = V and A ∩ B = ∅ . The sum of the weights between sets A and B are expressed as cut ( A, B) , and cut ( A, B ) = ∑ i∈ A, j∈B Wij , where di = ∑ j Wij denotes the
Segmentation for SAR Image Based on a New Spectral Clustering Algorithm
637
degree of the i ’th node. The volume of sets A and B are vol ( A) = ∑ i∈ A di and vol ( B ) = ∑ i∈B di . The normalized cut between sets A and B are represented as follows: NCut ( A, B ) = cut ( A, B)(
1 1 2icut ( A, B ) . )= + vol ( A) vol ( B ) vol ( A) vol ( B)
(1)
Where is expressed as the harmonic mean, whose expression is given by a b = 2ab (a + b) . In order to seek the satisfactory partitioning result, we need to minimize the value of NCut ( A, B) , namely, seek the sets A and B . According to spectral graph theory [19], we can obtain an approximate solution by thresholding the eigenvector correspond to the second smallest eigenvalue of the normalized Laplacian matrix L [18], which is equal to: L = D −1/ 2 ( D − W ) D −1/ 2 = I − D −1/ 2WD −1/ 2 .
(2)
Where D is the diagonal matrix, and Dii is the sum ofW ’s i ’th row ( Dii = di ). The Laplacian matrix L is positive semidefinite, even when the symmetric matrix W is indefinite. The eigenvalues of matrix L lie on the interval [0, 2]. The eq. (2) can thus be divided into two terms. The first term of eq. (2) I is a constant, so the eigenvalues of the second term of eq. (2) D −1/ 2WD −1/ 2 lie on the interval [−1, 1]. Moreover, multiple groups can be obtained by recursive bipartition or by multiple eigenvectors. In order to alleviate the computational and storage burdens of the SC algorithm, we combine the SC algorithm with Nyström method. Similar to the Nyström method [2, 19], we chose m random pixels to perform the approximation from a SAR image with N pixels, so we obtain the remaining n = N − m pixels, and m n . At first, these m randomly chosen pixels are used to solving the eigenvector of the affinity matrix and then extrapolating this solution to the full SAR image with N pixels. Now the affinity matrix can be expressed as:
⎡A W =⎢ T ⎣B
B⎤ . C ⎥⎦
(3)
Where A ∈ R m× m , B ∈ R m× n , C ∈ R n× n , subblock A corresponds to the affinity matrix of the randomly chosen m pixels, subblock B contains the weights from the randomly chosen m pixels to the remaining n pixels of affinity matrix, and subblock C corresponds to the affinity matrix of the remaining n pixels. Due to m n , so subblock C is very huge. However, we can use BT A−1 B to estimate the approximate solution of C via Nyström extension method [22]. Hence, rewrite the affinity matrix we have:
⎡A W' = ⎢ T ⎣B
⎤ . B A B ⎥⎦ B
T
−1
(4)
638
L.-L. Liu, X.-B. Wen, and X.-X. Gao
3 Our Algorithm 3.1 The Scaling Parameter σ and the New Similarity Function
The SC algorithm is sensitive to the scaling parameter σ . In the SC algorithm, a good segmentation result can be obtained by setting the appropriate scaling parameter. According to the pixel feature (density/space) in SAR image, the parameter σ controls how rapidly the affinity matrix Wij falls off with the distance between vi and v j [17]. The value of parameter σ is real-valued generally. The appropriate values are between 0 and 1. Moreover, it is very time-consuming to choose the value of parameter σ by repeating the experiments. Hence, we adopt the neighborhood adaptive scaling method to choose it automatically. The method calculates the local C neighbor aver2 age distance for each pixel vi . Consequently, the distance function d (vi , v j ) is mentioned in Section 2, which can be generalized as:
d 2 (vi , v j ) 2σ i 2 = d (vi , v j )d (v j , vi ) 2σ iσ j = d 2 (vi , v j ) 2σ iσ j .
(5)
The similarity function can be expressed as:
Wij = exp(− d 2 (vi , v j ) 2σ iσ j ) .
(6)
1 C 1 C d (vi , vm ) = ∑ vi − vm is the local C neighbor average distance of ∑ C m =1 C m =1 pixel vi , and vm is the m ’th nearest neighbor of pixel vi . In this paper, the value of C is 4, that is to say, the values of m are{1, 2,3, 4} . In this paper, the similarity function is obtained by combining the pixel value and the spatial location of each pixel. Moreover, we use the neighborhood adaptive scaling method as the approximation estimation of parameter σ . Hence the new similarity function is constructed with Gaussian-weighted Euclidean distance form: Where σ i =
Wij = exp(−
d 2 (viG , vGj ) 2σ iGσ Gj
−
d 2 (viX , v Xj ) 2σ iX σ jX
) , if i ≠ j , wii = 0 .
(7)
G where G and X represent the pixel value and the spatial location of each pixel, vi deX G notes the pixel value of pixel, vi denotes the spatial location of pixel, σ i denotes the
approximation estimation of scaling parameter based on pixel value, σ i denotes the approximation estimation of scaling parameter based on spatial location. X
3.2 Our Algorithm
The procedure of our algorithm is summarized as follows:
Segmentation for SAR Image Based on a New Spectral Clustering Algorithm
639
Step 1. Given a SAR image set W = {wi }iN=1 , randomly choose a subset S = {si }im=1 , R = {ri }in=1 and the rest subset is . Step 2. Compute the subblock A ∈ R m× m and the subblock B ∈ R m× n by Aij = exp(−
Bij = exp(−
d 2 ( siG , s Gj ) 2σ iGσ Gj d 2 ( siG , rjG ) 2σ iGσ Gj
−
−
d 2 ( siX , s Xj ) 2σ iX σ jX d 2 ( siX , rjX ) 2σ iX σ jX
).
(8)
).
(9)
Step 3. Compute the diagonal entry of diagonal matrix D by A1m + B1n ⎡ ⎤ ⎡ ar + br ⎤ d ' = W '1 = ⎢ T ⎥=⎢ ⎥. T −1 T −1 ⎣ B 1m + B A B1n ⎦ ⎣bc + B A br ⎦
(10)
Where ar , br ∈ R m denote the sum of every row of matrixes A and B , bc ∈ R n denotes the sum of every column of matrix B , 1 denotes the column vector, which all values are all 1. Step 4. Normalizing the matrix A and the matrix B by
Aij ←
Bij ←
Bij di' d 'j
Aij
, i, j = 1,..., n .
(11)
, i = 1,..., n, j = 1,..., m .
(12)
di' d 'j
In order to simplify the orthogonal process W ' = V ΛV T , we compute the matrix Q = A + A−1 2 BBT A−1 2 , where we can obtain A−1 2 by the singular value decomposition on A . As we know, Q can be orthogonalized to Q = U ΛU T , where we can obtain U by the singular value decomposition on Q . Moreover, through the eigenvalue decomposition ( D −1/ 2WD −1/ 2 )V = V Λ , we can obtain the orthogonal column ⎡ A⎤ eigenvector matrix V of Laplacian matrix L = D −1/ 2WD −1/ 2 by V = ⎢ T ⎥ A−1 2U Λ −1 2 . ⎣B ⎦ Step 5. The eigenvectors in matrix V are sorted in descending order by eigenvalues, compute the first k eigenvectors of matrix V , then get the matrix V = [v1 ,..., vk ] ∈ R N × k as columns.
Step 6. Normalizing each row inV to get the matrix F ∈ R N × k :
640
L.-L. Liu, X.-B. Wen, and X.-X. Gao
Fij =
Vi , j D jj
, i = 1,..., N , j = 1,..., k .
(13)
k Step 7. Treat each row f i ∈ R of F as a pixel, cluster them into k clusters via kmeans. Assign the original pixel wi to j ’th cluster, if and only if the corresponding row i of the matrix F is assigned to j ’th cluster. Finally, output k clusters ofW .
4 Experimental Study Our experiments use real SAR images. Firstly, we map each pixel into a feature space with the pixel value feature and the spatial location feature of each pixel. Next, we bring in parameter k = 5 , and the proposed similarity function (eq. (7)) to segment SAR images. The computational costs of the classic SC algorithm with Nyström method and the proposed method are compared by running time and error rate on a personal computer, which with dual-core 2.16GHz Multiprocessor, 2 GB memory, and Windows XP operating system, and programs running by Matlab 7.0.1.
(a)
(b)
(c)
Fig. 1. (a) Original SAR image (200×200 pixels). (b) Segmentation obtained by the classic SC algorithm with Nyström method (running time: 29.391s, error rate: 5.31%, the number of misclassified pixels: 2124). (c) The proposed method (running time: 33.685s, error rate: 2.58%, the number of misclassified pixels: 1032).
In Fig. 1, the original SAR image has 200×200 pixels, the running time of the classic SC algorithm with Nyström method is 29.391s, the error rate is 5.31%, and the number of misclassified pixels is 2124. The running time of the proposed method is 33.685s, the error rate is 2.58%, and the number of misclassified pixels is 1032. In Fig. 2, the original SAR image has 200×200 pixels, the running time of the classic SC algorithm with Nyström method is 15.266s, the error rate is 5.38%, and the number of misclassified pixels is 2152. The running time of the proposed method is 18.735s, the error rate is 2.45%, and the number of misclassified pixels is 980. In Fig. 3, the original SAR image has 256×256 pixels, the running time of the classic SC algorithm with
Segmentation for SAR Image Based on a New Spectral Clustering Algorithm
(a)
(b)
641
(c)
Fig. 2. (a) Original SAR image (200×200 pixels). (b) Segmentation obtained by the classic SC algorithm with Nyström method (running time: 15.266s, error rate: 5.38%, the number of misclassified pixels: 2152). (c) The proposed method (running time: 18.735s, error rate: 2.45%, the number of misclassified pixels: 980).
(a)
(b)
(c)
Fig. 3. (a) Original SAR image (256×256 pixels). (b) Segmentation obtained by the classic SC algorithm with Nyström method (running time: 25.25s, error rate: 5.49%, the number of misclassified pixels: 3598). (c) The proposed method (running time: 28.672s, error rate: 2.31%, the number of misclassified pixels: 1377).
Nyström method is 25.25s, the error rate is 5.49%, and the number of misclassified pixels is 3598. The running time of the proposed method is 28.672s, the error rate is 2.31%, and the number of misclassified pixels is 1377. Experimental results show that the proposed method is effective for SAR image segmentation. In the above experiments, to segment several real SAR images, the running time of the classic SC algorithm with Nyström method is slightly lower than the proposed algorithm, but the error rate of the proposed algorithm is much lower than the classic SC algorithm with Nyström method. Hence, the proposed algorithm is better than the classic SC algorithm with Nyström method in the segmentation performance.
5 Conclusion This paper proposed a new SC algorithm combine with Nyström method for SAR image segmentation. Compared with the classic SC algorithm with Nyström method, our algorithm has the better segmentation performance. If make better use of the SAR
642
L.-L. Liu, X.-B. Wen, and X.-X. Gao
image feature information, can we obtain the more satisfactory segmentation results, this is the direction for future research. Moreover, the Nyström method is instable for SAR image segmentation. Hense, it is expected that searching a more precise similarity function and sampling-based technique to approximate the affinity matrix will improve the performance further. Acknowledgements. The authors would like to thank anonymous reviewers for their detailed comments and questions which improved the quality of the presentation of this paper. This work is supported in part by the National Natural Science Foundation of China (No. 60872064), the Aeronautics and Astronautics Basal Science Foundation of China (No. 03I53059), the Tianjin Natural Science Foundation (08JCYBJC12300) (08JCYBJC12200).
References 1. Quan, J.: Multiscale Segmentation for SAR image based on Neural Networks. Tianjin University of Technology, D. Tianjin (2007) 2. Zhang, X., Jiao, L., Liu, F., Bo, L., Gong, M.: Spectral Clustering Ensemble Applied to SAR Image Segmentation. J. IEEE Trans. Geosci. Remote Sens. 46(7), 2126–2136 (2008) 3. Samadani, R.: A finite mixtures algorithm for finding proportions in SAR images. IEEE Trans. Image Process. 4(8), 1182–1185 (1995) 4. Dong, Y., Forster, B.C., Milne, A.K.: Comparison of radar image segmentation by Gaussian-and Gamma-Markov random field models. Int. J. Remote Sens. 24(4), 711–722 (2003) 5. Deng, H., Clausi, D.A.: Unsupervised segmentation of synthetic aperture radar sea ice imagery using a novel Markov random field model. IEEE Trans. Geosci. Remote Sens. 43(3), 528–538 (2005) 6. Lemaréchal, C., Fjørtoft, R., Marthon, P., Cubero-Castan, E., Lopes, A.: SAR image segmentation by morphological methods. In: Proc. SPIE, vol. 3497, pp. 111–121 (1998) 7. Ogor, B., Haese-coat, V., Ronsin, J.: SAR image segmentation by mathematical morphology and texture analysis. In: Proc. IGARSS, pp. 717–719 (1996) 8. Lee, J.S., Jurkevich, I.: Segmentation of SAR images. IEEE Trans. Geosci. Remote Sens. 27(6), 674–680 (1989) 9. Zaart, A.E., Ziou, D., Wang, S., Jiang, Q.: Segmentation of SAR images using mixture of gamma distribution. Pattern Recognit. 35(3), 713–724 (2002) 10. Kersten, P.R., Lee, J.-S., Ainsworth, T.L.: Unsupervised classification of polarimetric synthetic aperture radar images using fuzzy clustering and EM clustering. IEEE Trans. Geosci. Remote Sens. 43(3), 519–527 (2005) 11. Chumsamrong, W., Thitimajshima, P., Rangsanseri, Y.: Synthetic aperture radar (SAR) image segmentation using a new modified fuzzy c-means algorithm. In: Proc. IEEE Symp. Geosci., Remote Sens., Honolulu, pp. 624–626 (2000) 12. Donath, W.E., Hoffman, A.J.: Lower bounds for the partitioning of graphs. J. IBM J. Res. Develop. (17), 420–425 (1973) 13. Fiedler, M.: Algebraic connectivity of graphs. J. Czech Math J. (23), 298–305 (1973) 14. Hagen, L., Kahng, A.B.: New spectral methods for ratio cut partitioning and clustering. J. IEEE Transactions on Computed-Aided Design 11(9), 1074–1085 (1992)
Segmentation for SAR Image Based on a New Spectral Clustering Algorithm
643
15. Chan, P.K., Schlag, M.D.F., Zien, J.Y.: Spectral k-way ratio-cut partitioning and clustering. J. IEEE Trans. Computed-Aided Design Integr. Circuits Syst. 13(9), 1088–1096 (1994) 16. Shi, J., Malik, J.: Normalized cuts and image segmentation. J. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 888–905 (2000) 17. Ng, A.Y., Jordan, M.I., Weiss, Y.: On Spectral Clustering: Analysis and an algorithm C. In: Dietterich, T.G., Becker, S., Ghahramani, Z. (eds.) Advances in Neural Information Processing Systems, vol. 14, pp. 849–856. MIT Press, MA (2002) 18. Belongie, S., Fowlkes, C., Chung, F., Malik, J.: Spectral partitioning with indefinite kernels using the Nyström extension. In: Proc. European Conf. Computer Vision (2002) 19. Chung, F.R.K.: Spectral Graph Theory. Am. Math. Soc. (1997) 20. Fowlkes, C., Belongie, S., Chung, F., Malik, J.: Spectral grouping using the Nyström method. J. IEEE Trans. Pattern Anal. Mach. Intell. 26(2), 214–225 (2004) 21. Zhang, K., Kwok, J.T.: Density-Weighted Nyström Method for Computing Large Kernel Eigensystems. Neural Computation 21(1), 121–146 (2009)
Satellite-Retrieved Surface Chlorophyll Concentration Variation Based on Statistical Methods in the Bohai Sea* Li Qian, Wen-ling Liu**, and Xiao-shen Zheng Tianjin Key Laboratory of Marine Resources and Chemistry, Tianjin University of Science and Technology, Tianjin 300457, China
Abstract. Data of chlorophyll concentration in the Bohai Sea is obtained by the Sea-viewing Wide Field-of-view Sensor (SeaWiFS) from 1998 to 2009. Empirical Orthogonal Function (EOF) is used to analyze the spatial-temporal variation of chlorophyll concentration in the Bohai Sea. Discrete power spectral density (PSD) is used to calculate variation periods of the first four modes of EOF. All the processes are used IDL language. The results show that spatial distribution of chlorophyll concentration is characterized by decreasing from coastal shore to off shore. The seasonal variations show lowest concentration is in summer. The first four explain 22%,11% ,4% and 3% variation, respectively. Keywords: SeaWiFS, Bohai Sea, ocean-color remote sensing, chlorophyll, EOF analysis, discrete power spectral density.
1 Introduction The Bohai Sea is a semi-enclosed sea spanning about 7.7 million km2. It is the largest inland sea of China located at 37-41°N and 117-122°E, which is mainly subdivided into five parts including Bohai Bay, Liaodong Bay, Laizhou Bay, Center Basin and Bohai Strait. It is only connected with Northern Yellow Sea through Bohai Strait in the east (Fig.1). With its predominant geography, it has a significant role in national economy, national defense and international trade. However, in recent years, with the change of global climate and regional environment, the pollution becomes more and more serious and red tides bloom frequently in the Bohai Sea [1]. Chlorophyll-a is a good indicator for marine phytoplankton biomass and the main pigment for Photosynthesis of ocean phytoplankton. Chl-a concentration has an important economic effect in coastal marine environment on fisheries resources and marine aquaculture development [2]. Spatial-temporal variation of ocean chlorophyll concentration contains basic ecological information, which is closely related to the light, temperature, salinity and wind direction and other factors. But traditional vessel sampling is clearly unable to meet a wide range of surveys. Ocean color remote sensing with large-scale, long time even continued observation is to make up for the walking route measured data scattered defects [3]. * This paper was supported by Natural Science Foundation of Tianjin (NO.09JCZDJC25400 and NO. 08JCYBJC10500). ** Corresponding author: [email protected] K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 644 – 652, 2010. © Springer-Verlag Berlin Heidelberg 2010
Satellite-Retrieved Surface Chlorophyll Concentration Variation
645
Nikolay [4] analyzed the variation of chlorophyll derived from SeaWiFS and its relation to SST, NAO. ZOU [5] analyzed the spatial variation of chlorophyll derived from MODIS. J.A. Yoder [6] and Takahiro Iida [7] used the CZCS and SeaWiFS data to analyze the spatial-temporal variability of chlorophyll-a concentration, respectively. From all the above researches, we can see that it is necessary and significant to analyze the variation of surface chlorophyll concentration. But all the researches are in short time. In this paper, we present the variation of chlorophyll concentration using EOF based on Sea-viewing Wide Field-of-View Sensor (SeaWiFS, aboard OrbView-2 satellite) data with 12 complete years from 1998 to 2009 in the Bohai Sea. We have three objectives: 1. 2. 3.
To monitor the synoptic spatial variability in the Bohai Sea, To examine remotely sensed chlorophyll variability using EOF analysis, To analyze the significant variation periods using PSD method with 95% confidence limit analysis.
Fig. 1. The location of the Bohai Sea
2 Data and Methods 2.1 Satellite Data SeaWiFS Level 3 monthly Standard Mapped Image (SMI) data were obtained from NASA Goddard Space Flight Center’s Ocean Color Data Processing System (OCDPS, http://oceancolor.gsfc.nasa.gov), and processed with the OC4v4 algorithm. The retrieved equation is: Ca = 10
(0.366 −3.067 R +1.630 R 2 + 0.649 R3 −1.532 R 4 )
R = lg[max(Rrs443/ Rrs555, Rrs490/ Rrs555, Rrs510/ Rrs555)]
Where Ca is chlorophyll concentration (mg/m3), Rrs is remote reflectance.
(1) (2)
646
L. Qian, W.-l. Liu, and X.-s. Zheng
The format of Level 3 SMI data is a regular grid of equidistant cylindrical projection of 360 /4096 pixels (about 9 km spatial resolution) for SeaWiFS [8]. All the images were processed in the Environment for Visualizing Images (ENVI) version 4.4 and Interactive Data Language (IDL) version 6.4. In this paper, image of the study area is just a grid with 61*49. To extract the synoptic spatial variability, the image is realized to 122*98 grids for visual effect. And mean monthly composite chlorophyll images of 12-year from 1998 2009 is used for spatial variability. Seasonal variability is showed by four ocean representative months (February, May, August, and November). And mean images of 12-year corresponding to the four representative months from 1998 2009 is used for seasonal variability.
°
2.2 EOF Analysis EOF is a useful technique for compressing the variability of time series data, which originate from meteorology. However, in studies of chlorophyll using ocean color data, EOF analysis has been commonly used. EOF analysis provides a compact description of the spatial and temporal variability of time series data in terms of orthogonal functions or statistical modes [9]. EOF modes describe the major fraction of the total variance of the data set, which is defined by eigenvectors and eigenvalues. Typically, the lowest modes explain much of the variance and these spatial and temporal patterns will be easiest to interpret. The basic theory of EOF is to decompose element field to time field and spatial field. The two decomposed fields are unrelated. The physical meaning of EOF method is conspicuous. The methods can be expressed below [10]: Z = VT .
(3)
Where Z represents the element field, V and T is spatial field and time field, respectively. Fig.3 shows the basic flow of EOF analysis in IDL in this paper. As the covariance matrix is big, we used the LA_SVD function which uses the mathematical technique called singular value decomposition interchangeably as EOF. The eigenvectors (of the time domain space) are in the columns of U. these will have to be transformed into the eigenvectors of the spatial domain in the following procedure (in this paper we don’t specialize).
Fig. 2. The flow chart of EOF analysis process
Satellite-Retrieved Surface Chlorophyll Concentration Variation
647
2.3 PSD Analysis We introduce the discrete power spectral density (PSD) in statistical methodology to analyze variation periods of time series. The basic principles are to compute the Fourier coefficients. Finally, F-inspection is introduced to inspect the resonance cycle Tk with 1 − α = 0.95 confidence limit. For a sequence xt (t = 1, 2 " n ) with n sizes, the equations represent below [11].
2
n
2
n
2π k
(t − 1) ak = ∑ xt cos n t =1 n 2π k
(t − 1) bk = ∑ xt sin n t =1 n 1 sk 2 = (ak 2 + bk 2 ) Tk = 2
n n k = 1, 2"[ ] k 2
(4)
(5)
(6)
In this paper, we use PSD method to analyze the cycles of time series of the first four modes of EOF. And the confidence line equals to F 0.05(2, 133) = 3.0462 .
3 Results and Discussions 3.1 Seasonal Variability of Chlorophyll Fig.3 shows 12-year mean image in the Bohai Sea. The coastal shelves are characterized by relatively high chlorophyll concentration, such as in Bohai Bay, Liaodong Bay and Laizhou Bay. The lowest concentrations lie in Northern Yellow Sea and Bohai Strait. The concentrations decrease from coastal to Bohai Strait spanning 1~6 mg/m3. The concentrations are between 3~4 mg/m3 in the centre of Bohai Sea. The highest concentrations are 6 mg/m3 in coastal shores. Fig.4 shows the distribution of chlorophyll in four representative months (February, May, August, and November). The spatial rules are as below: In winter (February),the concentration is high in Bohai Sea reaching 7~8 mg/m3 , especially in Liaodong Bay and Bohai Bay; in spring(May), the high concentration only in parts of Bohai Sea reaches 4~5 mg/m3. Bohai Sea is enclosed by the value of 3~4 mg/m3; in summer (August), the concentration in Bohai Strait and Northern Yellow Sea is lower as much as 0~2 mg/m3. The high concentration is 4~5 mg/m3 only in Bohai Bay and parts of Laizhou Bay; in autumn (November) the concentration is 3~4 mg/m3 in most of Bohai Sea. Only in the Liaodong Bay, the concentration is between 4~5 mg/m3. Fig.3 represents the concentration is relatively high and low in Liaodong Bay and Bohai Basin all the year, respectively. Environmental factors in ocean such as nutrient salt, light, sea temperature, mixed layer depth and monsoons, and magnitude of zooplankton have a great influence on the variation of chlorophyll. Bohai Sea due to shallow depth and closure is strongly
648
L. Qian, W.-l. Liu, and X.-s. Zheng
subjected by continental climate. In winter, Deep nutrients were taken to the surface by northerly strong winds and s eddy mixing, which results high chlorophyll concentration in Bohai Sea. From winter to spring, due to temperature gradually rising and enhanced light, phytoplankton biomass increases resulting to relatively high concentration. In summer, due to zooplankton blooms and a mount of nutrient consumed in spring, the concentration is relatively low. In autumn, with light weakened, surface water temperature decreased and convection mixing enhanced, surface nutrients are added back which results a second higher concentration [12-15]. This type of variation is typical to temperature zone coastal sea.
Fig. 3. Monthly composites (1998-2009) of SeaWiFS chlorophyll concentration (mg/m3) in Bohai Sea
Fig. 4. Mean monthly chlorophyll composition images of four representative (February May, August,November)months from 1998 to 2009
3.2 Spatial-temporal Variability of Chlorophyll In this paper, we use the EOF statistical methods to depict the above variation of chlorophyll in Bohai Sea in detail. Fig.5 shows the percent and cumulative percent ratio of eigenvalues. From Fig.5, we can see that the percent ratio and cumulative percent ratio decrease and increase quickly, respectively. The results of top four EOF analyses
Satellite-Retrieved Surface Chlorophyll Concentration Variation
649
are showed in Tab.1. In the study, the first element of the percent variance is 22, indicates that mode 1 explains or predicts about 22% of the variation of the chlorophyll in the Bohai Sea. And the second mode predicts about 11% of the variation. And the third and four modes just explain 4%, 3% of the variance, respectively. The top four modes can explain 40% of the variance.
Fig. 5. The percent ratio (left) and cumulative ratio (right) of eigenvalues Table 1. Four top eigenvalues and percent variation Mode
Eigenvalues Percent variance (%) Cumulative percent Variance(%)
1 2 3 4
821.12079 393.86876 135.80478 109.40900
22 11 4 3
22 33 37 40
In this paper, we show the top four EOF modes to analyze the quantitative spatial and temporal variation of chlorophyll in Bohai Sea (from 1998 to 2009). Fig.6 shows the four modes of EOF to represent the spatial and temporal pattern of SeaWiFS chlorophyll variation over the Bohai Sea. The pattern of mode 1 (22% of variance) explains the similar spatial distribution to Fig.6. In the coastal shores the EOF show positive value and off-shore the values are negative, indicating the average distribution of chlorophyll in the Bohai Sea. Temporal pattern has a multi-year period and inner-year concussion. It also indicates coastal shores and centre of Bohai Sea covary inversely. The second mode of EOF (11% of variance) is negative all the regions of Bohai Sea other than portion of Laizhou Bay. Temporal pattern does not appear obvious variation cycles, but in the last months the concussion is very exquisite. The second mode of EOF indicates chlorophyll concentrations are low in all regions of Bohai Sea during 2008 locating in the x-coordinate about 120. The third mode shows positive value in the east of Liaodong Bay and a part of Bohai Sea. Combined with the temporal curve, the results can be concluded that Liaodong Bay and Bohai Bay has high chlorophyll concentration locating in the x-coordinate of 64, 77,100,120 and low in 9,27,40,85,104. As time later, these two places appear low chlorophyll concentration more and more frequent. The forth mode shows positive values along Liaodong Bay to Caofeidian. Combined with temporal curve, the results show that in many months chlorophyll concentration is high, such locating in the x-coordinate of 15.
650
L. Qian, W.-l. Liu, and X.-s. Zheng
Fig. 6. The four EOF modes over twelve years (1998-2009) of the SeaWiFS chlorophyll data to show the spatial pattern and temporal function
Fig. 7. The power spectral density curve (of the first mode) and F-inspection line (the dashed line) of 68 wave numbers
And this trend is more and more frequent. But chlorophyll concentration is very low locating in the x-coordinate of 122. In this paper, power spectral density is used to see the significant variation periods of chlorophyll of the four modes. Fig.7. shows the power spectral density curve and the F-inspection with 95% confidence limit. The PSD curve of mode 1 shows steady except for three crests, which exceed the F-inspection locating in the x-coordinate of 11, 12 and 13. This indicates variation period is about 10~12-month, which is
Satellite-Retrieved Surface Chlorophyll Concentration Variation
651
corresponding to the average variation situation. The PSD curve of mode 2 shows a point is more than confidence limit locating in the x-coordinate of 12, whose corresponding period is about 11~12-month. It indicates that chlorophyll concentration is low in the centre of Bohai Sea all the 12 years. The PSD curve of mode 3 changes more intense with many obvious peaks. There are 3 points is higher than confidence limit locating in the x-coordinate of 12,22,34, which indicate the corresponding periods are 11,3,6-month. That is to say the east of Liaodong Bay has high chlorophyll concentration every 11, 3, 6-month in a year. The curve of mode 4 is similar to mode 3. There are three points exceed the confidence limit locating in the x-coordinate of 1,23,56 with the corresponding periods two and six-month and 11-year. That is to say periods of high chlorophyll concentration is two and six-month or 11-year.
4 Conclusions The Bohai Sea is classified as coastal waters (case 2), where the sea surface color depends also on the dissolved and suspended matter concentrations, uncorrelated with chlorophyll [8]. Chlorophyll concentration will be highly estimated, but the variation trends are credibility. In this paper, we obtain the results as below: 1.
The spatial variation of surface chlorophyll concentration presents higher in coastal shore and decreases to off-shore. 12-year average concentration (from 1998 to2009) is 5 mg/m3. 2. The seasonal variation of 12-year is lowest in summer due to highest temperature and nutrient matter consumed in spring and higher in winter and spring. 3. The EOF analysis results show the first four explain 22%,11% ,4% and 3% variation, respectively. The first mode is positive in coastal shore and is negative in off-shore. The second mode is negative in all the region of Bohai Sea. The third and forth mode show that the east of Liaodong Bay and Liaodong Bay to Caofeidian changes differently from other places. 4. PSD analysis with 95% confidence limit of time-series of mode 1~4. Significant variation of mode 1 and 2 is one and mode 3 and 4 is three. Acknowledgments. This paper was supported by Natural Science Foundation of Tianjin (NO.09JCZDJC25400 and NO. 08JCYBJC10500).
References 1. Lin, F.X., Lu, X.W.: History, current situation and characteristics of red tide in Bohai Sea. J. Marine Environment Science 27(suppl. 2), 1–5 (2008) 2. Nyoman, R., Saitoh, S.I.: Satellite-derived measurements of spatial and temporal chlorophyll-a variability in Funka Bay, southwestern Hokkaido, Japan. J. Estuarine, Coastal and Shelf Science 79, 400–408 (2008) 3. Sha, H.M., Li, X.S.: Annual variation in sea surface temperature and chlorophyll-a concentration retrieved by MODIS in East China Sea. J. Journal of Dalian Fisheries 24, 151–156 (2009) 4. Nezlin, N.P.: Patterns of Seasonal and Interannual Variability of Remotely Sensed Chlorophyll. J. Hdb Env. Chem. Part Q. 5(part P), 143–157 (2005)
652
L. Qian, W.-l. Liu, and X.-s. Zheng
5. Zou, B.: Analysis of Characteristics of Seasonal and Spatial Variations of SST and Chlorophyll Concentration in the Bohai Sea. J. Advances In Marine Science 23(4), 487–492 (2005) 6. Yodera, J.A., O’Reillyb, J.E.: Variability in coastal zone color scanner (CZCS) Chlorophyll Imagery of ocean margin waters off the US East Coast. J. Continental Shelf Research 21, 1191–1218 (2001) 7. Iida, T., Saitoh, S.I.: Temporal and spatial variability of chlorophyll concentrations in the Bering Sea using empirical orthogonal function (EOF) analysis of remote sensing data. J. Deep-Sea Research II 54, 2657–2671 (2007) 8. Nezlin, N.P.: Seasonal and Interannual Variability of Remotely Sensed Chlorophyll. J. Hdb Env. Chem. 5, 333–349 (2008) 9. Emery, W.J., Thomson, R.E.: Data Analysis Methods in Physical Oceanography, second and revised edn., p. 638. Elsevier, Amsterdam (2001) 10. Shi, N.: Multivariate analysis method in weather research and forecasting, 2nd edn. Meteorological Press, Beijing (2002) (in chinese) 11. Wei, F.Y.: Modern diagnosis and prediction of climate statistics, 2nd edn. Meteorological Press, Beijing (2007) (in chinese) 12. Sun, X.P.: China coastal shore Regional Sea. Ocean Press, Beijing (2006) (in chinese) 13. Sheng, G.Y., Shi, B.Z.: Marine Ecology. Science Press, Beijing (2006) 14. Wu, R.J., Lv, R.H., Zhu, M.Y.: Impacts of sea water mixing and stratification on the vertical profile of Chlorophyll-a. J. Ecology and Environment 13(4), 515–519 (2004) 15. Wei, H., Zhao, L.: Variaiton of the Phytoplankton Biomass in the Bohai Sea. J.Journal Of Ocean University Of Qingdao 33(2), 173–179 (2003)
A Study on the Cooling Effects of Greenery on the Surrounding Areas by Computer Simulation for Green Built Environment Jiafang Song and Xinyu Li Department of Building Environment and Facility Engineering, Tianjin Polytechnic University, Tianjin, China,300160 [email protected]
Abstract. This paper discusses the effects of greenery on the surrounding environment in a sub-urban landscape in Singapore. The case study involved is Clementi Woods and its surrounding vicinity. Using computational tools such as ENVI-MET and Leonardo, we focused on the simulation works with the main objectives of the study including: to evaluate the cooling effects of the green area in Clementi Woods on the surrounding environment and to determine the impact of future removal of the green area on the surrounding environment. It was found that cooling effects of greenery can be confirmed by the results derived from the simulation. Clementi Woods is consistently 0.3 to o 0.6 C lower than other zones. Keywords: Cooling effects, greenery, computer simulation.
1 Introduction With rapid urbanization, there has been a tremendous growth in population and buildings in cities. The high concentration of hard surfaces actually triggered many environmental issues. The Urban Heat Island effect, one of these environmental issues, is a phenomenon where air temperatures in densely built cities are higher than the suburban rural areas. The primary root of Heat Island in cities is due to the absorption of solar radiation by mass building structures, roads, and other hard surfaces. The absorbed heat is subsequently re-radiated to the surroundings and increases ambient temperatures. In addition, heat generated from the use of air-conditioning coupled with the greenhouse effect of pollutants also contributes to the increase of temperature in cities. Plant is actually an ecological solution to the concrete jungle in cities. It is well known that plants strategically placed around buildings can bring thermal benefits to inhabitants [1]. As soon as a bare hard surface is covered with plants, the heat-absorbing surface transforms from an artificial layer to a living one. The alteration of thermal environment by plants mainly depends on the energy exchange between plants and their surrounding environment. Vegetation can stop and absorb most incoming solar radiation. Considerable solar radiation can be consumed through photosynthesis and evapotranspiration process. Water in the leaves is converted from liquid to gas resulting in K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 653 – 661, 2010. © Springer-Verlag Berlin Heidelberg 2010
654
J. Song and X. Li
lower leaf temperature, lower surrounding air temperature and higher humidity. As a result of the evapo-transpiration process, green plants could bring thermal benefits to buildings through decreasing the surface temperature, the sensible heat flux, and even the diurnal temperature fluctuations of shaded buildings [2]. Urban parks which are high concentrated plants areas have a cooling influence on their surrounding build-up area, thus reducing the stress produced by the urban heat island. Many studies have involved finding out the effect of the green areas on the surrounding environment. From the investigations of the green areas in Kumamoto City, it was concluded that air temperature distribution in an urban area was closely related to the distribution of green covering and even a small green area of about 60m×40m indicated the cooling effect [3](Saito et al., 1990). A study conducted for the Mexico City showed that Chapultepec Park (~500 ha) was 2-3oC cooler with respect to its boundaries and its influence reached a distance about the same as its width (2km) at clear night [4](Jamregui, 1990). From field observations in the west of the Tokyo Metropolitan Area, it was found that even though small, the Tama Central Park was significantly cooler than the surrounding area during the day and at night. It was estimated that 4000 kWh of electricity for cooling, or US$650 can be saved within 1 h from 1 to 2pm of a hot summer day [5]( Ca, 1998). This study discusses the cooling effects of greenery on the surrounding environment in a sub-urban landscape in Singapore. The case study involved is Clementi Woods and its surrounding vicinity. Using computational tools such as ENVI-MET and Leonardo, we will focus on the computer simulation works. Main objectives of the study are as follows: 1. 2.
To evaluate the cooling effects of the green area in Clementi Woods on the surrounding environment. To determine the impact of future removal of the green area on the surrounding environment.
2 Methodology This study was conducted using ENVI-met and LEONARDO software in addition to a field study. ENVI-met is a free three-dimensional non hydrostatic model for the simulation of Surface-Plant-Air interactions inside urban environments. It is designed for microscale with a typical horizontal resolution of 0.5 to 10 m and a typical time frame of 24 to 48 hours with a time step of 10 sec. This resolution allows analyzing small-scale interactions between individual buildings, surfaces and plants. The software LEONARDO is the main graphic processor for ENVI-met. The simulation results from ENVI-met are not visual directly. By the process of Leonardo, the visual results such as color contours, vectors which are easily understood are provided. Leonardo software can provide color contours and vectors for air temperature, wind speed, specific humidity and so on. 2.1 Study Area Description 2.1.1 Field Measurement In order to investigate the cooling effects of Clementi Woods, the study areas include Clementi woods and its surroundings. Ambient temperature, wind speed and humidity
A Study on the Cooling Effects of Greenery on the Surrounding Areas
655
Fig. 1. Field measurement map for Clementi Woods and receptors
HDB2
HDB1
Kent Vale
WOODS LEFT SDE
Fig. 2. Simulation Models
have been measured for 18 points along the Clementi Woods and surroundings (Fig.1). The field measurement results will be compared with the simulation results to validate the reliability for Envi-met software simulation.
656
J. Song and X. Li
2.2 Simulation Models In order to fully investigate the cooling effects of Clementi Woods, we have included two models in our simulation: Woods model and no trees model. Model 1 –Woods model This model is the base case which simulates the current conditions. In these models, there are Clementi Woods and surrounding areas including SDE, HDB1, HDB2, Kent Vale and Left side (See Fig.2). The simulation results will be validated with field measurement. The cooling effects will be demonstrated through the comparison with other models. The whole area is divided into six zones and the site conditions description is provided below. Table 1. Description of Location Names and Site conditions Abbreviation Location names CW Clementi Woods CWW West of Clementi Woods SDE East of Clementi Woods KV East-North of Clementi Woods HDB1 North of Clementi Woods HDB2 North-East of Clementi Woods
Site conditions Woods Floored Condominium Floored Office Building Floored Condominium Floored Public Housing Floored Public Housing
Model 2—No trees model For this model, the simulation involves the removal of all the plants in Clementi woods leaving behind only the bare soil. 2.3 Setting of Input Parameter Based on the preliminary analyses of weather station data obtained from previous studies and the project objectives, a clear sunny day is chosen to study the cooling effect of the trees of Clementi Woods on the microclimate. Table 2 shows the basic settings used in all the simulations. Since most of buildings within the computational domain are HDB, the properties of a typical material used for HDB are used in the present simulation. Table 3 shows the settings of material properties for HDB. Table 2. Basic settings Tair
WS at 10m
(K)
(m/s)
303
1.6
Wind direction S to N
SH in 2500m
RH in 2m
Roughness
(g/Kg)
(%)
length in 10m
7
68.5
0.1
Total sim.time (hrs) 24
Note: Tair: air temperature (K), WS: wind speed (m/s), SH: specific humidity (g/Kg), RH: relative humidity (%)
A Study on the Cooling Effects of Greenery on the Surrounding Areas
657
Table 3. Settings of Material Properties for HDB Tin (K)
Hw (W/m2K)
Hr (W/m2K)
Wall Albedo
Roof Albedo
303
3.28
1.492
0.23
0.4
To improve accuracy, the turbulence model in this study is chosen to calculate the turbulence continuously together with temperature and humidity in the main loop of the model until the maximum change of E or ε falls under 0.001 -/s although this model is time intensive compared with another available turbulent model which the turbulence field is calculated in the fixed time intervals until it is nearly stationary. A closed upper boundary at the top of 3D model is used. The temperatures and relative humidity of soil used in different layers in the four models are shown Table 4. Table 4. Initial Temperature and RH in Different Layers Initial Temperature in Upper Initial Temperature in Middle Initial Temperature in Deep Layer (0-20 cm) [K] Layer (20-50 cm) [K] Layer (below 50 cm)[K] 303 303 303 RH (%) in Upper Layer RH (%) in Middle Layer RH (%) in Deep Layer (0-20 cm) (20-50 cm) (below 50 cm) 50 36 31
3 Model Validation A field measurement was conducted in Clementi woods, HDB flats and Kent Vale apartment previously. The obtained data can be used to validate the results derived from the Envi-met simulation. First of all, it is necessary to highlight some notable differences between the real situation and the simulation model. These differences may incur bias in the validation: 1. Clementi Woods has been defined to be an area evenly distributed with 10 meter high dense trees (distinct crown layer) in the simulation model. In the real Clementi Woods, the density and height of trees are varied without a constant distribution; 2. The wind direction and velocity has been set to be constant in the simulation while they are varied on site; 3. The space between HDB blocks and other buildings has been defined to be hard surface (pavement or asphalt) without considering planting in between. In order to compare the simulation results with the field measurement data, 18 receptors in the simulation model were chosen. Almost every measuring point has a corresponding receptor in the simulation model. The comparison of measuring point and receptor in Clementi Woods is shown in Fig.3. W1 (24) is the simulation results generated from a 24-hour simulation while W1 (48) is the results generated from the second 24-hour of a 48-hour simulation.
658
J. Song and X. Li
The comparison of 1, W1(24), and W1(48) - Clementi Woods 33.0 32.0
Temperture (Degree C)
31.0 30.0 29.0 28.0 27.0 26.0 25.0 24.0
23:00
22:00
21:00
20:00
19:00
18:00
17:00
16:00
15:00
14:00
13:00
12:00
11:00
10:00
09:00
08:00
07:00
06:00
05:00
04:00
03:00
02:00
01:00
00:00
23.0
Time 1
W1(24)
W1(48)
Fig. 3. The comparison of measuring point and receptor in Clementi Woods
From above figure, it could be found that: 1. W1(24) and W1(48) can not fit very well to each other. Compared with W1(24), higher values were generated by W1 (48) from 1000 to 2300hr while lower values were experienced from 0000 to 1000hr. A delay of the peak value also occurred in W1(48). 2. Compared with field measurement data, it seems that the simulation underestimate the cooling effect of vegetation all the time. It is worse during the night time when the temperature difference can be up to 4-5 oC. 3. Generally, the profile/trend of the simulation can fit with that of field measurement.
4 Results and Discussion Fig. 4 shows the average air temperature of each zone with Clementi Woods. The curves of the six zones have similar temporal patterns. In the figure, it can be seen that the zone of Clementi Woods has the lowest temperature all the whole day due to the plants. Fig. 5 shows that the difference of the air temperature between CW and other zones in nighttime is higher than daytime. It indicates that plants can cause more cooling effect in nighttime. The reason comes from the plants absorb solar radiation less and do not re-transmit it to the environment at night as opposed to hard surfaces. As far as the impact on surroundings, plants obviously contribute to the KV Zone more than other zones. From the figure, it can be seen that CWW Zone, which located at the left side of Clementi Woods, has the highest temperature in the morning, while in the afternoon, air temperature of HDB1 Zone and HDB2 Zone are higher than other places. The reason lies in the building density and layout of in HDB 1 and HDB2.
A Study on the Cooling Effects of Greenery on the Surrounding Areas
659
After 1600 hrs, the air temperature begins to drop because of the sun set. The buildings stop absorbing solar radiation and start to re-emit the heat to the environment. Since the building densities in HDB1 and HDB 2 are higher than others places, the process of heat transfer is slowest at nighttime. So, it causes the slopes of the lines representing HDB1 and HDB2 (in Fig.4) after 16 o’clock to be smaller than those of the other lines. That means, the variation and reduction of temperature in HDB1 and HDB2 is smaller at nighttime. 32.0 31.5 31.0
Air Temperature (
)
℃30.5 30.0 29.5 29.0 28.5 28.0 27.5 7
9
11
CW
13
15
CWW
Model Max Min Ave StDev
17
SDE
19
21
23
Time (hrs)
KV
1
3
HDB1
5
HDB2
CW
CWW
SDE
KV
HDB1
HDB2
31.48
31.94
31.77
31.66
31.75
31.70
27.71
28.27
28.30
28.08
28.42
28.45
29.64
30.22
30.12
29.95
30.19
30.16
1.26
1.24
1.15
1.20
1.10
1.07
Fig. 4. Comparison of air temperature by zone for Woods model
3.5 3.0 2.5
HDB2 HDB1 KV SDE CWW
2.0 1.5 1.0 0.5 0.0 1
2
3
4
5
6
7
8
9
10
11
12 13 Time
14
15
16
17
18
19
20
21
22
23
Fig. 5. Difference of air temperature between CW and other zones
24
660
J. Song and X. Li
In order to determine the cooling effects of the plants on the nearby zones, the average temperature at Clementi Woods is taken as a reference point (in this case = 0.0). Table shows the comparison of the average temperature by zone in Woods model. From the table, it is demonstrated that the cooling effect of plants is best on KV Zone because it is the nearest place to the woods (situated leeward) and also due to its low building density (which can help the heat transfer). So the cool air from the Clementi Woods flows in KV Zone easier. Table 5. Initial Temperature and RH in Different Layers CL 0.00
KV +0.31
HDB1 +0.55
HDB2 +0.52
CWW +0.57
SDE +0.48
Clementi Woods is much cooler than the surrounding area during daytime and nighttime. This can be seen in Table 6. The air temperature in Clementi Woods is o 0.3—0.6 C lower than other zones. Table 6. Comparison of air temperature among six zones ( Zone Max Min Ave
℃)
CW
CWW
SDE
KV
HDB1
HDB2
31.48
31.94
31.77
31.66
31.75
31.70
27.71
28.27
28.30
28.08
28.42
28.45
29.64
30.22
30.12
29.95
30.19
30.16
From cross comparison, it is concluded that greenery areas have a cooling effect on their surrounding built-up area. The reduction of the air temperature with Clementi o Woods can reach 0.2 – 0.5 C as shown in Table 7. o
Table 7. Comparison of air temperature cross models and zones ( C) Model
Woods
Notree
KV
29.95
30.21
CWW
30.22
30.36
HDB1
28.42
28.60
HDB2
28.45
28.60
SDE
30.12
30.28
5 Conclusion Comparison between field measurement and simulation → simulation predict the temporal temperature profile reasonably well. The cooling effect of greenery can be confirmed by the simulation. The low-temperature region can be created with distinct boundary at night. It is confirmed by quantitative analysis which showed that Clementi
A Study on the Cooling Effects of Greenery on the Surrounding Areas
661
o
Woods is 0.3—0.6 C lower than other zones. Also, the temperature difference between Clementi Woods and surrounding areas in nighttime is higher than that in daytime. The relative air temperature to Clementi Wood varies from 0.15 to 0.74 ℃. In the cross-comparison of the four models for temperature, the best cooling effect on the surrounding built-up area is observed in the base case model (with vegetation). This effect is reduced when the vegetation is removed leaving behind the soil and drastically reduced the cooling effects when buildings are erected. The reduction of o the air temperature with Clementi Woods can reach 0.2 – 0.5 C.
References 1. Hoyano, A.: Climatological uses of plants for solar control and effects on the thermal environment of a building. Energy and Buildings 11, 181–199 (1988) 2. Wong, N.H., Chen, Y., Ong, C.L., Sia, A.: Investigation of thermal benefits of rooftop garden in the tropical environment. Building and Environment 38, 261–270 (2003) 3. Saito, I., Ishihara, O., Katayama, T.: Study of the effect of green areas on the thermal environment in an urban area. Energy and Buildings 15-16, 493–498 (1990) 4. Jaudregui, E.: Influence of a large urban park on temperature and convective precipitation in a tropical city. Energy and Buildings 15-16, 457–463 (1990) 5. Ca, V.T., Asaeda, T., Abu, E.M.: Reductions in air conditioning energy caused by a nearby park. Energy and Buildings 29, 83–92 (1998)
Spatial-temporal Variation of Chlorophyll-a Concentration in the Bohai Sea* Wen-ling Liu**, Li Qian, and Xiao-shen Zheng Tianjin Key Laboratory of Marine Resources and Chemistry , Tianjin University of Science and Technology, Tianjin 300457, China [email protected]
Abstract. Spatial-temporal variation of Chlorophyll-a concentration retrieved by Moderate Resolution Imaging Spectroradiometer (MODIS-Auqa, aboard Auqa satellite) was analyzed since the starting MODIS-Auqa mission in July 2002 till July 2009 in Bohai Sea. Statistical methods including anomaly, sliding average, power spectral density were used to analyze the spatial-temporal variation of Chlorophyll-a concentration. The results showed that seasonal variation of Chlorophyll-a concentration represented the maximum values in FebruaryMarch, the minimum values in July. Monthly anomalies showed about 2-year cycle. Spatial variation showed high concentration in coastal shore and decreased slowly to offshore. The whole Bohai Sea showed high Chlorophyll-a concentration in the year 2006. Keywords: Bohai Sea, Chlorophyll-a, MODIS, spatial and temporal variation.
1 Introduction Bohai Sea is the largest inland sea of China, which is mainly composed by five parts including Bohai Bay, Laizhou Bay, Liaodong Bay, Center basin and Bohai strait (showed in Fig.1). It accepts a large quantitatively volume of land-based pollutants and sewage every year and has poor water quality exchange. In recent years, the red tides bloom frequently in Bohai Sea, which are relative to chlorophyll-a (Chl-a) concentration in water bodies. The detection of Chl-a has a great significance to monitor red tides, environmental situation, carbon cycles and fishery. Spatial-temporal variation of ocean Chl-a concentration contains the basic ecological Information, which is closely related to the light, temperature, salinity and wind directions and other factors. Traditional vessel sampling is clearly unable to meet a wide range of surveys. Ocean color remote sensing of the oceans to achieve largescale, long time even continued observation is to make up for the walking route measured data scattered defects[1]. In recent years, high resolution spectra technology, such as used in the Moderate Resolution Imaging Spectrometer (MODIS), has developed rapidly. MODIS has been widely applied in water monitoring for its *
This paper was supported by Natural Science Foundation of Tianjin (NO.09JCZDJC25400 and NO. 08JCYBJC10500) ** Corresponding author. K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 662 – 670, 2010. © Springer-Verlag Berlin Heidelberg 2010
Spatial-temporal Variation of Chlorophyll-a Concentration in the Bohai Sea
663
advantages of high spectral resolution, short revisit period, free of charge and fast acquisition [2]. In this paper, we used MODIS-Aqua to analyze the long-term time series of Chl-a concentration. Nikolay analyzed the variation of chlorophyll derived from SeaWiFS and its relation to SST, NAO [3]. ZOU analyzed the spatial variation of chlorophyll derived from MODIS [4]. J.A. Yoder [5] and Takahiro Iida [6] used the CZCS and SeaWiFS data to analyze the spatial-temporal variability of chlorophyll-a concentration, respectively.
Fig. 1. The location of Bohai Sea of the study area
2 Data and Methods 2.1 Satellite Data The analysis of spatial-temporal variation of Chl-a concentration in the Bohai Sea of China was based on the remotely sensed data collected by MODIS-Aqua satellite sensor. We used monthly average Level-3 global Standard Mapped Images (L3 SMI) produced by the NASA Goddard Space Flight Center’s Ocean Color Data Processing System (OCDPS, http://oceancolor.gsfc.nasa.gov). The retrieved equations show below.
Ca = 10(0.283−2.753R+0.659 R
2
+0.649 R3 −1.403 R4 )
R = lg[max(Rrs443/ Rrs551, Rrs490/ Rrs551)]
(1) (2)
Where Ca is Chl-a concentration, Rrs is remote reflectance. The format of the Level 3 SMI data is a regular grid of equidistant cylindrical projection of 360◦/8192 pixels (about 4.5 km resolution) for MODIS-Aqua. The standard MODIS Chlorophyll algorithms were developed for clean open ocean waters (Case 1), where the color of ocean surface results mainly from chlorophyll concentration. Standard algorithms developed for open ocean (Case 1) overestimate chlorophyll
664
W.-l. Liu, L. Qian, and X.-s. Zheng
concentration in Case 2 waters [7]. As known, the Bohai Sea is classified as coastal waters (Case 2), where the pigment concentration depends other than Chl-a (i.e. dissolved and suspended matter concentrations). In this paper, we just used the satellitederived chlorophyll concentration to analyze the variable trends without comparing with in-situ measured data in the study area. Before statistical analysis, we resized the Bohai Sea in the domain 37-41°N and 117-122°E. The long-term time series here analyzed spans 8 years since the start of MODIS-Auqa mission in July 2002 till December 2009 including 90 months. 2.2 Statistical Methods The spatial-temporal variation of Chl-a concentration was analyzed by statistical methods. For this study, absolute values of Chl-a concentration was not as important as spatial and temporal gradient of Chl-a, which derived from satellite measurements were subject to significant inaccuracy due to technical difficulty of remotely sensed observations. So anomalies were introduced to show the interannual variability. The anomalies were based on the mean value of the same months in 2003-2009. For a discrete sequence x , the anomaly of a given i time is computed with the below equation: −
^
x i = xi − x(i = 1,2"n) −
x=
1 n
(3)
n
∑
xi
(4)
i =1
To see the marked variation trends of the long-term time series, we introduced sliding average of statistical method. For a discrete sequence x , the sliding average sequence can be expresses as the below equation: ∧
1 K K i=1
xj = ∑xi + j − 1( j =1,2"n− K +1)
(5)
Where n is the sample size, K is the sliding length which is often an odd number. In this paper, we choose K = 5 to eliminate to the influence of seasons. To analyze the significant variation periods of the long time series, we introduced the statistical method of discrete power spectral density. The basic principle is to compute the Fourier coefficients. For a sequence t (t = 1, 2 " n ) with n sizes, the equations represent below [8].
x
2 n n t =1
ak = ∑ xt cos 2
n
2π k (t −1) n 2π k
bk = ∑ xt sin (t − 1) n t =1 n
(6)
(7)
Spatial-temporal Variation of Chlorophyll-a Concentration in the Bohai Sea
1 2
sk 2 = (ak 2 + bk 2 ) Tk =
n k
k = 1, 2 " [
n ] 2
665
(8)
Where k represents wave number; t is the length of time series; sk is the power spectral density in NO. k ; Tk is the resonance cycle of NO. k .In this paper, the length of time series is n = 90 , wave number is k = 45 . Finally, we introduced F-inspection to inspect the resonance cycle Tk , using 1 − α = 0.95 confidence limit. The F-inspection equation [9] is:
F=
1 2 (ak + bk 2 ) / 2 2 1 ( s − (ak 2 + bk 2 )) / (n − 2 − 1) 2
(9)
2
S 2 = E ( X 2 ) − [ E ( X )]2
(10)
Where 2 and n-2-1 represent the freedom of numerator and denominator respectively, S2 is the variance of the time series. When F ≥ F α (2, n − 2 − 1) , it implies that the corresponding period was marked. And the confidence line equals to F 0.05(2, 87) = 3.1013 .
3 Results and Discussions 3.1 Seasonal Variability We first applied the same months to achieve the monthly-averaged Chl-a concentration during 2003-2009. Monthly average was between 3.8-5.2 mg/m3. Results show in Fig.2. Monthly-averaged Chl-a concentration is high in autumn-winter and low in spring-summer. The seasonal variability is characterized by a maximum in FebruaryMarch and a minimum in July, which is in agreement with the research of Z.Bin [4]. Environmental factors in Ocean Sea such as nutrient salt, light, sea temperature, mixed layer depth and monsoons, and zooplankton have a great influence on spatial and temporal changes of Chl-a. Due to shallow depth and closure, Bohai Sea is strongly subjected by continental climate. In winter, northerly winds prevail and shelf shallow sea convective mixing and eddy mixing is the strongest throughout the year, Deep nutrients were taken to the surface result high Chl-a concentration [10]. From winter to spring, due to temperature gradually rising and vertical mixing of sea water reducing, most of the seas have a decreasing Chl-a concentration [4]. Nevertheless, enhanced light has beneficial to photosynthesis of phytoplankton, Chl-a concentration is relative higher. In summer, surface water temperature is the highest in the whole year and vertical mixing becomes shallow. Due to enough light zooplankton blooms, which consume a large number of phytoplankton. The lower chlorophyll-a mean value in July dropped to 3.8mg/m3 [11]. In autumn, with light weakened, surface water temperature decreased and convection mixing enhanced, surface nutrients are added back which result Chl-a concentration gradually increase [12].
666
W.-l. Liu, L. Qian, and X.-s. Zheng
Fig. 2. Variation of monthly-averaged of Chl-a in the Bohai Sea during 2003-2009
3.2 Inter-annual Variability We computed the mean of all valid points to gain the long-term variation curve of Chl-a concentration from 2002 to 2009. Fig.3 shows the long-term time series of Chlorophyll-a concentration and 5-month running curve during 2002-2009, and Fig.4 shows Chl-a concentration anomaly curve and 5-month running. Fig.3 represents more than one peak and one crest in each year, but the peaks mostly locate in February-March and the crests locate in July-August. From Fig.4, seasonal anomalies of Chl-a concentration are negative during 2002-2005 and 2008-2009. Starting from 2006, Chl-a concentration seasonal anomalies are changed to positive (i.e. Chl-a exceeded the seasonal averages). And the positive anomalies last to 2007. From the monthly average anomaly curves, we can see that two curves present increasing trends, changing very steadily. In the two 5-month running average curves, the increasing trends represent more clearly. But starting from 2008, it has a slightly decreasing. Generally, Chl-a concentration shows a slight rising tendency.
Fig. 3. Monthly average of Chl-a concentration in the Bohai Sea during July 2002-2009
Fig. 4. Monthly anomalies of Chl-a concentration in the Bohai Sea during July 2002- December 2009
Spatial-temporal Variation of Chlorophyll-a Concentration in the Bohai Sea
667
3.3 Variability Cycles We analyzed the marked variation cycle of the monthly anomalies of Chl-a concentration using the power spectral density method. The F-inspection with 0.95 coefficient limit was introduced to inspect the remarkableness of the periods. The power spectral density and the F-inspection line are showed in Fig.5. There are many peaks in the spectral density curve, locating at 90/1, 90/3, 90/10, 90/15 respectively, but only two peaks could pass the F-inspection, locating at 90/1 and 90/3. That is to say monthly anomalies have two cycles about two-year and three- year.
Fig. 5. The power spectral density and F-inspection line (the dashed line) of 45 wave numbers
3.4 Variability Cycles The spatial distribution of Chl-a concentration in Bohai Sea was analyzed using the composite image measured by MODIS-Auqa sensor during 2003-2009 (Fig.6). The image was composed by 7-year mean of Chl-a concentration from 2003 to 2009. Then we processed the composite image through density slice. Chl-a concentration is mostly between 4-5mg/m3, which lies in the Center basin of Bohai Sea. Chl-a concentration gradient gradually decreased from coastal sea to offshore. Higher Chl-a concentration presents along the coastal Liaodong Bay, Bohai Bay, Laizhou Bay. Lower Chl-a concentration appears on the Bohai Strait and Northern Huanghai. Coastal seas with shallow depth, strong mixture, abundance nutrient salt provide a good harbor for the growth of phytoplankton. Freshwater with abundance nutrient-matter discharged coastal shore [4]. All above factors result high Chl-a concentration along coastal sea.
Fig. 6. 7-year average composite image of Chl-a distribution (mg/m3) during 2003-2009
668
W.-l. Liu, L. Qian, and X.-s. Zheng
Fig. 7. Seven anomaly images based on 7-year average image
Spatial-temporal Variation of Chlorophyll-a Concentration in the Bohai Sea
669
Base on above 7-year average image, we computed anomaly images respectively in 2003-2009 (Fig.7). In the figures, red regions represent positive anomaly (i.e. Chl-a exceeded 7-year average) and green regions show negative anomaly. In 2003, only Liaodong Bay shows positive anomaly. In 2004 and 2005, they show the similar situation, which appear positive anomaly in Bohai Bay and Laizhou Bay. But Center basin has positive in 2004. High Chl-a concentration represented in 2006 and 2007. It coincides with positive anomaly in 11 months sliding of Chl-a concentration anomaly. In 2006, the Bohai Sea region shows the highest Chl-a concentration during 2003-2008. In 2008, positive anomaly disperses in the whole of Bohai Sea, and high Chl-a concentration emerges in Bohai Bay. In 2009,the distribution is similar to the year of 2007. Only in the center of Bohai Sea, chlorophyll –a concentration is higher than 7-year average.
4 Conclusions In this study, the long-term time series of Chl-a concentration was analyzed in Bohai Sea of China using 8-year MODIS-Auqa measurements during July 2002-December 2009. Results of seasonal variability showed that Chl-a concentration represented a maximum in February-March, a minimum in July. Interannual variability showed Chl-a concentration with a slight increasing tendency through 2002-2007. Starting from 2008, Chl-a concentration decreased. In general, the interannual changing curves showed increasing tendency. The monthly anomaly showed an about 2-year and 3year cycle. The spatial variability had a various decreasing from coastal shore to Center basin. Chl-a concentration was high in 2006 and low in 2005 during 2003-2009. Bohai Sea is classified as a case 2 water body, in which Chl-a concentration retrieved by MODIS is higher than in-situ. The spatial-temporal variation of Chl-a concentration is mostly influenced by refresh water inflow, rainfall, ocean current, wind direction and so on. The further research about Chl-a concentration will be needed in Bohai Sea. Acknowledgments. This paper was supported by Natural Science Foundation of Tianjin (NO.09JCZDJC25400 and NO. 08JCYBJC10500).
References 1. Sha, H.M., Li, X.S.: Annual variation in sea surface temperature and chlorophyll-a concentration retrieved by MODIS in East China Sea. Journal of Dalian Fisheries 24, 151–156 (2009) 2. Wu, M.: Application of MODIS satellite data in monitoring water quality parameters of Chaohu Lake in China. J. Environ. Monit. Assess. 148, 255–264 (2009) 3. Nikolay, N.P.: Patterns of Seasonal and Interannual Variability of Remotely Sensed Chlorophyll. J. Hdb Env. Chem. Part Q.5(part P), 143–157 (2005) 4. Zou, B.: Analysis of Characteristics of Seasonal and Spatial Variations of SST and Chlorophyll Concentration in the Bohai Sea. J. Advances in Marine Science 23(4), 487–492 (2005)
670
W.-l. Liu, L. Qian, and X.-s. Zheng
5. Yodera, J.A., O’Reillyb, J.E.: Variability in coastal zone color scanner (CZCS) Chlorophyll Imagery of ocean margin waters off the US East Coast. J. Continental Shelf Research 21, 1191–1218 (2001) 6. Iida, T., Saitoh, S.I.: Temporal and spatial variability of chlorophyll concentrations in the Bering Sea using empirical orthogonal function (EOF) analysis of remote sensing data. J. Deep-Sea Research II 54, 2657–2671 (2007) 7. Nikolay, N.P.: Seasonal and Interannual Variability of Remotely Sensed Chlorophyll. J. Hdb Env. Chem. 5, 333–349 (2008) 8. Wei, F.Y.: Modern diagnosis and prediction of climate statistics, 2nd edn. Meteorological Press, Beijing (2007) (in chinese) 9. Shi, N.: Multivariate analysis method in weather research and forecasting, 2nd edn. Meteorological Press, Beijing (2002) (in chinese) 10. Sun, X.P.: China coastal shore Regional Sea. Ocean Press, Beijing (2006) (in chinese) 11. Sheng, G.Y., Shi, B.Z.: Marine Ecology. Science Press, Beijing (2006)
Effect of the Twirling Frequency on Firing Patterns Evoked by Acupuncture Yu-Liang Liu1,2,*, Jiang Wang1, Wen-Jie Si1, Bin Deng1, and Xi-Le Wei1 1
School of Electrical Engineering and Automation, Tianjin University, Tianjin, 300072 School of Automation and Electrical Engineering, Tianjin University of Technology and Education, Tianjin, 300222 Tel.: 86-22-88181114 [email protected]
2
Abstract. Acupuncture is an important component of Traditional Chinese Medicine (TCM) with a long history. Although there are a number of different acupuncture manipulations, the relationship between the evoked electrical signals and manipulations is rarely investigated. So an experiment is performed that acupuncture at Zusanli acupoint by four acupuncture manipulations with different frequency to obtain the spike trains at spinal dorsal horn, and then study the correlation between manipulations via neural system outputs. Because the neural information transmission underlies the temporal spike timing, the concepts of interspike intervals (ISI) and firing rate (FR) are introduced. First, distinguish and correlation between different twirling frequencies is obtained through ISI sequences of the evoked electrical signals. Then the variation trend of the firing rate with the twirling frequency is discussed. Keywords: acupuncture, frequency, ISI, firing rate.
1 Introduction The neural systems have strong nonlinear characteristics and can display different dynamics behaviors due to different inputs both from internal and external environments. Their dynamics usually experience little changes when inputs are slightly modified, but when the stimulus parameter is close to a critical value, which is related to the intrinsic oscillation of the system, the neural system will emerge obviously different firing patterns. Spike trains as the neural system outputs carries significant neural information, so it is necessary for us to study how the system output encodes the system input. Up to now, the main coding schemes contains two classes, that is, rate code and temporal code. The firing rate (FR) of the spike trains is one kinds of the rate code. Interspike intervals (ISI), as the time interval between adjacent spikes, is generally recognized as basic elements of the temporal code, which plays an important role in encoding the neuronal information [1-10]. Acupuncture is an important part of traditional Chinese medicine (TCM) and its effectiveness has been approved for more than 300 diseases [11]. Since the 20th century, acupuncture has been widely used in abirritation [12], drug treatment [13] and so on. Acupuncture, as a mechanical action, can be equivalent to an external stimulus to K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 671 – 682, 2010. © Springer-Verlag Berlin Heidelberg 2010
672
Y.-L. Liu et al.
the neural system, which induces the neural system to evoke various kinds of neural electrical signals in that both the variation of the stimulus and the highly nonlinearity of the neural system itself [14]. However, the encoding mechanism of acupuncture is still unclear. Hence, we design an experiment that acupuncture at Zusanli acupoint by four acupuncture manipulations with different frequency to record action potentials on dorsal spinal horn. Then, study the effect of twirling frequency on firing patterns evoked by acupuncture using the rate code and temporal code. This paper is organized as follows. Section 2 describes the data source. In Section 3, methods and the analysis results of the acupuncture neural electrical signals are given, and the conclusions are drawn in Section 4.
2 Experiment and Data Recording All the experiments were performed on adult Sprague-Dawley rats regardless of their sexes. During the experiment, the animal is in a state of narcosis all the time. The extracellular recording is used to record the electrical signals from spinal dorsal root which responses to the acupuncture at the Zusanli acupoint. The reason why Zusanli acupoint is because acupuncture at Zusanli acupoint has very good effectiveness for the cure of gastropathy and it has been widely accepted. The voltage trace of the neural electrical signal is recorded by 16-channel physiology acquisition system (BIOPAC-MP150) at 40 kHz sampling rate. The experiment connection diagram is shown in Fig.1.
Fig. 1. Experiment connection diagram
Fig. 2. Experiment flow chart
Four different acupuncture frequencies are involved, that is twirling with 50,100,150 and 200 times per minutes, respectively. Each twirling frequency conducts three times, and each trial lasts for 20 seconds and stops for 100 seconds. The experiment flow chart is shown in Fig.2. The recorded spike trains evoked by four
Effect of the Twirling Frequency on Firing Patterns Evoked by Acupuncture
673
acupuncture manipulations are shown in Fig.3. It can be seen clearly that there are various kinds of firing patterns and different acupuncture manipulations can induce distinct electrical signals.
3 Analysis of Spike Trains Evoked by Acupuncture Spike trains are the carriers of the neural information. In the following, we will explorer the underlying information in the spike trains by using ISI and FR. 3.1 ISI Analysis ISI is considered to be a state variable that can characterize the temporal dynamics of the neural system. Fig.4 shows the ISI sequences for different twirling frequency. It can be seen that as the twirling frequency increases, the number of spikes increases. Especially, the number of spikes with frequency of 200 times/min is much bigger than that of 150 times/min. w=100,n=1
w=50,n=1
0.2
0.2
0.1 V (m v )
V (m V )
0.1
0
-0.1
-0.1
-0.2 0
0
2
4 N
6
-0.2 0
8 x 10
2
5
w=150,n=1
0.15
4 N
6
8 x 10
5
w=200,n=1
0.2
0.1 0.1
0
V (m V )
V (m V )
0.05
-0.05
0
-0.1 -0.1 -0.15 -0.2 0
2
4 N
6
8 x 10
5
-0.2 0
2
4 N
6
8 x 10
5
Fig. 3. The spike trains evoked by acupuncture with different frequencies. w and n denote twirling frequency and the sequence number of trials for the same frequency, respectively; N represents the sampled points of the recorded data
674
Y.-L. Liu et al.
Quantile-quantile plots (QQ plots) display a quantile-quantile plot of two samples, which can be used to test whether the distributions of two samples are same. If the samples do come from the same distribution, the plot will be linear. This method is robust with respect to changes in the location and scale of either distribution. In Fig.5, QQ plots are obtained between different trials with the same twirling frequency. Though the ISI sequences from different trials with the same twisting frequency are different, the approximate linear relationship suggests that ISI sequences evoked by the same frequency are at the same distribution except (e) and (k). But the deviations of scatter points and the line in (e) and (k) are not very big. Hence, from statistical point of view, it assumes that the acupuncture effects are similar under the same twirling frequency.
Fig. 4. ISI sequences evoked by different twirling frequency. w and n denote twirling frequency and the sequence number of trials for the same frequency, respectively.
Another group of QQ plots are made between the experiments with the same sequence number but different twirling frequencies. From Fig.6, we could get several suggestions: (1) Distributions of samples evoked by frequency 50 times/min and 100 times/min are very similar, and the same situation occurs between frequency 150 times/min and 200 times/min. Except for these two cases, the distributions are unlike with each other. (2) Compared with Fig.5, because most of the plots hold a large deviation in Fig.6, samples of ISI sequences evoked by different twirling frequencies appears in the different distribution family. In conclusion, acupuncture manipulations with different frequencies will be likely to evoke different firing patterns, which means different twirling frequency can produce different effect on the target organ. That’s why different twirling frequencies in TCM are used to cure different diseases.
Effect of the Twirling Frequency on Firing Patterns Evoked by Acupuncture
1000
800
800
w = 5 0 ,n = 2
1000
w = 5 0 ,n = 2
600 400
600 400
200
200
0 0
200
400 600 w=50,n=1
800
0 0
1000
(c)
400
200 0
200
400 600 w=50,n=2
800
-200 0
1000
1200
1000
1000
800 w = 1 0 0 ,n = 3
800 w = 1 0 0 ,n = 3
1000
400
200
600 400 200
100
200
300 400 w=100,n=1
500
600
700
600 400 200 0
0 100
200
300 400 w=100,n=1
500
600
-200 0
700
500
200
400 w=100,n=2
600
800
400
400
300
300
w = 1 5 0 ,n = 3
w = 1 5 0 ,n = 2
800
600
w = 1 0 0 ,n = 2
w = 5 0 ,n = 3
600
200 100
200 100 0
0 -100 0
400 600 w=50,n=1
800
800
-200 0
200
1000
1000
0 0
675
100
200
300 400 w=150,n=1
500
600
-100 0
100
200
300 400 w=150,n=1
500
600
Fig. 5. QQ plots between different experiments in the same twirling frequency. w and n denote twirling frequency and the sequence number of trials for the same frequency, respectively.
676
Y.-L. Liu et al.
400
350
350
300
300
250
w = 1 5 0 ,n = 3
w = 2 0 0 ,n = 2
250
200
200
150
150
100
100
50
50 0 0
100
200 300 w=150,n=2
400
0 0
500
350
50
100
150 200 w=200,n=1
250
300
350
250
300
200 w = 2 0 0 ,n = 3
w = 2 00 ,n = 3
250
150
200 150
100
100
50
50 0 0
50
100
150 200 w=200,n=1
250
300
0 0
350
50
100 150 w=200,n=2
200
250
Fig. 5. (Continued) 700
2000
600 1500 w=150,n=1
w=100,n=1
500 400 300 200
1000
500
100 200
400 600 w=50,n=1
800
0 0
1000
200
500
1000
400
800 w=100,n=2
w=200,n=1
0 0
300 200
800
1000
600 400 200
100 0 0
400 600 w=50,n=1
200
400 600 w=50,n=1
800
1000
0 0
200
400
600
800
1000
w=50,n=2
Fig. 6. QQ plots between different experiments in the same sequence number but different twirling frequency. w and n denote manipulation frequency and the sequence number of trials for the same frequency, respectively
Effect of the Twirling Frequency on Firing Patterns Evoked by Acupuncture
2500
600 500
w=200,n=2
w=150,n= 2
2000 1500 1000 500
200
300 200
400 600 w=50,n=2
800
0 0
1000
1200
1000
1000 w=150,n=3
1400
1200
w=100,n=3
1400
800 600
400 200 400 600 w=50,n=3
800
0 0
1000
2500
500
2000
400
1500
w=150,n=1
600
300 200
400 600 w=50,n=2
800
1000
200
400 600 w=50,n=3
800
1000
600
200 200
200
800
400
0 0
w=200,n=3
400
100
0 0
1000 500 0
100 0 0
200
400 600 w=50,n=3
800
-500 0
1000
600
350
500
300
300 200
200
300 400 w=100,n=1
500
600
700
500
600
(l)
200 150 100
100 0 0
100
250
400
w= 200,n= 1
w=200,n=1
677
50
100
200
300 400 w=100,n=1
500
600
700
Fig. 6. (Continued)
0 0
100
200
300 400 w=150,n=1
Y.-L. Liu et al.
2000
500
1500
400 w = 200,n= 2
w=150,n=2
678
1000 500 0
200 100
-500 0
200
400 w=100,n=2
600
0 0
800
250
1000
200
800 w=150,n=3
w=200,n=2
300
150 100 50
200
400 w=100,n=2
600
200
400 w=100,n=3
600
800
600 400 200
0 0
100
200 300 w=150,n=2
400
0 0
500
800
250 1000 800
150
w=150,n=3
w=200,n=2
200
100 50
400 200
0 0
350 300
100
200 300 w=150,n=2
400
0 0
500
200
400 w=100,n=3
600
800
200
(q)
150 w=200,n=3
250 w=200,n= 3
600
200 150 100
100
50
50 0 0
200
400 w=100,n=3
600
800
0 0
Fig. 6. (Continued)
100
200 w=150,n=3
300
400
Effect of the Twirling Frequency on Firing Patterns Evoked by Acupuncture
4
3
x 10
679
100 90
2.5
80 meanISI
Var
2 1.5 1 0.5
w=50 w=100 w=150 w=200
0 1
70 60 50 40
2 n
30 1
3
w=50 w=100 w=150 w=200
2 n
3
Fig. 7. Variances and means of the ISI sequences. w and n denote twirling frequency and the sequence number of trials for the same frequency, respectively.
Furthermore, the variances and extreme value of ISI sequences can help us to distinguish the electrical signals produced by different twisting frequencies. In Fig.7, the means of the ISI sequences of the electrical signal evoked by different twirling frequencies can’t distinguish the different frequencies except that the mean for 200 times/min is much smaller than others. But it is much easier to distinguish them by variances. With the twirling frequency increases, the variance of the ISI sequences becomes smaller. Similarly, from Fig.8, the minimums of ISI sequences haven’t any regularity with the twirling frequency varied, but the maximum could explain perfectly that with the twirling frequency increases, the maximum of the ISI sequences decreases apparently. 900
0.4
800
0.35 0.3
w=50 w=100 w=150 w=200
600 500
minISI(ms)
maxISI(ms)
700
400 300
0.2
w=50 w=100 w=150 w=200
0.15 0.1
200 100 1
0.25
0.05 2
3
0 1
2 n
3
Fig. 8. Maximums and minimums of ISI sequences. w and n denote twirling frequency and the sequence number of trials for the same frequency, respectively
3.2 FR Analysis Before analyzing the firing rate of electrical signals evoked by different acupuncture frequency, it is important to choose a proper time window to calculate the firing rate. After a series of trials, the time window is set to be 50000 points, which means that count the number of spikes in each 50000 sample points, and divide by
680
Y.-L. Liu et al. w=50,n=2 35
25
30
20
25 fire rate
fire rate
w=50,n=1 30
15
20
10
15
5
10
0 0
2
4 N
6
8 x 10
5 0
2
4 N
5
6
8 x 10
5
w=100,n=1
w=50,n=3
30
30
25
25
fire rate
fire rate
20 20 15
15 10
10 5 0
5 2
4 N
6
0 0
8 x 10
2
5
4 N
6
8 x 10
5
w=100,n=3
w=100,n=2 30
30 25
25
fire rate
fire rate
20 20 15
15 10
10 5 0
5 2
4 N
6
0 0
8 x 10
2
5
w=150,n=1 40
8 x 10
5
25
fire rate
fire rate
6
30
30
20
10
0 0
4 N w=150,n=2
20 15 10
2
4
6
35
5 0
8 x 10
N w=150,n=3
2
5
6
8 x 10
5
w=200,n=1
60
30
4 N
50
fire rate
fire rate
25 20
40 30
15 20
10 5 0
2
4
6
8 x 10
5
10 0
2
4 N
6
8 x 10
5
Fig. 9. Firing rate of ISI sequences. w and n denote twirling frequency and the sequence number of trials for the same frequency, respectively; N represents the sampled points of the recorded data.
Effect of the Twirling Frequency on Firing Patterns Evoked by Acupuncture
w=200,n=2
60
681
w=200,n=3
50 45
50
fire rate
fire rate
40 40 30
35 30
20 10 0
25 2
4 N
6
20 0
8 x 10
5
2
4 N
6
8 x 10
5
Fig. 9. (Continued)
window length. Fig.9 is the firing rate plots of ISI sequences, and several points can be got: (1) The values of firing rate at frequency of 50,100 and 150 times/min are in the similar level, but when the frequency arrives at 200 times/min, the firing rate becomes much larger. (2) With the frequency increases, the curve of firing rate becomes smoother, which means that the spike may happen more frequently with higher twirling frequency.
4 Conclusion In the paper, the correlation between different twirling frequencies is studied using temporal code (ISI) and rate code (FR) of the electrical signals. Through QQ plots, it is found that ISI sequences generated by the same twirling frequency are in the same distribution, while in the different distribution when twirling frequency is different, which means that for the same experiment object, acupuncture with the same frequency will produce the same effect on the target while acupuncture in different frequencies should make the different effect. Furthermore, variances and extreme value are also used here to distinguish different twirling frequencies via corresponding ISI sequences effectively. FR analysis shows that the target organ can suffer more stimuli if the twirling frequency is increased, especially at the frequency 200 times/min. In a word, from both ISI and FR analysis, the twirling frequency has obvious effect on the target organ by neural signals. Acknowledgment. This work is supported by the NSFC (No. 50537030, 50707020 and 60901035).
References 1. Masuda, N., Aihara, K.: Filtered interspike interval encoding by class II neurons. Physics Letters A 311, 485–490 (2003) 2. Shoji, F.F., Lee, H.H.: On a response characteristics in the Hodgkin-Huxley model of nerve and muscle fiber to a periodic stimulation. In: 26th Annual Conference of the IEEE 2000, vol. 3(22-28), pp. 2035–2041 (2006) 3. Xu, J.X., Gong, Y.F., Wei, R., Hu, S.J., Wang, F.Z.: Propagation of periodic and chaotic action potential trains along nerve fibers. Physica D: Nonlinear Phenomena 100(1-2), 212–224 (1997)
682
Y.-L. Liu et al.
4. Tuckwell, H.C.: Spike trains in a stochastic Hodgkin–Huxley system. Biosystems 80(1), 25–36 (2005) 5. Yo, H.: A spike train with a step change in the interspike intervals in the FitzHughNagumo model. Physica D: Nonlinear Phenomena 82(4), 365–370 (1995) 6. Rasouli, G., Rasouli, M., Lenz, F.A., Verhagen, L., Borrett, D.S., Kwan, H.C.: Fractal characteristics of human parkinsonian neuronal spike trains. Neuroscience 139(3), 1153– 1158 (2006) 7. Canavier, C.C., Perla, S.R., Shepard, P.D.: Scaling of prediction error does not confirm chaotic dynamics underlying irregular firing using interspike intervals from midbrain dopamine neurons. Neuroscience 129(2), 491–502 (2004) 8. Takens, F.: Detecting strange attractors in turbulence. In: Lecture Notes in Mathematics, vol. 898, pp. 336–381 (1981) 9. Izhikevich, E.M.: Resonate-and-fire neurons. Neural Networks 14(6-7), 883–894 (2001) 10. Yang, Z.Q., Lu, Q.S., Gu, H.G., Ren, W.: Integer multiple spiking in the stochastic Chay model and its dynamical generation mechanism. Physics Letters A 299(5-6), 499–506 (2002) 11. Shi, X.M.: China Press of Traditional Chinese Medicine. Acupuncture, Beijing (2004) 12. Ke, Q., Wang, Y.H., Zhao, Y.C.: Acupuncture abirritation and its mechanism. Sichuan Journal of Anatomy 10(4), 224–230 (2003) 13. Lu, Y., Hu, J., Mo, Q.Z.: Advance in Research on Abstinence from Narcotin Drugs by Acupuncture. J. Acu-mox 18(3), 43–45 (1999) 14. Wang, J., Si, W.J., Che, Y.Q., Fei, X.Y.: Spike trains in Hodgkin-Huxley model and ISIs of acupuncture manipulations. Chaos Solitons & Fractals 4(4), 890–900 (2008)
Comparison of Two Models for Calculating Water Environment Capacity of Songhua River Shihu Shu1 and Huan Ma2 1
School of Environmental Science and Engineering, Tongji University, Shanghai, China 2 Anglian Water Services Ltd. , Anglian House, Ambury Road, Cambridgeshire, UK [email protected], [email protected]
Abstract. Water environment capacity is an important conception in environmental science. As a basic theory applied in EIA (Environmental Impact Assessment), water environmental capacity is also an indispensable factor in making District Environmental Planning and total water pollutant control. With the statistic monitoring data of 17 water quality indexes from 2001 to 2005 of six monitor sections offered by Harbin Environmental Protection Administration, assessment of water quality of Songhua River along Harbin City was made. Sensitivity analysis was performed to identify the critical model parameters from 17 indexes. COD and NH3-N were selected as the key parameters to be calculated. Both one-dimension and two-dimension water quality model were calibrated and used to derive the water environmental capacity in Songhua River. Discussion was developed to show the model performance evaluation. The conclusion was generated that two-dimension water quality model provides a more conservative water environmental capacity than one-dimension model. There is no water environmental capacity in Harbin City region of Songhua River, which needs pollutant reduction. Up stream and down stream of Harbin City can bear with the current wastewater discharge. Keywords: water environmental capacity, EIA, water quality model, onedimension model, two-dimension model.
1 Introduction Water environment capacity is an important conception in environmental science. Water environment capacity is the load quantity of pollutants during certain time, in a certain unit, under the condition that water can fulfil certain environmental object (Wang et al., 1995). It reflects the capacity of the water body provided to receive pollutants without destroying its own function. As a basic theory applied in Environmental Impact Assessment (EIA), water environmental capacity is also an indispensable factor in making District Environmental Planning and total water pollutant control. At present, the numerical computation methods for environmental capacity are classified as three categories: systematically optimized method, trail-and-error method, analytical formula method (Zhang & Zhang, 1991; Xu, 2003; Xu & Lu, 2003; Zheng et al., 1997). Optimized method based on linear programming has been applied in the river rehabilitation planning (Zheng et al., 1997; Li & Chen, 1991; Cao, K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 683 – 690, 2010. © Springer-Verlag Berlin Heidelberg 2010
684
S. Shu and H. Ma
1988; Yang & Sun, 1995), it is the technique that dynamic response between pollution load and water quality standard is established, then maximal permissible pollutant load can be given based on the objective function and the corresponding limitation conditions. This method is precise in theory, and the given environment capacity is also precise. However, the method is very complex, which is hard for a river with so many river segments matching the objective function (Yao et al., 2006). Trial-and-error method is the technique that well-calibrated model is used to make the simulated water quality concentration equal to the specified water quality standard by adjusting the discharged pollution load, thus the permissible pollutant load is obtained. This method is simple but less efficient, which is also difficult for a river with so many river segments (Xu & Lu, 2003; Zheng et al., 1997). By contrast, analytical formula method is the technique that the static water quality model is usually used to calculate stable environmental capacity under given water quality standard with a certain designed hydrological condition. This method is simple while accurate, which is very useful environmental capacity in river. However, the environment capacity varies with dynamic hydrodynamic patterns, so the dynamic model should be adopted to calculate environmental capacity (Zheng et al., 1997). In this study, the analytical formula with dynamic hydrodynamic patterns was deduced firstly; further, to the Songhua River along Harbin City, the hydrodynamic model was developed to simulate dynamic hydrodynamic conditions in this river, and the deduced analytical formula was utilized to compute the dynamic environment capacity, which lays basis for water quality conservation and restoration in this famous river.
2 Sensitivity Analysis of Water Quality Parameters Sensitivity analysis is the study of how model output varies with changes in model inputs. Regional Sensitivity Analysis (RSA) (Hornberger and Spear, 1981; Spear and Hornberger,1980) was used to assess the sensitivity of model parameters, where sensitivity is defined as the effect of changes in the parameters on the overall model performance. One of the approaches adopted uses the extension to RSA introduced by Freer et al. (1996). Essentially parameter sets are drawn at random from prior distributions of feasible values (which may simply be taken from a uniform distribution) and used for Monte-Carlo simulations. Simulations are ranked according to the selected objective function, and split between a behavioural set, which are feasibly consistent with the data support, and a non-behavioural set, which is discarded as being a totally unrealistic representation of the system (as judged by an objective function). The objective functions are then transformed into likelihood values (i.e. the chance of occurrence) split into ten quantile groups and the cumulative frequency distribution is calculated and plotted (Wagener et al., 2001). If the model performance is sensitive to a particular parameter there will be a large difference between the cumulative frequency distributions of the quantile groups. If the model performance is not sensitive to a particular parameter, given an a priori uniform distribution each group will plot on a straight line (Wagener et al., 2001). Sensitivity is only one of the essential requirements of an identifiable parameter. A parameter is termed identifiable if it is possible to determine its value with relative
Comparison of Two Models for Calculating Water Environment Capacity
685
confidence within the feasible parameter space based on the model output produced. However, the values of sensitive parameters that produce a behavioural model output can be distributed over a range of the feasible parameter space and can change when estimated from different response modes (Wagener et al., 2001). With the statistic monitoring data of 17 water quality indexes from 2001 to 2005 of six monitor sections offered by Harbin Environmental Protection Administration, assessment of water quality of Songhua River along Harbin City was made in this paper. Sensitivity analysis was performed to identify the critical model parameters from 17 indexes. COD and NH3-N were selected as the key parameters to be calculated.
3 Comparison of the Two Models In the study, both one-dimension and two-dimension water quality model were calibrated and used to derive the water environmental capacity in the Songhua River. Discussion was developed to show the mechanism and difference between the two models, as well as the model performance evaluation was presented. 3.1 One-Dimension Model The water environmental capacity estimation can be based on the following one dimensional water quality model:
∂ ( AC ) ∂ ( QC ) ∂ ⎛ ∂C ⎞ + − ⎜ AEM ⎟ + AkC = Sm ∂t Qx ∂x ⎝ ∂x ⎠
(1)
where Q is the discharge along the longitudinal direction; A is the lateral cross section area; C is the concentration of pollutant constituents; k is the decay coefficient of pollutant; EM is the diffuse coefficient of the river; Sm is the source and conflux; t is the time; x is the distance between sensitive point and the discharge. 3.2 Two-Dimension Model ∂ ( hC ) ⎡ ∂ ( huC ) ∂ ( hvC ) ⎤ ⎡ ∂ ⎛ ∂C ⎞ ∂ ⎛ ∂C ⎞ ⎤ +⎢ + ⎥ = ⎢ ⎜ hExM ⎟ ⎥ − r (c) + S ↓↑ ⎟ + ⎜ hE yM ∂t ∂ x ∂ y ∂ x ∂ x ∂ y ∂y ⎠ ⎦ ⎠ ⎝ ⎣ ⎦ ⎣ ⎝
where u is the velocity along the longitudinal direction; h is the depth of the river; r(c) is degradability; C is the concentration of pollutant constituents;
(2)
686
S. Shu and H. Ma
v is the velocity along the transverse direction; ExM, EyM are the transverse and longitudinal diffuse coefficients of the river; S ↓↑ is the source and conflux; t is the time; x,y are the transverse and longitudinal distance between sensitive point and the discharge. 3.3 Comparison One-dimensional steady state river water quality model was selected for earlier EIA studies. Since it was observed that, water quality parameters in the river vary predominantly in the longitudinal direction, one-dimensional approximation was assumed. The model is one of the widely accepted water quality simulation tools for water quality impact analysis studies. The restrict factor of one-dimension water quality simulation is the control of water quality object in the cross section. However, two-dimension water quality simulation gives the distribution of the pollutants and restricts the size of the pollutant area. By simulation to compare these two models, the conclusion was generated that twodimension water quality model provides a more conservative water environmental capacity than one-dimension model.
4 Model Calibration Model calibration is the first stage testing or tuning of the model to a set of field data not used in the original construction of the model. Such tuning is to include consistent and rational set of theoretically defensible parameters and inputs (Thomann 1982). Model calibration is actually the process by which one obtains estimates for the model parameters through the comparison of field observations and model predictions. Even if the steady state condition is assumed, the environmental parameters can still vary due to random changes of temp, stream discharge, time of day, and general weather conditions. Due to this inherent dynamic nature of the environment, discrepancies between the predicted and observed results are bound to occur. The effect of measurement errors can be minimized by optimizing data collection procedures like, collecting data in most sensitive locations and by collecting optimum number of replicates. Calibration of the hydrodynamic part of the model was first carried out by comparing simulated hydrodynamic variables (depth and velocity) with the measured ones. It is the hydrodynamic simulation that provides the necessary flow and velocity information to determine how a constituent is transported and reacted throughout a river. This indicates that hydrodynamic calibration must be conducted before embarking on water quality model calibration. Calibration of a water quality model is a complicated task. There are many uncertain parameters that need to be adjusted to reduce the discrepancy between the model predictions and field observations. The objective of water quality model calibration is to minimize the difference between the field observed and the model simulated constituent concentrations. Water quality calibration is a nonlinear implicit optimization problem. It can be solved by using optimization technique as for hydraulic model
Comparison of Two Models for Calculating Water Environment Capacity
687
calibration by Wu et al. (2002 & 2004). This paper provides a calibration methodology in which the calibration is automatically optimized with Genetic algorithm (GA). GA is a robust search paradigm based on the principles of natural evolution and biological reproduction (Goldberg, 1989). For optimizing calibration of a water quality model, a genetic algorithm program first generates a population of trial solutions of the model parameters. One generation produced by the genetic algorithm is then complete. The fitness measure is taken into account when performing the next generation of the genetic algorithm operations. To find the optimal calibration solutions, fitter solutions will be selected by mimicking Darwin’s natural selection principal of “survival of the fittest”. The selected solutions are used to reproduce a next generation of calibration solutions by performing genetic operations. Over many generations, the solutions evolve, and the optimal or near optimal solutions ultimately emerge. The calibrated model was used to predict the water quality with an independent set of data as a part of validation exercise. Results of model predictions were fairly good, and performance of the model was further confirmed through statistical evaluation of the results. Two models were calibrated by adjusting hydrodynamic and water quality parameters to make the simulated results fit the field observations. Performance of the calibrated model was evaluated. Model output was found to be very sensitive to headwater quality.
5 Calculation of Water Environmental Capacity
(3) @BD 2* 2* 3 2* D
333 @BD @BD where WEC is water environmental capacity, kg/d; h is the depth of the river, m; u is the velocity, m/s; K is the decay coefficient of pollutant, d-1; C0 is the concentration of pollutant from upper regions, mg/L; Cs is the water quality standard of control point, mg/L; Ey is the transverse diffuse coefficient, m2/s; x1,x2 are the distance between upper (lower) cross section and the discharge.
5.1 Parameters Determination 1) Decay coefficient k Each pollutant index has its own decay coefficient k. There are many methods to determine the coefficient k such as analogy, empirical estimate method, and so on. In this paper, decay coefficient k of COD and NH3-N was estimated to be 0.07 and 0.05 respectively based on the water quality model calibration. 2) Hydrological parameters Based on the monitoring data from the year 1980 to 2003, hydrological parameters were shown in Table 1. Hydrological parameters were determinated based on Hydraulic model calibration. 3) The amount of discharges The information of the discharges was given in Table 2.
688
S. Shu and H. Ma Table 1. Hydrological parameters in Calculation Parameters
Value
Parameters
Value
H (m)
337
Dy
0.38
V (m)
4.45
Q (m3/s)
345
B (m s-1)
0.23
C0COD (mg/L)
17.7
Kli
0.9
C0NH3-N (mg/L)
0.79
Table 2. The information of the discharges #
. ,-
K,78-
5.,#8-
433
:DB
D4: D
6 786,#8-
7
B433
330@
7 4B74
BD 4
"
@043
:3:
: 7
::3@
!
:043
7@
73
4
7043
73
73
4
C)
D0043
73
4
5.2 Results and Discussion The ideal water environmental capacity was given by the simulation of water quality model. The ideal water environmental capacity was defined as the water environmental capacity contributed by the discharges. One-dimension model and twodimension model were used respectively to calculate the environmental capacity. The results drawn by two-dimension water quality model were given in Table 3, 4. And the results drawn by One-dimension water quality model were given in Table 5, 6. The whole water environmental capacity of Songhua River was calculated and result can be concluded that pollution reduction was only needed in the region of Harbin City, but not the upper and lower regions. The amount of discharging was also given companied with the result mentioned above and that was obtained by the difference of water environmental capacity and real discharging amount. This result will be used as a reference of pollution reduction based on pollution control of Songhua River along Harbin City. Table 3. Environmental capacity of COD with two-dimension model
Part of the Songhua River
Q (m3 s-1)
Ideal Water environmental capacity
Water environmental capacity
COD discharge amount
Zhushuntun— Dongjiangqiao Dongjiangqiao— Dadingzishan
345.0
9887
9995.16
28976
347.02
44837
44674.2
83816
Comparison of Two Models for Calculating Water Environment Capacity
689
Table 4. Environmental capacity of NH3-N with two-dimension model / %
K,78-
+ " " * *
=L . 1 . 1 L. <
7D43
@7
@7:
D@7
7D03
DB
@DB
403@
6 786
Table 5. Environmental capacity of COD with one-dimension model / %
K,7I-
=L . 1 . 1 L . <
7D43
@:00
:3@4
@:0B
7D03
370B
347@@
@7@B
+ " " * *
5.
Table 6. Environmental capacity of NH3-N with one-dimension model / K ,7 8 % =L . 1 . 1 L . <
+ " *
" *
6 786
7D43
D@D
43 :
D@7
7D03
B 0@
B7344D
403@
6 Conclusion Considering that the conventional water quality model calibration method is not efficient, a new optimization technique approach was presented. Furthermore, this new method was applied to estimate the water environment capacity of Songhua River along Harbin City after establishing hydrodynamic and water quality model, which shows the newly developed analytical computation approach efficient and reliable both in theory and in practice. Acknowledgment. Part of this work was supported by National Water Special Project of China (2009ZX07421-005) and Shanghai Post-doctor research foundation (B type) Program (10R21420900).
References 1. Cao, L.L.: Computational method for aquatic environmental capacity of tide-affected river. Shanghai Environ. Sci. 17, 15–18 (1988) 2. Freer, J., Beven, K.J., Ambroise, B.: Bayesian estimation of uncertainty in runoff prediction and the value of data: an application of the GLUE approach. Water Resour. Res. 32, 2161–2173 (1996)
690
S. Shu and H. Ma
3. Goldberg, D.E., Korb, B., Deb, K.: Messy genetic algorithms: Motivation, analysis, and first results. Complex Systems 3, 493–530 (1989) 4. Hornberger, G.M., Spear, R.C.: An approach to the preliminary analysis of environmental systems. J. Environ. Manage. 12, 7–18 (1981) 5. Li, K.M., Chen, X.C.: Research on optimal water environmental capacity of Dongguan Canal. Res. Environ. Sci. 4, 13–16 (1991) 6. Spear, R.C., Hornberger, G.M.: Eutrophication in Peel Inlet, II, Identification of critical uncertainties via generalized sensitivity analysis. Water Res. 14, 43–49 (1980) 7. Thomann, R.V., Muller, J.A.: Verification of water quality models. J. Environ. Eng. 108, 923–940 (1982) 8. Wagener, T., Boyle, D.P., Lees, M.J., Wheater, H.S., Gupta, H.V., Sorooshian, S.: A framework for the development and application of hydrological models. Hydrol. Earth Syst. Sci. 5, 13–26 (2001) 9. Wang, H.D., Wang, S.H., Bao, Q.S.: On regional differentiation of river water environment capacity and strategies to control water environment pollution in China. Chinese Geogr. Sci. 5, 116–124 (1995) 10. Wu, Z.Y., Walski, T.M., Mankowski, R., Herrin, G., Gurrieri, R., Tryby, M.: Calibrating water distribution models via genetic algorithms. In: Proceedings AWWA Information Management Technology Conference, pp. 1–10. Taylor& Francis, Kansas (2002) 11. Wu, Z.Y., Arniella, E.F., Gianellaand, E.: Darwin calibrator-improving project productivity and model quality for large water systems. J. Am. Water Works Ass. 96, 27–34 (2004) 12. Xu, Z.X.: Practice and theory of river pollution control, 1st edn., pp. 58–99. China Environmental Science Press, Beijing (2003) 13. Xu, Z.X., Lu, S.Q.: Calculating analysis on water environmental capacity of tidal river system. Shanghai Environ. Sci. 22, 254–257 (2003) 14. Yang, Z.P., Sun, W.: Approach on computational Method for tidal river dynamic assimilative capacity. Shanghai Environ. Sci. 14, 14–16 (1995) 15. Yao, Y.J., Yin, H.L., Li, S.: The computation approach for water environmental capacity in tidal river network. J. Hydrod., Ser. B 18, 273–277 (2006) 16. Zhang, Y.L., Zhang, P.Z.: Handbook of the calculation of water environmental capacity, 1st edn., pp. 35–89. Tsinghua University Press, Beijing (1991) 17. Zheng, X.Y., Zhu, J.D., Zhu, W.B.: Research on water environmental capacity of dynamic river system. Adv. Water Sci. 8, 25–31 (1997)
Growth Characteristics and Fermentation Kinetics of Flocculants-Producing Bacterium F2 Jie Xing, Jixian Yang, Fang Ma*, Wei Wang, and Kexin Liu State Key Lab of Urban Water Resource and Environment, School of Municipal and Environmental Engineering, Harbin Institute of Technology, Harbin 150090, China Tel.: 0451-86282107 [email protected]
Abstract. We isolated flocculants-producing bacteria F2 from soil. It shows high and stable flocculating activity for Kaolin clay suspension. In order to comprehend their growth characteristics to make good use of them, we measured the changes of several parameters using shaking flask experiment, including pH, temperature, the content of glucose and nitrogen source. And we built the model about the cell growth and substrate consumption. Through the comparison of experimental data and the corresponding calculated values from the models, we found that the data joint well and the model can provide theoretical basis for large-scale fermentation of flocculants-producing bacteria F2. Keywords: bio-flocculants; growth characteristics; fermentation kinetics.
1 Introduction Flocculants have been widely used in the area of water treatment. However, it has been widely accepted that the current chemistry flocculants will be soon replaced by fermentation of microorganisms. Bio-flocculants have so many advantages, such as biodegradability, harmlessness, and none of secondary pollution, that more and more considerable attention has been paid to them recent years. Bio-flocculants are produced by microorganisms. Indeed, a large number of researchers at home and aboard have screened several flocculants-producing microorganisms. In order to develop a cost competitive fermentative flocculants production process, strain improvement maximizing the production of flocculants but minimizing the formation of by-products through rational metabolic engineering is very important. We screened and developed an improved flocculants producer, flocculants-producing 1 bacteria F2 by two –stage fermentation. In addition, the possibilities of cost-effective flocculants by flocculants-producing bacteria F2 from inexpensive and abundant feedstocks were also investigated. Flocculants-producing bacteria F2 is classified as radiation rhizobium and isolated from soil, which had high and stable flocculating activity for Kaolin clay suspension, domestic sewage, industrial wastewater and so on[1]. Kinetic modeling is regarded as an indispensable step in developing a fermentation process since the model can be used to determine an the best condition for the production of a target metabolite. Although the former can explain complex microbial * Corresponding author. K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 691 – 699, 2010. © Springer-Verlag Berlin Heidelberg 2010
692
J. Xing et al.
system at the molecular levels, relatively simpler unstructured kinetic models have frequently been used for practical applications [2]. In this paper, we comprehend the growth characteristics of flocculating bacteria F2 by tracking changes of the environmental and nutritional factors influencing the growth of strains F2. In addition, the kinetic model of cell growth and substrate consumption was built to describe the growth and batch fermentation of strains F2, providing experimental evidence for further industry. A series of batch fermentation with different time were conducted in shake flasks, and the experimental data were used to estimate parameters and also to validate the models. The behaviors of strain F2 during fermentation were predicted using the models, and the prediction were compared with experimental observations. The model established can successfully explain cell growth and glucose utilization.
2 Materials and Methods 2.1 Materials Strains: Bio-flocculants producing bacterium F2: Bacillus sp. Medium: Inclined medium (g/L): peptone 10, NaCL5, beef extract 3, agar powder 15~18, H2O 1000mL, pH 7.0~7.2 Flocculants fermentation medium (g/L): glucose 10, yeast extract 0.5, urea 0.5, MgSO4·7H2O 0.2 , NaCL0.1, K2HPO4 5 KH2PO4 2, H2O 1000mL, pH 7.2~7.5
,
Regents: Reagents for cultivation such as nutrient agar (NA), nutrient broth (NB) were purchased from Beijing Aoboxing biotechnology co., ltd, China. Components of Medium, glucose, K2HPO4, KH2PO4, MgSO4·7H2O, NaCl, urea were obtained from Tianjin the third chemical regent factory, China. Phenol, H2SO4, coomassie brilliant blue G-250 were from Beijing YiLi fine chemical co,ltd, China. 2.2 Methods Culturing of bio-flocculants-producing bacterium F2: Fermentation was carried out on a rotary shaker (140 rpm) at 30◦C for 24h100 ml of the medium in 250 ml flasks. The seed culture was prepared at 30◦C and 140 rpm for 24h in the medium which was same as the fermentation medium. 10% of seed culture was inoculated into the fermentation medium. OD: Dissolved oxygen meter[3]. pH: pH meter[4]. Polysaccharide: sulfuric acid-phenol method[5]. Protein: coomassie brilliant blue method[6]. Glucose: biosensor[7]
Growth Characteristics and Fermentation Kinetics of Flocculants-Producing Bacterium F2
693
3 Results and Discussion 3.1 The Affection of Environmental Factors on Bacteria F2 Dissolved Oxygen, pH are the essential environmental factors to determine the cell development. In order to inspect their impact on the cell development, bio-flocculants producing bacteria F2 was fermented in fermentation medium for 36h.Measure the pH, temperature and biomass. The changes are shown in figure 1. dry cell weight pH value 菌 DO 体干重
值 溶解氧 pH
1.8
) L/ g ( t
1.6 1.4 1.2
8.0
4.5
7.8
4.0
7.6
3.5
7.4
3.0
7.2
DO( mg/L)
hg ie 1.0 w ll 0.8 ec y 0.6 r D
5.0
0.4 0.2 0.0 0
3
6
9
12
15
18
() 21
time h
24
27
30
33
2.5
7.0
2.0
6.8
1.5
6.6
1.0
6.4
0.5
6.2
0.0
6.0
e u l pa H 值v H p
36
Fig. 1. Time course of cell growth, pH and temperature by strain F2
According to the Figure1, we can know flocculating bacteria F2 basically get over lag phase and directly turn into exponential growth phase, which is because the selected strain has appropriate cell age and inoculation amount. During 0-15h, dry cell weight increases sharply, flocculating bacteria F2 turn into exponential growth phase. During 15-27h, dry cell weight increases slowly, flocculating bacteria F2 turn into stationary phase. After 27h, flocculating bacteria F2 turn into decline phase, cells in the culture medium begin dying, living cells concentration declines while dry cell weight of fermentation broth increases slowly again, which is because the cell amount is measured by dry weight, containing both viable bacteria and dead bacteria, leading to the decline instead of increasing of the dry weight of cells. Compared with the growth curve of flocculating bacteria F2, in the range of neutral-alkali, the change of pH has little effect on cell growth. We can find that the amount of dissolved oxygen is few or no during the stage where pH has a decline trend. In the middle and final stages of Logarithmic phase and in stationary phase, the sharply decrease of oxygen concentration make its effect when function as terminal electron acceptor weaken, flocculating bacteria F2 use intermediate metabolites
694
J. Xing et al.
Oxidative decomposed by organics as terminal electron acceptor to ferment, generating organic acid and etc. then lead to the decline of pH value. Combining phasedivided oxygen supply control strategy, the fermentation of F2 during 3 to 27h need concentrated and large amount of oxygen supply, which is to say there are 21 hours in fermentation system when need relatively high concentration of dissolved oxygen while in the rest time fermentation system bacteria can maintain lower dissolved oxygen concentration, which can not only reduce fermentation energy consume but also facilitate the composition of flocculating products. 3.2 The Affection of Nutritional Factors on Bacteria F2 To observe the use condition of carbon source and nitrogen source in the growth of flocculating bacteria F2, use Glucose content, total nitrogen and dry cell weight to make Figure 2.
dry cell weight 菌 体 干 content 重 glucose 葡 萄TN糖 含 量
总 氮
2.0
10 9
) L/ g ( t
8
1.5 7
hg ie w1.0 ll ec y r0.5 D
6 5 4 3
400
) L / g 葡( 萄t
h 糖 gi 含 ew 量e ( s g o / c L u ) l g
350 300 250 200
) L / g m N T
(
150 100
2 1
0.0 0
3
6
9
12
15
18
21
24
27
30
33
0 36
50 0
()
time h
Fig. 2. Time course of cell growth, carbon and nitrogen source
According to Figure 2, during 0-15h cell growth turn into exponential phase directly, the corresponding Glucose and total nitrogen content exponentially decrease as well, providing essential nutrients for cell growth. During 15-27h, cell growth turn into stationary phase, the decline trend of corresponding Glucose and total nitrogen content slows down gradually. After 24h, bacteria turn into dead phase, the change of Glucose content and total nitrogen content is not significant. Thus it can be seen that cell growth is closely-related with the consumption of carbon resources and nitrogen resources, which can be added appropriately during the cell growth process.
Growth Characteristics and Fermentation Kinetics of Flocculants-Producing Bacterium F2
695
3.3 The Kinetic Equation of Cell Growth In order to research flocculating bacteria F2, the kinetic equation of cell growth was established. According to the above results, Logistic equation was used to describe cell growth of bacteria F2. Logistic equation: dX = f (t ) = dt
μ
m
X (1 −
X ) Xm
(1)
In the equation, X is cells concentration (g/L), µm is max specific growth rate. When fermentation starts, cells concentration is quite low, which is to say X is much smaller than Xm, X/Xm can be ignored, the equation expresses that cells grow exponentially. When Logarithmic phase finishes, cells growth is in stationary phase, X is kin to Xm, the formula expresses that cells growth ceases. In fact, in the process of batch fermentation, the increase of cells concentration has inhibit effect on the growth of themselves, when the cells growth can depicted well using Logistic equation. In the 0 - t interval, integral on both size of Equation (1) then obtain: ln X /( X
m
− X ) = K t − ln ( X
m
/ X
0
− 1)
(2)
In the equation, X0 is initial cells mass (g/L), Xm is the max cells mass (g/L), t is time (h). Bring the average dry cells weight obtained in the dynamical analysis of batch culture of flocculating bacteria F2, which is X0= 0.16g/L, Xm=1.7g/L into Formula (2) then obtain: X = X
m
[1 + e ln ( X m
/ X
0
)− K
t
] = 1 .7 / (1 + e 2 .3 6 − K t )
(3)
Use mathematic software to make nonlinear fitting on cells growth dynamical model Formula (3), the result is shown in figure 3. 1 .8 1 .6
) L/ g ( t
1 .4 1 .2
hg ie w ll ec y r D
1 .0
measured value …95%confidence ■ 试 验 测interval 定值 95%prediction … . .. 9 5 % 置 interval 信区间 - - - 9 5curve %预测区间 fitting 2 ——拟合曲线 R =0.9863 R ²= 0 .9 8 6 3 K = 0 .2 5 3 4 K=0.2534
0 .8 0 .6 0 .4 0 .2 0 .0 -3
0
3
6
9
12
15
18
() 21
24
27
30
33
36
time h
Fig. 3. The fitting curve of cell growth dynamical model
39
42
696
J. Xing et al.
Bring K=0.2534 after the fitting into the cells growth dynamical model equation (3) then obtain the dynamical equation of cells growth: X =
1 .7 1 + e 2 .3 6 − 0 . 2 5 3 4 t
According to the figure 3, the growth curve of flocculating bacteria F2 matches the trend of classical microorganism growth curve, so that use Logistic Formula can commendably fit experimental data, the degree of fitting between the simulation value and the experimental data R2=0.9863, which shows the degree of fitting is good and this equation can be used to depict the dynamical model of strain growth. The Figure also shows that from the beginning of fermentation till the end of Logarithmic phase, the equation can fit the experimental data well, while in the stationary phase the equation underestimate the data and in the decline phase the equation overestimate the data, which is common in using Logistic formula to fit microorganism growth. Use the simulation value from the dynamical model of cells growth and the experimental data, residual error and relative error is obtained.comparing measured value and predicted value, the maximum error between them is 9.0% and the minimum error between them is 0.1%, average relative error is 4.72%, which shows the measured value fits the predicted value well. Thus the built dynamical model of flocculating bacteria F2 growth can reflect the batch fermentation process of F2 well. 3.4 The Kinetic Equation of Substrate Consumption of Flocculation Bacteria F2 Substrate include various nutrients which is essential in cells growth, its consumption mainly has three respects: the first one is the cell growth consumption to synthesize new cells, the second one is the consumption to maintain the routine life of cells and the third one is the consumption to synthesize metabolized products. Thus, the consumption velocity can be depicted in the following equation: dS dX 1 1 dP = × + × + msX dt dt yX /S yP/S dt
−
(4)
In the equation, S is Glucose concentration (g/L), P is products content (g/L), yx/s is the velocity constant in carbon resources for cells growth, yp/s is the velocity constant in carbon resources for products accumulation, ms is the maintain constant in microorganism carbon resources and t is time (h). In our research, cells growth partly couples the largely synthesis of products. Cells growth and products accumulation mainly depend on the energy provided by substrate Glucose consumption. While the energy consumption to maintain the basic metabolism of flocculation bacteria is relatively small, thus msX in equation (4) could be ignored and the equation (4) turns into:
−
Make
1 yX / S
= m and
dS dX 1 1 dP = × + × dt dt yX /S yP / S dt
1 yP / S
= n, then equation (5) turn into:
(5)
Growth Characteristics and Fermentation Kinetics of Flocculants-Producing Bacterium F2
−
dS dX dP =m +n dt dt dt
697
(6)
Bring equation (3) into equation (6), in the 0 - t interval, integral on both size of Equation (6) then obtain:
S = S0 −
1.7 α m e 2.36 − 0.2534 t 1.7 n + 6.7088 β m ln − 2.36 − K t 2.36 − 0.2534 t 1+ e 1+ e 1 + e 2.36 − K t
Bring K=0.2534, S0=10g/L(initial concentration of Glucose), α=1.2926 0.00127 into Equation (7) then obtain:
S = 10 −
2.1974 m e 2.36 − 0.2534 t 1.7 n − 0.0085 m ln − 1 + e 2.36 − 0.2534 t 1 + e 2.36 − 0.2534 t 1 + e 2.36 − 0.2534 t
(7)
, β=(8)
Use mathematic software to make nonlinear fitting on Glucose consumption dynamical model Formula (8), the result is shown in Figure 4. 10
■measured 试 验 测value 定值 …95%confidence interval … .. . 9 5 % 置 信 区 间 95%prediction - - - 9 5 % 预 测interval 区间 —fitting —拟合 曲线 curve 2 R ²= 0 .9 9 6 4 R =0.9964 m = -1 2 .0 6 4 8 m=-12.0648 n = 1β=19.9385 9 .9 3 8 5
9
) L/ g ( t 8 7 6
ne tn 5 oc 4 e so cu 3 l g2 1
-3
0
3
6
9
12
15
18
21
24
27
30
33
36
39
()
time h
Fig. 4. The fitting curve of substrate consumption dynamical model
After the fitting, bring m=-12.0648 and n=19.9385 into equation (8) then obtain substrate consumption dynamical equation: S = 10 −
2 6 .5 1 1 1 e 2 .3 6 − 0 .2 5 3 4 t 3 3 .8 9 5 5 − 0 . 1 0 2 5 ln − 2 .3 6 − 0 .2 5 3 4 t 1+ e 1 + e 2 .3 6 − 0 .2 5 3 4 t 1 + e 2 .3 6 − 0 .2 5 3 4 t
According to Figure 4, the substrate consumption dynamical equation fits experimental data well, the degree of fitting between the simulation value and the experimental data R2=0.9863, which shows the degree of fitting is good and this equation can be used to depict the dynamical model of substrate consumption. The Figure also shows that in the process of fermentation the equation can fit the experimental data
698
J. Xing et al.
well, while in Logarithmic phase and decline phase the equation overestimate the data and in stationary phase the equation underestimate the data. Use substrate consumption dynamical model measured value and experimental value to obtain the residual error and relative errorδ. the maximum error between them is -9.7% and the minimum error between them is 0.1%, average relative error is 4.51%, which shows the measured value fits the predicted value well. Thus the built dynamical model of flocculating bacteria F2 fermentation substrate consumption can reflect the process in bacteria using substrate well.
4 Conclusions Bio-flocculants producing bacteria F2 was isolated from soil, which showed high flocculating activity and stability. During the growth process, environmental factors pH keeps stability and during 3 to 27h large amount of oxygen should be provided. As a kind of carbon source, glucose was consumed rapidly, and even lacked after 21h. So as to imitate the fermentation process of flocculating bacteria F2, and further guide industry of bio-flocculants, we established dynamical model of cell growth and substrate consumption: X =
1 .7 1+ e
2 .3 6 − 0.2 53 4 t
and S = 10 −
2 6 .5 1 1 1 e 2 .3 6 − 0 .2 5 3 4 t 3 3 .8 9 5 5 − 0 .1 0 2 5 ln − 2 .3 6 − 0 .2 5 3 4 t 1+ e 1 + e 2 .3 6 − 0 .2 5 3 4 t 1 + e 2 .3 6 − 0 .2 5 3 4 t
These unique characteristics of bio-flocculants produced by strains F2 indicate the potential value in industry and have attracted our interest for the further study.
Acknowledgment I gratefully acknowledge the National High Technology Research and Development Program of China (863 Program) (Granted No.SQ2009AA06XK1482412) and The Science and Technology Development Program of Heilongjiang Province Department of Education (Granted No.11541284) for their financial supports.
References 1. Wei, W., Fang, M.: Purification and characterization of Compound Bioflocculant. In: The 2nd International Conference on Bioinformatics and Biomedical Engineering (2008) 2. Hyohak, S.: Modeling of batch fermentation kinetics for succinic acid production by Mannheimia succiniciproducens.. Biochemical Engineering Journal 40, 107–115 (2008) 3. Yanbin, Z.: The analysis on characteristics, physical and chemical propertise and flocculating process of CBF-producing bacterium. Doctoral dissertation in Harbin Institute Technology, pp. 50–53 (2006)
Growth Characteristics and Fermentation Kinetics of Flocculants-Producing Bacterium F2
699
4. Kurane, R., Hatamochi, K., Kakuno, T., Kiyohara, M., Hirono, M., Taniguchi, Y.: Production of a bioflocculant by Rhodococcus erythropolis S-1 grown on alcohols. Biosci. Biotech. Biochem. 58, 428–429 (1994) 5. Koizumi, J.I., Takeda, M., Kurane, R., Nakamura, J.: Synergetic flocculation of the bioflocculant FIX extracellularly produced by Nocardia amarae. J. Gen. Appl. Microbiol. 37, 447– 454 (1991) 6. Takeda, M., Kurane, R., Koizumi, J.I., Nakamura, J.: A protein bio-flocculants produced by Rhodococcus erythropolis. Agric. Biol. Chem. 55, 2663–2664 (1991) 7. Qin, W., Fang, M.: The study on application of CBF. Industrial water treatment 27, 68–74 (2007)
Research on Enrichment for Anammox Bacteria Inoculated via Enhanced Endogenous Denitrification Yi Yuan1,2, Yong Huang2, Huiping Deng1, Yong Li2, and Yang Pan2 1
School of Environmental Science and Engineering, Tongji University, Shanghai ,China 2 Department of Environmental Science and Engineering, Suzhou university of Science and technology, Suzhou, China [email protected]
Abstract. The project of this study is to research the feasibility of anammox sludge enrichment with endogenous denitrification method and the characteristics of the enrichment anammox sludge. SBR was used as reactor which operation mode of continuously influent when operation would benefit for anammox reaction because of substrates dilution. Lower nitrogen concentrations and shorter operation periodic time would also benefit for the bacteria culture while at the same nitrogen load. The sludge selected from aerobic sludge treating municipal wastewater with endogenous-denitrification method could steadily react as anammox with the nitrogen load of 0.156kgN/m3⋅d. In this study the highest influent nitrogen concentrations were 500mgNH4+-N/l and 580mgNO2-N/l the removed NH4+-N/ NO2--N ratio was 1:1.12 closely to the reported value, and sludge became red, which all indicated that endogenousdenitrification sludge could be adopted to anammox bacteria enrichment, and endogenous-denitrification method could be adopted to treating sludge digestive liquid. The research showed that SBR for anammox sludge enrichment was most stable to flow rate shock, but it was very sensitive to substrate shock because of nitrite inhibition.
,
Keywords: endogenous denitrification; anammox; enrichment.
1 Introduction The Anaerobic ammonium oxidation (anammox) bacteria could oxidize ammonium directly to dinitrogen gas using nitrite as the electron acceptor.The process is autotrophic, using CO2 as the only carbon source, and the reaction was describe as Eq. (1)[1]: 1NH4+ +1.32 NO2-+ 0.066HCO3- + 0.13H+ → 1.02N2 + 0.26 NO3-+0.066CH2O0.5N0.15 +2.03H2O
(1)
Anammox process has been successfully implemented as a cost effective and environment friendly N-removal system in Netherlands[2,3]. With SHARON- Anammox process, the oxygen requirement for nitrogen removal is reduced by 60% via half K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 700 – 707, 2010. © Springer-Verlag Berlin Heidelberg 2010
Research on Enrichment for Anammox Bacteria Inoculated
701
nitrite nitrification, and the net CO2 emissions strongly reduced by 90%[4] because no COD was needed. Because of low sludge yield[5], the surplus sludge output of anammox process is only 8 percent of that of traditional nitrogen removal process, which could reduce the cost of sludge treatment and disposition. Hence anammox process is the most economic treatment for high nitrogen concentration wastewater and low C/N ratio wastewater. The sludge digestive liquid with high ammonia concentration in wastewater treatment plant should be treated by anammox process. But the bacteria culture for anammox is the problem. The strategy of anammox bacteria’s selection is endogenous denitrification[6]. Nitrification liquid in WWTP could be took to endogenous denitrify the sludge in concentration tank in order to select anammox sludge. In this way, without additional nitrification and special sludge, anammox sludge could be selected very economically. The project of this study was to research the feasibility of anammox sludge enrichment with endogenous denitrification and the characteristics of the enrichment anammox sludge.
2 Materials and Methods 2.1 Equipment An anaerobic sequencing batch reactor with an effective capacity of 2500ml was used (Fig. 1) for anammox bacteria accumulation. SBR was operated at 35±1 by immersed in a water bath and was covered with window blind to avoid restraining anammox bacteria. Influent water was batch fed from the bottom for 900ml per cycle.
℃
F )
Fig. 1. Anammox SBR reactor
2.2 Influent and Seeding Sludge A synthetic medium was used. The medium contained (per liter): NaNO2, 50580mgN; NH4Cl, 10-500mgN; NaHCO3, 1500 mg; NaCl, 500mg; KCl, 74mg KH2PO4, 27.2mg; MgSO4, 50 mg; CaCl2, 50mg, and 1.25ml trace element solution I (including EDTA and FeSO4) and trace element solution II[7]. The dissolved oxygen
;
702
Y. Yuan et al.
(DO) concentration of the feeding solution was reduced to 0.5 mg/l or less by sparging nitrogen gas prior to supplying into the reactor. The amount of ammonium was calculated based on a ratio to nitrite of 1:1.32. The sludge used for anammox bacteria enrichment originated from two 1L-SBR reactors in which the sludge from aerobic tank of Suzhou west WWTP was endogenous denitrified[6] by inorganic water containing 50 80mg NO2--N/l. The weight of sludge for denitrification was 14.7g/l.
~
2.3 Analytical Methods Ammonium was measured by Nessler's reagent spectro- photometer[8]. The nitrite was analysed by N-(1-naphthyl)- 1,2-diaminoethan e dihydrochioride spectro- photometry[8]. PH was measured by pH-3TC machine(Leici, Shanghai, China) and alkalinity was measured with electrode method[8].
,
2.4 Sensitivity Ratio Stability is regarded as the resistance of the performance parameters (e.g. substrate conversion percentage, effluent substrate concentration) change to operational parameters (e.g. inflow rate, influent substrate concentration) variation. If a slight disturbance in inflow rate or influent substrate concentration results in alteration of the effluent substrate concentration enormously then the system is judged as sensitive. The sensitivity can be described as Eq. (2)[9]: S=
(Oi − Ob ) / Ob (I i − I b ) / I b
(2)
where, S is sensitivity of reactor Oi and Ii are the effluent substrate concentration and influent load at the moment of i, respectively; Ob and Ib are the effluent substrate concentration and influent load during the steady operation period, respectively.
3 Results and Discussion 3.1 Enrichment Process in Reactor During the first 6 days, 10mgNH4+-N/l and 80mgNO2--N/l of influent was fed to digesting heterotrophic bacteria further. Then in the following days, 50mgNH4+-N/l and 50mgNO2--N/l of inorganic water was fed to reactor, and settling for draining when concentration of NH4+-N or NO2--N in reactor under 20mgN/l. The result was showed in Fig.2. At 93d, the raw water contained 120mgNH4+-N/l and 120mgNO2--N/l. Between 93d and 105d, the effluent average concentration was 30mgNH4+-N/l and 40mgNO2-N/l, and the highest nitrogen removal ratios for ammonia and nitrite were 75.4% and 55.7%, respectively. While because of inappropriate operation and difficult culture for bacteria, the nitrogen removal effect went bad and the nitrogen removal ratios for ammonia and nitrite dropped to 42.5% and 59.7%, respectively. Then the operation
Research on Enrichment for Anammox Bacteria Inoculated
703
periodic time was fixed at 6 days, and the nitrogen load was 0.0144kgTN/m3⋅d with batch fed per day from 123d. The reactor showed obvious removal capacity after 243d. The effluent concentration was less than 5mgNH4+-N/l and 1mgNO2--N/l, and removal ratios were above 97% accordantly. From 243d to 271d, periodic time was gradually reduced to 2 days. From 305d the raw nitrogen concentrations were gradually doubled to 500mgNH4+-N/l and 580mgNO2--N/l until 354d with 0.194 kgTN/m3⋅d of total nitrogen load. But this high concentration could inhibit anammox reaction exhibiting NO2-N accumulation in the first cycle, and endogenous denitrification took place in the second cycle with NH4+-N concentration increasing because of endogenous nitrogen releasing. Hence the raw nitrogen concentration was reduced to 250mgNH4+-N/l and 290mgNO2--N/l for bacteria rejuvenation at 394d. Anammox reaction was steady with 400mgNH4+-N/l and 464mgNO2--N/l of raw water from 415d to 535d, the removal ratios of NH4+-N and NO2--N were above 90% and 80%, respectively, and the nitrogen load was 0.156 kgTN/m3⋅d with 2 days of periodic time. It was showed that anammox reaction was always inhibited exhibiting NO2--N accumulation once nitrogen concentration of raw water was increased, for example when nitrogen was increased from 120mgNH4+-N/l to 240 mgNH4+-N/l, and from 240 mgNH4+-N/l to 500 mgNH4+-N/l. If nitrite concentration of effluent was above 50mgN/l, the nitrogen concentration of influent was reduced properly for bacteria rejuvenation. A little lower than usual anammox stoichiometry coefficient (NO2--N: NH4+-N = 1.32) [1] was observed, showed in Fig.3, during steady operation period (after 243d) and the value was 1.12.
Fig. 2. Variation of nitrogen during the operation
704
Y. Yuan et al.
3.2 Different Hydraulic Shock Loads Impact on Reactor Performance Stability During the operation period between 243d to 269d, the cycle time was gradually decreased from 6 days to 2 days, and hydraulic load was accordingly improved from 0.06 m3/m3⋅d to 0.18m3/m3⋅d with influent nitrogen concentrations of 120mgNH4+-N/l and 132mgNO2--N/l. The result in Fig.2 showed that both nitrogen removal ratios were steady. And sensitivity ratios were less than 1(listed in Table 1). These were all indicated that hydraulic shock loads had little impact on SBR operation when the influent substrate concentrations were 120mgNH4+-N/l and 132mgNO2--N/l. 3.3 Different Volume Shock Loads Impact on Reactor Performance Stability During the operation period between 269d and 528d, the influent nitrogen concentrations were gradually improved from 120mgNH4+-N/l and 132mgNO2--N/l to 400mg NH4+-N/l and 464mgNO2--N/l, and volume load was accordingly improved from 0.049kgTN/m3⋅d to 0.156 kgTN/m3⋅d with two days of cycle time. Once the influent concentration was raised, nitrite and ammonia removal ratios were declined and then go back to above 80 percents, which was certified by sensitivity ratios listed in Table 2. In these cases, sensitivity ratios were higher than those of the case of hydraulic shock, which showed that substrate shock had greater impact on reactor mainly because of nitrite inhibition. This conclusion accorded with the point of JIN[10].
Fig. 3. TN load and NH4+-N / NO2--N ratio profiles in anammox sludge accumulation
Research on Enrichment for Anammox Bacteria Inoculated
705
Table 1. Effect of Different Hydraulic Shock Loads on Performance Stability of Reactor Hydraulic load (m /m ⋅d) Ammonia removal ratio Nitrite removal ratio S(average) 3
3
0.06 98.6% 99.6%
0.09 97.7% 99.9% 0.62
0.18 97.8% 97.4% 0.93
Table 2. Effect of Different Volume Shock Loads on Performance Stability of Reactor Influent TN Concentration(mgN/l) S(max)
273→321
321→520
306→520
520→1080
4.59
103.7
9.78
69.6
540→864 6.29
3.4 Alkalinity and pH Variation in One Operation Cycle During the stable operation period, substrates were monitored in two SBR cycles, and the results were showed in Fig.4. Influent water was batch fed from the bottom per cycle, and 20ml effluent water was respectively discharged for monitored from the top of reactor at proper time. The beginning of nitrogen concentrations in reactor were 14.1mmol NH4+-N/l and 17.6mmol NO2--N/l in one of cycles, and nitrogen concentrations were monitored at 0.5 h 2 h 4 h 6 h 9 h 13 h 22 h 28h, respectively. In the other cycle, the beginning of nitrogen concentrations in reactor were 7.2mmol NH4+-N/l and 8.6mmol NO2--N/l, and nitrogen concentrations were monitored at 0.5 h 2 h 4 h 6 h 10 h 14 h 21 h 24 h 27 h 30 h and 33.5h, respectively. It was found that the final pH were little higher than initial pH, and alkalinity were vary between 600-700mgCaCO3/l. These were indicated that the anammox reaction could produce alkalinity. The average removed NH4+-N:NO2--N:NO3--N ratios were 1:1.24:0.252 and 1:1.204:0.202, respectively. And the removal NH4+-N:NO2--N
、 、 、 、 、 、 、
、 、 、 、 、 、 、 、 、
Fig. 4. Concentration profiles of soluble nitrogen compounds in a SBR cycle (m mol/l)
706
Y. Yuan et al.
ratios during reaction were corresponding to the initial NH4+-N:NO2--N ratios, which was indicated that original NH4+-N:NO2--N ratio would effect anammox reaction, and the production of NO3--N might be substituted for NO2--N to react with excessive NH4+-N. 3.5 The Characteristics of Sludge The selected sludge was brown and the colour became pale red when the sludge had high anammox activity which was showed in Fig.5. Because of high concentration of c-type cytochromes, anammox sludge was showed pale red. So pale red sludge could index the dominant reaction in reactor was anammox process.
Fig. 5. The profile of enrichment sludge
4 Conclusion The sludge selected from aerobic sludge treating municipal wastewater with endogenous-denitrification method could steadily react as anammox with the nitrogen load of 0.156kgN/m3⋅d. The SBR operation mode of continuously influent when operation would benefit for anammox reaction because of substrates dilution in reactor. The lower nitrogen concentrations and shorter operation periodic time would also benefit for the bacteria culture while at the same nitrogen load. SBR used for anammox sludge enrichment was most stable to flow rate shock, and it was very sensitive to substrate shock because of nitrite inhibition. In this study the highest influent nitrogen concentrations were 500mgNH4+-N/l and 580mgNO2--N/l the removed NH4+-N/ NO2--N ratio was 1:1.12 closely to reported value (1:1.32), and sludge became red, which all indicated that endogenousdenitrification sludge could be adopted to anammox bacteria enrichment, and endogenous- denitrification method could be adopted to treating sludge digestive liquid.
,
Acknowledgments. Project supported by the Open Fund of environmental science and engineering important lab of Jiangsu Province, China, (project number ZD081202) and the Fund of Suzhou university of science and technology, china, (project number A0030702).
Research on Enrichment for Anammox Bacteria Inoculated
707
References 1. Strous, M., Heijnen, J.J., Kuenen, J.G.: The sequencing batch reactor as a powerful tool for the study of slowly growing anaerobic ammonium-oxidizing microorganisms. Appl. Microbiol. Biotechnol. 50, 589–596 (1998) 2. Kartal, B., Van Niftrik, L., Sliekers, O.: Application, eco-physiology and biodiversity of anaerobic ammonium-oxidizing bacteria review. J. Environ. Sci. Bio/Technol. 3, 255–264 (2004) 3. Op Den Camp, H.J.M., Kartal, B., Güven, D.: Global impact and application of the anaerobic ammonium-oxidizing (anammox) bacteria. J. Biochem. Soc. Trans. 34, 174–178 (2006) 4. van Dongen, U., Jetten, M.S.M., van Loosdrecht, M.C.M.: The SHARON-Anammox process for treatment of ammonium rich wasterwater. J. water science and technology 44(1), 153–160 (2001) 5. Jetten, M.S.M., Strous, M., van de Pas-Schoonen, K.T.: The anaerobic oxidation of ammonium. J. FEMS Microbiol. Rev. 22(5), 421–437 (1998) 6. Yuan, Y., Huang, Y.: An Experimental Research on Anammox bacteria’s selection. J. of University of Science and Technology of Suzhou (Engineering and Technology) 17(4), 6– 10 (2004) 7. van de Graaf, A.A., de Bruijn, P., Robertson, L.A.: Autotrophic growth of anaerobic ammonium-oxidizing microorganisms in a fluidized bed reactor. J. Appl. Environ. Microbiol. 142(8), 2187–2196 (1996) 8. State Environmental Protection Administration of China. The Methods of Water and Wastewater Monitor, 4th edn. China Environmental Science Press, Beijing (2002) 9. Bolle, W.L., van Breugel, J., van Eybergen, G.C., Kossen, N.W.F., Zoetemeyer, J.: Modeling the liquid flow in up-flow anaerobic sludge blanket reactors. Biotechnol. Bioeng. 28, 1615–1620 (1986) 10. Jin, R., Hu, B., Zheng, P., Chen, X.: Stability of performance of ANAMMOX reactors and assessment criteria. Journal of Chemical Industry and Engineering (China) 157(15), 1166– 1170 (2006)
Evaluation of Geological Disaster with Extenics Based on Entropy Method* Xinmin Wang1, Zhansheng Tao1, and Xiwen Qin2 1
Institute of applied mathematics, Changchun University of Technology, Changchun 130012, China 2 College of Basic Sciences, Changchun University of Technology, Changchun 130012, China
Abstract. A new evaluation method of geological disaster combining extenics and entropy is presented. According to the data of geological disaster monitoring in Jilin province, based on matter element theory, extension set and dependent function, the classical domain and section field of geological disaster is decided, the weights of the evaluation indexes are calculated based on entropy method, and a comprehensive evaluation model of geological disaster is established by extenics theory. The method indicates that the adoption of extenics theory in comprehensive assessment of geological disaster is reasonable and feasible. Keywords: entropy; weight; extenics; evaluation model section.
1 Introduction This risk perception is a necessity for hazard mitigation. There are probably no greater problems for modern politicians and decision makers than dealing with scientific uncertainty and public perception of risk. By understanding risk, disaster and vulnerability can be analyzed, the experiences can be examined, the current situation can be monitored and the future predicted[1],[2]. Carrara A. brought the multivariant statistical analysis and predict method into the forcast of district coast space, then this method has been developed and expanded gradually[3] . HaruyamaH and KawakamiH(1984) assessed the dangerous grade of landslide disaster which caused by the rainfall in the active volcano area in Japan by using the statistical theory of mathematics[4]. At present, the method of fuzzy mathematics is also a way that widely used in the predict and research of geological disaster space. The evaluation methods of geological disaster tend to be quantitative and comprehensive, especially the white box, black box, gray prediction and the method of Analytic Hierarchy Process (AHP). At present, many researches believe that the application of the extenics theory could complete the risk evaluation of geological disaster. The system of geological disaster is a complicated matter element system with the characteristics of holistic, dynamic and openness. Because it was * Sponsored by Science and Technology Development Project of Jilin Province (NO.[2008]839)and State Water Project(2009ZX07424--002) K. Li et al. (Eds.): LSMS/ICSEE 2010, LNBI 6330, pp. 708 – 715, 2010. © Springer-Verlag Berlin Heidelberg 2010
Evaluation of Geological Disaster with Extenics Based on Entropy Method
709
incapable to affirm all the influencing factors, researchers could only evaluate the risk degree and of landslide disaster according to some uncompleted factors. however, the extenics theory takes the incompatible as its research object and study the transform principle and the solution[5],[6]. Therefore, the evaluation method of extenics theory is a way that can reflect the dangerous grade of geological disaster objectly. The core of extenics theory is to transform the problem of contradiction to be the problem of interconsistency and the key of it is to affirm the weight modulus of evaluation index. In this study ,the entropy method is used to affirm the weight modulus of evaluation index for it can reflect the value of entropy profoundly and the index it shows is more credible than the ways of grade analysis. The entropy method can predict the danger degree of the geological disaster, it provide scientific and reasonable foundation for preventing of the disaster and the adaptation of relevant measures.
2 Extenics Evaluation Model Based on Entropy Method The extenics is a new subject that established in 1983 by professor Caiwen in China[7],[8]. After 20 years of effort, developed her own unique way system - extension methods, and applications in many fields, forming the extension project. the extenics has been widely used in economy, industry, medical, military ,geology, culture and other fields. Extenics are the formal tools, and it study and resolve contradictions in the principle and methods from both qualitative and quantitative point of view.The core theory of extenics is the matter element t and extension set. The extenics comprehensive evaluation method is construct on the basis of extention set. It can not only reflect the belonging degree of the object that be assessed, the more unique aspect is that it can describe the diving line of two different states amountly and facilitate, the dynamic changes of the object which is described. Extension evaluation method steps: a. b. c. d.
Determine the classical domain and section field; Determine the evaluation matter element; Determine the correlation degree; Determine the evaluation grade
2.1 Determine Weight by Entropy Method The entropy of information is one of the conception in information theory used to measure information contents. The more ordered a system is, the lower the entropy of information is, on the contrary, the more jumbled the system is, the higher the entropy of information is. So the entropy of information can be considered as a measurement for the degree of system ordering. no matter risk evaluation or multi-object strategic decision, people always have to consider the relative importance of each evaluation index. the most direct and simple way to indicate the importance degree is giving weight to each index. according to the ideas of entropy, the information quality and quantity which get from the decision is one of the crucial factors for the precision and creditability of the decisions. The following is the process of determining weight by the entropy method.
710
X. Wang, Z. Tao, and X. Qin
2.1.1 Construction of Evaluation Matrix To consider an evaluation system, assuming that to obtain tion indexes of initial data matrix
m samples of n evaluaX = ( xij ) m×n . Because the dimension of each
index, the order of magnitude and the direction of index quality are very different, so it is needed to standardize the initial data and then get the standardized data matrix:
Y =(yij )m×n
(1)
.
2.1.2 Calculate the Entropy of Evaluation Index
Ci
As the information entropy can be used to measurement the effectiveness value of index Ci , the entropy of index Ci is : m
H i = − k ∑ f ij ln f j =1
In which,
f ij =
y ij
,
m
∑y
ij
k=
i = 1, 2," n .
.
(2)
1 , a constant k has something to do with ln m
ij
j =1
the number of sample
m.
2.1.3 Calculating Entropy Weight of Evaluation Index
Ci
Using entropy method to estimate the weight of each index, its essence is to use the coefficient value of the index information to calculate the weight. The higher the value of coefficient, the greater the importance of the evaluation. So the weight of evaluation index as follows: n
wi = (1 − H i ) (n − ∑ H i ) .
(3)
i =1
3 Application The basic idea of the risk assessment for the geological disaster with extenics which is based on the entroy method, at first,according to combination the actual situation of the study area with the risk assessment for the geological disaster of technical requirements,evaluation of geologic hazard-prone areas in Jilin Province,China, they are divided into non-prone areas and low-prone areas, middle-prone areas, highprone areas of four grades.Then using the entropy value to calculate the weight of each index. It is to be evaluated into the collection of various grades to make a
Evaluation of Geological Disaster with Extenics Based on Entropy Method
711
Classical Domain and Section Field
Evaluation Matrix
Matter Element
Index Entropy
Index Weight
Index Entropy Weight
Correlation Degree
Evaluation grade
Fig. 1. Evaluation of geological disaster with extenics based on entropy method
multi-indicator evaluation. It can be shown that the evaluation model process of geological disaster(Fig.1). 3.1 The Evaluation Index In the comprehensive evaluation of geological disaster, it is vital to choose the reasonable evaluation and prediction indexes. The selected evaluation indexes mainly include seven indicators. C1 stands for forest coverage rate, C 2 stands for mean
C 3 stands for physiognomy and topography, C 4 stands for geological structure, C 5 stands for gneiss, C 6 stands for disaster-point density, C7 stands for human engineering activity. the selection of the seven evaluation annual precipitation,
indexes are based on the qualitative analysis of influencing degree of each factor, combining with the real situation of geological disaster in jilin province. (Table 1. Due to the dimension difference between each evaluation factor and the large magnitude difference between them, there must be a normalization treatment on these indexes which would influence the risk degree of geological disaster.
)
712
X. Wang, Z. Tao, and X. Qin Table 1. Geological disaster classification standard
Index
Non-prone Area
Low-prone Area
Medium-prone Area
High-prone Area
C1
0~0.2
0.2~0.4
0.4~0.8
0.8~1
C2
0~0.2
0.2~0.6
0.6~0.8
0.8~1
C3 C4 C5 C6
0~0.25
0.25~0.45
0.45~0.75
0.75~1
0~0.3
0.3~0.6
0.6~0.8
0.8~1
0~0.25
0.25~0.4
0.4~0.65
0.65~1
0~0.2
0.2~0.4
0.4~0.8
0.8~1
C7
0~0.2
0.2~0.4
0.4~0.7
0.7~1
According to the chart above, we can get the classical domain and the relevant section field of the stability grade of geological disaster . 3.2 Determine Classical Domain and Section Field Concrete classical domain are as follows: C1 (0,0.2) ⎤ ⎡ C1 (0.2,0.4) ⎤ ⎡ ⎢ ⎥ ⎢ C (0,0.2) 2 C2 (0.2,0.6) ⎥⎥ ⎢ ⎥ ⎢ ⎢ C3 (0,0.25)⎥ ⎢ C3 (0.25,0.45)⎥ ⎢ ⎥ ⎢ ⎥ R1 = ⎢ N1 C4 (0,0.3) ⎥ R2 = ⎢N2 C4 (0.3,0.6) ⎥ ⎢ ⎢ C5 (0,0.25)⎥ C5 (0.25,0.4) ⎥ ⎢ ⎥ ⎢ ⎥ C6 (0,0.2) ⎥ C6 (0.2,0.4) ⎥ ⎢ ⎢ ⎢ ⎢ C7 (0,0.2) ⎥⎦ C7 (0.2,0.4) ⎥⎦ ⎣ ⎣ ⎡ ⎢ ⎢ ⎢ ⎢ R3 = ⎢ N3 ⎢ ⎢ ⎢ ⎢ ⎣
C1 C2 C3 C4 C5 C6 C7
⎤ ⎥ (0.6, 0.8) ⎥ (0.45, 0.75) ⎥ ⎥ (0.6, 0.8) ⎥ (0.4, 0.65) ⎥ ⎥ (0.4, 0.8) ⎥ (0.4, 0.7) ⎥⎦ (0.4, 0.8)
⎡ ⎢ ⎢ ⎢ ⎢ R4 = ⎢ N 4 ⎢ ⎢ ⎢ ⎢ ⎣
C1 C2 C3 C4 C5 C6 C7
(0.8,1) ⎤ ⎥ (0.8,1) ⎥ (0.75,1) ⎥ ⎥ (0.8,1) ⎥ (0.65,1) ⎥ ⎥ (0.8,1) ⎥ (0.7,1) ⎥⎦
Where, N j stands for the no. j ( j = 1, 2,3, 4) evaluation grade that has been classified. Ci (i = 1, 2, " , n) represent the influencing factor of evaluation grade N j , C1 stands for forest coverage rate, C2 stands for mean annual precipitation, C3 stands for
Evaluation of Geological Disaster with Extenics Based on Entropy Method
713
physiognomy and topography, C4 stands for geological structure, C5 stands for gneiss,
C6 stands for disaster-point density, C7 stands for human engineering activity. 3.3 Determine the Matter Element
The calculation of the model is rely on the monitoring data of geological disaster in Jilin province. In order to guarantee the calculation veracity of model, the calculation takes 5k m×5km grid as one basic evaluation unit and analyses every evaluation index in each unit carefully . Then qualificating it with value to get the value of evaluation index in each distribution area and setting up the corresponding matter element. 3.4 Determine the Weighting Coefficient
According to the normalization data of 5702 cells and using the entroy method to determine weighting coefficient, using forum (2) we can get the weight distribution coefficient of each evaluation index.
W = {0.03066, 0.00973, 0.07089, 0.27873, 0.12696, 0.34270, 0.14033}T . 7
where,
∑w i =1
i
=1.
3.5 The Calculation of Correlation Degree
Taking a certain unit as an example to calculate the correlation degree, for example NO. 1618 cell. For the NO.1618 cell we can calculate the correlation of evaluation factor C 3 which is about the every risk rating, as follows :
k13 (v3 ) =
ρ (v3 , v13 ) A = = −0.5. ρ (v3 , v p 3 ) − ρ (v3 , v13 ) | 0.65 − 1 (0 + 1) | − 1 (1 − 0) − A
2 2 1 1 A =| 0.65 − (0 + 0.25) | − (0.25 − 0). 2 2 Similarly, k23 (v3 ) = −0.364. k33 (v3 ) = 0.400. k43 (v3 ) = 0.1428.
Similarly, the relevancy of C1 , C 2 , C 4 , C 5 , C 6 , C 7 concerning with each grade and the relevancy of each element in other evaluation areas concerning with each risk degree can be got . 3.6 Evaluation Grade
Calculate the correlation of the cell to be evaluated which is about the evaluation category. It can calculate the correlation degree of fatalness in non-prone area .
714
X. Wang, Z. Tao, and X. Qin 7
K1 ( P1 ) = ∑ wi k1i (vi ) = −0.59211. i =1
We can also get the correlation degree in NO.1618 cell related to the low-prone area, medium prone area, and high-prone area. They are -0.25001 -0.44633 -0.1539 in order.According to the calculation result of the degree of association, the degree of association of high-prone area in NO.1618 cell is the largest and the risk grade is fourth. Therefore, the fatalness of the NO.1618 cell belongs to the high-prone area. The grade of all the cells in the research place will come cot according to the method mentioned above, thus the geological disaster grade classification will come out. (Fig.2)
,
,
Fig. 2. Geological disaster grade classification in Jilin province, China
4 Conclusion In this paper, the extenics evaluation model for geological disaster fatalness was set up. The evaluation model chooses seven evaluation indexes including forest coverage rate, mean annual precipitation, physiognomy and topography, geological structure, gneiss, and disaster-point density. Then using entropy to get all the weight indicators, in which the maximal weight index is the disaster-point density, and classifying the geological disaster fatalness grade in Jilin province. The result of evaluation is reasonable and feasible which provides a new and efficient method for the district appraisal. The extenics evaluation model based on entropy has a great applying value in
Evaluation of Geological Disaster with Extenics Based on Entropy Method
715
the analysis and evaluation fields. With the in-depth research and awareness-raising, it will surely be applied more and the theory system of it will be perfect continuously. Acknowledgment. We thank the Changchun University of Technology through the Faculty of School of Basic Sciences that funded this work. We also thank the Head Station of Geological Environment Monitoring of Jilin for providing the geological level data.
References 1. Zhao, H.Q., Juang-jie, L.G., Zhang, Z.H.: Probability analysis of geological disaster in the mountainous area in east Jilin. Journal of Jilin University(Earth Science Edition) 34(1), 119– 124 (2004) 2. Hu, K., Wu, D.H., Yang, D.M., et al.: Preliminary study of ecological effects of remote smallsand descending on urban area. Journal of Changchun University of Science and Technology 31(2), 176–179 (2001) 3. Carrara, A.: Multivariate models for landslide hazard evaluation. Mathematieal Geology 15(3), 403–426 (1983) 4. Haruyama, M., Kitamura, R.: An evaluation method by the quantifieation theory for the risk degree of landslidese caused by rainfall. In: Proc. 4th Int. Landslides, Toronto, vol. 2, pp. 435–440 (1984) 5. Cai, W.: Extension Theory and Its Application. Chinese Science Bulletin 44(17), 1538–1548 (1999) 6. Cai, W.: Extension Management Engineering and Applications. International Journal of Operations and Quantitative Management 5(1), 59–72 (1999) 7. Che, Q., Chen, J.P., Que, J.S.: Determination of Weight Factor with Extenics Evaluation Based on Rough Set Theory. Journal of Jilin University(Earth Science Edition) 38(2), 268– 272 (2008) 8. Abdou, S., Savoy, J.: Optimization Techniques 1973. LNCS, vol. 3, pp. 362–373. Springer, Heidelberg (2006)
Author Index
Ai, Diming 570 Archvadze, Nino 180 Bai, Xuejun 165 Ban, Xiaojuan 570 Cai, Yanliang 282 Cao, Qixin 307 Chen, Duan 509 Chen, Feng 439 Chen, Jianxin 205 Chen, Qiuwen 509 Chen, Weidong 33 Chen, Wenxiu 570 Chen, Xinyun 340 Chen, Xulu 579 Cheng, Jinyong 122 Cheng, Li-jun 611 Cheng, Sheng 532 Cheng, Yanhua 532 Chiu, David K.Y. 298 Chuo, Wenjing 205 Dai, Chuan 525 Deng, Bin 671 Deng, Chengcheng 307 Deng, Huiping 700 Ding, Jinli 148 Ding, Yanrui 358 Dong, Yuan 325 Dong, Zheng 604 Dou, Huiling 597 Dou, Jianhong 316 Feng, Liang-Bing Fu, Jingqi 500
541
Galbreath, Zack 100 Gao, Xiang-min 548 Gao, Xing-Xing 635 Gong, Chun 532 Gong, Zhenbang 459 Gu, Hong 1 Guo, Huaicheng 579 Guo, Shuzhen 205
Guo, Xiumei 131 Guo, Xudong 246 Gvajaia, Marika 180 Han, Xianglan 481 Harris, Gordon J. 27 He, Chuanhong 84, 238 He, Ran 348, 386 He, Wei 42, 84, 113, 212, 238 He, Xiaoxu 597 Huang, Guobing 509 Huang, Jian 189 Huang, Jiwei 469 Huang, Kai 579 Huang, Renhan 588 Huang, Xudong 340 Huang, Yong 700 Ji, Feng 413 Jiang, Jiaxin 75 Jiang, Mingfeng 51 Jiao, Zhuqing 626 Jin, Feng 509 Ju, Kang 238 Ju, Xuan 490 Kobayashi, Kunikazu 541 Kuremoto, Takashi 541 Lei, Jingtao 459 Li, Bing 238 Li, Chengyuan 358 Li, Chun 205 Li, Dan-dan 611 Li, Gang 67, 230, 611 Li, Jian 367 Li, Qian 42, 173, 212 Li, Shiyao 91 Li, Wenshu 220 Li, Xiang 604 Li, Xiaorun 451 Li, Xiaoxia 131 Li, Xinyu 653 Li, Yong 700
718
Author Index
Li, Yuan-yuan 611 Lin, Feng 325 Lin, Ling 67, 230, 611, 618 Liu, Chunbo 490 Liu, Hao 564 Liu, Hongde 291 Liu, Kexin 691 Liu, Li 19 Liu, Lijuan 367 Liu, Li-Li 635 Liu, Qiang 122 Liu, Wanquan 271 Liu, Wen-ling 644, 662 Liu, Xiangyin 189 Liu, Xiaodong 271 Liu, Xin-yu 196 Liu, Yihui 122 Liu, Yu-Liang 671 Liu, Zhong 307 Long, Haixia 358 Lu, Chong 271 Lu, Jiahui 325 Lu, Luyao 395 L¨ u, Qiang 430 Lu, Wenjin 155 Lu, Wenyu 19 Lu, Yuesheng 555 Luo, Haijun 113, 212 Lv, Dan 91 Ma, Fang 691 Ma, Huan 683 Meng, Jun 189 Meng, Qingfan 325 Obayashi, Masanao Pan, Feng 490 Pan, Yang 700 Pang, Ming-yong
541
548
Qian, Jinwu 75 Qian, Li 644, 662 Qin, Xiwen 708 Qiu, Fuming 189 Quan, Yutong 325 Rong, Qiguo 173, 588 Roysam, Badrinath 100
Sang, Jun 122 Sang, Yuanyuan 348 Shang, Xiukui 131 Shao, Chenxi 597 Shao, Yong 517 Shen, Deli 165 Shou, Guofa 51, 316 Shu, Shihu 683 Si, Wen-Jie 671 Song, Jiafang 653 Song, Lantian 60 Song, Li 570 Song, Xiaodong 113 Song, Yixu 10 Stamateli, Anna 180 Su, Wei 500 Sun, Jun 358 Sun, Xiao 291 Tan, Guo-Zhen 386 Tan, Huimeng 333 Tang, Can 532 Tang, Yin 404 Tang, Yongning 439 Tao, Zhansheng 708 Tavdishvili, Otar 180 Teng, Lirong 325 Tian, Hongru 325 Tong, Songtao 597 Tsagareli, Sulkhan 180 Tu, Xincheng 469 Wang, Wang, Wang, Wang, Wang, Wang, Wang, Wang, Wang, Wang, Wang, Wang, Wang, Wang, Wang, Wang, Wang, Wang,
Anna 91 Fei 404 Hanpin 282 Hong 10 Huiquan 67 Jiang 671 Jingchuan 33 Junfeng 469 Junsong 165 Lanzhou 148 Lei 375 Na 375 Shaoqing 122 Sichun 262 Tianmiao 459 Tianpeng 333 Wei 205, 691 Wenshan 307
Author Index Wang, Xingce 196 Wang, Xinmin 708 Wang, Xiu-Kun 348, 386 Wang, Yong 205 Wang, Zhe 91 Wang, Zhelong 525 Wang, Zhen 75 Wang, Zhentao 469 Wang, Zicai 597 Wei, Hao 254 Wei, Xi-Le 671 Wen, Xian-Bin 635 Wu, Jinzhen 430 Wu, Peng 367 Wu, Xiaoming 451 Wu, Zhen 165 Wu, Zhongke 196 Xia, Ling 51, 316, 395, 421 Xiao, Liang 100 Xie, Hong 340 Xing, Jie 691 Xiong, Hui 230 Xiong, Weili 626 Xu, Baoguo 626 Xu, Feng 196 Xu, Guizhi 131 Xu, Qingzheng 375 Xu, Ruxiang 230 Xu, Wenbo 358 Xu, Wenshan 564 Xu, Xuesong 262 Xu, Zheng 42, 113, 212, 238 Yan, Guozheng 246, 555 Yan, Rongguo 246 Yang, Banghua 19, 517, 604 Yang, Fan 10 Yang, Jixian 691 Yang, Lingyun 430 Yang, Ming 597 Yang, Nanhai 348, 386 Yang, Peng 430 Yang, Wenlu 340 Yao, Jianfu 220
Yi, Xiaomei 367 Yin, Baoshu 564 Yin, Fuliang 139 Yu, Fengqin 60 Yu, Jiangsheng 282 Yu, Junda 205 Yu, Lianzhi 555 Yuan, Linlin 220 Yuan, Xiaosong 100 Yuan, Yi 700 Zamani, Masood 298 Zan, Peng 19, 517, 604 Zang, Yunliang 316, 421 Zhai, Weiming 10 Zhang, Bailing 155 Zhang, Gang 481 Zhang, Jianliang 1 Zhang, Jianwei 532 Zhang, Jinyi 517 Zhang, Li 84 Zhang, Lijun 139 Zhang, Liyan 139 Zhang, Wangming 230 Zhang, Yibo 325 Zhang, Yu 316 Zhang, Zhen 75 Zhao, Guangzhou 1 Zhao, Hongyu 525 Zhao, Huihui 205 Zhao, Liaoying 451 Zhao, Yannan 10 Zhao, Yong-Wu 413 Zhao, Zhe 67 Zheng, Hao 413 Zheng, Xiaoming 604 Zheng, Xiao-shen 254, 644, 662 Zheng, Xueqing 325 Zhou, Ming-quan 196 Zhou, Qinian 220 Zhu, Wenhua 333 Zhu, Xiaofei 555 Zhu, Xiuwei 395 Zuo, Chun-Cheng 413 Zuo, Yong-Xia 413
719